Handbook on Information Technology in Finance (International Handbooks on Information Systems)

International Handbook on Information Systems Series Editors Peter Bernus, Jacek Bla˙zewicz, Günter Schmidt, Michael S...

Author: Detlef Seese | Christof Weinhardt | Frank Schlottmann

150 downloads 4633 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

International Handbook on Information Systems Series Editors Peter Bernus, Jacek Bla˙zewicz, Günter Schmidt, Michael Shaw

Titles in the Series M. Shaw, R. Blanning, T. Strader and A. Whinston (Eds.) Handbook on Electronic Commerce ISBN 978-3-540-65882-1 J. Bla˙zewicz, K. Ecker, B. Plateau and D. Trystram (Eds.) Handbook on Parallel and Distributed Processing ISBN 978-3-540-66441-3 H. H. Adelsberger, B. Collis and J.M. Pawlowski (Eds.) Handbook on Information Technologies for Education and Training ISBN 978-3-540-67803-8 C. W. Holsapple (Ed.) Handbook on Knowledge Management 1 Knowledge Matters ISBN 978-3-540-43527-3 Handbook on Knowledge Management 2 Knowledge Directions ISBN 978-3-540-43527-3 J. Bla˙zewicz, W. Kubiak, T. Morzy and M. Rusinkiewicz (Eds.) Handbook on Data Management in Information Systems ISBN 978-3-540-43893-9 P. Bernus, P. Nemes, G. Schmidt (Eds.) Handbook on Enterprise Architecture ISBN 978-3-540-00343-4 S. Staab and R. Studer (Eds.) Handbook on Ontologies ISBN 978-3-540-40834-5 S. O. Kimbrough and D.J. Wu (Eds.) Formal Modelling in Electronic Commerce ISBN 978-3-540-21431-1 P. Bernus, K. Merlins and G. Schmidt (Eds.) Handbook on Architectures of Information Systems ISBN 978-3-540-25472-0 (Second Edition) S. Kirn, O. Herzog, P. Lockemann, O. Spaniol (Eds.) Multiagent Engineering ISBN 978-3-540-31406-6 J. Bla˙zewicz, K. Ecker, E. Pesch, G. Schmidt, J. We¸glarz Handbook on Scheduling ISBN 978-3-54028046-0 F. Burstein and C.W. Holsapple (Eds.) Handbook on Decision Support Systems 1 ISBN 978-3-540-48712-8 Handbook on Decision Support Systems 2 ISBN 978-3-540-48715-9

Detlef Seese Christof Weinhardt Frank Schlottmann (Editors)

Handbook on Information Technology in Finance

123

Editors Prof. Dr. Detlef Seese University of Karlsruhe (TH) Institute AIFB 76128 Karlsruhe Germany [email protected]

Dr. Frank Schlottmann Gillardon AG financial software Edisonstraße 2 75015 Bretten Germany [email protected]

Prof. Dr. Christof Weinhardt University of Karlsruhe (TH) IISM 76218 Karlsruhe Germany [email protected]

ISBN 978-3-540-49486-7

e-ISBN 978-3-540-49487-4

DOI 10.1007/978-3-540-49487-4 Library of Congress Control Number: 2008924335 © 2008 Springer-Verlag Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Production: le-tex publishing services, Leipzig Cover-design: WMXDesign GmbH, Heidelberg Printed on acid-free paper 987654321 springer.com

PREFACE

Why do we need a handbook on Information Technology (IT) and Finance? At first, because both IT as well as finance, are some of the most prominent driving forces of our contemporary world. Secondly, because both areas develop with a terrific speed causing an urgent need of up to date information on recent developments. Thirdly, because serious applications of IT in Finance require specialists with a professional training and professional knowledge in both areas. Over the last decades the world has seen many changes in politics, economics, science and legislation. The driving forces behind many of these developments are of a technological nature. One of the key technologies with this respect is Information Technology. IT is the most prominent technology revolutionizing the industrial development, from products and processes to services, as well as finance, which is itself one of the central pillars of modern economics. The explosive development of the Internet emphasizes the importance of IT, since it is today’s key factor driving global access and availability of information and allows the division of labour on an international scale, the globalization. The profound transformation of finance and the financial industry over the last twenty years was driven by technological developments – e. g. the development of more efficient desktop computers (PCs) with a sufficient storage capacity and speed to perform efficient computations at the front desk as well as in the back office, or the network technology enabling intranets, the Internet and also home banking and service oriented architectures (SOAs) to allow new and more efficient forms of business processes and financial services. Other important developments are reflected by several eWords, such as eCommerce, eBusiness, eOrganisation as well as electronic markets, digital money and intelligent software agents. These developments led to whole new scientific areas as Market Engineering or Knowledge Management. But also the area of “pure” mathematical finance showed impressing developments, e. g. with the groundbreaking discoveries of Black, Scholes and Merton in the early 1970s, honored 1997 with the Nobel prize, with many applications for derivative financial instruments. Also the stunning discoveries of Benoît Mandelbrot on the Fractal Geometry of Nature with many applications in mathematics,

VI

Preface

physics and economics, explaining chaos as well as The (Mis)Behavior of Markets (cf. e. g. the corresponding books by Mandelbrot 1982 and by Mandelbrot & Hudson 2004) are a real milestone contributing essentially to the creation of the new area of Econophysics. Especially the latter discoveries were influenced by the developments of modern IT, since without powerful computers and efficient algorithms they simply would not have been possible. The important influence of IT in the area of finance can be observed in many ways, e. g. efficient algorithmic solutions realizing break-through mathematical solutions to financial problems, IT methods supporting new business processes or “simply” the IT-driven worldwide access to data and the visualization of discoveries. However, the development of modern IT and the contemporary business life with its global connectivity and speed lead to a growing complexity of tasks and questions which have to be solved with often diminishing resources. This enhances the risk of many decisions and emphasizes in conjunction with turbulences of financial markets the necessity to improve and develop further the mathematical and methodical foundation of the area of finance as well as the technological and methodological basis of IT. It also caused the Basel Committee on Banking Supervision to give recommendations serving as basis for several legislative initiatives to ensure the improvement of risk and security management of financial institutions. The present book is an attempt to cover both areas, modern IT and contemporary finance and to demonstrate the manifold of their interactions. Its primary objective is twofold: At first, up-to-date surveys on key areas are provided, containing new developments, essential methods and top results in both areas. Secondly, a series of case studies and applications illustrate the link between IT and finance. The intended readers are students as well as experts in both areas, IT and finance, and of course all those working in or being interested in applications of IT in the financial services sector. We are convinced that the book will be useful for academics as well as practitioners trying to get information on areas different from their expertise or trying to study relevant examples of professional applications. We have organized the book into three parts, the first part on IT systems, infrastructure and applications in finance, the second on IT methods in finance and the third on trends and challenges in the financial sector. An introduction to the respective content is given at the beginning of each part. The idea to this book was born when two of us, Detlef and Frank, met Günter Schmidt in 2004 at the International Workshop on Computational Management, Science, Economics, Finance and Engineering in Neuchâtel. In a conference break Günter Schmidt pointed out the need for a comprehensive handbook covering all aspects of the latest developments of IT and its impact on the area of finance, leading to new computational tools to implement more efficiently numerical computations, enabling new forms of markets, new organizational structures, new forms of cooperation, services, business and new products. Later Detlef and Frank

Preface

VII

continued this discussion while being on an excursion to Montreux, walking at the shore of the Lac Léman. It was a beautiful Spring day with bright sunshine, white clouds and glittering snow on the pikes of the Alps, the yellow mimosa trees and the pink Japanese cherry trees blossomed and the many colors of tulips on the fresh green of the lawn enhancing joy and optimism, which convinced us that the challenge of such a Handbook can be mastered and is worth all the effort. After returning to Karlsruhe the two of us contacted Christof and convinced him to join our team and to take all efforts to bring the project together to a successful end. Since then we got quite often the feeling that the walk at the Lac Léman probably enhanced our optimism too much compared to the enormous efforts necessary to realize the huge project of capturing the complete picture of IT in finance. Now, since our work is done we are convinced that it is probably impossible to capture such a vivid and dynamic world of research and applications in a single handbook. However, we are also convinced that we present an excellent choice of highlights, facets and snapshots giving the reader a birds-eye view on promising areas as well as contemporary and prospective applications shaping the future of finance from the perspective of IT. Of course, we are indebted to all those who contributed the chapters to this Handbook, cf. the list of contributors succeeding the Contents section. With their motivation, work and co-operation, they create advances in the areas of IT and finance, and this Handbook reflects a selection of these advances in its contributions. Moreover, we are grateful to all those who helped us through the arduous process of completing this handbook. At first we thank Günter Schmidt for his encouragement to start the project. We thank the Springer-Verlag for his support and especially Dr. Werner A. Mueller for his many valuable advices, his patience for the final delivery and for his encouragement to proceed with this Handbook. We wish to express our special gratitude to Dirk Neumann who contributed so many of his ideas and so much of his time and energy to the creation of this handbook that we can justly say that he is a co-editor of it. Furthermore, we also wish to express our gratitude to many colleagues we met at different conferences and meetings, e. g. at the conferences FinanceCom in Regensburg in Mai 2005, the 10. Symposium on Finance, Banking, and Insurance in Karlsruhe in December 2005, the conferences Quantitative Methods in Finance in Sydney in 2005 and 2006 and the International Workshop on Computational and Financial Econometrics in Geneva in 2007, with whom we discussed the handbook project and got many valuable hints; to Tim Becker, Hagen Buchwald, Jianer Chen, Jörn Dermietzel, Mike Fellows, Mark M. Davydov, Hermann Göppl, Benjamin Graf, Volker Gruhn, Ralf Herrmann, Christian Hipp, Carsten Holtmann, Ming-Yang Kao, Gottfried Koch, Dirk Krause, Michael Lesko, Torsten Lüdecke, Benoît Mandelbrot, Daniel Minoli, Rolf Niedermeier, Martin Op’t Land, Fethi Abderrahmane Rabhi, Svetlozar T. Rachev, Hartmut Schmeck, Rudi Studer, Marliese Uhrig-Homburg and Ute Werner for comments, hints, reviews and/or recommendations; to Christian Reimsbach Kounatze and Carolin Michels for helping us with the transformations of the manuscripts from LaTeX to the final style in Word; to Jörn Dermietzel, Tobias Dietrich, Andreas Mitschele and Gabriele Seese for

VIII

Preface

their help with different technical work related to the project; to Ingeborg Götz for her office work related to the book; to GILLARDON AG financial software for providing an IT infrastructure which efficiently supported the difficult production of the final Handbook; to Stefanie Altinger for supporting the final production. Detlef would like to thank Regina Betz, Fethi Abderrahmane Rabhi, Pradeep Kumar Ray and Nandan Parameswaran from the University of New South Wales Sydney and Stephan Chalup, University of Newcastle for their invitation and support during Detlef’s research semester in winter 2006/2007. Christof and Detlef gratefully acknowledge support from the Deutsche Forschungsgemeinschaft (DFG), especially from the project Graduate School Information Management and Market Engineering (IME). Last but not least it is us a pleasure to thank our wives Gabriele, Renate and Bianca for their constant support, their love and their patience on many nights, weekends and cancelled holidays we spent on the Handbook project. We thank all those concerned.

Karlsruhe, March 2008

Detlef Seese Christof Weinhardt Frank Schlottmann

TABLE OF CONTENTS

Preface ....................................................................................................................V Contributors ......................................................................................................XXV

PART I. IT SYSTEMS, INFRASTRUCTURE AND APPLICATIONS IN FINANCE Introduction to Part I............................................................................................... 3

CHAPTER 1 SOA in the Financial Industry – Technology Impact in Companies’ Practice ....... 9 Heiko Paoli, Carsten Holtmann, Stephan Stathel, Olaf Zeitnitz, Marcel Jakobi 1.1 Introduction ....................................................................................... 9 1.2 SOA – Different Perspectives ......................................................... 11 1.3 SOA – Technical Details ................................................................. 12 Elements and Layers......................................................... 12 1.3.1 1.3.2 Comparison with State-of-the-art Distributed Architecture Alternatives.................................................. 15 1.4 Value Proposition of SOA............................................................... 17 1.5 Case Study Union Investment ......................................................... 18 Introduction to the Use Case Company ............................ 18 1.5.1 1.5.2 SOA Implementation at the Use Case Company .............. 20 1.6 Discussion ....................................................................................... 23 1.7 Conclusion....................................................................................... 25

CHAPTER 2 Information Systems and IT Architectures for Securities Trading ....................... 29 Feras Dabous, Fethi Rabhi 2.1 Introduction ..................................................................................... 29

X

Table of Contents

2.2

2.3

2.4

2.5

2.6

Financial Markets: Introduction and Terminology.......................... 30 Introduction to Financial Markets..................................... 30 2.2.1 2.2.2 Capital Markets Trading Instruments ............................... 31 2.2.3 Equity Market ................................................................... 31 2.2.4 Derivatives Market ........................................................... 32 2.2.5 Types of Orders ................................................................ 32 2.2.6 Classifications of Marketplaces ........................................ 33 Information Systems for Securities Trading.................................... 34 Securities Trading Information Flow................................ 34 2.3.1 2.3.2 Categories of Capital Markets’ Information Systems ....... 37 Case Studies .................................................................................... 39 SMARTS .......................................................................... 39 2.4.1 2.4.2 X-STREAM...................................................................... 42 IT Integration Architectures and Technologies ............................... 44 Transport Level................................................................. 45 2.5.1 2.5.2 Contents (Data) Level....................................................... 46 2.5.3 Business Process Level..................................................... 47 Conclusion....................................................................................... 49

CHAPTER 3 Product Management Systems in Financial Services Software Architectures...... 51 Christian Knogler, Michael Linsmaier 3.1 The Initial Situation......................................................................... 51 3.2 Weaknesses in Software Architectures and Business Processes.................................................................... 52 3.3 Objectives and Today’s Options...................................................... 55 3.4 A ‘Product Server Based Business Application Architecture’ ........ 56 Business Considerations ................................................... 56 3.4.1 3.4.2 Technical Considerations.................................................. 58 3.5 Migrating into a New Architecture.................................................. 60 Blueprint ........................................................................... 62 3.5.1 3.5.2 Product Data Analysis ...................................................... 62 3.5.3 Repository......................................................................... 62 3.5.4 Consolidated Architecture ................................................ 64 3.5.5 Innovated Architecture ..................................................... 65 3.6 Impacts on Software Development.................................................. 67 3.7 Conclusion....................................................................................... 69

CHAPTER 4 Management of Security Risks – A Controlling Model for Banking Companies ........................................................................................ 73 Ulrich Faisst, Oliver Prokein 4.1 Introduction ..................................................................................... 73

Table of Contents

4.2

4.3

4.4

XI

The Risk Management Cycle .......................................................... 74 Identification Phase........................................................... 74 4.2.1 4.2.2 Quantification Phase ......................................................... 77 4.2.3 Controlling Phase.............................................................. 79 4.2.4 Monitoring Phase.............................................................. 80 A Controlling Model for Security Risks.......................................... 82 Assumptions ..................................................................... 82 4.3.1 4.3.2 Determining the Optimal Security and Insurance Level... 85 4.3.3 Constraints and Their Impacts .......................................... 87 Conclusion....................................................................................... 91

CHAPTER 5 Process-Oriented Systems in Corporate Treasuries: A Case Study from BMW Group.......................................................................... 95 Christian Ullrich, Jan Henkel 5.1 Introduction ..................................................................................... 95 5.2 The Technological Environment ..................................................... 97 5.3 Challenges for Corporate Treasuries ............................................. 101 5.4 Portrait BMW Group..................................................................... 104 5.5 Portrait BMW Group Treasury...................................................... 105 Organization and Responsibilities .................................. 106 5.5.1 5.5.2 Business Process Design................................................. 108 5.6 Business Process Support .............................................................. 110 Best-of-Breed System Functionalities ............................ 111 5.6.1 5.6.2 Decision-Support Functionalities.................................... 119 5.7 Final Remarks ............................................................................... 120

CHAPTER 6 Streamlining Foreign Exchange Management Operations with Corporate Financial Portals ........................................................................ 123 Hong Tuan Kiet Vo, Martin Glaum, Remigiusz Wojciechowski 6.1 Introduction ................................................................................... 123 6.2 Challenges of Multinational Financial Management..................... 124 6.3 Corporate Financial Portals ........................................................... 125 Key Characteristics of Corporate Portals........................ 126 6.3.1 6.3.2 Corporate Financial Portals............................................. 127 6.4 Bayer’s Corporate Foreign Exchange Management with the Corporate Financial Portal....................................................... 128 Corporate Foreign Exchange Management..................... 129 6.4.1 6.4.2 Bayer’s Foreign Exchange Management Practice .......... 130 6.4.3 Redesigning Foreign Exchange Risk Management with CoFiPot................................................................... 131 6.4.4 Lessons Learned ............................................................. 135 6.5 Conclusion..................................................................................... 138

XII

Table of Contents

CHAPTER 7 Capital Markets in the Gulf: International Access, Electronic Trading and Regulation ..................................................................... 141 Peter Gomber, Marco Lutat, Steffen Schubert 7.1 Introduction ................................................................................... 141 7.2 The Evolution of GCC Stock Markets and Its Implications .......... 143 7.3 Requirements of International Investors........................................ 148 7.4 Systematic Analysis of the Exchanges in the Region.................... 151 Bahrain Stock Exchange (BSE) ...................................... 151 7.4.1 7.4.2 Tadawul Saudi Stock Market (TSSM)............................ 154 7.4.3 Kuwait Stock Exchange (KSE)....................................... 155 7.4.4 Muscat Securities Market (MSM) .................................. 156 7.4.5 Doha Securities Market (DSM) ...................................... 157 7.4.6 Dubai Financial Market (DFM) ...................................... 158 7.4.7 Abu Dhabi Securities Market (ADSM) .......................... 159 7.5 Approaches to Further Open up the Stock Markets to the International Community − The Dubai International Financial Exchange (DIFX) Case.................................................. 160 7.6 Conclusion..................................................................................... 165 7.7 Appendix ....................................................................................... 165

CHAPTER 8

Competition of Retail Trading Venues − Online-brokerage and Security Markets in Germany ...................................................................... 171 Dennis Kundisch, Carsten Holtmann 8.1 Introduction ................................................................................... 171 8.2 Service Components and Fees in Securities Trading..................... 173 Service Components in Securities Trading..................... 173 8.2.1 8.2.2 Access Fees for Participants at German Exchanges ....... 174 8.2.3 Fees for Price Determination Services............................ 176 8.2.4 Discussion of Services and Fees ..................................... 179 8.3 Analysis of the Price Models of Online-brokers ........................... 181 Forms of Appearance of Online-brokerage .................... 181 8.3.1 8.3.2 Price Models of Selected Online-brokers ....................... 182 8.3.3 Comparison of the Price Models and Discussion ........... 184 8.4 Outlook: Off-exchange-markets as Competitors ........................... 187 8.5 Conclusion..................................................................................... 189

CHAPTER 9 Strategies for a Customer-Oriented Consulting Approach ................................. 193 Alexander Schöne 9.1 Introduction ................................................................................... 193 9.2 Qualitative Customer Service Through Application of Portfolio Selection Theory ........................................................ 194

Table of Contents

9.3

9.4

XIII

9.2.1 Markowitz’ Portfolio Selection Theory .......................... 195 9.2.2 Application to Real-World Business .............................. 197 9.2.3 Consideration of Existing Assets .................................... 199 9.2.4 Illiquid Asset Classes and Transaction Costs ................. 200 9.2.5 Conclusions for Daily Use.............................................. 201 Support of Customer-Oriented Consulting Using Modern Consulting Software ...................................................................... 202 Architecture of a Customer-Oriented 9.3.1 Consulting Software ....................................................... 202 9.3.2 Mobile Consulting as a Result of Customer-Orientation ................................................. 205 Conclusion..................................................................................... 208

CHAPTER 10 A Reference Model for Personal Financial Planning.......................................... 209 Oliver Braun, Günter Schmidt 10.1 Introduction ................................................................................... 209 10.2 State of the Art .............................................................................. 210 10.2.1 Personal Financial Planning............................................ 210 10.2.2 Systems and Models ....................................................... 212 10.2.3 Reference Models ........................................................... 214 10.2.4 Requirements for a Reference Model for Personal Financial Planning ...................................... 216 10.3 Framework for a Reference Model for Personal Financial Planning......................................................................... 218 10.4 Analysis Model.............................................................................. 220 10.4.1 Dynamic View on the System: Use Cases ...................... 220 10.4.2 Structural View on the System: Class Diagrams ............ 221 10.4.3 Use Case Realization: Sequence and Activity Diagrams.................................... 222 10.4.4 Example Models ............................................................. 222 10.5 System Architecture ...................................................................... 230 10.6 Conclusion and Future Trends....................................................... 233

CHAPTER 11 Internet Payments in Germany............................................................................ 239 Malte Krueger, Kay Leibold 11.1 Introduction: The Long Agony of Internet Payments.................... 239 11.2 Is There a Market for Innovative Internet Payment Systems?....... 240 11.2.1 Use of Traditional Payment Systems: Internet Payments and the Role of Payment Culture ...... 240 11.2.2 The Role of Technology ................................................. 242 11.2.3 The Role of Different Business Models.......................... 244

XIV

Table of Contents

11.3

11.4 11.5

Internet Payments in Germany: A View from Consumers and Merchants ............................................................................... 247 11.3.1 The Consumers’ View .................................................... 247 11.3.2 The Merchants’ View ..................................................... 249 SEPA ............................................................................................. 253 Outlook.......................................................................................... 254

CHAPTER 12 Grid Computing for Commercial Enterprise Environments ............................... 257 Daniel Minoli 12.1 Introduction ................................................................................... 257 12.2 What is Grid Computing and What are the Key Issues? ............... 259 12.3 Potential Applications and Financial Benefits of Grid Computing ........................................................................ 268 12.4 Grid Types, Topologies, Components, Layers – A Basic View ................................................................................ 271 12.5 Comparison with other Approaches .............................................. 279 12.6 A Quick View at Grid Computing Standards ................................ 283 12.7 A Pragmatic Course of Investigation on Grid Computing ............ 284

CHAPTER 13 Operational Metrics and Technical Platform for Measuring Bank Process Performance ................................................................................. 291 Markus Kress, Dirk Wölfing 13.1 Introduction ................................................................................... 291 13.2 Industrialisation in the 21st Century? ............................................ 292 13.3 Process Performance Metrics as an Integral Part of Business Performance Metrics ..................................................................... 293 13.3.1 Work Processes, Qualitative Criteria and Measuring Process Value.................................................................. 294 13.4 Controls for Strategic Alignment and Operational Customer and Resource Management............................................................ 296 13.5 Metrics Used for Input/Output Analysis........................................ 296 13.6 Output Metrics: Cash Inflows a Function of Output Quality ........ 297 13.7 Cash Outflows a Function of Available Capacity ......................... 298 13.8 Time and Volume as Non-financial Metrics of Process Performance ................................................................. 298 13.8.1 Fundamental Input/Output Process Performance Characteristics ........................................... 299 13.9 Technical Realization .................................................................... 299 13.9.1 Monitoring Challenges ................................................... 300 13.10 Related Work................................................................................. 301 13.11 SOA Based Business Process Performance Monitoring................ 302 13.11.1 Data Warehouse.............................................................. 305 13.11.2 Complex Event Processing ............................................. 305

Table of Contents

XV

13.11.3 13.11.4 13.11.5 13.11.6 13.11.7

Measuring “Time” .......................................................... 306 Measuring “Volume”...................................................... 307 Measurement “Capacity” ................................................ 307 Measuring “Quality”....................................................... 308 Measuring “Costs” or “Discounted Cash Flows” (DCF).............................................................................. 308 13.12 Conclusion..................................................................................... 308

CHAPTER 14 Risk and IT in Insurances ................................................................................... 311 Ute Werner 14.1 The Landscape of Risks in Insurance Companies ......................... 311 14.2 IT’s Function for Risk Identification and Analysis ....................... 314 14.2.1 Components of Underwriting Risk ................................. 314 14.2.2 Assessing Hazard, Exposure, Vulnerability and Potential Loss........................................................... 316 14.2.3 Information Management – Chances and Challenges ................................................. 321 14.3 IT as a Tool for Managing the Underwriting Risk.......................... 324 14.3.1 Disaster Assistance ......................................................... 324 14.3.2 Claims Handling ............................................................. 326 14.3.3 Market Development ...................................................... 327 14.4 Conclusion..................................................................................... 329

CHAPTER 15 Resolving Conceptual Ambiguities in Technology Risk Management ............................................................................................... 333 Christian Cuske, Tilo Dickopp, Axel Korthaus, Stefan Seedorf 15.1 New Challenges for IT Management............................................. 333 15.2 Knowledge Management Applications in Finance........................ 334 15.2.1 An Extended Perspective of Operational Risk Management ........................................................... 334 15.2.2 Formal Ontologies as an Enabling Technology .............. 336 15.2.3 The Case for Ontology-based Technology Risk Management ........................................................... 337 15.3 Technology Risk Management Using OntoRisk ........................... 338 15.3.1 The Technology Risk Ontology...................................... 339 15.3.2 The OntoRisk Process Model ......................................... 343 15.3.3 System Architecture........................................................ 347 15.4 Case Study: Outsourcing in Banking............................................. 348 15.4.1 Motivation and Influencing Factors................................ 348 15.4.2 Trading Environment...................................................... 349 15.4.3 Analysis and Results....................................................... 352 15.5 Conclusion..................................................................................... 352

XVI

Table of Contents

CHAPTER 16 Extracting Financial Data from SEC Filings for US GAAP Accountants................................................................................. 357 Thomas Stümpert 16.1 Introduction ................................................................................... 357 16.2 Structure of 10-K and 10-Q Filings............................................... 359 16.2.1 Structure of Document Elements .................................... 359 16.2.2 Embedded Plain Text...................................................... 360 16.2.3 Embedded HTML ........................................................... 361 16.2.4 Financial Reports Overview ........................................... 362 16.3 Information Retrieval Within EASE ............................................. 363 16.3.1 Information Extraction from Traditional Filings ............ 364 16.3.2 Standard Vector Space Model ........................................ 365 16.3.3 Extended Vector Space Model........................................ 367 16.3.4 Extraction of HTML Documents .................................... 368 16.4 The Role of XBRL ........................................................................ 371 16.5 Empirical Results .......................................................................... 371 16.6 Conclusion..................................................................................... 373

PART II. IT METHODS IN FINANCE Introduction to Part II ......................................................................................... 379

CHAPTER 17 Networks in Finance ........................................................................................... 383 Anna Nagurney 17.1 Introduction ................................................................................... 383 17.2 Financial Optimization Problems .................................................. 385 17.3 General Financial Equilibrium Problems ...................................... 388 17.3.1 A Multi-Sector, Multi-Instrument Financial Equilibrium Model.......................................................... 389 17.3.2 Model with Utility Functions.......................................... 394 17.3.3 Computation of Financial Equilibria............................... 395 17.4 Dynamic Financial Networks with Intermediation........................ 397 17.4.1 The Demand Market Price Dynamics............................. 400 17.4.2 The Dynamics of the Prices at the Intermediaries .......... 401 17.4.3 Precursors to the Dynamics of the Financial Flows........ 401 17.4.4 Optimizing Behavior of the Source Agents .................... 402 17.4.5 Optimizing Behavior of the Intermediaries .................... 403 17.4.6 The Dynamics of the Financial Flows Between the Source Agents and the Intermediaries....................... 404 17.4.7 The Dynamics of the Financial Flows Between the Intermediaries and the Demand Markets .................. 405 17.4.8 The Projected Dynamical System................................... 406

Table of Contents

17.5

XVII

17.4.9 A Stationary/Equilibrium Point ...................................... 406 17.4.10 Variational Inequality Formulation of Financial Equilibrium with Intermediation..................................... 407 17.4.11 The Discrete-Time Algorithm (Adjustment Process) .... 408 17.4.12 The Euler Method ........................................................... 408 17.4.13 Numerical Examples....................................................... 410 The Integration of Social Networks with Financial Networks ...... 412

CHAPTER 18 Agent-based Simulation for Research in Economics.......................................... 421 Clemens van Dinther 18.1 Introduction ................................................................................... 421 18.2 Simulation in Economics............................................................... 423 18.2.1 Benefits of Simulation .................................................... 423 18.2.2 Difficulties of Simulation ............................................... 424 18.3 Agent-based Simulation Approaches ............................................ 427 18.3.1 Pure Agent-based Simulation: The Bottom-up Approach ............................................... 427 18.3.2 Monte Carlo Simulation.................................................. 428 18.3.3 Evolutionary Approach................................................... 430 18.3.4 Reinforcement Learning ................................................. 432 18.4 Summary ....................................................................................... 438

CHAPTER 19 The Heterogeneous Agents Approach to Financial Markets – Development and Milestones.............................................................................. 443 Jörn Dermietzel 19.1 Introduction ................................................................................... 443 19.2 Fundamental Assumptions in Financial Theory ............................ 444 19.2.1 Rationality ...................................................................... 444 19.2.2 Efficient Market Hypothesis ........................................... 445 19.2.3 Representative Agent Theory ......................................... 445 19.3 Empirical Observations ................................................................. 446 19.4 Evolution of Models...................................................................... 447 19.4.1 Fundamentalists vs. Chartists.......................................... 447 19.4.2 Social Interaction ............................................................ 450 19.4.3 Adaptive Beliefs ............................................................. 452 19.4.4 Artificial Stock Markets.................................................. 454 19.5 Competition of Trading Strategies ................................................ 456 19.5.1 Friedman Hypothesis Revisited ...................................... 456 19.5.2 Dominant Strategies........................................................ 457 19.6 Multi-Asset Markets...................................................................... 458 19.7 Extensions and Applications ......................................................... 459 19.8 Conclusion and Outlook ................................................................ 459

XVIII

Table of Contents

CHAPTER 20 An Adaptive Model of Asset Price and Wealth Dynamics in a Market with Heterogeneous Trading Strategies........................................... 465 Carl Chiarella, Xue-Zhong He 20.1 Introduction ................................................................................... 465 20.2 Adaptive Model with Heterogeneous Agents................................ 468 20.2.1 Notation .......................................................................... 469 20.2.2 Portfolio Optimization Problem of Heterogeneous Agents................................................ 470 20.2.3 Market Clearing Equilibrium Price – A Growth Model............................................................. 471 20.2.4 Population Distribution Measure .................................... 471 20.2.5 Heterogeneous Representative Agents and Wealth Distribution Measure ................................... 472 20.2.6 Performance Measure, Population Evolution and Adaptiveness ............................................................ 472 20.2.7 An Adaptive Model ........................................................ 474 20.2.8 Trading Strategies ........................................................... 475 20.2.9 Fundamental Traders ...................................................... 475 20.2.10 Momentum Traders ........................................................ 475 20.2.11 Contrarian Traders .......................................................... 476 20.3 An Adaptive Model of Two Types of Agents ............................... 477 20.3.1 The Model for Two Types of Agents ............................. 477 20.3.2 Wealth Distribution and Profitability of Trading Strategies....................................................... 478 20.3.3 Population Distribution and Herd Behavior.................... 478 20.3.4 A Quasi-Homogeneous Model ....................................... 479 20.4 Wealth Dynamics of Momentum Trading Strategies .................... 481 20.4.1 Case: (L1,L2) = (3,5) ........................................................ 481 20.4.2 Other Lag Length Combinations .................................... 484 20.5 Wealth Dynamics of Contrarian Trading Strategies...................... 486 20.5.1 Case: (L1,L2) = (3,5) ........................................................ 487 20.5.2 Other Cases..................................................................... 489 20.6 Conclusion..................................................................................... 491 20.7 Appendix ....................................................................................... 492 20.7.1 Proof of Proposition 1..................................................... 492 20.7.2 Time Series Plots, Statistics and Autocorrelation Results............................................ 494

CHAPTER 21 Simulation Methods for Stochastic Differential Equations................................. 501 Eckhard Platen 21.1 Stochastic Differential Equations .................................................. 501 21.2 Approximation of SDEs ................................................................ 502

Table of Contents

21.3 21.4 21.5 21.6 21.7

XIX

Strong and Weak Convergence ..................................................... 503 Strong Approximation Methods .................................................... 505 Weak Approximation Methods ..................................................... 506 Monte Carlo Simulation for SDEs................................................. 509 Variance Reduction ....................................................................... 510

CHAPTER 22 Foundations of Option Pricing............................................................................ 515 Peter Buchen 22.1 Introduction ................................................................................... 515 22.2 The PDE and EMM Methods ........................................................ 517 22.2.1 Geometrical Brownian Motion ....................................... 517 22.2.2 The Black-Scholes pde ................................................... 518 22.2.3 The Equivalent Martingale Measure............................... 519 22.2.4 Effect of Dividends......................................................... 521 22.3 Pricing Simple European Derivatives............................................ 521 22.3.1 Asset and Bond Binaries................................................. 522 22.3.2 European Calls and Puts ................................................. 524 22.4 Dual-Expiry Options ..................................................................... 525 22.4.1 Second-order Binaries..................................................... 525 22.4.2 Binary Calls and Puts...................................................... 527 22.4.3 Compound Options ......................................................... 528 22.4.4 Chooser Options ............................................................. 529 22.5 Dual Asset Options........................................................................ 530 22.5.1 The Exchange Option ..................................................... 531 22.5.2 A Simple ESO................................................................. 532 22.5.3 Other Two Asset Exotics ................................................ 532 22.6 Barrier Options .............................................................................. 533 22.6.1 PDE’s for Barrier Options .............................................. 533 22.6.2 Image Solutions for the BS-pde...................................... 534 22.6.3 The D/O Barrier Option.................................................. 535 22.6.4 Equivalent Payoffs.......................................................... 535 22.6.5 Call and Put Barriers....................................................... 536 22.7 Lookback Options ......................................................................... 537 22.7.1 Equivalent Payoffs for Lookback Options...................... 537 22.7.2 Generic Lookback Options ............................................. 539 22.7.3 Floating Strike Lookback Options .................................. 540 22.8 Summary ....................................................................................... 540

CHAPTER 23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data ................... 543 Wei Sun, Svetlozar (Zari) Rachev, Frank Fabozzi 23.1 Introduction ................................................................................... 543 23.2 Stylized Facts of Financial Intra-daily Data .................................. 545 23.2.1 Random Durations .......................................................... 545

XX

Table of Contents

23.3

23.4

23.5

23.6

23.7

23.2.2 Distributional Properties of Returns ............................... 545 23.2.3 Autocorrelation and Seasonality ..................................... 546 23.2.4 Clustering........................................................................ 546 23.2.5 Long-range Dependence ................................................. 547 Computer Implementation in Studying Intra-daily Data ............... 547 23.3.1 Data Transformation ....................................................... 547 23.3.2 Data Cleaning ................................................................. 549 Research on Intra-Daily Data in Finance....................................... 550 23.4.1 Studies of Volatility ........................................................ 550 23.4.2 Studies of Liquidity ........................................................ 553 23.4.3 Studies of Market Microstructure ................................... 554 23.4.4 Studies of Trade Duration............................................... 556 Long-Range Dependence .............................................................. 560 23.5.1 Estimation and Detection of LRD in Time Domain ....... 560 23.5.2 Estimation and Detection of LRD in Frequency Domain...................................................... 564 23.5.3 Econometric Modeling of LRD ...................................... 566 Fractal Processes and Long-Range Dependence ........................... 569 23.6.1 Specification of the Fractal Processes............................. 569 23.6.2 Estimation of Fractal Processes ...................................... 571 23.6.3 Simulation of Fractal Processes ...................................... 574 23.6.4 Implications of Fractal Processes.................................... 575 Summary ....................................................................................... 576

CHAPTER 24 Bayesian Applications to the Investment Management Process ......................... 587 Biliana Bagasheva, Svetlozar (Zari) Rachev, John Hsu, Frank Fabozzi 24.1 Introduction ................................................................................... 587 24.2 The Single-Period Portfolio Problem ............................................ 587 24.3 Combining Prior Beliefs and Asset Pricing Models...................... 592 24.4 Testing Portfolio Efficiency .......................................................... 596 24.4.1 Tests Involving Posterior Odds Ratios............................ 597 24.4.2 Tests Involving Inefficiency Measures ........................... 599 24.5 Return Predictability...................................................................... 600 24.5.1 The Static Portfolio Problem .......................................... 601 24.5.2 The Dynamic Portfolio Problem..................................... 603 24.5.3 Model Uncertainty .......................................................... 604 24.6 Conclusion..................................................................................... 607

CHAPTER 25 Post-modern Approaches for Portfolio Optimization ......................................... 613 Borjana Racheva-Iotova, Stoyan Stoyanov 25.1 Introduction ................................................................................... 613 25.2 Risk and Performance Measures Overview................................... 614 25.2.1 The Value-at-Risk Measure ............................................ 616

Table of Contents

25.3

25.4

25.5 25.6

XXI

25.2.2 Coherent Risk Measures ................................................. 616 25.2.3 The Conditional Value-at-Risk ....................................... 618 25.2.4 Performance Measures.................................................... 619 Heavy-tailed and Asymmetric Models for Assets Returns............ 621 25.3.1 One-dimensional Models................................................ 622 25.3.2 Multivariate Models........................................................ 623 Strategy Construction Based on CVaR Minimization................... 624 25.4.1 Long-only Active Strategy.............................................. 625 25.4.2 Long-short Strategy ........................................................ 627 25.4.3 Zero-dollar Strategy........................................................ 628 25.4.4 Other Aspects.................................................................. 629 Optimization Example................................................................... 629 Conclusion..................................................................................... 633

CHAPTER 26 Applications of Heuristics in Finance................................................................. 635 Manfred Gilli, Dietmar Maringer, Peter Winker 26.1 Introduction ................................................................................... 635 26.2 Heuristics....................................................................................... 636 26.2.1 Traditional Numerical vs. Heuristic Methods................. 636 26.2.2 Some Selected Approaches............................................. 637 26.2.3 Further Methods and Principles ...................................... 639 26.3 Applications in Finance................................................................. 642 26.3.1 Portfolio Optimization .................................................... 642 26.3.2 Model Selection and Estimation ..................................... 648 26.4 Conclusion..................................................................................... 651

CHAPTER 27 Kernel Methods in Finance................................................................................. 655 Stephan Chalup, Andreas Mitschele 27.1 Introduction ................................................................................... 655 27.2 Kernelisation ................................................................................. 657 27.3 Dimensionality Reduction ............................................................. 658 27.3.1 Classical Methods for Dimensionality Reduction........... 658 27.3.2 Non-linear Dimensionality Reduction ............................ 661 27.4 Regression ..................................................................................... 664 27.5 Classification ................................................................................. 666 27.6 Kernels and Parameter Selection................................................... 668 27.7 Survey of Applications in Finance ................................................ 670 27.7.1 Credit Risk Management ................................................ 670 27.7.2 Market Risk Management............................................... 673 27.7.3 Synopsis and Possible Future Application Fields ........... 675 27.8 Overview of Software Tools ......................................................... 678 27.9 Conclusion..................................................................................... 679

XXII

Table of Contents

CHAPTER 28 Complexity of Exchange Markets ...................................................................... 689 Mao-cheng Cai, Xiaotie Deng 28.1 Introduction ................................................................................... 689 28.2 Definitions and Models ................................................................. 691 28.3 On Complexity of Arbitrage in a General Market Model ............. 694 28.4 Polynomial-Time Solvable Models of Exchange Markets ............ 697 28.5 Arbitrage on Futures Markets........................................................ 699 28.6 The Minimum Number of Operations to Eliminate Arbitrage ...... 702 28.7 Remarks and Discussion................................................................ 703

PART III. FURTHER ASPECTS AND FUTURE TRENDS Introduction to Part III ........................................................................................ 709

CHAPTER 29 IT Security: New Requirements, Regulations and Approaches .......................... 711 Günter Müller, Stefan Sackmann, Oliver Prokein 29.1 Introduction ................................................................................... 711 29.2 Online Services in the Financial Sector......................................... 712 29.2.1 Integration of Financial Services .................................... 712 29.2.2 Security Risks of Personalized Services ......................... 714 29.3 Security beyond Access Control.................................................... 714 29.3.1 Security: Protection Goals .............................................. 715 29.3.2 Security: Dependability .................................................. 717 29.4 Security: Policy and Audits ........................................................... 717 29.4.1 Security Policies ............................................................. 718 29.4.2 Logging Services ............................................................ 721 29.5 Regulatory Requirements and Security Risks ............................... 722 29.5.1 Data Protection Laws...................................................... 722 29.5.2 Regulatory Requirements ............................................... 723 29.5.3 Managing Security and Privacy Risks ............................ 724 29.6 Conclusion..................................................................................... 727

CHAPTER 30 Legal Aspects of Internet Banking in Germany.................................................. 731 Gerald Spindler, Einar Recknagel 30.1 Introduction ................................................................................... 731 30.2 Contract Law – Establishing Business Connections, Concluding Contracts, and Consumer Protection.......................... 732 30.2.1 Establishing the Business Connection ............................ 732 30.2.2 Contract Law .................................................................. 732 30.2.3 Terms and Conditions..................................................... 733 30.2.4 Consumer Protection – EU-Legislation and Directives .. 733

Table of Contents

30.3

XXIII

Liability – Duties and Obligations of Bank and Customer............ 736 30.3.1 Phishing and Its Variations – A Case of Identity............ 736 30.3.2 Legal Situation in Case of Phishing or Pharming ........... 737 30.3.3 Procedural Law and the Burden of Proof........................ 746

CHAPTER 31 Challenges and Trends for Insurance Companies............................................... 753 Markus Frosch, Joachim Lauterbach, Markus Warg 31.1 Starting Position ............................................................................ 753 31.2 Challenges and Trends .................................................................. 754 31.2.1 Business and IT Strategy in the Balanced Scorecard...... 754 31.2.2 Self-controlling Systems: Management Based on the Contribution Margin ....................................................... 756 31.2.3 Unique Selling Propositions: On Demand Insurance...... 759 31.2.4 Sales and Marketing Support: Access to Primary Data................................................... 761 31.2.5 Talent Management: Identify, Hire and Develop Talented Staff ............................................ 765 31.3 Summary and Outlook................................................................... 770

CHAPTER 32 Added Value and Challenges of Industry-Academic Research Partnerships – The Example of the E-Finance Lab .................................................................... 773 Wolfgang Koenig, Stefan Blumenberg, Sebastian Martin 32.1 Introduction ................................................................................... 773 32.2 Structure of the EFL ...................................................................... 773 32.3 Generation and Sharing of Knowledge Between Science and Practice: Research Cycle ........................................................ 775 32.3.1 Problem Definition ......................................................... 776 32.3.2 Operationalization........................................................... 776 32.3.3 Application ..................................................................... 777 32.3.4 Analysis .......................................................................... 777 32.3.5 Utilization ....................................................................... 779 32.4 Added Value and Challenges of the Industry-Academic Research Partnership ..................................................................... 783 32.5 Conclusion..................................................................................... 785 Index ................................................................................................................... 787

CONTRIBUTORS

Biliana Bagasheva Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106–3110, USA Stefan Blumenberg Institute for Information Systems/E-Finance Lab Frankfurt University 60325 Frankfurt/Main, Germany [email protected] Oliver Braun Lehrstuhl für Informations- und Technologiemanagement Saarland University Im Stadtwald 15 66123 Saarbrücken, Germany [email protected] Peter Buchen School of Mathematics and Statistics University of Sydney Sydney, NSW, Australia [email protected] Mao-cheng Cai Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing 100080, P. R. China

XXVI

Contributors

Stephan K. Chalup School of Electrical Engineering and Computer Science The University of Newcastle Callaghan, NSW 2308, Australia [email protected] Carl Chiarella University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Sydney, Australia [email protected] Christian Cuske Protiviti Taunusanlage 17 60325 Frankfurt/Main, Germany [email protected] Feras T. Dabous Senior Advisor Smarts Group International Level 4, 55 Harrington Street The Rocks Sydney NSW 2000 Australia [email protected] Xiaotie Deng Department of Computer Science City University of Hong Kong Hong Kong, P. R. China Jörn Dermietzel Graduate School IME and Institute AIFB Universität Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected] Tilo Dickopp Lehrstuhl für Wirtschaftsinformatik III University of Mannheim 68131 Mannheim, Germany [email protected] Frank Fabozzi Yale School of Management 135 Prospect Street, Box 208200 New Haven, Connecticut, 06520-8200, USA

Contributors XXVII

Ulrich Faisst Department of Information Systems & Financial Engineering Business School, University of Augsburg Augsburg, Germany [email protected] Markus Frosch Promerit AG Wiesenhüttenstraße 1 60329 Frankfurt/Main, Germany [email protected] Martin Glaum Chair for General Business Administration, International Management Accounting and Auditing, Justus-Liebig-University of Giessen 35390 Giessen, Germany Manfred Gilli Department of Econometrics University of Geneva 40, Bd du Pont d’Arve CH-1211 Geneva 4, Switzerland Peter Gomber Chair of Business Administration, especially e-Finance Goethe-University Frankfurt Robert-Mayer-Straße 1 60054 Frankfurt/Main, Germany [email protected] Xue-Zhong He University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Sydney, Australia [email protected] Jan Henkel BMW AG 80788 München, Germany [email protected] Carsten Holtmann Research Unit Information Process Engineering (IPE) FZI Forschungszentrum Informatik Karlsruhe, Germany [email protected]

XXVIII Contributors

John Hsu Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106-3110, USA Marcel Jakobi Union Investment Wiesenhüttenstraße 10 60329 Frankfurt/Main, Germany [email protected] Christian Knogler msg systems ag Robert-Bürkle-Str. 1 85737 Ismaning, Germany Wolfgang Koenig Institute for Information Systems/E-Finance Lab Frankfurt University 60325 Frankfurt/Main, Germany [email protected] Axel Korthaus Lehrstuhl für Wirtschaftsinformatik III University of Mannheim 68131 Mannheim, Germany [email protected] Markus Kress Institute AIFB University of Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected] Malte Krueger Lecturer Institute for Economic Policy Research University of Karlsruhe (TH) and Consultant, PaySys Consultancy GmbH 76128 Karlsruhe, Germany Dennis Kundisch Department of Information Systems & Financial Engineering Competence Center IT & Financial Services University of Augsburg 86159 Augsburg, Germany [email protected]

Contributors

Joachim Lauterbach vwd Vereinigte Wirtschaftsdienste GmbH Tilsiter Straße 1 60487 Frankfurt/Main [email protected] Kay Leibold Researcher Institute for Economic Policy Research University of Karlsruhe (TH) 76128 Karlsruhe, Germany Michael Linsmaier msg systems ag Robert-Bürkle-Str. 1 85737 Ismaning, Germany Marco Lutat Chair of Business Administration, especially e-Finance Goethe-University Frankfurt Robert-Mayer-Strasse 1 60054 Frankfurt/Main, Germany [email protected] Dietmar Maringer Centre for Computational Finance and Economic Agents (CCFEA) University of Essex Wivenhoe Park Colchester CO4 3SQ, United Kingdom Sebastian Martin Institute for Information Systems/E-Finance Lab Frankfurt University 60325 Frankfurt/Main, Germany [email protected] Daniel Minoli Stevens Institute of Technology Hoboken, NJ 07030-5991, USA Andreas Mitschele Institute AIFB University of Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected]

XXIX

XXX

Contributors

Günter Müller Institute of Computer Science and Social Studies Department of Telematics University of Freiburg 79085 Freiburg, Germany [email protected] Anna Nagurney Department of Finance and Operations Management Isenberg School of Management, University of Massachusetts Amherst, Massachusetts 01003, USA Heiko Paoli Research Unit Information Process Engineering (IPE) FZI Forschungszentrum Informatik 76131 Karlsruhe, Germany [email protected] Eckhard Platen School of Finance & Economics and Dept. of Mathematical Sciences University of Technology Sydney, Australia [email protected] Oliver Prokein Institute of Computer Science and Social Studies Department of Telematics University of Freiburg 79085 Freiburg, Germany [email protected] Fethi Rabhi School of Information Systems Technology and Management The University of New South Wales Sydney, NSW 2052, Australia [email protected] Svetlozar (Zari) Rachev School of Economics and Business Engineering University of Karlsruhe Postfach 6980, 76128 Karlsruhe, Germany and Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106-3110, USA

Contributors

Borjana Racheva-Iotova Research and Development, FinAnalytica Inc. Milin Kamak 65 Sofia 1164, Bulgaria Einar Recknagel RAe Schulz Noack Bärwinkel Baumwall 7 20459 Hamburg, Germany Stefan Sackmann Institute of Computer Science and Social Studies Department of Telematics University of Freiburg 79085 Freiburg, Germany [email protected] Günter Schmidt Lehrstuhl für Informations- und Technologiemanagement Saarland University, Im Stadtwald 15 66123 Saarbrücken, Germany [email protected] Alexander Schöne GILLARDON AG financial software Edisonstr. 2 75015 Bretten, Germany [email protected] Steffen Schubert Dubai International Financial Exchange Ltd. Level 7, The Exchange Building, Gate District Dubai International Financial Centre P.O. Box 53536, Dubai, United Arab Emirates [email protected] Stefan Seedorf Lehrstuhl für Wirtschaftsinformatik III University of Mannheim 68131 Mannheim, Germany [email protected] Gerald Spindler Juristische Fakultät Universität Göttingen Platz der Göttinger Sieben 6 37073 Göttingen, Germany

XXXI

XXXII Contributors

Stephan Stathel Research Unit Information Process Engineering (IPE) FZI Forschungszentrum Informatik Haid-und-Neu-Str. 10–14 76131 Karlsruhe, Germany [email protected] Stoyan Stoyanov FinAnalytica Inc. Milin Kamak 65 Sofia 1164, Bulgaria Wei Sun Institute for Statistics and Mathematical Economics University of Karlsruhe 76128 Karlsruhe, Germany [email protected] Thomas Stümpert Institute AIFB University of Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected] Christian Ullrich BMW AG 80788 München, Germany [email protected] Clemens van Dinther FZI – Research Center for Information Technologies Universität Karlsruhe (TH) Haid-und-Neu-Str. 10–14 76131 Karlsruhe, Germany [email protected] Hong Tuan Kiet Vo Institute of Information Systems and Management IISM Research Group Information & Market Engineering University Karlsruhe (TH) 76137 Karlsruhe, Germany Markus Warg Deutscher Ring Versicherungsunternehmen Ludwig-Erhard-Straße 22 20459 Hamburg, Germany [email protected]

Contributors XXXIII

Ute Werner Lehrstuhl für Versicherungswissenschaft Universität Karlsruhe (TH) 76133 Karlsruhe, Germany [email protected] Peter Winker Department of Economics University of Giessen Licher Str. 64 35394 Giessen, Germany Dirk Wölfing cirquent | BMW Group Königsberger Straße 29 60487 Frankfurt/Main [email protected] Remigiusz Wojciechowski Bayer AG Treasury Systems Manager Building Q26, Room 3.308 51368 Leverkusen, Germany [email protected] Olaf Zeitnitz Union Investment Wiesenhüttenstraße 10 60329 Frankfurt/Main, Germany [email protected]

PART I IT Systems, Infrastructure and Applications in Finance

Introduction to Part I

The sixteen chapters of this part cover essential aspects of IT systems and their relation to the infrastructure of the financial service industry. In the very first chapter Paoli, Holtmann, Stathel, Zeitnitz and Jakobi present the Service-Oriented Architecture (SOA) as an innovative concept for designing software infrastructures in the financial industry. The SOA promises benefits like decoupling of tasks and increased reuse of software components, since the SOA software system is divided into logical, self-sustaining parts (services). Services are, in contrast to e. g. objects in object-oriented architectures, loosely coupled to form the business logic of e. g. companies. SOA has also a very strong economic impact, as it allows more flexible ways for adapting business logic to fit new market requirements and simultaneously promises to reduce the total cost of ownership. In this chapter a case study from the German financial industry provides insights to the introduction of a SOA and reveals how much of the SOA impact and potential promised can today be fulfilled in daily practice. Chapter 2 is devoted to information systems and IT architectures for securities trading. Here, Dabous and Rabhi give an introduction to the basic principles of securities trading by presenting some essential concepts such as financial instruments, order types and market structures. It describes the securities trading cycle as consisting of a number of business processes within and across different institutions such as brokerage houses, investment banks, and markets. It classifies the different types of information and transaction systems and their role in the trading cycle. For illustration, the chapter focuses on two commercial systems in terms of their architecture. The chapter also examines the architecture of such systems and surveys the different levels of integration and technologies that are currently being used. It also discusses current trends in utilizing service-oriented computing, business process automation and integration principles in order to achieve the development of stable, flexible and extensible business processes. The next chapter of Knogler and Linsmaier investigates product management systems in financial services software architectures and presents a controlling model for banking. Flexibility and adaptability of core business application software

4

Introduction to Part I

are major drivers for both business and IT managers in the insurance industry. Extensively automated straight through processing from the point of sale (or point of contact) into the back office systems and back again has to be enabled in order to cut time and costs spent on sales and services processes. The authors point out a concept of ‘Product Data Driven Business Application Software Architecture’ for the insurance industry reflected by a product management system. Moreover, the scalability of services is a new approach that ranges from simple services (e. g. calculations) to product related business services which have a key role in nearly all core processes within the financial services and insurance industry. The chapter authors identify strategic IT-consulting and systems integration services as well as standard business applications as the ingredients for tomorrow’s business success. Faisst and Prokein discuss the management of security risks and introduce a controlling model for banking companies in Chapter 4. Increasing importance of information and communication technologies, new regulatory obligations (e. g. Basel II) and growing external risks (e. g. hacker attacks) put security risks in the management focus of banking companies. The management has to decide whether to carry security risks or to invest into technical security mechanisms in order to decrease the frequency of events or to invest in insurance policies in order to lower the severity of events. Based on a presentation of the state-of-the-art in the management of security risks, this chapter develops an optimization model to determine the optimal amount to be invested in technical security mechanisms and insurance policies. Furthermore the model considers budget and risk limits as constraints. This chapter is particularly supposed to help practitioners in controlling security risks. In Chapter 5 Ullrich and Henkel present a case study on process-oriented systems in corporate treasuries. Any kind of risk and transaction management conducted in corporate treasury department requires technology support. The reason for this stems not only from the volume, complexity and time criticality of the necessary data processing, but also from the desire to make quicker and betterinformed decisions. A practical conflict emerges from the need to consolidate IT systems at the one end, while at the same time granting a certain degree of specification. A case study provides insights into the treasury organization, financial services and treasury process of BMW Group, a multinational manufacturer of premium automobiles and motorcycles. In order to address the above mentioned conflict, BMW’s treasury system functions and their capability of supporting BMW’s treasury operations are outlined. It is found that, given the current market environment for IT systems, BMW needs to rely on a combination of vendors in order to obtain a complete business solution. How to streamline foreign exchange management operations with corporate financial portals is investigated by Kiet Vo, Glaum and Wojciechowski in Chapter 6. This work outlines the characteristics and determinants of multinational financial processes and identifies the challenges financial management has to face when striving for operational effectiveness and efficiency. It is argued that these challenges can be effectively managed only with the support of IT. Hence, the concept of a corporate financial portal as a central hub to financial information, systems and transaction services is introduced and discussed. To complement the

Introduction to Part I

5

theoretical discussion a corresponding application is introduced and the impact of a corporate financial portal implementation with regard to the foreign exchange management process of the Bayer AG is evaluated. Stock markets in the Gulf Region have experienced a significant growth in the recent past. This area is discussed in Chapter 7 by Gomber, Lutat and Schubert, with a special emphasis to international access, electronic trading and regulation of this market, for which market performance, trading activity and market capitalization have increased by nearly triple digits annually. This development implicates two effects: on the one hand financial markets gain more attractiveness for international investors which hardly had access for a long time; on the other hand the region’s exchanges are running through modernization in their market models, regulatory frameworks, functionalities and technical infrastructures. This chapter highlights recent stock market developments in the six Gulf Cooperating Council (GCC) countries against the background of international investors’ requirements. The region’s exchanges are characterized from an organizational and regulatory perspective with a focus on trading mechanisms. Recent approaches for modernizing and opening up GCC markets are investigated taking the Dubai International Financial Exchange (DIFX) as a case study. Kundisch and Holtmann investigate the competition of retail trading venues, online-brokerage and security markets in Germany in Chapter 8. Germany is one of the few countries with a vivid regional stock market structure. Some of the markets specifically address the retail investor with their offerings and compete on price as well as on quality. Retail investors cannot access these markets directly, but have to make use of intermediaries such as an online-broker. The explicit transaction costs are one important factor that influences the decision of a retail investor whether and where to perform her transaction. In this contribution on the one hand the services that are affiliated with a stock market transaction and on the other hand pricing schemes of four online-brokers are analysed. The research question addressed is, whether a potentially existing price competition of the stock markets and other affiliated service providers is visible for the retail investor. Due to the pricing schemes of the online-brokers which are primarily relevant to the retail investor, potentially existing price competition on the level of the stock markets does not become visible. However, differences can be reported in the distortions caused by the different pricing schemes of the analysed online-brokers. In Chapter 9 Schöne develops strategies for a customer-oriented consulting approach in sales processes for financial products. Following a phase where customer service was not considered a core business, banks have begun focusing on customer service again. The chapter discusses two central requirements for customer-oriented consulting. At first, it will highlight which methods are available for enhancing the functional quality of consulting services. Then it will introduce the IT structures, which modern consulting software should provide, especially to enable quick reactions to the changing requirements in the market. Chapter 10 is devoted to reference models for financial planning. Here Braun and Schmidt present a reference model for personal financial planning based on an application architecture and a corresponding system architecture. Personal financial

6

Introduction to Part I

planning is the process of meeting life goals through the management of finances. After an introduction the chapter provides a short review of the state-of-the-art in the field of reference modelling for personal financial planning and describes the framework of a suitable reference model. The chapter discusses essential parts of reference models for the analysis and architecture level. As an example is described a prototype (FiXplan – IT-based personal financial planning) in order to evaluate the requirements compiled. Finally the results are resumed and future research opportunities are discussed. Internet Payments in Germany is the central focus of Chapter 11 by Krueger and Leibold. After discussing the historical development they take a look at the current state of the market and some of the important drivers. Then the role of technology and different business models is investigated. Section three of this chapter presents the special situation of Internet payment in Germany. It is followed by a discussion of the Single European Payments Area (‘SEPA’) and an outlook to possible future developments. Grid Computing is an evolving field of virtualized distributed computing environments, providing dynamic “runtime” selection, sharing, and aggregation of (geographically) distributed autonomous resources. Minoli’s Chapter 12 gives a survey of the Grid Computing field as it applies to corporate environments and it focuses on what are the potential advantages of the technology in these environments. The chapter starts with a survey of the historical development. After a survey on the historical development it gives a profound introduction to grid computing and its key issues. An analysis of potential applications and financial benefits of grid computing and a survey on different grid types, topologies, components and layers is provided. This basic view is compared with other approaches. A quick view at grid computing standards and a pragmatic course of investigation on grid computing close the chapter. In Chapter 13, Kress and Wölfing present operational metrics and a technical platform for measuring bank process performance. Industrial methods to improve the productivity require metrics to control the performance of processes. For this purpose the end-to-end processes should be broken down into components, at best indicating the division of labour. The interfaces between the components of the process are stations for measurement. The coordination of the different workers is based on controlling information, produced in a technical business process management platform. Technically the business process management platform produces information by a monitoring system integrated in a SOA. Depending on the specific scenario various technical approaches exist to build such a monitoring solution which has to cope with increasing data volume as well as constant changes. The presented solution provides business driven monitoring in challenging technical environments. Werner gives a survey on special aspects of risk and IT in insurances in Chapter 14. Insurance companies run different risks, with underwriting and timing risk being the most typical ones. As insurance contracts cover natural and man-made hazards and the resulting damages, it is important for insurers to assess their interdependent effects. Geographical Information and Remote Sensing Systems are used for risk identification and analysis since they support the mapping of hazards,

Introduction to Part I

7

exposure, vulnerability and risk. Insurance companies need this information for risk inspection and pricing. Accumulations of risks insured have to be controlled, too. This task may be enhanced by IT since it requires the identification and visualization of spatial and temporal contingencies which is important for risk selection and complex risk analyses. Geographic Information and Remote Sensing Systems have been used as tools for disaster assistance and insurance claims management. Another promising application area is market development, i. e. increasing the quantity and improving the quality of insurance business written. Both effects are able to reduce the underwriting risk of insurers. Cuske, Dickopp, Korthaus and Seedorf investigate in Chapter 15 how conceptual ambiguities in technology risk management can be resolved. Knowledge management is widely recognized as a business enabler, especially in large organizations depending on intangible assets. Accordingly, financial institutions need to acquire, store and apply knowledge for supporting their core business processes. This particularly holds for risk management, which crosscuts the financial service domain as one core competency. Unlike e. g. market or credit risk, operational risk suffers from an underlying conceptual ambiguity, creating special challenges for knowledge management. In this chapter the OntoRisk process model is presented, which is used for identifying, measuring and controlling technology risk, a subcategory of operational risk which is also in the focus of the Basel II regulation. The process is supported by an ontology-based platform implementing a suitable knowledge representation and application components supporting the process steps. The advantages over traditional approaches are analysed by means of a case study. The last chapter of this part contains a contribution by Stümpert who discusses how to extract financial data from SEC filings for US GAAP accountants as a key example of automatic information retrieval. U.S. domestic companies with publicly traded securities have to disseminate financial information in the EDGAR database of the Securities and Exchange Commission (SEC), a regulation authority with the task to protect investors and to maintain fair, honest and efficient markets. The most important financial information in this database is 10-K filings which contain the annual reports, i. e. balance sheets, cash flow statements and income statements. The chapter presents a software agent which extracts data from annual and quarterly reports and enables financial accountants to compare financial data of different companies and different years easily. The structure of 10-K filings varies between companies, and changes over time for individual companies as well. Moreover, their technical specification is comparably weak. Altogether, this makes automated data extraction a great challenge. The focus of this chapter is the recognition of balance sheets, that is, how to find relevant sections in a large document. A filing consists of either HTML or plain ASCII text. Regarding HTML encoded content the agent builds a DOM (Document Object Model) instance on top of a filing. This DOM-based approach revealed additional potential for navigation in these filings in order to detect financial information faster and reliably even when filings do not adhere strictly to syntactical conventions. For plain text, a modified vector space model has been developed. The agent detects 100% of balance sheet data from 10-K filings in a sample of the companies of the Dow Jones Industrial Average Index.

CHAPTER 1 SOA in the Financial Industry – Technology Impact in Companies’ Practice Heiko Paoli, Carsten Holtmann, Stephan Stathel, Olaf Zeitnitz, Marcel Jakobi

1.1

Introduction

Service-oriented Architecture (SOA) basically is an innovative concept for designing software infrastructures. To better understand the idea and concepts of SOA it seams promising to take a look at how the Information and Communication Technology (ICT) has developed since the beginning of 70ths (see Table 1.1). Four major developments concerning architectural paradigms can be observed: • Monolithic and host-based approaches: The history of ICT started with monolithic software systems on mainframes. Typical applications in that era have been developed for the accounting, payroll and inventory purposes. Software primarily concentrated on the functions it should provide. Systems were designed as building blocks and therefore the reuse of system elements or components was largely impossible. Functionality was implemented on hosts, terminals provided the user interface to the systems. The next step in software architecture development was the era of client-server systems beginning in the 1980ths. • Client Server (CS) approaches: With the CS paradigm software started to increasingly address the needs of single user groups through applications for Enterprise Resource Planning (ERP), Human Resource Management (HR) or Collaboration issues. Terminals were replaced by personal computers and software systems were divided into two parts: Servers built the backbone and provided necessary business functionality while the clients were used mainly for presentation purposes. Hardware and/or operating system of client and server became independent from each other. Hence, server functionalities were used by multiple clients while the communication scheme was based on the technology known as Remote Procedure Call (RPC). Clients and servers were still coupled quite tightly.

10

Heiko Paoli et al.

• Multiple Tier approaches: In the 90ths the object-oriented software developing paradigm became common and it was possible to divide complex systems into smaller parts (components and objects), which were able to communicate with each other to build the overall functionality. One central goal of the object-oriented paradigm was to ease the reuse of system parts and objects. To support the reuse of server objects in a distributed environment the concept of a communication middleware was introduced. So it became possible to build complex systems, but like in client-server systems all parts of the distributed system still were tightly coupled through the middleware. To solve the challenges of multiple tier systems SOA started to take-off in the early 2000ths. Today SOA is one of the most often used terms in discussions about state-ofthe-art software architectures paradigms. Nearly every important software vendor offers its own solutions to support SOA. Promises are huge and so is uncertainty about real benefits: What exactly distinguishes SOA from other paradigms for distributed software systems? What are the primary benefits, and which of them can be observed in real world scenarios? In the remainder of the chapter these questions will be addressed. After an introduction into some different understandings one could have about SOA, the often mentioned benefits concerning technical and economic aspects are discussed, before the implementation of a SOA in a German Asset Management Company (AM) is illustrated and discussed with respect to both aspects.

Table 1.1. Architecture approaches (see Magic Software (2005))

Approach

Applications

Range

Characteristics

Host Based

Accounting, Payroll, Inventory

Function (Discrete)

ClientServer

Employee (Internal)

Multiple Tier (Web)

ERP, HR, Collaboration (Office) CRM, eBusiness,

Monolithic Very tightly coupled No reuse possible RPC Tightly coupled Little reuse possible

Enterprise Network

EAI BPM, CPM/BI

Customer (External) Business Process (Global)

Objects Tightly coupled Medium reuse possible Services Loosely coupled Reuse possible

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

1.2

11

SOA – Different Perspectives

The introduction provides a rough idea about the history of different ICT architecture paradigms and briefly summarizes important characteristics of them. The question ‘what a SOA is’ was raised. The answer is simple: It depends. Like in any emerging ICT field there is a huge heterogeneity of understandings to what SOA refers to. Typically none of them is able to sketch all aspects of the topic; some try to be broad and high level others focus on special characteristics.1 To illustrate different perspectives, three approaches – namely a software vendor, a consultant, and a common perspective – are briefly described before the basic architecture of a SOA is presented. A first way of understanding SOA could be characterized as a vendor’s perspective on the terms service, service orientation, service-oriented architecture and composite application since it has been provided by IBM. Interestingly, the very basis of the understanding is the definition of a service as being a repeatable business task (like to check a customer credit or open a new account). Service orientation is then primarily non-technical understood as a way of integrating business processes as a linked chain of services and the outcomes that they provide.2 Hence, a Service-oriented architecture is an IT architectural style that supports service orientation. These basic definitions of a software vendor interestingly stress the business aspect of services and SOA. Nothing is mentioned about the requirements upon the infrastructure, middleware, or software architecture. In this definition business processes seem to be the most important aspects to focus. Composite applications facilitate business processes across multiple functions and systems. A SOA infrastructure obviously has to be built upon the business needs − this would be in clear contrast to older concepts where the IT infrastructure dictated how the business tasks have to be handled. Building applications on a SOA approach seems to be a task of decomposing business needs into functions and services. Composite applications can be described as a jigsaw puzzle where business services are connected by using middleware functionalities. Following this insights a SOA has to be • process-oriented, • context-driven and dynamic which means it accomplishes business goals, responds to business events and changes over time depending on changing business needs, • user-centric which means that they integrate and provide information requested by specific user groups or roles.

1

2

Even manufacturers of the available software solutions which support this new paradigm have different understandings of SOA leading to a diversity of SOA products with very different properties. A set of related and integrated services that support a business process built up on service-oriented architecture is dubbed a composite application.

12

Heiko Paoli et al.

A second perspective upon SOA can be dubbed a consultant’s perspective. A Service-oriented architecture consists, referring to (Gartner 2005), of some established formal interfaces between reusable business components and clients. Services are designed to be used across applications and organizations. Like the first perspective this understanding also concentrates on the business aspect of SOA, but also integrates a more technical perspective and stresses the possibility of reusable business components. A third understanding can be dubbed a common understanding as it is provided in the free encyclopaedia wikipedia (Wikipedia 2005). Like in the previous definitions a service is defined as abstract and self-contained part needed for the business without considering how a service could be implemented. While additional questions upon the business perspective remains open, this source at least clarifies • how (technical) services could be used (through a stateless request/ response interaction pattern), and • how (technical) services interact with one another (orchestration, choreography). For using a service at least two parties are necessary (provider, consumer) and the relationship between a consumer and a provider can be dynamic in which case the binding between them two is established through a service repository (discovery). This definition allows that a provider for a specific service is also a consumer for another service. As this perspectives on SOA didn’t explain their fundamental nature the technical aspects are described in more detail in the following.

1.3

SOA – Technical Details

To illustrate the idea of SOA in more detail some insight in their basic elements and layers as well as a comparison with alternative state-of-the-art technologies is deemed promising – both are provided on the following pages.

1.3.1

Elements and Layers

Different from alternative SOA understandings that can be found in business scenarios, the technical elements and details of a SOA are more a matter of consensus between most authors. Referring to e. g. (Krafzig et al. 2004) a SOA comprises • • • •

services, application front-ends, a service repository, and a service bus.

Application front-ends are built upon services. The communication between services, service repository and application front-ends is enabled by the service bus.

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

13

Figure 1.1. Basic Elements of a SOA, see (Krafzig et al. 2004)

The service repository is needed to find and bind services dynamically to application front-ends and/or other services. A service can be divided in further parts (see Figure 1.1). Services can be accessed through their interfaces which are publicly visible. Without knowing the interface it is not possible to access or use a service. Hence it is necessary to describe interfaces in an appropriate way.3 A Service can have more than one interface. The advantage is the possibility to use a service by different protocols and/or technologies. A service needs a concrete implementation which provides the business logic and the data to fulfil its tasks. Data is typically partitioned in a way that every partition can only be manipulated through one service. Hence the necessary data of a service can be interpreted as a private property of that single service and if other services need to access that data the associated service and its interface(s) has to be used. The implementation of a service is typically hidden from the consumers. Last but not least a service has also a contract. Currently, the idea of contracts is often disregarded in current implementations. Service contracts define what the service guarantees to its consumers (e. g. quality, security, performance …) and also states what is required to use the service (e. g. price, required authorization level …). One can distinguish business services and basic or technical services respectively. The latter are typically applied to • interact with technical infrastructure elements (send and receive messages, use of data puffer and queues), • connect to already existing legacy systems or applications (e. g. sales system, customer relationship management systems), • use the technical infrastructure (through wrappers, adapters, transformations …). 3

Web services, e. g., can be described by WSDL (Web Services Description Language).

14

Heiko Paoli et al.

Table 1.2. Technical elements in a SOA

Term SOA Application front-end Service

Service Repository Service Bus Contract Implementation Interface Business logic Data

Definition/Comment Entirety of Application front-ends, Services, Service Repository, and Service Bus. Logical aggregation and presentation of services. Logical unit of business logic, which can be used by application front-ends. A service can be used through its interface and assures the conditions from it contract. Optional part of a SOA for managing services. It allows dynamic coupling between services and/or application front-ends. Communication infrastructure between Services, application front-ends, Service, and Service repository. A set of conditions under which a service can be used together with conditions the service will assure by using it. Business logic and data of a service needed to fulfil its contract. Description of a service to connect and use it. Functionality of a service which is needed by application front-ends. Data which belongs to a service and can be accessed and managed by the business logic.

In contrast the former provide logic business functionality to support business processes (often by using basic services), e. g. to check the creditworthiness of a customer. The business services can again be layered, meaning that there are simpler ones (e. g. to check the address of a customer) and more complex ones (e. g. to execute a securities order). The more complex business services typically approach the simpler ones to fulfil their goals. The communication infrastructure between all services is called the enterprise service bus. Typically, a SOA is implemented by using a number of building blocks or layers (see Figure 1.2). The lowest layer (legacy layer) typically consists of – already existing – legacy systems and applications typically providing back-office and persistence functionality. This is the core IT architecture. In the next layer (integration layer) these legacy systems and applications are encapsulated and integrated through basic services. On top of the basic service layer the business service layer (service layer) services are integrated – or orchestrated – to build business processes and business workflows. The uppermost layer (portal layer) is used for presentation purposes. This layer is the business process management portal consisting out of composite applications for controlling and monitoring the business processes. Referring to the basic description of a SOA, the integration and services layer can be implemented using different technologies – the best choice depends on the requirements to be met. Web services are independent from the platform (hardware, programming language and operations system), well supported by different providers and hence applied quite often.

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

15

Figure 1.2. Layers of a SOA implementation, adapted from (Eckwert and Saft 2006)

But they have also their shortcoming – the performance of web services is typically limited and traditional transaction protocols (e. g. 2-phase-commit) are difficult to support. CORBA is often considered as a possible alternative because it is well-suited for handling transactions and provides high performance. On the other side it lacks the profound possibilities to support B2B integration, so it seems best suited for internal processes; the same holds for the J2EE technology. Of course also a mix of different technologies could be used for a SOA implementation as long as the SOA goals like loose coupling of services, reuse and usage of services are regarded.

1.3.2

Comparison with State-of-the-art Distributed Architecture Alternatives

To finally understand what a SOA is, it is interesting to understand what it is not. Therefore a brief delimitation from differing architecture alternatives is given. • Client Server: The client-server approach is one of the best established architectures for distributed computing. The approach and the technology available are mature and stable. Systems of this type can be easily implemented and optimized for special application scenarios. On the other hand, once a client-server system is established it is difficult to change the functionality of the system. Therefore its flexibility is limited and best used, if the business process is expected to be stable and changes are seldom (see (Sadoski 1997)).

16

Heiko Paoli et al.

• J2EE: Java Enterprise Edition (J2EE) is also a mature technology and widely used for distributed computing in object-oriented worlds. Objects are typically smaller than services in a SOA and they are more closely intertwined with each other. The technology is independent from the operating system, but of course, not from the programming language. Objects can be inserted, modified or replaced without stopping and restarting the whole system, so J2EE increases the flexibility for adapting business processes compared to client-server systems. The expected reuse factor in a J2EE environment is as high as expected in a SOA. One main difference between objects of a J2EE architecture and services of a SOA is, that distributed objects lack an explicit contract – they only define an interface. From the conceptual point of view objects are passive and can be used from any other object. Objects don’t commit their functionality to other objects like services do. It is possible to implement a SOA with J2EE (see (Modi 2005)). • P2P: In a true peer-to-peer (P2P) architecture every participant is independent and equal. Central components or services which are used by all other participants should not exist. Every participant provides services to and also consumes services from others. The whole infrastructure is very dynamic and self-organizing, which leads to a high level of flexibility but on the other hand reliability can not be granted. The technology behind peer-to-peer networks is relatively young and mainly used in the field of sharing scenarios. In practice most peer-to-peer architectures use some central components like index servers or registries for better performance but with the disadvantage of providing single points of failure. Without using central services it is very difficult to find resources needed in a suitable amount of time. P2P networks are well suited for highly dynamic environments (see (Ratnasamy et al. 2000), (Stoica et al. 2001)). • Multi Agent Systems: Multi agent systems typically consist of many individual units, so called agents. The most important properties of an agent are that it is autonomous, aware of its environment and pro-active. The system infrastructure provides the communication technology for all agents in the system. These provide and consume services, which can be regulated through contracts, so it could be possible that an agent has to pay for some services other agents provide. On the other hand agents commit the services they can provide and define the level of quality. A major difference between agents and services is that the former can ‘decide on their own’ – agents are meant for behaving like human beings. Therefore it is also possible that agents behave opportunistic, meaning that they possibly try to cheat others. Multi agent systems can be regarded as a mixture of peer-to-peer architectures and SOA, but because of the complexity of the infrastructure required, they seem to be seldom suited for huge internet applications or B2B-integration (see (Peng et al. 1998)).

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

1.4

17

Value Proposition of SOA

Having introduced the basic elements, layers and particularities of SOA, we will take a closer view upon the benefits this paradigm promises. Technical and economic aspects are introduced separately. • Flexibility/Loose Coupling/Agility Loose coupling of services is one of the most important goals of a SOA. Loose coupling is important as it minimizes the probability of having a single point of failure. Every service is as independent as it could be. A service should be able to run in parallel to all other services and should not be dependent on failures of other services. In the case if redundant services exist, it should be even more possible to switch on-the-fly to another equivalent service. One major consequence of loose coupling of services is that services should be stateless, otherwise they are not independent. So normally a service will get all parameters which it needs to fulfil the business functionality at its instantiation. The result of its processing will then be delivered to the consumer. One related problem is the difficulty to implement traditional transaction protocols like a 2-Phase-Commit on basis of stateless services. • Application Integration The design of SOA supports easy application integration. Applications can be regarded as independent services which are applied by other services through a connector to the SOA communication framework (Enterprise Service Bus or ESB). The connector translates the data formats, messages and protocols into the common communication model of the ESB. Other services can virtually integrate functionality of applications and provide a combined view of the integrated functionality from all applications as a new composite application. • Complexity Reduction Another promise of SOA is the reduction of the system complexity. A good indicator of the complexity of a software system is the number of point-to-point connections between all used software components. SOA separates software functionality into technical oriented functionality, like storing data into databases, and business related functionality, like providing a new customer account while the needed communication middleware is normally provided by a deployed SOA framework. This separation of concerns eliminates the fact that software components need to have many point-to-point connections between each other. • Reuse Developing software is complex, error-prone and cost intensive. Therefore one of the oldest aims in developing software is reusability. In the past many attempts were made to support reuse of software and software components: At the beginning of software development the concept of procedural programming languages

18

Heiko Paoli et al.

was developed. All the functionality was put into procedures and then bundled into software libraries. Reuse is then established by using already developed libraries to build up new software. In the era of client-server systems the business functionality was put into the server where it could be reused by many (different) clients over a network. The next attempt was undertaken by the idea of object orientation. Now instead of procedures whole objects could be reused and with the technologies like CORBA or J2EE this is also possible over a network. In a SOA this idea is applied to services, which are now the reusable part. Today over 70% of all data used is managed by mainframe systems (see (Datamonitor 2006)). To access these data in a SOA suitable connectors are needed for the legacy systems. Today over hundred connectors for well established legacy systems are already available for SOA infrastructures. After discussing the technical value proposition the economic potentials have to be addressed – these are flexible to innovate or differentiate and cost savings: • Costs One goal of introducing a new technology is the reduction of costs. Potential costs can be divided into total cost of ownership and costs for developing new functionality. With SOA it is expected that the total cost of ownership is reduced because there are fewer connections between the software components and reduced data redundancy. The SOA enterprise bus is the only and central communication platform that connects all software components (services) together, this leads to reduced costs for the system maintenance. • Economic Flexibility/Differentiation The second important value proposition is the ability to easily adapt infrastructure to business challenges. If the requirements of business change IT infrastructure should be able to allow for technical adoptions. Increased flexibility allows to defend or to afford competitive advantage. The delay in adapting IT infrastructures to business challenges has to be minimized. This goal is supported by SOA because the architecture is designed for flexibility. Normally, new business ideas are realized by implementing new business services and business services are self-sustaining and don’t have strong dependencies on each other. Therefore it is possible to develop new business services independently by using the technical basis services. This typically leads to shorter development at lower costs.

1.5 1.5.1

Case Study Union Investment Introduction to the Use Case Company

Union Investment (UI) is tightly integrated into the co-operative sector where the local ‘Volks- und Raiffeisenbanken/Sparda-Banken’ serve the local clients (B2C)

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

19

and with ‘DZ Bank’ and ‘WGZ-Bank’ two central banks provide national and international service for the Volks- and Raiffeisenbanken. A group of companies (‘Verbundunternehmen’) is responsible to offer financial products – the case company Union Investment is responsible for funds, whereas Bausparkasse Schwäbisch Hall concentrates on real estate savings/loans, R+V on insurances and so on. These companies are mainly owned by the central banks, who themselves are owned by local banks. Three computing centres (GAD eG, Fiducia IT AG and Sparda Datenverarbeitung (SDV)) provide IT services including banking applications for the local banks and their individual IT departments. The local banks – about 1.200 with 12.000 subsidiaries, more than 40.000 sales people and over 20 Mio clients – are independent in their decision to sell UI funds or products of competitors. Hence there is a high pressure on UI to have a competitive product portfolio, an attractive fee structure and appropriate services concerning information, consulting/presales, after sales services and so on. The co-operative sales structure of UI differs from other German or European Asset Managers (AM). The main difference is that UI sells their funds only through the local bank channel and does not actively approach the clients in- or outside the co-operative sector. UI supports the local sales channel with massive public relations efforts (TV, Newspaper, ...) for the strong UI brand. Regarding Assets under Management UI ranks 3rd in Germany with 137 Bill. EUR (a step-up from number 5 six years ago). Nearly all German AM offer depot services/client accounting for their customers. UI keeps around 7 Mio. deposits for 4 Mio. clients. 90% of the transactions origins from the B2B local bank channel, around 10% arrive through the direct/ online channel from the client (these clients are also registered at a local bank and had contact to the local bank when they opened up their account). External B2B Business Processes like account opening or subscription/ redemption of funds are crucial key factors for UI. The fluctuating number of transactions over day/week/ month and year and the high influence of the overall stock market require a very efficient back office to handle huge numbers of fax and letters (up to 40.000 fax pages per day have been reached plus similar numbers of letters). UI tries to maximize the level of online transactions and to use straight-through-processing (STP) approach. The local banks use standard desktops and full service back office banking applications from their regional computing centre partners. Each computing centre typically runs different front and back office systems and uses different technological principles (3270, Java, HTML, etc.). The seamless integration into this external world is crucial for UI: 1. 90% transactions result from the B2B local bank side, 2. all required service have to be available at the local bank desktop and through web/internet front-ends, 3. highly standardized interfaces have to be able to handle development und maintenance.

20

Heiko Paoli et al.

Beside the external B2B interface there are also the following B2C and internal interfaces who require the same or similar transactional services: • UI call centre system for the agents servicing B2B and B2C calls, • B2C web system for UI private clients. In total there are six different front-ends, three from external computing centres and three UI owned for B2B Web, B2C Web and the call centre. The core functionality as well as the core processes are implemented in the depot management system Daska. The satellite/external systems usually access the core system through service interfaces (see Section 3.2). In 1999/2000 the IT infrastructure of the UI consisted of approximate 160 applications with more than 800 interfaces. Quite a number of these interfaces required intense orchestration. The boom of fund sales due to financial market developments and the success of new UI products increased the pressure on the internal IT department to handle ever increasing volumes. The management of IT-based business processes instead of single applications required higher adaptability and a more agile integration strategy. These basic conditions led to the strategic decision to introduce a central EAI platform. The most important objectives were: • • • • • •

Faster adaptability to the business processes requirements Reduction of system architecture’s complexity Increased interface quality Increased level of automation (STP) Decreased data replication efforts Decreased maintenance efforts

The steps UI followed to introduce an EAI platform are illustrated in more detail.

1.5.2

SOA Implementation at the Use Case Company

The development of UI’s SOA-based EAI infrastructure was realized in a dedicated process. Every phase increased the importance of the EAI platform for UI. • Pilot Study The very first step was the analysis of existing interfaces and systems. Core business processes were identified and modelled in ARIS. 17 EAI tools were evaluated against a criteria catalogue that comprised criteria like ‘transaction security with two-phase commit’, ‘universal tools for all integration levels’ just to name two. After the first check two tools were left fulfilling UI’s core requirements. After also considering additional nice-to-have criteria Vitria BusinessWare was selected. • First services roll-outs In the first EAI projects simple replication interfaces were implemented which transferred data from the depot management system into additional databases to

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

21

ensure fast transfer to web front-ends. This project served as the proof of concept of the EAI platform. Additionally, new processes and standards (development, deployment process, etc.) had to be developed and adapted. The first services were established in 2002. Through standard interfaces the Volks- and Raiffeisenbanken were given the opportunity to access the depot management system of UI. Now transactions and depot openings can be processed. The process logic as well as the plausibility checks were implemented in the EAI. With this project UI used process integration within the middleware layer for the first time. Furthermore the status of the platform EAI changed to a real on-line system. • Architecture Redesign With the subproject KOS (consolidation of On-line systems) in the year 2004/ 2005 the SOA was completely redesigned. This became necessary, because new front end systems and external interfaces were added to the service interfaces and new complex services had to be implemented. The requirements for the platform changed (hundred times more requests, 24 × 7 availability, etc.). The goal was to bring real-time-data from the legacy back-end system to all connected front-end. This had the major effect that UI got rid of the replicated databases and reduce the complexity within EAI. The process logic was centralized and the integration of new external systems is now a simplified task. Every direct access to the database was removed and a service-call via CTG (CICS Transaction Gateway) was implemented. The subproject was carried out in several steps: Step I: • • • •

Construction of the basic components of the SOA Implementation of 14 service methods Integration of the B2B-front front-end system to the SOA Integration of a database to the SOA

Step II: • Extension of SOA to approximately 6 services with 43 methods • Integration of the B2C-front-end system to the SOA • Integration of the callcenter-front-ends to the SOA Step III/IV: • • • •

Merge of B2B-/B2C-front-ends by a new, consolidated front-end system Extension of the SOA on 12 services with 105 methods Integration of the VR banks to the SOA New Architecture

The SOA is now divided into two functional blocks: The business services, on the one hand, are structured in customer-services or deposit-services. The second

22

Heiko Paoli et al.

block encapsulates all back-ends within a technical service (every back-end has its own service-implementation, like e. g. the oracle-service). The business-layer communicates through the technical layer with the back-end systems, the technical layer is transparent for the business layer and the front-ends. So UI is able to change the back-end systems without any implication for the front-ends. The methods within the services are classified in process und basic methods. Process methods capsulate the business process as well as control the orchestration of one or more basic methods. The integration of a front-end system is based on process methods. The basis methods are rather slender and re-usable. Basic methods are used by the process methods or for internal functions. All methods are administered through the service repository (SR). The SR controls the status for every method (online/offline) and the dependency between process and basis methods. In addition the SR has implemented an access control list for all known front-end to control the access to special set of methods for every front-end. The heart of the SOA is the service engine. With the consolidation the access to the juridical continuance occurs in the host and no more on replicated databases, hence an abstraction layer was implemented through CICS services. All inquiries of the SOA in the host are handed over the common transaction gateway to CICS and are answered by the suitable programs. This allows to use the whole logic in the host and to implement new logic without integrating SQL’s or business logic into the SOA. The outside world communication is done via XML structures. The goal of the service engine (SE) was to design the interface between CICS and XML instead of implementing the mappings. The SE implements the technical error handling and an XML based mapping description which is used to transform the XML into CICS and vice versa by runtime. This reduces the implementation costs for a new service. The platform supports on-the-fly-deployment of service configurations and recognizes cyclic changes within the service repository. All requests and replies are logged in a database. Per day approximately 200,000 requests like buy/sell orders for funds are processed by the platform. The new architecture consists of the following components: a. Service adaptor: Converts client specific requests into internal requests, supported technologies are (HTTP, MQ-Series, etc.). b. Dispatcher: Handles the processing of the requests, uses the service repository. The dispatcher is part of the enterprise service layer and part of the system integration layer. c. Service repository: Contains information about services (name, type, access right-hand side, etc.). d. Enterprise service: Enterprise services contain the business logic and use integration services to access the legacy systems. e. Integration service: Handles the access to the legacy systems.

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

1.6

23

Discussion

To illustrate which benefits of the SOA approach were realized at UI they are discussed in more detail. • Flexibility/Loose Coupling/Agility The main advantage of the new SOA for UI is the elimination of replication interfaces. This avoids data inconsistency and improves quality and respond times. The business logic is decoupled and only located at the host whereas it was distributed among several layers prior to the SOA introduction. In principle changes of business logic and/or technical services should have become easier and at reduced time and cost. Choosing the correct service for adoptions is still not always straightforward because of the high number of existing services. Having found the service to revise, most adoptions can be made by changing the configuration of its description – in contrast to changes of the Java code itself like before. As a consequence these changes don’t require a compilation of the service implementation anymore. They can be performed by staff different from IT-developers. Loose coupling of the components was required for the orchestration of the services but it is also strategy of UI to improve the stability and reliability of the whole system. Most of the services (business and technical) are dependent from additional technical services. So far, it is not exactly transparent whether these dependencies are really necessary or could at least be minimised by restructuring the technical service layer itself. These dependencies lead to complex release planning. For the reliability of the whole infrastructure it was noticed that after the SOA introduction the availability increased. However risk increased as well: With the service repository/ the service dispatcher the architecture now has a new singlepoint-of-failure. Prior to the introduction of the SOA redundancy avoided singlepoints-of-failures. For critical components additional investments in fallback solutions and load balancing became necessary. One surprising result of the new architecture is performance. As XML is used intensively in the new system (which basically comes along with a lot of serialisation, de-serialisation and transformation efforts for this text based format) so lower performance was expected. Surprisingly, the performance was ten times higher than in the old system. • Application Integration With the introduction of SOA seven applications have been consolidated and integrated. It became easier to build new application layers consisting of composite applications. Because of independent efforts in enterprise application integration there was already a prototype established before the full roll-out of the final solution. Hence, these independent strategies fitted together nicely and alternatives for application integration weren’t discussed.

24

Heiko Paoli et al.

• Complexity Reduction The centralisation of the business logic into the depot system reduced the interface of the SOA to its core capabilities. Multiple development efforts for insurance business plausibility weren’t needed. Inside the SOA technical aspects and functions are wrapped and can be orchestrated through business processes. Hence, the complexity of the technical services and business plausibility is hidden from the user. It is now possible to add new front-ends and new services at minimal costs. On the other side additional effort for coordinating and testing is needed. Modifications on already installed and utilised services or basic functionalities are sensitive. Therefore, advanced planning prior to modifications and complex tests afterwards became necessary and have to be coordinated by a central release planning. The total number of software components has more or less remained the same. • Reuse When a huge number of software projects uses the same platform reusability of components becomes an important issue. Every project can profit from the availability of services tools and service components. The first step for every new EAI project is to examine the existing components whether they can be utilised at different points or whether different components/services have to be implemented. Analyses at the beginning of the SOA introduction found out that reuse of the old and already installed software components from the old system was not suitable. Hence most functionality has been developed from scratch; one exception was the communication interfaces to the B2B partners. The newly developed components will be easily reusable in the further development of the system. During the last years many components, modules and service tools have been created which accelerate the development of new projects. Some of these components are: • error processing: centralized, data based error handling, • deployment operations: fully automated, staging (E-T-A-P) support, • data base XSL transformer: centralized service, can read XSLT documents dynamically from data base or file system, • ... In the beginning the re-use factor was rather low and limited to some sourcecode or parts of the framework. With the introduction of basic function and the SOA the relation ‘newly created software parts’ to ‘re-used software parts’ improved a lot. Nearly 80% of the basic services can be re-used without customization. The rest splits itself in basic services where minor adaptations are necessary, and a small part of new functions and services. • Economic Aspects Besides the technical observations economic ones can be illustrated. • Costs: In the case of UI the direct costs of the EAI System have not been reduced after introducing the SOA, but the SOA enabled to consolidate the front-end systems where the cost reduction was significant.

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

25

UI noticed that developing new functionality through services is not cheaper than adding new functionality to the old architecture. Pure implementation efforts for functionality as service are lower but there is increased planning effort because services should be reusable and must be conform to the overall SOA strategy. Cost reduction is possible both for operating and maintaining the system, but maintenance became more complex in terms of dependencies from reused services. • Economic Flexibility/Differentiation: As stated above, it turned out that the SOA itself is more flexible than the old solution, but it was also mentioned that this flexibility is today hardly helpful for UI. At the moment the time needed for implementing new functionality is nearly the same like it was prior to the introduction. This is because major changes of the functionality in the back-end system often also require changes in the front-end. Unfortunately these front-ends are administrated by UI partners and are not controlled by UI nor are they part of the SOA structure.With the new SOA users of the system got direct access to real-time data and don’t have to work with replicated data. This increases user satisfaction. Because of the separation between business-oriented services and technical services it is easier to locate which services should be changed or adapted if the business requires this. It is often possible to change only the configuration of a business service instead of changing the implementation by means of programming. A good example of the new flexibility is the implementation of an OCR system which uses existing SOA services like name search or ordering. For the new technology only a few parameters had to be added to the services.

1.7

Conclusion

Business requirements drive the requirements of the IT infrastructure. A modern IT infrastructure has to support business process. In a dynamic environment IT architecture has to be prepared for continuously changing process. One of the key success factors of today’s infrastructures is flexibility. The new development paradigm is to assemble new applications from existing components (composite applications). Programmers create services but architects and business analysts orchestrate services to implement business processes. Therefore there are two closely intertwined levels: the business and the IT level. In the case of UI the main goal for introducing a new architecture were • • • •

reduction of costs, easier maintenance of the system, timeliness of data, and elimination of redundancy.

The introduced SOA is fast and flexible whereas the direct costs remained the same, cost reduction was achieved by using the SOA for front end consolidation.

26

Heiko Paoli et al.

The strategic decision to introduce a central EAI platform was mainly driven by requirements like agility and a high STP rate. Vendors of EAI/ESB systems typically promise fast deployment and high return on investment (ROI). UI realized that this only holds true for either build-from-scratch or very simple infrastructures where no legacy systems etc. have to be integrated. Even there the result seems still vague. SOA success requires time and expertise. A positive or negative outcome of an economic analysis of the business case depends heavily on the technical situation (like the existing infrastructure with respect to number of interfaces, STP requirements, BPO and so on) and the individual framework of the company (e. g. with respect to existing contracts). In a more or less static or simple environment, where changes are rare it seems pretty hard to calculate a positive business case. SOA fits perfect into complex infrastructures with time dependent linked interfaces and a high rate of change. It is common that in such infrastructures data is often replicated and (nearly) identical business services are developed and maintained a couple of times. Minimizing data replication and reusing services will result in cost reductions. Technical challenges have somehow been transferred into business challenges: Management and planning of releases and testing became more complex and that flexibility gained is not fully used because of external dependencies to the business partners. The new architecture allows more configurations and needs less programming than the old architecture but for changing the configuration there is still a developer necessary. The new architecture has more or less of the same complexity like the old one, a reduction in complexity could not be achieved. But even if the total complexity is the same the overall structure has been improved. The old structure has grown historically and consisted of a number of completely different systems together with a lot of workarounds (spaghetti pattern). Now the different subsystems form an integrated structure and follow a common idea. This leads to an interesting new insight and psychological side effect of SOA, the employee ‘like to work’ with the new architecture. Development of the old system often meant to do boring and nasty work whereas the further development of the SOA architecture tends to be interesting and motivating. All in all the introduction of SOA was a success and the new system provides more flexibility, higher performance and reliability at the same costs. From the sight of Union Investment the possibility of cost reduction alone is not the main point for introducing a SOA. To further increase the momentum of the new technology at UI and in general continuous improvement with respects to technical services but also with respect to organisational aspects is needed – areas that themselves are still objects to scientific research: Two of the most important questions might be how to divide new functionality into appropriate services and how to innovate new services at all.

1 SOA in the Financial Industry – Technology Impact in Companies’ Practice

27

• How ‘big’ should a typical service be? What are the rules to design a service? This question is today only answered by rules of thumb and surely need more experience with SOA. In practice the most implementations tend to design services in dual to the business process. Therefore the driver for the service granularity is more the business and not some technical aspect. It is obvious that it makes no sense to have some artificial rules like e. g. the number of methods a single service could have at a maximum before to split it. For deciding how requirements could/ should be divided in services and/or orchestrated a detailed analysis and examination is recommended. At first it should be ascertained which requirements are needed by the business process (top-down approach) then, secondly, it has to be decided how these requirements could best be provided by business and technical services (bottom-up-approach). At this point it has also to be checked if services should be reused by already installed and integrated systems. Today’s guidelines at UI prefer thin business services with a high reuse-rate. • How can or even should service innovation look like? This process surely needs strong communications between a number of organisational units like i. e. business departments and system development. In SOA scenarios exists the same problem like in most software development process – namely that business experts and system experts have to be more closely connected and that methods, tools and languages have to be available that are easy to understand for both of them but powerful enough to be of real help. So far at UI exists no continuous innovation cycle, there are no regularly planned milestones for checking and refactoring deployed services. SOA has a high potential in improving companies’ infrastructures. The momentum that can be reached surely depends on the questions how well the different internal stakeholder groups can be supported and how well technical and economic questions can be addressed simultaneously.

References Bloomberg J (2005) The Four Pillars of Service-Oriented Development, zapthink, http://www.zapthink.com/report.html?id=ZAPFLASH-2005418, as seen on 2006-09-20 Datamonitor (2006) OpenConnect Exposes Mainframe Services by Listening, http://www.computerwire.com/industries/research/?pid=D72EDD51%2DE37 4%2D4F46%2D9904%2DB1C115B08624, as seen on 2006-09-20 Eckwert B, Saft F (2005) Konsolidierung Onlinesysteme – ein SOA-Praxisbericht, Talk at EAJ-Forum, Karlsruhe, 2005-04-20 Empolis (2006) Empolis official Hompage, http://www.empolis.com/, as seen on 2006-09-20 Fu X, Bultan T, Su J (2004) Analysis of Interacting BPEL Web Services, in: Proc. of the 13th Int. World Wide Web Conference, pages 621–630, New York, USA

28

Heiko Paoli et al.

Garlan D (2000) Software Architecture: a Roadmap, in: ICSE – Future of SE Track http://www.cs.ucl.ac.uk/staff/A.Finkelstein/fose/finalgarlan.pdf Gartner (2005) Gartner’s Positions on the Five Hottest IT Topics and Trends in 2005, David Cearley, D./Fenn, J./Plummer, D., http://www.gartner.com/ DisplayDocument?doc_cd=125868, as seen on 2006-09-20 He H (2003) What Is Service-Oriented Architecture, XML.COM, O’Reilly, http:// webservices.xml.com/pub/a/ws/2003/09/30/soa.html, as seen on 2006-09-20 Huhns M, Singh M (2005) Service-Oriented Computing: Key Concepts and Principles, in: IEEE Internet Computing, vol. 09, no. 1, pp. 75−81 Jones S, Morris M (2005) A METHODOLOGY FOR SERVICE ARCHI TECTURES, OASIS, http://www.oasis-open.org/committees/download.php/ 15071/A%20metho-dology%20for%20Service%20Architectures%201%202% 204%20-%20OASIS%20 Contribution.pdf, as seen on 2006-09-20 Kossmann D, Leymann F (2004) Web Services, in: Informatik Spektrum 27(2): 117−128 Krafzig D, Banke K, Slama, D (2004) Enterprise SOA, Service-Orientated Architecture Best Practices, Prentice Hall Magic Software (2005) EAI Competence Days, Mainz, Germany, http://www.magicsoftware.com Modi T (2005) Splitting Hairs: Web Services Vs. Distributed Objects, Web Services Pipeline, http://ie.webservicespipeline.com/57704023, as seen on 2006-09-20 Nickull D, McCabe F, MacKenzi M (2006) SOA Reference Model TC, OASIS http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=soa-rm, as seen on 2006-09-20 Peng Y, Finin T, Labrou Y, Chu B, Long J, Tolone W, Boughannam A (1998) A Multi-Agent system for Enterprise Integration, in: Proceedings of the 3rd International Conference on the Practical Applications of Agents and MultiAgent Systems (PAAM-98), pages 155−169, London, UK Ratnasamy S, Francis P, Handley M, Karp R, Shenker S (2000) A Scalable Content Addressable Network, Berkeley, CA, USA Sadoski D (1997) Client/Server Software Architectures – An Overview, Carnegie Mellon Software Engineering Institute, http://www.sei.cmu.edu/str/ descriptions/client-server_body.html, as seen on 2006-09-20 Stoica I, Morris R, Karger D, Kaashoek F, Balakrishnan H (2001) Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, in: Proceedings of the 2001 ACM SIGCOMM Conference, pages149−160 Stojanovic Z, Dahanayake A (2005); Service-Orientated Software System Engineering, Challenges and Practices, Idea Group Publishing Wikipedia (2006) Service-oriented architecture, in the free encyclopaedia (http:// en.wikipedia.org/wiki/Service-oriented_architecture, as seen on 2006-09-20) Zimmermann O, Krogdahl P, Gee, C (2004) Elements of Service-Oriented Analysis and Design: An interdisciplinary approach for SOA project, IBM DeveloperWorks Web Services Zone, July 2004

CHAPTER 2 Information Systems and IT Architectures for Securities Trading Feras Dabous, Fethi Rabhi

2.1

Introduction

Electronic finance (e-finance) is mainly concerned with the automation of traditional activities in the finance domain and their associated processes. In particular, trading has become more and more a global activity because of the recent technological developments that have facilitated the instantaneous exchange of information, securities and funds worldwide. Trading decisions have also become more complex as they involve the cooperation of different participants interacting across wide geographical boundaries and different time zones. Although traditional communication media (such as phone, email and fax) are steadily being replaced by automated systems, there are still many areas that require human intervention. The evolutions in IT middleware and integration technologies (particularly those based on the Internet) are providing endless opportunities for addressing such gaps (Banks 2001). One of the first challenges for software architects and IT professionals is getting sufficient knowledge about existing practices, business processes systems and architectures. This chapter should be primarily regarded as a contribution to the understanding of the domain of capital markets trading by illustrating some of its associated architectures, technologies and systems. The remaining part of this chapter is organized as follows. Section 2.2 introduces financial markets in general and reviews the concepts of capital markets structures, trading instruments, and trading order types. Section 2.3 discusses the information flow across the trading cycle in a capital market. It also addresses different types of information systems that are in use across such a cycle. Section 2.4 provides a closer look to the software architecture of two selected systems. Section 2.5 discusses an integration model that illustrates the different levels of abstraction currently being used in capital markets information systems. Section 2.6 concludes this chapter.

30

2.2

Feras Dabous, Fethi Rabhi

Financial Markets: Introduction and Terminology

2.2.1 Introduction to Financial Markets Financial markets have the role of facilitating the movement of funds among people and organizations. The surplus units (e. g. funds suppliers) are seeking to generate returns (profits) by supplying their funds to the deficit units (e. g. the borrowers). Financial markets have evolved in such a way that enhances efficiency with which trades may take place and maximizes the fairness of trading in order to attract investors. The evolution of financial markets is a continuous process in response to changing needs, new technologies and changes in public policies. Financial markets are constantly introducing new financial instruments with various levels of risk management that equip the market participants with more confidence and risk management to conduct trades. Financial markets are classified into primary and secondary markets. The former is used by deficit units such as businesses, governments and households to raise funds for the expenditure of goods, services and assets. The latter is used by surplus units such as the investors to trade what has been bought in the primary market. Financial markets are also classified into money markets and capital markets (Viney 2000). Money markets bring market participants together to trade market supported instruments offering high levels of liquidity. Australian money market participants include banks, superannuation funds, money market corporations, fund managers, building societies, credit unions, cash management trusts and companies. Money market instruments include exchange settlement account funds, treasury notes, intercompany loans, interbank loans, etc. On the other hand, capital markets have the primary responsibility for the operation of both its primary and secondary capital markets. Companies that need to raise capital by offering new stocks use the primary market. These new stocks can include initial public offerings (IPOs), right issues, placement and company options. Once those newly issued stocks are sold in the primary market, they can continue to be traded through the secondary market, called the stock exchange market. Capital market participants include traders (e. g. investors), intermediaries (e. g. brokers), statutory authorities (e. g. securities commissions) and third parties such as commercial information vendors (e. g. Bloomberg or Reuters). The market structure is determined by the trading rules and trading systems used by that market. These rules and systems determine who can trade, what to trade, when, where and how they can trade. They also determine rules governing market information dissemination to traders and other market participants (Harris 2003). Understanding market regulations, mainly trading rules and systems, is crucial for conducting trades. Normally, trading strategies that are applied to one marketplace

2 Information Systems and IT Architectures for Securities Trading

31

are not applicable to most other markets that have different regulations and structure. Such differences are motivated by different understanding and research outcome on how to facilitate fairness and efficiency of a marketplace. A framework for market characterization has been defined in (Aitken et al. 1997). The framework has six crucial elements which are: technology, regulation, information, participants, instruments and trading protocol. It also presents four characteristics to achieve market fairness and efficiency. These characteristics are liquidity, transparency, volatility, and transactions costs. Market organizers have the role of achieving the best combination of the six factors in such a way that satisfies required levels of a fair and efficient market. In the remainder of this section, we illustrate key elements in markets microstructures that are: trading instruments, types of trading orders, and types of markets microstructures.

2.2.2 Capital Markets Trading Instruments Capital markets have evolved by introducing a number of trading instruments that diversify investment opportunities and attractiveness. These instruments can accommodate different trading styles for investors. They range from trading physical stocks (i. e. shares) to contracts on the willingness to trade stocks on a future date at a price determined today.

2.2.3 Equity Market The equity market, commonly known as sharemarket, plays the role of both primary and secondary capital markets. It is the place where financial assets are bought and sold. These assets can be the new stocks issued at the primary market or existing stocks that are currently traded at the secondary market. Secondary market transactions do not directly affect the cash flow of the corporation whose shares are being traded. The cash flow of that corporation is directly affected when it issues new stocks at the primary market. However; it has already been suggested that the existence of a well-developed secondary market is of a great significance to a corporation that may be seeking to raise capital in the primary market (Viney 2000). The existence of an active, liquid, well-organized market in existing shares adds to the attractiveness of acquiring new shares once advertised in the market. Figure 2.1 (based on (Viney 2000)) summarizes the flow of capital and ownership rights of stocks in secondary capital markets in Australia while showing the participants role in that process.

32

Feras Dabous, Fethi Rabhi

Figure 2.1. Roles of secondary capital market participants and the flow of funds in Australia

2.2.4 Derivatives Market The derivatives market provides instruments that focus on risk management (interest rate, foreign exchange and price risks). These instruments such as options and futures have their value derived from the underlying security in the equity market or the value of a specified index for the index based derivatives. The unit of derivative trading is a contract. Once the contract is written, it can be traded at the derivatives market at a price that depends on the underlying stock price movement or the underlying index value. Each contract is a variant of an obligation or no obligation to trade (sell/buy) the underlying stocks at a predetermined price before a predetermined date or at predetermined date at some cost or no cost. Derivatives investors should be obligated to pay or receive margins during the life of the contract on timely bases. Organized derivatives can also be based on an equity market index (index derivatives) that gives the investor exposure to a certain market index. The advantage here is the exposure to a broad range of securities comprising an index rather than being limited to one particular company. Index derivative are normally cash settled.

2.2.5 Types of Orders There are a number of standardized orders that can be used by investors to inform their brokers about the order specification and/or by traders or brokers to inform the market about the intension to sell (ask order) or buy (bid order). Some markets may not support certain types of orders. In this case, brokers may offer services to their clients. Brokers may support a more complicated order specification service. In both cases, the broker service would imply the execution of the order as soon as all its requirements and conditions set by the trader are satisfied. Harris (Harris 2003) has presented a number of common orders’ types some of which are:

2 Information Systems and IT Architectures for Securities Trading

33

• Market order is an instruction to trade at the current available best price. This type of order is traded immediately but the trading price is not fixed before trading. • Limit order is an instruction to trade at a specified price or better. The price of bid limit order is normally equal or less than the current best bid price, while the price of ask limit order is normally equal or greater that the current best ask price. This order is only trader when the market price moves towards it. • Any order can carry additional information such as validity, expiration, etc. Some of the validity and expiry instructions are: open-order, dayorder, good-this-week, good-this-month, good-until, immediate-or-cancel, fill-or-kill, good-after, market-on-open, market-on-close. Quantity instructions may also be included such as all-or-nothing or minimum-or-none orders. Other instructions may also be attached such as hidden-volume (undisclosed) order and substitution orders.

2.2.6 Classifications of Marketplaces There exist two main types of trading sessions: continuous and call. Continuous session allows brokers and traders to place orders at any time during the trading session and the exchange executes trades as soon as a bid/ask match occurs. On the contrary, in call session, brokers and traders can also place orders at any time during the session until the market is called. A classification of market microstructures (Harris 2003) shows the following types of markets: • Quote-driven dealer markets: in this market type, the dealers participate in each trade by quoting the bid and ask prices once requested by brokers or traders. Traders request quote and negotiate with a number of the security’s dealers and normally select the dealer with the best quote that suites their order. If the submitted order is not matched immediately, then it is processed according to the type of that order and its routing instructions, for instance, it may be cancelled or awaiting as an outstanding order. Matching outstanding orders can occur through a negotiation process in which the dealer moves his offered price to match that outstanding orders and/or the trader updates his order price to match the dealer price. NASDAQ, London stock exchange (LSE) and Reuters200 foreign exchange trading system are examples in which quote-driven market is the dominant structure organized by a dealer association and an exchange. • Order driven (auction) market: include oral auction (open outcry), singlepriced auction, continuous rule-based two-sided auction and crossing networks. In this type of markets, trader orders for any particular security are centralized in a single orderbook where buyers are seeking the lowest price and sellers are seeking the highest price. An order-driven market uses order precedence rules to match buyer to sellers and trade pricing

34

Feras Dabous, Fethi Rabhi

rules to price the resulting trades. The primary order precedence rule at all markets is the order price where the higher bids and the lower asks have higher priorities for matching. Other precedence rules vary for different markets and used when a number of orders have the same price. These rules include placement time where the earlier submitted orders have a higher priority, order size where smaller/larger order volume has higher priority, public order precedence where public traders have higher priority than floor traders and display precedence where disclosed orders have higher priority than undisclosed orders. Continuous auction markets, which maintain a real-time outstanding orderbook for all securities, are used in most stock exchanges such as ASX continuous trading session. • Brokered market: in such markets, brokers take the role of the exchange order matching under certain circumstances which is normally called crossing. Brokers’ role is to search for a counterparty trader(s) for a trading request. This is normally occurs for very large trading orders which is called block orders. In dealer market, the dealer sometimes refuses to conduct risky trades due to expected abnormal price movement or in illiquid securities. In auction markets, block orders could have an impact on the market price especially if that security is illiquid that may cause the market price to move away from the quoted price before matching that block order. The ASX allows brokers under some conditions to arrange trades for block orders through in-market and off-market crossing trading rules. • Hybrid markets: combine characteristics of the previous three market types may be presented. Normally, the market structure features of one market type are the dominant while some other features from other markets types are embedded. The NYSE, which is classified as an orderdriven market, requires the specialist dealers to offer liquidity if no one else does that. The NASDAQ, which is classified as a quote driven dealer market, requires its dealers to display or execute the limit order.

2.3

Information Systems for Securities Trading

This section describes the information flow across the securities trading process. It also illustrates the different categories of information systems that can be utilized across such a process.

2.3.1 Securities Trading Information Flow Market information includes real-time trading data and events such as orders, trades, quotes, indexes and announcements. It also includes historic information such as past trading data and companies’ periodic reports. Additional statistical data and analytics may also be available; however, participants such as brokers

2 Information Systems and IT Architectures for Securities Trading

35

Figure 2.2. Flow of trading information within the trading cycle

can generate their own analytics based on raw historic market data. As mentioned earlier, market information dissemination is very crucial to the fairness and efficiency of a capital market and hence contributes to its attractiveness to traders and encourages companies to be listed on the market. Figure 2.2 illustrates the overall information flow across the trading processes for an order-driven market structure which is derived from the way the Australian Stock Exchange (ASX) (www.asx.com.au) operates. It comes in the shape of a continuous cycle in which we identify five phases: pre-trade analytics, trading, post-trade analytics, settlement and registry. In the following is a description of each trading phase. 2.3.1.1 Pre-trade Analytics Phase The trading cycle starts with a pre-trading analytics phase that is normally conducted by the brokerage firms. In this phase, analysts use real-time information (trades, orders, volume, etc.) and correlate it with company announcements and historic information (including periodic company reports) to predict price movement with a degree of certainty. Consequently, the purpose of this phase is to support buy or sell decisions in order to gain capital profit or prevent losing capital respectively. Most brokers provide the outcome of their analysis as services to their clients (investors) and extra fees normally apply. Investors can seek historic information, delayed real-time market information and announcements from the

36

Feras Dabous, Fethi Rabhi

capital market web site or from third party such as commercial information vendors e. g. Bloomberg (Bloomberg.com) and Reuters (www.reuters.com). These services are offered free or for some fee that depends on the level of detail and real-time accuracy. 2.3.1.2 Trading Phase The trading phase basically implements a market structure as discussed in the previous section. Buyers and sellers both signal their intentions in the form of orders (different types of orders were also discussed earlier) and different markets have different methods for determining trades depending on the order type, the market structure and the degree of the trade process automation. At the heart of the trading system is the trading engine (TE). 2.3.1.3 Post-trade Analytics Phase A capital market attracts investors by maximizing the fairness and efficiency of trading. Post-trade analytics focuses of detecting illegal trading behavior based on the available real-time and historic market information. Post-trade analysis can also use this information for other purposes such as analyzing past and current market behavior and trading patterns and predicting future behavior, etc. A capital market surveillance department is an example of a place for conducting post-trade analytics. The major responsibility of a surveillance department is to detect illegal trading incidents and to generate real-time alerts for such incidents. A surveillance department must have real-time access to every piece of market information across all stages in the trading process such as order and trading data, settlement and registry databases, possibly brokerage firms’ databases and investors’ holdings and personal data. Due to information privacy measures that are enforced, surveillance departments may not be able to access some of this information. Instead, they are partially coping by implementing intelligent and smart techniques to detect illegal trading behavior. Because of heuristic nature of these techniques, surveillance is in constant need for improvements. 2.3.1.4 Settlement and Registry Phases Settlement and registry is a transactional-based process for transferring securities from seller’s profile to buyer’s profile and transferring money from buyer’s account to seller’s account. The transfer of both securities and money is simultaneous and irrevocable once the settlement process is started. The registry phase completes the settlement process by modifying the ownership of shares in the company’s registry from the seller to buyer. It is possible that a number of brokers are involved in one single trade, this happens when, for example, in equity market one large bid order is matched with a number of smaller ask orders. The settlement process is complicated since it involves transferring money and ownership of securities between the different parties involved.

2 Information Systems and IT Architectures for Securities Trading

37

2.3.2 Categories of Capital Markets’ Information Systems We now take a close look at the different types of computer-based systems that participate in the securities trading process. These systems include those that facilitate information dissemination to market participants and contribute to market’s transparency. Other systems facilitate order routing to and from the exchange and across the trading cycle. Remarkably, each phase in the trading cycle is supported by one or more computer information system. In the following classification (based on (Harris 2003)), one system may perform tasks related to more than one category. 2.3.2.1 Information Collection and Distribution Systems These systems have replaced the clerks (market reporters) who used to observe and record all trading activities. Information collection includes reordering all market activities such as orders, cancellations, amendments, trades, quotes and announcements. Most exchanges provide real-time market feeds that can be accessed by outside systems. The distribution of market information is directly related to market’s transparency. Distribution can occur in real-time or delayed basis governed by market regulations and the type of requester. Most markets provide delayed quotes and companies historic information as a free service for public. Real-time services are only available registered members such as brokers and data vendors. Data vendors, such as Bloomberg and Reuters, often support query and broadcast services of its customers. Query services provide information on demand while broadcasting provides continuous streams of information as it is received from the participant exchange. Broadcast services include trade reports (ticker tape) and/or quotes (quotation feeds). Broadcasting delayed quotes service in normally available free on the watch board of the market while query services of delayed quotes are also available for free on the Web sites of markets or data vendors. 2.3.2.2 Trading Engines The trading engine (TE) is at the heart of an automated trading system. The main function of the TE is to match bid/ask orders using order precedence set by the market regulation. In an order-driven market such as the Australian Stock eXchange (ASX), the TE implements a number of bid/ask order-matching algorithms that depend on the security concerned, the current trading phase (opening, continuous, closing etc.) and the type of order. Once a security’s best bid price matches its best ask, an irreversible trade is executed. In-house developed trading engines are often the choice by many large markets such as SYCOM in Sydney Futures Exchange (www.sfe.com.au), SuperDot in New-York Stock Exchange (www.nyse.com) and SuperMontage in NASDAQ (www.nasdaq.com). Other markets choose a thirdparty trading engine. For example, SWX Swiss and Shanghai Stock Exchange use

38

Feras Dabous, Fethi Rabhi

OMX’s X-STREAM trading engine (www.omxgroup.com) which will be described in more detail later in this chapter. Another example is OMX’s CLICK trading system that is utilized in a number of equity options markets such as the International Securities Exchange and the ASX. 2.3.2.3 Order Routing Systems Order routing systems transmit orders between the appropriate market participants and systems. Traders use them to send orders to the broker or dealer. Brokers use them to transmit orders to a dealer or an exchange and to report back to the trader about the status of the order. Dealers and exchanges use them to report trades to the brokers and/or traders and to the registry and settlement systems. In ASX, order routing systems are utilized by the SEATS message based Open Interface (OI). Order routing systems of Australian brokers such as IRESS IOS (www.iress.com.au) are conformant with the SEATS IO. American ITS (Intermarket Trading Systems) facilitate trading across different equity markets in US. Some Electronic Communication Networks in the US are participants to order routing systems. 2.3.2.4 Order Presentation Systems Order presentation systems present information about orders and quotes to the market participants (e. g. brokers). In order driven markets, this information is presented on traders screen connected to the screen-based trading system. Brokerage firms support their own presentation/visualization systems that receive feeds from ORSs. For example, IRESS system has its own facility for order presentation to the brokers whereas SMARTS (to be described later) has two subsystems called SPREAD and REPLY which support orders visualization facility to market researches and surveillance personnel. Dealers markets often use messaging systems that allow participant to communication via messages. 2.3.2.5 Settlement and Registry Systems Settlement and registry systems facilitate the registration of the trade volume to the buyer and transferring the trade value to the seller. We often distinguish between settlement and registry systems. Settlement is about the transfer of money in return for title and the legal title is often (not always) then held in what is known as a depository. The settlement and depository is often done in the same organization which are in general called “CSD” – central securities depository. The registry system is really a duplication of this information but is a specialized function because its role is to facilitate the communication between a company and its shareholders (such as voting, annual meetings, distributing annual reports and dividends etc.) and is becoming an increasingly important role given that issues of corporate governance are becoming more prominent. In the ASX, CHESS is the official settlement system and depository that acts upon receiving trades from

2 Information Systems and IT Architectures for Securities Trading

39

SEATS with the cooperation of the trade participating brokers. CHESS is capable of communicating with issuer sponsored registry if one of the trade counterparties is not CHESS sponsored. 2.3.2.6 Surveillance Systems Market surveillance is facilitated by computer systems that try to detect illegal market behaviors that violate the market regulations. Surveillance systems should have access to every piece of real-time or historic market information. For example, the Financial Services Authority (FSA) (www.fsa.gov.uk) in UK is tendering for a supplier to upgrade its SABRE surveillance system. SMARTS (www.smarts-systems.com) is a universal surveillance system that is used in more than 16 countries.

2.4

Case Studies

This section presents two selected systems that participate in the securities trading cycle, namely the SMARTS surveillance system and the X-STREAM trading engine.

2.4.1 SMARTS SMARTS (www.smarts-systems.com) is a securities markets surveillance and information system that provides easy-to-use alert management, reporting and research capabilities for exchanges. The system allows accurate alert algorithms to be created. SMARTS is a software tool which originates from the research labs at Sydney University and was developed under the direction of Prof. Mike Aitken and two PhD students Thomas Jones and Tim Cooper. After the involvement of Computershare Ltd in 1996, the tool changed its focus from a research tool into a surveillance and compliance product. SMARTS is now used in many stock exchanges including Jakarta and Hong Kong, SWX Swiss Exchanges, ASX, Tokyo Commodities Exchange, SE of Thailand, Singapore, Moscow. The SMARTS architecture comprises the following important elements: 1. A number of application components that perform surveillance and research functionalities on real-time and historic financial data. 2. A data model called MPL which stores all the market real-time and historic information in an optimized manner. 3. A number of software modules (called engines) for the creation and manipulation of MPL data. The most commonly used are the tracking and alerting engines. 4. The Alice language that is utilized to specify the rules and conditions that can cause the issuing of alerts.

40

Feras Dabous, Fethi Rabhi

2.4.1.1 Application Components SMARTS comprises a number of application components (or modules) which carry out predefined and autonomous tasks on behalf of users. There are different types of components. The surveillance-oriented components are: ACCESS (benchmark visualization tool depicting the normal behavior of trading metrics representing liquidity, volatility, transparency and transactions costs), REPLAY (replays historical order and trade data), SPREAD (an interactive graph of trading behavior on price and time axes), ALDIT (Alice language editor), and ALMAS (Alice Management System), COLLUDE (depicts trading activity between brokers). Besides real-time detection of illegal behaviors in capital markets (e. g. insider trading and market manipulation), SMARTS can also be utilized as a platform for capital market analysis and research purposes. The analysis/research oriented components are: EVENTS (Market research), FACTS (Market statistics report), INFORM (Information Editor for news such as company announcements), PORTFOLIO (visual representations of trading metrics), and REPORT (lists of transactions, such as trades and orders, pertaining to a single or multiple security, house, trader and/or client). 2.4.1.2 SMARTS MPL Data Model The MPL data model manages all information relevant to SMARTS application components. As Figure 2.3 shows, there are two types of data: transient data which is held in the main memory and persistent data which is held on disk. The

Figure 2.3. SMARTS architecture

2 Information Systems and IT Architectures for Securities Trading

41

MPL data model can manage and process real-time transactions as they arrive from the trading engine. It also provides the required infrastructure for monitoring multiple markets (either different trading engines or different types of markets such as options and futures). The MPL data model manages the following market relevant data and concepts: trades, orders, quotes, historic indexes, interest rates, dividends and stock splits, static data such as companies names, shares-on-issue, index weighting, special fields for options, futures, bills and bonds, broker house or trader cash position, inventories of holdings, and news announcements. One feature of MPL is that it relies on new data (i. e. market transactions) being added to track data (FAV format) in append mode only. The “never delete anything” feature is needed for storing a full history of events so that, at any time, a surveillance department can regenerate the exact conditions as they occurred in the past for research and investigation purposes. The concepts of “append-mode only” and “never delete any data” allow creating redundant data in different formats and indexing for performance needs such as AM data (indexed by security and contains information such as best bid/ask, trade info, security status control info and liquidity info for each trading day) and Daytot data (daily summaries indexed by date including open price, close price, max price, min price, average price, and turnover). MPL comprises a set of APIs that can be invoked from SMARTS application components, C programs or through the Alice language commands. At the lowest level, MPL data is stored using binary proprietary formats that provide efficient access and storage conditions. 2.4.1.3 MPL Manipulation Engines The alerting engine is the most significant part of any surveillance system. This engine implements the specialized alerting language Alice. Once the alerting engine is started (by the “alertserver” program), it interprets all text alert rules from the companion Alice code into a convenient data structure that will direct the alerting engine processing for that trading day. Normally, the alerting engine examines those alerts rules that are activated by the type of the FAV message that has been processed or on a timely basis. 2.4.1.4 The Alice Language SMARTS uses the in-house built programming language Alice for coding and creating alerting rules based on historical or real-time trading data. It also used for generating reports based on historical or real-time trading data. Alice programs make it clearly possible to separate SMARTS functionalities from the alerting rules. Alerting rules (i. e. Alice program) can be modified at any time to incorporate new or modified rules without the need to recompile SMARTS. The alerting engine interprets the contents of the respective Alice program to build the memory structure that represents those rules. Alert engine then uses this structure to control

42

Feras Dabous, Fethi Rabhi

its behavior on when (following which events) and what rules to be check. Event can be the type of trading event (i. e. FAV message type) that is currently being processed. It also can be on timely basis (e. g. every 5 minutes). A converter program needs to be written for every trading engine that SMARTS connects to. This conversion program generates FAV formatted messages out of the trading engine output (trades, orders, control info, announcements, etc.). Information from more than one trading board can be combined in a single FAV. Indeed, output of the trading engine can form more than one feed to SMARTS where each feed data is collected in a separate FAV file. Normally, order/trade/control data compose one feed, announcements (news) compose a second feed and index data composes a third feed.

2.4.2 X-STREAM X-Stream, previously called ASTS (Automated Securities Trading System), is a system built and commercialized by Computershare (and recently purchased by OMX of Sweden). It is primarily designed as a trading engine and market management system but it is also able to keep track of users’ accounts, forward market information, and provide a personal instant messaging service to users on the system. X-Stream can be customized for different markets and system configurations and is one of the most widely used in the world. 2.4.2.1 Orders and Instruments There are a large number of different types of instruments that are tradable with X-Stream. X-Stream was developed to work seamlessly with securities, options, futures, bonds, swaps, commodities, foreign exchange etc. As such there must be a defined standard for how X-Stream treats these different types. With these different types of instruments comes the ability to trade in different ways. An order is an “Instruction to a broker/dealer to buy, sell, deliver, or receive securities or commodities that commits the issuer of the “order” to the terms specified.1 2.4.2.2 The X-Stream Architecture The X-Stream software architecture is broken up into several components. Figure 2.4 depicts the organization of these components, described next.

1

Computershare Technology Services PTY LTD. Programmer’s References for ASTS Client SDK. Version 1.6.2002

2 Information Systems and IT Architectures for Securities Trading

43

Figure 2.4. X-Stream architecture

2.4.2.3 Trading Engine The Trading Engine is the main transaction component and its main function is to take orders and process the order book. It also manages all the user accounts on the system. On start-up, the Trading Engine requests all of the market and system data from the relational database server. It then stores this data in log files updated in real-time. If the Trading Engine ever crashes, then it can be restarted from data in these log files. 2.4.2.4 Relational Database Server All of the parameters in X-Stream (such as the number and type of users, which securities to trade, security boards, currencies and etcetera) are held within the relational database server. The flexibility of the system is provided through the database server because all of the system parameters can be defined easily within the relational database. 2.4.2.5 Backup Trading Engine The Backup Trading Engine processes the same orders as the Trading engine. In the event of a failure of the Trading Engine, the Backup Trading Engine can be used as the primary trading engine transparently. The Backup Trading Engine starts up after the main Trading Engine, so it simply downloads the market information from the Trading Engine’s log files. In all other respects, the Backup Trading Engine is the same as the main Trading Engine.

44

Feras Dabous, Fethi Rabhi

2.4.2.6 Governor The Governor handles the fault management of the whole system. After an initialization protocol with the Governor, the two trading engines stay synchronized. Upon failure of either service there will be a breakdown in communication and both services will attempt to contact the Governor, and the one that is successful becomes the only trading engine. 2.4.2.7 Gateway As there could possibly be many client machines connected with the Trading Engine, the Gateway component is used to load balance the different transactions and queries and to distribute all the market data to the clients. The Gateway processes all the same transactions as the Trading Engine so that it maintains a complete picture of the market. 2.4.2.8 Trader Workplace The Trader Workplace is the user end interface developed by Computershare to the X-Stream software. It has different functions depending on which user is logged into the terminal. Its main priorities are to display market information, to submit orders, allow communication between brokers, and to submit market sensitive information and to control and operate a market. 2.4.2.9 Client Developed Software X-Stream also comes with a complete software development kit, called Comet SDK, for interfacing with all aspects of the system. The SDK was developed as a complete way to interface to the system. Any client that wishes to develop or modify their own Trading Engine’s client can easily interface through Comet. The main usage of the SDK is to interface to the Gateway as a client application, however, because of its extensiveness; anyone can connect to any part of the system. It allows enough flexibility for developing a different Gateway or Governor. Software components communicate with each other using the AMP Messaging Protocol, which is X-Stream’s main messaging protocol for handling all transaction and query information and is formally described in the message definition language ASN.1 (ISO/IEC 8824-1:2002).

2.5

IT Integration Architectures and Technologies

We now turn our attention to the middleware technologies that are used to facilitate the integration between the different systems described earlier. As these technologies intervene at different levels of abstraction, we first define a layered conceptual model that will be used to classify the different technologies involved.

2 Information Systems and IT Architectures for Securities Trading

45

Figure 2.5. Conceptual integration model for middleware technologies

Figure 2.5 shows three possible levels of integration between two different systems X and Y. In the upper level, the business process level is the highest level in the hierarchy. It offers business services in a way that “hides” the underlying data formats or middleware platforms. It represents the appropriate level for defining integrated services that correspond to particular business models. The content (data) level defines the structure and format of messages between the software components. The transport level defines the interfaces of communication between the various software components. We now present a brief review of existing technologies classified in relation to this model.

2.5.1 Transport Level For security reasons, most financial systems traditionally interact with each other using private or dedicated networks. An example is the SWIFT Transport Network (STN) based on the X.25 protocol. As communication software is non-standard, this is a very expensive solution. Other than for historical reasons, there is less and less interest in integration at this level because of the low level nature of the corresponding programming interfaces. Even private networks are now moving towards IP-based solutions (Intranets and Extranets). For example, SWIFT has developed a new generation of interactive services (called SWIFTNet) using an IP-based network called the Secure IP Network (SIPN). Besides loosely coupled communication by means of messages and file transfers, systems can also interact with each other in a tightly coupled way using dedicated

46

Feras Dabous, Fethi Rabhi

Application Programming Interfaces (APIs) accessible through the network protocols. A number of possibilities exist for such interactions such as the use of remote procedure calls and remote objects.

2.5.2 Contents (Data) Level The content layer provides languages and models to describe and organize information which enables the participating applications to understand the semantics of content and types of business documents. The objective of interactions at this layer is to achieve a seamless integration of data formats, data models, and languages (Medjahed el al. 2003). The two major enabling technologies at this layer are Electronic Data Interchange (EDI) and eXtensible Markup Language (XML)based frameworks. Electronic Data Interchange (EDI) (www.edi-information.com) is well-defined and well-established in the industry in the sense that it provides both the defined syntax and defined vocabularies for values of the EDI message fields (Bussler 2001). Trading partners exchange business documents via value-added-networks (VAN) using EDI standards. EDI is an established technique for B2B integration that focuses on interactions at transport layer where VANs are used to handle message delivery and routing. EDI standards also provide a single homogeneous solution for the interactions at the content layer where the set of supported document types is limited. In addition, EDI standards, as currently defined, do not support interactions at the BP layer. EDI has been in use long time before the emergence of the Internet. It depends on a moderately sophisticated technology infrastructure based on proprietary and expensive VANs and ad hoc development that needs direct communication between partners. Therefore, it remains unaffordable for the majority of the business community. The emergence of the Internet as a global standard for data exchange has influenced EDI to expand in many directions. The Open Buying on the Internet (OBI) (www.openbuy.org) is an initiative that leverages EDI to define an Internet-based procurement framework thereby utilizing the free use of the Internet instead of the expensive VANs. In parallel with XML developments, a number of international organisations were already working towards the adoption of common standards in the financial area. One of such standards is the Financial Information eXchange (FIX) (www.fixprotocol.org) protocol for communicating securities transactions between two parties. Since FIX does not define how data is transported, an XML version of FIX called FIXML (www.fixml.org) has been defined. FIXML facilitates real-time delivery of messages and it supports multiple currencies and financial instruments types. It also specifies how the trading orders are advertised, acknowledged, placed, rejected and executed. It also can be used to manage settlement instructions between the trades’ parties. Another standard is the Financial Products Markup Language (FpML) (www.fpml.org) supports business information exchange standard for electronic dealing and processing of financial derivatives instruments. It establishes a new protocol for sharing information and

2 Information Systems and IT Architectures for Securities Trading

47

dealing in financial derivatives over the Internet. The FpML is currently focuses on partial derivatives instruments such as interest rate swaps and forward rate agreement. Research Information Exchange Markup Language (RIXML) (www.rixml.org) is an initiative for developing an open industrial standard to tag and deliver investment research. It provides a structure for classifying financial research in a way that will enable users to define specific sorting, filtering and personalization criteria across all research publishers. In each case, a working group defines a Data Type Definition (DTD) that defines the XML tags and how they can be legally combined. These DTDs can then be used with XML parsers and other XML tools to rapidly create applications to process and display the stored information in whatever way is required. Other related standards include ISO15022 (www.iso15022.org) which defines a standard for messages in banking, securities and related financial services. This standard is suitable for inclusion within an XML format. Another standard is SWIFT’s messaging service, FIN, which runs on its private network (STN) and enables financial institutions to exchange standardized financial messages worldwide.

2.5.3 Business Process Level The business process layer is concerned with the conversational interactions among the activities that comprise a BP. The objective of interactions at this layer is to allow autonomous and heterogeneous partners to come online, advertise their terms and capabilities, and engage in peer-to-peer interactions with any other partners (Medjahed et al. 2003). Interoperability at this higher level is a challenging issue because it requires the understanding of the semantics of partner BPs. A few approaches have emerged for the purpose of facilitating BP interactions. We focus here on two popular ones, namely Workflows and Web Services. 2.5.3.1 Workflows Current BPs within an organization are integrated and managed either using ERP systems such as SAP/R3, Baan, PeopleSoft or various workflow systems like IBM’s MQ Series or integrated manually on demand-basis. There are many Workflow Management System (WFMS) products available each of which proposes a different specification that makes it hard to replace and ineffective when considering the needs of B2B interactions (Medjahed 2003). For this purpose a workflow reference model has been defined (Hollinsworth 1994) which motivates some standardization efforts such as jointFlow, the Simple Workflow Access Protocol (SWAP) (Bolcer and Bolcer 1999) and Wf-XML message set (www.wfmc.org). However these efforts provide little support for inter-enterprise BPs. The purpose of inter-enterprise workflows is to automate BPs that interconnect and manage communication among disparate systems. An Inter-Enterprise Workflow (IEWf), is a workflow consisting of a set of activities that are implemented by different enterprises, that is, it represents a BP that crosses organizational boundaries. Delivering

48

Feras Dabous, Fethi Rabhi

the next generation of IEWf Systems is having potential attention with the objective to interconnect cross organization BPs while supporting the integration of diverse users, applications, and systems (Yang and Papazoglou 2000). However, no current approach seems to be extensively used in the financial sector. 2.5.3.2 Web Services The Web Services model (Alonso 2004) is emerging steadily as a loosely-coupled integration framework for Internet-based e-business applications. Unlike traditional workflows, Web services support inter-enterprise workflows where BPs across enterprises can interact in a loosely coupled fashion by exchanging XML document-based messages. The Web Services model idea is to break down web accessible applications into smaller services. These services are accessible through electronic means, namely the Internet. They are self-describing and provide semantically well-defined functionality that allows users to access and perform the offered tasks. Such services can be distributed and deployed over a number of Internet-connected machines. A service provider is able to describe a service, publish the service and allow invocation of the service by parties wishing to do so. A service requester may request a service location through a service broker that also support service search. There are four emerging de facto standards for Web Services. The first one is Simple Object Access Protocol (SOAP) (www.w3.org/SOAP) which is an XML-based protocol for exchanging documentbased messages across the Internet. The second one is Web Services Description Language (WSDL) (www.w3.org/TR/wsdl) which is a general purpose XMLbased language for describing the interface, protocol bindings and the deployment details of Web services. The third one is Universal Description, Discovery and Integration (UDDI) (www.uddi.org) which refers to a set of specifications related to efficiently publishing and discovering information about Web services. The last one is Business Process Execution Language for Web Services (BPEL) (www.siebel.com/bpel) that has emerged based on a combined effort in merging the Microsoft XLANG and the IBM WSFL. This Web services model is not matured enough for wide acceptance, yet it tries to leverage the three layer interaction framework. At the transport layer, SOAP is used as a messaging protocol. At the content layer, a WSDL based XML schema is used. At the business process layer, BPEL is yet to evolve to cope with complex inter-enterprise interaction issues. In a previous study (Dabous 2005), we have justified with examples that some of the functionalities of capital market systems can be reused to assist the automation of many of the domain business processes. Therefore, we consider each of such systems as consisting of one or more e-services that can facilitate the enactment of domain business processes. Such business processes are related to trading activities across the trading cycle discussed earlier. This can include but not limited to activities such as placement of orders, online settlement and registry once a trade is matched, real-time surveillance and real-time information dissemination (announcements, quotes, trades, etc.).

2 Information Systems and IT Architectures for Securities Trading

49

Figure 2.6. High level service oriented architecture for securities trading

In (Rabhi and Benatallah 2002), preliminary service oriented architecture for the integration of capital market financial services has been proposed. In this architecture, each software component provides either a basic service or composite (integrated) service. The former is where the service logic is totally provided by that architectural component. The latter is where the architectural component interacts and requires other services (called outsourced services) to fulfill its role. Therefore, a service-based approach separates between business concerns (i. e. the use of the service) and technological considerations (i. e. the computing and networking infrastructure to support a service). Figure 2.6 shows the essential basic and integrated services that are associated with securities trading. Very few capital market systems are integrated at this level. Therefore, the B2B integration at this level in capital markets domain is still an active research area.

2.6

Conclusion

The main purpose of this chapter is providing software architects and IT professionals with sufficient knowledge about existing practices, business processes, systems and architectures in the area of securities trading. It introduced the concepts of market structures, order types and described the information flow throughout the trading cycle and the different types of information systems that participate in this cycle. The chapter pays some attention to business process integration issues by reviewing relevant middleware technologies according to their level of abstraction. We have seen that most integration currently occurs at the transport and content layer. Heterogeneous technologies in development and middleware platforms together with the absence of dominant standards have complicated the achievement of an interoperable environment that facilitates the integration among different systems. There are many opportunities yet to be exploited for integration at the business process (BP) level. This is essential for two reasons. On one hand, we are witnessing dramatic changes in computing, data communication and middleware technologies. On the other hand, the BPs themselves are changing as a result of

50

Feras Dabous, Fethi Rabhi

regulations, global competition and new financial instruments or market structures. Future developments may witness the maturity of the emergent technologies especially at the BP level which will facilitate the integration of multiple markets worldwide.

Acknowledgements We wish to acknowledge the Capital Markets Cooperative Research Centre (CMCRC) for funding this work. We are also grateful to staff at SMARTS and Computershare for their support especially in providing information about their systems.

References Michael Aitken M, Frino A, Jarnecic E, McCorry M, Segara R, Winn R (1997) The microstructure of Australian stock exchange. Securities Industry Research centre of Asian Pacific (SIRCA) Alonso G, Casati F, Kuno H, Machiraju V (2004). Web Services: Concepts, Architectures and Applications. Springer Banks E (2001) e-Finance: The Electronic Revolution, John Wiley & Sons Bolcer G, Kaiser G (1999). SWAP: Leveraging the Web to Manage Workflow, IEEE Internet Computing, 3(1), pp. 55−88 Bussler C (2001) B2B Protocol Standards and their Role in Semantic B2B Integration Engines. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 24(1), pp. 3−11 Dabous FT, Pattern-Based Approach for the Architectural Design of e-Business Applications, PhD thesis,The University of New South Wales, Australia Harris L (2003) Trading and Exchanges: Market Microstructure for Practitioners. Oxford University Press Hollinsworth D (1994) The workflow reference model, #TC00-1003, Brussels, Belgium Medjahed B, Benatallah B, Bouguettaya A, Ngu AHH, Elmagarmid AK (2003) B2B Interactions: Issues and Enabling Technologies,The VLDB Journal, pp. 59−85, Springer-Verlag, Apr Rabhi FA, Benatallah B (2002) A Service-Based Architecture for Capital Markets Systems, 16(1), pp. 15−19 Yang J, Papazoglou M (2000) Interoperation Support for electronic business, Communications of ACM, 43(6), pp. 39−47 Viney C (2002) Financial Institutions Instruments and Markets. McGrath’s

CHAPTER 3 Product Management Systems in Financial Services Software Architectures Christian Knogler, Michael Linsmaier

3.1

The Initial Situation

Consolidating and modernizing software architectures in the financial services industry has become a number one issue on the current agenda of Chief Information Officers (CIOs). Common software architectures have matured over the past twenty years or more but the business itself has changed so seriously that many companies, especially insurance carriers, have architectures that lag behind the business innovation. Today, more and more CIOs agree that they have to spend major parts of their IT budget for maintenance and that the ratio between costs and innovation is not appropriate anymore (e. g. cf. (Lier 2005)). (Hersh 2006) confirms and predicts that reducing operational costs will stay a top priority for quite a while because maintenance costs can range from uncomfortably high to downright crushing when carriers spend 80 percent or more of their budgets on maintenance rather than new projects. What forces carriers in the financial services industry to innovate their business? First, customers expect products and services that meet their needs and which are delivered in time and budget. Products and services have to have – from the customers point of view – an optimum price/performance ratio. If a carrier focuses on ‘quality leadership’ or ‘price leadership’ depends on the business strategy. Delivery in time and budget means that customers do expect that an offer or proposal is immediately delivered at the point of sale and that the printed contract is delivered within 48 hours. Second, competitors have also perceived customers expectations and try to meet this challenge their way. Third, a 2004 study reported by (Hersh 2006) came up with a result that we already know from fast moving consumer good markets (e. g. mobile phones): Products that get to market quickly typically generate more revenue and greater profits. Fourth, the value chain of a financial services (FS) company starts with product management and thus has a strong influence on subsequent elements of the value chain (e. g. underwriting, risk carrying/transformation,

52

Christian Knogler, Michael Linsmaier

claims management, etc.). Our conclusion is to improve business by improving product management and the subsequent core business processes (which must include IT issues). It is not our objective to discuss how a FS company comes down to a new business strategy. What we offer is a discussion of the weaknesses of current software architectures and business processes and the opportunities of improvement.

3.2

Weaknesses in Software Architectures and Business Processes

When the product development or the marketing department starts to create a new product they end up with a paper based description of the product. Life insurance products are often ‘described’ on spreadsheets because they allow interactive calculation. Based on this set of information the IT department implements a ‘piece of software’ and the product development department tests it. This is how software development works in general one might say. Where are the weaknesses? The product development department has comprehensive product knowledge but lacks of a clear method how to define a financial services product. Especially every piece of information that has to be implemented as code needs a precise description. It is common sense that no software specification is as perfect as software engineers want it. From our point of view the traditional process based on division of labor has to be discussed again. Trying to improve paper based description and coding is, like (Watzlawick 1983) called it, ‘more of the same’. What the product development department needs is an approach to a tool based definition of products which enables them to precisely define and test the calculation and the business rules, the properties of the product, etc. (Gruhn 2005) reports a comprehensive set of requirements for such a tool, he uses the acronym PMS (product management system). We follow Gruhn and use term and acronym below. Finally a PMS has to generate a ‘piece of software’ that includes all the product information that is needed by the core business applications of a carrier. Process weaknesses have not only been identified within the product implementation process moreover the software deployment is also critical. If an IT department implements FS products in a traditional way they have to implement the same product several times. First for the contract management system, second for the offline sales system, third – for example – for a broker portal, fourth for … etc. etc. The authors have seen insurance companies that have to deploy products to ten and more business applications. A quite common way to meet this challenge is to create a so called ‘calculation box’ which contains the companies set of calculations. Because of their platform variety IT departments use ANSI-C code (a version of

3 Product Management Systems in Financial Services Software Architectures

53

the programming language C standardized by the American National Standards Institute) to provide a multi-platform calculation box. Although this seems to be a quite good approach it is not a silver bullet because the process of definition has not been changed (as related above) and an FS product consists of much more than calculations. What about attributes, properties and business rules for instance? Looking at sales channels of FS companies it is easy to recognize that today every company has much more sales channels than ten or fifteen years ago. On the one hand carriers cooperate with nation- or europeanwide independent brokers and banks and sell their products ‘over their counter’. On the other hand they cooperate with big multinational groups and deliver customized FS products as add-on products to the groups products (for instance a finance and insurance package for car buyers) or as part of the groups benefit program (e. g. special homeowner loans, private financial security for future retirees, and so on). Common to all of the cooperation partners of a carrier is that the partners have their own IT platforms and business applications. Even today carriers still provide only simple paper based proposal forms to their business partners, others offer at least a calculation box (which normally does not include the special business rules for the partner). Media discontinuity and a lack of data validation at the point of sale (no wonder a paper form is still not able to validate its data) is one of the major causes why callbacks are still necessary and the time elapsed for the processing of a standard proposal form is about several weeks. If a partner asks for a seamless process integration and automation, many carriers wonder about that. A remarkable aspect in that context is that today some carriers do not have a process integration and automation even with their own sales staff and their back office. Process integration and automation is not only an issue with sales. Just take a look at claims management, or quite simple, a claims notification. Today an FS company has many points of service. Sooner or later every point of sale turns into a point of service. No matter what has been sold (a simple ‘Bausparvertrag’1 or a new car with a leasing and insurance package), customers have questions or claims and come back to where they have bought. Regardless whether the service employee at the point of service calls the call center or fills out a paper form: media discontinuity, callbacks, needless idle time and so on are the unwanted effects. A lack of integration and automation is often a symptom of a more substantial weakness: Inflexibility (of applications/components and sometimes in peoples mind). This in most cases goes hand in hand with the well known ‘legacy concept’. Undocumented business rules and logic, ineffectual code and so on. For example product enhancements that appear quite straightforward to the prod1

A translation is quite difficult because a ‘Bausparvertrag’ is a special kind of a (future) homeowner savings contract with the right to close a homeowner mortgage loan contract at a special fixed interest rate at the end of the savings period. This combination of a savings contract with an optional future loan contract is a common governmental promotion for private proprietary.

54

Christian Knogler, Michael Linsmaier

uct development department (‘small tweaks’ from an actuaries point of view) take the IT almost a year or more and millions of Euros to make and test that change. Carriers who are listed at a US stock exchange and run a legacy system in the back office are often still struggling with the implementation of the Sarbanes-Oxley Act (SOX)2. (Cyphert 2006) recently reported that some companies repeatedly have been unable to project their claims, new business, and the like and have made up for it with workarounds. Although he meant American carriers the situation with legacy systems in Europe is, from our point of view, at least similar or even worse. This is also true for what Gartner recently found out. The most problematic part for Property and Casualty (P&C) insurance companies is rating/quoting and new-product development, which means that carriers want to get their products to market faster and be more accurate in their pricing. Gartner says that they are trying new concepts, but the technology has not matured at the rate they would like. The second problem piece is underwriting which is in Gartner’s opinion an extension on the movement toward pricing and new-product development (Hyle 2006). A typical legacy architecture looks like that (the following list is not final): • A separate contract administration system for every (or almost every) product line and a policy-centric view on data • Code and tables for the products are an integral part of the system, a segregation is not so easy and takes a remarkable effort • (Almost) no modularization/componentization (monolithic design) • A large number of (and sometimes old) technologies/platforms3 Common side effects of such legacy architectures are skill-set problems with old technologies because the people who understand them have retired. If a carrier wants to rely on third-party outsourcers instead, they ask for a current documentation that is in most cases not available. The old documentation does not fit anymore because the system has been customized, modified and debased to the extent. How does a FS company know when to pull the trigger on replacing legacy systems? The following questions might help to clarify: • Are the systems inhibiting the company from being competitive respectively are the business requirements met today? • Will the company be able to meet the business requirements in two or three years?

2

3

We prefer the acronym SOX instead of SOA because in the IT world SOA is widely used for Service Oriented Architectures. (Bozovic and Jakubik 2006) relate that they have seen companies “that internally have to juggle up to 30 different technology platforms”.

3 Product Management Systems in Financial Services Software Architectures

55

• Does the company have a clear vision of its future business? A clear strategy and the steps to implement it? • Is it clear how the core business processes will run in the future? Which steps in each process will be supported or even automated by IT?

3.3

Objectives and Today’s Options

One of the biggest changes today are the options available to companies. “Companies are starting to change what their choices are for policy administration,” (Harris-Ferrante 2006) says. “It used to be companies were thinking about buy, build, or outsource – those were the three choices you had. A lot of companies now are looking at the different alternatives and broadening some of these options.” Such options include installing components rather than the entire system or adding different alternatives on top of the current system, which could improve the policy system without changing it. “We’ve seen companies start looking more and more at the option of adding things on top of the system – Business Process Management (BPM), workflow, and different types of supplemental systems you can add on top without replacing what you already have,” says Harris-Ferrante. “Companies are taking a variety of approaches to their policy system vs. the big-bang approach.” Our approach is similar but different. We also consider the risks of a big-bang approach in most cases as “to high” and prefer an approach that we call ‘smart consolidation’. What does it mean? The weaknesses we have related prior in this article lead us to the following conclusions: • The product development department has to consolidate and optimize4 its methods of defining FS products • The IT department has to consolidate and centralize the way it provides product information to business applications in- and outside the company Based on these conclusions we have established an architecture that we call ‘product server based business application architecture’. A product server is an application that provides every peace of product information that is required by a business application (in- or outside the company). In general a product server is product line and platform/technology independent and it is the runtime environment for the products defined and implemented by using a PMS.

4

‘Industrialize’ is sometimes used as an equivalent term.

56

Christian Knogler, Michael Linsmaier

3.4

A ‘Product Server Based Business Application Architecture’

Misc. Supporting Systems (e.g. Business Partner, DWH, etc.)

The following diagram displays the main components of the suggested architecture:

Contract Management System

Claims Management System

Product Server

Commission & Charge System

Point of Sale / Service System

Intermediary 1

Intermediary 2

Intermediary n

Figure 3.1. Product server based business application architecture

3.4.1 Business Considerations From a professional point of view our architecture is able to provide a broad range of services in order to support the following processes: Table 3.1. Processes and services

Processes

Services (examples)

quotations

calculations (premiums, sums, endowment values, installments, etc.) data validation identification of needs/recommendation of products product (e. g. available coverage) structure additional calculations (e. g. loan redemption schedules, detailed coverage premiums, …) data validation

offers/proposals

3 Product Management Systems in Financial Services Software Architectures

57

Table 3.1. Continued

contracts

claims

assistance services

actuarial methods (e. g. contract alterations) business rules (both global and sales channel specific) document output (control of text modules) additional calculations (e. g. reserves, commissions, …) product structure additional actuarial methods (e. g. contract conversion) business rules (underwriting, straight-through-processing, …) document output (control of text modules) context sensitive claims notification questionnaire calculations (loss reserve, compensation, installments, etc.) business rules (compensation yes/no, what, how, immediate claims adjustment, …) document output (control of text modules) calculations data validation business rules document output (control of text modules)

The services are available throughout the entire process chain and are not limited to the company itself, which means that financial intermediaries are also able to use the services (both offline and online) provided by the product server. Straight-through-Processing The standardized ‘rollout of services’ enables a positive shift in process quality and speed. Everywhere and every time business applications can call for a service and get – for the same request – the same precise result. The service examples given in Table 3.1 go beyond a simple calculation/validation approach. This architecture is the basis for an all-round service at the point of sale/service and for straight-through-processing (STP). From our point of view STP is crucial to lowering costs and speeding up execution. While the case for STP may be changing, the goals remain the same: to reduce the number of exceptions and to allow as many proposals as possible to be executed, settled, and cleared without manual intervention. Considering the various sales channels of a carrier with different products, different prices and different business rules we believe that execution costs and time elapsed can not be fixed at a competitive level without running a product server as part of a STP-strategy. STP is not limited to sales processes; service processes are even a bigger case. As we have stated prior in this article every point of sale is a potential point of service. A process chain which is free off media discontinuity and which integrates service partners seamlessly is important in tomorrow’s competition where productivity as well as quality and speed of service are crucial. Imaging a car repair shop or a homeowner who enters a carrier’s customer web portal and notifies a minor damage (car body damage, glass damage, etc.). If the attached documentation (photos, estimation of costs, etc.) is sufficient and the business rules for minor damages allow an instant regulation the clients get their approval only hours later. It is clear that

58

Christian Knogler, Michael Linsmaier

a STP-approach with service processes needs a powerful fraud detection system and overall human claims revision professionals who systematically re-check settled claims or check cases that were marked by the fraud detection system. Already Traeger (in (El Hage and Jara 2003)) relates a scenario called claims portal solution that integrates every service provider and business partner of an insurance carrier and uses a car break-down to illustrate straight-through-processing with service processes. Outsourcing Is a product server based business application architecture compatible with an outsourcing approach? Outsourcing is a trend among FS companies and we are sure that our architecture is the key to keep major parts of the core business know-how inhouse. The product development departments create, maintain and test the products by using a PMS. At the end of this process they provide encapsulated product information in a platform independent format. The outsourcing partner tests the integrity and deploys it to the business applications he/she is running for the company. Collaboration Using a product management system is also a sophisticated way for a collaborative product development. The headquarters is able to determine general structures; calculations and business rules and subsidiaries abroad add local value. New business strategies go one step further (e. g. cf. (Riemer 2005)). The separation of risk carrying from product selling to individuals is the core element in these strategies and they imply some sort of a ‘wholesale-retail-scenario’ with insurance factories that create ‘white label products’ and sell them via other insurance companies or financial intermediaries (brokers, banks, etc.) who apply their own brand to these products. Again, a PMS is the professional basis to implement such a strategy. If two FS companies use the same PMS it is easy to share (basic) parts of their products. As long as the shared elements do not influence the interface to the business applications this is a good way to exchange product development experiences or ‘product kernels’ that are expected to be used in both companies.

3.4.2 Technical Considerations Based on our architectural approach we have derived some important technical requirements for a PMS. Since one of the core points of the suggested architecture is straight-through-processing, a PMS will have to support the core components of the architecture. We found the following characteristics to be crucial for a state-ofthe-art PMS. Operating System Independency Many carriers use different operating systems. There are Windows based notebooks (or even handhelds) at the point of sale, open source/Linux based worksta-

3 Product Management Systems in Financial Services Software Architectures

59

tions at the point of service, Unix-based web applications for e. g. brokers and finally a zOS-based (or any other classic OS) mainframe back office systems. Since the runtime environment must show the same behavior and compute the same results on all different platforms, we state that the runtime environment has to use operating system independent technologies when different platforms in (potentially) different companies have to be served. Database System Independency The technologies for data storage also differ according to which application/component is used. For example, if the back office system uses a DB/2 database, then the web application will use, e. g. an Oracle database or even MySQL. On the other side the point of sale system handhelds should not have a separate database management system, rather should only use the file system. This means that a PMS cannot be based on one database system only, any kind of (product) data storage must be available. Serving a Broad Array of Technologies There are as many different requirements for the interfaces as there are components, which have to use the PMS. On the mainframe side (e. g. a COBOL administration system) program libraries have to be integrated, in a workstation environment.NET or Java Native Interface (JNI) are used and for web applications web services or interfaces for J2EE (Java Platform Enterprise Edition) applications are needed. From our point of view a PMS is not ready for integration and not able to be a central part in the suggested architecture until it is not able to connect to and serve the systems of FS carriers. Deployment Any advantages gained from the support of different operating systems, databases and interfaces get lost if the product data itself cannot be efficiently distributed. During the implementation of a new product, it is important to ensure that the product data is available to all locations/applications in the value chain as quickly as possible. A product implementation which is based on the classic cascading phases component design, creation of source code, translation, (technical) testing and integration etc., (the ‘traditional long running software development process’), will not be sufficient. A smarter approach has to be chosen since one of the major success factors for process optimization is the separation of the requirements of the product development from those of the software development. In general there are two options available: First, a code based development approach such as for J2EE components that was designed for simple deployment (besides other requirements of course) or second, an interpreter based, data-driven approach. We appreciate the datadriven approach because it guarantees the platform independency for really all platforms. Thus you can distribute (deploy) the product data and product functionality fast and without any conversions. (We are aware of taking care for potential performance issues.)

60

Christian Knogler, Michael Linsmaier

Scalability Carriers require a PMS to be scalable. Although there are multiple scenarios like: • A workstation/notebook environment • Working as part of a transaction monitor (e. g. CICS, IBMs Customer Information Control System) controlled system with a large number of concurrent users • Working as part of a mainframe batch environment that processes millions of units • Working on servers in a web environment with thousands of requests within one hour.5 Hence the internal technical architecture of a PMS must guarantee a flexible scalability at most not only on single processor systems but also on multi processor machines. Integration The ability to integrate is different from the ability to be integrated into business applications. Ability to integrate means that existing product components, tables, calculation boxes (please refer also to ‘migration methods’ below) etc. can be perfectly integrated respectively wrapped. This enables a PMS to be used as a commoditized interface (using plug-in mechanisms, user exits, etc.) for any external product related components.

3.5

Migrating into a New Architecture

Prior in this article we have related the most important migration reasons. From our point of view a migration to the suggested architecture is eligible to result in a lower total cost of ownership (TCO). Whatever the reasons for a CIO in detail are, TCO is always on his mind. The following figure shows the two major options for a carrier to transform its business application architecture. Before a company decides to go for a new architecture the major questions we have exemplified prior in this article have to be answered clearly. Considering the given constraints (technical, business, organizational, etc.) and the actual requirements we have experienced that many companies come down to the conclusion that a smart consolidation of their legacy systems based on our approach of a product server based business application architecture better limits the risks of a migration than introducing a completely new ensemble of core business applications (e. g. contract administration system plus business partner plus commission and charge plus … etc. etc.). In our point of view a company will rather accept the 5

In this context carriers require the so called ‘24 × 7’ availability which means that the system must be up and running 24 hours a day, 7 days a week and 365 days a year.

3 Product Management Systems in Financial Services Software Architectures

61

consolidate or innovate ? original component renovated component costs

(re)developed component

PMS

original architecture

risk

benefit

PMS

consolidated architecture

innovated architecture (blueprint)

... consolidate, then innovate ! costs

risk

benefit

Figure 3.2. Costs, risks and benefits for consolidation and innovation

moderate increased costs of the consolidation approach than the risk of a much more expensive total miss of a new system landscape. The following figure displays the main milestones in our program scope from the blueprint to an innovated architecture: 5 milestones to an innovated product server based architecture

substitute or enhance components

renovate components

innovated architecture

consolidated architecture

repository product data reengineering

product analysis analyze

analyze blueprint

t

Figure 3.3. The milestones and activities to the drafted blueprint

62

Christian Knogler, Michael Linsmaier

3.5.1 Blueprint The blueprint6 of the future IT architecture displays the companies specific ‘interpretation’ of the ‘product server based business application architecture’. Based on the blueprint the following questions for each component have to be clearly answered: 1. Is there a need to retain the component in the new architecture? 2. Is it possible to compensate the component’s deviations from new requirements by renovation? 3. If the answer to question 2 is ‘No’: Are costs and risks of a substitution (make or buy) appropriate to the expected benefits?

3.5.2 Product Data Analysis The second milestone is a fundamental one for the architectural transformation laid down in the blueprint. It’s the “metadata” or “product data” analysis. The existing components will be analyzed with a clear focus on their included productbased metadata, referred to hereafter as product data. Such product data is usually scattered throughout databases or is an integrated part of the program code of the components. The product data found in the databases usually include product module information like product names or product selling periods, quotation rates or even simple business rules etc. The program code itself contains metadata or functionality, which cannot be stored in database structures or should not be stored there for performance reasons (e. g. financial or insurance calculation instructions). Some of these program code units are so called calculation boxes. In this case it is sometimes reasonable to reuse the calculation box within the PMS.

3.5.3 Repository This milestone is a basis for the consolidated (renovation scenario) as well as the innovated architecture (substitution or enhancement scenario). The repository is vital and comprises the entire (formerly scattered) product data. In general the design of the repository has to meet the current and future business requirements. 6

A blueprint is, conceptually, a plan or design documenting an architecture or an engineering design. While this may be as simple as a quick sketch or outline using pencil and paper, a blueprint is traditionally reflected as the contact printing process of cyanotype. And it is cyanotype which produces the familiar white lines on blue background. The term is often used metaphorically to mean any detailed plan (Wikipedia 2006).

3 Product Management Systems in Financial Services Software Architectures

63

Redesign the Product Model? We have learned in the past that professionals love to discuss this issue. To our mind it needs to be more of the business requirements that drive decisions on that issue. If the business requirements can be satisfied by consolidation (and probably some enhancements) for the next 3−5 years the existing contract administration systems are the benchmark. If not, the company does have to work carefully on its future product model design because it has to satisfy the business needs for the next ten or more years. Everything that is in-between is a compromise or temporary solution. Its common sense that these solutions live longer than everyone expected them to. In general a compromise implies advantages and disadvantages. Complex interfaces and therefore performance troubles during batch processing are a possible disadvantage in this case. In our experience performance leaks occur mainly during writing to and reading from the product server (the runtime environment of a PMS). Instead of a bad compromise we prefer this way: • Create a product model based on the consolidated architecture. • Define a product model that satisfies your future business requirements. • Lay one upon the other and analyze the differences. Consider elements of the future model which are not part of the current model as harmless enhancements of the current model. Take small workarounds into account for existing model concepts which could affect the future usability of the product model. Furthermore a product design guideline has to ensure, that new developed products fit easily into the future model. Redesign the Products? Once the product model has been fixed, the next decision deals with the products itself. The question goes like this: Do we have to (re-)implement every product or only new products? We belief that a full migration of all products would be the optimum but in fact business constraints are the real driver for this decision. Therefore, both the current and future product portfolios have to be analyzed and the following questions have to be answered: • Is there a small number of products in the portfolio which are too expensive to migrate? Are there ‘exotic’ products which would be also to expensive? Is it possible to convert the policies that use these products? (If not, consider a termination of the policy.) • Are there any policies in the database which cannot be sold or adjusted anymore? Are there products with a very small number of active policies? If yes, you don’t need to migrate its products. It is cheaper to manage these policies manually. The workaround in this case is to create a ‘one size fits all’ product7 and link it to the ‘read-only’ policy. 7

This is some sort of a ‘dummy’ product without quotation rates or any other product specific business functions.

64

Christian Knogler, Michael Linsmaier

All other products have to be migrated into the PMS and adapted to additional/ future business requirements if necessary (e. g. classification for the data warehouse system). A Direct Access to the Repository? From our point of view components should be modified during the next step (consolidated architecture). Therefore we do not access the repository directly by using it’s interface. The question is how the components can use the repository anyway. In general, database tables contain the majority of the meta product data and these tables can be maintained by the repository in this way: • The product development department maintains the repository. • A small SQL (structured query language) product data export tool transfers the data from the repository into the database tables. The advantages are (a) a standardized definition and maintenance of product data and (b) a central storage. Argument (a) also influences the quality, as every certified quality assurance professional knows by his/her experience. And last but not least (c): via a standard interface other applications/system could access the repository information (e. g. business intelligence/data warehouse).

3.5.4 Consolidated Architecture We have already pointed out that most carriers want to reduce their project risks or at least spread them. The ‘5 milestones to an innovated product server based architecture’ are our suggestion to spread the risks. As we have stated prior in this article the blueprint milestone fixes (among other things) which components can be retained and if necessary renovated. Common renovation issues are: Mathematical Subroutines Are mathematical subroutines (‘calculation box’) reusable within the PMS? FS companies often use for several reasons encapsulated subroutines for mathematical instructions. Sometimes it is reasonable, sometimes its even a (temporarily) ‘must’ to reuse these instructions. In general a State-of-the-Art PMS is able to integrate such encapsulated subroutines via a standard interface. The benefit is that the business applications just call the product server. If a carrier has several subroutines that have to be reused, a PMS is a smart option to substitute the different interfaces with one standard interface to the product server. Second, subroutines that were not available before can be made available to every business application that runs an OS which is appropriate for the subroutines. Third, within the PMS every method is able to use every subroutine that is integrated. Replacing Embedded Functionality What about embedded functionality that could be replaced by a service provided by the product server? What are the risks, what are the benefits? Its common sense

3 Product Management Systems in Financial Services Software Architectures

65

that cutting off embedded functionality is risky because in-depth adaptations are in most cases subsequent. The quality of the adaptations in question has to be considered carefully. In case of doubt we prefer the safe road to the consolidated architecture and bundle this kind of adaptations for a later implementation when we go for the last milestone, the innovated architecture. Minor Process Improvements Even today it is not a matter of course that business applications avoid wrong user input, wrong selections, clicks on wrong buttons and so on. A product server provides services that are not so difficult to integrate and significantly improve the usability of a process. Some examples: • Immediate validation of every user input and in case of a mistake a suggestion for a context specific correction • Delivery of results as soon as they can be calculated without clicking the ‘calc’ button • Context specifically filtered selection lists. Only actually valid items can be selected, invalid items are not visible • Context specific disabling of controls (input fields, lists, buttons, etc.) if their availability makes no sense This kind of improvements has to be also carefully considered because it depends on the application how easy or difficult it is to implement them. Again, in case of doubt we prefer to bundle these modifications with a later implementation when components are replaced or enhanced.

3.5.5 Innovated Architecture This is the last milestone in our roadmap to an innovative architecture. All in all, the main objective for an innovation is to satisfy future business requirements and to reduce TCO. A consolidated architecture is a pre-stage for the implementation of future business requirements like e. g. straight-through-processing8 (STP). During the initial step the company has developed its blueprint based on the architectural requirements: • • • • 8

Business processes Straight-through-Processing Outsourcing Collaboration

Although the management of a company recognizes the benefits of STP immediately people with underwriting skills not always do. Its no secret that underwriters often maintain ‘their work is an art and not a science’. Gaining commitment from the underwriters will be crucial for the project success and is therefore a No. 1 duty for the project managers.

66

Christian Knogler, Michael Linsmaier

Regardless of a make-or-buy decision the new application that will replace an existing has to meet these functional requirements. So let’s have a look on how we can satisfy these requirements during this step: Service requests to the product server are handled by a single generic mechanism9. Future enhancements of the business functionality provided by the product server are easier to implement and do not enforce program code modifications in the requesting application. Therefore the broad range of services we mentioned in “business considerations” can be provided. The standardized product data enable a common language for the components interaction. Furthermore these components get product data which haven’t been available before at all or haven’t been available in that quality. Since, in addition, the same product services are available all-over the process chain, the components can fulfill the straight-through-processing requirement. Because the new product model is the benchmark for new developed components, there is no need for conversions (or at least only a few if they are inevitable) between the requesting application and the product server. Thus the overall performance of processes will be enhanced. A PMS is the appropriate development and maintenance environment that enables carriers to implement new products or product enhancements, thus satisfying the product related business needs. To satisfy requirements that concern business processes a PMS is important and more than half the battle. As we have exemplified in Table 1 processes and services, a PMS is able to provide a variety of services, the initial step to process improvement is process design. When a company has created a clear vision of the innovated process and the project team is sure that this process will satisfy the business requirements, it derives the software requirements and differentiates by services delivered by the product server and functionality implemented in applications/components. Regardless of a make-or-buy decision the new application that will replace an existing has to meet these functional requirements. (Bozovic and Jakubik 2006) give us a word of caution saying: “If IT can’t provide the agility not just to support but to help promote business innovation, the gap between the two will widen and then the business can and will tank.” We have experienced that this approach to process innovation and implementation is unfamiliar for companies that have not worked with a ‘product server based business application architecture’ before and hence we want to sum up the important PMS aspects in this context: • The product server provides a bunch of new services and makes product data which haven’t been available before at all or haven’t been available in that quality. 9

Because not only calculation rules etc. are product specific but also the product structures (e. g. attributes) or even types of product functions the communication with a PMS must built as a generic mechanism. Which means that this mechanism has to use flexible structures for communication.

3 Product Management Systems in Financial Services Software Architectures

67

• Service requests to the product server are handled by a single generic mechanism.10 Future enhancements of the business functionality provided by the product server are easier to implement and do not enforce program code modifications in the requesting application. • The future product model is the benchmark for new developed components. No conversions (or at least only a few if they are inevitable) between the requesting application and the product server is a strong requirement.

3.6

Impacts on Software Development

The Model-driven Approach and Product Data Its obvious that the software development has reached a new level of evolution. This evolution can be compared to the evolution in the textile industry where materials are no longer made at home, but through mechanized looms and factories. In the same way, software development has moved from pure programming activities using case-tool support to today’s model-driven architecture approach (MDA). The goal of the MDA approach is to define technical specifications in a platformindependent model. They are then transferred to the corresponding concrete implementation by using platform-specific models (PSM), e. g. via model code generators for a.NET platform. This is an integrated way from a formally described technical specification (via UML11) to executable components. The approach of a PMS is quite similar. A formal business model is implemented instead of traditional (database) programming in a common language/tool. This business model describes the structure of the (real) product data. In contrast to the MDA approach the intermediate step of a platform-specific model is skipped since the user of a PMS is (in general) a member of the product development or actuarial department. Hence “executable” product data have to be generated directly. A product model is able to provide a business description of an object, e. g. a “fund contract”. But then a technical class, for example described with UML, describes an object abstractly based on its technical structure. Thus, from a technical point of view, the product data illustrates an operative object more concretely than a “technical” class. 10

11

Because not only calculation rules etc. are product specific but also the product structures (e. g. attributes) or even types of product functions the communication with a PMS must built as a generic mechanism. Which means that this mechanism has to use flexible structures for communication. The Unified Modeling Language (UML) is a non-proprietary, object modeling and specification language used in software engineering. UML includes a standardized graphical notation that may be used to create an abstract model of a system: the UML model. While UML was designed to specify, visualize, construct, and document software-intensive systems, UML is not restricted to modeling software. UML has its strengths at higher, more architectural levels and has been used for modeling hardware (engineering systems) and is commonly used for business process modeling, systems engineering modeling, and representing organizational structure (Wikipedia 2006).

68

Christian Knogler, Michael Linsmaier

Technical Model

Product Data

Contract -Premium : float(idl) -SumInsured : float(idl) +CalculateSumInsured() : float(idl)

Unit Linked Fund Product ShortName: ULF Premium := 124 SumInsured CalculateSumInsured[]

instance of descriped by

Objects FundContract4711 MsMeier : Contract Premium : float(idl) = 124 SumInsured : float(idl) = 250000

Figure 3.4. Product data as a description of operative objects

Unified Modeling Language as a Product Definition Language? From our point of view UML is not appropriate for business users to specify the business aspects of a product. Why? On the one hand UML is a modeling language/tool rather for software engineers than for FS product development specialists. On the other hand UML would help to emerge a myriad of problems business users do not have to face without UML (e. g. large product libraries could comprise thousands of product modules). Last but not least, the integration of the product development process into a standard software development process is very difficult as we have learned from the past thus leading the authors at the beginning of the 1990’s to the conclusion that business users in a FS product development department need to have a specific business oriented development environment. Different Development Processes Prior in this chapter we have pointed out that the traditional way to implement FS products is not sufficient. Introducing a PMS enables the IT department to concentrate on technical issues while the product development department provides ready-to-run software modules that encapsulate the entire product-specific information and the required business/actuarial functionality. Although it seems to be easier then for the IT department they have to cope with a new kind of requirements. Application components have to be enabled to act in accordance to context sensitive results, outcomes and messages provided by the product server or even to delegate functionality to the product server. On the other hand the product development department delivers all the business/actuarial functionality.

3 Product Management Systems in Financial Services Software Architectures

Software Engineer

Platform Independent Model

69

Product Expert

Modell oriented Language

Product Data

Platform Specific Model

Platform Code

Application Component

Product Management System Component

Software Development Process

Interpretabel Product Data Information

Product Development Process

Figure 3.5. Different development processes

An important question which IT departments and product development departments ask is: “What is our functional and data scope?” The rule of thumb says that everything, that is product-specific, should be implemented by the product development department by using a PMS and provided by the product server.

3.7

Conclusion

To cope with today’s challenges IT must provide a flexible, nimble architecture that can not only offer tangible benefits today but also adapts to whatever tomorrow brings. A product server based business application architecture offers many tangible benefits: • A well integrated PMS is a considerably faster tool for developing products than existing methods. • The integration certainly means much less involvement by IT staff to take products from concept to rollout.

70

Christian Knogler, Michael Linsmaier

• This architecture is perfect for a smart consolidation approach when a carrier wants to migrate several contract administration systems into one because the entire product management is vital to an FS company and has to be consolidated at first. • After a successful consolidation of the product management the company can start to renovate or replace other components in its architecture and has much less risks than before because there is one platform for all new and existing products. • One platform for all products and their related services is a core element in a straight-through-processing strategy that – in combination with a consolidated administration environment – presents enormous potential cost savings, since currently many carriers rely on multiple systems for their broad array of products and core processes that do have to be automated. There is no other way to process standard business with a speed and quality clients expect and with shareholders that expect an increase in productivity, revenue and profit on the one side and a proportional decrease in costs on the other side.

References Bozovic A, Jakubik M (2006) Time to Say Goodbye. Retrieved January 5, 2006 from http://www.nationalunderwriter.com/tech/news/viewFeatures.asp? articleID=1059 Cyphert M in Hyle RR (2006) Taking the Big Step. Retrieved January 5, 2006 from http://www.nationalunderwriter.com/tech/news/viewFeatures.asp? articleID=1061 Gruhn V (2005) Agieren statt reagieren. vb-versicherungsbetriebe 4/2005, 20−21 Lier M (2005) Dies und das aus der elektronischen Versicherungswelt. Die ITBudgets regiert der Rotstift. Versicherungswirtschaft 7/2005, 480−487 Hahn, C-P (2005) Metadaten, Application Understanding und Software Migration, Gesellschaft für Informatik e. V. Softwaretechnik Trends Band 25 Heft 4, ISSN 0720-8928, 28−29 Harris-Ferrante K in Hyle RR (2006) Taking the Big Step. Retrieved January 5, 2006 from http://www.nationalunderwriter.com/tech/news/viewFeatures.asp? articleID=1061 Hersh C (2006) Power Toolbox. Retrieved January 5, 2006 from http://www. nationalunderwriter.com/tech/news/viewFeatures.asp?articleID=1060 Openmdx.org: openMDX introduction: Version 1.0 http://www.openmdx.org/ documents/introduction/openMDX_Introduction.pdf Riemer O (2005) Vorsorge Lebensversicherung AG: Die Versicherungsfabrik. I.VW Management-Information 4/2005, 14−19

3 Product Management Systems in Financial Services Software Architectures

71

Traeger M (2004) Automatisierung/Industrialisierung im Schadenmanagement. In: El Hage B, Jara M (eds): Schadenmanagement. Grundlagen, Methoden und Instrumente, Praktische Erfahrungen. I.VW Management-Information 6/2003, 94−112 Watzlawick P (1983) Anleitung zum Unglücklichsein. Piper, München 1983 Wikipedia contributors (2006) Unified Modeling Language. Wikipedia, The Free Encyclopedia. Retrieved February 25, 2006 from http://en.wikipedia.org/w/ index.php?title=Unified_Modeling_Language&oldid=41083251

CHAPTER 4 Management of Security Risks – A Controlling Model for Banking Companies Ulrich Faisst, Oliver Prokein

4.1

Introduction

Increasing virtualization of business processes and the cumulative adoption of ICT involved leverage the importance of security risks. The “Electronic Commerce Enquête IV” inquiry carried out in August 2004 concluded that the majority of German banking companies plan to increase the investments in ICT within the next two years (Sackmann and Strüker 2005). However, the rising deployment of ICT implicates growing security risks. To lower security risks, banking companies may invest into technical security mechanisms and insurance policies. Generally, the more a banking company invests into technical security mechanisms and insurance policies, the lower are the expected losses and opportunity costs of the economical capital charge, et vice versa (Faisst 2004). Overall, a trade-off exists between the expected losses and the opportunity costs of the economical capital charge on the one hand and the investments in technical security mechanisms and insurance policies on the other. In practice, such investment decisions depend on explicit responsibilities within a banking company. In case, where an explicit responsibility exists, the decisionmaker might tend to make every possible investment within his budget, although holistically viewed not every investment is profitable. If no explicit responsibility exists, the decision-maker might tend to minimize costs and therefore neglects further investments, although such investments are profitable in a holistic view. This contribution1 aims at developing an optimization model that is able to map the described trade-off between the expected losses and the opportunity costs of the 1

This contribution is an enhanced and revised version of the following already published contribution: Faisst U, Prokein O (2005) An Optimization Model for the Management of Security Risks in Banking Companies. In: Müller G, Lin K-J (eds.) Proceedings of the Seventh IEEE International Conference on E-Commmerce Technology CEC 2005, Munich, IEEE Computer Society Press, Los Alamitos, CA, pp. 266–273.

74

Ulrich Faisst, Oliver Prokein

economical capital charge on the one hand and the investments in security mechanisms and insurance policies on the other in a decision calculation. Moreover, the model helps to allocate available budgets to security mechanisms (ex-ante prevention) and into insurance policies (ex-post risk transfer) in an efficient way. In order to present the state-of-the-art of the management of security risks we will portray the risk management cycle.

4.2

The Risk Management Cycle

The following risk management cycle in Figure 4.1 illustrates the main activities to handle security risks. The cycle contains four phases: identification, quantification, controlling and monitoring (Piaz 2001).

Figure 4.1. The risk management cycle

4.2.1

Identification Phase

Within the scope of the identification phase the security risks are identified and classified. Security in ICT covers the wide range from the physical protection of the hardware to the protection of personal data against deliberate attacks (Müller et al. 2003). In an open information system like the Internet, one cannot assume, that all parties involved (such as communication partners, services providers etc.) trust or even know each other (Müller and Rannenberg 1999). Therefore, the analysis of security risks requires not only the observation of external attackers but also the inclusion of all parties involved as potential attackers. The concept of multilateral security (Rannenberg 1998) considers the security requirements of all parties involved. The security risks result from the threat of the so-called four

4 Management of Security Risks – A Controlling Model for Banking Companies

75

protection goals of multilateral security according to Müller and Rannenberg (Müller and Rannenberg 1999): • • • •

Confidentiality, Integrity, Accountability, Availability.

Confidentiality Firm specific data or personal data of individual users may not be noticed of unauthorized users: • Message contents have to be confidential in respect of all parties apart from the communication partner, • communication partners (sender and/or recipients) have to be able to remain in an anonymous way to each other. It should further be possible for other parties to observe them, • for either potential communication partners or other parties should it be possible to determine the current location of a mobile terminal or its user without his consent. Integrity A change of messages by unauthorized people may not be unnoticed, i. e. manipulation of messages must be detected. Accountability Accountability of communication transactions is indispensable for a required internalization of actions. Responsibilities and liabilities, efficient incentive scheme, definition of property rights and their transactions are otherwise not to be established: • The recipient of a message should be able to proof to a third party that a certain communication partner has sent the message, • the sender of a message should be able to prove and verify the sending of a message together with its actual content. Beyond it, it should be possible for the sender to prove that the message was received, • the payment for the used services cannot be withheld from the provider – at least the provider receives sufficient evidence that a service was requested. In contrary the service provider can only require payment for services which are correctly provided. Availability A restricted availability can lead to material or immaterial impacts, i. e. the communication network has to enable communications between all requiring partners.

76

Ulrich Faisst, Oliver Prokein

The concept of multilateral security is a possible concept to define security risks. Moreover, to quantify the security risks it is necessary to consider the attacks that threat and violence the four protection goals. According to the CSI/FBI Computer Crime and Security Survey (CSI/FBI 2005) viruses, unauthorized attacks and the theft of proprietary information are the most profoundly attacks. The following table illustrates the protections goals of multilateral security, selected attacks and potential economic impacts. The probability of loss occurrence arises from the observed attacks, the amount of losses from the economic impacts: Table 4.1. Economic impacts of attacks

Protection goal

Selected attacks

Potential economic impact

Confidentiality

Theft of data, access misuse etc.

Loss of competitive advantage, liability claims of third, punishments etc.

Integrity

Sabotage, man-in-the-middleattack, computer bug etc.

Loss of data, business interruption, sales shortfall etc.

Accountability

IP-spoofing, social hacking, inadequate access control etc.

Loss of image, business interruption, liability claims of third etc.

Availability

Distributed denial-of-serviceattack, computer virus, server failure etc.

Loss of recovery, loss of market share etc.

The threats can be classified into different categories. The German “Bundesamt für Sicherheit in der Informationstechnik (BSI)” differs between the categories act of nature beyond control (e. g. fire), organizational deficiencies (e. g. insufficient maintenance), human error (e. g. incorrect data input), technical failure (e. g. server failure) and deliberate act (e. g. theft of hardware) (BSI 2004). According to Basel II banks have to charge capital for operational risk. Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes, people and systems of from external events. This definition includes legal risks, but excludes strategic and reputational risk (Basel Committee on Banking Supervision 2001). Security risks are a subset of operational risks. Figure 4.2 provides a comparison of the loss event type classification according to Basel II and the loss event type classification according to BSI, which portrays that security risk are included in a large number of categories of operational risk.

4 Management of Security Risks – A Controlling Model for Banking Companies

77

Loss Event Type Classification according to Basel II

internal fraud

external fraud

business disruption and system failures

clients, products & business practices

Loss Event Type Classification according to BSI

act of nature beyond control

employment practices and workplace safety

execution, delivery & process management

e.g. fire e.g. insufficient maintenance

organizational deficiencies

deliberate act

e.g. theft of hardware

human error

e.g. careless erasure of objects

technical failure

damage to physical assets

e.g. computer virus

e.g. theft of data

e.g. insufficient net capacity e.g. misuse of confidential information

e.g. server switch-off in going concern e.g. breakdown of electric power supply

e.g. terror attack e.g. negligent demolition of hardware

e.g. incorrect data input

e.g. server failure

Figure 4.2. Security risks as a subset of operational risks

4.2.2

Quantification Phase

The identified security risks are measured by the use of different methods within the quantification phase (Cruz 2002). So far, no quantification model has been developed for the measurement of the security risks, defined above. For the measurement of this subset of operational risk the Basel Committee on Banking Supervision (Basel Committee on Banking Supervision 2004) suggests five different quantification methods in order to determine the economical capital charge. The methods reach from simple, factor-based approaches to complex stochastic loss distribution models based on the Value-at-Risk (Basel Committee on Banking Supervision 2004). Beyond that, further methods exist for the quantification of operational risks, as for instance questioning techniques or causal methods, like Bayesian Belief Networks (Faisst and Kovacs 2003). Selected methods are described more detailed in the following. 4.2.2.1 Questioning Techniques Questioning techniques subsume methods like expert interviews and self-assessments by responsible managers. Operational risks are identified and quantified by using structured interview guidelines as well as management workshops. Beside the identification and quantification of operational risks questioning techniques are used to leverage the awareness of operational risk within the company.

78

Ulrich Faisst, Oliver Prokein

4.2.2.2 Indicator Approaches Indicator approaches use a specific indicator (single indicator approaches) respectively a set of key indicators (key indicator approaches) to indirectly determine the amount of operational risk. Such indicators are selected on base of empirical surveys as well as expert opinions in case the coherence to operational risk can be assumed. An example of the simple indicator approach is the Basic Indicator Approach (Basel Committee on Banking Supervision 2001), on which the gross income of a banking company is used as an exposure indicator multiplied with a factor to determine the economical capital charge. Key indicator approaches consider a set of specific indicators. These comprise key indicators for historic losses, company-specific risk indicators, such as down-time of systems, employee loyalty or the number of transactions. Such indicators may be finally gathered in a scorecard for operational risk.

4.2.2.3 Stochastic Methods Stochastic methods use distribution functions to describe the level of operational risk. One of these methods is the operational Value-at-Risk. This approach is based on the general Value-at-Risk approach which was originally developed for market risks (Beek and Kaiser 2000). The frequency and severity of events are normally forecasted by proceeding simulations based on historical loss data.

4.2.2.4 Causal Methods Causal methods are used to analyse the coherence between sources and drivers of operational risk and resulting losses. Such a method is e. g. Bayesian Belief networks. Bayesian Belief networks can be used to connect historical data on past events on the one hand with expert opinions on future events on the other hand. The Table 4.2 classifies the described quantification methods. Risk managers at banking companies face the problem to select and implement approaches to quantify operational risk. Single indicator approaches appear to be only suitable for small banking companies, for which the usage of more advanced approaches is too costly. Large banking companies usually have implemented stochastic methods, such as operational Value-at-Risk in connection with questioning techniques. This helps banking companies to determine the level of operational risk on base of a large number of historic data and expert opinions. Further development of quantification methods for operational risk is still required. An integrated and consistent method is needed to quantify operational risk for the banking company as a whole and its business units. The creation of interfaces to other methods and risk types as well as a consistent aggregation of risk types is still a challenge.

4 Management of Security Risks – A Controlling Model for Banking Companies

79

Table 4.2. Overview on quantification methods2 Questioning techniques:

Indicator approaches:

• Expert interviews (Piaz 2001) • Self assessments (Piaz 2001)

• Single indicator approaches − Basic Indicator approach (Basel Committee on Banking Supervision 2004) − Standardized approach (Basel Committee on Banking Supervision 2004) − Internal measurement approach (Basel Committee on Banking Supervision 2001) − CAPM-approach (Beeck and Kaiser 2000) • Key indicator approaches − Key performance indicator approaches (Piaz 2001) − Key risk indicator approaches (Piaz 2001) − Scorecard approaches (Basel Committee on Banking Supervision 2001)

Stochastic methods:

Causal methods:

• Loss distribution approach based on operational Value-at-Risk (Basel Committee on Banking Supervision 2001) • Extreme value theory (Cruz 2002)

• Bayesian Belief Networks (Gemela 2001) • Neural Networks (Cruz 2002)

4.2.3

Controlling Phase

Based on the identified and quantified operational risks, decisions on carrying, decreasing, avoidance as well as the transfer of the security risks are made within the controlling phase. There are internal and external controlling instruments for operational risks: • Internal controlling instruments are focused on the sources and drivers of operational risk and are used to prevent loss events caused by operational risks. • External controlling instruments aim at transferring operational risks out of the company. External parties carry potential losses caused by operational risks. Such instruments are insurance policies, outsourcing of process or systems and alternative risk transfer instruments, such as Operational Risk Linked Bonds. 2

Basel II approaches are based on indicator-approaches (such as Basic Indicator, Standardized Approach, Internal Measurement Approach or Scorecard Approach) as well as a stochastic method (Loss Distribution approach based on operational value-at-risk).

80

Ulrich Faisst, Oliver Prokein

4.2.4

Monitoring Phase

The monitoring phase encompasses all procedures and techniques, which are necessary for a continuous monitoring of operational risk. Thereby it is analyzed, if • all the occurred events have been prior identified as possible events, • the distribution of probabilities of occurrence of events and the distribution of severities of losses have been anticipated within the quantification phase, • the selected controlling measures have lead to the desired results. To summarize the state-of-the-art in each of the four presented phases of Operational Risk Management, Table 4.3 provides an overview on research questions in selected contributions: Table 4.3. Overview on phases and research in selected contributions Phase

Research questions

Identifica- • Which methods can tion be used to identify operational risks?

Method

Source

Descriptive analysis

(Basel Committee on Banking Supervision 2001; Brink 2000; Cruz 2002; Eller et al. 2002; Jörg 2002; Faisst and Kovacs 2003; Lokarek-Junge and Hengmith 2003; Marshall 2001; Piaz 2001)

• How can identified operational risks be categorised? • How can sources and drivers of operational risks be analysed?

Quantification

• Which methods can be used to quantify operational risks? • How can operational risks be aggregated? • How can rare events with large severity be quantified?

Case study (Hoffman 2002) (N = 20 banking companies) Descriptive analysis

(Brink 2000; Buhr 2000; Cruz 2002; Eller et al. 2002; Faisst and Kovacs 2003; Jörg 2002; Marshall 2001; Piaz 2001).

Case study (Hoffman 2002) (N = 20 banking companies)

4 Management of Security Risks – A Controlling Model for Banking Companies

81

Table 4.3. Continued

Controlling

• Which quantification method is suitable for which implementation area?

Descriptive analysis

(Faisst and Kovacs 2003)

• Which amount of losses has been caused by operational risk (based on historical data)?

Empirical study (N = 89 banking companies)

(Basel Committee on Banking Supervision 2004)

Simulation model

(Beeck and Kaiser 2000)

• How much does a business process contribute to the aggregated operational risk?

Simulation model

(Ebnöther et al. 2001)

• Which instruments can be used to control the level of operational risk?

Descriptive analysis

(Brink 2000; Cruz 2002; Eller et al. 2002; Jörg 2002; Lokarek-Junge and Hengmith 2003; Marshall 2001; Piaz 2001).

Case study

(Spahr 2001)

• Which impact have these controlling instruments on the frequency and severity of events caused by operational risk? Monitoring

• Which procedures and methods should be implemented to monitor operational risk? • Which requirements have to be considered to monitor operational risk in the most relevant business processes?

(Hoffman 2002) Case study (N = 20 banking companies) Descriptive analysis

(Basel Committee on Banking Supervision 2001; Basel Committee on Banking Supervision 2004; Brink 2000; Cruz 2002; Eller et al. 2002; Lokarek-Junge and Hengmith 2003; Marshall 2001; Piaz 2001).

(Hoffman 2002) Case study (N = 20 banking companies)

82

Ulrich Faisst, Oliver Prokein

This contribution focuses on the controlling phase and the steering of security risks by implementing security mechanism to prevent loss events (ex-ante) and using insurance policies to transfer parts of the losses in case of an event (ex-post). The contribution analyses, which combinations of ex-ante and ex-post controlling measures lead to an efficient solution.

4.3

A Controlling Model for Security Risks

The model aims to solve the described trade-off between the expected losses and the opportunity costs of the economical capital charge on the one hand and the investments in security mechanisms and insurance policies on the other. Thereby, the amount to be invested in security mechanisms and insurance policies will be optimized.

4.3.1

Assumptions

The time horizon accounts for a single period. 4.3.1.1 Assumption 1: Independence of a Single Information System A single information system is regarded, neglecting possible dependencies to other information systems. 4.3.1.2 Assumption 2: Relevant Cashflow Parameters The expected total negative cashflow μ of the information system is composed of the items expected losses due to security risks E(L), the opportunity costs of the economical capital charge OCC, the investments in security mechanisms ISM and the investments in insurance policies IIns. μ = E(L) + OCC + ISM + I Ins

(4.1)

The cashflow items E(L), OCC, ISM and IIns are estimated ex-ante. 4.3.1.3 Assumption 2a: Expected Losses The expected loss E(L) arises as a result of multiplying the expected frequency of attempted attacks λ by the expected loss given events LGE (with LGE > 0). For simplicity, we assume constant LGE. We also assume, that investments in security mechanisms can reduce the expected frequency of attempted attacks λ ex-ante by the so called security level (SL = 1−a). The security level represents the percentage of prevented attacks through the implementation of technical security mechanisms. In order to make allowance for the impacts of these

4 Management of Security Risks – A Controlling Model for Banking Companies

83

mechanisms, the expected frequency of attempted attacks λ is multiplied by the factor a (whereby 0 < a ≤ 1) that represents the percentage of successful attacks. We further assume that investments in insurance policies can reduce the amount of losses LGE by the so called insurance level (IL = 1−b). The insurance level represents the percentage of the transferred respective insured loss given events. In order to make allowance for the impacts of insurance policies, the expected loss given events LGE are multiplied by the factor b (whereby 0 < b ≤ 1) that represents the percentage of not insured loss given events LGE. Thus, the expected losses E(L) are: E(L) = E ⎣⎡( a ⋅ N ) ⋅ ( b ⋅ LGE ) ⎦⎤ = ( a ⋅ E(N) ) ⋅ ( b ⋅ LGE ) = (a ⋅ λ) ⋅ (b ⋅ LGE)

(4.2)

with: E(N) = λ: = Expected frequency of attempted attacks N, = Percentage of successful attacks, a: LGE: = Expected loss given events, = Percentage of not insured loss given events. b: We assume, that the frequency of successful attacks Q is Poisson distributed. The Poisson distribution exhibits the characteristic that the variance corresponds to the expected value. The variance σ Q2 is determined by: σ Q2 = E(Q) = a ⋅ λ

(4.3)

For constant LGE, the standard deviation of losses σL is given by: σ L = a ⋅ λ ⋅ (b ⋅ LGE)

(4.4)

4.3.1.4 Assumption 2b: Opportunity Costs of Economical Capital Charge Under the assumption 2a of Poisson distributed frequency of loss events and of constant LGE, the economical capital charge ECC can be determined as follows: ECC = γ ⋅ E(L)

(4.5)

The economic capital charge ECC arises as a result of multiplying E(L) by the so-called gamma-factor γ3. With an opportunity interest rate r, the opportunity costs of the economical capital charge OCC are given by: OCC = r ⋅ ECC = r ⋅ γ ⋅ E(L) 3

(4.6)

The gamma-factor translates the estimate of expected losses into an estimate for the unexpected losses to be covered by capital charge (Basel Committee on Banking Supervision 2001).

84

Ulrich Faisst, Oliver Prokein

The opportunity costs of the economical capital charge OCC exhibit a deterministic character. The standard deviation σOCC is therefore given by: σ OCC = 0

(4.7)

4.3.1.5 Assumption 2c: Investments in Security Mechanisms According to assumption 2a, increasing investments in security mechanisms ISM implicate decreasing expected losses E(L) et vice versa. We assume that the frequency of attempted attacks λ can be reduced ex-ante by implementing security mechanisms. ISM =

λ ⋅ LGE

[a ⋅ λ ⋅ LGE ]β

−1

(4.8)

with: ISM = 0 for a = β = 1; ISM > 0 for 0 < a < 1 and 0 < β < 1. According to assumption 2a and Equation (4.8), there is an inversely proportional relationship between the investments in security mechanisms ISM and the percentage of successful attacks a for a constant calibration factor β. This calibration factor determines the sensitivity of the relationship (whereby 0 < β ≤ 1). Similar to the opportunity costs of the economical capital charge, the investments in security mechanisms ISM exhibit a deterministic character. The standard deviation σSM is therefore given by: σSM = 0

(4.9)

4.3.1.6 Assumption 2d: Investments in Insurance Policies According to assumption 2a, increasing investments in insurance policies IIns implicate decreasing E(L), et vice versa. If the security risks fulfil the criteria of insurability, banking companies are able to reduce the extent of damage ex-post by investments in insurance policies. ⎛ λ ⋅ LGE ⎞ − 1⎟ I Ins = ⎜ δ ⎜ [ b ⋅ λ ⋅ LGE ] ⎟ ⎝ ⎠

(4.10)

with: IIns = 0 for b = δ = 1; IIns > 0 for 0 < b < 1 and 0 < δ < 1. Analogous to the investments in security mechanisms we assume an inversely proportional relationship between the investments in insurance policies IIns and the percentage of not insured loss given events b for a constant calibration factor δ. This calibration factor determines the sensitivity of the relationship (whereby 0 < δ ≤ 1).

4 Management of Security Risks – A Controlling Model for Banking Companies

85

The investments in insurance policies IIns exhibit a deterministic character and the standard deviation is given by: σ Ins = 0

(4.11)

4.3.1.7 Assumption 3: Solution Space with Continuous σ and Its Transformation on μ(σ): We assume that any number of σ ∈ (0,∞) exists and the corresponding cashflows can be mapped through the continuous function μ(σ).4 Only one σ can be realized, combinations are not possible.

4.3.2

Determining the Optimal Security and Insurance Level

In order to determine the optimal security and insurance level (SL*, IL*) and the corresponding optimal amount to be invested in technical security mechanisms I*SM and insurance policies I*Ins , we assume a risk neutral decision-maker that aims at minimizing his expected total negative cashflow μ. The expected total negative cashflow is obtained by the substitution of (4.2), (4,6), (4.8) and (4.10) in (4.1): μ = (1 + r ⋅ γ) ⋅ (a ⋅ λ) ⋅ (b ⋅ LGE) +

λ ⋅ LGE

( a ⋅ λ ⋅ LGE )

β

+

λ ⋅ LGE

( b ⋅ λ ⋅ LGE )

δ

−2

(4.12)

The derivation of equation (4.12) with respect to a is given by: ∂μ λ ⋅ LGE = λ ⋅ (1 + r ⋅ γ) ⋅ (b ⋅ LGE) − β ⋅ ∂a (λ ⋅ LGE)β ⋅ a β +1

(4.13)

∂ 2μ λ ⋅ LGE = β ⋅ (β + 1) ⋅ >0 2 ∂ a (λ ⋅ LGE)β ⋅ a β + 2

(4.14)

Equation (4.13) fulfils the necessary as well as (4.13) and (4.14) the sufficient conditions of a minimum of the expected total negative cashflow. Transformation of (4.13) leads to: 1

⎛ β ⋅ λ ⋅ LGE ⎞ β +1 ⎜ ⎟ b ⋅ (1 + r ⋅ γ) ⎠ a=⎝ λ ⋅ LGE 4

(4.15)

Thus, it is assumed that any number of σ can be obtained. In reality only a finite number of discrete values of σ exist. Another simplification is the assumption that for any number of σ a continuous function μ(σ) exists.

86

Ulrich Faisst, Oliver Prokein

The derivation of the variable b is obtained analogous and is given by: 1

⎛ δ ⋅ λ ⋅ LGE ⎞ δ +1 ⎜ ⎟ a ⋅ (1 + r ⋅ γ) ⎠ b=⎝ λ ⋅ LGE

(4.16)

Substitution of (4.16) in (4.15) and (4.15) in (4.16) leads to a * = (λ ⋅ LGE)

b* = (λ ⋅ LGE)

1

δ −β(δ +1) β⋅(δ +1) + δ

⎛ β (1+ δ) ⋅⎜ ⎜ δ ⋅ ( r ⋅ γ + 1)δ ⎝

⎞ β⋅(δ +1) + δ ⎟ ⎟ ⎠

β − δ⋅(β +1) β⋅(δ +1) + δ

⎛ δ(1+β) ⋅⎜ ⎜ β ⋅ ( r ⋅ γ + 1)β ⎝

⎞ β⋅(δ +1) + δ ⎟ ⎟ ⎠

(4.17)

1

(4.18)

The optimal security level SL* and insurance level IL* are (see assumption 2a) SL* = 1 − a *

(4.19)

IL* = 1 − b*

(4.20)

and therefore obtained by substitution of (4.17) in (4.19) and (4.18) in (4.20): ⎡ δ −β⋅(δ +1) ⎛ ⎢ β (1+ δ) * SL = 1 − ⎢ (λ ⋅ LGE) β⋅(δ +1) + δ ⋅ ⎜ ⎜ δ ⋅ ( r ⋅ γ + 1)δ ⎢ ⎝ ⎣

1 ⎤ ⎞ β⋅(δ +1) + δ ⎥ ⎟ ⎥ ⎟ ⎥ ⎠ ⎦

(4.21)

⎡ β − δ⋅(β +1) ⎛ ⎢ δ(1+β) * IL = 1 − ⎢ (λ ⋅ LGE) β⋅(δ +1) + δ ⋅ ⎜ ⎜ β ⋅ ( r ⋅ γ + 1)β ⎢ ⎝ ⎣

1 ⎤ ⎞ β⋅(δ +1) + δ ⎥ ⎟ ⎥ ⎟ ⎥ ⎠ ⎦

(4.22)

The minimum of the expected total negative cashflow μ* (a * ,b* ) is obtained by the substitution of the equations (4.17) and (4.18) in equation (4.12). In doing so, it is further possible to determine the optimal amount to be invested in security mechanisms I*SM and insurance policies I*Ins : I*SM =

I*Ins =

λ ⋅ LGE ⎡ ⎢ ⎢( λ ⋅ LGE ) ⎢ ⎣

⎛ ⎞ 2⋅δ ⎜ ⎟ ⎝ (1+ δ)⋅β + δ ⎠

β

−1

δ

−1

⎛ ⎞⎤ 1 ⎜ ⎟ ⎞⎝ (1+ δ)⋅β + δ ⎠ ⎥

⎛ β (1+ δ) ⋅⎜ δ ⎟ ⎝ δ ⋅ (1 + r ⋅ γ) ⎠

⎥ ⎥ ⎦

λ ⋅ LGE ⎡ ⎢ ⎢( λ ⋅ LGE ) ⎢ ⎣

⎛ ⎞ 2⋅β ⎜ ⎟ ⎝ (1+ δ)⋅β + δ ⎠

⎛ ⎞⎤ 1 ⎜ ⎟ ⎞⎝ (1+ δ)⋅β + δ ⎠ ⎥

⎛ δ(1+β) ⋅⎜ δ ⎟ ⎝ β ⋅ (1 + r ⋅ γ) ⎠

(4.23)

⎥ ⎥ ⎦

(4.24)

4 Management of Security Risks – A Controlling Model for Banking Companies

87

Figure 4.3. Minimum of the expected total negative cashflow5

The mapped area in Figure 4.3 illustrates all possible μ(a,b)-combinations and the corresponding security and insurance level (SL,IL) for given values of β and σ. However, there is only one minimum in μ*(a*,b*) and therefore only one efficient (SL*,IL*)-solution. Example 4.1: Consider the following instance for the model: β = 0,2 ; δ = 0,6 ; r = 0,1 ; LGE = 1.000 ; λ = 0,3 ; γ = 7,3 . The optimal security and insurance level are given by SL* = 0,58 (with a* = 0,42 ) and IL* = 0,9 (with b* = 0,1 ). Therefore a banking company would invest I*SM = 112,99 in technical security mechanisms and I*Ins = 36,99 in insurance policies. The expected loss E(L) equals the value E(L) = 13,18 and the opportunity costs of the economical capital charge OCC = 9,62 . Therefore the minimum of the expected total negative cashflow is given by μ* = 172,78 . Result 4.1: For any given values of (β, δ) only one minimum of the expected total negative cashflow μ*(a*,b*), respective (SL*, IL*)-solution exists.

4.3.3

Constraints and Their Impacts

In practice, constraints like limited economical capital charge and limited budget affect the controlling of security risks. We will further analyze the impacts of such constraints. 5

According to assumption 2a, the domains of a, b and SL, IL respectively are given by a, b ∈ (0; 1] and SL, IL ∈ [0; 1). The closer SL, IL approaches to 1, the greater the expected total negative cashflow μ . Therefore, μ is not limited and can rise infinitely. For illustration reasons, the plotted graph shows all the SL, IL -combinations within the domain [0; 0,99].

88

Ulrich Faisst, Oliver Prokein

At first, we transform the expected total negative cashflow in an equation dependent on the standard deviation. The standard deviation of the expected total negative cashflow σETNC arises by considering (4.3), (4.7), (4.9) and (4.11): σ ETNC = σ L = a ⋅ λ ⋅ (b ⋅ LGE)

a ⋅λ =

(4.25)

σ 2ETNC

(4.26)

( b ⋅ LGE )2

In the following, σETNC is denoted as σ. We obtain E(L), OCC, and ISM in dependence of σ by the substitution of (4.26) in (4.2), (4.6), (4.8) and (4.10): E(L) =

σ2 b ⋅ LGE

OCC = r ⋅ γ ⋅ ISM =

(4.27)

σ2 b ⋅ LGE

(4.28)

σ2 β

⎛ σ2 ⎞ a ⋅ b ⋅ LGE ⋅ ⎜ 2 ⎟ ⎝ b ⋅ LGE ⎠

−1

(4.29)

−1

(4.30)

2

I Ins =

σ2

⎛ ⎞ σ2 a ⋅ b ⋅ LGE ⋅ ⎜ ⎟ ⋅ ⋅ a b LGE ⎝ ⎠

δ

2

The expected total negative cashflow μ(σ) is obtained based on (4.27), (4.28), (4.29) and (4.30): μ=

β δ ⎛ σ2 1 ⎜⎛ ⎛ b 2 ⋅ LGE ⎞ ⎛ a ⋅ b ⋅ LGE ⎞ ⎟⎞ ⎟⎞ ⋅ ⎜1 + r ⋅ γ + ⋅ ⎜ + ⎟ ⎜ ⎟ ⎟⎟ − 2 b ⋅ LGE ⎜ a ⋅ b ⎜ ⎝ σ2 σ2 ⎝ ⎠ ⎠ ⎝ ⎠⎠ ⎝

(4.31)

Equation (4.31) describes μ(σ) as a continuous function in dependence of σ. μ(σ) thereby maps the domain σ ∈ (0;∞) well defined on μ(σ) ∈ (μ * ;∞) . Figure 4.4 illustrates the trade-off between the expected losses E(L) and the opportunity costs of the economical capital charge OCC on the one hand and the investments in security mechanisms ISM and insurance policies IIns on the other. The expected total negative cashflow μ(σ) thereby possesses only one minimum in μ*(σ*). Example 4.2: Analogous to example 4.1, we consider the following instance for the model: β = 0,2; δ = 0,6; r = 0,1; LGE = 1.000; λ = 0,3; γ = 7,3; SL* = 0,58 (with a* = 0,42); IL* = 0,9 (with b* = 0,1) and μ* = 172,78. Substituting these values in (29) leads to an optimal risk level amounting to σ* = 36,17.

4 Management of Security Risks – A Controlling Model for Banking Companies

μ (σ )

μ

89

E (L )

μ* OCC I = I SM + I Ins

σ*

σ

Figure 4.4. Trade off between E(L) and OCC on the one hand and ISM and IIns on the other

As mentioned above, the minimum of the expected total negative cashflow μ*(a*,b*), respectively (SL*,IL*))-solution, is derived by equation (4.12) in connection with (4.19) and (4.20). The appropriate optimal risk level σ* can be determined by equation (4.31) in conjunction with μ*. 4.3.3.1 Constraint 1: Risk Limits of the Economical Capital Charge

Composing limits of the economical capital charge determines the amount of risks a banking company is prepared to carry. In addition, we assume that a banking company defines a limit of the economical capital charge LECC for a single information system. In doing, so the amount of feasible solutions is restricted. In order to illustrate the impacts on the expected total negative cashflow, we consider two different limits of the economical capital LECCi (i = 1,2). The two limits LECC1,2 are represented in Figure 4.5. LECC1 cuts the expected total negative cashflow in (μ1,σ1). In this case, the banking company is further able to realize the global minimum of the expected total negative cashflow in (μ*,σ*). Therefore, the limit of the economical capital charge LECC1 does not tap the full potential (σ* < σ1). However, in the second case the limit LECC2 cuts the expected total negative cashflow in the suboptimal solution (μ2,σ2). The expected total negative cashflow μ2 is greater than μ*, the corresponding risk σ2 is accordingly smaller (σ2 < σ*). Result 4.2: In case the LECC exceeds the optimal solution (μ*,σ*), a banking company can further realize the minimal expected total negative cashflow μ*. However, if the LECC is smaller than σ* a banking company can only realize suboptimal solutions.

90

μ (σ )

Ulrich Faisst, Oliver Prokein LECC

2

LECC

1

μ

μ2 μ1 μ*

σ2 σBL σ *

σ1

σ

Figure 4.5. Risk limits of economical capital charge

Analogous to the limits of the economical capital charge, budget limits restrict the amount of feasible solutions. Their impacts are analyzed in the following. 4.3.3.2 Constraint 2: Budget Limits

Budget limits BL serve as a limitation of the payments in business areas or in our case in information systems. Budgeting generally considers the expected losses E(L) ex-ante. Anyhow, L is a random variable that can exceed ex-post the defined budget limit. The amount exceeded can be covered through equity capital. Figure 4.6

μ (σ )

μ

μ2 μ1 BL

μ*

σBL σ * Figure 4.6. Budget limits

σ

4 Management of Security Risks – A Controlling Model for Banking Companies

91

illustrates the impact of the budget limit BL exemplarily. A budget limit BL cuts the expected total negative cashflow in the suboptimal solution (μBL,σBL). In this case, a banking company can further realize the optimal solution (μ*,σ*). Assumed that the budget limit BL is smaller than the optimal solution (μ*,σ*), the information system cannot be carried on. Result 4.3: If the budget limit BL exceeds the minimal expected total negative cashflow BL ≥ μ*, a risk neutral decision maker will further choose the optimal solution (μ*,σ*).

4.4

Conclusion

The developed decision model is able to map the existing trade-off between the expected losses and the opportunity costs of the economical capital charge on the one hand as well as the investments in security mechanisms and insurance policies on the other hand in a common framework. Thereby, the model optimizes the investments in ex-ante and ex-post controlling mechanisms. Only one (a*,b*)respective (SL*,IL*)-combination exists that minimizes the expected total negative cashflow. Furthermore, the model points out that the constraints – limits of the economical capital charge and budget limits – can, under certain conditions, lead to suboptimal solutions. Normally a banking company determines its limits centrally via an individual information system. Not fully taped limits of economical capital charge and budget limits cannot be exchanged. Thus, this can lead to inefficiencies from a holistic point of view. The exchange of not fully taped potentials can implicate a greater utility. However, further research questions arise from the defined assumptions: • In the model an isolated information system is regarded, by which it is assumed that it is independent of all other systems. Correlations to other information systems are not considered. Taking correlations into account can lead to different results. • We further assume that investments in security mechanisms and insurance policies can be mapped within the domain of σ ∈ ( 0;∞ ) . This can be traced back to the property of continuity of the function μ(σ). It is assumed that in reality only discrete action alternatives exist. • For simplicity, we assumed constant loss given events. However, if the standard deviation of the expected loss given events is taken into account, the expected total negative cashflow will be affected. A further research topic includes modeling random variables for the loss given events.

92

Ulrich Faisst, Oliver Prokein

References Basel Committee on Banking Supervision (2001) Working Paper on the Regulatory Treatment of Operational Risk. Retrieved December 12, 2005, from www.bis.org/publ/bcbs_wp8.pdf Basel Committee on Banking Supervision (2004) Internal Convergence of Capital Measurement and Capital Standards. Retrieved December 12, 2005, from http://www.bis.org/publ/bcbs107.pdf Beeck H, Kaiser, T (2000) Quantifizierung von Operational Risk. In: Johanning, L., Rudolph, B. (eds.) Handbuch Risikomanagement. UHLENBRUCH Verlag, Bad Soden/Ts, pp. 633−654 Brink J. van den (2000) Operational Risk: Wie Banken das Betriebsrisiko beherrschen. Dissertation, St. Gallen, Switzerland BSI (2004) IT-Grundschutzhandbuch. Retrieved March 12, 2004, from http://www.bsi.de/gshb/deutsch/download/GSHB2004.pdf Buhr, R (2000) Messung von Betriebsrisiken – ein methodischer Ansatz. Die Bank 40: 202−206 Cruz A (2002) Modeling, Measuring and Hedging Operational Risk. John Wiley & Sons, Chichester CSI/FBI (2005) Computer Crime and Security Survey 2005. Retrieved April 9, 2005, from http://www.usdoj.gov/criminal/cybercrime/FBI2005.pdf Ebnöter S, Vanini P, McNeil A, Antolinez-Fehr, P (2001) Modelling Operational Risk. Working Paper, ETH Zürich Eller R, Gruber W, Reif M (2002) Handbuch Operationelle Risiken – Aufsichtsrechtliche Anforderungen, Quantifizierung und Management, Praxisbeispiele. Schäffer-Poeschel Verlag, Stuttgart Faisst U (2004) Ein Modell zur Steuerung operationeller Risiken in IT-unterstützten Bankprozessen. Banking and Information Technology (BIT) – Sonderheft zur Multikonferenz Wirtschaftsinformatik 2004 in Essen 1: 35−50 Faisst U, Kovacs M (2003) Quantifizierung operationeller Risiken – ein Methodenvergleich. Die Bank 43: 342−349 Gemela J (2001) Financial analysis using Bayesian networks. Applied Stochastic Models in Business and Industry 17: 57−67 Hoffman DG (2002) Managing Operational Risk – 20 Firmwide Best Practice Strategies. John Wiley & Sons, Chichester Jörg M (2002) Operational Risk – Herausforderung bei der Implementierung von Basel II. Diskussionsbeiträge zur Bankbetriebslehre, Hochschule für Bankwirtschaft, Frankfurt, Germany Locarek-Junge H, Hengmith L (2003) Management des operationalen Risikos der Informationswirtschaft in Banken. In: Geyer-Schulz A, Taudes A (eds.) Informationswirtschaft: Ein Sektor mit Zukunft – Symposium. Köllen Druck + Verlag, Bonn Marshall CL (2001) Measuring and Managing Operational Risk in Financial Institutions. John Wiley & Sons, Chichester

4 Management of Security Risks – A Controlling Model for Banking Companies

93

Müller G, Eymann T, Kreutzer M (2003) Telematik- und Kommunikationssysteme in der vernetzten Wirtschaft. Oldenbourg, München Müller G, Rannenberg K (1999) Multilateral Security in Communications. Vol. 3: Technology, Infrastructure, Economy. Addison-Wesley-Longman, New York Piaz JM (2001) Operational Risk Management bei Banken. Versus-Verlag, Zürich Rannenberg K (1998) Zertifizierung Mehrseitiger Sicherheit: Kriterien und organisatorische Rahmenbedingungen. Vieweg, Braunschweig, Wiesbaden Sackmann S, Strüker J (2005) 10 Jahre E-Commerce – Eine stille Revolution in deutschen Unternehmen. Konradin Verlag, Leinfelden Spahr R (2001) Steuerung operationaler Risiken im Electronic und Investment Banking. Die Bank 41: 660−663

CHAPTER 5 Process-Oriented Systems in Corporate Treasuries: A Case Study from BMW Group Christian Ullrich, Jan Henkel

5.1

Introduction

Today, most organizations in all sectors of industry, commerce and government are fundamentally dependent on their information systems. In the words of (Rockart 1988): “Information technology has become inextricably intertwined with business.” In industries such as telecommunications, media, entertainment and financial services, where the product is digitized, the existence of an organization depends on the effective application of information technology (IT). The treasury management function of multinational companies provides another interesting example. Treasury management has developed during the last two decades from a simple function of protecting and enhancing cash flows, to a complex one, where all aspects pertaining to modern cash management, foreign exchange management, investment management, exposure management, risk-return analysis, and accounting are handled. The reason for the changing role of corporate treasuries can be found in the changing financial landscape itself. Changes in financial markets over the past decade have been rapid and profound. New computing and telecommunication technologies, along with the removal of legal and regulatory barriers to entry, have led to greater competition among a wider variety of institutions, in broader geographic areas, and across an expanding array of instruments. The result has been an increase in the efficiency with which financial markets channel funds from borrowers to lenders. Moreover, there has been considerable improvement in the ability of market players to adjust their risk-return profiles. Among the key technological innovations are those that have enabled the development of databases critical to pricing and managing the risks of financial instruments. The ability to value risky financial assets laid the foundation for understanding the risks embedded in those assets and managing them on an aggregated, portfolio basis. While the value of risk unbundling has been known for decades (Markowitz 1971) the ability to create sophisticated instruments that

96

Christian Ullrich, Jan Henkel

could be effective in a dynamic market had to await the last decade’s development of computer and telecommunications technologies. In fact, technological changes have tremendously contributed to the evolution of financial markets, since more people and institutions have a greater number of alternatives for supplying or acquiring funds. In addition, both lenders and borrowers have a much wider range of instruments from which to choose the risk profile of their financial positions and to adjust that profile in response to changes in their own situations or in the overall financial and economic environment (BIS 2005; Bodnar et al. 1999). According to (Greenspan 1999), “… the extraordinary development and expansion of financial derivatives …” has been “by far the most significant event in finance during the past decade”. For example, it has helped to reduce the cost of transferring savings from anywhere around the globe into investments anywhere around the globe. It has also reduced the reliance of particular classes of borrowers and lenders on particular types of financial intermediaries. However, it has also challenged financial and technology service providers to identify the value they add to the intermediation process. Although these changes have created potential for more robust financial markets, in reality the shifts have not been without difficulties. Some significant disturbances in the past decade have challenged market participants’ and regulators’ understanding of risk and revealed considerable weaknesses in risk management tools and practices. For example, in 1997 and 1998, capital flows to a number of Asian countries were disrupted, with severe effects on their economies. In 1998, the global financial system was rocked by the Russian debt default and the troubles of Long-Term Capital Management. Those events were followed by the worst excesses of the stock market bubble, including not only the overvaluation of many stocks, but also the lapses in corporate governance as revealed by a range of corporate disasters which have marked that era (see e. g. Benston et al. 2003; Culp and Miller 1994; Mello and Parsons 1994). New technologies and changing market structures imply regulatory bodies to constantly review their legal frameworks. During the last years, a range of regulatory requirements (FAS, IFRS, KonTraG, SOX) have been put into practice in order to promote greater flow of accurate information and therefore enable private participants to make better informed decisions. Today, we observe that companies are not trying to fulfil regulatory requirements only, but strive to integrate regulatory demands in a corporate value-orientated context. This tendency has become particularly obvious in firms’ financial risk management practices. As creators of value, treasuries seek to progress from pure measurement of financial risk and return to effective decision-support analyses.1 In addition, they play a pioneering role in passing risk management expertise on to other, core business related areas, in order to establish enterprise-wide risk management frameworks. In order to fulfil this role, treasuries face the challenge to move from monitoring and managing accounts 1

Note that this behaviour is contrary to the theoretical findings of (Modigliani and Miller 1958) which basically states that under the assumption of perfect capital markets, firms need not manage their financial risks since investors can do it for themselves.

5 Process-Oriented Systems in Corporate Treasuries

97

locally, regionally or globally, to providing detailed analytical reports on the financial position of the company. These new tasks can hardly be accomplished without technology support. In this chapter, we present the current status on how IT systems support BMW’s treasury activities. We take different views according to the notions of (Zachman 1987).2 The contribution is organized in seven sections as follows. In Section 5.2 we give an overview of the technological options available. Section 5.3 presents the major challenges that multinational corporations are confronted with today when optimizing their treasury procedures. In Section 5.4 we portray BMW, its business strategy and its current position on the international market for automobiles and motorcycles (“ballpark view”). In Section 5.5 an “owner’s perspective” of BMW’s treasury process is taken. We describe the role and function of BMW’s treasury department by examining its organizational structure and responsibilities, as well as its processes and the flow of information (“architect’s view”). In Section 5.6, special emphasis is put on the functionality and architecture of BMW’s treasury IT (“designer’s view”). How IT supports the company’s treasury operations is shown within a case study. Section 5.7 gives the conclusion.

5.2

The Technological Environment

The market for Treasury IT systems is characterized by a huge number of differently aligned system manufacturers, which offer the whole range from simple plugand-play solutions up to complicated integrated systems including comprehensive range of functionalities. In the following, we survey the technological possibilities that exist. We focus on four key technological options that have emerged to shape the topography of information architectures over the past three decades: • • • •

2

legacy systems best-of-breed solutions enterprise resource planning systems (ERP) decision-support systems (DSS)

In 1987, John Zachman published an influential approach to the elements of system development. Instead of representing the process as a series of steps, he organized it around the points of view taken by the various players within a company. These players included (1) the CEO or whoever is setting the agenda and strategy for an organization (“ballpark view”), (2) the business people who run the organization (“owner’s view”), (3) the systems analyst who wants to represent the business in a disciplined form (“architect’s view”), (4) the designer, who applies specific technologies to solve the problems of the business (“designer’s view”), the IT experts who choose a particular language, program listings, database specifications, networks, etc. (“builder’s view”), and finally, (5) the system itself.

98

Christian Ullrich, Jan Henkel

A brief look back in history demonstrates that companies were deploying large numbers of internally developed, agglomerated smaller systems isolated by function – so called legacy systems – to support critical business processes. Little thought was given to integrating these applications with each other and existing systems. As a consequence, these resulting information architectures made it difficult for decision-makers to combine and gain insight from information housed in unrelated systems. The delivery of sophisticated treasury management system (TMS) solutions has progressed hand in hand with the availability of the necessary network, server and processor power. Until the mid 1990s, sophisticated risk management was the preserve of banks whose massive IT budgets enabled them to deploy the mainframe computers, which offered the necessary storage capacity and processing power. IT budgets for corporate treasuries were comparatively modest – despite the volume of money which they process and the various risks that this implies. The advent of networked minicomputers in the early 1990s was the first big advance, as it enabled low-cost PCs to share information to complete such processes as the aggregation of different kinds of treasury deals. This provided much of the raw data needed for risk analysis in a timely and convenient way. However, early networked PCs were significantly constrained by disk capacity, network speed, processor power and memory capacity. As the demand for treasury systems grew, it was cheerfully met by programmers who produced more and more automated solutions in order to limit operational risks. Still, success was inhibited by the slowness of the technology on which they had to operate. During the last years, these constraints have been eliminated by the exponential improvement in capacity and performance, coupled with the downward pressure on prices which finally led to the adoption of business process segment specific, best-of-breed TMS in many corporations, especially large multinational ones. This is also reflected in a range of industry surveys which confirm that the majority of industrial corporations today use TMS for supporting treasury processes (Ernst & Young 2005; PricewaterhouseCoopers 2003). Very generally, a TMS can be characterized as a “… complex piece of software which can be used to automate, record, and control many core treasury functions. TMSs, sometimes referred to as ‘Treasury Workstations’ also act as a central database for information flowing in and out of the treasury, …” (Unknown 2006). The TMS vendor business itself has evolved into a fragmented industry in which numerous providers exhibit strength in different regions and among different customer segments, but no competitor is capable of dominating the field globally. Wellrecognized TMS providers on a global basis include FXpress, Reuters, SAP, SimCorp, SunGuard, Trema and XRT-CERG. While the TMS competitive landscape changes slowly, shifts that are taking place are arising largely from consolidation as the recent acquisitions of Integrity Treasury by SunGuard, Selkirk by Thomson and Richmond by Trema demonstrate. Still, TMS providers have focused on providing internet-based tools that facilitate the extension of transaction processing beyond the central treasury operation, including foreign business subsidiaries (customers) and banks (suppliers). Web browsers on desktops and

5 Process-Oriented Systems in Corporate Treasuries

99

cheaper bandwidths create an environment whereby deployment of internet-based tools is a viable option for corporate treasuries. Over the last decade, ERP systems have been widely developed to replace aging legacy systems, as well as to better integrate and automate core business processes, such as manufacturing, logistics, distribution, inventory, quality, human resources shipping, invoicing, and accounting. Thus, all functional departments that are involved in operations or production are integrated in one system. ERP systems are often called back office systems indicating that customers and the general public are not directly involved. This is contrasted with front office systems like customer relationship management (CRM) systems, eFinance systems, or supplier relationship management (SRM) systems that deal with companies’ external business partners. ERP systems are cross-functional, enterprise-wide and are marked by a seamless design that reduces time spent maintaining and upgrading separate systems and interfaces. Treasury and risk management functionalities are not represented as single systems but as integrated modules. This may remove organizational, geographic and political barriers. For instance, a single-vendor package of enterprise solutions for a range of global transaction processing needs brings the advantages of simplicity, integration and single-contact points for sales and support. Vendors whose packages provide greatest functional coverage include SAP, Oracle and PeopleSoft. However, although ERP systems are designed to handle heavy transaction-processing loads, they have not been universally esteemed for their performance management features. Reporting and analysis capabilities are pre-built and usually not customizable. The complexity involved to create new reports or to code changes has accelerated the adoption of third-party products that hide back-end complexity while simplifying reporting and analysis activities. Furthermore, the difficulty of extracting, combining and analyzing information from multiple disparate sources has led to an explosion of business intelligence tools, such as DSS in the very recent years.3 DSS can take many different forms and the term can be used in many different ways (Alter 1980). For example, (Finlay 1994) considers a DSS broadly as “a computer-based system that aids the process of decision making.” In a more precise way, (Turban 1995) defines it as “an interactive, flexible, and adaptable computerbased information system, especially developed for supporting the solution of a non-structured management problem for improved decision making. It utilizes data, provides an easy-to-use interface, and allows for the decision maker’s own insights.” At the conceptual level, (Power 2002) differentiates model-driven, communication-driven, data-driven, document-driven, and knowledge-driven DSS: 3

According to (Keen and Morton 1978), the concept of decision support has evolved from two main areas of research: the theoretical studies of organizational decision making done at the Carnegie Institute of Technology during the late 1950s and early 1960s, and the technical work on interactive computer systems, mainly carried out at the Massachusetts Institute of Technology in the 1960s. It is considered that the concept of decision-support systems became an area of research of its own in the middle of the 1970s, before gaining in intensity during the 1980s.

100

Christian Ullrich, Jan Henkel

• A model-driven DSS emphasizes access to and manipulation of a statistical, financial, optimization, or simulation model. Model-driven DSS use data and parameters provided by DSS users to aid decision makers in analyzing a situation, but they are not necessarily data intensive. • A communication-driven DSS supports more than one person working on a shared task. • A data-driven DSS or data-oriented DSS emphasizes access to and manipulation of a time series of internal company data and, sometimes, external data. • A document-driven DSS manages, retrieves and manipulates unstructured information in a variety of electronic formats. • A knowledge-driven DSS provides specialized problem solving expertise stored as facts, rules, procedures, or in similar structures. Moreover, at the system level, (Power 1997) differentiates enterprise-wide DSS and desktop DSS. Enterprise-wide DSS are linked to large data warehouses and serve many users in a company, whereas desktop single-user DSS are small systems that reside on individual users’ PCs. In the field of treasury, DSS tools are mostly model- and data-driven desktop solutions, which involve the special subjects of financial market prediction, scenario analysis and risk-return optimization. Figure 5.1 characterizes all four system types in terms of their functional breadth (which refers to the system’s functional reach within the company) and their functional depth (which refers to the degree of specialization). Typically, systems with high functional breadth, such as ERP, are marked by low functional depth. Vice versa, systems with high functional depth such as legacy systems or DSS have low functional breadth. Legacy & Decision Support Systems High

Functional depth

Best-of-breed Systems

Enterprise Resource Planning (ERP) Systems Low Treasury

Cost Management

HR

Materials Management

Functional breadth

Figure 5.1. Categorization of system solutions

Supply Chain Management

5 Process-Oriented Systems in Corporate Treasuries

5.3

101

Challenges for Corporate Treasuries

For treasury operations today, the application and integration of IT systems has become a key success factor in order to meet a range of internal and external challenges (see Figure 5.2): • • • •

enhancing transaction processing providing timely-accurate, periodic management information complying to business controls and corporate governance integrating decision-support mechanisms into the existing IT systems infrastructure

Transaction Processing: Treasury management has developed during the last two decades from a simple function of protecting and enhancing cash flows, to a broader one, where all aspects pertaining to modern cash management, foreign exchange management, investment management, risk-return and exposure management and analysis, and accounting are handled. Naturally, these activities are being accompanied by increasing transaction volumes. Since the flow of data is faster than ever, the data itself is more detailed, and it has no regional boundaries, data management is becoming more sophisticated. Consider a global corporate treasury operation that comes with a whole taxonomy of data around it. For instance, if the company has 1 million Euros in an account, that figure is associated with a currency, a legal entity, a bank, a business unit, geographic data, a date, etc. Consider now that the same firm has thousands of accounts with many banks all over the globe. This may result in voluminous financial transactions and hence huge amounts of information across thousands of accounts that have to be sliced, diced, aggregated and segmented in

Tomorrow

Today 100 Decision-Support

Search for value enhancement, Deployment of expert knowledge.

Business Controls/ Corporate Govenance Management Information

Decision-Support Business Process Reengineering to reflect environmental changes in the organization.

Transaction Processing

0

Value Contributions

Business Controls/ Corporate Governance Management Information

Cost savings through economies of scale and IT systems support.

Transaction Processing

Figure 5.2. The changing role of corporate treasury (Source: PricewaterhouseCoopers)

102

Christian Ullrich, Jan Henkel

many different ways. The collection of the necessary data from a range of sources within a company in order to compile management and treasury position reports as well as financial forecasts may thus be a very time-consuming process. Consequently, there is a technological need to quickly process financial transactions from initiation to settlement in an automated and secure fashion. According to Figure 5.2, the main challenges that multinational firms are confronted with are still efficiencyrelated. The main purpose herein is to exploit economies of scale that result from efficient transaction processing. Among system vendors, the term “transaction processing”, understood as the embracement of an efficient, instantaneous flow of information within a system in order to realize economies of scale, is often referred to as straight-through-processing. Management Information: Often, financial processes lack transparency because the relevant data does not come from one single source, but is distributed over a variety of systems. Missing interfaces make straight-through-processing and exchange of real-time data difficult. As a result, firms encounter contradictory data, which may lead to the problem that questions about the liquidity reserves in a particular country, legal entity, or business unit or about the financial obligations towards a certain business partner can often not be answered. In order to avoid such problems, automated systems are necessary that provide precisely timed access to accurate information, and thus allow for real-time financial performance measurement, risk-return controlling and financial reporting. The degree of straight-through-processing becomes a particular issue during those times when management reporting cycles coincide with important market shifts or roll-over dates and rate fixings. In addition, since managers rarely rely on a single source of information, the focus of information systems planning has moved towards the integration of individual systems into coherent sources of management information. This involves information analysis techniques such as data modelling and entity analysis to devise new ways of organizing and delivering information. Moreover, advances in communication capability facilitate new and complex forms of organization embodying distributed network structures and processes. Business Controls/Corporate Governance: It is nothing new that the treasury function in global corporations also faces a highly competitive and demanding regulatory environment that mandates a seamlessly integrated corporate treasury solution. The recent spate of corporate scandals and failures, including Enron, WorldCom, and Global Crossing in the United States, Daewoo Group in Korea, and HIH in Australia, has raised serious questions about the way public corporations are governed around the world. When managerial self-dealings are excessive and left unchecked, they can have serious negative effects on corporate values and the proper functions of capital markets. In fact, there is growing consensus around the world that it is important to strengthen corporate governance in order to protect the rights of shareholders, curb managerial excesses, and restore confidence in capital markets.4 The recent years have shown a huge push on corporate govern4

The term “corporate governance” is understood here as “the economic, legal and institutional framework, in which corporate control and cash flow rights are distributed among

5 Process-Oriented Systems in Corporate Treasuries

103

ance recommendations and regulations, such as Sarbanes-Oxley in the US or IFRS and hedge accounting rules (IAS 39) in Europe. This regulatory demand, coupled with stakeholder pressure for more financial transparency and control over treasury transactions is making increased tracking and justification of treasury transactions not only desirable but necessary. Since spreadsheet calculations are prone to error due to their low degree of automation, there is now a demand to create financial figures using a controlled, visible and audited process. As a consequence, more and more organizations turn to treasury automation with a system solution to the problems of clarity and accountability in the production of financial figures. A related trend is the demand from corporations for their TMS to not only control and automate treasury processes, but also document them in order to prove that a clear and controlled process exists. As a consequence, treasury technology providers must now include compliance analytics and metrics, such as verification of profit/loss entries being double-checked or payment authorities and approvals. Decision-Support: It is anticipated that in the next years there will be a shift in focus from efficiently “running the business” to effectively “managing and optimizing the business” (Ernst & Young 2005; PricewaterhouseCoopers 2003). Accordingly, resources are expected to be redirected from back-office transaction processing to front-office decision-support and performance management related activities. Two important resources that aid users in managerial decision making within an organization are models and data.5 Prerequisites for making better decisions faster consist not only in having access to clean and complete data. Greater connectivity and access to increased variety and volume of information generates larger data volumes and also greater informational complexity. Since data offer an intrinsic value proposition, it has to be determined what data are relevant for the underlying decision context. Data Mining, also known as Knowledge-Discovery in Databases (KDD), is important for “the nontrivial extraction of implicit, previously unknown, and potentially useful information from data” (Frawley et al. 1992). Furthermore computational methods from the field of Artificial Intelligence (AI), such as Fuzzy Logic (Nguyen and Walker 2005), Machine Learning (Mitchell 1997) and Evolutionary Computation (Ashlock 2006) can transform data into information, and, at peak performance, into intelligence. As literature on applied informatics has shown, these areas can be helpful for solving decision problems involving corporate financial planning (Pacheco et al. 2000), risk management (Chan et al. 2002), and financial market forecasting (Peramunetilleke and Wong

5

shareholders, managers, and other stakeholders of the company” (Eun and Resnick 2004, p. 472). From the view of regulatory authorities the resulting problem to be solved can be stated as “how to best protect outside investors from expropriation by the controlling insiders so that the former can receive fair returns on their investments” (Eun and Resnick 2004, p. 474). How authorities deal with this problem has enormous implications for shareholder welfare, corporate allocation of resources, financing and valuation, development of capital markets, and economic growth. Note that we view a “model” as a computer-executable procedure that may require data inputs, a view that is widely held in the DSS literature (Gachet 2004; Power 2000).

104

Christian Ullrich, Jan Henkel

2002; Ullrich et al. 2005). It is important to note that the opportunities in this operational area of treasury management clearly drive the demand for broader and technically better educated people whose value-orientated assignments should not be blocked by repetitive actions. While manual preparation of relevant information may take lots of time and lead to decisions that are based on outdated or incomplete data, automated applications with a high degree of interconnection to market and intra-company data sources are required that enable operational units to make quicker and more informed decisions. We observe today that multinational corporations are not only trying to fulfil regulatory requirements, but strive to integrate them in a corporate value-orientated context. The need for consolidating and integrating treasury systems stems from the desire to improve business controls through visible processes and the installation of transparent management information systems. The purpose here is not only to avoid functional lacks that naturally exist due to aging legacy systems and heterogeneously organized decision-support solutions. It is also about minimizing departmental deficits in terms of redundant and complicated flows of information. Hence, the ability to gather transactional data from remote locations through automated data feeds to the corporate finance engine without additional interface requirements would certainly represent a significant win. For these reasons, it is quite obvious that a consolidated treasury system landscape can only be realized by companies if they are heading towards single-vendor ERP systems. However, the barriers to streamlining information flows in treasury can be significant. It is important to note at this place that treasury is not a priority in ERP systems whose vendors are slow to respond to changing circumstances that impact treasury. The slow uptake of new financial accounting standards (FAS) accounting rules over derivatives and hedges is one such example. The lack of flexibility around interfacing to external information sources, due to the need for strict interfacing rules, inflexible reporting capabilities, and the inability to perform complex treasury analysis are all arguments accusing ERP of failing to offer a comprehensive treasury solution. In addition, the increasing demand towards value creation requires highly specified and context-dependent decision-support solutions that supply decision-makers with superior information. For these reasons, the choice of IT systems that suit a company’s business needs is an interesting practical problem that is far from being trivial. In the following we will provide insights into the treasury organization, process and system support of BMW Group (“BMW”), a multinational manufacturer of premium automobiles and motorcycles.

5.4

Portrait BMW Group

The BMW Group is one of the ten largest car manufacturers in the world and possesses, with its BMW, MINI and Rolls-Royce brands, three of the strongest premium brands in the car industry. The BMW Group also has a strong market position in the motorcycle sector and operates successfully in the area of financial

5 Process-Oriented Systems in Corporate Treasuries

105

services. The BMW Group aims to generate profitable growth and above-average returns by focusing on the premium segments of the international automobile markets. With this in mind, a wide-ranging product and market initiative was launched back in 2001, which has resulted, over the past years, in the BMW Group expanding its product range considerably and strengthening its worldwide market position. The BMW Group will continue in this vein in the coming years. The BMW Group is the only automobile company worldwide to operate with all its brands exclusively in the premium segments of the automobile market, from the small car to the absolute top segment. The BMW Group’s vehicles provide outstanding product substance in terms of aesthetic design, dynamic performance, cutting-edge technology and quality, underlining the company’s leadership in technology and innovation. As an international corporation, the BMW Group currently has 22 production and assembly plants in 12 countries. Within the framework of its sales strategy, the BMW Group has, since the 1970s, been consistently pursuing its objective of having its own sales subsidiaries in all the major markets around the world. Thus in 1981, BMW was the first European manufacturer to establish a sales subsidiary in Japan. Today, the sales network consists of 35 group-owned sales companies and 3,000 dealerships. Approximately 100 more countries are handled by local importers. This means that the BMW Group is represented in over 150 countries on all five continents. In 2005 the BMW brand outperformed its historical record for the fifth time in a row since 2001 to 1,327,992 units. For the first time, more than 200,000 MINI brand cars were sold in a single year, with the number of cars delivered increasing by 8.7% to 200,428 units, as compared to 2004. The Rolls-Royce brand confirmed its position at the top of the absolute luxury class. 796 Phantoms were delivered to customers in 2005, marginally higher than the 792 sold in the previous year. In the business year 2005, the BMW Group employed 105.800 employees approximately. Three quarters of all BMW employees are located in Germany. According to IAS, the BMW Group reported revenues of 46.7 billion Euros and a net profit of 2.2 billion Euros. The BMW ordinary share has been listed since 1926. In 1999 BMW Group introduced the 1 Euro par value share. Concurrently BMW moved to the collective custody account procedure. Physical share certificates are no longer available. As of 31 December 2005, the subscribed capital of BMW AG amounted to 674 million Euros and comprised 622,227,918 ordinary shares and 52,196,162 preferred shares.

5.5

Portrait BMW Group Treasury

There are many dimensions for determining the structure, role and function of a treasury organization. Generally, corporate treasury practices can vary according to size, geographical reach, and process complexity. At one end there are treasuries that focus on managing daily cash flows and optimizing working capital through process excellence. At the other end of the spectrum, there are treasuries

106

Christian Ullrich, Jan Henkel

that are strategic to the business and whose objective is to support the financial performance of the company. Clearly any organization can exercise its choice on the scope of the treasury functions that it undertakes and where it positions itself between these two extremes. In doing this, it may be governed by a variety of considerations. It may choose to handle only those needs driven by utilitarian motives such as liquidity support or, on the other hand, it may consider treasury as a “core” organizational process and hence handle the full range of services. It may also choose to outsource portions of the activities required or it may choose to foster these capabilities in-house. Independently, an organization can also decide on the extent of centralization of treasury management. For instance, it may be efficient to centralize transaction processing, while the front office may need to be decentralized to aid speedy local decision-making. It may also be important to have a common risk management strategy, while execution may be decentralized. Many large corporations manage their business entities in a decentralized fashion to promote local ownership of business decisions and accountability for results which is in turn favoured by many investors.

5.5.1 Organization and Responsibilities At BMW, the treasury division is responsible for all of the company’s relations with the capital markets. This includes pension funds management, financial marketing, and corporate finance. The main goals are to minimize the cost of capital to the company, protect its solvency and each of its individual subsidiaries and to safeguard BMW’s financial independence in light of strategic decisions along with the requirements for profitability and value contribution by managing risk-return. The question of appropriately aligning the Treasury within the enterprise organization, whether as a cost, service or profit centre, has a very special meaning in regards to the definition of the single Treasury-functional areas and of course the overall definition of treasury goals and strategy. (PricewaterhouseCoopers 2003) report, that the great majority of corporate treasuries are organized as service centre models. Only some companies characterize their treasury as cost centres and even fewer ones as profit centres. The main features of a service centre include the management of financial risks and liquidity within narrow guidelines, a restricted scope of action for entering open positions, as well as the idea of serving and supporting operational subsidiaries in order to optimize performance with regards to foreign exchange and liquidity questions. Since conformity with BMW goals requires both efficient processes and positive financial performance as compared to financial market developments, the company may consider its treasury function as a hybrid one. It is organized according to the service centre principle, however, does not possess all of its characteristics, such as narrow operational guidelines. Instead, the BMW Group Treasury also contains notions of the profit centre concept, especially with regards to its risk management practices, and of the cost centre concept.

5 Process-Oriented Systems in Corporate Treasuries

107

Figure 5.3. Vertical perspective: BMW Group Treasury

Figure 5.3 depicts the organizational structure of BMW Group Treasury. In this article, we will not embrace the BMW Group Treasury as a whole, but rather focus on those divisions which are responsible for the execution of treasury functions as commonly known. These functions are attributed to the Corporate Finance department along with local Treasury Centres (TC). The purpose of the latter ones is to align local treasuries with the BMW Group’s treasury strategy and represent the BMW Group Treasury in the local markets. This allows the local subsidiaries (Sales, Financial Services, and Production companies) to benefit from uniform standards, methods, and expertise. TCs are located in Europe (Belgium, Netherlands, Great Britain and Austria) and the United States. Recently, the company has decided to build an Asia-Pacific TC in Singapore. It is due to this step that BMW Group Treasury is now able to cover the entire eastern hemisphere and realize economies of scale from optimized processes in treasury management. In spite of the functional basic structure of the organization, a decentralized leadership philosophy prevails at BMW. This leadership philosophy is marked by a central guidance that is coupled with decentralized responsibility and stretches through all parts of the BMW organization. Thus, decisions on the overall treasury policy, procedures and guidelines (including strategy, concepts, standards, principles and structuring) are made centrally in Munich. Local TCs are wholly responsible for the way in which the treasury policy and strategy are executed. This helps BMW to ensure a uniform approach to the financial markets. It also enables BMW to offset contrary positions and evaluate executed transactions via the application of a centralized treasury system which we will refer to below. Furthermore, the core benefits of decentralization hold: local proximity, operating in the same time zone, knowledge of local culture, regulations and accounting practices, as well as existing opportunities for local support. The fact that BMW neither has a totally centralized nor a totally decentralized treasury is not an unusual phenomenon in German multinational companies. (PricewaterhouseCoopers 2003) demonstrate that 62% of the responding companies pursue a similar approach to treasury management. This can be attributed to the fact that, traditionally, expansion into new markets has been accompanied by deployment of local treasury staff to handle funding, investment, liquidity and exposure management in unfamiliar business environments, especially in those where change is rapid and the organization has

108

Christian Ullrich, Jan Henkel

significant business and/or financial exposure. At present, financial transactions are conducted exclusively via BMW TCs as long as the BMW Group is not affected by adverse legal, tax or economic effects.

5.5.2 Business Process Design In this subsection the current status of the operational work and information flows at BMW are examined throughout the treasury process chain. This is important because the functional representation of the Treasury as shown in Figure 5.3 is not valid from a business process and information flow perspective. As depicted in Figure 5.4, the treasury process chain as a whole further affects other parts of Group Finance (namely the Financial Controlling and Financial Reporting divisions) and other divisions of the company (Corporate Sales). This horizontal, cross-functional flow model of a business is better suited for identifying simplifications of processes, standardization of procedures across diverse business units and their integration, both within the organization and with those of external organizations, including customers and suppliers. Intuitively, these aspects provide an important starting-point for IT systems design. An effective separation of duties is ensured by grouping single process activities into process units or process classes. A typical treasury is divided into the following process units: front office, middle office and back office. Due to decentralization of treasury activities at BMW, desks are separated market-specifically, i. e. there is a front and back office in each of BMW’s financial entities. Middle office tasks are managed centrally. The treasury process starts off at the local operating units, which estimate and adjust their future sales volumes periodically according to the rolling window principle. Based on product- and market-related sales forecasts, important middle office responsibilities include periodic determination of exposures and quantification of risks which, above all, provide the foundation for front office activities as will be described below. Middle office is further responsible for medium- and long-term financial planning, including profit/loss and budget rates. Trading activities are managed by setting counterparty limits, which represent the maximum exposure to a particular counterparty or group of related counterparties. Limit calculation is based on equity, a rating factor (as delivered by Standard & Poors, Moody’s and Fitch) and a country factor. Compliance with counterparty limits is monitored very closely. Non-compliance is not tolerated and repeated non-compliance is a dismissible offence. Exposure limits, which represent maximum exposure to particular market rates, or stop loss limits, which represent the maximum loss that a particular deal or strategy may incur before it must be closed out, are not relevant. Middle office’s second area of responsibilities – “risk controlling” – includes benchmarking, performance analysis and reporting, and is aligned to the prevailing risk management objectives of the front office. Performance measurement provides information about the value contribution that treasury has produced according to pre-determined benchmarks and shows whether and how strategic

5 Process-Oriented Systems in Corporate Treasuries

TC Asia TC America TC Europe AG

Sales- orientated Planning

Financial Planning, Risk Measurement, Limit Mgmt.

AG

109

TC Asia TC America TC Europe AG

Confirmation, ImplemenRisk Analysis, Settlement, tation, Strategy Development Documentation Reconciliation

AG

Accounting

Performance Analysis, Reporting

Middle Office

Front Office

Back Office

Middle Office

Controlling

Treasury

Reporting

Controlling

Figure 5.4. Horizontal perspective: BMW Treasury Process Chain

requirements and goals have been reached by providing management reports. Hence, middle office represents the process unit that is responsible for covering the beginning and the end of the treasury process chain. BMW’s front office is divided into several different desks with each desk focusing on a specific market. In regards to the corporate finance departments, front office desks at BMW include the following: foreign exchange, fixed income, commodities, money markets and capital markets. Consider the risk management process, for instance: when risk models suggest the trader to get ready to enter the market, hedging strategies are developed, analyzed and finally executed. In the case of foreign exchange, the primary factor that affects the risk manager’s decision in the longer term is macroeconomic data. The results of market research as conducted by banks, independent research institutes, global and domestic events, political and social conditions, fiscal and monetary policy, chart analyses and rumours are only important for the shorter term. Traders at BMW consider these factors and sources of information and translate them into a range of hedging strategies that are designed to match BMW’s overall objectives. The best strategy that is also in line with BMW’s hedging policy is finally chosen. In this context, it is important to note that decisions on financial risk management strategy and tactics are seldom made independently and entirely by the front office itself. Due to the complexity of interacting decision-relevant parameters as well as reasons of supervision, risk management decisions at BMW are often discussed and made within a committee, which does not include risk management officers only, but also financial controlling experts along with corporate finance and treasury executives. Upon execution, the deal has to be documented and finally recorded. Deal records are required for both correct revaluation and settlement. It should be noted that online trading platforms are not relevant for BMW. Trading is not done continuously, but occasionally and hence it is sufficient to use phone and email as communication media.

110

Christian Ullrich, Jan Henkel

The nature of the back office’s business can simply be summarized as transaction processing. This involves controlling the accuracy and verification of trades as entered by the front office. Back office further has to ascertain that all deals have been confirmed by the counterparty and that all resulting cashflows from these transactions will be available for cash management, settlement and accounting. Cash management as part of the back office is responsible for reconciling deals with bank statements. Only when forecasted cashflows have been reconciled with the bank account they will be posted via interface to the accounting system. Standard agreements regarding settlement, payment instructions, confirmations, etc. are set up with bank partners.

5.6

Business Process Support

IT systems are both drivers and enablers for changes in treasury processes. IT is a driver because computational developments pertaining hardware and software open up new opportunities for changing strategies and processes. However, it is also an enabler, since its installation and application enables companies to support existing strategies and business processes. In the following, we will focus on the role of IT as an enabler. A key consideration for companies when selecting a treasury management system is the functional requirements that the organization needs to deliver. If the treasury function is aligned more operationally a system would be required which is closely integrated with the planning, forecasting, and budgeting tools within the business and to external sources of data, such as banks and capital markets. The foundation for value-orientation then consists in a high degree of straight-through-processing, i. e. transactions that are recorded only once and are available afterwards for other subsystems or are automatically booked in the ledger. However, for treasuries that are strategic to their business, information extraction is the primary concern because such treasuries are required to perform complex and sophisticated analysis on business data in order to support financial decision-making. Since the BMW Group Treasury can neither be characterized as a purely operationally aligned treasury nor as one that is totally strategic to its business, a more differentiated view is required, that segregates operational workflows from decision-making procedures. In BMW Group Treasury’s framework of organizational structure and workflows, the objective is the integration of all of BMW’s financial positions as a basis for a central management of these positions in real-time and the standardization of treasury processes across all BMW entities. Figure 5.5 illustrates the IT system architecture deployed by BMW. It combines the strengths of an ERP as the general finance engine (SAP), interfacing into a best-of-breed treasury application (Trema) that is again interfaced to back-office satellite systems, real-time news provision service (Reuters), historical market rate database (Zentrale Kursdatenbank ZKDB) and specialized DSSs which are not yet fully implemented.

5 Process-Oriented Systems in Corporate Treasuries

TC Asia TC America TC Europe AG

Sales- orientated Planning

AG

Financial Planning, Risk Measurement, Limit Mgmt.

111

TC Asia TC America TC Europe AG

ImplemenConfirmation, Risk Analysis, tation, Settlement, Strategy Development Documentation Reconciliation

AG

Accounting

Performance Analysis, Reporting

TREMA Treasury Management System, Reuters, ZKDB DSS

SAP

DSS

Legacy Systems Middle Office

Front Office

Back Office

Middle Office

Controlling

Treasury

Reporting

Controlling

Figure 5.5. IT systems support for treasury process and business process units

5.6.1 Best-of-Breed System Functionalities Whereas, until now, the ERP solution SAP is used for data collection and highlevel reporting only, the central component in supporting BMW’s treasury activities is Finance KIT, a best-of-breed TMS from Swedish system vendor Trema. Finance KIT uses standard client server architecture along with a portable window environment. The system is used Group wide, mainly to guarantee operational integrity by embracing front, middle, and back office functionalities. Intrinsic to the design are automated transaction flows to reduce redundant keying. However, Finance KIT also supports certain functionalities within trading and risk management. Systematic coverage and evaluation of all corporate financial risks, market evaluation of balance sheet positions and financial derivatives are covered Groupwide. All BMW Group entities, namely BMW AG (Internal Audit, Financial Controlling, Group Reporting, and Financial Processes Integration), BMW Financial Services, BMW overseas entities (TCs) and network partners (Auditing, System partners) are connected via Finance KIT. The system is available 24 hours a day and seven days a week. It connects approximately 200 users worldwide which access Finance KIT via internet browser. The central database is located in Munich. The system handles an annual transaction volume of approximately 680 Billion Euros. About 7000 user enquiries per year have to be tackled by BMW internal Finance KIT support and Treasury Consulting with an increasing tendency. Finance KIT includes an interface to local systems for accounting and payments, as well as a standardized reporting and unique system layout.

112

Christian Ullrich, Jan Henkel

Figure 5.6. Information flow in Finance KIT (Trema 2002)

The information flow in Finance KIT consists of four types of information (Trema 2002): static information, market information, transaction and cashflow management information, and calculated information. Figure 5.6 provides an overview of the different kinds of information and their flow in Finance KIT. Static information is the information that usually does not change. It includes the definition of currencies, counterparties, issuers, countries, rules, etc. and is used by most calculations. Editors are used to define and modify static information. There is a specific editor for each type of static information. Currencies, for instance, are defined in the Currency Editor. The report application is a report generator which is attached to each editor and allows creating reports from information defined in the editors. Market information refers to the market quotes of currencies, instruments, yield curves, equities, volatilities, etc. which are stored in Finance KIT. Market information is used to measure and evaluate the risk of positions. This information is typically imported from market information providers, such as Reuters, but can also be entered manually. The Finance KIT rate board can be used to view and modify the market information. Market rate reports can be established via a specific market rate reporting engine. Transaction information is information that is essential to define a transaction, for example, transaction and cashflow amounts, transaction counterparties and deal rates. Calculated information such as unrealized results, market value, etc., is not part of this information. Transaction boards are used to enter, view or modify this information. For example, deals are entered in the Deal Capture transaction board. Some reports, such as Transaction Log Report, Daily Deals Report and Cashflow Report are designed to report this kind of information. A primary role also consists in calculating figures and in displaying them in different applications. In order to do these calculations, information

5 Process-Oriented Systems in Corporate Treasuries

113

Figure 5.7. Finance KIT principles (Trema 2002)

and instructions are required from other parts of Finance KIT, such as static data, market data and transaction data. In Finance KIT, there are five basic principles that support the treasury process chain from a systems point of view: “Transaction Flow”, “Cashflow”, “Portfolio Hierarchy”, “Valuation”, and “Real-Time”. The entire information flow in Finance KIT is based on these principles. Due to their powerful impact on the system’s performance, as well as on its various applications, we will describe the ideas in more detail, according to Figure 5.7. We start in the front office. Every trade that is entered in Deal Capture is regarded as a transaction. The process of transaction handling and control from initial input to final acceptance is referred to as “Transaction Flow”. The Transaction Flow is used to monitor the movement of transactions from one state to another. Different applications, mainly transaction boards, are configured to control and manage the transactions in different parts of the flow and access rights to these applications that are limited to those people who actually use them. When a transaction is committed in a transaction board, it receives a state and status. The

114

Christian Ullrich, Jan Henkel

Table 5.1. Entering a bond and a foreign exchange forward via Deal Capture

Bond Instrument Name Nominal Amount (Face Value) Price Portfolio Counterparty

Bond A USD 10.000.000 105,043 Bond Portfolio Bank A

FX Forward Currency Pair

EUR/USD

Purchased Amount Rate Portfolio Counterparty Value Date

EUR 1.000.000 1.1500 + 150 bps FX Forward Portfolio Bank B March 15, 2007

state of the transaction determines where it stands in the Transaction Flow. All transactions must have a state and can only have one of the following states at a time: “Open”, “Trader Verify”, “Back Office Verify” or “Final”. In the following we consider two imaginary transactions, a bond purchase and a foreign currency purchase that will be followed from trade entry in the front office, through the middle office, to bookkeeping in the back office. Let us imagine that on the morning of January 22, 2006, the trader (located in TC America) decides to buy a government bond as well as Euros for US-Dollars. The trader enters the transactions via “Deal Capture” in Finance KIT. Table 5.1 shows all the information the trader needs to specify in order to make these transactions. As soon as they are saved in Finance KIT, these transactions are split into their resulting cashflows (“Cashflow” principle). A cashflow is a certain amount of money in a particular currency that occurs at a specific point in time. Thus, each cashflow has a value date, necessary for valuation, payment settlement and bookkeeping. The nature and processing rules of various cashflows are dependent on the type of financial instrument that is recorded. Generally, cashflows are divided into four groups. “Simple Cashflows” are those cashflows that actually take place, for example a coupon or redemption. “Pseudo Cashflows” are used for calculating the actual settlement before fixing takes place, for example in forward rate agreements. “Proxy Cashflows” are used for the valuation of floating rate instruments. “Conditional Cashflows” are used for options where the magnitude of the cashflows is known but the delivery is uncertain. In order to administer various cashflows it is necessary to further classify cashflows into different types. All cashflow types are defined during implementation. The six main cashflow types are: “Principal”, “Interest”, “P/L”, “Fee”, “Payment”, and “Balance”. Each cashflow type contains subtypes. For example the cashflow type Principal has the subtypes “Principal”, “Amortization”, “Expiration”, and “Redemption”. These subtypes are used in money settlement applications and to define accounting rules. All cashflows can be analyzed and valued together as one position or divided into several components by various grouping parameters. Table 5.2 depicts the cashflows for the bond transaction as generated by Finance KIT. Furthermore, Table 5.3 depicts the cashflows for the foreign exchange forward contract as generated by Finance KIT.

5 Process-Oriented Systems in Corporate Treasuries

115

Table 5.2. Cashflows of the bond transaction

Date

Cashflow Amount Currency

Jan 29, 2006

-10,504,300.00 USD

Jan 29, 2006 Jun 03, 2008 Jun 03, 2009

-525,000.00 USD 900,000.00 USD 900,000.00 USD

Jun 03, 2009

10,000,000.00 USD

Description The principal payment for the bond. The acccrued interest of the bond paid to the previous owner. The future coupon of the bond. The future coupon of the bond. The redemption payment of the bond.

Table 5.3. Cashflows of the foreign exchange forward transaction

Date Cashflow Amount Currency Description Mar 15, 2007 1,150,000.00 EUR The Euros bought. The US Dollars paid for the Euros bought. Mar 15, 2007 -1,000,000.00 USD

When the trader commits these transactions in Deal Capture, they obtain the transaction state Open. Trade tickets are printed automatically in order to document conformity to market conditions at the time of the trade. The nominal amount of the transaction is verified against the trader limit. When the trader commits the transactions, they move forward in the Transaction Flow. If the transaction exceeds the limit a warning appears on the trader’s screen and the exceeded limit is logged. The state of the transaction defines which transactions are visible in which transaction board. For example, committed transactions no longer appear in Deal Capture, unless they are sent back to the Open state from another transaction board. In the configuration of this case study, the next state for these transactions is Trader Verify. When transactions are in this state, each trader must review the transactions they have entered. If the transaction is incorrect, the trader sends it back to the Open state in order to modify or delete the transaction. Transactions reach the back office after they have been verified by the front office traders. A back office clerk views the two transactions that are now in the state Back Office Verify in the “Back Office Verification” transaction board. If an error is identified, the respective transaction is sent back to the Open state for correction. The states of the transactions change to Final as soon as back office receives transaction confirmations from the deal counterparties. If the transaction data is correct and the counterparty confirmation is received, the back office clerk confirms the transaction using the “Confirmation” transaction board and transfers the payment to the bank. The money is automatically transferred to the bank by either fax or file. In this context, different payment rules are defined, which enable the system to select the correct transfer method for each payment. The next day the bank account statement is collected via file and is automatically reconciled against the payments sent from Finance KIT. At the end of each business day, the bookkeeping entries from trans-

116

Christian Ullrich, Jan Henkel

actions in the Final state are automatically generated. This also requires the definition of rules in order to enable Finance KIT to identify the correct bookkeeping account for each cashflow. Bookkeeping entries are made on the value date of each cashflow. For the government bond, the principal payment and the interest accrued to the previous owner are booked on January 30, 2006. The generated entries are as follows: Table 5.4. Bond transaction bookkeeping entries

Bookkepping Account Cash Account (USD) Bond Account Bond Interest Account

Entry Type Credit Debit Debit

Amount 11,029,300.00 10,504,300.00 525,000.00

Once the FX forward transaction reaches its value date, the cashflows resulting from that transaction are booked. Thus, if 1 Euro equals 1.1700 US-Dollars on March 15, 2007, the bookings read according to Table 5.5: Table 5.5. FX Forward bookkeeping entries

Bookkepping Account Cash Account (USD) Cash Account (EUR) FX Revaluation Account

Entry Type Credit Debit Debit

Amount 1,000,000.00 854,700.85 145,299.15

In this example, it results in a debit entry of 854,700.85 Euros, which are 145,299.15 Euros less than the transaction exchange rate. This difference is booked as a foreign exchange loss in the foreign exchange revaluation account of the US operating unit. The result from the forward points is booked into the interest rate profit/loss account. The transactions entered in closing books can be entered at an arbitrary date. The entries are similar to bookkeeping entries, except that the items that are booked in the closing books are not an actual cashflow but unrealized items (for example accrued interest, net market value). The unrealized items need to be allocated to a correct accounting period for which entries are made at the end of every month. The effect of the two transactions can be seen in different back office reports, including the “Delivery Report”, the “Daily Deals” and the “Outstanding Positions” Report. The effect of sample transactions on the liquidity position is seen in real-time in “Treasury Monitor” as soon as the transactions are entered. There are three major benefits that arise from this procedure of handling financial transactions. First, all transaction types are properly documented and recorded in a uniform way by entering data once only. This allows for better accuracy and correctness, since bookkeeping entries and payments are not made from transactions which have not been sufficiently verified. Second, communications in the

5 Process-Oriented Systems in Corporate Treasuries

117

business process are clearly determined which significantly improves the subject of error processing by reducing miscommunication and the time spend on correcting data inconsistencies. Third, data migration to spreadsheets and neighbourhood systems is done from a single source, central database. This guarantees a substantial reduction of operational risk by eliminating the possibility of human error that would occur from loading data manually. The operations that take place in the middle office are related to risk controlling and the profit/loss supervision of the BMW Group Treasury’s positions. Positions are administrated via so-called portfolios that can both be defined with regards to competencies and on a product group level (“Portfolio Hierarchy” principle). For example, let us first consider the BMW Group portfolio of competencies from a top-down perspective. It is the Group portfolio itself that represents the parent portfolio for each of its regions. Regions represent a second parent level themselves in the top-down hierarchy and consist of entity sub-portfolios, i. e. subportfolios for every BMW subsidiary that are assigned to a specific region. Furthermore, every entity consists of a range of businesses which are represented by another third level of sub-portfolios. The overall treasury position is monitored by viewing the entire BMW Group Treasury portfolio while the American Region position is monitored by the American Region’s portfolio. Since the same data and the same transactions will often be viewed from different perspectives, one portfolio can also belong to several hierarchies. For example on a product group level, the currency spot risk can be transferred from the currency forward portfolio to the spot portfolio. Figure 5.8 displays the portfolio hierarchy as defined for the BMW case study. The transactions are entered into the bond and forward portfolios. To view the positions of both transactions at the same time, the risk manager must choose to view the “MAIN” portfolio. It is possible to view both transactions because both portfolios are underlying portfolios of the MAIN portfolio. Similarly, if the risk

Figure 5.8. Example of portfolio hierarchies at a product group level

118

Christian Ullrich, Jan Henkel

manager selects the foreign exchange (FX) portfolio, only foreign exchange transactions are included in the position results, because only foreign exchange transactions are located in the foreign exchange portfolio and its underlying portfolios. The portfolio tree structure thus enables powerful opportunities for consolidation and aggregation of positions which enables the efficient control of risks and results on various organizational levels and according to various criteria. When the risk manager has selected the position to be monitored, Finance KIT retrieves the relevant transaction cashflows and calculates the requested keyfigures based on the cashflows and market information it contains. In Finance KIT, risk and unrealized result calculations are based on the market value of specific cashflows at the time of valuation. Therefore, when a transaction is entered, it is valued with mark-to-market rates in order to manage risks and calculate realized and unrealized results (“Valuation” principle). According to the Valuation principle, the market value of a position is the present value of the cashflows in the future that make up the position. In order to provide the market value of the cashflow, it has to be discounted by a specific discounting rate. The discounting rate is defined separately for each instrument and can be either a direct mark-to-market quotation of the instrument or a specific yield curve. By default, market value is expressed in the base currency of the portfolio. Thus the present values denominated in the foreign currency are converted into the base currency using the current spot exchange rate. All cashflows in the future are discounted to the date from which the position is monitored. This produces the present value of the cashflows on which risk figures are based. The most recent market information is always used when valuing positions. The discount rate depends on the definitions of the instrument from which the cashflow results. The rate can be any direct market quotation, yield curve or zero-coupon rate available in Finance KIT. These quotations and yield curves are usually obtained from external market information sources. The zerocoupon rate is calculated from swap yield curves or instruments. Similarly, other instrument specific definitions, such as date basis and price type for the interest (periodic or yield) affect the calculations. Middle office can view the case study in different reports, for example in the key-figure report and in the periodic P/L report. The risk manager can request reports in which only the transactions that have reached a certain transaction state are included. In the example used in this case study, the risk manager keeps Treasury Monitor open on the computer screen at all times in order to monitor the overall risk position of the treasury and to see the change in the risk figures in real-time as new transactions and market information are received. Limit Follow-Up must also be open in order to see how the defined portfolio and instrument limits are used in real-time. If the risk figures reach an unsatisfactory level, the user can examine the position in more detail in Limit-Follow-Up, in order to determine the origin of the excess risk. To identify the root of the problem, the user can split each key-figure by counterparty, instrument, portfolio, a transaction etc. Based on this information, the risk manager can inform the traders of possible hedging or advise other actions. As the position is monitored in real-time, the discrepancies from the desired levels of risk can be

5 Process-Oriented Systems in Corporate Treasuries

119

Table 5.6. Position Monitoring

Market Value 10.500.200,00 USD

Net Market Value -4.100,000 USD

IR Exposure 82.420,00 USD

quickly identified. In this case study, the user monitors the money market (MM) portfolio. The bond transaction is part of the money market position and the figures below are the key-figures of this transaction. They can be seen in reports and in Treasury Monitor. The market value is the present value of the future cashflows of the bond. The net market value is the difference between the purchase price and the current market value. Interest rate (IR) exposure is the change in the market value (for example profit or loss) if the discount rate rises by a defined percentage. The risk manager can view similar profit/loss figures for the FX transaction. It is also possible to see the effect of these transactions on the credit risk of the treasury by printing a credit risk report. Deal Capture continuously receives the latest market information. This information is used in valuations in order to obtain current and reliable risk and result figures (“Real-Time” principle). Whether it is a new deal entered into the system or updated market quotes for currencies, instruments, yield curves or volatilities, Finance KIT automatically updates all new or modified information in real-time for all its applications without it having to be requested. Real-Time means, that all new information is available instantly, i. e. risks, market value and other key figures are recalculated as soon as there is a change in a market variable and/or a position. Recalculation occurs permanently according to the updating cycles of the market rates. Market rates are obtained from Reuters, an external news provider. For every position, Finance KIT calculates the P/L for the chosen periodicity (daily, monthly, quarterly, annually) and displays realized and unrealized results, break-even-rates and net present value. Revenues are displayed rate- and currency-specific. Reports are not updated in real-time which means, that they must be updated manually in order to reflect new information. The main advantage of real-time position updating is that it enables increased trading and risk control. All mathematical calculations and formulae for trade entry, valuation and risk calculations are programmed in so called “Financial Objects”. Financial Objects are procedures, which Finance KIT stores and then uses to make calculations. Financial Objects and instrument information determine how Finance KIT handles cashflows.

5.6.2 Decision-Support Functionalities TMS often lack in-depth risk analysis capabilities that fit the very specific nature of single risk factor exposures. This can be attributed to slow updating cycles from TMS system vendors, their reluctance in addressing single customer needs, and

120

Christian Ullrich, Jan Henkel

functionally restricted applications due to high degree of standardization. In order to replace individual spreadsheet-based solutions, large efforts are currently being made at BMW to endorse the best-of-breed environment by statistical decisionsupport system functionalities in the areas of currency hedging, interest rate and asset management. Even if suitable system providers are selected, this is a complex and time-consuming task because systems integration has to be achieved on a variety of levels, such as complementing or even redefining existing processes, incorporating corporate goals and restrictions, user training, etc. In Section 5.2 we distinguished between different kinds of decision-support systems, namely modeldriven, communication-driven, document-driven, and knowledge-driven ones. In addition, academics and practitioners have discussed building decision-support systems in terms of four major components: (a) the user interface, (b) the database, (c) the model and analytical tools, and (d) the decision-support system architecture (Power 2002). At BMW, decision-support system functionalities are intended to be model- and data-driven desktop solutions for the use of a few specialists only. Although these solutions are generally model-independent, they are not dataindependent from Finance KIT. All financial planning data is centrally saved in Finance KIT. Thus, automated interfaces have to be established that enable the transfer of predefined corporate financial planning data from Finance KIT. In addition, risk factor market data, such as rates, volatilities and correlations need to be transferred to the system from an external database. Automation is important in order to eliminate human error and quickly react to market events. As soon as the systems are supplied with the required data, the treasury process is continued according to front and middle office responsibilities (see Section 5.2). For example, for the purpose of strategy development, analysis and implementation, a front office trader may simulate a range of alternative financial instruments via historical or Monte Carlo simulation methods in order to compare their risk-return profiles. Implementation of intelligent optimization procedures is necessary in order to obtain optimal solutions to decision problems, such as finding the optimal asset allocation or determining the optimal mix of hedging instruments, in a reasonable amount of time. Once a decision is made and a deal has been executed – no matter if it was recommended by the system or not – it is entered in Finance KIT’s Deal Capture, and all references are updated. Otherwise, for the purpose of middle office responsibilities, market risk and opportunities can be quantified by evaluating exposures with respective market factor scenarios which serves as a foundation for risk reporting and strategy development.

5.7

Final Remarks

The subject of information systems architecture and process support in corporate treasuries is receiving considerable attention. Over the years, decentralized governance structures have resulted in highly fragmented, duplicative information architectures that complicate information integration and increase ongoing support

5 Process-Oriented Systems in Corporate Treasuries

121

and maintenance costs. In contrast, today’s fast-paced business environment requires companies not only to simplify information flows, but also to improve their knowledge and information architectures. Thus a conflict has to be solved that arises from the parallel need to consolidate IT systems at the one end, while at the same time granting a certain degree of IT systems specification. Since these two objectives are somewhat contradictive, the problem of determining the optimal IT systems architecture becomes a very complex one. We demonstrated in detail, how BMW copes with this conflict by considering the current state on how IT systems support the BMW Group’s treasury process. Uniform data records, single data sources and standardized information flows between process partners have enabled BMW to streamline its business processes and reduce operational risks originating from human error. This consolidation provides the foundation for the current implementation of context-dependent decision-support systems in the area of risk management. Given the technological opportunities as being offered by system vendors, we argue that there exists no single one that is able to comprehensively meet the challenges that BMW faces today. As the market stands now, BMW needs to rely on a combination of vendors in order to obtain a complete business solution.

References Ashlock D (2006) Evolutionary computation for modelling and optimization, Springer Alter SL (1980) Decision support systems: current practice and continuing challenges, Addison-Wesley, Reading/Mass Benston G, Bromwich M, Litan RE, Wagenhofer A (2003) Following the Money: The Enron failure and the state of corporate disclosure. AEI-Brookings Joint Center for Regulatory Studies, Washington DC Bank for International Settlements BIS (2005) Triennial central bank survey – foreign exchange and derivatives market activity in 2004 Bodnar G, Marston RC, Hayt G (1999) Survey of derivatives usage by U.S. nonfinancial firms. Financial Management, 27:70–91 Chan M-C, Wong C-C, Cheung BK-S, Tang GY-N (2002) Genetic algorithms in multi-stage portfolio optimization system. In proceedings of the eighth international conference of the Society for Computational Economics, Computing in Economics and Finance, Aix-en-Provence, France Culp C and Miller M (1994) Risk management lessons from Metallgesellschaft, Journal of Applied Corporate Finance Ernst & Young (2005) Treasury operations survey 2005 results Eun CS and Resnick BG (2004) International financial management. Third edition, Irwin McGraw Hill, New York Finlay PN (1994) Introducing decision support systems, NCC Blackwell, Blackwell Publishers, Oxford/UK Cambridge/Mass

122

Christian Ullrich, Jan Henkel

Frawley W, Piatetsky-Shapiro G, Matheus C (1992) Knowledge discovery in databases: An overview. AI Magazine, Fall 1992, 213−228 Gachet A (2004) Building model-driven decision support systems with dicodess, VDF, Zurich Greenspan A (1999) Financial derivatives. Speech, before the Futures Industry Association, Boca Raton, Florida, March 19 Keen PGW and Morton MSSS (1978) Decision support systems: an organizational perspective, Addison-Wesley Publishing, Reading/Mass Markowitz HM (1971) Portfolio selection: efficient diversification of investments, Yale University Press Mello AS and Parsons JE (1994) Maturity structure of a hedge matters. Lessons from MG AG, Derivatives Quarterly, Fall, 7−15 Mitchell TM (1997) Machine Learning, McGraw-Hill Modigliani F and Miller M (1958) The cost of capital, corporation finance and the theory of investment. American Economic Review, 48:261−97 Nguyen HT and Walker EA (2005) A first course in fuzzy logic, Third edition, Chapman & Hall Pacheco MA, Noronha M, Vellasco M, Lopes C (2000) Cash flow planning and optimization through genetic algorithms. In: Computing in Economics and Finance 2000, Society for Computational Economics (eds) Peramunetilleke D, Wong RK (2002): Currency exchange rate forecasting from news headlines. In proceedings of the thirteenth Australasian conference on database technologies, 131−139, Australian Computer Society, Inc Power DJ (1997) What is a DSS? The On-Line Executive Journal for DataIntensive Decision Support, October 21, vol 1, no 3, http://dssresources.com/papers/dssarticles.html Power DJ (2000) Web-based and model-driven decision support systems: concepts and issues. In proceedings of the Americas conference on information systems, Long Beach, California Power DJ (2002) Decision support systems: concepts and resources for managers, Quorum Books, Westport/CT PricewaterhouseCoopers (2003) Corporate treasury in Deutschland, Industriestudie Rockart J (1988) The line takes leadership – IS management in a wired society, Sloan Management Review, Summer, 57−64 Trema (2002) Introducing Finance KIT for Unix and Windows, Version 8.0.8 Turban E (1995) Decision support and expert systems: management support systems, Prentice Hall, Englewood Cliffs/NJ Ullrich CM, Seese D, and Chalup S (2005): Predicting foreign exchange rate return directions with support vector machines. In proceedings of the fourth Australasian data mining conference, Simoff SJ, Williams GJ, Galloway J and Kolyshkina I (eds), 221−240, Sydney, Australia Unknown (2006): Treasury management systems – choosing a TMS. Treasury Today, April, 7 Zachman JA (1987): A framework for information systems architecture, IBM Systems Journal, 26, no 3

CHAPTER 6 Streamlining Foreign Exchange Management Operations with Corporate Financial Portals Hong Tuan Kiet Vo, Martin Glaum, Remigiusz Wojciechowski

6.1

Introduction

Global competition and decreasing profit margins force companies to reduce their operational costs. This puts pressure on all business units and operations, including financial management. One approach to increase operational efficiency is the automation of existing processes by eliminating unnecessary manual interventions. Alternatively, one can design and implement new, improved processes. This is a challenging task that gets even more complex if the financial management has to deal with the heterogeneous environments of a multinational corporation. In this sense, the deployment of information systems that help to reduce the complexity of integration, communication, coordination, and collaboration challenges is required. Thus, financial management is in search for the equivalent of the global information system postulated by Chou (1999). In this work, we present and discuss the concept of the corporate financial portal as a global financial information system and management reporting tool. We discuss the extent to which corporate financial portals can help to meet the information management challenges of today’s multinational financial management. To complement the theoretical argumentation, we present the foreign exchange management practices as they are currently conducted by the Bayer AG, a German company with core competencies in the business areas: health care, nutrition, and high tech material. By the means of a corporate financial portal and respective service implementations, Bayer’s foreign exchange (FX) management practices have been fundamentally restructured and simplified. We evaluate the impact of the corporate financial portal implementation on Bayer’s foreign exchange management practices by analyzing the transaction data from 2002 to 2006. The chapter is structured as follows: Section 6.2 briefly outlines the challenges the financial management of multinational corporations has to face. In today’s international environment, financial management on a corporate level is faced with

124

Hong Tuan Kiet Vo et al.

integration, coordination and collaboration challenges when striving for operational efficiency. In Section 6.3, we present the concept of the corporate financial portal and discuss how such systems can be of use in the context of financial management. In Section 6.4, we describe the Corporate Financial Portal – CoFiPot – project at Bayer. We explain how the information and transaction services offered by the portal led to major improvements in the efficacy, efficiency and quality of foreign exchange management at Bayer. The paper concludes with a summary of the lessons learned and an outline of further research questions regarding this topic.

6.2

Challenges of Multinational Financial Management

Corporate financial management has to fulfill three primary functions. First, it has to guarantee the financing of long term business-activities throughout the organization. Second, it has to manage short-term liquidity needs and optimize corporate cash flows and cash balances. And third, it has to design and implement financial risk management strategies.1 While already demanding in a domestic setting, in a multinational context these tasks become highly complex. MNCs operate in several countries and are therefore confronted with a multitude of economic, financial, legal and tax environments. Moreover, MNCs are faced with additional risks, in particular exchange rate risks, and political risks. The complexity of multinational financial management is increased by the fact that MNCs are usually organized as groups of legally independent firms, with local subsidiaries in various foreign countries being under the common control of the parent company: On the one hand, the group structure can be used by the MNC to exploit differences in national legal, tax and other systems. On the other hand, it sets further restrictions for management as each unit is subject to the legal requirements of its respective host country (Glaum and Brunner 2003). Geographical distance, different time zones as well as language and cultural differences further contribute to the challenges. As mentioned before, in this setting financial management has to meet integration, coordination and collaboration demands. Today, MNCs are subject to fierce global competitions. Decreasing profit margins force them to reduce their operational costs. This also puts pressure on the financial management to cut costs and increase the efficiency of its financial operations. The automation of existing processes by eliminating unnecessary manual interventions is one way to increase operational efficiency. An alternative approach 1

Refer to (Cooper 2000; Eun and Resnick 2004; Eiteman, Moffett et al. 2006; Shapiro 2006) for a detailed discussion of (multinational) financial management functions; a brief summary is presented by Glaum and Brunner 2003.

6 Streamlining Foreign Exchange Management Operations

125

is to design and implement new, improved processes. The prerequisite in each case is the integration of the involved information systems. The financial industry is well known for its highly heterogeneous systems landscape (Weitzel et al. 2003). Considering that the growth of MNCs is often based on mergers and acquisitions, MNCs are often confronted with the need to integrate a highly diverse systems landscape with regard to global business processes. One further challenge to financial management represents the coordination of the financial activities in the multinational business environment. As noted above, geographical distances, different time zones and local holidays make coordination and communication within the financial organization difficult. In particular, if the corporate centre has to provide in-house banking services (Richtsfeld 1994) to the dispersed affiliates, it will want to establish mechanisms and processes that enable services on a global scale – regardless of time zones and geographical restrictions. In addition to the coordination requirements, collaboration also poses challenges in this context. The major goals and strategies of the MNC’s financial management are developed at the corporate centre. However, local managers may pursuit their own, personal goals. The MNC thus may have to deal with principal-agent problems (Hommel and Dufey 1996) that affect the willingness and quality of collaboration within the organization. Yet, the quality of the decisions made by the corporate financial centre strongly dependents on the timely delivery of accurate financial reports provided by the operational and legal sub-units. The efficient execution of decisions also depends on the willingness of local management to follow the MNC’s overall financial strategies. Thus, the financial management of the MNC has to design and implemement incentive mechanisms and control structures that enforce and ensure collaboration and compliance within the organization. In order to meet these challenges, financial management has to make use of the benefits that information systems based on Internet technology offer. Therefore the following section will present the concept of the corporate financial portal and discuss to what extend this tool can support the main financial management functions.

6.3

Corporate Financial Portals

Corporate portals are applications built upon and accessed through the use of Internet technology. The primary function is to offer one single point of access to personalized, business-relevant information, applications and services (e. g. Collins 2001; Dias 2001; Vo et al. 2006). Depending on the actual field of application, literature distinguishes between a multitude of different types of corporate portals including knowledge portals, process portals, or the enterprise information portal (Dias 2001). This work focuses on the concept of corporate financial portals that address the information management requirements of the corporate financial management

126

Hong Tuan Kiet Vo et al.

functions on a global scale. It serves as a central hub to financial information, application and services on the corporate level as well as on the level of the individual affiliates. In the following, we discuss key characteristics of corporate portals in general, as well as the role of the portal concept in the context of financial management of MNCs.

6.3.1

Key Characteristics of Corporate Portals

Today, most companies have an efficient, always-on communications channel – the Intranet. This network can be accessed from virtually everywhere in the world, provided that access to the Internet is available. Applications build upon the use of Internet technology are called Web applications. Every corporate portal implementation is a special form of Web application that consequently can be accessed from everywhere through the means of a standard Web browser. In contrast to traditional applications, Web applications offer various advantages to users and to operators, alike. From the latter’s perspective, Web applications have to be installed and maintained on one central application server only, reducing the complexity of software administration and maintenance tasks (Balzert 2000, p. 946) which is particularly helpful with regard to a global operating environment. Further, it is guaranteed that the users will always access the latest version of the application without the need to pay respect to installation and software updates. From the end-users’ perspective, the primary function of corporate portals is to act as a hub to information, applications and services. For this purpose, the integration of data and systems is a prerequisite (cf. Davydov 2001; Linthicum 2003). Corporate portal initiatives have to be accompanied by associated integration initiatives. Preferably, the integration efforts result in a reusable business integration architecture (Vo et al. 2005), which forms the basis for any portal service or application as it allows portal services to easily access needed data and system functionality for further processing and transformation. A corporate portal will not per se solve the technical issues of system and data integration: these problems have to be addressed within separate integration initiatives. However, the corporate portal can foster integration on the business process level: corporate portal services can access the data and system functionality of one system and make them available to the user within a specific business context. More importantly, the corporate portal architecture supports the implementation of application components and services that can easily combine, transform, and process data and functionality from various information sources. This facilitates the streamlining of existing, as well as the development new business processes. With the corporate portal, these services will be available instantly throughout the organization. The corporate portal therefore not only “opens the door” to existing integration, but also facilitates process integration on a global level.

6 Streamlining Foreign Exchange Management Operations

6.3.2

127

Corporate Financial Portals

With the corporate financial portal, financial management is offered a tool that facilitates the coordination needs within the organization. The corporate financial portal can provide often requested information and service regardless of time-zone and geographical restrictions. In this context, the corporate financial portal can be regarded as a one-stop-shop (Österle 2001) that fulfills the user’s financial information and transaction needs. Thus, the objective is to establish among the financial community the mentality to search within the corporate financial portal for the needed information or service, first. To support this target, it is important to inform portal users on a regular basis about new services that are available and to monitor their wishes. As a result the portal will help reducing communication and coordination needs of standard processes. Moreover, in a multinational environment, financial management processes will depend on a variety of information that is dispersed throughout the organization. The corporate financial portal and the accompanying integration activities provide the means to access, process, and analyze the financial data throughout the organization, with regard to specific business scenarios. As an example, a corporate financial portal can implement services that periodically (e. g. on a daily basis) evaluate the corporate foreign exchange exposure level and automatically notify the persons in charge in case critical limits are reached. Aside from the information requirements, the effective and efficient coordination of the information flows is a critical determinant of multinational financial management operations. From a corporate perspective, information processes have to be coordinated between subsidiaries and the corporate centre. As the quality of financial management decisions significantly depends on the quality of the underlying data, it is important to provide the financial managers of the local subsidiaries with the necessary incentives to supply the data of the required quality. With a corporate portal, the corporate finance centre can offer value-adding reporting services for these local subsidiaries. By linking the quality of the reports provided to the quality of the original data provided by the local managers, the local managers are required to provide data of high quality for their own benefit. The financial planning activities on the corporate-level that requires the collection of planning figures from the dispersed local affiliates is one potential scenario for the application of a corporate portal as a reporting hub. In summary, the corporate financial portal provides the instruments to improve the effectiveness and efficiency of the financial processes on a global scale. It fosters the management of data, information and knowledge regardless of time zones and geographical boundaries. Still, one has to keep in mind that a corporate financial portal is not a product that is bought off-the-shelf. Rather, it is an information system that follows an evolutionary development process, thus grows in functionality with the number of services provided. In the following section, we will describe a corporate financial portal implementation and evaluate its impact on the financial management processes of the regarded company.

128

6.4

Hong Tuan Kiet Vo et al.

Bayer’s Corporate Foreign Exchange Management with the Corporate Financial Portal

Bayer is a global enterprise with 110.200 employees and core competencies in four business areas: health care, nutrition, and high tech material. The Bayer Group comprises of three business areas2 that operate independently under the leadership of the management holding Bayer AG. The German headquarter supervises and coordinate the activities of 350 companies, worldwide. In 2005, the Bayer Group generated a net income of 27.3 billion Euros resulting in an operating result (EBIT) of 2.8 billion Euros.3 In 2000, Bayer decided to re-organize the organization of the group’s financial management structure. The decision was to centralize the financial management activities at the Corporate Center in Germany, Leverkusen, thus reduce the numbers of the dispersed regional financial services centers worldwide. Currently, the Bayer Group’s Corporate Finance Center comprises of the Corporate Finance, Corporate Treasury, Credit Risk Management, and Corporate Financial Controlling departments. It takes the role of a global financial services centre and bears global responsibility for the financial management of the Bayer Group. The strategic re-organization poses new requirements and challenges on the information management for Bayer’s financial management. On the one hand, the Corporate Finance Center has to provide the worldwide subsidiaries with the required financial information and services. On the other hand, group-level financial management decisions require the Corporate Finance Center to collect, aggregate and process the relevant financial data from the individual subsidiaries worldwide. To address these information management challenges, in 2001 the Corporate Financial Portal project was initiated. The objective was to provide a central point of entry for all manners of financial services for the global financial community of the Bayer Group by the means of an Intranet corporate portal, the CoFiPot. Currently, the CoFiPot provides services that support the key tasks of the corporate treasury management, the corporate financing, the corporate controlling, and the corporate financial planning. Furthermore, a knowledge-base and a number of general services address the requirement of Bayer’s financial community. Figure 6.1 shows a screenshot of the homepage of the Corporate Financial Portal. In this section, we describe and discuss the impact of the implementation of the corporate financial portal on Bayer’s foreign exchange risk management practices. We first describe the corporate foreign exchange management processes and then elaborate the portal services that were introduced to reengineer key aspects of the corporate foreign exchange management.

2 3

Subgroups: Bayer HealthCare, Bayer CropScience, Bayer MaterialScience. Figures from Bayer Investor Relations: http://www.investor.bayer.de/

6 Streamlining Foreign Exchange Management Operations

129

Figure 6.1. The homepage of the Corporate Financial Portal (as of February 2007)

6.4.1

Corporate Foreign Exchange Management

Companies with international business operations are exposed to foreign exchange risk in as much as the cash flows from their projects depend on exchange rates and future exchange rate changes cannot be fully anticipated. Broadly speaking, one can distinguish three concepts to measure the effect of exchange rate changes on the firm. The accounting exposure concept (translation or book exposure) measures the impact of parity changes on accounting profits and owners’ equity. The transaction exposure concept concentrates on contractual commitments which involve the actual conversion of currencies. Finally, the economic exposure concept also takes into account the long-term effects of exchange rate changes on a firm’s competitiveness in its input and output markets.4 Surveys among MNCs show that the majority of companies base their foreign exchange risk management on the hedging of transaction exposure (e. g. Glaum 2000; Marshall 2000; Pramborg 2005). Companies first try to determine their exposures on the basis of their open foreign-currency denominated contracts (receivables, payables, loans, etc.) Sometimes, foreign-currency cash flows which are expected in the short-run, typically over the budgeting horizon, are included in the determination of the foreign exchange risk exposure. Depending on the firm’s hedging strategy, financial management can then neutralize the exposure by enter4

Refer to (Glaum 2001) for a detailed analysis of the three exposure concepts.

130

Hong Tuan Kiet Vo et al.

ing into counter-balancing forward contracts. For example, an exporting firm that expects US-dollar inflows from US exports receivables can sell the expected future US-dollar inflow in the forward market The effects of exchange rate changes on the receivables and on the forward market position will now cancel each other out, the home currency value of the future cash flow is fixed (US-dollar amount times the forward rate). Bayer’s foreign exchange risk management process is depicted in Figure 6.2:5

Corporate FX management practices and guidelines (e.g. hedging limits and restrictions, benchmark definition,…)

Measuring FX Exposure

Hedging FX Exposure

Controlling FX Hedging

feedback

Figure 6.2. Bayer’s foreign exchange risk management process

In order to gain from economies of scale and to consequently reduce the cost of risk management, MNCs to manage foreign currency risk on a global level rather than on a domestic level. More precisely, firms try to identify their overall net positions in any given currency by subtracting expected cash outflows (“short” positions) from expected cash inflows (“long” positions) over all of its sub-units. Since the effects of exchange rate changes on long and short positions cancel each other out, only the net position is effectively exposed to exchange risk. Hence, the netting of foreign currency exposure over the parent company and all subsidiaries results in substantially reduced hedging transactions. In order to enable netting on a global, group-wide level, corporate financial management has to deploy a central treasury management system which retrieves the local exposure data from the subsidiaries and implements mechanisms and functionality to support the multilateral netting process (Cooper 2000).

6.4.2

Bayer’s Foreign Exchange Management Practice

The primary goal of foreign exchange management at Bayer is to reduce the risks that arise from exchange rates fluctuations. This is achieved through the elimination 5

Refer to (Glaum and Brunner 2003) for a detailed discussion of the individual process steps.

6 Streamlining Foreign Exchange Management Operations

131

of the effects of exchange rates on booked and anticipated foreign exchange cash flows, such that the impact of currency rate changes on receivables and payables is fully offset (except for hedging costs). Financial risk management is focused solely on exposures which have an impact on cash flows. This means that only accountingbased risks, for example, gains and losses stemming from foreign exchange translation, will not be hedged. Foreign currency exposures of booked cash flows are completely hedged while currency exposures from anticipated cash flows are only partially hedged. The hedge ratios for the anticipated foreign-currency exposures are determined once a year by the corporate financial management. Bayer distinguishes between two perspectives on FX management: group level perspective and legal entity perspective. From the group perspective, the corporate financial head quarter hedges the group net exposure with external banks counterparties usually by executing FX forward contracts. This practice ensures that the FX risk of the Bayer group is hedged according to the corporate financial guidelines. Yet, the legal entities do not automatically participate from the group level hedging activities in the sense that the hedging results are not automatically attributed to the respective corporate entities and their exposure. The legal entities are free to decide on their individual FX management strategy with the option to hedge their FX risk or speculate on improving their earnings by betting on positive exchange rate developments. If they decide to safe-guard their own P&L from effects of FX rate fluctuations they may hedge their exposure with the corporate financial service centre CFSC by means of internal FX hedges. Hence, the CFSC acts as an in-house bank and quotes market oriented prices for the requested hedging transaction.

6.4.3

Redesigning Foreign Exchange Risk Management with CoFiPot

In this section we present the redesigned foreign exchange risk management process at Bayer as it has been implemented with regard to the means offered by the CoFiPot. Still, due to the possibilities offered by the portal, financial processes at Bayer are in a state of constant evolution. The discussion will follow the steps of the foreign exchange risk management process presented in Section 6.4.2. We exclude the task of determining the corporate FX guidelines from the subsequent discussion as the definition of corporate financial management policies is not within the scope the presented project. 6.4.3.1 Identification and Measuring FX Exposure To determine the Bayer group net exposure, the data on the booked foreign exchange exposure has to be reported by the legal entities on a regular basis e. g. For the majority of entities this process is done automatically on the basis of a daily

132 1

Hong Tuan Kiet Vo et al.

Report Exposure Entity A

Corporate Center Manual exposure report

Automated exposure report

Request exposure reports On-demand 2

Monitor Exposure

Email, CoFiPot

Corporate Exposure Database

Local exposure

Entity B

Local exposure

Request exposure reports On-demand FX Exposure Monitor

Figure 6.3. Measuring FX exposure at Bayer with CoFiPot

FX Exposure reports offer different levels of aggregation (from group level to the individual transaction)

Figure 6.4. CoFiPot FX Exposure Monitor service

data import process. The remaining entities that lack the technical requirements to participate in the automated process have to report their exposure data manually either via e-mail or by directly entering the data using an exposure upload service provided by the Corporate Financial Portal ((1) in Figure 6.3). In either case, the reported data is stored in a central exposure data base that is managed and used primarily by the financial headquarter for the purpose of group level exposure management.

6 Streamlining Foreign Exchange Management Operations

133

The role of the FX Exposure Monitor is to access the exposure database, extract the relevant data, and prepare the required exposure reports ((2) in Figure 6.3). Personalization services provided by the CoFiPot application framework enables the FX Exposure Monitor to provide exposure reports on different levels of details and aggregation with regard on the user’s preferences and security level. Grouplevel FX managers have access to all exposure data, while local FX managers only get the exposure reports for the entities within their responsibility. By the means of the FX Exposure Monitor the local FX managers have access to and rely on the same exposure reporting capabilities as the group level FX managers. The FX Exposure Monitor offers on-demand reports of the current FX exposure that automatically highlight critical exposure levels. The exposure report provides three levels of aggregation from the group net exposure, over the entity level, to the transaction level. Moreover, the application offers basic data warehouse slice and dice capability with regard to the dimensions region, sub group, entity and date (cf. Figure 6.4). 6.4.3.2 Hedging FX Exposure On the basis of the FX exposure level reported by the CoFiPot, the corporate centre decides on the subsequent hedging actions on the group level. In the past, this process was done on a monthly basis. However, due to systems integration and process automation, a weekly evaluation of the current corporate foreign currency exposure could be implemented. The process of evaluating the foreign currency exposure, deciding on the hedging needs, and subsequently engaging in hedging transaction on the global FX market has been automated to a large extend. The CoFiPot does not only support affiliates in their data reporting tasks, but also assists them in their individual risk management needs. If the local FX manager identifies the need to save-guard the legal entity’s profit and loss from FX rate fluctuations, he may do so by entering into internal FX forward contracts with the CFSC. He can either rely on the traditional voice trade channel (i. e. telephone trading) or chose the CoFiPot FX Trade Entry application (Figure 6.5) to enter into valid FX transactions on a 24/7 h basis. With the FX Trade Entry the FX manager has to specify the trade type, the settlement type, and the currency pair with the buy or sell amount respectively. The application relies on personalization mechanisms to reduce the interaction complexity by pre-selecting suitable, and omitting irrelevant options. Using the “get quote” functionality, the FX manager obtains a quote for the requested transaction that is based on the current market rate as provided by an external market data provider. The quote is valid for five minutes. Figure 6.6 describes the FX exposure hedging process. After the FX manager executes the FX trade, the CoFiPot FX Trade Entry application directly saves the trade details as specified into the corporate treasury management system TMS and notifies the staff at the CFSC for final agreement and confirmation. It is important to note that regardless of the actual time the trade is processed at the CFSC, the trade conditions on execution are guaranteed. Upon agreement and

134

Hong Tuan Kiet Vo et al.

Quote from market data provider. Mid rate between bid and ask. Valid for 5 minutes.

Figure 6.5. CoFiPot FX Trade Entry service

CoFiPot – FX Trade Entry 1

Request deal 24/7h

2

Get valid market quotes

Hedging demand 3 4

Save trade

Confirmation & settlement

Corporate Treasury Management System

External market data provider

Local exposure Legal Entity

Corporate Center

Figure 6.6. Hedging FX risk process (intercompany)

confirmation, the FX manager automatically receives a confirmation letter as a contract for this hedge. The FX manager then either confirms the trade by replying to the e-mail with the confirmation letter attached or via the CoFiPot. Furthermore, documentation that complies with the IFRS hedge accounting requirements and guidelines6 is automatically created and stored for the transaction. This documentation is necessary for the subsequent controlling requirements. 6

The reader is referred to (Deloitte 2006) for a comprehensive discussion on the IFRS hedge accounting guidelines and requirements.

6 Streamlining Foreign Exchange Management Operations

135

6.4.3.3 Controlling FX Hedging Activities Every FX transaction that is closed with the CFSC, is stored and managed by the corporate treasury management system TMS. Consequently, these trades automatically participate in the TMS workflows that include the valuation of the individual trades. The TMS provides FX reporting capabilities that originally were only available to the employees at the CFSC. In this regard, the primary objective of the CoFiPot FX Trades application is to provide a convenient access to these FX trade reports as provided by the TMS – not only for the CFSC but for every FX manager with Bayer’s financial community. With FX Trades application the FX manager can now request an overview of all FX trades that he is responsible for, and with the respective market values either in Euro or in local currency. Following the general CoFiPot reporting standards, the FX Trades application allows for a slice and dice of the data according to the dimensions entity, portfolio, date, type, and the current trade status (Figure 6.7). It is important to note that the level of detail provided by the CoFiPot FX Trades reports satisfies the requirements of the Bayer’s accounting principles (HGB, IFRS, US-GAAP).

Close new internal/external FX transactions

Documentation according to IAS39 is automatically created for new FX transactions

Figure 6.7. CoFiPot FX Trades service

6.4.4

Lessons Learned

In the following, we evaluate the effects of the corporate financial portal on the foreign exchange risk management process of Bayer. The discussion follows the steps of the risk management process.

136

Hong Tuan Kiet Vo et al.

The integration of the transactional data from the local/regional ERP systems and the automated import of this data into the corporate exposure database on a daily basis has significantly reduced the need for coordination and communications with the respective subsidiaries. More importantly, the exposure database is updated on a daily basis offering the CSFC a timelier and more accurate overview of the group-level exposure. Currently, virtually all of the relevant affiliates are integrated into the exposure upload process. The process of managing foreign currency risk is also significantly improved as the integration efforts laid the basis for the automation of this process. The information stored in the exposure database is assessed and used to automatically determine and execute hedging activities on the corporate level. As stated in Section 6.4.2, this is done on a weekly basis which would not be manageable with manual processing. Actually, automation would allow for an even higher frequency of the risk management process. Yet, it still has to be evaluated if this activity would result in any further significant improvement of the risk management process. Traders of subsidiaries have the option to close foreign exchange transactions using the traditional means of communication (e. g. telephone, fax) or the corporate financial portal. Figure 6.8 illustrates the development of the hedging transactions per month for the regarded periods and shows whether the telephone or the CoFiPot was used as a transaction channel. Furthermore, the line illustrates the development of the ratio of CoFiPot transactions versus telephone transactions. In 2005, on average 82% of all internal transactions were closed using CoFiPot. From February 2002 till June 2006, a total of 8,966 internal trades were closed using the financial portal (1,694 using the telephone trading channel). We deter-

# CoFiPot

# Telephone

Percentage CoFiPot 320

100

90 270 80 220

70

60 170 50 120 40

30

70

20 20 10

02

M rz

Ja n

M ai 02 Ju l0 Se 2 p 02 N ov 02 Ja n 03 M rz 03 M ai 03 Ju l0 Se 3 p 03 N ov 03 Ja n 04 M rz 04 M ai 04 Ju l0 Se 4 p 04 N ov 04 Ja n 05 M rz 05 M ai 05 Ju l0 Se 5 p 05 N ov 05 Ja n 06 M rz 06 M ai 06

-30

02

0

Figure 6.8. CoFiPot vs. telephone transactions (intercompany)

6 Streamlining Foreign Exchange Management Operations

137

mined through observation, that the average time to execute an internal transaction is about 4 minutes using the telephone. Even under the assumption that closing transactions using the portal take approximately the same amount of time, one has to consider that with the telephone two persons are involved in the trading process. The analysis of the transaction data further shows that since the introduction of the CoFiPot FX management services in January 2002 the number of transactions per month has increased from an average of 60 trades per month in the year 2001 to an average of 235 trades in the year 2005. Figure 6.8 further outlines the different trading channels used for the transactions. The data reveals that following the introduction phase CoFiPot has replaced the traditional telephone trading for most transactions. As of July 2002 the number of CoFiPot FX Trades Entry trades surpasses that of telephone trades. There are at least two reasons for the strong increase in total hedging transaction (cf. Figure 6.9). First, the average number of hedges per entity has increased from an average of 4 trades to 8 trades per month. Second, and more important, the increase in hedges stems from an increased number of entities that hedge their exposure. Figure 8 shows that this number has increased from 14 entities that do hedge their exposure in 2001 to 35 entities in 2005. Both effects explain the strong increase in hedges. Fluctuations in this number are due to external business events. As an example, the Lanxess spin-off in January 2005 was followed by a departure of the respective entities that consequently leads to a significant drop in the hedging activities for the subsequent periods. To further support the subsidiaries with their FX management operations, the CFSC offers them the option to participate in a so called Autohedging process. This initiative is another example that illustrates the role of the corporate financial portal as an enabler of new financial practices. The idea is to automatically create internal hedges whenever the foreign exchange exposure of a participating subsidiary changes substantially. The only prerequisite is that the subsidiary feeds the corporate exposure database with its exposure data. The corporate centre will continuously observe the currency exposure of these subsidiaries and execute

Ø Number of telephone trades per entity per month

Number of [Entities, Transactions]

Ø Number of CoFiPot trades per entity per month

Number of active hedging entities

LanXESS Spin-off

Autohedging

30

CoFiPot Trade Entry

40

20

10

0

Figure 6.9. CoFiPot vs. telephone transactions per entity; number of active entities

138

Hong Tuan Kiet Vo et al.

hedging activities when needed. The financial portal can be used to monitor these hedging activities. Following the active promotion of this service in 2004, currently the Autohedging service accounts for one half of all FX exposure hedges. Taking aside effects from external business events, the CoFiPot currently accounts for an average of 80 percent of all hedging transactions.

6.5

Conclusion

This paper identifies and discusses the challenges that the financial management of multinational corporations has to meet. Under the pressure to continuously improve financial business processes, financial management not only has to find ways to streamline existing processes but also has to look for new ways of doing business by designing and implementing new processes. This demand can only be met with the help of modern information systems that reduce the complexity of integration, communication, coordination and collaboration challenges that might be especially demanding in a multinational business environment. We describe the concept of the corporate financial portal as a central hub to financial information, systems and services. We argue that with this information system some of the above-mentioned demands can be met. Moreover, we believe that a corporate financial portal project does not only help to solve existing operational needs but, more importantly, also enables new financial practices that improve the quality of financial management functions. Finally, we describe and evaluate the corporate financial portal implementation of the Bayer AG. On the basis of Bayer’s foreign exchange risk management practices we evaluated the role of the corporate financial portal within the context of redesigning the underlying processes of this key function. System integration and process automation enabled us to design and implement new financial practices like a hedging of currency exposure on a weekly rather than a monthly basis. Furthermore, we were able to better meet the foreign exchange risk management demand of the local affiliates by providing them with the possibility to use the corporate financial portal in order to execute and monitor inter-company foreign exchange transactions. Following the process reengineering, we could observe a strong increase in the hedging activities of the affiliates since the introduction of these Web-based trading service.

References Balzert H (2000) Lehrbuch der Software-Technik – Bd. 1. Software-Entwicklung. Heidelberg; Berlin, Spektrum – Akademischer Verlag Chou DC (1999) “Is the Internet the Global Information System.” Decision Line 30(2): 4−6 Collins H (2001) Corporate Portals. New York

6 Streamlining Foreign Exchange Management Operations

139

Cooper R (2000) Corporate Treasury and Cash Management. New York, Palgrave Macmillian Davydov M (2001) Corporate Portals and eBusiness Integration, McGraw-Hill Companies Deloitte (2006) iGAAP 2006 Finanzinstrumente: IAS 32, IAS 39 und IFRS 7, CCH. Dias C (2001) Corporate Portals: a literature review of a new concept in Information Management. International Journal of Information Management 21: 269−287 Eiteman DK et al. (2006) Multinational Business Finance, Addison Wesley Eun CS and Resnick BG (2004) International Financial Management, McGrawHill Professional Glaum M (2000) Foreign Exchange Risk Management in German Non-Financial Cor-porations: An Empirical Analysis. Risk Management – Challenge and Opportunity. Risk Management – Challenge and Opportunity. M. Frenkel, U. Hommel and B. Rudolf. Berlin, Springer: 373−393 Glaum M and Brunner M (2003) Finanz- und Währungsmanagement in Multinationalen Unternehmungen. Management Multinationaler Unternehmungen. D. Holtbrügge. Heidelberg, Physica-Verlag: 307−326 Hommel U and Dufey G (1996) Currency Exposure Management in Multinational Companies: ‘Centralized Coordination’ as an Alternative to Centralization. Strategische Führung internationaler Unternehmen: Paradoxien, Strategien und Erfahrungen. J. Engelhard. Wiesbaden, Gabler: 199−220 Linthicum DS (2003) Next Generation Application Integration. From Simple Information to Web Services, Addison-Wesley Marshall AP (2000) Foreign exchange risk management in UK, USA and Asia Pacific multinational comanies. Journal of Multinational Financial Management(10): 185−211 Österle H (2001) Geschäftsmodell des Informationszeitalters. Business Networking in der Praxis: Beispiele und Strategien zur Vernetzung mit Kunden und Lieferanten. H. Österle, E. Fleisch and R. Alt. Berlin, Springer: 17−37 Pramborg B (2005) Foreign exchange risk management by Swedish and Korean nonfinancial firms: A comparative survey. Pacific-Basin Finance Journal 13: 343−366 Richtsfeld J (1994) In-House-Banking: neue Erfolgsstrategien im Finanzmanagement internationaler Unternehmen. Wiesbaden, Gabler Shapiro AC (2006) Multinational Financial Management, Wiley & Sons Vo HTK et al. (2006) Corporate Portals from a Service Oriented Perspective. CEC/EEE 2006, San Francisco, IEEE Vo, HTK et al. (2005) Integration of Electronic Foreign Exchange Trading and Corporate Treasury Systems with Web Services. Wirtschaftsinformatik 2005, Springer Weitzel, T et al. (2003) Straight Through Processing auf XML-Basis im Wertpapiergeschäft. Wirtschaftsinformatik 45(4): 409−420.h

CHAPTER 7 Capital Markets in the Gulf: International Access, Electronic Trading and Regulation Peter Gomber, Marco Lutat, Steffen Schubert

7.1

Introduction

In 1981, the Gulf Cooperating Council (GCC) was formed by six Arabic countries: Kuwait, Bahrain, Qatar, Saudi Arabia, Oman and the United Arab Emirates (UAE). The GCC aims to promote cooperation between its member states, mainly in the fields of economy and industry. So far, the GCC has succeeded in a number of areas. Citizens of GCC countries can move freely amongst the six countries without the need for visas and there are no customs duty within the GCC (but only on products made within the GCC). More significantly, there is freedom for professionals of one GCC state to work in another since then. Members of the GCC group of nations are also allowed to own shares in the companies that operate within that group. In academic literature one finds only little information on Arabic stock markets, although their markets went through rapid developments in the recent past and are currently aiming to establish world class stock exchange standards. In the following, an insight into those markets describing regional macroeconomic and exchange business development will be provided. The chapter is structured as follows: Section 7.1 will give an overview of the member countries of the GCC by presenting recent macroeconomic developments while Section 7.2 will characterize the exchange business in the region with trading figures, market capitalization etc. International investors’ requirements will be investigated in Section 7.3 and the exchanges of the region are analyzed in Section 7.4. The most recent exchange in the Gulf, the Dubai International Financial Exchange (DIFX), will serve as a case study of the approaches to further open up the region’s markets in Section 7.5. Findings will be summarized and conclusions presented in the last section. While the combined GCC population was at a level of 21.89 million in 1990 and 25.99 million in 1995, it reached more than 35 million in 2005 according to estimates – one of the fastest growing regions in the world. The Kingdom of Saudi

142

Peter Gomber et al.

Table 7.1. Population in the GCC countries (Source: Central Intelligence Agency 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Total

Population (million) 0.69 26.42 2.34 3.00 0.86 2.56 35.87

Table 7.2. GDP for GCC countries, purchasing power parity, compared to Germany and the UK1 (Source: Central Intelligence Agency 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Total GCC Germany United Kingdom

GDP for 2004 ($ billion) 13.01 310.2 48.0 38.09 19.49 63.67 492.46 2,362.00 1,782.00

Arabia (KSA), by far the biggest GCC nation, makes up more than 73 percent of the whole population while Bahrain, the smallest of those countries contributes by less than 2 percent. In terms of population, the whole GCC region is comparable to Canada whose population is estimated at 32.8 million for July 2005 (Central Intelligence Agency 2005). The discrepancy in the size of the countries’ populations is also reflected in the national gross domestic products (GDP) which, for 2004, range between $13.01 billion for Bahrain and $310.2 billion for the Kingdom of Saudi Arabia. Compared to those of Germany and the UK, even the GDP for the whole Gulf region is low. Related to its population the highest GDP per capita can be found in the United Arab Emirates and the lowest in the Kingdom of Saudi Arabia despite its oil richness. In 2003, GDP growth rates have increased significantly in the GCC countries compared to the average annual growth rate from 1998 to 2002 (except for Oman). Since then, growth rates have fallen slightly in most countries but are still remarkable. 1

In the following tables, we will provide comparisons to Germany, UK and/or the US and the World respectively without listing the comparison countries explicitly in the headers of the tables.

7 Capital Markets in the Gulf

143

Table 7.3. GDP per capita of GCC countries, purchasing power parity (Source: Central Intelligence Agency 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Germany United Kingdom

GDP per capita for 2004 ($) 19,200 12,000 21,300 13,100 23,200 25,200 28,700 29,600

Table 7.4. Real GDP growth rates, annual change in percentage (Source: IMF, Middle East and Central Asia Regional Economic Outlook, September 2005, p. 41 and IMF, World Economic Outlook, September 2005, p. 205−206)

Bahrain KSA Kuwait Oman Qatar UAE Germany UK USA World

1998−2002 average 4.8 1.5 0.8 3.6 7.4 4.0 1.7 2.9 2.9 3.3

2003

2004

7.2 7.7 9.7 1.9 8.6 11.6 −0.2 2.5 2.7 4.0

5.4 5.2 7.2 4.5 9.3 7.8 1.6 3.2 4.2 5.1

2005 (projected) 7.1 6.0 3.2 3.8 5.5 7.5 0.8 1.9 3.5 4.3

2006 (projected) 5.2 4.7 3.2 6.2 7.1 5.5 1.2 2.2 3.3 4.3

Based on these macroeconomic data in comparison to developed economies, in the following an overview of the development of the stock markets in the region will be provided.

7.2

The Evolution of GCC Stock Markets and Its Implications

Stock exchanges in the GCC region do not have a long history compared to markets in the US or Europe. Local stock exchanges have been in place both formally and informally from the 1970’s, but organized trading activity started only after the 1980’s. Since the inauguration of the Dubai Financial Market (DFM) in March 2000, all six member states of the GCC now have officially regulated stock ex-

144

Peter Gomber et al.

Table 7.5. Stock exchange commencement in the GCC countries (Source: FINCORP, GCC Markets – A special review, November 10, 2005, p. 2) Country Kuwait Oman Bahrain KSA Qatar UAE UAE

Name of the stock market Kuwait Stock Exchange (KSE) Muscat Securities Market (MSM) Bahrain Stock Exchange (BSE) Tadawul Saudi Stock Market (TSSM) Doha Securities Market (DSM) Abu Dhabi Securities Market (ADSM) Dubai Financial Market (DFM)

Year of commencement 1977 1988 1988 1989–2001* 1997 2000 2000

* Automated clearing system was introduced in 1989 and the electronic trading system Tadawul started in 2001

Table 7.6. Market capitalization for GCC countries, Germany and the UK (Source: World Bank, World Development Indicators 2005, Table 5.4 – Stock markets; GCC exchanges websites, Federation of Euro-Asian Stock Exchanges (FEAS) 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Total GCC Germany United Kingdom

Market Capitalization at the end of 2004 (US-$ millions) 13,500 306,248 75,200 9,317 40,452 84,088 528,805 1,079,026 2,412,434

changes (Dubai Financial Market 2006). Following the commencement of the Dubai Financial Market, the Abu Dhabi Securities Market (ADSM) opened up in November 2000 and thus makes the UAE the only GCC member country to be home to more than one stock exchange. The following table provides the commencement dates of regulated stock exchanges for the GCC countries. Going hand in hand with the growing economies in the region (see Section 7.1) market indicators show that GCC stock markets are experiencing a positive development in recent years, i. e. growth in market capitalization, trading activity and stock performance. In terms of absolute market capitalization the region is dominated by the exchange of Saudi Arabia, which is by far the biggest market, followed by the aggregated market capitalization of both UAE markets; the smallest markets are those of Oman and Bahrain. The following table compares the GCC market capitalization to Germany and the United Kingdom.2 2

Due to data availability the following statistics are listed until the end of 2004.

7 Capital Markets in the Gulf

145

Table 7.7. Market capitalization as a percentage of national GDP Country

Market Cap as a percentage of GDP (2004) 103.8 98.7 156.7 24.5 207.6 132.1 45.7 135.4

Bahrain KSA Kuwait Oman Qatar UAE Germany United Kingdom

Table 7.8. Annual market capitalization growth rates in percentage (Source: Exchanges’ websites and Federation of Euro-Asian Stock Exchanges, 2005) Name of the stock exchange

2002

2003

2004

Bahrain Stock Exchange Tadawul Saudi Stock Market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market Dubai Financial Market

14.54 2.0 27.0 15.6 44.0 252.8 20.33

29.13 110.14 72.0 40.65 152.7 49.1 50.26

40.7 94.7 22.2 29.9 51.4 82.7 145.47

2002−2004 average 27.67 60.99 38.71 28.30 76.62 112.61 64.33

Market capitalization in the region as a whole is less than half the German and less than one fourth of the UK market capitalization. Concerning the ratio of market capitalization to GDP many countries in the GCC region show quite high ratios, lead by Qatar with a market capitalization of 207.6 percent its GDP. Beginning in 2002, GCC members have faced triple digit growth rates in their market capitalization which can be explained by two factors: First, a successful wave of privatization has begun in these countries, e. g. in the telecommunication sector. This development has met interest by local investors which led to initial public offerings being regularly over-subscribed many times; the IPO of Ettihad Etisalat for example was over-subscribed 51 times in Saudi Arabia in November 2004 (Jafarkhan 2004). Second, index performance soared in most markets by double digits in recent years. Taking a closer look at the market capitalization growth rates, it is noticeable that the most positive developments have been experienced by countries with high oil production, which are benefiting from soaring oil prices. Another explanation of the comfortable environment for stock business is the repatriation of money. Money that was invested in markets in the US and the UK has been reinvested in the home markets. This started after the attacks of September 2001 as Arabic investors feared that

146

Peter Gomber et al.

Table 7.9. Index performance in percentage (Source: Exchanges’ websites and FEAS 2005) Name of the stock exchange

2002

2003

2004

Bahrain Stock Exchange Tadawul Saudi Stock Market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market Dubai Financial Market

3.41 4.0 24.1 26.2 37.3 7.8 21.1

28.81 76.23 63.9 42.1 69.8 28.6 45.4

30.17 84.93 11.9 23.8 64.5 74.8 49.0

2002−2004 average 20.14 50.21 31.54 30.45 56.53 34.32 37.92

Table 7.10. Annual growth rates of trade volume in percentage (Source: GCC exchanges’ websites and FEAS 2005) Name of the stock exchange

2002

2003

2004

Bahrain Stock Exchange Tadawul Saudi stock market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market Dubai Financial Market

7.21 60.0 87.6 41.02 113.8 150.0 157.5

26.4 346.0 142.7 156.64 264.6 176.0 49.3

70.71 197.0 −6.0 24.12 97.0 343.0 1337.8

2002−2004 average 32.25 176.74 62.36 65.00 148.56 212.67 280.93

the US government, among others, might take it from them or freeze their accounts. As a further reason it is argued the GCC-US dollar peg and low interest rates in the region lead to little incentive to save but instead to borrow money to speculate on better returns (Rutledge 2005). Besides the initial public offerings, companies already listed release new shares as there is a positive climate for secondary offerings. Cumulatively, those factors are reflected in an increase in trading activity. Table 7.10 and Table 7.11 show the development of trade volume growth rates and the increase in the number of trades respectively for the years 2002 to 2004. Despite these statistics on the positive evolution of GCC stock market business one can argue that, compared to traditional markets like the US or Europe, markets in the region are still developing regarding exchange activity. This can be shown by taking a closer look at the respective turnover ratios, i. e. the quotient of traded volume and market capitalization, and by comparing them to other countries. Except for Saudi Arabia, all GCC markets show a low turnover ratio compared to the markets in Germany, the UK and the US. Especially Qatar can be seen as an outlier as its trade value counts for only 0.16 percent of its overall market capitalization.

7 Capital Markets in the Gulf

147

Table 7.11. Annual increase in the number of trades in percentage (Source: GCC exchanges’ websites and FEAS 2005) Name of the stock exchange

2002

2003

2004

Bahrain Stock Exchange Tadawul Saudi Stock market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market

−2.87 71.0 46.8 58.67 88.9 97.0

12.73 264.0 107.5 92.92 352.2 126.0

7.62 254.0 −2.3 33.44 111.3 n/a**

Dubai Financial Market

84.2

0.1

754.3

2002−2004 average 5.62 180.35 43.84 59.85 162.31 111.00 (2002/2003) 150.68

**Annual report 2004 contains only statistics for 2003

Table 7.12. Turnover ratios defined as trade volume relative to market capitalization (Source: World Bank, World Development Indicators 2005; GCC exchanges’ websites) Country Bahrain KSA Kuwait Oman Qatar UAE Germany United Kingdom USA

Turnover ratio (2004) 3.4 204.1 68.9 31.5 0.16 3.4 130.0 100.6 122.6

Two major implications can be derived from the recent evolution of GCC stock markets: First, the rapid increase of listed companies, market capitalization and trading activity requires the regional exchanges to provide professional and reliable infrastructures and functionalities to their investors. As local investors are developing and improving their skills in stock trading they will call for regulation, market integrity, trading models, information services and operational reliability according to international standards, if not yet existent. Second, due to their boost in stock performance and therefore their attractiveness, GCC markets will experience a broader interest from international investors seeking high profit investments. Stock exchanges in the region as well as local investors should have an interest in attracting foreign investments as this will lead to additional liquidity. If obstacles for foreign investments in the countries of the region are too high, additional liquidity from other GCC and non-GCC investors will abstain from local markets.

148

7.3

Peter Gomber et al.

Requirements of International Investors

Attracting international investors is important for developing economies, because foreign investment is one way to fuel economic growth. The following analysis of international investors’ requirements discusses (i) transaction costs and market liquidity, (ii) the instrument scope, (iii) information dissemination, (iv) market regulation and trading surveillance, (v) trading systems and its mechanisms and (vi) investment restrictions. (i)

Transaction Costs and Market Liquidity

A well organized financial market is characterized by low costs for all aspects related to a security transaction (Reilly 1989, pp. 75−76). When making a transaction, any investor needs to incorporate total transaction costs as they will have a negative influence on the investment’s performance. The different cost components of a transaction can be classified either as explicit or implicit. Those cost components which are known and calculable before a trade occurs are referred to as explicit, while those costs induced by the trade itself and the market conditions at the time of the transaction are referred to as implicit. Implicit trading costs are highly linked to market liquidity and are not easily observable (Davydoff et al. 2002, p. 4). The explicit costs are determined directly by the exchange itself and via the intermediary serving the investor while the implicit costs result from various factors, e. g. the form of the market system or existence of liquidity providers (Degryse and Van Achter 2002). Possible explicit cost components for a transaction are: broker commissions, market maker commissions, exchange fees, clearing and settlement costs.3 In most cases the investor will not face explicit exchange fees directly, unless he has direct non-intermediated access to a market, as he is charged a commission by the intermediary. Nevertheless, there exists an indirect relationship between the exchange fees and the commission which an investor is charged. Market related cost components are often displayed in bundled form, e. g. exchange, clearing and settlement fee as an aggregated cost component if those services are offered from one source. Especially international investors will be able to perform a cost comparison between international markets and the potential new markets. Market liquidity is widely seen as the most decisive criterion for market quality, as it gives investors the opportunity to buy and sell equities at justifiable costs. But liquidity can be influenced only indirectly by the design of an exchange as liquidity results from various other factors which have to be fulfilled first, e. g. market transparency, market integrity, broad scope of investment instruments (Gomber 2000, p. 81). Then a process may set in where further liquidity is lured by the liquidity already existing in a market. Exchanges can improve their market 3

For a definition of transaction costs and its components see: Pagano and Röell 1990, p. 75.

7 Capital Markets in the Gulf

149

quality by engaging market makers to provide (artificial) liquidity on a continuous basis. For international investors liquidity in developing markets might be even more important than in established markets as perception of the investment risk is higher in these markets and the fear of not getting out of positions if necessary is more apparent. It is important to note that liquidity rarely is a critical issue for the average size order and during phases with normal trading activity, but more so for block orders and under extreme market conditions (Radcliffe 1973 and Reilly and Wright 1984). (ii)

Instrument Scope

Investors reduce their risk exposure by a well balanced and diversified portfolio (Davis and Steil 2001, p. 52). This requires the market to offer a broad variety of possible investments over different company sectors and different instrument types, i. e. equities, futures, options, bonds etc. Access to derivatives is important to both investors and market makers as they allow them to reduce trading risk by hedging their positions. (iii)

Information Dissemination

Information on a securities market is distinguished between market endogenous and market exogenous information (Gomber 2000, p. 19). Data which originates directly from trading is called market endogenous (Füss 2002, p. 9): this includes real-time data on price development, traded volumes, order book depth etc. This category also includes up-to date ex-post market statistics. Real-time endogenous information is especially important for decisions on the order execution style and also essential for the evolution and implementation of new technologies: e. g. algorithmic trading is dependent on market endogenous information as it analyzes current and past market conditions in order to optimize the execution style for an order (Economist 2005, p. 14). Information originating from the companies listed on an exchange is denoted as exogenous. It includes financial reports on a regular basis and ad-hoc news for the respective company. As analyst research is based on exogenous information, the dissemination of it is inevitable for investment decisions. The availability of both endogenous and exogenous data and access for all market participants to this information contributes to market transparency and a high grade of market transparency is necessary for a market’s information efficiency (Fama 1970, p. 383). Both types of information should be easily accessible for any investor independent of his/her location in the world. Market fairness requires that all information is distributed in a way that all investors have the chance to learn it simultaneously thus enabling a level playing field. Consequently missing, incorrect or incomplete and time-lagged data potentially leads to wrong investment decisions. If investors thereby feel treated in an unfair manner, they will absent themselves from those markets (Schiereck 1995, p. 30). Information by exchanges as well as listed companies should be made available in the English language – in

150

Peter Gomber et al.

a region of the world where Arabic is the main language and script, it is worth mentioning this. (iv)

Market Regulation and Trading Surveillance

Insider trading is a problem for every securities market, generally unwanted and “no longer regarded as socially acceptable behavior“ (Sunder 1992, p. 12). Repeated public cases of insider trading reduce the international investor’s trust in the markets. In order to avoid such cases, trading surveillance and firm market regulation are indispensable. In market regulation the following principles regarding a local regulator should be followed (International Organization of Securities Commissions 2005): As a principle, the regulator should be operationally independent from governments and the regulated exchanges. Furthermore, it should be accountable for the exercise of its functions and powers. This requires the regulator to act at any time following the highest professional standards including those for confidentiality. The regulator’s responsibilities should be clearly stated so that every market participant is aware of the situations where the regulator may be called. It is necessary that the regulator is equipped with adequate powers and capacities to perform its duties, which includes inspection, investigation and surveillance of the market and its participants. As international investors are usually located at a far distance from their investment markets and the companies they invest in, they are dependent on a reliable and functional market regulation, which should ensure that those investors are not discriminated in comparison to locally based investors. (v)

Trading System and Its Mechanisms

The trading system and its mechanisms can be seen as the core part of an exchange. In the 1990’s several studies on the preferences of institutional investors (including questions on their preferred trading system) showed that investors favor computer-based over floor-based trading (Economides and Schwartz 1994 and Schiereck 1995, p. 122). Electronic markets combine various advantages over traditional floor-based trading, such as anonymity, pre- and post-trade transparency and reduced risk of pre-trade information leakage via intermediating brokers. Independent of the design of the trading system, electronic or floor-based, market participants need fast, reliable and unrestricted access to their investment markets. In order to offer fair trading services, all investors, local as well as international, and the brokers acting as their agents’ need a level playing field concerning market access. Trading hours of an exchange are of importance to international investors, especially if their location is on a different continent or their investment securities are cross-listed. In this case they need to coordinate different time zones (Miller and Puthenpurackal 2005, p. 12). Trading days which differ due to cultural reasons present a further obstacle and public holidays may also be different from the international investor’s location.

7 Capital Markets in the Gulf

(vi)

151

Investment Restrictions

When investors trade in an international environment they have to deal with different legal and regulatory frameworks, possibly even including investment restrictions. When talking about foreign investments one has to differentiate between foreign direct investment (FDI) and portfolio investment. While the International Monetary Fund states on FDI that it “… is made to acquire a lasting interest in an enterprise operating in an economy, other than that of the investor, the investor’s purpose being to have an effective voice in the management of the enterprise”, portfolio investment is described as “gaining ownership without control” and is of short-term interest (Italy and Razin 2005, p. 2). Possible restrictions for foreign investments may be limitations on foreign ownership, screening or notification procedures, management and operational restrictions (Golub 2003, p. 7). Limitations on foreign ownership again can have two forms: the typical form of “limiting the share of companies’ equity capital in a target sector that non-residents are allowed to hold” or a prohibition of any foreign ownership (Golub 2003, p. 7). An international investor will only invest in a market as long as cost and risk, caused by access restrictions, is justifiable for him/her.

7.4

Systematic Analysis of the Exchanges in the Region

This section examines the seven exchanges of the GCC countries regarding the key aspects concerning infrastructure, market functionality and regulation. An overview of the tradable instruments types, the form of the market mechanisms, information dissemination and the services offered along the securities trading value chain will be provided. Individual pricing models are investigated with a focus on the incentives for liquidity provision and the distribution of commissions resulting from a transaction. Finally, an insight into the regulatory framework and possible restrictions on foreign ownership will be given. In order to introduce the structure of the analysis, the Bahrain Stock Exchange will be described in greater detail first. This provides the structure for the remaining six exchanges4. They will be presented in tabular form and follow the analysis scheme of the Bahrain Stock Exchange.

7.4.1

Bahrain Stock Exchange (BSE)

On the Bahrain Stock Exchange investors are able to trade four types of tradable instruments, namely equities, government and corporate bonds as well as mutual funds on Bahraini equities. Derivatives are not offered. 4

The analysis is based on information available until January 2006.

152

Peter Gomber et al.

The trading mechanisms/system can be described as follows: The BSE provides a hybrid market mechanism, i. e. it is order and quote driven. Market makers provide (artificial) liquidity. The Bahrain Automated Trading System (BATS) was established in 1999 and is based upon Computershare’s Horizon system. Supported order types are market and limit orders including a restriction on order validity period (BSE 1999, article 17−18). Investors can also submit iceberg orders (BSE 1999, article 20). Price determination takes place according to the following criteria (in sequence): price priority, source of order priority, time priority, cross priority and random factor priority (BSE 1999, article 23). The source of order priority discriminates between Bahraini and non-Bahraini orders whereby orders from Bahraini investors are prioritized. A trading safeguard is effective for equities only and avoids price fluctuations of more than 10 percent from the last closing price (BSE 1999, article 15). Bonds and mutual funds are free in their price fluctuations. The market is accessed via brokers, who have to use the automated trading system to input their orders. The automated trading system is available via terminals on the trading floor and, since 2001, via remote access (BSE 2001, article 1). It permits two hours of continuous trading per day, namely from 10 a. m. to 12 p. m., Sundays through Thursdays, and a pre-opening phase takes place every morning from 9.30 a. m. to 10 a. m. followed by an opening auction (BSE 1999, article 25−26). Information dissemination of realtime market data is provided via BATS which features an open order book. Historical market statistics are provided by BSE in English language, ranging from daily to annual publication frequency. Listed companies are, among other obligations, required to publish quarterly, semi-annual and annual financial reports (Bahrain Monetary Agency 2003, article 34). As services along the value chain, BSE provides trading facilities and a clearing and settlement unit including a central counterparty (CCP). The net position of trades in a security is settled via the broker’s clearance account two days after the trade’s occurrence (T+2). Afterwards, brokers are responsible for settlement with their clients. BSE’s pricing model differentiates between equities and bonds. Brokerage commissions are negotiable (between the investor and the broker) but the exchange fee is a fixed rate in a prescribed brokerage commission model with commissions decreasing depending on the amount of a transaction. The pricing schedule for equity trading is shown in table 7.13: Table 7.13. Pricing schedule for equity trading at the BSE (Source: BSE website) Value of Transaction (BD*)

Broker (negotiable)

BSE

1 – 20,000 20,001 – 50,000 Above 50,000

24 bps 16 bps 8 bps

6 bps 4 bps 2 bps

* BD 1 = US-$ 2.65259

Total (depending on negotiations) 30 bps 20 bps 10 bps

7 Capital Markets in the Gulf

153

The BSE pricing schedule does not offer any pricing incentives for liquidity provision, neither for market makers nor for investors submitting limit orders. Following the description of Table 7.13, the overall commission is distributed at 80 percent to the broker and 20 percent to the exchange. The regulatory framework for Bahrain’s capital market was set by the Bahrain Monetary Agency (BMA) in 2002 (Bahrain Monetary Agency 2006). Simultaneously, the BMA is responsible for Bahrain’s monetary policy and the regulation of all financial institutions in the kingdom. Since 1999, restrictions on foreign ownership have been reduced allowing GCC nationals to own up to 100 percent (formerly 49 percent) of shares of a listed Bahraini company while nonGCC nationals may hold up to 49 percent (formerly 24 percent) of shares (Bahrain Stock Exchange 2006). Table 7.14 summarizes the BSE analysis and provides the framework for the other markets.

Table 7.14. Characteristics of the Bahrain Stock Exchange (BSE) Category Tradable instruments Trading mechanisms/system

Supported order types

Safeguards Information dissemination

Services along the value chain Pricing model

Regulatory framework

Restriction on foreign ownership

Results Equities, government and corporate bonds, mutual funds Continuous, order-driven with liquidity providers; BATS electronic trading system based on Computershare’s Horizon since 1999; opening hours: Sunday – Thursday, pre-opening 9.30 a.m. – 10 a.m., continuous trading 10 a.m. – 12 p.m. Limit and market orders (extended by special conditions), hidden volume, order validity restrictions 10 percent of fluctuation from the last closing price (for equities only) Market statistics ranging from daily to annual; company financial statements quarterly, semi-annual and annual Trading, clearing with CCP, settlement (T+2) Broker’s commission negotiable, while exchange fee is not; linear fee model with discounts for higher trading values Regulated by the Bahrain Monetary Agency (BMA) since 2002; BMA regulates all financial institutions and serves as central bank Discrimination between GCC and non-GCC nationals; GCC nationals allowed to hold up to 100 percent, non-GCC nationals 49 percent of a company’s shares

154

Peter Gomber et al.

7.4.2

Tadawul Saudi Stock Market (TSSM)

Table 7.15. Characteristics of the Tadawul Saudi Stock Market Category Tradable instruments Trading mechanisms/system

Supported order types

Safeguards Information dissemination Services along the value chain Pricing model

Regulatory framework

Restriction on foreign ownership

Results Equities Continuous, order-driven without liquidity provision; Tadawul electronic trading system since 2001 (based on Computershare’s Horizon); trading hours: Saturday – Wednesday, 10 a.m. – 12 p.m., 4 p.m. – 6.30 p.m.; Thursday: 10 a.m. – 12 a.m. Limit and market orders (extended by special conditions), hidden volume, order validity restrictions None Market statistics ranging from daily to annual; company financial statements quarterly Trading, clearing, settlement Linear model with floor of 15 Saudi Riyal* for total commission, otherwise a maximum of 15 bps of trade value (negotiable, see appendix) Regulated by the autonomous Capital Market Authority (CMA) based on the Capital Market Law (CML) (Capital Market Authority 2006) Foreign participation in the stock market only through mutual funds (Saudi Arabia General Investment Authority 2006)

* 1 Saudi Riyal = USD 0.26666

Additional information on TSSM: Bonds are currently traded via telephone but in October 2005, the Saudi government announced to “… establish a secondary market for government bonds” (AlJasser and Banafe 2003, p. 182 and Banker 2005, p. 74). Mutual funds on Saudi equities are offered by investment companies.

7 Capital Markets in the Gulf

7.4.3

155

Kuwait Stock Exchange (KSE)

Table 7.16. Characteristics of the Kuwait Stock Exchange Category Tradable instruments Trading mechanisms/system

Supported order types Safeguards Information dissemination Services along the value chain Pricing model Regulatory framework Restriction on foreign ownership

Results Equities, options, forwards, futures Continuous, order-driven without liquidity provision; Kuwait Automated Trading System (KATS); access only via brokers; trading hours: auction at 8.50 a.m., 9 a.m. – 12.30 p.m. continuous equity trading Limit and market orders None Market statistics available via KSE website in Arabic language only, company information: n/a Trading; clearing and settlement is performed by the Kuwait Clearing Company (KCC) Linear model with discount for higher transaction values (see appendix for detailed information) Regulated by the government-owned Kuwait Investment Authority (KIA) Foreign participation in Kuwaiti companies is limited to 49 percent, expatriates within Kuwait are allowed to hold 100 percent

156

Peter Gomber et al.

7.4.4

Muscat Securities Market (MSM)

Table 7.17. Characteristics of the Muscat Securities Market Category Tradable instruments Trading mechanisms/system

Supported order types Safeguards Information dissemination

Services along the value chain

Pricing model

Regulatory framework

Restriction on foreign ownership

Results Equities, commercial and government bonds Since December 2005 electronic, continuous and order-driven market, no liquidity provision; Atos Euronext trading system; trading hours: Sunday – Thursday, Regular market 10 a. m. – 11 a. m., parallel & bond markets 11:30 a. m. – 12:30 p. m., third market 12:35 p. m. – 12:55 p. m. Limit/market orders and various extensions in restrictions and execution conditions No information available Real-time market information dissemination through the automated trading system; quarterly financial company statements Trading; clearing and settlement is performed by the Muscat Depository & Securities Registration Company (Muscat Securities Market 2006a), settlement at T+3 MSM’s commission is a fixed percentage of the order value, the broker’s commission is subject to negotiation for equities trading (see appendix for details) In 1998 regulation has been transferred from MSM to an governmental Capital Market Authority (CMA) No restrictions (Muscat Securities Market 2006b)

7 Capital Markets in the Gulf

7.4.5

157

Doha Securities Market (DSM)

Table 7.18. Characteristics of the Doha Securities Market Category Tradable instruments Trading mechanisms/system

Supported order types Safeguards Information dissemination

Services along the value chain Pricing model Regulatory framework Restriction on foreign ownership

Results Equities Electronic order-driven market based on Computershare’s Horizon system, access only personally via local brokers; trading hours: no information available Limit/market orders No information available Market statistics on a monthly basis; quarterly financial statements by companies; weekly publication of main financial indicators, e. g. dividend yield, p/e ratios Trading; clearing house organized by the settlement bank, T+3 cycle Exchange fee is a fixed rate of 30 percent of the broker’s commission No information available Since April 2005 foreigners (including GCC nationals) are allowed to hold up to 25 percent of a listed company’s shares (Doha Securities Market 2006)

158

Peter Gomber et al.

7.4.6

Dubai Financial Market (DFM)

Table 7.19. Characteristics of the Dubai Financial Market Category Tradable instruments Trading mechanisms/system

Supported order types Safeguards Information dissemination

Services along the value chain Pricing model

Regulatory framework

Restriction on foreign ownership

Results Equities, bonds; mutual funds on equities and fixed income existent Order-driven, automated screen-based trading system (Computershare’s Horizon), access via local brokers; trading hours: pre-opening at 9.30 a.m.; continuous trading from 10 a.m. – 1 p.m. Limit/market orders No information available Real-time and historic market endogenous information through the trading system screens on the trading floor; market statistics publications ranging from daily to annual; companies’ financial statements have to be published on a quarterly, semi-annual and annual basis (Emirates Securities and Commodities Authority 2000, article 36) Trading, clearing and settlement Commission resulting from a transaction are distributed among broker, DFM, and the market regulator (see appendix for details) Regulated by the financially and administratively autonomous Emirates Securities and Commodities Authority (ESCA) which is led by government officials; exchange rules for market conduct None

7 Capital Markets in the Gulf

7.4.7

159

Abu Dhabi Securities Market (ADSM)

Table 7.20. Characteristics of the Abu Dhabi Securities Market Category Tradable instruments Trading mechanisms/system

Supported order types Safeguards Information dissemination

Services along the value chain Pricing model

Regulatory framework

Restriction on foreign ownership

Results Equities (bonds are announced) Automated central limit order matching trading system (Computershare’s Horizon), access via local brokers; trading hours: pre-opening at 10 a.m.; continuous trading from 10.30 a.m. – 12.30 p.m. Limit/market orders No information available Real-time and historic market information through the trading system screens on the trading floor; market statistics publications ranging from daily to annual; companies’ financial statements have to be published on a quarterly, semi-annual and annual basis (Emirates Securities and Commodities Authority 2000, article 36) Trading, clearing and settlement Commission resulting from a transaction are distributed among broker, ADSM, and the market regulator; exchange fee: percentage of trade value with a floor for total commission (for detailed information see appendix) Regulated by the financially and administratively autonomous Emirates Securities and Commodities Authority (ESCA) which is lead by government officials No restrictions

As presented above, all six exchanges – except for Saudi Arabia which allows foreign participation via mutual funds only – have opened up their markets for foreigners. Some markets discriminate between GCC and non-GCC nationals. While the markets in Oman and the UAE impose no restrictions on non-GCC investments, the markets in Bahrain, Kuwait and Qatar allow non-GCC nationals to hold only a minority of a listed company’s shares.

160

7.5

Peter Gomber et al.

Approaches to Further Open up the Stock Markets to the International Community − The Dubai International Financial Exchange (DIFX) Case

The previous sections identified the underlying infrastructure of the markets in the region. These markets developed over a relatively short period of time. If we look at the drivers we find that the underlying economies of the Gulf markets developed strongly on the back of oil and gas. This development meant that a significant local cash reserve was created and that any surplus liquidity was invested in business opportunities abroad. Over time the local economy created more business opportunities and the net export of cash slowed down. A significant number of these businesses had been financed by debt before the need for a stock market arrived. The markets are located in comparably small countries which traditionally are closed to outsiders for investment purposes. However, the next step in the development of a stock market lies in widening the investor base. The underlying legal framework hinders rapid expansion into non-GCC areas. Whilst most markets are fully electronic in the best sense of the word, market participants cannot be located in other jurisdictions than the home market. All markets have introduced electronic trading systems, but market access to these systems in the GCC markets is different from access to electronic markets in the US and Europe, where brokers have remote access to the exchanges’ trading systems, i. e. access without restrictions to the broker’s location. In some cases trading firms have actually to be present “on the floor” even if actual trading is fully electronic and trades are entered right there into the central system and the trade is matched by Central Limit Order Books (CLOB). Today, there are two ways to accommodate growth in stock markets: (i)

Evolutionary Regional Market Development

The development of markets in the GCC shows many similarities with market development in Europe during the past 30 years. Harmonization of the regulatory framework of local markets is a cornerstone of the capital market evolution. In this, there are many similarities to the EU from the quest to connect markets to the establishment of a single currency. The overall geographical market of the region however is less harmonized and will need more development to come even close to the environment currently existing in Europe. In recent history, markets of the region and beyond started discussing ways to connect their markets technically in the hope that this will lead to increased liquidity. Before technical connectivity will pave the way it is more important that legal and regulatory systems of these markets are harmonized. Thriving towards a common currency will also lead to a more integrated capital market structure.

7 Capital Markets in the Gulf

161

These developments will build an essential foundation which will ultimately allow local markets to fuse in an evolutionary way. (ii)

Leapfrog Development

The revolutionary development of markets is preferred in the region as time seems to be the least available commodity. The DIFX is a prime example of such a development as it benefits from the concept of an international financial center. The DIFX was set up within the Dubai International Financial Centre (DIFC) and specifically targets an international investor, issuer and intermediary community – therefore the following section, will further described it as a case study of an approach to further open up the markets in the region. Dubai (and thus the UAE) did not want the evolutionary approach over years and decades to change their stock exchanges and therefore designed the concept of financial free zones. The generic free zone concept has been established over the years for trade and allows foreigners or international organizations to build a business as 100% owners and with the right of full profit repatriation and quite often with a beneficial tax regime. To take this concept one step further to the creation of a financial centre a lot of changes and additions needed to be made to the underlying idea. The Financial Centre The Financial Centre creates a framework which allows leapfrogging the evolutionary capital market development. A capital market will only be as efficient or as international as its framework allows it to be. Three major components make up this environment: 1. The legal framework and the market oversight (a regulator) and 2. The technical and organizational infrastructure (a “municipality”) 3. The rule of law and the court system. Hence, the International Financial Centre in Dubai started with the creation of the legal framework. It took on experts of markets ranging from New Zealand via Europe to Canada and created legacy-free system of interlinking laws. This effort effectively created a country inside a country in which the newly created regulator works to IOSCO5 Standards and takes over the roles traditionally carried out by the central bank and/or a country regulator. In case of Dubai, a close working relationship with the organizations of the federal government ensures enforcement in the home country but outside the Centre. Commercial and civil laws had been recreated by the Centre using international best practice whilst only the criminal laws are left from the Federation. More importantly the UAE government allowed the Financial Centre to choose the language of its laws. The use of English language for all applicable new laws creates an environment which leaves the 5

International Organization of Securities Commissions.

162

Peter Gomber et al.

market participants with a readily recognizable legal framework where content does not get lost in translation.6 The creation of a legal framework was followed by the second step, the creation of a physical center which – in the long run – aims at rivaling the main financial centers around the globe in its provision of technical infrastructure and international connectivity. The Financial Centre’s master plan states that it wants to create an environment which will provide office premises as well as residential and retail areas for 60,000 people.7 Finally, the Financial Centre created a Court System with international judges which complete the basic structure of this new environment.8 Dubai International Financial Exchange A key component of this new structure is the Dubai International Financial Exchange (DIFX). It was set up on the framework of the DIFC and built in less than two years. All relevant systems, services, rules and regulation aim to provide an internationally accessible capital market with a straight through processing environment.9 The DIFX that went live in September 2005 aims at achieving the following goals to be able to attract international investments: • The DIFX positions itself strategically as a complementary market to the national markets in the region. National markets perform an important task in their home countries which the DIFX is not meant to copy. The DIFX was created to complement these markets with an additional value proposition which ranges from product types to remote members. Thus it aims at keeping companies from the region which would long for international capital through European or US markets closer to their home market. The DIFX’s structure is built in such a way that it can be used in form of a ‘hub an spoke’ model for the enhancement of local markets in their ongoing development. • At the heart of the DIFX rests the concept of the ‘authorized’ and the ‘recognized’ member. An authorized member is located inside the Centre and is regulated by the exchange regulator. The differentiation from the national markets is the embedded concept of the recognized or remote member. It allows a trading member to be located in another jurisdiction as long as his legal environment is accepted by the exchange and its regulator. The market aims at providing a platform in which internationally active trading members and institutional investors can trade with lesser known regional banks and brokers who move into the highly regulated environment around the DIFX. 6 7

8 9

See the DFSA website for the whole range of laws: http://www.dfsa.ae For the DIFC setup and framework refer to the DIFC website: http://www.difc.ae/about_us/index.html For full details see the DIFC Courts website: http://www.difccourts.ae/ For background information on the establishment of the DIFX see its homepage: http://www.difx.ae

7 Capital Markets in the Gulf

163

• The challenge of remote membership of the DIFX is two-fold. Firstly, it lies in the design of the legal framework of the exchange and its subsidiaries and its acceptance in the home jurisdictions of the trading members. Secondly, the technological aspect is driven by the large region the DIFX will cover and the focus on connectivity of a remote trading member to the exchange. This has to be done in a way which does not create a disadvantage for some members due to latency. • The DIFX created an infrastructure which is known as a “vertical silo”. In its development it will provide a trading system for equity and derivatives, a clearing house with a central counter party (CCP), a Central Securities Depositary (CSD) and a registry all under one roof. To complement this structure for international connectivity, the settlement organization will connect to an International Securities Depository (ICSD) System in Europe. The way the whole market is structured allows the DIFX to offer and perform certain tasks in clearing and settlement for the local markets as they develop outside their home markets. It further facilitates the creation of dual listings which is one way for the local markets to expand their reach across their borders and to enhance local market liquidity. • The DIFX further initiated the payment clearing of its trading currency – the US Dollar – to ensure currency availability during the trading day for margin calls and settlement instructions. • For issuers, the DIFX aims to provide a stream lined listing system. Oversight of the market rests with the regulator whilst the listing authority has been delegated to the exchange. The exchange has created a self regulatory function for market oversight and listing. The listing is not hindered by either access restrictions or ownership restrictions to institutional investors in any way which exist in most local markets in the region in one form or another. • The market model of the DIFX provides for a Central Limit Order Book and will integrate market makers to provide liquidity for the lesser liquid products. For the trading system for equity and derivative products the ATOS-Euronext system NCS was chosen. For the clearing, settlement and registry function TCS of India delivered the software. These core systems were enhanced by a collection of supporting systems from the delivery of corporate actions to the dissemination of real-time data. The network of the exchange builds on GLTrade and SWIFT connectivity. The system conglomerate works to the latest IOCSO standards and aims at providing a modern straight through processing environment. • International investors and intermediaries demand a readily available system for data information from the market (market endogenous information) as well as from the issuers on the market e. g. on corporate actions (market exogenous information). The DIFX maintains a comprehensive data dissemination concept which allows over 700,000 professional users worldwide to access the market data.

164

Peter Gomber et al.

• The rules of the exchange demand high listing standards from the issuers and education with exams from the traders on the market. To achieve this, the DIFX has built the DIFX Academy which teamed up with institutions like the University of Frankfurt or FT Knowledge/New York Institute of Finance.10 Courses include general market know-how, specialist listing requirements on corporate governance and trader education and examination. Following the tabular description of stock markets from Section 7.4 the DIFX can be characterized as:

Table 7.21. Characteristics of the Dubai International Financial Exchange Category Tradable instruments Trading mechanisms/system

Supported order types Safeguards Information dissemination Services along the value chain Pricing model

Regulatory framework

Restriction on foreign ownership

Results Equities, GDRs, bonds, sukuks, certificates, ETF’s, options and futures planned for 2006 Order-driven, automated screen-based trading system with market maker interaction (Atos Euronext trading system); market access via local and international brokers; trading hours: Monday-Friday 2 p.m. – 5 p.m. Limit/market orders with various extensions in restrictions and execution conditions Volatility interruption Real-time dissemination via the major data vendors Trading, clearing with central counter party (CCP), settlement and registry Based on international standards with different pricing levels for liquidity providers and liquidity takers Regulated by the financially and administratively autonomous Dubai Financial Services Authority (DFSA) None

The many facets of the markets started to adjust to international best practices and standards and the impact is now visible in all national markets. This development can propel the region’s capital markets forward in the coming years and move them closer to the established markets which will gradually close the gap between the Asian and the European time zones.

10

For the development of its trader exam the DIFX cooperated with the Chair of e-Finance at Frankfurt University, for details see: http://efinance.wiwi.uni-frankfurt.de

7 Capital Markets in the Gulf

7.6

165

Conclusion

The previous sections gave an overview of the six GCC countries and the region as a whole. It was shown that economies and stock market business in the region experienced a very dynamic development in recent years. Along with risen stock trading activity, market performance was consistently positive. Compared to markets in the US or Europe, exchanges in the Gulf countries have a younger history and consequently, trading is limited to equities in most cases up to now. International investors require their markets to have a high grade of liquidity, free market access, transparency, reliability and no obstacles for foreign ownership. All GCC exchanges have migrated from manual trading to electronic trading systems, but market access is mostly available via local brokers only which may limit international investments. It can be observed that in some countries barriers for foreign ownership still exist, e. g. Saudi Arabia which allows no direct foreign ownership, while mainly the smaller countries in the region have tried to open up and lift restrictions. The Dubai International Financial Exchange is an approach to further open up the securities market and attract international investments by applying standards that can also be found in the US or Europe. It can be found that exchanges in the Gulf have already run through modernization and a process to attract foreign capital to their markets, and they are in a fast pacing process to catch up with the functionalities and the organizational structure of their counterparts in international stock markets.

7.7

Appendix

The pricing model of the Tadawul Saudi Stock Market (TSSM): Floor

SR 15

Maximum total commission (negotiable)

15 bps

SR 1 = US-$ 0.26666

The pricing model of the Kuwait Stock Exchange (KSE): Transaction Value (KD)

Total Commission

1 – 50,000

12.5 bps

> 50,000

10 bps of the excessive amount

KD 1 = US-$ 3.42571

166

Peter Gomber et al.

The pricing model of the Muscat Securities Market (MSM):

Broker

MSM Total

Equities (Value < RO 100.000) 25−60 bps (min. RO 0.5) 15 bps (min. RO 0.2) 40−75 bps

Equities (Value > RO 100,000)

Bonds (Value < RO 250.000)

Bonds (Value > RO 250,000)

50 bps of the excessive amount 10 bps of exc. amount

16 bps

10 bps of the excessive amount 2 bps of exc. amount

4 bps 20 bps

RO 1 = US-$ 2.5974

The pricing model of the Doha Securities Market (DSM): Brokerage commission is negotiable and exchange fee amounts to 30 percent of this commission. The pricing model of the Dubai Financial Market (DFM): Shares

Bonds

Funds & Warrants

Broker DFM Clearance ESCA Total Variable

30 bps 10 bps 5 bps 5 bps 50 bps

3 bps 1 bps 0.5 bps 0.5 bps 5 bps

15 bps 10 bps – – 25 bps

Fixed Charge

AED 10 (AED75)

AED 10 (AED75)

AED 10 (AED75)

AED 1 = US-$ 0.27227

The pricing model of the Abu Dhabi Securities Market (ADSM):

Broker

Trade Value ≤ AED. 15.000

Trade Value > AED. 15.000

AED 45

30 bps

ADSM

AED 22.5

15 bps

Authority

AED 7.5

5 bps

Total

AED 75

50 bps

AED 1 = US-$ 0.27227

7 Capital Markets in the Gulf

167

Acknowledgement The authors gratefully acknowledge the support of the E-Finance Lab. For further information on the E-Finance Lab, please see the chapter “Added Value and Challenges of Industry-Academic Research Partnerships – The Example of the E-Finance Lab” in this volume.

References Al-Jasser MS, Banafe A (2002) “The development of debt markets in emerging economies: the Saudi Arabian experience”, BIS Papers No. 11, June 2002 Bahrain Monetary Agency (2003) “Disclosure Standards”, December 2003. Retrieved January 5, 2006, from http://www.bahrainstock.com/downloads/ law/BMA_Disclosure_Standrads_Book.pdf Bahrain Monetary Agency (2006) “Regulations and Supervision”. Retrieved Jan 5, 2006, from http://www.bma.gov.bh/cmsrule/index.jsp?action=article&ID=898 Bahrain Stock Exchange (1999) “Resolution No. (4) of 1999”. Retrieved January 5, 2006, from http://www.bahrainstock.com/downloads/law/Trading%20Rules %20and%20Procedures.pdf Bahrain Stock Exchange (2001) “Resolution No. (6) of 2001”. Retrieved January 5, 2006, fromhttp://www.bahrainstock.com/downloads/law/Trading%20Rules %20and%20Procedures.pdf Bahrain Stock Exchange (2006) “Foreign Ownership”. Retrieved January 6, 2006, from http://www.bahrainstock.com/bahrainstock/bse/foreignownership.html Banker (2005) “Saudi’s new bond market beckons”, The Banker, November 7, 2005, p. 74 Capital Market Authority (2006) Retrieved January 6, 2006, from http://www.cma.org.sa/cma_en/default.aspx Central Intelligence Agency (2005) “The World Factbook 2005”. Retrieved November 22, 2005, from http://www.cia.gov/cia/publications/factbook/ Davis EP, Steil B (2001) “Institutional Investors“, The MIT Press, Cambridge, Massachusetts; London, England Davydoff D, Gajewski JF, Gresse C, Grillet-Aubert L (2002) “Trading costs analysis: A comparison of Euronext Paris and the London Stock Exchange”, OEE research paper Degryse H, Van Achter M (2002) “Alternative Trading Systems and Liquidity”, SUERF-colloquium book “Technology and Finance: challenges for financial markets, business strategies and policymakers”, M. Balling, F. Lierman, A. Mullineux, editors; Chapter 10; Routledge, London, New York Doha Securities Market (2006) Retrieved January 9, 2006, from http://english.dsm.com.qa/site/topics/static.asp?cu_no=2&lng=1&template_ id=33&temp_type=42&parent_id=16

168

Peter Gomber et al.

Dubai Financial Market (2006) Retrieved January 5, 2006, from http://www.dfm.co.ae Economides N, Schwartz R (1995) Equity Trading Practices and Market Structure: Assessing Asset Manager’s Demand for Immediacy, Financial Markets, Institutions & Instruments 4, 1995, pp. 1−46 Economist (2005) “The march of the robo-traders”, September 17, 2005, pp. 12−13 Emirates Securities and Commodities Authority (2000) “Resolution No. 3/2000”. Retrieved January 9, 2006, from http://www.dfm.co.ae/dfm/ ESCARulesRegulations/ESCA_Rules/Transparency.DOC Fama EF (1970) “ Efficient Capital Markets. A Review of Theory and Empirical Work “, Journal Of Finance, Vol. 25, May, pp. 383−417 Federation of Euro-Asian Stock Exchanges (2005) Retrieved November 25, 2005, from http://www.feas.org/ Fincorp (2005) “GCC Markets – A special review”, November 10, 2005 Füss R (2002) “The Financial Characteristics between ‘Emerging’ and ‘Developed’ Equity Markets”, EcoMod 2002-International Conference on Policy Modeling, Proceedings Brussels, July 4−6, Brussels 2002 Golub S (2003) “Measures Of Restrictions on Inward Foreign Direct Investment for OECD Countries“, OECD Economics Department Working Papers, No.357, OECD Publishing Gomber, P. (2000) “Elektronische Handelssysteme – Innovative Konzepte und Technologien”, Physica-Verlag, Heidelberg IMF (2005a) “Middle East and Central Asia Regional Economic Outlook”, September 2005. Retrieved November 22, 2005, from http://www.imf.org/external/pubs/ft/reo/2005/eng/meca0905.pdf IMF (2005b) “World Economic Outlook”, September 2005. Retrieved November 22, 2005, from http://www.imf.org/external/pubs/ft/weo/2005/02/ International Organization Of Securities Commissions “Objectives and Principles of Securities regulation”. Retrieved December 27, 2005, from http://riskinstitute.ch Italy G, Razin A (2005) “Foreign Direct Investment vs. Foreign Portfolio Investment” NBER Working Papers 11047, National Bureau of Economic Research, Incp. 2 Jafarkhan KK (2004) “Ettihad Etisalat share trading to begin tomorrow”, Khaleej Times Newspaper Online Edition, December 19, 2004 Miller D, Puthenpurackal J (2005) “Security fungibility and the cost of capital” Working Paper Series, 426, European Central Bank, January 2005 Muscat Securities Market (2006a) Retrieved January 7, 2006, from http://www.msm.gov.om/pages/default.aspx?c=100 Muscat Securities Market (2006b) Retrieved January 7, 2006, from http://www.msm.gov.om/pages/default.aspx?c=171&tid=24 Pagano M, Röell A (1990) “Trading Systems in European Stock Exchanges: Current Performance and Policy Options”. Economic Policy, 10, pp. 65−115

7 Capital Markets in the Gulf

169

Radcliffe RC (1973) “Liquidity Costs and Block Trading”, Financial Analysts Journal, 29, July−August, pp. 73−80 Reilly FK (1989) “Investment Analysis and Portfolio Management, 3rd ed., Dryden Press, Chicago Reilly FK, Wright DJ (1984) “Block Trading and Aggregate Stock Price Volatility”, Financial Analysts Journal, 40, no. 2, March−April, pp. 54−60 Rutledge E (2005) “Is oil alone fuelling GCC stock markets?”, Khaleej Times Newspaper Online Edition, February 23, 2005 Saudi Arabia General Investment Authority (2006) “Why Saudi Arabia”. Retrieved January 6, 2006, from http://www.sagia.gov.sa/innerpage.asp?ContentID=3 Schiereck D (1995) “Internationale Börsenplatzentscheidungen institutioneller Investoren”, Gabler Verlag, Wiesbaden Sunder S (1992) “Insider Information and its Role in Securities Markets”, Carnegie Mellon University, Working Paper No.1992-18, Pittsburgh World Bank (2005) “World Development Indicators 2005”. Retrieved November 23, 2005, from http://www.worldbank.org/data/wdi2005/

CHAPTER 8 Competition of Retail Trading Venues − Online-brokerage and Security Markets in Germany Dennis Kundisch, Carsten Holtmann

8.1

Introduction

German regional stock exchanges rival in the horizontal inter-exchange competition for order flow of investors.1 Particularly, they are trying to be an attractive option for private retail investors and differentiate against the competition. Differentiation often comes in terms of ‘price quality’ and the attempt to reduce implicit transaction costs for the investors. Therefore, many exchanges have begun to extend their established trading rules. With different labels and – in detail slightly differing – service attributes, e. g. best-execution guarantees have been introduced to most markets.2 These offerings have in common, that investor gets a guarantee that the price for a transaction will be at least as good as it would be at a predefined – often more liquid – reference market. For retail investors, it is not only the implicit transaction costs but also the explicit ones3 that are relevant for his orderrouting decision.4 These are the sum of provisions and fees that are charged by the service providers that participate in routing and matching an order as well as clearing a deal.

1

2

3

4

See e. g. (Picot et al. 1996; Röhrl 1996; Rudolph and Röhrl 1997) for different forms of appearances of exchanges. Concerning the competition between exchanges see e. g. (Hammer 2004). Germany is one of only eleven countries in the world that has more than five national exchanges, see (Clayton et al. 2006). E. g. Duesseldorf: „Quality Trading“, Hamburg/Hanover: „Execution Guarantee“, Munich: „maxone“, Stuttgart: „Best-Price-Principle“. On the potential effects of best execution offerings (so called cream skimming) see e. g. (Easley et al. 1996; Battalio 1997). For a distinction between implicit and explicit transaction costs see e. g. (Lüdecke 1996; Gomber 2000). See e. g. (Barber et al. 2005) with respect to load fees of mutual funds. See also (Picot et al. 1996), p. 117, who state that a trade in a global financial market is always executed at the place with the lowest transaction costs.

172

Dennis Kundisch, Carsten Holtmann

Kirschner analyzed these fees for all German regional exchanges as well as for XETRA, the Suisse exchange and the Wiener Börse (Vienna exchange). He states substantial differences in the fees while comparing these trading venues.5 Interestingly, these differences in the fees have not led to the situation that access intermediaries, such as online-brokers, just offer access to selected (regional) exchanges.6 It remains to the investor, the retail trader, to choose between the numerous trading options taking the heterogeneous terms and services into account. Due to the different price models of online-brokers, the differing price models of exchanges and other participating service providers become hardly comparable or even not visible by any means. Moreover, most retail investors may even not know which service providers are participating at all in routing, matching and clearing a transaction. Hence, the objective of this contribution is to identify relevant services and their fees to route and execute an order at an exchange and to integrate these data with the price models of established online-brokers in the German financial services market. The following are the guiding research questions: • What are the relevant components of the service “exchange transaction” and which fees are charged for these components?7 • Is there a competition among exchanges based on the charged fees? And if so, do price models of the online-brokers distort this competition or can a retail investor recognize this competition and thus take it into account for his order routing decision? To answer these questions, the primary service components of a stock market transaction are identified and the respective fees are evaluated. Subsequently, it will be analyzed, how and whether these fees are passed on to retail investors through the price model of online-brokers. We will focus our analysis on German regional exchanges (Berlin8, Duesseldorf, Hamburg, Hanover, Frankfurt (FWB), Munich, and Stuttgart) as well as the electronic trading system XETRA. The structure of this contribution9 is as follows: First, the service components of securities trading and their respective fees are introduced. Thereafter, forms of appearance of online-brokerage are discussed and a short market overview is 5 6

7

8

9

See (Kirschner 2003). One exception is the online-broker Fimatex, which currently does not offer access to the regional exchanges at Hamburg and Hanover (as at March 2006). The internal processes of a banking institution such as account services will not be covered in this analysis. The initiative „Nasdaq Germany“ was closed in autumn 2003 after having only operated for around half a year, currently, there is no trading market at the exchange in Bremen. Hence, this exchange will not be further analyzed in the following. At the end of 2005 the Bremen exchange was bought by SWX Swiss exchange. In 2007, trading of derivative securities shall start on a fully electronic exchange system. This contribution is an extended and revised version of the contribution: Kundisch D, Henneberger M, Holtmann C (2005) Börsenwettbewerb über explizite Transaktionskosten auch beim Privatanleger?, in: Österreichisches BankArchiv, Vol. 53, No. 2, 2005, pp. 117–129.

8 Online-brokerage and Security Markets in Germany

173

provided. Then, four price models of relevant online-brokers are briefly portrayed. Finally, the price models and the fees for service components are integrated and conclusions are drawn. Concluding the contribution, a summary and outlook with respect to current developments in the market with particular emphasis on OTCofferings are provided.

8.2

Service Components and Fees in Securities Trading

8.2.1 Service Components in Securities Trading Market participants receive the complex service offering “exchange transaction” when trading on exchange markets. This offering is composed of different components that are provided by a number of players. Private investors typically do not have direct access to the market, but have to make use of financial intermediaries to route their orders to the markets and to hold their deposits and accounts. Both services constitute the core service offerings of online-brokers which hence can be subscribed as the provision of market access. Typically, private investors only have a direct business relation to the broker; they are indirect customers for any other service provider involved in exchange transactions. Those indirect service offerings are charged by the broker in the name of the providers.10 The prices of the single service components are hidden (or observable and then relevant to the investors’ decisions) depending on the type and transparency of the broker’s business and fee model. The following service components, which can be grouped by the different phases of a market transaction, extend the core offering of the broker:11 • Information services: Information services are used before as well as after a transaction. Before the trade the investment decision may be based on the information about the current market settings. After the trade information may be needed to monitor and, if necessary, to revise the decision. Information about the market settings is typically acquired from specialized companies like Reuters or Bloomberg. Brokers usually provide them for free or for a monthly fee; very detailed information (e. g. research or real-time quotes) are sometimes charged additionally. • Price determination services: In Germany, exchange transactions can be closed at any of the regional markets or within the automated trading system XETRA that is run by Deutsche Börse AG. The fees that are charged 10

11

Consequently, the broker sells a kind of system or composite good; see (Matutes and Regibeau 1988). See (Picot et al. 1996) for a phase model of market transactions; see (Holtmann 2004) for an analysis of exchanges as service providers.

174

Dennis Kundisch, Carsten Holtmann

differ between the markets and depending on the service providers, e. g. the exchange broker, involved. Exchanges charge their fees depending on the transaction volume starting with a minimum amount. Exchange brokers get special commissions, also referred to as ‘courtage’, for finding partners with corresponding trading interests. Moreover, exchanges charge annual fees for the market access. • Clearing and settlement services: Clearing and settlement is provided by specialized companies in combination with the federal state central banks, which hold the deposits and accounts of the brokers. In the remainder, the fee models of price determination and clearing and settlement providers are discussed before those of the brokers are analyzed in more detail.

8.2.2 Access Fees for Participants at German Exchanges Online-brokers have to have an admission to any single market they plan to route orders to. This admission is typically charged with a one-time accreditation fee as well as recurring annual participation fees (see the following table). Table 8.1. Accreditation and annual fees12 Exchange

Berlin

Annual participation fee Min.

Max.

in €

in €

1,500

6,000

Accreditation fee

Comments

Multiplier

(one time)

4 steps at 1.500 € each

100%

Equals the corresponding annual participation fee

Duesseldorf

1,000

22,500

Not defined explicitly

100%*

Equals to twice the amount of the participation fee when not already participant at another regional exchange; in the latter case it equals the participation fee

Frankfurt (FWB)

7,500

15,000

Participants accessing

Multiplier

exclusively via Xontro

not existent (‘Skontroführer’)

For lead brokers

7,500 €, participants on

20,000 €, none for the

the floor 15,000 € with

others

additional 1,500 € for every trader 12

Information taken from (Börse Berlin-Bremen 2004; Börse Duesseldorf 2004; Börse Munich 2004b; Börse Stuttgart 2004; Hanseatische Wertpapierbörse Hamburg 2004; Niedersächsische Börse zu Hannover 2003; Deutsche Börse AG 2003; Deutsche Börse AG 2004) as well as from emails and phone calls with the market providers.

8 Online-brokerage and Security Markets in Germany

175

Table 8.1. Continued Exchange

Annual participation fee Min.

Hamburg

Max.

in €

in €

50

10,000

Accreditation fee

Comments

Multiplier

(one time)

Not defined explicitly

Multiplier

Equals the corre-

not existent sponding annual Hanover

50

20,000

Not defined explicitly

Multiplier

Munich

500

7,000

15 steps at 250 € or 500 € 350%

participation fee 5,000 €

not existent 1,500 €

each, Specialists: 35,000 € additionally Stuttgart

1,200

6,100

Different steps, official

Multiplier

1,800−6,100 €,

exchange brokers 5,000 € not existent 35,000 € for lead additional

brokers (‘Skontroführer’)

XETRA:

37,500

37,500

28,500

28,500

two direct connections XETRA: one direct connection, one Internet

Fees for XETRA

connection XETRA:

10,500

10,500

13,500

13,500

Multiplier not existent

None

Internet connection @XETRA Workstation

* In Duesseldorf the multiplier is typically not applied; instead the fees are negotiated on an individual basis within the given interval

Especially the annual participation fees are regularly revised and vary significantly between the markets depending on the given and/or anticipated turnover volume of the broker at the respective market.13 Differences between the markets are immense and Berlin, Duesseldorf, and Munich define a multiplier in their scale of charges and fees that can be adapted easily to current developments. The markets operators also charge the provision of the technical infrastructures. Brokers and other direct market participants pay monthly fees for using the standardized Xontro System. Xontro is the system that allows for electronic trading and order

13

See e. g. § 2 para. 2 in the scales of charges and fees of Börse Stuttgart.

176

Dennis Kundisch, Carsten Holtmann

processing at the German regional exchanges – it is provided by Deutsche Börse. Market participants can choose between different forms of accessing Xontro:14 • Dial-in Connection: The Dial-in connection is the least expensive and easiest possibility to get the full range of functionalities for order processing. It is typically utilized by brokers with small or medium order processing volume or as a backup solution by brokers with a higher volume. • Permanent Connection: Market participants that have high(er) processing volumes and that run own order processing systems might want to have a permanent connection to the systems of Deutsche Börse AG. For such a connection they are charged a monthly fee of 7.500 € no matter at which markets the single participant is registered.15

8.2.3 Fees for Price Determination Services Market maker commissions that have to be paid for the most transactions on regional exchanges are calculated in base points (bp) of the order volume. XETRA is not only an electronic but also an automated trading system in a sense that the matching of orders and the determination of prices takes place without human intervention – orders compete directly with each other. Instead of market maker commissions XETRA system fees are charged. Partial executions of orders are handled as separate transactions that are normally charged individually (XETRA transactions can be handled differently if executed within one day16). Table 8.2 summarizes the amounts that are charged for trading stocks; trading in bonds is calculated differently and not illustrated here. If transactions are executed and documented in Xontro, contract notes are created that provide information on trading partners, price, date, fees, etc. Contract notes document the terms of any contract that has been created at an exchange. Generation and transmission of contract notes are charged by the exchanges – rates deviate between the regional markets (see Table 8.3). Contract note fees are charged by Deutsche Börse on behalf of the different regional exchanges.17 Normally there are no corresponding duties on XETRA; for voluntary contract notes for XETRA contracts, one has to pay 0.06 € per transaction. In addition to this, market participants have to pay a monthly fee of 55 € for any market they access via Xontro.

14 15

16 17

See (BrainTrade 2004b), p. 12. Munich runs the proprietary trading system maxone that can be accessed through the Xontro connection; see (Deutsche Börse Systems, BrainTrade 2004), pp. 8–9. (Deutsche Börse AG 2003). (BrainTrade 2004a).

8 Online-brokerage and Security Markets in Germany

177

Table 8.2. Market maker commission and (XETRA) system fees, respectively, for order execution18 Exchange

Variable allowance

Exception for German blue chips (DAX)

Berlin Duesseldorf Frankfurt (FWB) Hamburg Hanover Munich Stuttgart XETRA: High Volume XETRA: Medium Vol. XETRA: Low Volume

8 bp 8 bp 8 bp 8 bp 8 bp 8 bp 8 bp 0.56 bp

4 bp 4 bp 4 bp 4 bp 4 bp 4 bp 4 bp None

0.588 bp

None

0.644 bp

None

Notes Minimum/ maximum allowance in € for stocks 0.75/– 0.75/– 0.75/–* 0.75/8.00 0.75/– 0.75/– 0.75/12.00 0.70/21.00 20,000 € minimum fee 0.73/22.05 5,000 € minimum fee 0.80/24.15 2,000 € minimum fee

* For derivative leverage products and for derivative investment products, there is a maximum allowance of 3 € and 12 €, respectively

Table 8.3. Contract Note Fees19

Contract note fee in € Monthly fee for Xontro access to the respective market in €

Contract note fee in € Monthly Fee for Xontro access to the respective market in €

18

19

Berlin 1.75 55.00

Duesseldorf 1.75 55.00

Frankfurt 1.75 55.00

Hanover 1.70 55.00

Munich 1.00 55.00

Stuttgart 1.74 55.00

Hamburg 1.70 55.00

Table 8.2 is based on the references already given in Table 8.1. The internalization system XETRA Best is not illustrated. (BrainTrade 2004a).

178

Dennis Kundisch, Carsten Holtmann

Fees for Clearing and Settlement20 After closing a deal at an exchange the duties of the involved participants regarding delivery and payment are registered and fulfilled in the clearing and settlement phase. Besides these basis services clearing and settlement providers may offer additional services (e. g. custody). Trades on national exchanges are reported to Clearstream Banking AG as one of the main clearing and settlement service providers. As the central depository for securities Clearstream Banking AG holds the deposits of the banks and brokers being involved in securities market trading. Clearstream charges basis fees for the holding of accounts and deposits and additional fees per transaction. Basis fees have to be paid monthly; they are calculated in bp from the individual volumes. For differing security types different fee scales apply. Table 4 illustrates the pricing scheme for stocks, mutual funds and similar securities. The fee is calculated by summing up the security values in the single categories (e. g. for a deposit with a volume of € 150 Mio. the first € 100 Mio. are calculated with 0.2 bp, the remaining € 50 Mio. with 0.175 bp). Table 8.4. Annual Fees of Clearstream Banking AG Deposit (market value in Mio. €) 0 to 100 From 100 to 250 From 250 to 500 From 500 to 1.000 From 1,000 to 5,000 From 5,000 to 10,000 From 10,000 to 25,000 From 25,000 to 100,000 From 100,000

Fee (in bp, excluding taxes) 0.200 0.175 0.150 0.125 0.100 0.080 0.060 0.040 0.020

Transaction based fees are also charged on a monthly basis. For a standard trade they amount to typically around € 1.15 (tax free) for both partners involved in a trade. Depending on the monthly trading volume Clearstream provides discounts and rebates which are illustrated in Table 8.5.21 Since 2003 Eurex Clearing works as a Central Counterparty (CCP) for transactions in stocks at regional exchange in Frankfurt and in XETRA to allow for the netting of trades (only the net positions of the participants have to be cleared and 20

21

The numbers given in the next paragraph are taken from (Clearstream Banking AG 2003) and (Eurex Clearing AG 2004). Additional services of Clearstream and the corresponding fees are not discussed here in more detail.

8 Online-brokerage and Security Markets in Germany

179

Table 8.5. Discount Scheme for transaction-based Fees of Clearstream Banking AG Number of bookings per month

Rebate

From 5,000 From 10,000 From 20,000 From 50,000

5.0% 10.0% 12.5% 15.0%

Fee per trade and participant 1.09 € 1.04 € 1.01 € 0.98 €

settled), lower risks and more anonymity in trading. Hence, Eurex Clearing AG becomes the trading partner for both parties involved in a trade and insures its timely clearing (Clearstream Banking AG remains as the settlement service provider). The fees that are charged by Eurex Clearing differ between the two procedures, namely the gross or net procedure, which can be applied. In the first case a transaction fee of € 0.70 has to be paid, in the latter a more complex pricing scheme is applied to the so called netting units – in both cases partly executions are regarded as single executions and prices are listed without taxes:22 Table 8.6. Pricing scheme of Eurex Clearing AG Number of transactions per “settlement netting unit” and day From 1 to 1,000 From 1,001 to 2,500 From 2,501 to 5,000 From 5,001 to 10,000 From 10,001 to 20,000 From 20,000

Fee per transaction 0.55 € 0.53 € 0.51 € 0.49 € 0.47 € 0.45 €

8.2.4 Discussion of Services and Fees Diverse specialized companies contribute to the provision of the service offering “exchange transaction”. The single fees that are charged for this complex service differ significantly between the providers: • Fees for market access: Access fees differ substantially between providers and markets. This is due to the fact that (i) the interval in the official scales of charges and fees of the analyzed markets is very wide (e. g. Hanover from 50 € to 20,000 €) and (ii) the differences between the markets are also huge (e. g. Duesseldorf 22,500 € at the maximum vs. Berlin 22

A netting unit is a number of similar buy and sell orders hat shall be netted before settlement. The definition of her netting units as well as the decision which trades may be settled in the net or gross procedures is left to each participant.

180

Dennis Kundisch, Carsten Holtmann

Table 8.7. Market maker commission and order volume Bp 8 4 0.644 0.588 0.56

Minimum market maker commission in € 0.75 0.75 0.80 0.73 0.70

Volume of orders up to witch minimum market maker commission has to be paid 937.50 1,875.00 12,422.36 12,414.97 12,500.00

6,000 € at the maximum). Access and accreditation fees can be regarded as fixed costs for the brokers that are (mostly) independent from the single transactions. The same holds for Xontro fees. • Price determination services: Market maker commissions that have to be paid to the exchange broker are variable costs for the broker. At present fee structures do not differ between the markets, except for some few caps. Differences do exist between XETRA and the regional exchanges – in single cases theoretical XETRA fees amount to only 7% of the respective fees of the regional exchanges. But this more theoretical value has to be put into perspective as for most trades the minimum commission of 0.70 €, 0.73 € or 0.80 € has to be paid (most retail trades have in fact a lower volume than 12,000 €). The pretended price advantage of XETRA is therefore smaller than anticipated. Fees for the contract notes are also variable costs that can be easily forwarded. Especially Munich is less expensive (1 €) than all other exchanges (1.70 € to 1.75 €). The fee for clearing and settlement connectivity does neither vary between markets nor is it correlated with the number of orders. These 55 € per month can be regarded as fixed costs for the broker. • Clearing and settlement services: Fees are charged from Clearstream Banking AG and Eurex Clearing AG for holding deposits and for clearing and settlement services in a narrower sense. The latter can be directly associated with single trades and, therefore, can be regarded as variable costs for an online-broker. Differences between exchanges are given by the central counterparty (CCP) that exists at the markets in Frankfurt (regional market and XETRA). Trades with the CCP involved allow for significantly lower costs.23

23

It has to be mentioned though that the introduction of the CCP incorporates some significant investments in hard- and software as well as additional infrastructure components for the market participants. Additionally, the CCP is not yet available for all products. Hence traditional and new processes and infrastructures have to be run in parallel; see (Kalbhenn 2002).

8 Online-brokerage and Security Markets in Germany

181

After discussing the services and fees that are associated with an exchange transaction, the offerings and charges of four online-brokers are illustrated in more detail. The way and the transparency whether and how online-brokers forward the prices they have to pay to their customers is analyzed to discuss if and to what extend private investors can base their trading decisions on information about the prices that the different service providers charge for their individual offerings.

8.3

Analysis of the Price Models of Online-brokers

8.3.1 Forms of Appearance of Online-brokerage Companies that provide access to markets offer securities trading services for their customers. As members of the so-called sell-side, they are trading securities in stead of their customers – in contrast dealers such as market makers trade with their customers.24 To put it differently, online-brokers do business in their own name but on the account of their customers.25 Since the beginning of the nineties their appearance has changed, particularly due to the influence of information technology.26 Today the term broker – frequently specified as discount- of direct-broker – is often used for different kinds of securities trading service companies, which provide retail investors the opportunity to trade securities through the offering of access to markets. As ‘access intermediaries’ they are largely offering technical infrastructures that allow for entry and routing of customer orders as well as custody of deposits and accounts.27 Since the intensive use of the Internet, the expression online-broker (often synonymic also e-broker) has been used as an umbrella term for these different types. Over time it has embraced characteristics of differing stages of technical evolution and service offerings. Starting with the first direct-brokers that had a clear focus on cost leadership, online-brokers with a broader service offering and less discount characteristics have been entering the German market. Most of these brokers are subsidiaries of German or international private banks (see Table 8.8). In the following we will refer to an online-broker as an intermediary that focuses its business primarily on offering market access and order routing via online channels. Moreover, we will concentrate on the part of the service offering that retail investors utilize for Internet-based securities trading at a German exchange (national Internet-brokerage).

24 25 26 27

See (Harris 2002). See also §2 para. 3 Securities Trade Act (Wertpapierhandelsgesetz). See (SEC 1999). See (Weinhardt et al. 1999).

182

Dennis Kundisch, Carsten Holtmann

Table 8.8. Online-brokers, their parent companies and number of online accounts Name of the broker

Parent Company

Nationality of parent company

1822direkt

Frankfurter Sparkasse Citigroup Commerzbank BNP Paribas Hypovereinsbank/Unicredito ING Group Postbank E*TRADE FINANCIAL Société Générale Deutsche Bank German Savings Banks’ Association

Germany

Citibank comdirect bank Cortal Consors DAB bank ING Diba Easytrade E*TRADE Fimatex maxblue S Broker

Number of online securities accounts in Germany in 2005 (estimations) > 160,000 > 330,000** > 550,000 > 500,000 > 900,000*

USA Germany France Germany/ Italy Netherlands Germany USA

> 458,000 > 430,000 > 10,000

France Germany Germany

> 22,500 > 400,000 > 100,000

* Including around 450.000 accounts of the FondsServiceBank GmbH in the B2B business ** Some of which may only be used offline

8.3.2 Price Models of Selected Online-brokers Fees which are relevant for closing a deal at an exchange will be the issue of interest in the following. Therefore, we will have a look at the fees a retail investor is charged by his online-broker. The analysis will only cover orders in stocks that were entered via the Internet28 to be routed at a German exchange. These fees are called explicit transaction costs.29 A substantial part of these transaction costs are the order provisions. In general, the calculation of order provisions is dependent on the volume of a transaction. Moreover, some online-brokers charge a so-called trading place fee. Dependent on the price models there are also fees for placing a limit order, canceling or changing an order or for partial executions of orders. In addition to these costs that are described in their tariffs (also called terms & conditions or rates & fees on the respective websites), so-called external allowances – which are generally not described in detail in the tariffs – may also be 28

29

These results can be broadly applied also to publicly traded securities such as warrants, exchange traded funds or derivative investment certificates. Concerning bonds, there are generally at least different market maker commissions. Implicit transaction costs include e. g. the spread or the price impact of an order (see e. g. (Gomber 2000)).

8 Online-brokerage and Security Markets in Germany

183

charged to the customer. They may include market maker commission, XETRA fees, clearing fees or a contract note fee. Trades in registered securities may also result in additional fees for the recordal in the register of shareholders. In the following price models of selected online-brokers – Cortal Consors, comdirect, DAB bank und maxblue – are briefly sketched. These brokers offer a comparable service spectrum concerning market information, market access and tools for market analysis. Moreover, all of them exist for at least five years and hold more than 400.000 online accounts in Germany. The data refer to the standard tariffs, special conditions e. g. for active traders were not considered.30 The following tables briefly summarize the four price models. Table 8.9. Price models (in €)31 Order commission for a national Internet stock trade

Comdirect Cortal Consors DAB bank Maxblue

Trading place fee

Other fees

%

XETRA/regi minimax. fixed- basic onal exLimit mum charge step charge change charge

0.25

–

4.90

9.90

59.90 1.50/2.50+ 2.50*

2.50/2.50 9

0.25

–

4.95

9.95

69.00 0.95/2.95

–

2.50/2.50 (9)#

9

4.95 –

7.95 55.00 1.50/2.90 19.99 34.99 –/–

– –

2.50/2.50 (9)# 4.99/4.99 9

0.25 –

Change/Ca Partial ncellation execution

* This fee applies only if the order is not executed at the same day + These are minimum fees. For orders with a volume above 100,000 €, the fee is 0.0015% and 0.0025%, respectively # The order commission is waived for the second or more partial execution Table 8.10. Charged external allowances (in €) Market maker commission Comdirect 9 Cortal Consors 9 DAB bank 9 Maxblue 9

30

31

Clearing & Settlement – – – 9

Contractual note fee – – – 9

Registration fee for registered securities 0.93 1.95 0.93 0.93

Often, special customer segments such as actively trading customers are grouped into a community. Examples for these communities that get special conditions are „Startrader“ and „Platinumtrader“ at Cortal Consors or the „comdirect first“ at comdirect. See (Kundisch and Krammer 2006) for some insights concerning attitudes of retail investors dependent on their trading activity. Moreover, there may be charged additional fees in specific cases e. g. for the custody or foreign stocks. For the sake of simplicity, these fees are not covered in the following. For the detailed price models see (Comdirect bank 2006; Cortal Consors 2005; DAB bank 2004; Deutsche Bank Privat- und Geschäftskunden AG 2006).

184

Dennis Kundisch, Carsten Holtmann

8.3.3 Comparison of the Price Models and Discussion 8.3.3.1 General Comparison of the Price Models The overview of the price models reveals that each price model on its own is quite straightforward; however, a comparison is difficult since each price models consists of different service components with their respective fees, as well as different fees for the same service component. The following figure graphically illustrates the comparison of transaction costs for a limit order in XETRA. While Cortal Consors provides the lowest explicit transaction costs for a trade volume up to 5,500 €, maxblue is the least expensive provider for an order volume between 5,500 € and 10,000 € and for volumes above 11,700 €. Cortal Consors is again the best choice between 10,000 € and 11,700 € among the big four online-brokers. It is not the objective of this contribution to analyze these apparent differences in detail. Rather it is the question, whether these numerous fees and commissions become visible and transparent for the retail investors. Executed XETRA order with limit

Explicit transaction costs in Euro

80

60

comdirect

40

Cortal Consors DAB bank

20

maxblue

30.000

25.000

20.000

15.000

10.000

5.000

0

0

Order volume in Euro

Figure 8.1. Comparison of explicit transaction costs for a limit order in XETRA

8.3.3.2 Transparency of the Fees for Service Components for Investors The components of the price models of the covered online-brokers do not allow for an exact mapping to service and fee components, which are charged the onlinebroker from exchanges or other service providers to route, execute, and clear a trade. Therefore the differentiation in market access, price discovery, and clearing and settlement fees cannot be used one-to-one for the following discussion.

8 Online-brokerage and Security Markets in Germany

185

All online-brokers have to pay accreditation and annual fees at the exchanges. These fees are independent from single orders32 and may vary between different exchanges as well as between different online-brokers. Theoretically, the annual participation fee could be distributed uniformly by an online-broker for each exchange according to the expected order volume that will be routed to this market. In contrast accreditation may be regarded as sunk costs since they were paid once and should not influence the cost calculation of onlinebrokers today. However in practice, the considered online-brokers behave differently. According to answers to our inquiries concerning this issue, at least one broker sums up the annual participation fees for all German trading places, divides this sum by all expected national orders, and considers the result as a fixed cost allocation within her order commission. Hence, potentially different fees of the exchanges will not become visible for a retail investor. At the time the analysis was conducted all market makers at German regional exchanges charged similar commissions. These commissions are passed through to the retail investor by all of the brokers considered. Hence, competition of the exchanges would be visible for an investor but for this service component, there is no differentiation in the market with the exception of caps. There are some market segments at specific exchanges that have a cap on market maker commissions (at Stuttgart 12 € for all retail derivatives (EUWAX segment), domestic stocks, and mutual funds; at Frankfurt 3 € and 8 € for derivative leverage products and derivative investment products, respectively (Smart Trading segment); at Hamburg 12 € for all stock orders). Particularly EUWAX and Smart Trading use these caps intensively as sales argument in their marketing strategies.33 So-called XETRA fees are only passed through by maxblue, while the other three Brokers charge a flat fee (and benefit in most cases by this). Specifically comdirect charges a minimum fee of 1.50 €. Additionally, for a trade volume higher than 100,000 €, a volume dependent fee of 0.0015% is applied. This order volume, however, will be very rare exceptions for the group of retail investors. Whether the XETRA fee at Cortal Consors, comdirect, and DAB bank includes fixed costs distributions or may be a sum of other variable costs is neither visible for an investor nor for the authors.

32

33

According to the terms and conditions of the exchanges, the annual participation fees are set with regard to the overall relevance and importance of a participator for the exchange. In the long run, these fixed costs in fact depend in a sense on the number of orders and the volume or orders that are routed to a specific exchange. In addition, in case of incorrect information by an online-broker, a retail customer may come to the conclusion that with respect to market maker commissions, one exchange may be less expensive compared to another. E. g. one broker told us, that the market maker commission in Berlin on all instead of just the DAX stocks is 4 bp (Email as of 04/17/2004). Incorrect information was also provided by several online-brokers with respect to the charges for the recordal of registered shares.

186

Dennis Kundisch, Carsten Holtmann

In addition to the market maker commission at regional exchanges, Cortal Consors, comdirect and DAB bank charge a regional exchange fee. It is not visible at all for a retail investor which service and fee components are included in this fixed fee. maxblue is the only broker in the sample that passes through all external allowances. Hence, differing contractual note fees (see Table 8.3), which in the worst case amounts to a moderate 0.75 € per trade, will only become visible for maxblue customers in the sample. Clearing and settlement is centrally coordinated by the Clearstream Banking AG and in some cases – concerning stock trades at FWB or via XETRA – by the Eurex Clearing AG. Competition can only be expected between FWB and XETRA on the one hand and all the other regional exchanges on the other hand. According to the hotlines of the brokers in the sample, only maxblue is passing through this fee. Thus, a potential advantage of XETRA and FWB in most cases will not become visible to a retail investor. The handling of partial executions is directly relevant to retail investors. Whereas Cortal Consors and DAB bank waive their order commission for the second and more partial executions but charge trading place fees and external allowances. maxblue and comdirect charge for each partial execution as if it was a single and separate order – except for XETRA trades that are executed on the same day. Thus, trades at maxblue or comdirect may become substantially more expensive compared to trades at Cortal Consors or DAB Bank due to partial executions. One should note that partial executions generally only occur on XETRA. Regional exchanges in most cases execute an order completely in one step; e. g. Börse Munich advertises a partial execution rate of below 0.2% of all orders. 8.3.3.3 Discussion and Limitations There is a number of service providers involved in the complex offering “exchange transaction”. On the level of online-brokers there are particularly different fees for the participation at the different exchanges (see Table 8.1). Another aspect is the diversity in fees for contractual notes (see Table 8.3). For all other service and fee components, there is only competition on two levels: • Generally, the market maker commission is identical for all regional exchanges – expect for some few exchanges that offer a cap on this commission – and differs significantly from the XETRA fee. The XETRA fee is substantially lower compared to the market maker commission, however, in most cases the minimum fee will be triggered for a retail trade. Thus, the difference to the market maker commission is relativized in most instances. • Concerning the clearing and settlement of trades, the fees of the FWB and XETRA may be lower compared to the other regional exchanges due to the introduction of the Central Counterparty (CCP) – left alone the one time fixed costs for the adaptation of the intermediaries’ infrastructures. A retail investor may benefit from the scale, in terms of the number of

8 Online-brokerage and Security Markets in Germany

187

transactions and the consolidated order volume, of his intermediary, since the fees are dependent on the number and the volume of the transactions of the intermediary in total. These discounts are independent of the exchanges, though. Normally, for retail investors the moderate differences in the fees between the competing exchanges do not become visible due to the price model of his broker. However, there are differences in these models: On the one hand Cortal Consors and DAB bank refrain from charging variable costs for most of the external allowances but charge a fixed fee per market. On the other hand, maxblue passes through all external allowances that may be attributed to a specific trade. Whether the price models affect the different investors’ behaviour can not be observed. Additionally, one important aspect should be mentioned here: None of the analyzed brokers lists the different costs for the different exchanges. Moreover, it took even the authors of this contribution a number of weeks to get to know all the relevant data. Thus, most of the time, an investor will only know about the fees at a trading place ex post on his contract note and will not have at the same time the opportunity to compare these costs to the costs the same trade would have triggered using a different trading place. Some limitations accompany the presented analysis which should be taken into account while talking about the conclusions above: • We just had a look at transactions at Germany exchanges in national stocks. There may be substantial differences in the service components and fees talking about foreign exchanges or trades in foreign securities. • We just analyzed the price models of four established online-brokers in the German market. There may be substantially differing terms & conditions at other brokers that potentially would lead to other conclusions – Citibank e. g. recently introduced a flat rate model at 9.99 € per trade without market maker commissions, trading place fees or limit fees.

8.4

Outlook: Off-exchange-markets as Competitors

So far, only markets in Germany under the official market and legal supervision of the state-run Exchange Supervisory Authority have been analyzed as trading venues. These official exchanges provide pre- and post-trade transparency and offer order-driven market models. In recent years additional trading venues gained importance for retail investors: Off-exchange trading on specific alternative trading systems (ATS). Bypassing the traditional exchanges, retail investors can communicate directly via their brokers with the issuers of retail derivatives or market makers. Investors get the chance to request ‘a quote’ (meaning the sell and the buy price the counterparty is willing to trade for) for securities to be bought or sold. The market maker or issuer sends a quote to this specific customer and he may decide upon a mouse click within

188

Dennis Kundisch, Carsten Holtmann

a specified time interval (typically 3 to 8 seconds), whether he wants to buy (or sell) the securities at the posted price. This market model is called request-forquote and belongs to the quote-driven trading mechanisms.34 Referring to a current study of the authors in these markets primarily retail derivatives are traded and account for around 80% of the total turnover. Due to the lack of transparency, for all turnover figures, prices and volumes of executed trades, one has to rely on estimations of experts for this market. In interviews of German brokers performed in 2005 experts estimated the turnover via ATS in investment retail derivatives to around 50% of the total turnover of 41 billion Euro in this product category in 2004, whereas the turnover via ATS in levered retail derivatives accounts for around 60% of the total turnover of 52 billion Euro in 2004. In 2005, not only the market itself saw another increase in the double digits but also trading via ATS became ever more popular among private investors. But not only retail derivatives are traded via ATS, also a number of German bonds, most German stocks, European blue chips, all NASDAQ and Dow Jones stocks as well as some selected Asian stocks can be traded via ATS. Talking about competition of trading places, here it really becomes visible for the customer. However, it seems that this is mostly due to the price models of the online-brokers. Regardless of the real payment streams between brokers and issuers or market makers, the explicit transaction costs for the customer are lower at all analyzed brokers compared to a trade that is executed at an exchange. • maxblue offers the greatest discount compared to its prices for an exchange executed order. Generally, the broker commission is 5 € lower and no external allowances are charged.35 • DAB bank offers a substantial discount on the broker commission only on trades with three36 selected issuers and market makers (so-called “star partners”). For these, there is a flat fee without external allowances. For all other issuers and market makers, the broker commission is the same and there is a fixed trading place fee of 0.80 €, which is substantially lower compared to the regional exchange fee of 2.90 € or a XETRA fee of 1.50 €. No other external allowances are charged.37 • comdirect and Cortal Consors just waive all external allowances but charge the same broker commission as if the order would be routed to an exchange.38

34 35

36 37

38

See e. g. (Holtmann 2004). The only external allowance is a so-called liquidity provision fee which is charged when trading with the market maker Lang & Schwarz. Out of more than 20. The only external allowance is a so-called liquidity provision fee which is charged when trading with the market maker Lang & Schwarz. Again, the only external allowance is a so-called liquidity provision fee which is charged when trading with the market maker Lang & Schwarz.

8 Online-brokerage and Security Markets in Germany

189

Additionally, all of the brokers listed above, offer free trade campaigns with selected issuers for a limited time span (usually for one to three months) or marketing campaigns with reduced broker commissions for all trades via ATS – sometimes only for specific (community) groups of customers. To sum up, competition between exchanges on the one hand and specialized off-exchange-markets on the other hand is much more visible for the retail investor compared to the competition among the different exchanges. The caps on the market maker commission at EUWAX and Smart Trading can be interpreted as a direct answer to this price competition described above.

8.5

Conclusion

German regional exchanges compete among themselves and against the fully electronic cash market trading system XETRA for the order flow of retail customers. Additional competition can be observed with the off-exchange markets. In this contribution at first hand, we described what the fee models of these exchanges mean for direct market participants, such as access intermediaries like online-brokers. Next, it was analyzed, whether any price competition between the exchanges and other service providers that contribute to the execution and settlement of a trade are traceable and seem decision-relevant for the retail customer as an indirect market participant. Four established online-brokers in the German market were selected and their price models were analyzed with regard to the question: Do the price models of online-brokers distort or even blur (if any) existing price competition of German exchanges. Online-brokers were chosen for at least one major reason: They focus their business on providing information about and access to markets, order routing and account related services. Thus, crosssubsidization hopefully will be much less distinct compared to other financial services providers, specifically branch-based banks. Finally, a brief outlook in the off-exchange markets – an increasingly recognized alternative to execute an order for a self-directed retail investor – was provided. All in all, the cost competition between the German exchanges can be classified as moderate. Solely in the domain or participation fees at an exchange and the contractual note fees differ substantially between the considered trading venues. With respect to clearing and settlement services, small cost advantages can be stated for XETRA and FWB due to the introduction of the CCP compared to the other regional exchanges. Often, these costs are included in the marketindependent broker commissions. Partially, a fee is charged, which just differentiates between XETRA on the one hand and regional exchanges on the other. However, there are some differences in the price models of the four analyzed online-brokers. While particularly Cortal Consors and DAB bank do not charge external allowances except for the market maker commission, maxblue passes through all external allowances. Whether this results in different investment and trading behavior at the site of the retail investor is questionable, yet not observable.

190

Dennis Kundisch, Carsten Holtmann

For the anticipation of explicit transaction costs, it seems not only advisable to have a look at the tariffs of all the participation service providers but also at the market model of the exchanges, which define the trading rules. For retail investors who generally trade with comparably low volumes per transaction, it might be favorable to choose trading places where partial executions are avoided. Particularly on XETRA, partial executions are quite common and may result in substantially higher transaction costs compared to an order that is executed at a regional exchange – even if the commission and fees for a single transaction is less expensive on XETRA. Talking about market models, the quote-driven off-exchange markets in Germany observed an increase in importance in recent years. Most established onlinebrokers meanwhile offer not only access to all German exchanges but also to at least one off-exchange market where retail investors can trade directly with issuers or market makers bypassing. The fees for trading off-exchange provide (partially substantial) discounts compared to orders routed and executed at exchanges. Moreover, due to the quote-driven business-model, partial executions are impossible. With respect to clearing and settlement, there are some new developments that might lead to further competition in the future. Since February 2004 clearing and settlement are allowed at any licensed central depository for securities.39 However, Clearstream Banking AG is still the only player in this market. In summary, inter-exchange competition is moderate with respect to costs and mostly becomes not visible to the retail investor due to the distorting price models of online-brokers. However, price competition between exchanges on the one hand and ATS without governmental supervision on the other hand seem to be the real battlefield of the years to come. It remains to be seen, which consequences will be brought by the current re-regulation of the securities market and the implementation of the Markets-for-Financial-Instruments (MIFID) directive.

References Barber B, Odean T, Zheng L (2005) Out of Sight, Out of Mind: The Effects of Expenses on Mutual Fund Flows, forthcoming in: Journal of Business, 2005, vol. 78, no. 6, pp. 2095–2119 Battalio R (1997) Third Market Broker-Dealers: Cost Competitors or Cream Skimmers? Journal of Finance, 1997, vol. 52, pp. 341–352 Börse Berlin-Bremen (2004) Gebührenordnung der Börse Berlin-Bremen, as at 1.1.2004 Börse Duesseldorf (2004) Gebührenordnung Börse München (2004a) Jahresbericht 2003/2004 – Mit Vielfalt und Wettbewerb Marktchancen erschließen und Zukunft sichern

39

See o.V. (2004).

8 Online-brokerage and Security Markets in Germany

191

Börse München (2004b) Gebührenordnung für die Börse München. In Kraft seit 1.10.2003 Börse Stuttgart AG (2003) Jahresbericht 2002 der Baden-Württembergischen Wertpapierbörse Börse Stuttgart AG (2004) Gebührenordnung, as at 1.1.2004 BrainTrade (2004a) Gebühren für XONTRO BrainTrade (2004b) Kreditinstitute – Anmeldung XONTRO, as at 25.2.2004 Clayton MJ, Jorgensen BN, Kavajecz KA (2006) On the presence and marketstructure of exchanges around the world, in: Journal of Financial Markets 9 (2006) pp. 27–48 Clearstream Banking AG (2003) Preisverzeichnis Inland für Kunden der Clearstream Banking AG, Frankfurt, as at 17.11.2003 Comdirect bank (2006) Preis- und Leistungsverzeichnis, as at 01.03.2006 Cortal Consors (2005) Preis- und Leistungsverzeichnis, as at 01.07.2005 DAB bank (2004) Preis- Leistungsverzeichnis der DAB bank, as at 01.11.2004 Deutsche Bank Privat- und Geschäftskunden AG (2006) Preis- und Leistungsverzeichnis, as at 01.01.2006 Deutsche Börse AG (2003) Preisverzeichnis ab 01.04.2003 für die Nutzung des elektronischen Handelssystems Xetra Deutsche Börse AG (2004) Gebührenordnung für die Frankfurter Wertpapierbörse, As at 02.01.2004, FWB12 Deutsche Börse Systems, BrainTrade (2004) Kreditinstitute – Technische Anbindung, Version 4.0, as at 19.1.2004 Easley D, Kiefer N, O’Hara M (1996) Cream-Skimming or Profit Sharing? The Curious Role of Purchased Order Flow. in: Journal of Finance, 1996, vol. 51, pp. 811–833 Eurex Clearing AG (2004) Preisverzeichnis, as at 23.02.2004 Gomber P (2000) Elektronische Handelssysteme – Innovative Konzepte und Technologien, Physica, Heidelberg Hammer T (2004) Kleiner Handel, großer Gewinn, in: Die Zeit, no. 22, 19.05.2004, p. 29 Hanseatische Wertpapierbörse Hamburg (2004) Gebührenordnung, as at: März 2004 Harris LE (2002) Trading and Exchanges, Oxford University Press, New York, N.Y Holtmann C (2004) Organisation von Märkten – Market Engineering für den elektronischen Wertpapierhandel, Dissertation, Fachbereich Wirtschaftswissenschaften, Universität Karlsruhe (TH), Karlsruhe Kalbhenn C (2002) Der unbeliebte Kontrahent, in: Börsenzeitung, no. 112, 14.06.2002, p. 8 Kirschner S (2003) Transaktionskosten der Börsen im deutschsprachigen Raum, in: Bankarchiv, no. 11, 2003, pp. 819–828 Kort K (2004) Internet-Broker setzen Kundenberater ein, in: Handelsblatt, no. 53, 16.03.2004, p. 20 Kundisch D, Krammer A (2006) Transaktionshäufigkeit als Indikator für die Angebotsgestaltung bei deutschen Online-Brokern, in: Der Markt, Vol. 46, no. 1, 2006

192

Dennis Kundisch, Carsten Holtmann

Lüdecke T (1996) Struktur und Qualität von Finanzmärkten, Gabler, Wiesbaden. Niedersächsische Börse zu Hannover (2003) Gebührenordnung, as at 11.8.2003 Matutes C, Regibeau P (1988) Mix and Match: Product Compatibility without Network Externality, in: RAND Journal of Economics, Vol. 19, S. 221–234 Picot A, Bortenlänger C, Röhrl H (1996) Börsen im Wandel, Fritz Knapp, Frankfurt a.M o.V. (2004) Deutsche Börse ermöglicht Rivalen Abwicklung, in: Handelsblatt, no. 42, 01.03.2004, p. 22 Röhrl H (1996) Börsenwettbewerb: Die Organisation der Bereitstellung von Börsenleistungen, Deutscher Universitäts-Verlag, Wiesbaden Rudolph B, Röhrl H (1997) Grundfragen der Börsenorganisation aus ökonomischer Sicht, in: Hopt, K. J./Rudolph, B./Baum H. (Hrsg.): Börsenreform – Eine ökonomische, rechtsvergleichende und rechtspolitische Untersuchung, Schäffer-Poeschel, Stuttgart, pp. 143–285 SEC (1999) On-Line Brokerage: Keeping Apace of Cyberspace. Securities and Exchange Commission (SEC), New York, N.Y. Weinhardt C, Gomber P, Holtmann C, Groffmann HD (1999) Online-Brokerage als Teil des Online-Banking – Phasenintegration als strategische Chance, in: Locarek-Junge, H./Walter, B. [eds.]: Banken im Wandel: Direktbanken und Direct Banking, Bd. 18. Berlin-Verlag 2000, Berlin, pp. 99–120

CHAPTER 9 Strategies for a Customer-Oriented Consulting Approach Alexander Schöne

9.1

Introduction

Current publications and projects in the field of banking show that banks have begun focusing on customer service again. Following a phase where customer service was not considered a core business, and was even delegated to a separate company, many banks have once again started focusing their activities on the field. There are many reasons for this. For one, the success of various financial distribution companies has shown that customer service, when carried through consequently and comprehensive, does indeed portray a high potential for financial service providers. When consulting is provided, which satisfies the customer, it often leads to a long-term customer commitment, which forms a reliable and predictable basis for financial service providers. However, financial service providers face many challenges in the field of customer service, especially when it comes to legal requirements, such as the conversion of the EC broker guidelines or the German Pension Tax law. Lately, there is also the trend, that banks extend customer service to all of their customer segment. The focus has especially been placed on the private customer and wealthy private customer segments. Due to the low number of such customers, banks often neglected these segments and let specialized banks handle the matter. Recently, different models have shown how a successful consulting offer can be set up even with a low number of potential customers. The segment of retail customers has also turned into an interesting consulting field for banks due to the more in-depth standardization and industrialization. In order to keep customers from going over to the competition, banks have to actively support the move back to enhanced customer service – and not only for wealthy private customers, but for the customer segment in general. Many banks have the wrong expectations for their consulting services. They only accompany the trend toward increased customer service with related marketing activities. In reality, however, such an approach is doomed to fail. Even the wrong

194

Alexander Schöne

approach toward customers can cause customer’s to lose interest in the consulting offered by a bank, for example if a bank focuses on acquiring new customers instead of on consequently using existing relationships. All in all it can be said that consulting is mostly just appearance at the moment and that it is rarely actually lived and practiced in banks. To successfully move toward customer-oriented consulting, certain prerequisites have to be met in many areas. In addition to organization and processes, consulting software is also an important aspect of this change. Concentrating efforts on a single area – organization or software – will generally not achieve the desired effect. Since the interaction between the two fields is quite significant, the success of this movement can only be ensured when both fields are addressed. And it is not only in the customers’ interest, but in the bank’s as well, to enforce this movement. The remainder of the chapter will discuss two central requirements for customer-oriented consulting. First, it will highlight which methods are available for enhancing the functional quality of consulting services. Then it will introduce the structures, which a modern consulting software should provide, especially to enable quick reactions to the changing requirements in the market.

9.2

Qualitative Customer Service Through Application of Portfolio Selection Theory

One focus when practicing qualitative customer support is the analysis and derivation of a customer-specific recommendation. Customers want to be acknowledged as an individual regardless of their financial situation, and not to be given impersonal support. Thus, recommendations must be given a special significance when providing support. However, the reality of the situation is quite different. At the moment, two approaches dominate consulting: • The standard recommendation: The institute defines a recommendation for a certain type of customer. During the consultation, the consultant groups the customer into one of the customer types and uses the recommendations defined for that type. This enables optimization tools and specialists in the back office to be used, which optimize the recommendation ahead of time. However, such recommendations do not necessary address the customer’s specific situation. • The subjective recommendation: During the consultation, the consultant puts together an individual recommendation for the customer using the bank’s product portfolio. Even through the customer’s situation is given special attention, the consultant cannot really optimize the recommendation due to the complex interactions between the individual investment instruments.

9 Strategies for a Customer-Oriented Consulting Approach

195

Both approaches have advantages and disadvantages, yet neither really works in the customer’s best interest. Special consulting tools are better suited for customer service, as are already used in the back office for enhancing recommendations. Qualitatively-speaking, they should provide the same benefits as in the back-office, but should be easier to use.

9.2.1

Markowitz’ Portfolio Selection Theory

One of the widest spread optimization methods for determining the best asset structure is Markowitz’ portfolio selection theory. Three parameters form the basis of the portfolio selection theory, on which the optimization is based: • Return, meaning the performance of an investment. • Risk, meaning the fluctuation margin of an investment from the expected return. • Correlation, meaning the interdependency or performance between two investments. Whereas the return and risk are pretty common terms for investors, the term “correlation” is not very widely used in consulting. In the portfolio selection theory, the correlation plays a very central role. The correlation describes the connection between two investment instruments. For example, when considering the development of the DAX compared to the Dow Jones, a tendency toward a similar development can be seen. If the DAX goes up, the Dow Jones goes up as well and vice versa. Thus, this would be called a positive correlation. However, if the DAX is compared to the development of the price of gold, a counter-development will be seen. Thus, this would be called a negative correlation. Each investment instrument is described using these three parameters. Risk and return are visualized in a risk-return diagram. The correlation is not used until two or more investment instruments are combined. For example, it two investments, let’s call them A and B, were purchased for the same amount one would intuitively expect the resulting risk-return to be exactly in the middle of the two investments. However, this is generally not the case. Using the correlation, i. e. the interdependency of the two investments, the risk is lower when the two investments are combined, than the initial risk of each investment by itself. Thus, the diversification, i. e. the distribution of the assets to different investment instruments, is very important. It is not only used to place the risk on numerous “shoulders”, but more specifically to reduce the overall risk. The central message of the portfolio selection theory, for which Harry M. Markowitz and colleagues received the Nobel prize in 1990, was based on the recognition that all possible combinations out of a number of investment instruments formed a contiguous area in the risk-return diagram. To select the best portfolio, one only has to consider those combinations, whose resulting risk return is on the upper area of the area section of the diagram. All other portfolios are ineffective, meaning that they are not optimal in the sense that other combinations can

196

Alexander Schöne

be found, which provide a higher return on the same risk or have a lower risk yet the same return. For asset consulting this means, that the consultant does not have to consider all possible combinations of investment instruments, but can concentrate on those, which are on the edge of the area section of the so-called efficiency curve. The selection of the best portfolio for an individual customer situation is thus based on different possibilities: • The minimum risk portfolio is the portfolio with the lowest risk of all possible combinations. It is especially suited for customers, which want the lowest risk possible. • The optimal risk portfolio is the portfolio with the lowest risk for a given return. It is especially suited for customers, who have a certain target revenue. • The optimal return portfolio is the portfolio with the greatest return for a given risk. It is especially suited for customers, which do not want to exceed a given risk. • The portfolio with the maximum risk premium is especially suited for customers, who want to invest economically and who do not define a target risk or return, instead want a portfolio, which delivers the highest revenue above the current market interest in relation to the exposed risk. Using these four portfolios, an optimal portfolio can be selected for every customer situation in the field of asset consulting. No other combination portrays a sensible alternative for a customer.

optimal return portfolio

target return 7,00%

optimal risk portfolio

Return

maximum risk premium minimum risk portfolio

0,00% 0,00%

Figure 9.1. Selection of the optimal portfolio

Risk

target risk 21,20%

9 Strategies for a Customer-Oriented Consulting Approach

9.2.2

197

Application to Real-World Business

How these insights are then used can be shown through a few simple examples. In these examples it is important to remember that the aim of asset consulting is the strategic selection of the investment instruments. Thus, the consulting is done on asset class levels, such as European stocks, Eurozone bonds, etc. and not on a specific securities level. Relatively few and relatively general asset classes were selected for the examples. In actual use, however, any number of asset classes can be used, with any amount of detail. The starting point will be a customer with an asset structure as shown in Table 9.1. The customer’s goal is to reduce his risk. If possible, he would still like to earn the same revenue as before. To keep the example simple, we will assume that the institute does not offer any other asset classes. First, the customer is shown a visualization of the items in the individual asset classes and those of his assets based on the risk-return diagram. The numbers shown in Figure 9.2 are based on benchmarks of the last 10 years. A return of 6.6% with a risk of 12.4% is calculated for the customer assets. Next, the efficiency curve is calculated and portrayed. It can be immediately seen, that the customer can clearly enhance his situation by selecting the optimal risk portfolio (cf. Figure 9.2). If the return of 6.6% is retained, a shift can reduce the risk by over half (to 6.0%). The structure of the corresponding portfolio is listed in Table 9.2. Due to spatial limitations, we are not able to describe the results for all portfolios based on the efficiency curve. However, upon analyzing their composition it becomes clear, that with a lower risk range and a return between 4% and 6.5%, the combination is mostly determined by the relationship between the public real estate funds and the bonds. In the upper risk area, with a return between 6.5% and 8%, the relationship between the real estate funds and the stocks are the determining factor. Even though public real estate funds have a similar risk-return structure to bonds (cf. Figure 9.2), they are a sensible addition for any portfolio due to the correlation. An analysis of the optimal portfolio in the efficiency curve also shows, that not all eight asset classes contribute the same amount to the result. The basic limitation of the consulting to the three investment instruments: European stocks, Eurozone bonds and public real estate funds, reduces the complexity of the consulting without having to accept a significance loss of quality for the customer. This is illustrated by comparing the efficiency curves of all eight, or the three selected, asset classes (cf. Figure 9.3). Identical results apply for product portfolios with a noticeably higher number of products. Although 30 or more asset classes are generally offered in consulting, less 10 ten are usually enough for qualitatively strong customer support. All other asset classes unnecessarily increase the complexity for the consultant, as well as the customer. Strong consulting concentrates on the relevant products and asset classes.

198

Alexander Schöne

10,00%

USA Stocks

7,50%

European Stocks World Stocks Asian Stocks

Recom m ended Structure Original Structure

Return

Public Real Estate Funds International Bonds

5,00%

Corporate Bonds Eurozone Bonds

2,50%

0,00% 0,00%

10,00%

20,00%

30,00%

Risk

Figure 9.2. Efficiency curves in the risk-return diagram

Table 9.1. Original asset structure Asset Class European stocks USA stocks Asian stocks World stocks Eurozone bonds International bonds Corporate bonds Public real estate funds

Amount 10,000 EUR 5,000 EUR 10,000 EUR 20,000 EUR 5,000 EUR 10,000 EUR 5,000 EUR 5,000 EUR

Table 9.2. Optimized asset structure Asset Class European stocks USA stocks Public real estate funds

Amount 9,760 EUR 13,680 EUR 46,560 EUR

9 Strategies for a Customer-Oriented Consulting Approach

199

10,00%

Return

7,50%

Eight Asset Classes

Three Asset Classes

5,00%

2,50%

0,00% 0,00%

10,00%

20,00%

30,00%

Risk

Figure 9.3. Comparison of the efficiency curves

9.2.3

Consideration of Existing Assets

During an interview, customers often decide to restructure only part of their assets and not their entire portfolio. The correct approach in this kind of situation will be explained using the data from the previous example. The customer has 70,000 EUR from an inheritance and would now like to invest it. However, the existing assets totaling 70,000 EUR (see Table 9.1) should not be touched. The target return is 7%. A consultant comes up with the following solution: Since the 70,000 EUR will not be restructured, he will simply ignore them in his suggestion. Now, his task is the same as in the first example: 70,000 EUR should be distributed to the eight available asset classes. He now finds the optimal risk portfolio with a return of 7% in the efficiency curve and recommends the following investments: 12,177 EUR in European stocks, 25,867 EUR in USA stocks and 31,956 EUR in public real estate funds. However, consultant B thinks a little farther ahead. He knows: Even if the original assets will not be restructured, they will still have a significant influence on the overall result due to the correlation to the newly created asset classes. This means a return of only 6.8% for the overall assets of 140,000 EUR. This means, that consultant A would not achieve the customer’s specified goal. Consultant B determines the correct efficiency curve for the total assets of 140,000 EUR in consideration of the fact, that 70,000 EUR are already concretely distributed to asset classes and 70,000 EUR can be freely distributed. Now, in order to achieve the specified target of 7% while minimizing the risk, the customer would have to invest 48,000 EUR in USA stocks and 22,000 EUR in public real estate funds.

200

Alexander Schöne

The results of this example can also be applied to a greater number of asset classes in the field of consulting. A new amount to be invested should never be considered apart from the overall assets. The existing assets have a significant influence on the result, even if they themselves will not be changed. The “wrong” result not only falls short of the customer’s specifications, but is inefficient as well, meaning a better combination exists, which offers a lower risk with the same return. Even the number of recommended asset classes is visibly higher with the “wrong” recommendation. If the existing assets are correctly integrated into the consulting, the recommendation will automatically be reduced to the central investment instruments, since the existing assets already have a basic risk combination.

9.2.4

Illiquid Asset Classes and Transaction Costs

The previous examples focus on liquid asset classes alone. Just as existing security assets have to be integrated into the analysis, so do illiquid asset classes, such as real estate or equity investments. They are easy to handle, i. e. in general they have to be considered a fixed mass during the consulting, but still influence (and usually even significantly) the total risk and the total return on the customer’s assets. Transaction costs can be disregarded in the liquid asset classes without a major influence on the result, since they are generally minor and are the same for all liquid asset classes. Studies have shown, that transaction costs no longer play a role for liquid asset classes when the investment volume is 10,000 EUR or more. However, transaction costs cannot be ignored when dealing with illiquid asset classes. The transaction costs can be considerable, in which case they do have a significant influence on the result. Thus, in order to not exclude illiquid asset classes from the recommendation, a planning horizon hast to be defined, which will be used for the optimization. However, Markowitz’ original portfolio theory and the related calculation algorithms fail when it comes to transaction costs und cross-period planning timeframes. Alternative approaches do exist, which solve the problem using modern algorithms. These algorithms do not have as high a performance as conventional algorithms, which means they are only partially suitable for use in customer service. The possible uses of these algorithms will be explained in the following example. A customer has the asset structure shown in Table 9.3. The variable transaction costs are 1.0% for asset class “money market”, 1.5% for “securities”, 3.5% for “real estate” and 5% for “closed participations”. Savings deposits are not subject to transaction costs. Transaction costs are added to the purchase price when buying and subtracted from the revenue when selling. The customer’s goal is an optimal return portfolio, meaning he wants to retain the risk, but to maximize his return over a period of 5 years. The best restructuring recommendation for such customers is shown in Table 9.4.

9 Strategies for a Customer-Oriented Consulting Approach

201

Table 9.3. Asset structure before optimization Asset class Liquidity Securities stocks Securities annuities Illiquid bonds

Financial instrument Savings deposits Money market European stocks USA stocks Eurozone bonds International bonds Real estate Equity investments

Amount 15,000 EUR 75,000 EUR 40,000 EUR 45,000 EUR 15,000 EUR 45,000 EUR 80,000 EUR 25,000 EUR

Table 9.4. Optimized asset allocation under consideration of transaction costs Asset class Liquidity Securities stocks Securities annuities Illiquid bonds

Financial instrument Savings deposits Money market European stocks USA stocks Eurozone bonds International bonds Real estate Equity investments

Amount 8,697 EUR 0 EUR 0 EUR 54,400 EUR 0 EUR 190,400 EUR 54,400 EUR 27,200 EUR

These shifts do create transaction costs of 4,903 EUR. Yet, after 5 years the customer can count on total assets of 439,479 EUR – compared to 432,570 EUR before the shift – yet without an increased risk. Thus, after the transaction costs he still increased his assets by over 7,000 EUR. With a planning timeframe of 10 years, the customer could even achieve an advantage of over 25,000 EUR.

9.2.5

Conclusions for Daily Use

Markowitz’ portfolio selection theory is a model for optimizing the asset situation. An optimal portfolio combination is created based on the parameters return, risk and correlation. The existing high-performance algorithms can be used in customer service as well. Thus, the portfolio selection theory is a sensible basis for neutral and individual customer support. Consequences for the consulting can also be derived from the application of the portfolio selection theory. The customer consulting should not be based on real products, but on general asset classes. In addition, the underlying asset classes can be reduced to the central classes without a significant disadvantage for the customers. A reduction of the offered product range to central asset classes is also

202

Alexander Schöne

important in consideration of the transaction costs. The consulting must consider the existing assets, otherwise it will produce incorrect results. Since the consulting is performed on the asset class level and not for real products, only very little data is required for the actual application in the customer support. The values for return, risk and correlation still form the basis for the results. The back office should provide these values and update them on a regular basis – usually on a quarterly basis. The portfolio selection theory only portrays a partial aspect of the consulting, meaning that consultants should not use it as an independent tool, but should integrate it into their consulting software. These requirements can be easily covered by web-based applications. The data is stored and calculated on a high-performance service, whereas it is portrayed on the consultant’s local computer. Web-based applications also offer the flexibility of integrating the functionality of the portfolio selection theory into an existing application as a specialized calculation core, but can also be used as a comprehensive, specialized consulting solution. However, the portfolio selection theory has faced some critique. Many studies have tried to disprove or confirm the assumptions of the theory. Without going into detail, the following can be said: A theory does not portray reality, it only offers an approach for better understanding reality. The portfolio selection theory fulfills that purpose. It should not be overrated and used as the only basis for consulting. Rather, it is a useful approach for neutral and objective customer support.

9.3

Support of Customer-Oriented Consulting Using Modern Consulting Software

As explained at the start of this chapter, consulting is based on functional content and the corresponding technical support by a suitable consulting software and portray an inseparable component of customer-oriented consulting. The second section of this article will describe the software-related aspects of customeroriented consulting. It deals with the fundamental architecture of the corresponding software and introduces possibilities for their targeted application even in mobile sales channels.

9.3.1

Architecture of a Customer-Oriented Consulting Software

Until recently, comprehensive financial planning solutions, which considered all of a customer’s aspects at once, were considered the alpha and omega of consulting. However, actual use has shown, that the complexity of comprehensive financial planning consulting is too great to actually be used in all customer segments and still be efficient. The effort and resulting costs are not suitable for many customers.

9 Strategies for a Customer-Oriented Consulting Approach

203

Moreover, comprehensive financial planning consulting is inappropriate in larger segments with retail banking and private banking customers. Recently, the trend has been moving more toward topic-oriented systems, which achieve comprehensive consulting by other means. A strong consulting solution fulfills multiple areas: • It must consider the institute’s wide range of customers and must be able to flexibly address individual consulting situations. • At the same time, it must enable neutral, high-quality consulting without omitting product sales. • It must master a balance between a uniform approach and topic-oriented consulting. • In addition, it must be possible to use it for all sales channels. The specific structure of such a consulting solution can be seen in the following figure.

Self Assessment Quick Analysis Complete Analysis

CRM/ Workflow

Consulting Module A

Consulting Module B

Consulting Module C

Consulting Module D

Back Office

Central Database

Market Data

Figure 9.4. Example of the architecture of a customer-oriented consulting software

Modern consulting software enables the flexible adjustment to individual consulting situations. Each inquiry made to the customer consulting is processed in a separate module. These inquiries may address topics such as pension schemes, construction financing, asset analyses or asset replacements. This reduces the complexity of the consulting, since only the data relevant for a specific topic has to be entered. However, standalone data must be avoided in order to ensure the holistic approach of the consulting. The entered data must be stored centrally. Thus, all consulting modules have to be based on the same database and data, such as income,

204

Alexander Schöne

asset values, marital status, etc. only have to be entered once, after which they are available in all modules. Each additional consulting increases the amount of data, so that consultants can constantly gain more information about their customers. The consulting time and effort will still be reduced, since only topic-relevant data is used for each module. Thus, the focus remains on the customer and an efficient consulting is ensured. Not all consulting has to take place on the same level. Modern consulting software offers the possibility to individually address customers by selecting a consulting level. The lowest consulting level could be a short analysis offering easy access to complex topics and requiring only four or five entries. The next level for example could be a first qualified recommendation achieved by using the quick analysis. In doing so, even a minor amount of input data can provide an overview of the consulting situation. Tax-related effects are also considered in a clear manner. If the consultant realizes that there is a greater need for consulting, the consulting can be intensified at any time up to a complete, comprehensive data collection of the overall customer situation in the top consulting level. When using different consulting levels, it is important that the same module has to be active in the background regardless of the level of consulting. This allows all customers to be treated with the same quality – and with the investment of a justifiable amount of time. The central component of modern consulting software is an integrated action and product recommendation, which generates a customer-specific, high-quality recommendation independent of the consultant. Different recommendation logics can be selected for doing so: static decision trees, optimization algorithms (such as Markowitz’ portfolio selection model described above), fuzzy algorithms or knowledge-based systems. To convince the customer that the recommendation is right, the recommendation should be derived in three steps: • Neutral structure recommendation based on asset classes. • Conversion proposal based on the existing products. • Reasoning and portrayal of the products. In contrast to a mere product recommendation, the first step generates a recommendation for asset classes, i. e. for neutral “products”. During an asset analysis, the customer would, for example, receive a recommendation as to which asset classes he should invest in or which he should get rid of. Specific products from the institute’s product portfolio should not be introduced into the recommendation until the second step. A conscious transition from a neutral recommendation to specific products is easier for customers to understand. To increase the customers’ trust in the recommendation, the transition is explained for each customer during the third step. The customer is informed as to why each of the proposed products would support his optimal strategy. Interfaces to, for example, an institute’s CRM, existing portfolio and/or market data, as well as the subsequent administration can be used to ensure that the consulting software is not a standalone system, rather is efficiently integrated into the institute’s consulting process and workflow.

9 Strategies for a Customer-Oriented Consulting Approach

9.3.2

205

Mobile Consulting as a Result of Customer-Orientation

Customer-oriented consulting must focus on a customer’s goals. This means, that consulting has to be available not only during regular banking hours in an office, but even at the customer’s home, for example. Such services have become more important not only for wealthy private customers, but also for “normal” bank customers. More and more banks have already taken advantage of this possibility or are planning how to do so. However, an error is frequently made in giving customers different consulting services when at home than when in the bank’s offices. For cost-related reasons, consultations in a customer’s home environment should not be treated as small-talk sessions or to meet informational needs, but should be treated as a full-value consultation offering additional sales opportunities. Solid consulting must offer the customer consistent, high-quality support regardless of the location. The consulting software used must be suitable for mobile as well as stationary sales. However, this is seldom the case. Even consulting software, which has been installed on a laptop and can thus be used for mobile consulting, presents the financial institute with high financial and organizational handicaps. The core problem in such cases is the data storage. Customer data must be exchanged between the consultant’s laptop and the central solution. This required a high replication effort, especially when the central solution has been integrated into the institute’s in-force business system or the CRM. Thus, many institutes decide not to use a central consulting solution, but to work with the solutions installed on the consultant’s laptop when in the office or at the customer’s location. This means, that the customer data is not stored centrally, but as standalone data on the consultant’s laptop. This, in turn, prevents a consequent enforcement of the sales strategy. Regardless of whether data is exchanged between mobile or stationary solutions or is only stored in the mobile solution, the institute will always have to perform organizational tasks. Software updates, market and product data, as well as recommendation data have to be distributed to the consultant laptops on a regular basis. Additional hidden costs are also created through the increased maintenance and increased hardware requirements related to the consultant laptops, which are often ignored when calculating the profitability of the implementation of a consulting solution. The use of common consulting solutions in stationary and mobile business presents the institute with technical, organizational and financial challenges and makes the consultant’s daily work with the consulting solution more difficult. 9.3.2.1 Web-Based Consulting Software as a Basis for Stationary and Mobile Sales Web-based software offers many advantages for mobile and stationary sales. The software is installed on a central server and the consultants can access it using

206

Alexander Schöne

a common internet browser, such as Microsoft Explorer or Netscape Navigator. It does not matter whether the server is located in an external computer center, the company headquarters or in a branch office. Consultants can access the solution from any workplace using the browser, without having to install additional software. Access is possible from the intranet or from an external, secured network. When at the customer’s location, consultants can utilize, for example, a GPTS, UMTS or WLAN connection to access the consulting software from his laptop. In doing so, he always has access to the latest data. Technically, this type of connection is similar to an internet connection using a dial-up telephone line. In contrast to common telephone connections, the duration of the connection is irrelevant – the user only has to pay for the amount of data he transfers. Regardless of whether the consultation lasts 30 minutes or 3 hours, the decisive factor for the costs is only the amount of data transferred during the meeting. The advantages of a mobile access are obvious. The requirements placed on the laptop are minimal – only an internet browser needs to be installed. Moreover, the institute does not have any costs for maintaining the mobile consulting solution. Consultants always have access to the latest customer, market and product data. The interfaces available in the consulting software, such as CRM or market data systems, can be used in mobile consulting as well, without having to install additional programs on the consultant’s laptop. The advantages offered by this possibility are subject to high cost expectations and the fear of security problems. However, both of these concerns can be solved during actual use. 9.3.2.2 Minimal Additional Costs in Mobile Consulting Modern consulting software focuses on keeping the required data transfers as low as possible. For example, graphics are no longer generated in and transferred in a standard, resource-intensive graphic format, but as so-called vector graphics. These offer the advantage, that they can be scaled as needed without losing on quality and only require a minimum of memory. The majority of screens in consulting are text-oriented, which significantly reduces the data volumes which have to be transferred. Another advantage of web-based consulting software lies in the fact that only the results have to be transferred from the server to he consultant’s laptop, regardless of the complexity of the calculations or the quantity of required data. It does not matter whether a specific analysis will be performed for a retail customer with 10 asset values or a wealth private customer with 200 asset values. All necessary data is located on the central server, where the calculations are performed and the concluding page with the results is generated. Only the relative small result portrayal is transferred. Thus, the performance of the mobile consulting solution is no longer dependent on the laptop’s hardware, but on the bandwidth of the transmission. This means, that the speed in mobile consulting is comparable to the speed of stationary consulting.

9 Strategies for a Customer-Oriented Consulting Approach

207

Table 9.5. Data volumes and transmission costs for mobile consulting

Data volumes (compressed, without consulting synopsis) Cost

Simple Support 450 Kbytes

Comprehensive Support 1,200 Kbytes

0.74 EUR

1.98 EUR

A typical page from standard, web-based consulting software requires an average data volume of approximately 40 Kbytes. Utilizing the data compression of the transmission and the local proxy storage of the transferred contents, this can generally be reduced to 5 to 10 Kbytes. The number of pages opened during the consultation depends mostly on the complexity of the consulting. Simple consulting generally requires no more than 40 pages, whereas comprehensive consulting for a wealth customer can require 150 pages or more. Table 9.5 shows the average values determined for the total data volume transferred during actual consultations and the resulting costs. The costs vary depending on the provider and the type of contract which has been signed.1 As can be seen, an average of one to two Euro is accrued per consultation. Compared to the consultant’s personnel costs, this is only a fraction of the total costs. When compared to the alternative costs accrued for, say, higher maintenance, better laptops if solutions have to be installed on a consultant’s laptop, the advantages of mobile consulting using a Web-based solution are more than obvious. 9.3.2.3 Highest Data Security in Mobile Consulting The security of mobile consulting is a core concern. A differentiation has to be made between the authentication and authorization of the consulting software, and the security of the data transmission (encoding) between the terminal and the server. Web-based consulting software may require the client to have a Web browser with an IP connection to the consulting server. Regardless of which mobile communication standard is used, whether e. g. GSM, GPRS, UMTS or WLAN, the internet is generally used as an open network between the mobile network operator and the server. Figure 9.5 shows the principle of a connection by GPRS (General Packet Radio Service) from a cell phone. To counteract any electronic eavesdropping within the involved networks, a secured point-to-point connection is formed between the mobile terminal (e. g. the consultant’s laptop) and the server. This can be achieved either by setting up a VPN (Virtual Private Network) or an SSL connection between the browser and the Web server. These connections are eavesdropping-secured according to the current technological state. 1

The calculation is based on the business rates offered by a major German cellular network provider (at the beginning of 2006). It is suitable for consultants for five to ten mobile consultations a month. Lower rates are available in case of a greater number of mobile consultations, which would reduce the costs by approximately 30%.

208

Alexander Schöne

Mobile Consulting

Stationary Consulting Firewall

Central Server

Figure 9.5. Mobile consulting with laptop and cell phone

With this approach, there is no difference in the security of the transmission and authentication when using a mobile terminal connection or a terminal connection for accessed the server directly via the internet. Unauthorized person can neither eavesdrop on the customer data during the consultation, nor can they access the consulting solution. Thus, Web-based consulting solutions offer the same service scope in stationary and mobile use with low costs and organizational efforts. Mobile consulting guarantees the same data security as consultations within a company’s offices. This means, that a Web-based consulting solution is well-suited for all financial institutes, which would like offer stationary and/or mobile consulting.

9.4

Conclusion

In this chapter two central requirements for customer-oriented consulting were discussed. The discussion revealed that available methods and tools from back office solutions can be used in consulting services to enhance the quality and service for the customer. Applying the portfolio selection theory of Markowitz enables the consultant to derive an individual proposal for the customers need which is superior to other consulting approaches. Secondly it was shown that modern software architecture enables consultants to serve their customer not only in the bank office but also at the customers home. Both facts are key fundamentals of a customer oriented consulting approach.

References Elton E, Gruber J (1995) Modern Portfolio Theory and Investment Analysis, John Wiley & Sons, New York Markowitz H (1952) Portfolio Selection, Journal of Finance 7 (1), pp. 77–91 Markowitz H (1997) Portfolio Selection: Efficient Diversification of Investments, second edition, Blackwell Publishers, Cambridge MA

CHAPTER 10 A Reference Model for Personal Financial Planning Oliver Braun, Günter Schmidt

10.1 Introduction We base the discussion of the business requirements on the application architecture LISA where four views on models are defined (Schmidt 1999): 1. the granularity of the model differing between industry model, enterprise model, and detailed model; 2. the elements of the model differing between data, function, and coordination; 3. the life cycle of modelling differing between analysis, design, and implementation; 4. the purpose of modelling differing between problem description and problem solution. We will present an analysis model as part of the application architecture showing the granularity of an industry model. It contains data, function, and co-ordination models related to the purpose of modelling. The language we use to develop the analysis model is the Unified Modeling Language (UML). An intrinsic part of the proposed system architecture is the usage of web technologies as the global Internet and the World Wide Web are the primary enabling technologies for delivering customized decision support. We refer to the reference model as a combination of the analysis model and the system architecture. We believe that combining analysis model and system architecture could enhance the usability of the reference model. We understand personal financial planning as the process of meeting life goals through the management of finances (Certified Financial Planner’s (CFP) Board of Standards 2005). Our reference model fulfils two kinds of purposes: first, the analysis model is a conceptual model that can serve financial planners as a decision support tool. Second, system developers can map the analysis model to the system architecture at the design stage of system development. Furthermore, the reference model serves as a capture of existing knowledge in the field of IT-

210

Oliver Braun, Günter Schmidt

supported personal financial planning. The model also addresses interoperability assessments by the concept of platform-independent usage of personal financial planning tools. In Section 10.2 we provide definitions and a short review of the state-of-the-art in the field of reference modelling for personal financial planning research. Furthermore, we outline some features displaying the advantages of the reference model. In Section 10.3 we describe the framework of our reference model, and in sections 10.4 and 10.5 (parts of) the reference models for the analysis and architecture level are described. We have developed a prototype FiXplan (IT-based personal financial planning) in order to evaluate the requirements compiled in Section 10.2. Finally, Section 10.6 resumes the results and discusses future research opportunities within the field of reference models for personal financial planning.

10.2 State of the Art We will briefly review concepts of personal financial planning in Section 10.2.1, and systems and models in Section 10.2.2. Section 10.2.3 discusses core concepts of reference modelling, and Section 10.2.4 is dedicated to the discussion of requirements for a reference model for personal financial planning.

10.2.1 Personal Financial Planning The field of personal financial planning is well supplied with a lot of textbooks, among them: (Böckhoff and Stracke 2001; Keown 2003; Nissenbaum et al. 2004; Schmidt 2006; Woerheide 2002) to name just a few. Most books can be used as guides to handle personal financial problems e. g. maximize wealth, achieve various financial goals, determine emergency savings, maximize retirement plan contributions, etc. There are also papers in journals ranging from the popular press to academic journals. (Braun and Kramer 2004) give a review on software for personal financial planning in German-speaking countries. We will start our discussion with some definitions related to the world of personal financial planning. Certified Financial Planner’s (CFP) Board of Standards defines (personal) financial planning as follows: Definition 10.2.1 (CFP Board of Standards 2005) Financial planning is the process of meeting your life goals through the proper management of your finances. Life goals can include buying a home, saving for your child’s education or planning for retirement. Financial planning provides direction and meaning to your financial decisions. It allows you to understand how each financial decision you make affects other areas of your finances. For example, buying a particular investment product might help you pay off your mortgage faster or it might delay your retirement significantly. By viewing each financial

10 A Reference Model for Personal Financial Planning

211

decision as part of a whole, you can consider its short and long-term effects on your life goals. You can also adapt more easily to life changes and feel more secure that your goals are on track. The draft ISO/DIS 22 222-1 of the Technical Committee ISO/TC 222, Personal financial planning, of the International Organization for Standardization defines personal financial planning as follows: Definition 10.2.2 (ISO/TC 222 2004) Personal financial planning is an interactive process designed to enable a consumer/client to achieve their personal financial goals. In the same draft, Personal Financial Planner, consumer, client and financial goals are defined as follows: Definition 10.2.3 (ISO/TC 222 2004) A Personal Financial Planner is an individual practitioner who provides financial planning services to clients and meets all competence, ethics and experience requirements contained in this standard. A consumer is an individual or a group of individuals, such as a family, who have shared financial interests. A client of a Personal Financial Planner is an individual who has accepted the terms of engagement by entering into a contract of services. A financial goal is a quantifiable outcome aimed to be achieved at some future point in time or over a period of time. Certified Financial Planner’s (CFP) Board of Standards and the Technical Committee ISO/TC 222, Personal financial planning, of the International Organization for Standardization define the personal financial planning process as follows: Definition 10.2.4 (CFP Board of Standards 2005; ISO/TC 222 2004) The personal financial planning process shall include, but is not limited to, six steps that can be repeated throughout the client and financial planner relationship. The client can decide to end the process before having passed all the steps. The process involves gathering relevant financial information, setting life goals, examining your current financial status and coming up with a strategy or plan for how you can meet your goals given your current situation and future plans. The financial planning process consists of the following six steps: 1. Establishing and defining the client-planner relationship. The financial planner should clearly explain or document the services to be provided to the client and define both his and his client’s responsibilities. The planner should explain fully how he will be paid and by whom. The client and the planner should agree on how long the professional relationship should last and on how decisions will be made. 2. Gathering client data and determining goals and expectations. The financial planner should ask for information about the client’s financial situation. The client and the planner should mutually define the client’s personal and financial goals, understand the client’s time frame for results and discuss, if relevant, how the client feel about risk. The financial planner should gather all the necessary documents before giving advice the client needs.

212

Oliver Braun, Günter Schmidt

3. Analyzing and evaluating the client’s financial status. The financial planner should analyze the client’s information to assess the client’s current situation and determine what the client must do to meet his goals. Depending on what services the client has asked for, this could include analyzing the client’s assets, liabilities and cash flow, current insurance coverage, investments or tax strategies. 4. Developing and presenting financial planning recommendations and/or alternatives. The financial planner should offer financial planning recommendations that address the client’s goals, based on the information the client provides. The planner should go over the recommendations with the client to help the client understand them so that the client can make informed decisions. The planner should also listen to the client’s concerns and revise the recommendations as appropriate. 5. Implementing the financial planning recommendations. The client and the planner should agree on how the recommendations will be carried out. The planner may carry out the recommendations or serve as the client’s “coach,” coordinating the whole process with the client and other professionals such as attorneys or stockbrokers. 6. Monitoring the financial planning recommendations. The client and the planner should agree on who will monitor the client’s progress towards his goals. If the planner is in charge of the process, he should report to the client periodically to review his situation and adjust the recommendations, if needed, as the client’s life changes.

10.2.2 Systems and Models Models are simplifications of reality. Models exist at various levels of abstraction. A level of abstraction indicates how far removed from the reality a model is. High levels of abstraction represent the most simplified models. Low levels of abstraction have close correspondence with the system they are modelling. An exact 1:1 correspondence is no longer a model but rather a transformation of one system to another. Systems are defined as follows: Definition 10.2.5 (Rumbaugh et al. 2005) A system is a collection of connected units organized to accomplish a purpose. A system can be described by one or more models, possibly from different viewpoints. The complete model describes the whole system. The personal financial planning process can be supported by financial Decision Support Systems (DSS). In general, a Decision Support System can be described as follows. Definition 10.2.6 (Turban 1995) A Decision Support System (DSS) is an interactive, flexible, and adaptable computerbased information system, specially developed for supporting the solution of a non-

10 A Reference Model for Personal Financial Planning

213

structured management problem for improved decision making. It utilizes data, provides an easy-to-use interface, and allows for the decision maker own insights. Continuous interaction between system and decision-makers is important (Briggs 1997). Interactive decision-making has been accepted as the most appropriate way to obtain the correct preferences of decision-makers (Mathieu and Gibson 1993; Mukherjee 1994). If this interaction is to be supported by a web-based system, then there is a need to manage the related techniques (or models), to support the data needs, and to develop an interface between the users and the system. Advances in web technologies and the emergence of the e-business have strongly influenced the design and implementation of financial DSS. As a result, improvement in global accessibility in terms of integration and share of information means that obtaining data from the Internet has become more convenient. Growing demand in fast and accurate information sharing increases the need in web-based financial DSS (Chou 1998; Dong et al. 2002; Dong et al. 2004; Fan et al. 1999, 2000; Hirschey et al. 2000; Li 1998; Mehdi and Mohammad 1998). The application of DSS to some specific financial planning problems is described in several articles. In (Schmidt and Lahl 1991) a DSS based on expert system technology is discussed. The focus of the system is portfolio selection. (Samaras et al. 2005) focus on portfolio selection. They propose a DSS which includes three techniques of investment analysis: Fundamental Analysis, Technical Analysis and Market Psychology. (Dong et al. 2004) also focus on portfolio selection and report on the implementation of a Web-based DSS for Chinese financial markets. (Zahedi and Palma-dos-Reis 1999) describe a personalized intelligent financial DSS. (Zopounidis et al. 1997) give a survey on the use of knowledge-based DSS in financial management. The basic characteristic of such systems is the integration of expert systems technology with models and methods used in the decision support framework. They survey some systems applied to financial analysis, portfolio management, loan analysis, credit granting, assessment of credit risk, and assessment of corporate performance and viability. These systems provide the following features: • support all stages of the decision making process, i. e. structuring the problem, selecting alternative solutions and implementing the decision; • respond to the needs and the cognitive style of different decision makers based on individual preferences; • incorporate a knowledge base to help the decision maker to understand the results of the mathematical models; • ensure objectiveness and completeness of the results by comparing expert estimations and results from mathematical models; • time and costs of the decision making process is significantly reduced, while the quality of the decisions is increased. Special emphasis on web-based decision support is given in (Bhargava et al. 2006). Even though the above efforts in developing DSS have been successful to solve some specific financial planning problems, these do not lead to a general

214

Oliver Braun, Günter Schmidt

framework for solving any financial planning problem. There is no DSS which covers the whole process of financial planning. Models can be defined as follows: Definition 10.2.7 (Rumbaugh et al. 2005) A model is a semantically complete description of a system. According to that definition, a model is an abstraction of a system from a particular viewpoint. It describes the system or entity at the chosen level of precision and viewpoint. Different models provide more-or-less independent viewpoints that can be manipulated separately. A model may comprise a containment hierarchy of packages in which the top-level package corresponds to the entire system. The contents of a model are the transitive closures of its containment (ownership) relationships from top-level packages to model elements. A model may also include relevant parts of the system’s environment, represented, for example, by actors and their interfaces. In particular, the relationship of the environment to the system elements may be modelled. The primary aim of modelling is the reduction of complexity of the real world in order to simplify the construction process of information systems (Frank 1999).

10.2.3 Reference Models (Mišić and Zhao 1999) define reference models as follows: Definition 10.2.8 (Mišić and Zhao 1999) A reference model describes a standard decomposition of a known problem domain into a collection of interrelated parts, or components, that cooperatively solve the problem. It also describes the manner in which the components interact in order to provide the required functions. A reference model is a conceptual framework for describing system architectures, thus providing a high-level specification for a class of systems. Reference modelling can be divided into two groups (Fettke and Loos 2003): investigation of methodological aspects (e. g. Lang et al. 1996; Marshall 1999; Remme 1997; Schütte 1998), and construction of concrete reference models (e. g. Becker and Schütte 2004; Fowler 1997; Hay 1996; Scheer 1994). To the best of our knowledge, there is no reference model for personal financial planning. Our reference model for personal financial planning consists of an analysis model for the system analysis of a personal financial planning system and of corresponding system architecture of an information system for personal financial planning. 10.2.3.1 Analysis Model Surveys of reference models at the analysis stage of system development (i. e. conceptual models) are given in various papers, (e. g. Scholz-Reiter 1990; Marent 1995; Mertens et al.1996; Fettke and Loos 2003; Mišić and Zhao 1999). For the construction of models, languages are needed. We will use the Unified Modeling Language (UML) for our analysis model.

10 A Reference Model for Personal Financial Planning

215

Definition 10.2.9 (Rumbaugh et al. 2005) The Unified Modeling Language (UML) is a general-purpose modelling language that is used to specify, visualize, construct, and document the artefacts of a software system. UML was developed by the Rational Software Corporation to unify the best features of earlier methods and notations. The UML models can be categorized into three groups: • State models (that describe the static data structures) • Behaviour models (that describe object collaborations) • State change models (that describe the allowed states for the system over time) UML also contains a few architectural constructs that allow modularizing the system for iterative and incremental development. We will use UML 2.0 (Rumbaugh, Jacobson, & Booch, 2005) for our analysis model. 10.2.3.2 System Architecture In general, architecture can be defined as follows: Definition 10.2.10 (Rumbaugh et al. 2005) An architecture is the organizational structure of a system, including its decomposition into parts, their connectivity, interaction mechanisms, and the guiding principles that inform the design of a system. Definition 10.2.11 (Bass et al. 1998) The term system architecture denotes the description of the structure of a computer-based information system. The structures in question comprise software components, the externally visible properties of those components, and the relationships among them. IBM (Youngs et al. 1999) has put forth an Architecture Description Standard (ADS), which defines two major aspects: functional and operational. The IBM ADS leverages UML to help express these two aspects of the architecture. This standard is intended to support harvesting and reuse of reference architectures. For a comprehensive overview of architectures, languages, methods, and techniques for analysing, modelling, and constructing information systems in organisations, we refer to the Handbook on Architectures of Information Systems (Bernus et al. 2005). The reference architecture we use to derive our system architecture is a service oriented architecture (SOA) proposed by the World Wide Web Consortium (W3C) (Booth et al. 2005) for building Web-based information systems. According to the Gartner group, by 2007 service oriented architectures will be the dominant strategy (more than 65%) of developing information systems. The architecture we propose supports modular deployment of both device specific user interfaces through multichannel delivery of services and new or adapted operational processes and strate-

216

Oliver Braun, Günter Schmidt

gies. Service oriented architectures refer to an application software topology which separates business logic from user interaction logic represented in one or multiple software components (services), exposed to access via well defined formal interfaces. Each service provides its functionality to the rest of the system as a welldefined interface described in a formal markup language and the communication between services is platform and language independent. Thus, modularity and reusability are ensured enabling several different configurations and achieving multiple business goals. The first service oriented architecture relates to the use of DCOM or Object Request Brokers (ORBs) based on the CORBA specification. service is a function that is well-defined, self-contained, and does not depend on the context or state of other services. Services are what you connect together using Web Services. A service is the endpoint of a connection. Also, a service has some type of underlying computer system that supports the connection offered. The combination of services – internal and external to an organization – make up a service oriented architecture. The technology of Web services is the most likely connection technology of service oriented architectures. Web services essentially use XML to create a robust connection. Web Services technology is currently the most promising methodology of developing web information systems. Web Services allow companies to reduce the cost of doing e-business, to deploy solutions faster and to open up new opportunities. The key to reach these goals is a common program-to-program communication model, built on existing and emerging standards such as HTTP, Extensible Markup Language (XML, see also Wüstner et al. 2005), Simple Object Access Protocol (SOAP), Web Services Description Language (WSDL) and Universal Description, Discovery and Integration (UDDI). However, the real Business-toBusiness interoperability between different organizations requires more than the aforementioned standards. It requires long-lived, secure, transactional conversations between Web Services of different organizations. To this end, a number of standards are under way (e. g. Web Services Conversation Language (WSCL), and the Business Process Execution Language for Web Services (BPEL4WS). With respect to the description of service’s supported and required characteristics the WS-Policy Framework is under development by IBM, Microsoft, SAP, and other leading companies in the area. On the other hand, other no standard languages and technologies have been proposed for composing services and modelling transactional behaviour of complex service compositions. Such proposals include the Unified Transaction Modelling Language (UTML) and the SWORD toolkit.

10.2.4 Requirements for a Reference Model for Personal Financial Planning Our reference model for personal financial planning should fulfil the following requirements concerning the input, data processing, and output: the input has to be collected in a complete, i. e. adequate to the purpose, way. The input data have to

10 A Reference Model for Personal Financial Planning

217

be collected from the client and/or from institutions that are involved in this process. Such institutions may be Saving Banks, Mortgage Banks, Life Insurances, Car Insurances, Provider of Stock Marked quotations, Pension Funds, etc. This stage is time consuming and the quality of the financial plan is dependent on the completeness and correctness of the collected input data. Data processing (i. e. data evaluation and data analysis) have the task to process the input data in a correct, individual, and networked way. Correct means that the results have to be precise and error-free according to accepted methods of financial planning. Individual means that the concrete situation with all its facets has to be centred in the planning, and that it is forbidden to make generalizations. Networked means: all of its data’s effects and interdependencies have been considered and structured into a financial plan. The output has to be coherent and clear. 10.2.4.1 Analysis Model Main task of an analysis model for personal financial planning is to help financial planners to consult individuals to properly manage their personal finances. Proper personal financial planning is so important because many individuals lack a working knowledge of financial concepts and do not have the tools they need to make decisions most advantageous to their economic well-being. For example, the Federal Reserve Board’s Division of Consumer and Community Affairs (2002) stated that “financial literacy deficiencies can affect an individual’s or family’s day-today money management and ability to save for long-term goals such as buying a home, seeking higher education, or financial retirement. Ineffective money management can also result in behaviours that make consumers vulnerable to severe financial crises”. Furthermore, the analysis model should fulfil the following requirements: • Maintainability (the analysis model should be understandable and alterable) • Adaptability (the analysis model should be adaptable to specific user requirements) • Efficiency (the analysis model should solve the personal financial planning problem as fast and with the least effort as possible) • Standards oriented (the analysis model should be oriented at ISO/DIN/ CFP – board standards; it should fulfil e. g. the CFP financial planning practice standards (Certified Financial Planner’s (CFP) Board of Standards 2005) 10.2.4.2 System Architecture An architecturally significant requirement is a system requirement that has a profound impact on the development of the rest of the system. Architecture gives the rules and regulations by which the software has to be constructed. The architect has the sole responsibility for defining and communicating the system’s architecture. The architecture’s key role is to define the set of constraints placed on the

218

Oliver Braun, Günter Schmidt

design and implementation teams when they transform the requirements and analysis models into the executable system. These constraints contain all the significant design decisions and the rationale behind them. Defining an architecture is an activity of identifying what is architecturally significant and expressing it in the proper context and viewpoint. The system architecture for an IT-based personal financial planning system should fulfil the following requirements: • Maintainability (the architecture should be understandable and alterable) • Adaptability (the architecture should be adaptable to specific user requirements) • Efficiency (the architecture should deliver a framework that allows a Personal Financial Planner to solve personal financial planning problems as fast and with the least effort as possible, e. g. by automatic data gathering) • Standards oriented (the architecture should be oriented at well established standards, e. g. service oriented architectures) • Reliability and robustness (the architecture should support a reliable and robust IT-system) • Portability (the architecture should support a platform independent usage of the personal financial planning system) In the following sections, we describe a reference model for personal financial planning which meets the requirements specified in this section. In Section 10.3 we describe the framework for our reference model and in sections 10.4 and 10.5 (parts of) the reference models for the analysis and architecture level are described.

10.3 Framework for a Reference Model for Personal Financial Planning Our reference model for personal financial planning consists of two parts (see Figure 10.1): 1. Analysis model which can help financial planners to do personal financial planning. 2. System architecture which can help system developers to develop logical models at the design stage of system development. Our reference model for personal financial planning has two purposes: first, the analysis model can help financial planners to do personal financial planning in a systematic way. Second, at the design stage of system development, system developers can take the analysis model produced during system analysis and apply the system architecture to it. An intrinsic part of our reference model at the architecture level is the usage of web technologies as web technologies have already and will strongly influence the design and implementation of financial information systems in general.

10 A Reference Model for Personal Financial Planning

219

Figure 10.1. Analysis model and system architecture

Design starts with the analysis model and the software architecture document as the major inputs (Conallen 2002). Design is where the abstraction of the business takes its first step into the reality of software. At design level, the analysis model is refined such that it can be implemented with the components that obey the rules of the architecture. As with analysis, design activities revolve around the class and interaction diagrams. Classes become more defined, with fully qualified properties. As this happens, the level of abstraction of the class shifts from analysis to design. Additional classes, mostly helper and implementation classes, are often added during design. In the end, the resulting design model is something that can be mapped directly into code. This is the link between the abstractions of the business and the realities of software. Analysis and design activities help transform the requirements of the system into a design that can be realized in software. Analysis begins with the use case model, the use cases and their scenarios, and the functional requirements of the system that are not included in the use cases. The analysis model is made up of classes and collaborations of classes that exhibit the dynamic behaviours detailed in the use cases and requirements. The analysis model and the design model are often the same artefact. As the model evolves, its elements change levels of abstraction from analysis to detailed design. Analysis-level classes represent objects in the business domain. Analysis

220

Oliver Braun, Günter Schmidt

focuses on the functional requirements of the system, ignoring the architectural constraints of the system. Use case analysis comprises those activities that take the use cases and functional requirements to produce an analysis model of the system. Analysis focuses on the functional requirements of the system, so the fact that some or all of the system will be implemented with Web technologies is beside the fact. Unless the functional requirements state the use of a specific technology, references to architectural elements should be avoided. The analysis model is made up of classes and collaborations of classes that exhibit the dynamic behaviours detailed in the use cases and the requirements. The model represents the structure of the proposed system at a level of abstraction beyond the physical implementation of the system. The classes typically represent objects in the business domain, or problem space. The level of abstraction is such that the analysis model could be applied equally to any architecture. Important processes and objects in the problem space are identified, named, and categorized during analysis. Analysis focuses on the functional requirements of the system, ignoring the architectural constraints of the system. The emphasis is on ensuring that all functional requirements, as expressed by the use cases and other documents, are realized somewhere in the system. Ideally, each use case is linked to the classes and packages that realize them. This link is important in establishing the traceability between requirements and use cases and the classes that will realize them. The analysis model is an input for the design model and can be evolved into the design model.

10.4 Analysis Model The analysis model of our reference model for personal financial planning consists of the following parts: Use Cases build the dynamic view of the system. Class diagrams build the structural view of the system. Both are combined in the behavioural view of the system with activity diagrams and sequence diagrams.

10.4.1 Dynamic View on the System: Use Cases Use cases, use case models and use case diagrams are defined as follows: Definition 10.4.1 (Rumbaugh et al. 2005) A use case is the specification of sequences of actions, including variant sequences and error sequences, that a system, subsystem, or class can perform by interacting with outside objects to provide a service of value. A use case is a coherent unit of functionality provided by a classifier (a system, subsystem, or class) as manifested by sequences of messages exchanged among the system and one or more outside users (represented as actors), together with actions performed by the system. The purpose of a use case is to define a piece of behaviour of a classifier (including a subsystem or the entire system), without revealing the internal structure of the

10 A Reference Model for Personal Financial Planning

221

classifier. A use case model is a model that describes the functional requirements of a system or other classifier in terms of use cases. A use case is a diagram that shows the relationships among actors and use cases within a system.

10.4.2 Structural View on the System: Class Diagrams The hierarchy of the dynamic view of the system (use cases) may provide a start but usually falls short when defining the structural view of the system (classes). The reason is that it is likely that certain objects participate in many use cases and logically cannot be assigned to a single use case package. At the highest level, the packages are often the same, but at the lower levels of the hierarchy, there are often better ways to divide the packages. Analysis identifies a preliminary mapping of required behaviour onto structural elements, or classes, in the system. At the analysis level, it is convenient to categorize all discovered objects into one of three stereotyped classes: «boundary», «control», and «entity», as suggested by (Jacobson et al.1992), see Figure 10.2.

Stereotype

<>

<>

<<entity>>

Icon

Figure 10.2. Structural elements

Definition 10.4.2 (Jacobson et al. 1992) Boundary classes connect users of the system to the system. Control classes map to the processes that deliver the functionality of the system. Entity classes represent the persistent things of the system, such as the database. Entities are shared and have a life cycle outside any one use of the system, whereas control instances typically are created, executed, and destroyed with each invocation. Entity properties are mostly attributes, although also operations that organize an entity’s properties can often be found on them. Controllers do not specify attributes. Initially, their operations are based on verb phrases in the use case specification and tend to represent functionality that the user might invoke. In general, class diagrams can be defined as follows: Definition 10.4.3 (Rumbaugh et al. 2005) A class diagram is a graphic representation of the static view that shows a collection of declarative (static) model elements, such as classes, types, and their contents and relationships.

222

Oliver Braun, Günter Schmidt

10.4.3

Use Case Realization: Sequence and Activity Diagrams

Construction of behavioural diagrams that form the model’s use case realizations help to discover and elaborate the analysis classes. A use case realization is an expression of a use case’s flow of events in terms of objects in the system. In UML, this event flow is expressed with sequence diagrams. Definition 10.4.4 (Rumbaugh et al. 2005) A sequence diagram is a diagram that shows object interactions arranged in time sequence. In particular, it shows the objects participating in an interaction and the sequences of messages exchanged. A sequence diagram can realize a use case, describing the context in which the invocation of the use case executes. The sequence diagram describes the objects that are created, executed, and destroyed in order to execute the functionality described in the use case. It does not include object relationships. Sequence diagrams provide a critical link of traceability between the scenarios of the use cases and the structure of the classes. These diagrams can express the flow in a use case scenario in terms of the classes that will implement them. An additional diagram that is useful for mapping analysis classes into the use case model is the activity diagram. Definition 10.4.5 (Rumbaugh et al. 2005) An activity diagram is a diagram that shows the decomposition of an activity into its constituents. Activities are executions of behaviour. Activities are behavioural specifications that describe the sequential and concurrent steps of a computational procedure. Activities map well to controller classes, where each controller executes one or more activities. We create one activity diagram per use case.

10.4.4 Example Models The basic analysis use case model (see Figure 10.3) is oriented to the Certified Financial Planner’s (CFP) Board of Standards and ISO/TC 222 definition of the personal financial planning process (see Def. 2.4). In the following, we will describe the subsystem Core Personal Financial Planning in detail. Core Personal Financial Planning includes three use cases Determining the client’s financial status, Determining a feasible to-be concept, and Determining planning steps that are elaborated in more detail in Sections 10.4.4.1 through 10.4.4.3.

10 A Reference Model for Personal Financial Planning

223

Figure 10.3. Basic use case model

10.4.4.1 Determining the Client’s Financial Status Determining the client’s financial status means answering the question “How is the current situation defined?” We will pick the following scenarios as the main success scenarios. Determining the client’s financial status 1. Data gathering. The Financial Planner asks the client for information about the client’s financial situation. This includes information about assets, liabilities, income and expenses. 2. Data evaluation. The Financial Planner uses two financial statements: balance sheet and income statement.

224

Oliver Braun, Günter Schmidt

3. Data analysis. The Financial Planner analyses the results from balance sheet and income statement. He can investigate for example the state of the provisions and pension plan, risk management, tax charges, return on investments and other financial ratios. In the following, the scenario is described in more detail. Data gathering: assets and liabilities The first category of assets is cash which refers to money in accounts such as checking accounts, savings accounts, money market accounts, money market mutual funds, and certificates of deposit with a maturity of less than one year. The second category of assets are fixed assets, fixed-principal assets, fixed-return assets, or debt instruments. This category includes investments that represent a loan made by the client to someone else. Fixed-return assets are primarily corporate and government bonds, but may also be mortgages held, mutual funds that invest primarily in bonds, fixed annuities, and any other monies owed to the client. The third category of assets are equity investments. Examples of equity investments are common stock, mutual funds that hold mostly common stock, variable annuities, and partnerships. The fourth category of assets is the value of pensions and other retirement accounts. The last category of assets is usually personal or tangible assets, also known as real assets. This category includes assets bought primarily for comforts, like home, car, furniture, clothing, jewellery, and any other personal articles with a resale value. Liabilities are debts that a client currently owes. Liabilities may be charge account balances, personal loans, auto loans, home mortgages, alimony, child support, and life insurance policy loans. In general, assets are gathered at their market value, and debts are gathered at the amount a client would owe if he were to pay the debt off today. Data gathering: income and expenses Income includes wages and salaries, income from investments such as bonds, stocks, certificates of deposit, and rental property, and distributions from retirement accounts. Expenses may be expenses for housing, food, and transportation. Expenses can be grouped by similarity, for example one can choose insurance premiums as a general category, and then include life insurance, auto insurance, health insurance, homeowner’s or renter’s insurance, disability insurance, and any other insurance premiums one pays in this general category. The acquisition of personal assets (buying a car) is also treated as an expense. Data evaluation: balance sheet A balance sheet has three sections. The first section is a statement of assets (things that one owns). The second section is a statement of liabilities (a list of one’s debts). The third section is called net worth. It represents the difference between one’s assets and one’s liabilities. Businesses regularly construct balance sheets following a rigorous set of accounting rules. Personal balance sheets are constructed much less formally, although they should follow some of the same gen-

10 A Reference Model for Personal Financial Planning

225

eral rules incorporated into business balance sheets. The assets on a personal balance sheet may be placed in any order. The most common practice is to list assets in descending order of liquidity. Liquidity refers to an asset’s ability to be converted to cash quickly with little or no loss in value. Liabilities are normally listed in order of increasing maturity. The net worth on a balance sheet is the difference between the total value of assets owned and the total amount of liabilities owed. Net worth is the measure of a person’s financial wealth. Data evaluation: income statement Where the personal financial balance sheet is like a snapshot of your net worth at a particular point in time, an income statement is a statement of income and expenses over a period of time. The time frame is usually the past year. The first section of an income statement is the income. The second section is a statement of expenses. Expenses are usually listed in order of magnitude, with the basic necessities being listed first. The third section is a summary that represents the contribution to savings. Contribution to savings is the difference between the total of income and the total of expenses. A positive contribution to savings would normally show up in the purchase of investments, a reduction in debt, or some combination of the two. A negative contribution to savings shows up in the liquidation of assets, an increase in debt, or some combination of the two. Data analysis The net worth number is significant in and of itself, since the larger the net worth, the better off a person is financially, but balance sheet and income statement contain even more useful information a financial planner should consider. The traditional approach for analyzing financial statements is using of financial ratios. Financial ratios combine numbers from the financial statements to measure a specific aspect of the personal financial situation of a client. Financial ratios are used 1. to measure changes in the quality of the financial situation over time, 2. to measure the absolute quality of the current financial situation, and 3. as guidelines for how to improve the financial situation. We will group the ratios into the following eight categories: liquidity, solvency, savings, asset allocation, inflation protection, tax burden, housing expenses, and insolvency/credit (see also DeVaney et al. 1994; Greninger et al. 1994). For example, the Solvency Ratio is defined as Solvency Ratio = Total Assets/Total Liabilities and the Savings Rate is defined as Savings Rate = Contribution to Savings/Income. UML diagrams In this section we will elaborate the use case Determining the client’s financial status: data evaluation as an example of the UML diagram.

226

Oliver Braun, Günter Schmidt

The class diagram in Figure 10.4 shows a first-pass analysis model for the use case Determining the client’s financial status: data evaluation. Most important elements are the class names, the types, and the relationships. This basic class diagram can be elaborated with the operations and attributes as described in the use case specification to an elaborated class diagram (Figure 10.5).

ASSET

BUILD BALANCE SHEET

LIABILITY

DATA EVALUATION INCOME BUILD INCOME STATEMENT

EXPENSE

Figure 10.4. Basic use case analysis model elements

Figure 10.5 Elaborated analysis model elements

10 A Reference Model for Personal Financial Planning

227

Figure 10.6 shows a simple sequence diagram that indicates a basic flow of the use case Determining the client’s financial status: data evaluation. Messages corresponding to operations on the classes are drawn to correspond to the main narrative text alongside the scenario. Figure 10.7 shows the same use case scenario Determining the client’s financial status: data evaluation as depicted in the sequence diagram in Figure 10.6 does with System expressed in terms of the analysis classes in Figure 10.5. Figure 10.8 shows the basic activity diagram for the use case Determining the client’s financial status: data evaluation elaborated with classes from Figure 10.6.

Figure 10.6. Basic sequence diagram

Figure 10.7. Sequence diagram elaborated with analysis objects

228

Oliver Braun, Günter Schmidt

Figure 10.8. Basic activity diagram

10.4.4.2 Determining a Feasible To-be Concept The use case scenario is as follows: Determining a feasible To-be concept 1. Data gathering and Feasibility check. The Financial Planner asks for information about the client’s future financial situation. This includes information about the client’s future incomes and expenses and the client’s assumption on the performance of his assets. Future incomes and expenses are determined by life-cycle scenarios. The Financial Planner asks the client for his requirements on a feasible To-be concept. A to-be concept is feasible if and only if all requirements are fulfilled at every point in time. 2. Data evaluation. The Financial Planner uses two financial statements: (planned) balance sheets and (planned) income statements. 3. Data analysis. The Financial Planner analyses the results from balance sheets and income statements. He can investigate for example the state of the foresight and pension plan, risk management, tax charges, return on investments and other financial ratios. Data evaluation and Data analysis have to be done in a very similar way as in Determining the client’s financial status. In this section, we will concentrate our discussion at the first step: Data gathering.

10 A Reference Model for Personal Financial Planning

229

Future incomes and expenses are determined by financial goals. Financial goals are not just monetary goals, although such goals are obvious and appropriate. Goals can be short term, intermediate term, and long term. Short-term goals refer to the next twelve months. What does a client want his personal balance sheet to look like in twelve months? What would he like to do during the coming year: where would he like to spend his vacation? How much would he like to spend on the vacation? What assets, such as a new computer or a new car, would he like to acquire? Intermediate goals are usually more financial in nature and less specific in terms of activities and acquisitions. As one thinks further into the future, one cares more about having the financial ability to do things and less about the details of what one will do. Most intermediate goals would likely include more than just wealth targets. For example, a client may target a date for buying his first home. Intermediate goals often address where one wants to be in five or twenty years. For most people, long-term goals include their date of retirement and their accumulated wealth at retirement. What sort of personal balance sheet does a client want to have at retirement? The goals of a system for personal financial planning consist in helping a Personal Financial Planner to handle the financial situation of a client in a way that the client can maximize his/her quality of life by using money the best way for him/herself. Goals are about the management, protection, and ultimate use of wealth. There is a strong propensity to associate wealth or standard of living with quality of life. Wealth is normally defined of assets owned less any outstanding debts. Standard of living is measured in terms of annual income. For some people, wealth or standard of living may be the correct measure of their quality of life. For many, that is not the case. First, many people give away substantial amounts of wealth during their lifetimes (places of worship, charities, non-profit organizations such as alma maters). People give gifts because they derive pleasure from it or because they have a sense of obligation. Second, many people hold jobs noted for low pay (religious professions such as priests) and earn low pay compared with what their level of educational training could earn them in other professions. A lot of people pass up lucrative corporate incomes in order to be self-employed. Some people seek jobs with no pay. Full-time, stay-at-home parents are examples of people who accept unpaid jobs based on the benefits they perceive from managing the household. Obviously, many people regularly make choices that are not wealth- or income-maximizing, and these choices are perfectly rational and important to their lives. Proper financial planning seeks only to help people make the most of their wealth and income given the non-pecuniary choices they make in their lives, not just maximize their wealth and income. Life-cycle scenarios (see Braun 2006) are for example applying for a personal loan, purchasing a car, purchasing and financing a home, auto insurance, homeowner insurance, health and disability insurance, life insurance, retirement planning, estate planning. Requirements to a feasible To-be concept as introduced in (Braun 2007) are primarily requirements to balance sheets such as the composition of the entire portfolio,

230

Oliver Braun, Günter Schmidt

a cash reserve, etc. According to the financial goals of a company, a client may have the following two main requirements to a feasible To-be concept: assurance of liquidity (i. e. to have all the time enough money to cover his consumer spending) and prevention of personal insolvency (i. e. to avoid inability to pay). Assurance of liquidity (liquidity management) means that one has to prepare for anticipated cash shortages in any future month by ensuring that enough liquid assets to cover the deficiency are available. Some of the more liquid assets include a checking account, a savings account, a money market deposit account, and money market funds. The more funds are maintained in these types of assets, the more liquidity a person will have to cover cash shortages. Even if one has not sufficient liquid assets, one can cover a cash deficiency by obtaining short-term financing (such as using a credit card). If adequate liquidity is maintained, one will not need to borrow every time one needs money. In this way, a person can avoid major financial problems and therefore be more likely to achieve financial goals. Prevention of personal insolvency can be carried out by purchasing insurance. Property and casualty insurance insure assets (such as car and home), health insurance covers health expenses, and disability insurance provides financial support in case of disability. Life insurance provides the family members or other named beneficiaries with financial support if they loose their bread-winner. Thus, insurance protects against events that could reduce income or wealth. Retirement planning ensures that one will have sufficient funds for the old-age period. Key retirement planning decisions involve choosing a retirement plan, determining how much to contribute, and allocating the contributions. If the feasibility check fails, Personal Financial Planners can apply their knowledge and experience to balance the financial situation. They can take actions which adjust to the financial structure by changing requirements to a feasible to-be concept, by changing future expenses (such as expenses for a car), or by alternative financing expenses (for example by enhancement of revenues). These actions may be necessary to ensure that the main financial goals are fulfilled. 10.4.4.3 Measures to Take Financial goals of a client determine the planning steps as gathered in Section 10.4.4.2 (Determining a feasible To-be concept) and activities to achieve a feasible To-be concept.

10.5

System Architecture

Architecture influences just about everything in the process of developing software. Architecture gives the rules and regulations by which the software has to be constructed. Figure 10.9 shows the general architecture of our prototype FiXplan (IT-based Personal financial planning), and the data flow between the User Interfaces, the Application Layer, and the Sources (see also Braun and Schmidt 2005).

10 A Reference Model for Personal Financial Planning Clients

Server Visual Basic

EXCEL

WEB BROWSER

SOAP/HTTP

Sources

FiXplan Web Services Toolbox

Java Classes

HTML/HTTP

External Web Sites

SOAP/HTTP Java Server Pages

WEB BROWSER

231

HTML/HTTP

SOAP/HTTP

FiXplan Main System

Financial Applications PHP Classes SQL

Database Server

Figure 10.9. General Three-tier architecture of FiXplan

Basically, the system contains the following components: • Clients: Used by Personal Financial Planners or clients to access our main financial planning system and our Web Services. User Interfaces may be web browsers such as the Internet Explorer or Mozilla, or applications such as Microsoft Excel. • Server: As described in detail in the analysis model. The Web Services Toolbox provides useful tools for personal financial planning. • Sources: Used as a repository for the tools of the personal finance framework. The three-tier architecture of FiXplan supports modular deployment of both device specific user interfaces through multi-channel delivery of services and new or adapted operational processes and strategies. All connectivity interfaces are based on standard specifications. At the client side, the web browser and PHP-scripts handle the user interface and the presentation logic. At the server side, the web server gets all http requests from the web user and propagates the requests to the application server which implements the business logic of all the services for personal financial planning. At the database server side, transactional or historical data of the dayto-day operations are stored in the system database by RDBMS. The application server sends the query to the database server and gets the result set. It prepares the response after a series of processes and returns to the web server to be propagated back to the client side. A registered user can log in and pick an existing session or create a new session of his preference, such as a level of user

232

Oliver Braun, Günter Schmidt

Service Provider

XML service request based on WSDL (sent using SOAP)

XML service response based on WSDL (sent using SOAP)

Service Consumer

Figure 10.10. Basic service oriented architecture

protection, access rights and level of complexity. Based on the three-tier structure, FiXplan can run on both, internet and intranet environments. The user can for example use a web browser to access a web server through HTML language and HTTP protocol. The kernel of the system is placed on the web server. FiXplan uses a collection of Web Services. These services communicate with each other. The communication can involve either simple data passing or it could involve two or more services coordinating some activity. Some means of connecting services to each other is needed. Figure 10.10 illustrates a basic service oriented architecture. It shows a service consumer below sending a service request message to a service provider above. The service provider returns a response message to the service consumer. The request and subsequent response connections are defined in some way that is understandable to both the service consumer and the service provider. A service provider can also be a service consumer. All the messages shown in the above figure are sent using SOAP. SOAP generally uses HTTP, but other means of connection such as Simple Mail Transfer Protocol (SMTP) may be used. HTTP is the familiar connection we all use for the Internet. In fact, it is the pervasiveness of HTTP connections that will help drive the adoption of Web Services. SOAP can be used to exchange complete documents or to call a remote procedure. SOAP provides the envelope for sending Web Services messages over the Internet/Internet. The envelope contains two parts: An optional header providing information on authentication, encoding of data, or how a recipient of a SOAP message should process the message. The body that contains the message. These messages can be defined using the WSDL specification. WSDL uses XML to define messages. XML has a tagged message format. Both the service provider and service consumer use these tags. In fact, the service provider could send the data shown at the bottom of this figure in any order. The service consumer uses the tags and not the order of the data to get the data values. FiXplan is both: Web Service Provider and Web Service Consumer as shown in Figure 10.11:

10 A Reference Model for Personal Financial Planning Clients

Server

233

Sources

FiXplan EXCEL-SHEET

accNr SOAP/HTTP

CreditBalance.xls balance

Java Classes

Web Services Toolbox accNr

WS Consumer ShowBalance.php

balance

accNr

WS Provider

SOAP/HTTPS

balance WS Consumer

FiXplan Main System

Bank Giro Service GetSaldo.jws WS Provider

PHP Classes

Figure 10.11. Service provider and service consumer

In this example, a Visual Basic function called from an Excel-Sheet sends a SOAP message to the PHP Web Service ShowBalance.php. ShowBalance.php calls the function GetBalance.jsp that gets the balance via a SOAP message from the Web Service of a Bank Giro Service. ShowBalance.php retrieves the balance of account accNr and constructs a SOAP message that can be transferred by a Visual Basic function in the Excel Sheet.

10.6 Conclusion and Future Trends We have presented a reference model for personal financial planning consisting of an analysis model and a corresponding system architecture. An intrinsic part of our reference model at the architecture level is the usage of web technologies. Our reference model for personal financial planning fulfils two kinds of purposes: first, the analysis model is a conceptual model that can serve financial planners as a decision support tool. Second, at the design stage of system development, system developers can take the analysis model and apply the system architecture to them. The reference model combines business concepts of personal financial planning with technical concepts from information technology. We evaluated our reference model by building the IT system FiXplan based on this reference model. The prototype shows that our reference model fulfils maintainability, adaptability, efficiency, reliability, and portability requirements. Furthermore, it is based on ISO and CFP Board standards. Further research is needed to enlarge the reference model with respect to the automation of the To-be concept feasibility check. Here, classical mathematical

234

Oliver Braun, Günter Schmidt

programming models as well as fuzzy models may deliver a solution to that problem. While in case of classical models the vague data is replaced by average data, fuzzy models offer the opportunity to model subjective imaginations of the client or the Personal Financial Planner as precisely as they are able to describe it. Thus the risk of applying a wrong model of the reality and selecting wrong solutions which do not reflect the real problem could be reduced.

References Bass L, Clements P, Kazman R (1998) Software Architecture in Practice. The SEI Series in Software Engineering. Addison Wesley, Reading Becker J, Schütte R (2004) Handelsinformationssysteme. Redline Wirtschaft, Frankfurt am Main Bernus P, Mertins K, Schmidt G (2005) Handbook on Architectures of Information Systems. Springer, Berlin Bhargava HK, Power DJ, Sun D (2006) Progress in web-based decision support technologies, Decision Support Systems, to appear Böckhoff M, Stracke G (2001) Der Finanzplaner. Sauer, Heidelberg Booth D, Haas H, McCabe F, Newcomer E, Champion M, Ferris C et al. (eds) Web Services Architecture, W3C. Retrieved August 5, 2005, from http://www.w3c.org/TR/ws-arch/ Braun O (2006) Lebensereignis- und Präferenzorientierte Persönliche Finanzplanung – Ein Referenzmodell für das Personal Financial Planning. In: Schmidt G (ed) Neue Entwicklungen im Financial Planning. Hochschule Liechtenstein Braun O (2007) IT-gestützte Persönliche Finanzplanung. Habilitationsschrift, Saarland University Braun O, Kramer S (2004) Vergleichende Untersuchung von Tools zur Privaten Finanzplanung. In: Geberl S, Weinmann S, Wiesner DF (eds) Impulse aus der Wirtschaftsinformatik. Physica, Heidelberg, pp. 119−133 Braun O, Schmidt G (2005) A service oriented architecture for personal financial planning. In: Khosrow-Pour M (ed.) Managing modern organizations with information technology, Proceedings of the 2005 IRMA International Conference. Idea Group Inc, pp. 1032−1034 Briggs RO, Nunamaker JF, Sprague RH (1997) 1001 Unanswered research questions in GSS. Journal of Management Information Systems 14: 3−21 Certified Financial Planner’s (CFP) Board of Standards Inc (2005) Retrieved August 14, 2005, from http://www.cfp.net Chou SCT (1998) Migrating to the web: a web financial information system server. Decision Support Systems 23: 29−40 Conallen J (2002) Building Web Applications with UML, 2nd edn. Pearson Education, Boston

10 A Reference Model for Personal Financial Planning

235

DeVaney S, Greninger SA, Achacoso JA (1994) The Usefulness of Financial Ratios as Predictors of Household Insolvency: Two Perspectives Dong JC, Deng SY, Wang Y (2002) Portfolio selection based on the internet. Systems engineering: Theory and Practice 12: 73−80 Dong JC, Du HS, Wang S, Chen K, Deng X (2004) A framework of Web-based Decision Support Systems for portfolio selection with OLAP and PVM. Decision Support Systems 37: 367−376 Fan M, Stallaert J, Whinston AB (1999) Implementing a financial market using Java and Web-based distributed computing. IEEE Computer 32: 64−70 Fan M, Stallaert J, Whinston AB (2000) The internet and the future of financial markets. Communications of the ACM 43: 82−88 Federal Reserve Board Division of Consumer and Community Affairs (2002) Financial Literacy: An Overview of Practice, Research, and Policy. Retrieved on October 24, 2006 from http://www.federalreserve.gov/pubs/bulletin/2002/1102lead.pdf Fettke P, Loos P (2003) Multiperspective Evaluation of Reference Models – Towards a Framework. In: Jeusfeld MA, Pastor O (eds) ER 2003 Workshops, LNCS 2814. Springer, Heidelberg, pp. 80−91 Fowler M (1997) Analysis Patterns: Reusable Object Models. Menlo Park Frank U (1999) Conceptual Modelling as the Core of the Information Systems Discipline – Perspectives and Epistemological Challenges. In Haseman WD, Nazareth DL (eds) Proceedings of the 5th Americas Conference on Information Systems (AMCIS 1999). Milwaukee, Wisconsin, pp. 695−697 Greninger SA, Hampton VL, Kitt KA (1994) Ratios and Benchmarks for Measuring the Financial Well-Being of Families and Individuals. Financial Counselling and Planning 5: 5−24 Hay DC (1996) Data Model Patterns – Conventions of Thought. New York. Hirschey M, Richardson VJ, Scholz S (2000) Stock price effects of internet buysell recommendations: the Motley Fool Case. The Financial Review 35: 147−174 International Organization for Standardization (ISO/TC 222) (2004) ISO/DIS 22 222-1 of the Technical Committee ISO/TC 222, Personal financial planning Jacobson I, Christerson M, Jonsson P, Overgaard G (1992) Object-Oriented Software Engineering: A Use Case Driven Approach. Addison-Wesley, Boston Keown AJ (2003) Personal Finance: Turning Money into Wealth. Prentice Hall, Upper Saddle River Lang K, Taumann W, Bodendorf F (1996) Business Process Reengineering with Reusable Reference Process Building Blocks. In: Scholz-Reiter B, Stickel E (eds) Business Process Modelling. Berlin, pp. 264−290 Li ZM (1998) Internet/Intranet technology and its development. Transaction of Computer and Communication 8: 73−78 Marent C (1995) Branchenspezifische Referenzmodelle für betriebswirtschaftliche IV-Anwendungsbereiche. Wirtschaftsinformatik 37(3): 303−313 Marshall C (1999) Enterprise Modeling with UML: Designing Successful Software through Business Analysis. Reading

236

Oliver Braun, Günter Schmidt

Mathieu RG, Gibson JE (1993) A methodology for large scale R&D planning based on cluster analysis. IEEE Transactions on Engineering Management 30: 283−291 Mehdi RZ, Mohammad RS (1998) A web-based information system for stock selection and evaluation. Proceedings of the International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems Mertens P, Holzer J, Ludwig P (1996) Individual- and Standardsoftware: tertium datur? Betriebswirtschaftliche Anwendungsarchitekturen mit branchen- und betriebstypischen Zuschnitt (FORWISS-Rep. FR-1996-004). Erlangen, München, Passau Mišić VB, Zhao JL (1999) Reference models for electronic commerce. Proceedings of the 9th Hong Kong Computer Society Database Conference – Database and Electronic Commerce. Hong Kong, pp. 199−209 Mukherjee K (1994) Application of an interactive method for MOLIP in project selection decision: a case from Indian coal mining industry. International Journal of Production Economics 36: 203−211 Nissenbaum M, Raasch BJ, Ratner C (2004) Ernst & Young’s Personal Financial Planning Guide, 5th edn. John Wiley & Sons Remme M (1997) Konstruktion von Geschäftsprozessen – Ein modellgestützter Ansatz durch Montage generischer Prozesspartikel. Wiesbaden Rumbaugh J, Jacobson I, Booch G (2005) The Unified Modelling Language Reference Manual, 2nd edn. Addison-Wesley, Boston Samaras GD, Matsatsinis NF, Zopounidis C (2005) Towards an intelligent decision support system for portfolio management. Foundations of Computing and Decision Sciences 2: 141−162 Scheer A-W (1994) Business Process Engineering – Reference Models for Industrial Companies, 2nd edn. Berlin Schmidt G, Lahl B (1991) Integration von Expertensystem- und konventioneller Software am Beispiel der Aktienportfoliozusammenstellung. Wirtschaftsinformatik 33: 123−130 Schmidt G (1999) Informationsmanagement. Springer, Heidelberg Schmidt G (2006) Persönliche Finanzplanung – Modelle und Methoden des Financial Planning. Springer, Heidelberg Scholz-Reiter B (1990) CIM – Informations- und Kommunikationssysteme. München, Wien Schütte R (1998) Grundsätze ordnungsmäßiger Referenzmodellierung – Konstruktion konfigurations- und anpassungsorientierter Modelle. Wiesbaden Turban E (1995) Decision Support and Expert Systems, 4th edn. Prentice-Hall, Englewood Cliffs Woerheide W (2002) Core concepts of Personal Finance. John Wiley & Sons Wüstner E, Buxmann P, Braun O (2005) The Extensible Markup Language and its Use in the Field of EDI. In: Bernus P, Mertins K, Schmidt G (eds) Handbook on Architectures of Information Systems, 2nd edn. Springer, Heidelberg, pp. 391−419

10 A Reference Model for Personal Financial Planning

237

Youngs R, Redmond-Pyle D, Spaas P, Kahan E (1999) A Standard for Architecture Description. IBM Systems Journal 38. Available at http://www.research.ibm.com/journal /sj/381/youngs.html Zahedi F, Palma-dos-Reis A (1999) Designing personalized intelligent financial decision support systems. Decision Support Systems 26: 31−47 Zopounidis C, Doumpos M, Matsatsinis NF (1997) On the use of knowledgebased decision support systems in financial management: a survey. Decision Support Systems 20: 259−277

CHAPTER 11 Internet Payments in Germany Malte Krueger, Kay Leibold

11.1 Introduction: The Long Agony of Internet Payments Ten years ago, internet payments seemed to be a new and quickly expanding universe, ready to be taken over by new and fanciful payment methods. New companies were taking if not the market so at least the headlines by storm. The year 1994 saw the foundation of new upstarts such as DigiCash, CyberCash and First Virtual, and 1994 also was the birth year of SSL (the current standard of encoding confidential information sent over the internet). Entrepreneurs were not the only ones busy in this year. Regulators also had become aware of innovations in the payment system. In particular, DigiCash’s pilot with ‘Cyberbucks’ that could be sent over the internet as a means of payment triggered high hopes and grave fears. Evangelists of the New Economy regarded Cyberbucks as THE future internet payment system, monetary reformers hoped that systems such as Cyberbucks would free money users from the malicious influence of the state and – as a mirror image – central bankers feared the loss of monetary control. Thus, it is not surprising that the European Monetary Institute (the precursor of the European Central Bank (ECB)) came out with its ‘Report on Electronic Money’ in 1994 already (EMI 1994). Subsequently, many more reports were written. The EU Commission prepared a regulatory scheme for e-money and after a long period of haggling between the European Commission, member states and the ECB the ‘e-money directive’ was passed (EU Directive 2000). 1 1

In the US, the authorities were less worried about internet payment systems. In particular, the Fed took a much more relaxed view of e-money (Greenspan 1997). However, the image of a laissez-faire regime in the US – as compared to tight regulation in the EU – is somewhat mistaken (Krueger 2001). In the US, e-payment providers have to comply with state regulations. These regulations may be less severe than the EU

240

Malte Krueger, Kay Leibold

Today, the frenzy about internet payments has died down. Most of the pioneers of internet payments went into bankruptcy. Big schemes with powerful supporters (for instance Secure Electronic Transaction (SET) championed by the credit card organisations) never got any where. One of the few exceptions is the Peer-to-Peer (P2P) payment scheme PayPal which has taken the US market by storm and is currently expanding into other regions of the world. Apart from PayPal there are few success stories. There are three important factors that may help to explain why innovative new internet payment schemes did not take off. • First, traditional means of payment – used in the offline world – proved surprisingly adept for internet usage. • Second, internet payments in general and security in particular were seen as technical problems awaiting technical solutions. • Third, due to the pervasive availability of free content, the market for internet payments – in particular for micro payments – grew much more slowly than expected. Below we will take a look at the current state of the market and some of the important drivers.

11.2 Is There a Market for Innovative Internet Payment Systems? 11.2.1 Use of Traditional Payment Systems: Internet Payments and the Role of Payment Culture What could be observed over the last ten years was the successful adaptation of real world payment systems to the internet. One of the results of this trend is that the supposedly borderless internet exhibits surprisingly large national differences when it comes to payments. In retail payments, usually, a distinction is made between giro and check countries. Typical check countries are the US and the UK whereas many continental European countries are typical giro countries (see the following table). When looking at Point of Sale (POS) payments, in the typical check country, the main payment media are cash, checks and credit cards (with debit cards catching up fast, at the moment). In remote retail payments, the check is clearly dominating. In particular, it is still used for most recurring payments such as rent, electricity payments or salaries. In giro countries, the main POS payment media are cash and e-money regulation. However, the EU e-money directive has the advantage of providing a ’passport’ for the whole of the EU.

11 Internet Payments in Germany

241

Table 11.1. Relative importance of cashless payment instruments in 2002 Check Germany Belgium France Italy Netherlands Sweden Switzerland United Kingd. United States

0.7 0.8 24.5 14.6 Nap 0.1 0.2 13.9 37.2

Credit/ debit cards 14.9 39.0 32.9 37.5 37.0 61.2 35.9 45.1 48.4

Credit transfers 42.2 43.2 17.7 32.8 32.2 29.5 57.3 21.5 6.2

Direct debits 41.9 11.6 24.9 14.5 26.9 9.2 4.9 19.5 8.2

Card-based e-money 0.2 5.4 0.1 0.6 3.9 Nav 1.8 Nav Neg

Percentage of total volume of cashless transactions. Source: (BIS 2006)

debit cards with credit cards a distant third. For remote payments, people use either credit transfers, standing orders or direct debits – all of which can be summarised under the label ‘giro payment’ or ‘Automatic Clearing House (ACH) payment’. Over the last 20 years, the move from paper to electronic initiation and processing of payments has been notably different in the two payment cultures. Whereas check payments are still mostly paper-based, giro countries have been very successful in reducing the share of paper-based payments. The processing is almost completely electronic and even the initiation of payments is relying less and less on paper as an increasing number of customers uses phone banking or online banking. In Germany, for instance, there are more than 33 million online accounts (out of a total of 85 million giro accounts). From the point of view of customers, the giro system has a number of advantages. Recurrent payments do not require any intervention by the payer and it is possible to initiate electronic P2P payments by phone or PC.

Table 11.2. Relative importance of various internet payment systems

Belgium Denmark France Germany Netherlands United Kingdom United States

Bank Cards 50 10 51 24 26 60 > 80

Direct debits

Giro transfer 33

Cash upon del. 9

30 23

6 6 4

Other 8

90 21 30

43 19 47 10

Sources: Adams (2003), BIS (2004), Celent (2002), FEVAD (2006), insites (2004a) and (2004b), Krueger et al. (2005) and own estimates

242

Malte Krueger, Kay Leibold

The observed differences in payment cultures in the offline world have had a strong impact on payment patterns in the online world. In giro countries, credit transfers and debits have gained a large share in Business-to-Consumer (B2C) e-commerce. By contrast, in check countries the credit card is clearly dominating with a share of 60% and more. Even more striking is the difference in the area of P2P payments. Whereas in giro countries, credit transfers can easily be used for P2P payments, in check countries there is no wide accessible payment medium for electronic P2P payments. That has opened a market for newcomers such as PayPal. When it comes to international payments, there is more convergence between the two different payment cultures in the area of Consumer-to-Business (C2B) payments. X-border payments are generally dominated by the credit card. However, with the EU-directive on x-border payments, credit transfers have become an attractive option for both, C2B and Consumer-to-Consumer (C2C) payments within the EU. On the whole, it can be said that an innovative payment provider may find a very different payment environment depending on geographic market he wants to enter. With the spread of online banking and increasing use of online credit transfers the European market poses a bigger challenge for suppliers of alternative payment systems than the US market.

11.2.2 The Role of Technology The early developments of internet payment systems were very much technologydriven.2 Security and, in some cases, anonymity were seen as key elements of a successful internet payments product. One example is the ECash system offered by DigiCash. ECash was designed as an almost cash-like means of payment for the internet. It allows users to generate and store ‘one-way tokens’ on their PC which can be spent almost like cash. To provide anonymity, ECash was using blind signatures – a technique that had been developed by David Chaum. After trials in 1994, the first commercial roll-out was undertaken by Mark Twain Bank in the US. Subsequently, DigiCash licensed the technology to a number of banks in various countries (Merita Bank in Finland, the Swedish Post or Advance Bank in Australia). In Germany, Deutsche Bank acquired an ECash license and started a pilot in 1997. Another prominent example is CyberCash – a US payment service provider. CyberCash Inc. co-operated with Landesbank Sachsen and Dresdner Bank to make CyberCash the German internet payment standard. To use Cybercash both, consumer and merchant, have to download and install CyberCash software. This 2

A comprehensive overview of the early systems can be found in (Furche and Wrightson 1997).

11 Internet Payments in Germany

243

software allows them to encrypt all payment messages sent between consumer and merchant and between merchant and payment gateway. In Germany, CyberCash and its partners offered three distinct types of internet payment: Cybercoin (a prepaid micropayment system), electronic direct debit and credit card payments. Maybe the most prominent example of these early developments is SET. Security concerns with respect to the use of credit cards in open networks prompted the two large credit card companies EuroCard/MasterCard and Visa to develop a standard for SET. The system is based on digital signatures and heavy encryption. It was piloted in 1997 and subsequently promoted as a secure standard for credit card payments on the internet. Although all of these solutions found the support of large players in the field of payments, none could succeed. DigiCash (in 1998) and CyberCash (in 2001) had to declare bankruptcy and SET has been revised and re-branded over and again without achieving critical mass. What went wrong? All of these systems relied on users to install extra software on their PCs. This not only proved time-consuming and cumbersome, the wallet approach also limited portability. Given the high preference of payment system users for convenience of use, this approach did not meet enthusiastic customers. Merchants often found it difficult to integrate the new solutions into their shop systems. Moreover, while anonymity may be positive from the point of view of customers, merchants want to know their customers. Finally, the new systems were based on a narrow concept of ‘security’. Security was defined as secure against interference of third parties (‘hackers’) and irrevocability. However, a system with these traits leaves an important risk with customers: the risk that the merchant does not deliver. A second wave of internet payment systems tries to remedy the short comings of the first wave (Böhle 2001). In particular, in order to provide portability, a server based approach is used and SSL has become the encryption standard. Since SSL is integrated into standard browser software, customers no longer need to download additional encryption software. In payment systems with a server based approach customer data and account information are stored on a central server. Customers can access this server from any entry point to the internet. Authentication and authorisation is usually carried out with user names and pass words. Prominent examples are PayPal, FirstGate and the paysafecard. The latter is particularly interesting because it shows that anonymous internet payments can be provided with a system based on fairly basic technologies (i. e. a simple number tied to a prepaid online account). Companies in the second wave also follow a more focussed approach. For instance, PayPal initially strongly concentrated on the P2P segment and payments in connection with online auctions. FirstGate is focussing on digital goods and offers merchants a variety of additional services. paysafecard is eyeing the segments of the online market where anonymity is valued and seeks to position its system as a marketing tool.

244

Malte Krueger, Kay Leibold

11.2.3 The Role of Different Business Models 11.2.3.1 Two Types of C2B E-commerce In the real world, it can be observed that different goods are paid with different payment systems. Bread is mostly paid in cash, rent is often paid per bank transfer or check and car hire is generally cleared by credit card. In the world of e-commerce, a distinction that is particularly important is between soft and hard goods (or digital and ‘real’ goods). Soft goods can be bought online and immediately delivered over the internet. Hard goods have to be mailed or picked up in person. Soft goods are also often of low value compared to hard goods. From the point of view of payments, e-commerce in hard goods does not pose any new problem. After all, it is just another form of distant trade. Long before the internet was established, distant trade was well established, for instance, in the form of mail order business. Mail order merchants have successfully dealt with the problem that customers are far away and that a direct exchange of money and goods is impossible. Thus, the early claim, that e-commerce requires new types of payment and cannot flourish without such innovative e-payment systems seems unfounded. In fact, in many countries, mail order merchants have successfully embraced e-commerce. Conversely, pure e-tailors are growing in size and sophistication and are increasingly employing the risk management techniques required for distant trade. Such techniques allow them to make use of post-payment systems such as bank transfer or check payment after delivery. Thus, in the segment of hard goods there does not seem to be an urgent demand for innovative internet payment schemes. Many suppliers of innovative internet payment solutions seem to have set their eyes on e-commerce in soft goods. However, the wide availability of free goods has kept demand for payment products in check. It has repeatedly been argued that the ‘free lunch culture’ of the internet is inefficient and unlikely to survive. The laws of economic theory are invoked which supposedly state that the user has to bear the costs. Thus, it is expected that the share of goods that have to be paid for will be rising in the future as the internet becomes a ‘normal’ market place. When looking at the current situation, it is indeed true that in the area of digital content web surfers can find a wide choice of free offerings. In Germany, for instance, companies offering digital content on the internet derive 95% of their earnings from advertising and 5% from paying customers (see Figure 11.1). However, such figures do not prove that the business model is unsustainable or likely to change. Indeed, a look at the structure of earnings of the content industry in the real world shows that a high share of earnings from advertising is not unusual. To be sure, the share of advertising is not as high in the offline world as in the online world. Thus, there is hope for e-payment providers. However, when judging the prospective market two factors have to be taken into account. First, the charging model can be ‘per click’ or subscription. Subscription can be handled very well with established payment systems (for instance

11 Internet Payments in Germany

245

Figure 11.1. Digital content in the online and offline world (Germany 2003), Source: VDZ

standing order or direct debit). Thus, this segment of the market does not create a demand for innovative internet payment systems. The problem, from the point of view of e-payment providers, is that subscription can cover a fairy large chunk of the market. Looking at print media in Germany, 50−90% of sales revenue come from subscription. US figures for online paid content equally show a high fraction of subscription: 80% of sales revenues in the first half of 2005 (OPA 2006). What about economic theory? Should not the user be the one who pays? Well, there does not seem to be any economic law that says it is inefficient if a third party (the advertiser) voluntarily pays. What economic theory does say is that price should equal marginal costs. On that count, a price close to zero does not seem to be wide off the mark. Producing content is a fixed cost business. Marginal costs mainly consist of the costs of distribution. Thus, a structure as in Figure 11.1 – higher share of advertising revenue in the online world than in the offline world – does not seem inefficient. It should not be overlooked, however, that the future development can be very different, depending on the particular market segment. Music is a prominent example. The potential market for online sales is huge. The current losses due to file sharing etc. are substantial. For years, the music industry has been grappling with the problem without finding a solution. Now Apple has come up with the iPod and created a new business model: downloading music at a price of 0.99 cents per song. Apple’s sales are impressive. Up to September 2006 iTunes has sold over 1,5 billions soundfiles worldwide (200 million in Europe). The example seems to show that there is life in the ‘pay per click’ model after all. However, nagging questions remain. First, Apple’s success may comfort the digital content industry but not innovative payment providers. So far, Apple is simply using the credit card as payment instrument. Recently, PayPal has been added as another payment option. To keep costs in check, Apple is aggregating credit card transactions. Thus, in fact, there is another example of how adaptive established payment methods such as the credit card are. Second, for the moment, Apple seems to provide a possible new business model for the industry as a whole. Whoever, at

246

Malte Krueger, Kay Leibold

0.99 cents the price of a song is way above marginal costs. In such a situation economic forces should drive down the price. The only thing that could prevent this is an effective copy protection that is not being hacked within weeks or months. Falling prices would be bad news for the music industry. But they might ultimately trigger demand for some type of micro payment scheme that is capable to deal effectively with transactions well below 1 US$. A German player that seems well positioned to benefit from this development is FirstGate. Other potential contenders are T-Pay and WEB.DE. 11.2.3.2 C2C Payments and the Success of PayPal PayPal is by far the most successful new internet payment scheme. In the US, it has grown phenomenally and it is currently expanding into other countries. The question is whether PayPal will be as successful in these countries as in the US. As has been discussed above, in giro countries, there are already electronic C2C payment mechanisms. Thus, in these countries, there does not exist a gap in the payments market – like in the US – that needs to be filled. For instance, the common means of payment for eBay purchases is the credit transfer mostly initiated via the internet. Thus, for eBay transactions, PayPal does not fill a gap but is competing with established means of payment. As in other markets, the essential parameters of competition are service and price. With respect to prices, PayPal has already been forced to go below the price level in the US. With respect to service, PayPal seems to be offering more convenience to its users than most traditional payment methods used on the internet. Most notably, the payment process can be integrated into the eBay purchasing process allowing the seller to automatically fill in the PayPal payment order for the buyer. Another important ‘plus’ is the immediate notification of the payee that payment has been made. Finally, PayPal also offers a buying protection. PayPal has to compete with the credit transfer – a payment mechanism that is well known by European users. Furthermore, although a credit transfer requires more time for funds to travel from payer to payee (2−3 days) they are credited to the giro account of the seller. In the case of a PayPal payment, they still have to be transferred from the PayPal account to the giro account. Finally, banks and

Table 11.3. PayPal results Q3 2006 Total revenues (USD million) Number of accounts (million) total Germany Active accounts (million)* Total payment volume (USD million)

340 122.5 >3 30,9 9,123

Source: ebay inc. financial results and own estimates * Accounts that made or received at least one payment in the last quarter

11 Internet Payments in Germany

247

payment service providers are implementing systems that allow to integrate the credit transfer into the shopping process.3 Given the competition from new innovative internet payment systems and established payment methods, it is still open whether PayPal can carry its momentum from the US to continental Europe. At the moment, the use of PayPal in the German market is still limited, but growing.

11.3 Internet Payments in Germany: A View from Consumers and Merchants Like every market, the payment services market has a supply and a de-mand side. On the supply side there are payment service providers (banks, e-money-institutions and others). On the demand side there are merchants and consumers. From merchants’ point of view, payment instruments should include a payment guarantee, should be user-friendly and cheap and should provide a wide reach (in terms of customers). Consumers desire safe and simple payment methods which are widely accepted and secured by the legal framework. When designing and implementing an internet payment system, payment solution providers and payment service providers have to take the -sometimes conflicting – interests of these two groups into account.

11.3.1 The Consumers’ View The information on consumer’s views provided in this section is based on an online survey carried out by the Institute for Economic Policy Research in 2005. 15.342 participants took part in the 8th online survey ‘Internet payment systems: The consumers’ view’. Most of the participants are male (72%), between 26 and 35 years old (29%), relative experienced with the usage of the net (93%), at least three years online (88%) and are highly educated (37% have university degree). The survey reveals that when it comes to internet payments, traditional payment methods known from the offline world are dominating (see Table 4). Almost 78% have used the online transfer (Credit transfer via internet banking) and more than 62% have already used cash upon delivery, around 55% have used debits and 53% a paper-based credit transfer. Finally, 47% have used credit cards. New internet payment systems such as collection/billing, e-mail systems, prepaid systems or mobile payments have been used much less. 3

The German internet payment system ‘giropay’ (based on online banking) has been launched by a group of German banks in co-operation with PayPal. It can be used to fund PayPal transactions.

248

Malte Krueger, Kay Leibold

Table 11.4. Which internet payment system have you been using when purchasing on the internet?

Mobile phone Prepaid systems E-Mail systems Collection/Billing Paper based debit Credit card Paper based credit transfer Online Debit Cash on delivery Online credit transfer

% 1,5 9,0 18,1 29,6 36,3 47,3 52,9 54,2 62,5 77,6

Source: (Krueger et al. 2005). N=(12518−13234), in per cent of participants, multiple entries possible

The survey also shows that consumer behaviour towards hard and soft goods differs. Whereas 97% of participants have bought hard goods, only 65% have bought soft goods. But it is noteworthy that the willingness to pay for digital goods has been rising in 2005. Particularly, the very experienced participants are willing to pay for digitals goods (87%). As shown in Figure 2. the credit card is used most often for settling digital goods. Hard goods instead were mostly paid by credit transfer with direct debits on second place. The average low price of many soft goods (songs, articles, pictures, etc.) and the possibility to make a payment and receive the purchased goods instantaneously seems to support collection/billing systems which are on second place. High value e-goods were more often settled by online-banking or credit card. The security of e-commerce and e-payments is a much discussed topic. Our analysis suggests that lack of appropriate payment methods may not be a big issue. This seems to contradict the common perception that customers are reluctant to purchase via the internet because they are concerned about the security of internet payment systems. 26% of participants of IZV8 have had negative experience with the online purchase process. This contrasts with ‘only’ 8% who have made negative experiences with the online payment process (see Figure 11.2). Fortunately, both percentages have been decreasing over time. Thus, the security of internet payments does not seem as bad as is often assumed. Still, in spite of these results, the subjective impression of the security of e-payments is much less reassuring: 61% of the participants feel unsafe when buying goods online. To be sure, the respondents in the survey are mostly experienced internet users. Thus, the results are not representative for the population as a whole. However, they demonstrate that the lack of trust with respective to e-commerce and the internet may reflect low confidence in the internet in general rather than a specific reservation

11 Internet Payments in Germany

249

Figure 11.2. Negative experience when purchasing/payment on the internet, Source: (Krueger et al. 2005)

against particular payment products. Thus, once people get used to using the net, they also become more confident to carry out transactions over the net. If this is true, new payment systems may not be required to enhance consumer trust. Rather than looking for new ways of payment, innovators should seek to make existing payments more user friendly. After all, the payments industry is – to a large extent – a service industry.

11.3.2 The Merchants’ View The merchants’ view is based on data of the third merchant survey conducted by ECC-Handel and the Institute for Economic Policy Research in 2005 (see Van Baal 2005). In this survey, 449 companies took part. With respect to payment methods offered, the merchant survey confirms the results of the consumer survey. Most merchants offer payment methods which are known from the offline world. Special internet payment systems are rare. Pre-paid credit transfer (82%), cash on delivery (COD) (64%), credit transfer after invoice

250

Malte Krueger, Kay Leibold

payment in advance

14 , 9

8 2 ,3

cash on delivery

3 2 ,6

6 3 ,9

invoice debit entry

4 3 ,8

7, 3

4 8 ,9

3 9 ,7

13 , 6

4 6 ,7

credit card w ith SSL encryption

3 1, 0

PayPal

2 9 ,9

17,4

52 , 7

credit card verified by CVC/CVV

2 8 ,3

19 , 8

51, 9

credit card w irh verification

4 9 ,7

19 ,3

6 8 ,8

19 , 6

11,7

other

9 , 2 5, 7

8 5, 1

collection / billing systems

7, 3 9 , 8

8 2 ,9

online-transfer

6 8 ,8

2 4 ,5

6 ,8

credit card w ithout encryption

5, 45, 4

8 9 ,1

prepaid solutions

5, 26 , 8

88

payment plattform mobil payment GeldKarte

8 3 ,2

12 , 2

8 7,5

9 ,2

9 0 ,8

7, 1

payment per telephone 0% Yes

9 3 ,5

20%

40%

60%

No, but planned for 2006

80%

100%

No, not planned

Figure 11.3. Which methods of payment do you accept? (n = 368), Source: (Van Baal 2005)

(49%) and direct debit (47%) are mostly named. Up to 28% of the shops accept credit cards (SSL or CVV/CVC-based) or PayPal. Most of the dedicated internet payments systems as, for instance, telephone based systems, collection/billing systems or prepaid systems, are only accepted by a small fraction of internet merchants. However, in some segments of the market, these payment systems may have a higher penetration rate. The adoption of a payment system is, as in the consumers survey, depending on the type of goods (see Figure 11.4). Especially shops which are selling digital goods, e. g. music, require particular online payment systems. Digital goods can be delivered ‘per click’ and therefore payments should be as fast. Therefore, the share of direct debits, collection/billing and credit cards is relatively high whereas credit transfers (for prepayment or after invoice) and COD are not widely offered by merchants selling digital goods. When selling physical goods the rate of diffusion is more important than speed of payment. Therefore invoice, prepayment and COD are more common in this area.

11 Internet Payments in Germany

251

9 2 ,2

payment in advance***

3 4 ,1

6 5,2 7 5, 1

cash on delivery***

9 ,8

invoice***

4 3 ,5 4 9 ,5

19 , 5

71, 7 4 6 ,6 4 1, 5 52 ,2 3 8 ,8

debit entry credit card

3 1, 7 3 2 ,6 3 3 ,1

PayPal**

14 , 5

other

4 ,3 3 ,9

collection / billing systems***

2 3 ,9

9 ,6 12 , 2 2 9 ,3

8 ,7 5

online-transf er***

22

4 ,3 3 ,2

prepaid solutions***

19 , 5

4 ,3 3 ,2

payment plattf orm***

17 , 1

2 ,2 0

mobil payment***

19 , 5

8 ,7 1, 8

GeldKarte*

0 0 ,4

payment per telephone***

7,3

4 ,3

0

9 ,8

20

physical and digital goods

40 digital goods

60

80

100

physical goods

Figure 11.4. Does your company offer the following payment methods for sales on the internet? Share of companies answering ‘yes’, Source: (Van Baal 2005) (n1 = 321 suppliers of physical goods, n2 = 54 suppliers of digital goods, n3 = 74 suppliers of both types of goods, * sign. at 90%), ** sign. at 95%, *** sign. at 99%

In the media, a lot of attention has been given to the risks consumers face when purchasing over the internet. However, in many cases, merchants are the ones facing most of the risk. Many merchants offer delivery before payment has been received – or even before it has been initiated. In addition, they often accept means of payment that allow payment reversals by the payer. Therefore, from the point of view of the merchant, risk management is essential. The survey shows that risk is, indeed, an issue and that merchants use different strategies to deal with risk. On average, the merchants participating in the survey have an expected loss rate between 0.6% and 1% (Figure 11.5). When considering this percentage rate on its own, it is not as dramatic as media reports suggest. However, it contrasts with the offline world with a loss rate of 0.12% for direct debits initiated at the POS. Moreover, a substantial portion of online traders experiences loss rates far higher: 8% suffer losses above 8% and 2.5% even suffer losses of 10% and higher. These results show how important a proper risk management is.

252

Malte Krueger, Kay Leibold

50% 40%

37,6% 23,8%

30%

18,1%

20%

12,4% 5,7%

10%

2,5%

0% 0 - 0,5%

0,6 - 1%

1,1 - 3%

3,1 - 5%

5,1 - 10%

more than 10%

Figure 11.5. How large are internet payment losses for your company in percent of your turnover? (n = 282; estimates of participants; for example: 18,1% of the participants have payment losses of 1,1 to 3% of their turnover.), Source: (Van Baal 2005)

s elective us e of COD* or advance

30,4

69,6

checking of cus tom er data

44,8

55,2

internal check of blocking lis t

42,2

57,8

delivery generally only with cas h on

41,0

59,0

adres s check

40,0

60,0

check of cus tom er credit-worthines s

25,3

74,7

am ount lim it for new cus tom ers

24,5

75,5

own s coring for ris k valuation

22,3

77,7

ris k depending offer of paym ent s ys tem s

19,4

80,6

paym ent guarantee by paym ent s ys tem

18,4

81,6

identification of cus tom er by bank

17,3

82,7

ins urance for paym ent los s es by 3rd party 7,3 0%

92,7

20%

40%

60%

Us ed

Not us ed

80%

100%

Figure 11.6. The following instruments are suitable to restrict the risk of payment losses. Which does your company use? (251 ≥ n ≥ 217), * cash on delivery

11 Internet Payments in Germany

253

Risk management starts with the selection of appropriate payment methods. In addition, to minimize risk, other instruments like third party guarantees, address verification or selective strategies could be implemented. According to the results (Figure 11.6), many merchants prefer simple solutions. In this way COD/payment in advance were most mentioned (70%). Checking account or card numbers is on second place (55%). Use of external data and scoring models is much less popular.

11.4 SEPA For a long time, retail payment has not been an area of particular interest to regulators. However, this has been changing over the years. Increasingly, consumer protection laws, anti trust laws and prudential supervision are becoming relevant for retail payments. Moreover, in the EU, the movements towards a single market have become increasingly important for the area of payments. The EU Commission has set the goal of a Single European Payments Area (‘SEPA’) and puts a lot of pressure on the payments industry to abolish all differences between intra-EU payments and national payments. While trade in goods and services has become ever more integrated within the EU, payment systems mostly were confined within national borders. Even after the introduction of the Euro, x-border payments within the EU remained expensive and cumbersome. The EU Commission became increasingly irritated about this state of the world. Finally, it resorted to regulation to change this state of affairs. In 2001 it introduced the Regulation (EC) No 2560/2001 on cross-border payments in EUR to mandate equal pricing for national and x-border payments denominated in EUR. Regulation 2560/2001 has become something like a wake-up call for the EU banking industry. In 2002, 42 banks and the three European Credit Sector Associations launched the European Payments Council (EPC). The task of the EPC is to coordinate the activities of the banking industry to achieve the integration of the Euroarea payment system. The main activities are the creation of a European credit transfer, a European direct debit system and European standards for card and cash payments. Additionally, the EPC works on internet payments and mobile payments. The EPC has set an ambitious timeline that envisions the introduction of new European payment instruments by 2008 and the gradual adoption of these instruments until 2010. The EPC co-operates with the European Central Bank and the EU Commission. Both institutions are putting a lot of pressure on banks to stay within the proposed time framework. It is still a point of debate whether European and national solutions should coexist or whether ultimately, there should only be European solutions. The banks favour an approach that allows for co-existence. But the EU Commission and the European Central Bank are strongly in favour of discontinuing all national solutions. It remains to be seen whether shutting down all national solutions will always benefit customers. Thus, few expect that a European debit card scheme will be as

254

Malte Krueger, Kay Leibold

cheap as the Dutch PIN-system. Consequently, Dutch merchants will not see themselves as winners of SEPA. Equally, a European direct debit may offer less protection to bank customers than the current German system which is highly popular with German account holders. The drive towards SEPA also has important implications for internet payments. As has been shown above, the world of internet payments mirrors in many respects the offline world. Thus, the only truly pan-European means of payment is the credit card. Most other methods are more or less national. To some extend, specialised service providers have tried to overcome this situation. Companies like Bibit/RBS Worldpay and GlobalCollect have tried to connect to a large number of national systems in order to offer merchants the possibility to let customers in various countries all use their local payment instruments. This has worked up to a point. However, it seems immediately obvious that such a procedure comes at a cost. Moreover, the EU-Commission as well as the ECB are pressing for a genuinely European system that does not involve any cross-country differences. Thus, as the standard payment systems of the offline world are transformed, the same will have to happen to internet payment systems. This process will partly be automatic. For instance, once there is a European direct debit, merchants can offer direct debit throughout Europe. This definitely is a plus. However, it remains to be seen how consumers will react because the European direct debit will not be quite as the direct debit to which they have grown accustomed. Thus, in countries like Germany, consumers may well end up with fewer rights with respect to direct debit payments.

11.5 Outlook The payment requirements of e-commerce have been met at an astonishing degree by existing payment systems. In some cases, existing systems were used without any change (paper check, cash on demand, credit transfer). In other cases, there were smaller modifications (credit card or debit payment without signature). Finally, there are cases of more complex adaptations to the internet (Verified by Visa, Secure Code, the integrated online credit transfer). The strong performance of existing payment systems has made it difficult for innovative newcomers to enter the market. This is particularly true for the giro countries which found it easier to adapt a largely paperless system to new requirements imposed by e-commerce. But even in the check countries, a similar development may follow. Moving from paper check to e-check may ultimately lead to a system that looks very much like an electronic debit system. In addition, ACH payments are gaining market share. Thus, in the long run, there may be convergence between the two systems. Where would leave that new internet payment schemes? Strictly speaking, if the US moved towards a giro system, there would be much less need for a C2C system such as PayPal. However, in the US, PayPal has reached critical mass already and may therefore be able to survive – in particular if it can manage to gain a high reputation for Quality of Service (QoS).

11 Internet Payments in Germany

255

That would leave only micro payments as a potential market for innovative new payment schemes. The problem is that the size of this market is constrained by the prevalence of advertising finance and of subscription. Moreover, aggregation can be used to reduce the per transaction costs of conventional payment methods. Still, many new firms have set their hopes on micropayments and the ‘pay per click’ model. In Germany, for instance, there are Firstgate click&buy, T-Pay and Web.cent – to name but a few. These payment systems offer convenient ways for billing small amounts. The other side of the medal are the low diffusion of these systems and the restricted application area. Thus, it remains to be seen whether such systems will be able to gain critical mass.

References Adams J (2003) How Europe pays. European Card Review March/April: 11−16. Bank for International Settlement (BIS) (2006) Statistics on payment and settlement systems in selected countries (‘Red Book’). Basel Bank for International Settlement (BIS) (2004) Survey of e-money and internet and mobile payments. Basel Böhle K (2001) The Potential of Server-based Internet Payment Systems – An attempt to assess the future of Internet payments – ePSO Background Paper No. 3. Institute for Prospective Technological Studies, Sevilla (http://epso.jrc.es/Docs/Backgrnd-3.pdf) Celent (2002) Innovation in Internet Payments: The Plot Against Credit Cards. Celent Communication, New York European Central Bank (ECB) (2004) Payment and securities settlement systems in the European Union (‘Blue Book’). Frankfurt/M EU Directive (2000) Directive 2000/46/EC of the European Parliament and of the Council of 18 September 2000 on the taking up, pursuit and prudential supervision of the business of electronic money institutions. Official Journal of the European Communities L 275 of 27 October EU Regulation (2001) Regulation (EC) No 2560/2001 of the European Parliament and of the Council of 19 December 2001 on cross-border payments in euro. Official Journal L 344 of 28 December 2001 European Monetary Institute (EMI) (1994) Report to the Council of the European Monetary Institute on Prepaid Cards, by the Working Group on EU Payment Systems. Frankfurt/M FAVED (2006) Chiffres Clés. Vente à distance e-commerce. (www.faved.com) Furche A, Wrightson G (1997) Computer Money. Internet- und Kartensysteme. Ein systematischer Überblick. dpunkt, Heidelberg Furche A, Wrightson G (2000) Why do stored value systems fail? Netnomics 2:1, March: 37−47 Greenspan, A (1997) Fostering Financial Innovation: The Role of Government. in: Dorn, James A (ed) The Future of Money in the Information Age. Cato Institute, Washington, D.C.: 45−50

256

Malte Krueger, Kay Leibold

Insites (2004a) 6.000.000 nederlanders kochten het voorbije jaar via internet. (www.insites-consulting.com) (download 2004) Insites (2004b) 2.250.000 Belgian made online purchases in the past year. (www.insites-consulting.com) (download 2004) Krueger M, Leibold K, Smasal D (2005) Internet Payment Systems: The consumer’s view – results of the 8th online survey. University of Karlsruhe. (http://www.iww.uni-karlsruhe.de/izv8) Krueger, M (2001a) Innovation and Regulation – The Case of E-money Regulation in the EU – ePSO Background Paper No. 5. Institute for Prospective Technological Studies, Sevilla. (http://epso.jrc.es/Docs/Backgrnd-5.pdf) Krueger M (2001b) Offshore E-Money Issuers and Monetary Policy, First Monday 6:10 (October) (http://firstmonday.org/issues/issue6_10/krueger/ index.html) Online Publishers Association (OPA) (2005) Online Paid Content US Market Spending Report, October Van Baal S, Krueger M, Hinrichs J-W (2005) Internet-Zahlungssysteme aus Sicht der Händler: Ergebnisse der Umfrage IZH3. Institut für Handelsforschung, Universität Köln, September

CHAPTER 12 Grid Computing for Commercial Enterprise Environments Daniel Minoli

12.1 Introduction Grid Computing1 (or, more precisely a “Grid Computing System”) is a virtualized distributed computing environment. Such environment aims at enabling the dynamic “runtime” selection, sharing, and aggregation of (geographically) distributed autonomous resources based on the availability, capability, performance, and cost of these computing resources, and, simultaneously, also based on an organization’s specific baseline and/or burst processing requirements. When people think of a grid the idea of an interconnected system for the distribution of electricity, especially a network of high-tension cables and power stations, comes to mind. In the mid-1990s the grid metaphor was (re)applied to computing, by extending and advancing the 1960’s concept of “computer time sharing”. The grid metaphor strongly illustrates the relation to, and the dependency on, a highly-interconnected networking infrastructure. This chapter is a survey of the Grid Computing field as it applies to corporate environments and it focuses on what are the potential advantages of the technology in these environments. The goal is to serves as a familiarization vehicle for the interested Information Technology (IT) processional. It should be noted at the outset, however, that no claim is made herewith that there is a single or unique solution to a given computing problem; Grid Computing is one of a number of available solutions in support of optimized distributed computing. Corporate IT professionals, for whom this text is intended, will have to perform appropriate functional, economic, business-case, and strategic analyses to determine which computing approach ultimately is best for their respective organizations. Furthermore, it should be noted that Grid Computing is an evolving field and, so, there is not always one canonical, normative, universally-accepted, or axiomatically-derivable view of “everything 1

The material in this chapter is summarized from the book (Minoli D 2005) A Networking Approach to Grid Computing, Wiley, New York.

258

Daniel Minoli

grid-related”; it follows, that occasionally we present multiple views, multiple interpretations, or multiple perspectives on a topic, as it might be perceived by different stakeholders of communities of interest. This chapter begins the discussion by providing a sample of what a number of stakeholders define Grid Computing to be, along with an assay of the industry. The precise definition of what exactly this technology encompasses is still evolving and there is not a globally-accepted normative definition that is perfectly nonoverlapping with other related technologies. Grid Computing emphasizes (but does not mandate) geographically-distributed, multi-organization, utility-based, outsourcer-provided, networking-reliant computing methods. For a more complete treatment of the topic the reader is referred to the textbook [34]. Virtualization is a well known concept in networking, from Virtual Channels in Asynchronous Transfer Mode, to Virtual Private Networks, to Virtual LANs, and Virtual IP Addresses. However, an even more fundamental type of virtualization is achievable with today’s ubiquitous networks: machine cycle and storage virtualization through the auspices of Grid Computing and IP storage. Grid Computing is also known as utility computing, what IBM calls on-demand computing. Grid computing is a virtualization technology that was talked about in the 1980s and 1990s and entered the scientific computing field in the past ten years. The technology is now beginning to make its presence felt in the commercial computing environment. In the past couple of years there has been a lot of press and market activity, and a number of proponents see major penetration in the immediate future. IBM, Sun, Oracle, AT&T, and others are major players in this space. Grid computing cannot really exist without networks (the “grid”), since the user is requesting computing or storage resources that are located miles or continents away. The user needs not be concerned about the specific technology used in delivering the computing or storage power: all the user wants and gets is the requisite “service”. One can think of grid computing as a middleware that shields the user from the raw technology itself. The network delivers the job requests anywhere in the world and returns the results, based on an established service level agreement. The advantages of grid computing are the fact that there can be a mix-andmatch of different hardware in the network; the cost is lower because there is a better, statistically-averaged, utilization of the underlying resources; also, there is higher availability because if a processor were to fail, another processor is automatically switched in service. Think of an environment of a Redundant Array of Inexpensive Computers (RAIC), similar to the concept of RAID. Grid Computing is intrinsically network-based: resources are distributed all over an intranet, an extranet, or the Internet. Users can also get locally-based virtualization by using middleware such as VMWare that allows a multitude of servers right in the corporate Data Center to be utilized more efficiently. Typically corporate servers are utilized for less than 30−40% of their available computing power. Using virtualization mechanism the firm can improve utilization, increase availability, reduce costs, and make use of a plethora of mix-and-match processors; as a minimum this drives to server consolidation.

12 Grid Computing for Commercial Enterprise Environments

259

Security is a key consideration in grid computing. The user wants to get its services in a trust-worthy and confidential manner. Then there is the desire for guaranteed levels of service and predictable, reduced costs. Finally, there is the need for standardization, so that a user with an appropriate middleware client software can transparently reach any registered resource in the network. Grid Computing supports the concept of the Service Oriented Architecture, where clients obtain services from loosely-coupled service-provider resources in the network. Web Services based on the Simple Object Access Protocol (SOAP) and Universal Description, Discovery and Integration (UDDI) protocols are now key building blocks of a grid environment [35].

12.2 What is Grid Computing and What are the Key Issues? In its basic form, the concept of Grid Computing is straightforward: with Grid Computing an organization can transparently integrate, streamline, and share dispersed, heterogeneous pools of hosts, servers, storage systems, data, and networks into one synergistic system, in order to deliver agreed-upon service at specified levels of application efficiency and processing performance. Additionally, or, alternatively, with Grid Computing an organization can simply secure commoditized “machine cycles” or storage capacity from a remote provider, “on-demand”, without having to own the “heavy iron” to do the “number crunching”. Either way, to an end-user or application this arrangement (ensemble) looks like one large, cohesive, virtual, transparent computing system ([11]; [30]). Broadband networks play a fundamental enabling role in making Grid Computing possible and this is the motivation for looking at this technology from the perspective of communication. According to IBM’s definition ([48]; [24]), “a grid is a collection of distributed computing resources available over a local or wide area network that appear to an end user or application as one large virtual computing system. The vision is to create virtual dynamic organizations through secure, coordinated resource-sharing among individuals, institutions, and resources. Grid Computing is an approach to distributed computing that spans not only locations but also organizations, machine architectures, and software boundaries to provide unlimited power, collaboration, and information access to everyone connected to a grid.” “… The Internet is about getting computers to talk together; Grid Computing is about getting computers to work together. Grid will help elevate the Internet to a true computing platform, combining the qualities of service of enterprise computing with the ability to share heterogeneous distributed resources – everything from applications, data, storage and servers.” Another definition, this one from The Globus Alliance (a research and development initiative focused on enabling the application of grid concepts to scientific and engineering computing), is as follows [28]: “The grid refers to an infrastructure that enables the integrated, collaborative use of high-end computers, networks,

260

Daniel Minoli

databases, and scientific instruments owned and managed by multiple organizations. Grid applications often involve large amounts of data and/or computing and often require secure resource sharing across organizational boundaries, and are thus not easily handled by today’s Internet and Web infrastructures.” Yet another industry-formulated definition of Grid Computing is as follows ([14]; [15]): “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to highend computational capabilities. A grid is concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. The key concept is the ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose. The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problemsolving and resource-brokering strategies emerging in industry, science, and engineering. This sharing is, necessarily, highly controlled, with resource providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules form what we call a virtual organization (VO).” Whereas the Internet is a network of communication, Grid Computing is seen as a network of computation: the field provides tools and protocols for resource sharing of a variety of IT resources. Grid Computing approaches are based on coordinated resource sharing and problem solving in dynamic, multi-institutional VOs. A (short) list of examples of possible VOs include: application service providers, storage service providers, machine-cycle providers, and members of industry-specific consortia. These examples, among others, represent an approach to computing and problem solving based on collaboration in data-rich and computation-rich environments ([18]; [37]). The enabling factors in the creation of Grid Computing systems in recent years have been the proliferation of broadband (optical-based) communications, the Internet, and the World Wide Web infrastructure, along with the availability of low-cost, high-performance computers using standardized (open) Operating Systems ([9]; [14]; [18]). The role communications as a fundamental enabler will be emphasized throughout this chapter. Prior to the deployment of Grid Computing, a typical business application had a dedicated server platform of servers and an anchored storage device assigned to each individual server. Applications developed for such platforms were not able to share resources, and from an individual server’s perspective it was not possible, in general, to predict, even statistically, what the processing load would be at different times. Consequently, each instance of an application needed to have its own excess capacity to handle peak usage loads. This predicament typically resulted in higher overall costs than would otherwise need to be the case [23]. To address these lacunae, Grid Computing aims at exploiting the opportunities afforded by the synergies, the economies of scale, and the load smoothing that result from the ability to share and aggregate distributed computational capabilities, and deliver

12 Grid Computing for Commercial Enterprise Environments

261

these hardware-based capabilities as a transparent service to the end-user.2 To reinforce the point, the term “synergistic” implies “working together so that the total effect is greater than the sum of the individual constituent elements.” From a service provider perspective, Grid Computing is somewhat akin to an Application Service Provider (ASP) environment, but with a much-higher level of performance and assurance [5]. Specialized ASPs, known as Grid Service Providers (GSPs) are expected to emerge to provide grid-based services, including, possibly, “open-source outsourcing services.” Grid Computing started out as the simultaneous application of the resources of many networked computers to a single (scientific) problem [14]. Grid Computing has been characterized as the “massive integration of computer systems” [45]. Computational grids have been used for a number of years to solve large-scale problems in Science and Engineering. The noteworthy fact is that, at this juncture, the approach can already be applied to a mix of mainstream business problems. Specifically, Grid Computing is now beginning to make inroads into the commercial world, including financial services operations, making the leap forward from such scientific venues as research labs and academic settings [23]. The possibility exists, according to the industry, that with Grid Computing companies can save as much as 30% of certain key line items of the operations budget (in an ideal situation), which is typically a large fraction of the total IT budget [41]. Companies yearly spend, on the average, 6%3 of their top line revenues on IT services; for example, a $10B/yr Fortune 500 company might spend $600M/yr on IT. Grid middleware vendors make the assertion that cluster computing (aggregating processors in parallel-based configurations), yield reductions in IT costs and costs of operations that are expected to reach 15% by 2005 and 30% by 2007–8 in most early adopter sectors, according to the industry. Use of enterprise grids (middleware-based environments to harvest unused “machine cycles”, thereby displacing otherwise-needed growth costs), is expected to result in a 15% savings in IT costs by the year 2007–8, growing to a 30% savings by 2010 to 2012 [10]. This potential savings is what this chapter is all about. In 1994 this author published the book Analyzing Outsourcing, Reengineering Information and Communication Systems (McGraw-Hill), calling attention to the possibility that companies could save 15−20% or more in their IT costs by considering outsourcing – the trends of the mid-2000s have, indeed, validated this (then) timely assertion [32]. At this juncture we call early attention to the fact that the possibility exists for companies to save as much as 15−30% of certain key line items of the IT operations (run-the-engine) budget by using Grid Computing 2

3

As implied in the opening paragraphs, a number of solutions in addition to Grid Computing (e. g., virtualization) can be employed to address this and other computational issues – Grid Computing is just one approach. The range is typically 2% to 12%. See Minoli D (2007) Enterprise Architecture A thru Z: Frameworks, Business Process Modeling, SOA, and Infrastructure Technology – Migrating to State-of-the-Art Environments, in Enterprises with IT-intensive Assets: Platform, Storage, and Network Virtualization, Taylor & Francis Group.

262

Daniel Minoli

and/or related computing/storage virtualization technologies. In effect, Grid Computing, particularly the utility computing aspect, can be seen as another form of outsourcing. Perhaps utility computing will be the next phase of outsourcing and be a major trend in the 2010-decade. As we discuss later in the book, the evolving Grid Computing standards can be used by companies to deploy a next-generation kind of “open source outsourcing” that has the advantage of offering portability, enabling companies to easily move their business among a variety of pseudocommodity providers. This chapter explores practical advantages of Grid Computing and what is needed by an organization to migrate to this new computing paradigm, if it so chooses. The chapter is intended for practitioners and decision-makers in organizations (not necessarily for software programmers) who want to explore the overall business opportunities afforded by this new technology. At the same time, the importance of the underlying networking mechanism is emphasized. For any kind of new technology corporate and business decision-makers typically seek answers to questions such as these: (i) “What is this stuff?”; (ii) “How widespread is its present/potential penetration?”; (iii) “Is it ready for prime time?”; (iv) “Are there firm standards?”; (v) “Is it secure?”; (vi) “How do we bill it, since it’s new?”; (vii) “Tell me how to deploy it (at a macro level)”; and (viii) “Give me a self-contained reference... make it understandable and as simple as possible.” Table 12.1 summarizes these and other questions that decision-makers, CIOs, CTOs, and planners may have about Grid Computing. Table 12.2 lists some of the concepts embodied in Grid Computing and other related technologies. Grid Computing is also known by a number of other names (although some of these terms have slightly different connotations), such as “grid” (the term “the grid” was coined in the mid-1990s to denote a proposed distributed computing infrastructure for advanced science and engineering); “computational grid”; “computing-on-demand”; “On Demand computing”; “just-in-time computing”; “platform computing”; “network computing”, “computing utility” (the term used by this author in the late 1980s [33]; “utility computing”; “cluster computing”; and “high performance distributed computing”. With regard to nomenclature, in this text, besides the term Grid Computing, we will also interchangeably use the term grid and computational grid. In this text, we use the term grid technology to describe the entire collection of Grid Computing elements, middleware, networks, and protocols. To deploy a grid, a commercial organization needs to assign computing resources to the shared environment and deploy appropriate grid middleware on these resources, enabling them to play various roles that need to be supported in the grid (e. g., scheduler, broker, etc.) Some (minor) application re-tuning and/or parallelization may, in some instances, be required; data accessibility will also have to be taken into consideration. A security framework will also be required. If the organization subscribes to the service provider model, then grid deployment would mean establishing adequate access bandwidth to the provider, some possible application re-tuning, and the establishment of security policies (the assumption being that the provider will itself have a reliable security framework.)

12 Grid Computing for Commercial Enterprise Environments

263

The concept of providing computing power as a utility-based function is generally attractive to end-users requiring fast transactional processing and “scenario modeling” capabilities. The concept is may also be attractive to IT planners looking to control costs and reduce Data Center complexity. The ability to have a cluster, an entire Data Center, or other resources spread across a geography connected by the Internet (and/or, alternatively, connected by an intranet or extranet), operating as a single transparent virtualized system that can be managed as a service, rather than as individual constituent components, likely will, over time, increase business agility, reduce complexity, streamline management processes, and lower operational costs [23]. Grid technology allows organizations to utilize numerous computers to solve problems by sharing computing resources. The problems to be solved might involve data processing, network bandwidth, or data storage, or a combination thereof. In a grid environment, the ensemble of resources is able to work together cohesively because of defined protocols that control connectivity, coordination, resource allocation, resource management, security, and chargeback. Generally, the protocols are implemented in the middleware. The systems “glued” together by a computational grid may be in the same room, or may be distributed across the globe; they may running on homogenous or heterogeneous hardware platforms; they may be running on similar or dissimilar operating systems; and, they may owned by one or more organizations. The goal of Grid Computing is to provide users with a single view and/or single mechanism that can be utilized to support any number of computing tasks: the grid leverages its extensive informatics capabilities to supports the “number crunching” needed to complete the task and all the user perceives is, essentially, a large virtual computer undertaking his or her work (developerWorks). In recent years one has seen an increasing roster of published articles, conferences, tutorials, resources, and tools related to this topic (developerWorks). Distributed virtualized Grid Computing technology, as we define it today, is still fairly new, it being only a decade in the making. However, as already implied, a number of the basic concepts of Grid Computing go back as far as the mid 1960s and early 1970s. Recent advances, such as ubiquitous high-speed networking in both private and public venues (e. g., high-speed intranets/high-speed Internet), make the technology more deployable at the practical level, particularly when looking at corporate environments. Table 12.1. Issues of interest and questions that CIOs/CTOs/Planners have about grid computing What is Grid Computing and what are the key issues? Grid Benefits and Status of Technology Motivations For Considering Computational Grids Brief History of Grid Computing Is Grid Computing Ready for Prime Time? Early Suppliers and Vendors Challenges Future Directions

264

Daniel Minoli

Table 12.1. Continued What are the Components of Grid Computing Systems/Architectures? Portal/user interfaces User Security Broker function Scheduler function Data management function Job management and resource management Are there stable Standards Supporting Grid Computing? What is OGSA/OGSI? Implementations of OGSI OGSA Services Virtual Organization Creation and Management Service Groups and Discovery Services Choreography, Orchestration, and Workflow Transactions Metering Service Accounting Service Billing and Payment Service Grid System Deployment Issues and Approaches Generic Implementations: Globus Toolkit Security Considerations – Can Grid Computing be Trusted? What are the Grid Deployment/Management Issues? Challenges and Approaches Availability of Products by Categories Business Grid Types Deploying a basic computing grid Deploying more complex computing grid Grid Operation What are the Economics of Grid Systems? The Chargeable Grid Service The Grid Payment System How does one pull it all together? Communication and networking Infrastructure Communication Systems for Local Grids Communication systems for national grids Communication Systems for Global Grids

12 Grid Computing for Commercial Enterprise Environments

265

Table 12.2. Definition of some key terms Grid Computing

(Virtualized) distributed computing environment that enables the dynamic “runtime” selection, sharing, and aggregation of (geographically) distributed autonomous (autonomic) resources based on the availability, capability, performance, and cost of these computing resources, and, simultaneously, also based on an organization’s specific baseline and/or burst processing requirements. Enables organizations to transparently integrate, streamline, and share dispersed, heterogeneous pools of hosts, servers, storage systems, data, and networks into one synergistic system, in order to deliver agreed-upon service at specified levels of application efficiency and processing performance. An approach to distributed computing that spans multiple locations and/or multiple organizations, machine architectures, and software boundaries to provide power, collaboration, and information access. Infrastructure that enables the integrated, collaborative use of computers, supercomputers, networks, databases, and scientific instruments owned and managed by multiple organizations. A network of computation: namely, tools and protocols for coordinated resource sharing and problem solving among pooled assets … allows coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. Simultaneous application of the resources of many networked computers to a single problem … concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. Decentralized architecture for resource management, and a layered hierarchical architecture for implementation of various constituent services. Combines elements such as distributed computing, high-performance computing, and disposable computing, depending on the application. Local, metropolitan, regional, national, or international footprint. Systems may be in the same room, or may be distributed across the globe; they may running on homogenous or heterogeneous hardware platforms; they may be running on similar or dissimilar operating systems; and, they may owned by one or more organizations.

266

Daniel Minoli

Table 12.2. Continued

Virtualization

Clusters

Types: (i) Computational grids: machines with set-aside resources stand by to “number-crunch” data or provide coverage for other intensive workloads; (ii) Scavenging grids: commonly used to locate and exploit machine cycles on idle servers and desktop machines for use in resource-intensive tasks; and, (iii) Data grids: a unified interface for all data repositories in an organization, and through which data can be queried, managed, and secured. Computational grids can be local, Enterprise grids (also called Intragrids), and Internet-based grids (also called Intergrids.) Enterprise grids are middleware-based environments to harvest unused “machine cycles”, thereby displacing otherwise-needed growth costs. Other terms (some with slightly different connotations): “computational grid”; “computing-on-demand”; “On Demand computing”; “just-in-time computing”; “platform computing”; “network computing”, “computing utility”; “utility computing”; “cluster computing”; and “high performance distributed computing”. An approach that allows several operating systems to run simultaneously on one (large) computer (e. g., IBM’s z/VM operating system lets multiple instances of Linux coexist on the same mainframe computer). More generally, it is the practice of making resources from diverse devices accessible to a user as if they were a single, larger, homogenous, appear-to-be-locally-available resource. Dynamically shifting resources across platforms to match computing demands with available resources: the computing environment can become dynamic, enabling autonomic shifting applications between servers to match demand. The abstraction of server, storage, and network resources in order to make them available dynamically for sharing by IT services, both internal to and external to an organization. In combination with other server, storage, and networking capabilities, virtualization offers customers the opportunity to build more efficient IT infrastructures. Virtualization is seen by some as a step on the road to utility computing. Aggregating of processors in parallel-based configurations, typically in local environment (within a Data Center); all nodes work cooperatively as a single unified resource. Resource allocation is performed by a centralized resource manager and scheduling system.

12 Grid Computing for Commercial Enterprise Environments

267

Table 12.2. Continued

(Basic) Web Services (WS)

Peer-to-peer (P2P)

Comprised of multiple interconnected independent nodes that co-operatively work together as a single unified resource; unlike grids, cluster resources are typically owned by a single organization. All users of clusters have to go through a centralized system that manages allocation of resources to application jobs. Cluster management systems have centralized control, complete knowledge of system state and user requests, and complete control over individual components. Web Services provide standard infrastructure for data exchange between two different distributed applications (grids provide an infrastructure for aggregation of high-end resources for solving large-scale problems). Web Services expected to play a key constituent role in the standardized definition of Grid Computing, since Web Services have emerged as a standards-based approach for accessing network applications. P2P is concerned with same general problem as Grid Computing, namely, the organization of resource sharing within virtual communities. Grid community focuses on aggregating distributed highend machines such as clusters, whereas the P2P community concentrates on sharing low-end systems such as PCs connected to the Internet. Like P2P, Grid Computing allows users to share files (many-to-many sharing). With grid, the sharing is not only in reference to files, but also other IT resources.

As far back as 1987 this researcher was advocating the concept of Grid Computing in internal Bell Communications Research White Papers (e. g., in Special Reports SR-NPL-000790 – an extensive plan written by the author listing progressive data services that could be offered by local telcos and RBOCs, entitled A Collection of Potential Network-Based Data Services [33]. In a section called “Network for a Computing Utility” it was stated: “The proposed service provides the entire apparatus to make the concept of the Computing Utility possible. This includes as follows: (1) the physical network over which the information can travel, and the interface through which a guest PC/workstation can participate in the provision of machine cycles and through which the service requesters submit jobs; (2) a load sharing mechanisms to invoke the necessary servers to complete a job; (3) a reliable security mechanism; (4) an effective accounting mechanism to invoke the billing system; and, (5) a detailed directory of servers... Security is one of the major issues for this service, particularly if the PC is not fully dedicated to this function, but also used for other local activities. Virus threats, infiltration and corruption of

268

Daniel Minoli

data, and other damage must be appropriately addressed and managed by the service; multi-task and robust operating systems are also needed for the servers to assist in this security process … The Computing Utility service is beginning to be approached by the Client/Server paradigm now available within a Local Area Network (LAN) environment … This service involves capabilities that span multiple 7-layer stacks. For example, one stack may handle administrative tasks, another may invoke the service (e. g., Remote Operations), still another may return the results (possibly a file), and so on … Currently no such service exists in the public domain. Three existing analogues exist, as follows: (1) timesharing service with a centralized computer; (2) highly-parallel computer systems with hundreds or thousands of nodes (what people now call cluster computing), and (3) gateways or other processors connected as servers on a LAN. The distinction between these and the proposed service is the security and accounting arenas, which are much more complex in the distributed, public (grid) environment … This service is basically feasible once a transport and switching network with strong security and accounting (chargeback) capabilities is deployed, as shown in Figure... A high degree of intelligence in the network is required … a physical network is required … security and accounting software is needed … protocols and standards will be needed to connect servers and users, as well as for accounting and billing. These protocols will have to be developed before the service can be established …”

12.3 Potential Applications and Financial Benefits of Grid Computing Grid proponents take the position that Grid Computing represents a “next step” in the world of computing, and that Grid Computing promises to move the Internet evolution to the next logical level. According to some ([46], [47], [48]) among others), “utility computing is a positive, fundamental shift in computing architecture,” and many businesses will be completely transformed over the next decade by using gridenabled services, as these businesses integrate not only applications across the Internet, but, also, raw computer power and storage. Furthermore, proponents prognosticate that infrastructure will appear that will be able to connect multiple regional and national computational grids, creating a universal source of pervasive and dependable computing power that will support new classes of applications [3]. Most researchers, however, see Grid Computing as an evolution, not revolution. In fact, Grid Computing can be seen as the latest and most complete evolution of more familiar developments, such as distributed computing, the Web, peer-to-peer (P2P) computing, and virtualization technologies [27]. Some applications of Grid Computing, particularly in the scientific and engineering arena include, but are not limited to the following [6]: • Distributed Supercomputing/Computational science; • High-Capacity/Throughput Computing: large-scale simulation/chip design and parameter studies;

12 Grid Computing for Commercial Enterprise Environments

269

• • • •

Content sharing, e. g., sharing digital contents among peers; Remote software access/renting services: ASPs and Web Services; Data-intensive computing: drug design, particle physics, stock prediction; On-demand, real-time computing: medical instrumentation and mission critical initiatives; • Collaborative Computing (eScience, eEngeering, …): collaborative design, data exploration, education, e-learning; • Utility Computing/Service-Oriented Computing: new computing paradigm, new applications, new industries, and new business.

The benefits gained from Grid Computing can translate into competitive advantages in the marketplace. For example the potential exists for grids to ([27], [9]): • Enable resource sharing • Provide transparent access to remote resources • Makes effective use of computing resources, including platforms and data sets • Reduces significantly the number of servers needed (25−75%) • Allow on-demand aggregation of resources at multiple sites • Reduce execution time for large-scale, data processing applications • Provide access to remote databases and software • Provide load smoothing across a set of platforms • Provides fault tolerance • Take advantage of time zone and random diversity (in peak hours, users can access resources in off-peak zones) • Provide the flexibility to meet unforeseen emergency demands by renting external resources for a required period instead of owning them • Enables the realization of a Virtual Data Center (naturally there also are challenges associated with a grid deployment – this field being new and evolving.) As implied by the last bullet point, there is a discernable IT trend afoot toward virtualization and on-demand services. Virtualization4 (and supporting technology) is an approach that allows several operating systems to run simultaneously on one (large) computer. For example, IBM’s z/VM operating system lets multiple instances of Linux coexist on the same mainframe computer. More generally, virtualization is the practice of making resources from diverse devices accessible to a user as if they were a single, larger, homogenous, appear-to-belocally-available resource. Virtualization supports the concept of dynamically shifting resources across platforms to match computing demands with available resources: the computing environment can become dynamic, enabling autonomic shifting applications between servers to match demand [38]. There are well-known advantages in sharing resources, as a routine assessment of the behavior of the M/M/1 queue (memoryless/memoryless/1 server queue) versus the M/M/m queue 4

Virtualization can be achieved without Grid Computing; but many view virtualization as a step towards the goal of deploying Grid Computing infrastructures.

270

Daniel Minoli

(memoryless/memoryless/n servers queue) demonstrates: a single more powerful queue is more efficient than a group of discrete queues of comparable aggregate power. Grid Computing represents a development in virtualization: as we have stated, it enables the abstraction of distributed computing and data resources such as processing, network bandwidth, and data storage to create a single system image; this grants users and applications seamless access (when properly implemented) to a large pool of IT capabilities. Just as an Internet user views a unified instance of content via the Web, a Grid Computing user essentially sees a single, large virtual computer [27]. “Virtualization” – the driving force behind Grid Computing – has been a key factor since the earliest days of electronic business computing. Studies have shown that when problems can be parallelized, such as in the case of data mining, records analysis, and billing (as may be the case in a bank, securities company, financial services company, insurance company, etc.), then significant savings are achievable. Specifically, when a classical model may require, say, $100 K to process 100 K records, a grid-enabled environment may take as little as $20 K to process the same number of records. Hence, the bottom line is that Fortune 500 companies have the potential to save 30% or more in the Run-TheEngine costs on the appropriate line item of their IT budget. Grid Computing can also be seen as part of a larger re-hosting initiative and underlying IT trend at many companies (where alternatives such as Linux® or possibly Windows Operating Systems could, in the future, be the preferred choice over the highly-reliable, but fairly costly UNIX solutions.) While each organization is different and the results vary, the directional cost trend is believable. Vendors engaged in this space include (but are not limited to) IBM, Hewlett-Packard, Sun, and Oracle. IBM uses “on-demand ” to describe its initiative; HP has its Utility Data Center (UDC) products; Sun has its N1 Data-center Architecture; and Oracle has the 10 g family of “grid-aware” products. Several software vendors also have a stake in Grid Computing, including but not limited to Microsoft, Computer Associates, Veritas Software, and Platform Computing [2]. Commercial interest in Grid Computing is on the rise, according to market research published by Gartner. The research firm estimated that 15% of corporations adopted a utility (grid) computing arrangement in 2003, and the market for utility services in North America would increase from US$8.6 billion in 2003 to more than US$25 billion in 2006. By 2006, 30% of companies would have some sort of utility computing arrangement, according to the Gartner [2]. Based on published statements, IBM expects the sector to move into “hypergrowth” in 2004, with “technologies … moving ‘from rocket science to business service’ ”, and the company had a target of doubling its grid revenue during that year [12]. According to economists, proliferation of high performance cluster and Grid Computing and Web Services (WSs) applications will yield substantial productivity gains in the United States and worldwide over the next decade [10]. A recent report from research firm IDC concluded that 23% of IT services will be delivered from offshore centers by 2007 [31]. Grid Computing may be a mechanism to enable companies to reduce costs, yet keep the jobs, the intellectual capital, and the data from migrating abroad. While distributed computing does enable the idea

12 Grid Computing for Commercial Enterprise Environments

271

of “remoting” functions, with Grid Computing this “remoting” can be done to some in-country regional – rather than 3rd World-location (just like electric and water grids have regional or near-countries scope, rather than having far-flung 3rd World remote scope.) IT jobs that migrate abroad (particularly if terabytes of data about U.S. citizens and our government become the resident ownership of politicallyunstable 3rd World countries), have, in the opinion of this researcher, National Security/Homeland Security risk implications, in the long term. While there are advantages with grids (e. g., potential reduction in the number of servers from 25-to-50% and related run-the-engine costs), some companies have reservations about immediately implementing the technology. Some of this hesitation relates to the fact that the technology is new, and in fact may be overhyped by the potential provider of services. Other issues may be related to “protection of turf ”: eliminating vast arrays of servers implies reduction in Data Center space, reduction in management span of control, reduction in operational staff, reduction in budget, etc. This is the same issue that was faced in the 1990s regarding outsourcing (e. g., see [32]). Other reservation may relate to the fact that infrastructure changes are needed, and this may have a short-term financial disbursement implication. Finally, not all situations, environments, and applications are amenable to a grid paradigm.

12.4 Grid Types, Topologies, Components, Layers – A Basic View Grid Computing embodies a combination of a decentralized architecture for resource management, and a layered hierarchical architecture for implementation of various constituent services (GRID Infoware). A Grid Computing system can have local, metropolitan, regional, national, or international footprints. In turn, the autonomous resources in the constituent ensemble can span a single organization, multiple organizations, or a service provider space. Grids can be focused on the pooled assets of one organization or span virtual organizations that use a common suite of protocols to enable grid users and applications to run services in a secure, controlled manner [37]. Furthermore, resources can be logically aggregated for a long period of time (say, months or years), or for a temporary period of time (say, minutes, days, or weeks). Grid Computing often combines elements such as distributed computing, highperformance computing, and disposable computing, depending on the application of the technology and the scale of the operation. Grids can, in practical terms, create a virtual supercomputer out of existing servers, workstations, and even PCs, to deliver processing power not only to a company’s own stakeholders and employees, but also to its partners and customers. This metacomputing environment is achieved by treating such IT resources as processing power, memory, storage and network bandwidth as pure commodities. Like an electricity or water network,

272

Daniel Minoli

computational power can be delivered to any department or any application where it is needed most at any given time, based on specified business goals and priorities. Furthermore, Grid Computing allows charge-back on a per-usage basis rather than for a fixed infrastructure cost [23]. Present-day grids encompass the following types [11]: • Computational grids, where machines with set-aside resources stand by to “number-crunch” data or provide coverage for other intensive workloads; • Scavenging grids, commonly used to find and harvest machine cycles from idle servers and desktop machines for use in resource-intensive tasks (scavenging is usually implemented in a way that is unobtrusive to the owner/user of the processor); and, • Data grids, that provide a unified interface for all data repositories in an organization, and through which data can be queried, managed, and secured. As already noted, no claim is made herewith that there is a single solution to a given problem; Grid Computing is one of the available solutions. For example, while some of the machine-cycle inefficiencies can be addressed by virtual servers/re-hosting (e. g., VMWare, MS VirtualPC and VirtualServer, LPARs from IBM, partitions from Sun and HP, which do not require a grid infrastructure), one of the possible approaches to this inefficiency issue is, indeed, Grid Computing. Grid Computing does have an emphasis on geographically-distributed, multiorganization, utility-based (outsourced), networking-reliant methods, while clustering and re-hosting have more (but not exclusively) of a datacenter-focused, single organization-oriented approach. Organization will need to perform appropriate functional, economic, and strategic assessments to determine which approach is, in final analysis, best for their specific environment. (This text is on Grid Computing, hence, our emphasis is on this approach, rather than other possible approaches, such as virtualization). Figures 12.1, 12.2, and 12.3 provide a pictorial view of some Grid Computing environments. Figure 12.1 depicts the traditional computing environment where a multitude of often-underutilized servers support a disjoint set of applications and data sets. As implied by this figure, the typical IT environment prior to Grid Computing, operated as follows: a business-critical application runs on a designated server. While the average utilization may be relatively low, during peak cycles the server in question can get overtaxed. As a consequence of this instantaneous overtaxation, the application can slow down, experience a halt, or even stall. In this traditional instance, the large data set that this application is analyzing, exists only in a single data store (note that while multiple copies of the data could exist, it would not be easy with the traditional model to synchronize the databases if two programs independently operated aggressively on the data at the same time.) Server capacity and access to the data store place limitations on how quickly desired results can be returned. Machine cycles on other servers are unable to be constructively utilized, and available disk capacity remains unused [27].

12 Grid Computing for Commercial Enterprise Environments

273

Job Request 1

User 1

Result 1 Job Request 2

User 2

Corporate Intranet Result 2 (briged/ routed LAN segments Job Request 3 and VLANs) Result 3

SAN Server A Utilization: 33%

SAN Server B Utilization: 45%

SAN Server C Utilization: 20%

Data Set 1

Data Set 2

Data Set 3

User 3 Job Request 4

Result 4 User 4

SAN Server D Utilization: 52%

Data Set 4

DATA CENTER

Figure 12.1. Standard computing environment

Figure 12.2 depicts an organization-owned computational grid; here, a middleware application running on a Grid Computing Broker manages a small(er) set of processors and an integrated data store. A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. In a grid environment, workload can be broken up and sent in manageable pieces to idle server cycles. Not all applications are necessarily instantly migratable to a grid environment, without at least some re-design. While legacy business applications may, a priori, fit such class of applications, a number of Fortune 500 companies are indeed looking into how such legacy applications can be modified and/or re-tooled such that they can be made to run on grid-based infrastructures. A scheduler sets rules and priorities for routing jobs on a grid-based infrastructure. When servers and storage are enabled for Grid Computing, copies of the data can be stored in formerly unused space and easily made available [27]. A grid also provides mechanisms for managing the distributed data in a seamless way [39], [19]. A grid middleware provides facilities to allow the use the grid for applications and users. Middleware such as Globus [17], Legion [22], and UNICORE (UNiform Interface to Computer Resources) (Huber 2001) provide software infrastructure to handle various challenges of computational and data grids [39]. Figure 12.3 depicts the utility-oriented implementation of a computational grid. This concept is analogous to electric power network (grid) where power genera-

274

Daniel Minoli

tors are distributed, but the users are able to access electric power without concerning themselves about the source of energy and its pedestrian operational management (URL gridcomputing). As suggested by these figures, Grid Computing

Middleware

User 1

Job Request 1

Enterprise Grid and/or InterGrid

Result 1 Job Request 2

User 2

Middleware

Corporate Intranet Result 2 (briged/ routed LAN Job Request 3 segments and VLANs) Result 3

Server/Mainframe A LAN (e.g., GbE)

Data

Job Requests Job Requests Job Requests

Utilization: 83%

Job Requests Job Requests Utilization: 85%

User 3 Job Request 4

Result 4 User 4

Server/Mainframe B

Grid Cmputing Resource Broker DATA CENTER

Figure 12.2. Grid computing environment (local implementation)

Figure 12.3. Grid computing environment (remote implementation)

SAN (e.g., FC, iFCP)

12 Grid Computing for Commercial Enterprise Environments

Application B

275

Application A Grid resource broker

Grid information service Database Router

R2 Router

Grid information service

R3

R1

Router

R5

Router

R6 Grid information service Grid resource broker Grid resource broker

Figure 12.4. Pictorial view of World Wide InterGrid

aims to provide seamless and scalable access to distributed resources. Computational grids enable the sharing, selection, and aggregation of a wide variety of geographically distributed computational resources (such as supercomputers, computing clusters, storage systems, data sources, instruments, and developers), and presents them as a single, unified resource for solving large-scale computing- and data-intensive applications (e. g., molecular modeling for drug design, brain activity analysis, and high-energy physics). An initial grid deployment at a company can be scaled over time to bring in additional applications and new data. This allows gains in speed and accuracy without significant cost increases. Several years ago, this author coined the phrase “the corporation is the network” [36]. Grid Computing supports this concept very well: with Grid Computing, all a corporation needs to run its IT apparatus is a reliable high-speed network to connect it to the distributed set of virtualized computational resources not necessarily owned by the corporation itself. Grid Computing started out as mechanism to share of the computational resources distributed all over the world for Basic Science applications, as illustrated pictorially by Figure 12.4 ([39], [14]). But, other types of resources, such as licenses or specialized equipment, can now also be virtualized in a Grid Computing environment. For example, if an organization’s software license agreement limits the number of users that can be using the license simultaneously, license management tools operating in grid mode could be employed to keep track of how many

276

Daniel Minoli

concurrent copies of the software are active. This will prevent the number from exceeding the allowed number, as well as scheduling jobs according to priorities defined by the automated business policies. Specialized equipment that is remotely deployed on the network could also be managed in a similar way, thereby reducing the necessity of the organization to purchase multiple devices, in much the same way today that people in the same office share Internet access or a printing resources across a LAN [23]. Some grids focus on data federation and availability; other grids focus on computing power and speed. Many grids involve a combination of the two. For endusers, all infrastructure complexity stays hidden [27]. Data (database) federation makes disparate corporate databases look like the constituent data is all in the same database. Significant gains can be secured if one can work on all the different databases, including selects, inserts, updates, and deletes as if all the tables existed in a single database.5 Almost every organization has significant unused computing capacity, widely distributed among a tribal arrangement of PCs, midrange platforms, mainframes, and supercomputers. For example, if a company has 10,000 PCs, at an average computing power of 333 MIPS, this equates to an aggregate 3 tera (1012) floating-point operations per second (TFLOPS) of potential computing power. As another example, in the U.S. there are an estimated 300 million computers; at an average computing power of 333 MIPS, this equates to a raw computing power of 100,000 TFLOPS. Mainframes are generally idle 40% of the time; Unix servers are actually “serving” something less than 10% of the time; most PCs do nothing for 95% of a typical day [27]. This is an inefficient situation for customers. TFLOPS-speeds that are possible with Grid Computing enable scientists to address some of the most computationally-intensive scientific tasks, from problems in protein analysis that will form the basis for new drug designs, to climate modeling, to deducing the content and behavior of the cosmos from astronomical data [45]. Many scientific applications are also data-intensive, in addition to being computationally-intensive. By 2006, several physics projects will produce multiple petabytes (1015 byte) of data per year. This has been called “peta-scale” data. PCs now ship with up to 100 gigabytes (GB) of storage (as much as an entire 1990 supercomputer center) [13]: one petabyte would equate to 10,000 of these PCs, or to the PC base of a “smaller” Fortune 500 company. Data grids also have some immediate commercial application. Grid-oriented solutions are the way to manage this sort of storage requirement, particularly from a data access perspective (more 5

The “federator” system operates on the tables in the remote systems, the “federatees”. The remote tables appear as virtual tables in the “Federator” database. Client application programs can perform operations on the virtual tables in the “Federator” database, but the real persistent storage is in the remote database. Each “federatee” views the “federator” as just another database client connection. The “Federatee” is simply servicing client requests for database operations. The “federator” needs client software to access each remote database. Client software for IBM Informix®, Sybase, Oracle, etc. would need to be installed to access each type of federatee (IBM Press Releases).

12 Grid Computing for Commercial Enterprise Environments

277

than just from a physical storage perspective.) As grids evolved, the management of “peta-scale” data became burdensome. The confluence and combination of large dataset size, geographic distribution of users and resources, and computationally-intensive scientific analyses, prompted the development of data grids, as noted earlier ([39]; [8]). Here, a data middleware (usually part of general purpose grid middleware), provides facilities for data management. Various research communities have developed successful data middleware like Storage Resource Broker (SRB) [1], Grid Data Farm [42], and European Data Grid Middleware. These middleware tools have been effective in providing framework for managing high volumes of data but they are often incompatible. The key components of Grid Computing include the following (developerWorks): • Resource management: the grid must be aware of what resources are available for different tasks; • Security management: the grid needs to take care that only authorized users can access and use the available resources; • Data management: data must be transported, cleansed, parceled, and processed; and, • Services management: users and applications must be able to query the grid in an effective and efficient manner. More specifically, Grid Computing can be viewed as being comprised of a number of logical hierarchical layers. Figure 12.5 depicts a first view of the layered architecture of a grid environment. At the base of the grid stack, one finds the grid fabric, namely, the distributed resources that are managed by a local resource manager with a local policy; these resources are interconnected via local-, metropolitan-, or widearea networks. The grid fabric includes and incorporates networks; computers such as PCs and processors using operating systems such as Unix, Linux, or Windows; clusters using various operating systems; resource management systems; storage devices; and databases. The security infrastructure layer provides secure and authorized access to grid resources. Above this layer, the core grid middleware provides uniform access to the resources in the fabric; the middleware designed to hide complexities of partitioning, distributing, and load balancing. The next layer, the user-level middleware layer consists of resource brokers or schedulers responsible

Figure 12.5. One view of Grid Computing Layers

278

Daniel Minoli

Figure 12.6. Layered Grid Architecture, Global Grid Forum (This presentation is licensed for use under the terms of the Globus Toolkit Public License. See http://www.globus.org/toolkit/ download/license.html for the full text of this license.)

for aggregating resources. The grid programming environments and tools layer includes the compilers, libraries, development tools, and so on, that are required to run the applications (resource brokers manage execution of applications on distributed resources using appropriate scheduling strategies, while grid development tools to grid-enable applications.) The top layer consists of grid applications themselves [9]. Building on this intuitive idea of layering, it would be advantageous if industry consensus were reached on the series of layers. Architecture standards are now under development by the Global Grid Forum (GGF). The GGF is an industry advocacy group; it supports community-driven processes for developing and documenting new standards for Grid Computing. The GGF is a forum for exchanging information and defining standards relating to distributed computing and grid technologies. GGF is fashioned after the Grid Forum, the eGrid European Grid Forum, and the Grid Community in the Asia-Pacific. GGF is focusing on open grid architecture standards [48]. Technical specifications are being developed for architecture elements: e. g., security, data, resource management, and information. Grid architectures are being built based on Internet protocols and services (for example, communication, routing, name resolution, etc. The layering approach is used to the extent possible because it is advantageous for higher-level functions to use common lower-level functions. The GGF’s approach has been to propose set of core services as basic infrastructure, as shown in Figure 12.6. These core services are used to construct high-level, domain-specific solutions. The design principles are: keep participation cost low; enable local control; support for adaptation; use the “IP hourglass” model of many applications using a few core services to support many fabric elements (e. g., Operating Systems.) In the meantime, Globus Toolkit™ has emerged as the de facto standard for several important Connectivity, Resource, and

12 Grid Computing for Commercial Enterprise Environments

279

Collective protocols. The toolkit, a “middleware plus” capability, addresses issues of security, information discovery, resource management, data management, communication, fault detection, and portability (The Global Grid Forum).

12.5 Comparison with other Approaches It is important to note that certain IT computing constructs are not grids, as we discuss next. In some instances, these technologies are the optimal solution for an organization’s problem; in other cases, Grid Computing is the best solution, particularly if in the long term one is especially interested in supplier-provided utility computing. The distinction between clusters and grids relates to the way resources are managed. In case of clusters (aggregating of processors in parallel-based configurations), the resource allocation is performed by a centralized resource manager and scheduling system; also, nodes cooperatively work together as a single unified resource. In case of grids, each node has its own resource manager and such node does not aim at providing a single system view [5]. A cluster is comprised of multiple interconnected independent nodes that co-operatively work together as a single unified resource. This means all users of clusters have to go through a centralized system that manages the allocation of resources to application jobs. Unlike grids, cluster resources are almost always owned by a single organization. Actually, many grids are constructed by using clusters or traditional parallel systems as their nodes, although this is not a requirement. An example of grid that contains clusters as its nodes is the NSF TeraGrid [20]; another example is the World Wide Grid has many nodes that are clusters, that are located in organizations such as NRC Canada, AIST-Japan, N*Grid Korea, and University of Melbourne. While cluster management systems such as Platform’s Load Sharing Facility, Veridian’s Portable Batch System, or Sun’s Sun Grid Engine can deliver enhanced distributed computing services, they are not grids themselves; these cluster management systems have centralized control, complete knowledge of system state and user requests, and complete control over individual components (such features tend not to be characteristic of a grid proper) [15]. Grid Computing also differs from basic Web Services, although it now makes use of these services. Web Services have become an important component of distributed computing applications over the Internet (Grid Computing using.NET and WSRF.NET 2004). The World Wide Web is not (yet, in itself) a grid: its open, general-purpose protocols support access to distributed resources but not the coordinated use of those resources to deliver negotiated qualities of service [15]. So, while the Web is mainly focused on communication, Grid Computing enables resource sharing and collaborative resource interplay toward common business goals. Web Services provide standard infrastructure for data exchange between two different distributed applications, while grids provide an infrastructure for aggregation of high-end resources for solving large-scale problems in Science, Engineering, and Commerce. However, there are similarities as well as dependencies:

280

Daniel Minoli

(i) similar to the case of the World Wide Web, Grid Computing keeps complexity hidden: multiple users experience a single, unified experience; and, (ii) Web Services are utilized to support Grid Computing mechanisms: these Web Services, will play a key constituent role in the standardized definition of Grid Computing, since Web Services have emerged in the past few years as a standards-based approach for accessing network applications. The recent trend is to implement grid solutions using Web Services technologies; for example, the Globus Toolkit 3.0 middleware is being implemented using Web Services technologies. In this context, low-level Grid Services are instances of Web Services (a Grid Services is a Web Services that conforms to a set of conventions that provide for controlled, fault resilient and secure management of stateful services) ([21]; [19]). Both peer-to-peer and Grid Computing are concerned with the same general problem, namely, the organization of resource-sharing within VOs. As is the case with peer-to-peer environments, Grid Computing allows users to share files; but unlike P2P, Grid Computing allows many-to-many sharing; furthermore, with grid the sharing is not only in reference to files but other resources as well. The grid community generally focuses on aggregating distributed high-end machines such as clusters, while the P2P community concentrates on sharing low-end systems such as PCs connected to the Internet [9]. Both disciplines take the same general approach to solving this problem, namely the creation of overlay structures that coexist with, but need not correspond in structure to underlying organizational structures. Each discipline has made technical advances in recent years, but each also has – in current instantiations – a number of limitations: there are complementary aspects regarding the strengths and weaknesses of the two approaches which suggests that the interests of the two communities are likely to grow closer over time [16]. P2P networks can amass computing power, as does the SETI@home project, or share contents, as Napster and Gnutella have done in the recent past. Given the number of grid and P2P projects and forums that began worldwide at the turn of the decade, it is clear that interest in the research, development, and commercial deployment of these technologies is burgeoning [9]. Grid Computing also differ from virtualization. Resource virtualization is the abstraction of server, storage, and network resources in order to make them available dynamically for sharing by IT services-both inside and outside an organization. Virtualization is a step along the way on the road to utility computing (Grid Computing) and, in combination with other server, storage, and networking capabilities, offers customers the opportunity to build, according to advocates, an IT infrastructure “without” hard boundaries or fixed constraints [25]. Virtualization has somewhat more of an emphasis on local resources, while Grid Computing has more of an emphasis on geographically distributed inter-organizational resources. The universal problem that virtualization is solving in a data center is that of dedicated resources. While this approach does address performance, this method lacks fine granularity. Typically, IT managers take an educated guess how many dedicated servers they will need to handle peaks, by purchasing extra servers, and, then later finding out that a significant portion of these servers are grossly underutilized. A typical data center has a large amount of idle infrastructure, bought and

12 Grid Computing for Commercial Enterprise Environments

281

set up online to handle peak traffic for different applications. Virtualization offers a way of moving resources from one application to another dynamically. However, specifics of the desired virtualizing effect depend on the specific application deployed [47]. Three representative press-time products are HP’s Utility Data Center, EMC’s VMware, and Platform Computing’s Platform LFS. With virtualization, the logical functions of the server, storage, and network elements are separated from their physical functions (e. g., processor, memory, I/O, controllers, disks, switches). In other words, all servers, storage, and network devices can be aggregated into independent pools of resources. Some elements may even be further subdivided (server partitions, storage LUNs) to provide an even more granular level of control. Elements from these pools can then be allocated, provisioned, and managed-manually or automatically-to meet the changing needs and priorities of one’s business. Virtualization can span the following domains [25]: i. Server virtualization for horizontally and vertically scaled server environments. Server virtualization enables optimized utilization, improved service levels, and reduced management overhead. ii. Network virtualization, enabled by intelligent routers, switches, and other networking elements supporting virtual LANs. Virtualized networks are more secure and more able to support unforeseen spikes in customer and user demand. iii. Storage virtualization (server, network, and array-based). Storage virtualization technologies improve the utilization of current storage subsystems, reduce administrative costs, and protect vital data in a secure and automated fashion. iv. Application virtualization enables programs and services to be executed on multiple systems simultaneously. This computing approach is related to horizontal scaling, clusters, and Grid Computing, where a single application is able to cooperatively execute on a number of servers concurrently. v. Data center virtualization, whereby groups of servers, storage, and network resources can be provisioned or re-allocated on-the-fly to meet the needs of a new IT service or to handle dynamically changing workloads (Hewlett-Packard Company 2002). Grid Computing deployment, although potentially related to a re-hosting initiative, is not just re-hosting. As Figure 12.7 depicts, re-hosting implies the reduction of typically a large number of servers (possibly using some older and/or proprietary OS) to a smaller set of more powerful and more modern servers (possibly running on open source OSs). This is certainly advantageous from an operations, physical maintenance, and power and space perspective. There are savings associated with re-hosting. However, applications are still assigned specific servers. Grid Computing, on the other hand, permits the true virtualization of the computing function, as seen in Figure 12.7. Here applications are not pre-assigned a server, but the “runtime” assignment is made based on real-time considerations. (Note: in the bottom diagram, the hosts could be collocated or spread all over the world. When local

282

Daniel Minoli

Figure 12.7. A comparison with re-hosting

hosts are aggregated in tightly-coupled configurations, they tend to generally be of the cluster parallel-based computing type; such processors, however, can also be nonparallel-computing-based grids, e. g., by running the Globus Toolkit. When geographically dispersed hosts are aggregated in distributed computing configurations

12 Grid Computing for Commercial Enterprise Environments

283

they tend to generally be of the Grid Computing type and not running in a clustered arrangement – Figure 12.7 does not show geography and the reader should conclude that the hosts are arranged in a Grid Computing arrangement.) In summary, like clusters and distributed computing, grids bring computing resources together. Unlike clusters and distributed computing, that need physical proximity and operating homogeneity, grids can be geographically distributed and can be heterogeneous. Like virtualization technologies, Grid Computing enables the virtualization of IT resources. Unlike virtualization technologies, that virtualize a single system, Grid Computing enables the virtualization of broad-scale and disparate IT resources [27]. Scientific community deployments such as the distributed data processing system being deployed internationally by “Data Grid” projects (e. g., GriPhyN, PPDG, EU DataGrid), NASA’s Information Power Grid, the Distributed ASCI Supercomputer system that links clusters at several Dutch universities, the DOE Science Grid and DISCOM Grid that link systems at DOE laboratories, and the TeraGrid mentioned above being constructed to link major U.S. academic sites, are all bona-fide examples of Grid Computing. A multi-site scheduler such as Platform’s MultiCluster can reasonably be called (firstgeneration) grids. Other examples of Grid Computing include the distributed computing systems provided by Condor, Entropia, and United Devices, that make use of idle desktops; peer-to-peer systems such as Gnutella, that support file sharing among participating peers; and a federated deployment of the Storage Resource Broker, that supports distributed access to data resources [18].

12.6 A Quick View at Grid Computing Standards One of the challenges with any computing technology is getting the various components to communicate with each other. Nowhere is this more critical than when trying to get different platforms and environments to interoperate. It should, therefore, be immediately evident that the Grid Computing paradigm requires standard, open, general-purpose protocols and interfaces. Standards for Grid Computing are now being defined and are beginning to be implemented by the vendors ([11], [7]). To make the most effective use of the computing resources available, these environments need to utilize common protocols [4]. Standards are the “holy grail” for Grid Computing. Regarding this issue, proponents make the case that we are now indeed entering a new phase of Grid Computing where standards will define grids in a consistent way – by enabling grid systems to become easily-built “off-the-shelf” systems. Standard-based grid systems have been called by some “Third-Generation Grids,” or 3G Grids. First generation “1G Grids” involved local “Metacomputers” with basic services such as distributed file systems and site-wide single sign on, upon which early-adopters developers created distributed applications with custom communications protocols. Gigabit testbeds extended 1G Grids across distance, and attempts to create “Metacenters” explored issues of inter-organizational integration.

284

Daniel Minoli

1G Grids were totally custom made proofs of concept [7]. 2G Grid systems began with projects such Condor, I-WAY (the origin of Globus) and Legion (origin of Avaki), where underlying software services and communications protocols could be used as a basis for developing distributed applications and services. 2G Grids offered basic building blocks, but deployment involved significant customization and filling in many gaps. Independent deployments of 2G Grid technology today involve enough customized extensions that interoperability is problematic, and interoperability among 2G Grid systems is very difficult. This is why the industry needs 3G Grids [7]. By introducing standard technical specifications, 3G Grid technology will have the potential to allow both competition and interoperability not only among applications and toolkits, but among implementations of key services. The goal is to mix-and-match components; but this potential will only be realized if the grid community continues to work at defining standards [7]. The Global Grid Forum community is applying lessons learned from 1G and 2G Grids and from Web services technologies and concepts to create 3G architectures [7]. GGF has driven initiatives such as the Open Grid Services Architecture (OGSA). OGSA is a set of specifications and standards that integrate and leverage the worlds of Web services and Grid Computing (Web services are viewed by some as the “biggest technology trend” in the last five years [29]. With this architecture, a set of common interface specifications supports the interoperability of discrete, independently developed services. OGSA brings together Web standards such as XML (eXtensible Markup Language), WSDL (Web Service Definition Language), UDDI (Universal Description, Discovery, and Integration), and SOAP (Simple Object Access Protocol), with the standards for Grid Computing developed by the Globus Project [30]. The Globus Project is a joint effort on the part of researchers and developers from around the world that are focused on grid research, software tools, testbeds, and applications. As TCP/IP (Transmission Control Protocol/Internet Protocol) forms the backbone for the Internet, the OGSA is the backbone for Grid Computing. The recently-released Open Grid Services Infrastructure (OGSI) service specification is the keystone in this architecture [7]. In addition to making progress on the standards front, Grid Computing as a service needs to address various issues and challenges. Besides standardization, some of these issues and challenges include: security; true scalability; autonomy; heterogeneity of resource access interfaces; policies; capabilities; pricing; data locality; dynamic variation in availability of resources; and, complexity in creation of applications (GRID Infoware).

12.7 A Pragmatic Course of Investigation on Grid Computing In order to identify possible benefits to their organizations, planners should understand Grid Computing concepts and the underlying networking mechanisms.

12 Grid Computing for Commercial Enterprise Environments

285

Practitioners interested in Grid Computing are asking basic questions (developerWorks): What do we do with all of this stuff? Where do we start? How do the pieces fit together? What comes next? As implied by the introductory but rather encompassing discussion above, Grid Computing is applicable to enterprise users at two levels: 1. Obtaining computing services over a network from a remote computing service provider; and, 2. Aggregating an organization’s dispersed set on uncoordinated systems into one holistic computing apparatus. As noted, with Grid Computing organizations can optimize computing and data resources, pool such resources to support large capacity workloads, share the resources across networks, and enable collaboration [27]. Grid technology allows the IT organization to consolidate and numerically reduce the number of platforms that need to be kept operating. Practitioners should look at networking as not only the best way to understand what Grid Computing is, but, more importantly, the best way to understand why and how Grid Computing is important to IT practitioners (rather than “just another hot technology” that gets researchers excited but never has much effect on what network professionals see in the real world). One can build and deploy a grid of a variety of sizes and types: for large or small firms; for a single department or the entire enterprise; and, for enterprise business applications or for scientific endeavors. Like many other recent technologies, however, Grid Computing runs the risk of being over-hyped. CIOs need to be careful not to be to oversold on Grid Computing: a sound, reliable, conservative economic analysis is, therefore, required that encompasses the true Total Cost of Ownership (TCO) and assesses the true risks associated with this approach. Like the Internet, Grid Computing has its roots in the scientific and research communities. After about a decade of research open systems are poised to enter the market. Coupled with a rapid drop in the cost for communication bandwidth, commercial-grade opportunities are emerging for Fortune 500 IT shops searching for new ways to save money. All of this has to be properly weighted against the commoditization of machine cycles (just buy more processors and retain the status quo), and against reduced support cost by way of sub-continent outsourcing of said IT support and related application development (just move operations abroad, but otherwise retain the status quo). During the past ten years or so, a tour-deforce commoditization has been experienced in computing hardware platforms that support IT applications at businesses of all sizes. Some have encapsulated this rapidly-occurring phenomenon with the phrase “IT doesn’t matter”. The pricedisrupting predicament brought about by commoditization affords new opportunities for organizations, although it also has conspicuous process- and peopledislocating consequences. Grid Computing is but one way to capitalize on this Moore-Law-driven commoditization. Already we have noted that at its core Grid Computing is, and must be, based on an open set of standards and protocols that enable communication across heteroge-

286

Daniel Minoli

neous, geographically dispersed environments. Therefore, planners should track the work that the standards groups are doing. It is important to understand the significance of standards in Grid Computing, how they affect capabilities and facilities, what standards exist, and how they can be applied to the problems of distributed computation [4]. There is effort involved with resource management and scheduling in a grid environment. When enterprises need to aggregate resources distributed within their organization and prioritize allocation of resources to different users, projects, and applications based on their Quality of Service (QoS) requirements (call these Service Level Agreements), these enterprises need to be concerned about resource management and scheduling. The user QoS-based allocation strategies enhance the value delivered by the utility. The need for QoS-based resource management becomes significant whenever more than one competing applications or users need to utilize shared resources [21]. Regional economies will benefit significantly from Grid Computing technologies, as suggested earlier in the chapter, assuming that two activities occur [10]: First, the broadband infrastructure needed to support Grid Computing and Web services must be developed in a timely fashion (this being the underlying theme of this text). And, second, that states and regions attract (or grow from within) a sufficient pool of skilled computer and communications professionals to fully deploy and utilize the new technologies and applications. One can ask: what has been learnt in the past 10 years related to grids that can be taken forward? What are the valid approaches we make normative going forward? What is enough to make it good enough for the commercial market? What is economically viable for the commercial market? What is sufficient to convey the value of grid technology to a prospective end user? What are the performance levels for businesses: Tera(FLOPS, bps, B) or Mega(FLOPS, bps, B)? Tera or Giga? Peta or Exa? What are the requisite networking mechanisms, without which Grid Computing cannot happen? For a more complete treatment of this topic the reader is referred to the book by D. Minoli, A Networking Approach to Grid Computing, Wiley, 2005, New York.

References 1. Baru C, Moore R, Rajasekar A, Wan M (1998) The sdsc storage resource broker. Proceedings of IBM Centers for Advanced Studies Conference. IBM 2. Bednarz A, Dubie D (2003) How to: How to get to utility computing. Network World, 12/01/03 3. Berman F, Fox G, Hey AJ (2003) Grid Computing: Making the Global Infrastructure a Reality, May 2003 4. Brown MC (2003) Grid Computing – moving to a standardized platform. August 2003, IBM’s Developerworks Grid Library, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, www.ibm.com

12 Grid Computing for Commercial Enterprise Environments

287

5. Buyya R, Frequently Asked Questions. Grid Computing Info Centre, GridComputing Magazine 6. Buyya R (2003) Delivering Grid Services as ICT Commodities: Challenges and Opportunities. Grid and Distributed Systems (GRIDS) Laboratory, Dept. of Computer Science and Software Engineering, The University of Melbourne, Melbourne, Australia, IST (Information Society Technologies), Milan, Italy 7. Catlett C (2003) The Rise Of Third-Generation Grids. Global Grid Forum Chairman, Grid Connections, Fall 2003, Volume 1, Issue 3, The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA 8. Chervenak A, Foster I, Kesselman C, Salisbury C, Tuecke S (1999) The Data Grid: Towards An Architecture For The Distributed Management And Analysis Of Large Scientific Datasets 9. Chetty M, Buyya R (2002) Weaving Computational grids: How Analogous Are they With Electrical grids?, Computing In Science & Engineering, July/August 2002, IEEE 10. Cohen RB, Feser E (2003) Grid Computing, Projected Impact in North Carolina’s Economy & Broadband Use Through 2010. Rural Internet Access Authority, September 2003 11. developerWorks staff (2003) Start Here to learn about Grid Computing, August 2003, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States 12. Fellows W (2003) IBM’S Grid Computing Push Continues. Gridtoday, Daily News & Information for for the Global Grid Community, October 6, 2003: VOL. 2 NO. 40, Published by Tabor Communications Inc, 8445 Camino Santa Fe, San Diego, California 92121, (858) 625-0070 13. Foster, I. (2002) The Grid: A New Infrastructure for 21st Century Science. Physics Today, 55 (2) pp. 42−47 14. Foster I, Kesselman C, (eds) (1999) The Grid: Blueprint for a Future Computing Infrastructure. Morgan Kaufmann Publishers, San Francisco, CA 15. Foster I (2002) What is the Grid? A Three Point Checklist. Argonne National Laboratory & University of Chicago, July 20, 2002, Argonne National Laboratory, 9700 Cass Ave, Argonne, IL, 60439, Tel: 630 252-4619, Fax: 630 2525986, mailto:[email protected] 16. Foster I, Iamnitchi A (2003) On Death, Taxes, and the Convergence of Peerto-Peer and Grid Computing. 2nd International Workshop on Peer-to-Peer Systems (IPTPS’03), February 2003, Berkeley, CA 17. Foster I, Kesselman C (1997) Globus: A Metacomputing Infrastructure Toolkit. The International Journal of Supercomputer Applications and High Performance Computing, 11 (2) pp. 115−128 18. Foster I, Kesselman C, Tuecke S (2001) The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15 (3). pp. 200−222 19. Fox G, Pierce M, Gannon RGD, Thomas M (2002) Overview of Grid Computing Environments. GFD-I.9, Feb 2003, Copyright (C) Global Grid Forum

288

20. 21.

22.

23.

24.

25.

26.

27. 28.

29. 30.

31. 32. 33. 34. 35. 36.

Daniel Minoli

(2002). The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA Grid Computing using.NET and WSRF.NET (2004) Tutorial. GGF11, Honolulu, June 6, 2004 GRID Infoware Grid Computing, Answers to the Enterprise Architect Magazine Query. Grid Computing Info Centre (GRID Infoware), Enterprise Architect Magazine: http://www.cs.mu.oz.au/~raj/GridInfoware/gridfaq.html Grimshaw AS, Wulf WA, the Legion Team (1997) The Legion Vision of a Worldwide Virtual Computer. Communications of the ACM, 40 (1) pp. 39−45, January 1997 Haney M (2003) Grid Computing: Making inroads into financial services. 24 Apr 2003, Issue No: Volume 4, Number 5, IBM’s Developerworks Grid Library, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, www.ibm.com Hawk T (2003) IBM Grid Computing General Manager. Grid Computing Planet Conference and Expo, San Jose, 17 June 2002. Also as quoted by Globus Alliance, Press Release, July 1, 2003 Hewlett-Packard Company (2002) HP Virtualization: Computing without boundaries or constraints, Enabling an Adaptive Enterprise. HP Whitepaper, 2002, Hewlett-Packard Company, 3000 Hanover Street, Palo Alto, CA 943041185 USA, Phone: (650) 857-1501, Fax: (650) 857-5518, http://www.hp.com Huber V (2001) UNICORE: A Grid Computing environment for distributed and parallel computing. Lecture Notes in Computer Science, 2127 pp. 258−266 IBM Press Releases. IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, http://www.ibm.com Kesselman C Globus Alliance, Press Releases. USC/Information Sciences Institute, 4676 Admiralty Way, Suite 1001, Marina del Rey, CA 90292-6695, Tel: 310 822-1511 x338, Fax: 310 823-6714, mailto:[email protected] http://www.globus.org, mailto:[email protected] Lehmann M (2003) Who Needs Web Services Transactions?. Oracle Magazine, http://otn.oracle.com/oraclemagazine, November/December 2003 McCommon M (2003) Letter from the Grid Computing Editor: Welcome to the new developerWorks Grid Computing resource. Editor, Grid Computing resource, IBM, April 7, 2003 McDougall P (2003) Offshore Hiccups In An Irreversible Trend. Dec. 1, 2003, InformationWeek, CMP Media Publishers, Manhasset, NY Minoli D (1994/5) Analyzing Outsourcing, Reengineering Information and Communication Systems. McGraw-Hill Minoli D (1987) A Collection of Potential Network-Based Data Services. Bellcore/Telcordia Special Report, SR-NPL-000790, Piscataway, NJ Minoli D (2005) A Networking Approach to Grid Computing. Wiley, New York Minoli D (2006) the Virtue of Virtualization. Networkworld, Feb. 27 Minoli D, Schmidt A (1997) Client/Server Applications On ATM Networks. Prentice-Hall/Manning

12 Grid Computing for Commercial Enterprise Environments

289

37. Myer T (2003) Grid Computing: Conceptual flyover for developers. May 2003, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States 38. Otey M (2004) Grading Grid Computing. SQL Server Magazine, January 2004, Published by Windows &.Net Magazine Network, a Division of Penton Media Inc., 221 E. 29th St., Loveland, CO 80538 39. Padala P (2003) A Survey of Grid File Systems. Editor, GFS-WG (Grid File Systems Working Group), Global Grid Forum, September 19, 2003. The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA 40. Smetanikov M (2003) HP Virtualization a Step Toward Planetary Network. Web Host Industry Review (theWHIR.com) April 15, 2003. Web Host Industry Review, Inc. 552 Church Street, Suite 89, Toronto, Ontario, Canada M4Y 2E3, (p) 416-925-7264, (f) 416-925-9421 41. Sun Networks Press Release (2003) Network Computing Made More Secure, Less Complex With New Reference Architectures, Sun Infrastructure Solution. September 17, 2003. Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054, Phone: 1-800-555-9SUN or 1-650-960-1300 42. Tatebe O, Morita Y, Matsuoka S, Soda N, Sekiguchi.S (2002) Grid datafarm architecture for petascale data intensive computing. In Henri E. Bal, KlausPeter Lohr, and Alexander Reinefeld, editors, Proceedings of the Second IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid2002), Berlin, Germany, IEEE Computer Society pp. 102−110 43. The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA, http://www.ggf.org 44. URL grid computing, http://www.gridcomputing.com/ 45. Waldrop M (2002) Hook enough computers together and what do you get? A new kind of utility that offers supercomputer processing on tap. MIT Enterprise Technology Review, May 2002, Technology Review, One Main Street, 7th Floor, Cambridge, MA, 02142, Tel: 617-475-8000, Fax: 617-475-8042 46. Yankee Group (2003) Enterprise Computing & Networking Report on Utility Computing in Next-Gen IT Architectures, Yankee Group, August 2003 Yankee Group, 31 St. James Avenue (aka the Park Square Building), Boston MA 02116, (617) 956-5000 47. Yankee Group (2004) Enterprise Computing & Networking Report on Performance Management Road Map for Utility Computing, Yankee Group, February 2004. Yankee Group, 31 St. James Avenue (aka the Park Square Building), Boston MA 02116, (617) 956-5000 48. Zhang LJ, Chung JY, Zhou Q (2002) Developing Grid Computing applications, Part 1: Introduction of a Grid architecture and toolkit for building Grid solutions. October 1, 2002, Updated November 20, 2002, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, http://www.ibm.com (IBM’s developerworks/views/grid/articles archives)

CHAPTER 13 Operational Metrics and Technical Platform for Measuring Bank Process Performance Markus Kress, Dirk Wölfing

13.1 Introduction The progressive globalization of financial markets is requiring major adaptation on the part of market participants to move beyond national-level competition and achieve international and global competitiveness. The entire banking industry is now focusing on major performance enhancements and gains in domestic market share as a springboard to successful international expansion. Banks are concentrating their efforts on market segments offering the potential for growth and rising profits, resulting in a reorientation within the overall financial services sector. New types of banks, including distribution, processing and portfolio banks, are evolving as the market consolidates due to mergers and acquisitions. This dual trend toward specialization and consolidation is forging banks that will be able to compete in international and global markets. Performance enhancement efforts are aimed at a complete realignment of internal and inter-corporate processes. In contrast to the trend in recent years, the focus is no longer on cutting costs alone, but rather on simultaneously improving services to customers. Not only must processes become more efficient, they must be made more customer-friendly as well. Attempts are being made to transfer approaches that have proven effective in other industries, particularly manufacturing, to the financial sector. Industrialisation is the buzzword for boosting performance by applying industrial methods. What is this formula for success being borrowed from industry? Can industrialisation really mean progress in the 21st century, or is it all just another round of hype to be eventually debunked by management pundits? Are industrialised forms of process organisation truly superior to those conventionally employed by service firms? Industry has seldom relied on intuition to address these types of questions, developing instead measurement techniques to provide answers on the basis of objective criteria, in the true engineering tradition. The same approach is now being applied to process design and realignment in a banking environment and is the

292

Markus Kress, Dirk Wölfing

subject of the first part of the article. After an introductory discussion on measurement systems for business operational performance tracking, we next undertake to categorise the various types of measurement systems employed. This categorisation provides a basis for the definition of the concept of performance. We then outline a process model employing metrics in line with this concept of performance. The second part of the chapter then presents a model of a service-oriented IT architecture, into which we plug a number of elements for the measurement and reporting of process performance at financial services organisations.

13.2 Industrialisation in the 21st Century? The principal idea underlying industrial production is the division of labour. An industrialised organisation generates a collective product through a series of controlled process workflows (e. g., assembly line car production). The division of labour occurs within industrial business processes (Richter-von Hagen and Stucky 2004). Industrialisation arose from the analysis of complex workflows based on the principle of the division of labour. This analysis focuses on measuring and rendering manageable discrete steps of production labour. This led to the invention of the assembly line, an era in which manufactured products were initially highly standardised with minimal automation. Each production step involved minutely specific, pre-planned physical labour, which is why factory work has been seen as monotonous for a long time. Industrial manufacturing was likewise thought to largely preclude any customisation. Simplifying and standardising workflows made possible the introduction of mechanised and/or automated production technology. Industrialisation is thus often equated with automation. Automation is a central but not the sole constitutive element of industrialisation, as automation can also be employed in manual labour processes. The essence of industrialisation is the intelligent organisation and control of workflows within a process of discrete production steps. Modern controls now allow industrial production to be tailored to a wide variety of individual customer requirements. The mechanical assembly line of yore has been replaced by complex IT systems for controlling the manufacturing process. Knowledge and rule-based high performance systems allow unprecedented flexibility for process customisation. Upon submitting an order, customers are now able to specify the relevant production information and track production and delivery status online. The production process thus starts and ends with the customer. The industrial idea of coordinating labour within a segmented production process was initially conceived for the manufacture of goods. Today, however, we see the same principle being employed at call centres in a service environment. The service in question is broken down into individual tasks. Incoming calls are directed to provide specialised services of a defined quality within a specific period of time. Planning and controls allow capacity to be adapted to meet customer requirements. The principle of division of labour can transcend the borders of the individual enterprise, however. In the global markets, processes are designed and managed

13 Measuring Bank Process Performance

293

extending from the customer to the supplier. Companies establish overarching, multi-organisation process chains controlled via purchasing (Supply Chain Management, SCM) and sales (Customer Relationship Management, CRM) systems. Whereas originally industrialisation was concerned with organising processes in order to simplify activities1, managing complex process steps has now become the key to successful process optimisation in the 21st century. This means abandoning a “radical cure” approach (Hammer and Champy 2003) in favour of ongoing adaptation of business processes to changing external factors by means of optimised process controls (Osterloh and Frost 2003).

13.3 Process Performance Metrics as an Integral Part of Business Performance Metrics Processes are complex and largely interdependent systems, in which minor changes in parameters (e. g. capacity) may significantly impact overall process performance. These interdependencies and the need for ongoing adaptation to changing market and competitive conditions require an effective system of controls. Operational control systems involve a quantitative presentation of the status quo versus a given target, as well as instruments for analysis of factors influencing the complex system we refer to as “processes”. Discussion of process management today revolves mainly around challenges to technical process support, the integration of technical process steps into operational systems and the quality of technical control functions in an information technology context (see for example (Kahno et al. 2005; Ray et al. 2005) for further reading). The potential these technical considerations represent from an operations management perspective as control parameters has not yet been reflected. In operations management, process performance controls are viewed in the overall context of business performance metrics, quantitative reporting of process performance being seen as an integral part thereof (Sandt 2004; Siebert 1998). Figure 13.1 provides an overview of concepts of metrics systems. Classic management systems employing financial and managerial accounting used at the start of the 20th century were revised and augmented toward the century’s close. Value driver concepts were developed on the financial metrics side based on the idea of shareholder value. A number of different quality management concepts have been developed in addition to financial concepts (e. g. Six Sigma). The Balanced Scorecard method integrates financial and qualitative aspects into a comprehensive control and management system. Methodological concepts based upon this method are currently being discussed (Mende 1995).

1

Such as Frederick Winslaw Taylor, who was the first to break down the labour process into discrete activities, identifying major potential for increasing productivity.

294

Markus Kress, Dirk Wölfing

Metrics concepts for business management

Financial metrics

Non-financial metrics

Accounting systems: DuPont system, ZVEI system, etc. Quality management concepts

Value driver concepts

Balanced Scorecard

Metrics management concept

Concepts involving select metrics

Figure 13.1. Metrics concepts according to (Sandt 2004)

Managing risk is the primary managerial concern in the banking industry, where process-oriented thinking is less developed than in manufacturing. Initial attempts are currently being made to develop quality management systems based on methods such as Six Sigma (Moormann 2006). Process-oriented thinking in a risk management context is subsumed under the concept of operational risk. Frances Frei of the Harvard Business School has been pursuing an interesting value-based approach. In his studies, customer value is defined as the primary metric for multi-channel retail bank distribution. The modern concept of performance encompasses both financial and qualitative yardsticks such as customer retention data and process flexibility, incorporating forward-looking as well as historical elements (Schrank 2002). In the following passage we attempt to integrate this expanded notion of process performance into a value-oriented approach in order to develop a model system of process performance metrics.

13.3.1 Work Processes, Qualitative Criteria and Measuring Process Value In the academic discussion (cf. Mende 1995), distinction is made between metrics for process performance and for process execution. Process performance metrics

13 Measuring Bank Process Performance

295

Metric type

Nonfinancial metrics

Financial metrics

Amount, time, quality, disruptive factors etc.

Cost-income ratio, ROI etc.

Processing/set-up times, defects/rejects per machine, defects per employee

ROI per machine, cost per workstation

Process concept Input/output concept

Mapped workflows

Figure 13.2. Cluster diagram of various metrics systems with examples

break down into four categories: time, quality, costs and flexibility. Some examples of process execution metrics are: throughput times, defect rate, process reliability, customer relevancy, operational impact, information flow, directability, efficiency, know-how, information systems, motivation and innovative capability. This distinction between process performance and process configuration allows further distinction between operative concepts and types of metrics: 1. Processes may be viewed either as a series of black box-type segments with inputs and outputs, or in terms of specific workflows. 2. Metrics may be strictly financial or may also incorporate non-financial aspects (including synthetic metrics such as scoring, etc.) relevant to analysing workflows. The input/output concept and financial metrics involve a greater level of abstraction, making them particularly useful for simplification in an “end-to-end” view. Within such a view, overall process performance may be broken down into a manageable number of process segments. A detailed analysis could then concentrate on individual segments while remaining focused on the greater context. Figure 13.2 shows a matrix of these conceptual views with example of each. Financial metrics based on the input/output concept can be employed to ascertain the value created through a given process. The “inputs” used are the cash outflows resulting from the process and the “outputs” are the cash inflows generated.

296

Markus Kress, Dirk Wölfing

The present value of these cash flows represents the contribution of a given process to the sum total of added value created by an enterprise.2 The presentation below mainly concerns an input/output concept, as a specific process is being abstracted upon.

13.4 Controls for Strategic Alignment and Operational Customer and Resource Management Metrics are useful in delineating a starting situation (current), an objective (target) and the trajectory from current to target levels. Metrics are thus configured as a function of both the trajectory to be traversed and the envisioned target. Three levels of target definition can be identified: 1. The market-oriented strategic level at which the medium- to long-range alignment of the enterprise is defined. The primary focus here is on adapting business processes so as to attain the strategic targets set. Strategic business objectives (market, cost, quality objectives etc.) are transformed into metrics and pursued on this level. 2. The customer-oriented level of products or services at which performance outputs for customers are defined. Processes extending from and to the customer are defined on this level for each service provided. Controls are exercised within the framework of a defined product life cycle. 3. The resource level. The objective on this level is to optimally adapt capacity to short- and medium-term market demand (capacity management). Capacity is a function of personnel skills and IT systems.

13.5 Metrics Used for Input/Output Analysis Process performance can be tracked by abstracting upon concrete workflows, employing an input/output concept. Breaking down end-to-end-processes into process segments is also beneficial for analytical purposes. Process segments exhibit the same characteristics as the end-to-end macro-process, namely inputs and outputs described in terms of the financial criterion cash flow, in terms of output 2

The discounted cash flow (DCF) method is employed for valuing cash flows. DCF has the advantage over traditional profitability measures that it provides a macro-level view over a number of periods without having to arbitrarily posit a particular method of depreciation.

13 Measuring Bank Process Performance

Input Cash flows (expenses) based on available IT and personnel capacity

Duration/Time

297

Output Cash flows (income) based on output quality

Volume

Figure 13.3. Process segment from an input/output viewpoint

quality and of available capacity. Process steps are viewed primarily in terms of duration and volume. Figure 13.3 presents a model process and process segments, along with the criteria required for measuring process performance.

13.6 Output Metrics: Cash Inflows a Function of Output Quality The concept of quality is key regarding the generation of cash inflows. The term ‘quality’ denotes two separate things. Quality as a characteristic of financial services Services differ in terms of type and delivery. A particular banking product represents a specific type of service, while delivery concerns the manner in which a banking product is made available to customers (e. g. channel, location, speed, etc.). Banking product features represent quality characteristics that can be described nearly3 entirely in terms of specific numeric attributes. Service levels are now generally used for the detailed outlining and measurement of service delivery. Time is another quality characteristic of financial services, as speed of availability makes possible differentiation of otherwise commodity-like money products. The rapidity of loan approvals and customer-friendly features with lending and investment products (especially in the online segment) are major factors in selecting a banking partner. “Time-to-market” and “time-to-customer” are key for success in the banking business today. “Real-time” products are set to be the next

3

The few surviving physical products like cash are an exception. Other exceptions are contractual provisions governing work to be performed (terms and conditions, etc.).

298

Markus Kress, Dirk Wölfing

major step for enhanced customer service. Time considerations are thus a critical factor and standard to most service levels. The fulfilment of quality standards is the precondition for generating cash inflows. Measurement systems are relied on to document adherence to service quality standards. Income generated is thus equal to the number of qualitatively equivalent services provided multiplied by the market prices obtained for these. Defects as a lack of specific characteristics The dual connotations of the term ‘quality’ are effectively explained if we consider defects as a “lack of a (promised) characteristic”. With defects, the defined “target” characteristic differs from the “actual” characteristic”, i. e., the banking product or service related to said product does not correspond to specifications or expectations. Defects are process failures that have an impact on cash inflows and the usage of resources. Quality management concepts provide a framework for the in-depth analysis of these interdependent phenomena. Measurement systems must be able to track and document defects/errors. Corrective measures to cash inflows due to quality defects identified are implemented external to the process itself.

13.7 Cash Outflows a Function of Available Capacity The concept of “capacity” relates to the input side of the equation as “quality” relates to the delivery aspect (or “output” of a financial service process). Bank process capacity is a function of the two resources personnel and information technology. Capacity depends on the amount of resources made available for a process. IT resources are those IT functionalities required for a specific activity, while personnel resources are the available man hours of a specific personnel function. Available but unused capacity is referred to as idle capacity. Providing capacity typically generates cash outflows.

13.8 Time and Volume as Non-financial Metrics of Process Performance Time Time as a process characteristic differs from time as a delivery requirement in connection with service levels. As a process characteristic, time is an indicator of process performance. Turnaround time, processing time, down time, make-ready time, waiting- and other times are examples of non-financial work process characteristics used to describe process performance. The technical documentation of the start and finish of a given processing event depend on the specific organisation of workflows. The concept of processing time must first undergo definition and be applied to the work process before technical documentation can take place.

13 Measuring Bank Process Performance

299

Volume Volume represents the number of business events within a larger process or within individual process segments. This figure always refers to business events or services to the customer. Modular process segments allow for the providing of various service packages consisting of standardised individual services. The number of business events per process segment is thus an important measure for the standardisation of services. Volume is another non-financial process performance characteristic.

13.8.1 Fundamental Input/Output Process Performance Characteristics Information on performance quality and quantity, type and amount of resources provided and process time and volume allow for the documenting of process performance and creation of added value. The following productivity measures represent fundamental metrics: Unit of time Unit of performance

or

Unit of performance Unit of time

or

Unit of performance Resource

and value-creation metrics including: Income Unit of volume

Income Unit of time

Cash outflows Unit of volume

Cash outflows Unit of time

The Discounted Cash Flow (DCF) method is applied to determine the value of cash flows generated by a given process. In a value-based approach, the resulting figure sums up process performance. Performance may also incorporate forwardlooking elements by factoring in the present value of projected cash flows. Process segments can also be further defined to include other non-financial metrics within an “end-to-end” view. Process performance data obtained on the basis of these fundamental metrics permits studying causalities and potential performance improvements as well as the mapping of more complex interrelationships (such as are denoted by the concept of flexibility).

13.9 Technical Realization While in the chapters above the business view on business process performance (BPP) was discussed, the last part of this article addresses the technical aspects,

300

Markus Kress, Dirk Wölfing

especially the monitoring of BPP. How the business requirements regarding this kind of monitoring can be realized with a technical implementation is analyzed in this chapter. The resulting technical monitoring approach is dependent from the system landscape in which business processes run. There is considerable difference whether a process is implemented end-to-end in one application or distributed among various non-connected systems. It is obviously more difficult to extract and relate data from different data sources. This means that the monitoring approach needs to consider and address the specific system architecture. As a service oriented architecture (SOA) is a promising architecture paradigm due to the flexibility and agility it provides, the monitoring approach introduced in this paper is especially designed for SOAs. At the moment there is a clear SOA trend. Therefore the decision to examine monitoring in SOAs can be seen as market- or practice-driven. The terms monitoring and business process are used in a technical context so there meaning differs from above. The term monitoring is used here as measuring the KPIs of business processes by extracting process data and performing calculations for retrieving the KPI values. Business processes consist of a number of activities which require human interaction or automatic processing. In a SOA these activities are related to services. Therefore a business process can be defined as orchestrated services. By this orchestration and the management of services in a standardized way, a SOA separates the what (processes and metrics) from the how (resources) (O’Connel et al. 2006). To recapitulate, the technical monitoring of BPP has to implement the business requirements as stated above within a SOA.

13.9.1 Monitoring Challenges Monitoring business processes implemented in a SOA environment imply various challenges. The main challenge of performance monitoring within a SOA lies in its distributed nature. SOA monitoring has to span different components as well as architectural layers which makes the data collection difficult or at least laborintensive. Data required for KPI calculations has to be retrieved from these different sources. In practice it is often complicated to correlate the data, e. g. to the corresponding business process instance. Furthermore there is no standardized reference architecture. There are many different SOA-stacks with various types of services and service hierarchies. In practice SOAs are implemented in a lot of different ways caused by the existing system landscape. Legacy systems as well as tools strategically set, affect the architecture. These heterogeneous service implementations are the second big challenge that must be coped with. This heterogeneity may cause difficulties in data collection, e. g. when different data formats and encodings in the services are used. Generally, monitoring of business process performance requires a comparison between target and actual data. While target data is defined manually in advance

13 Measuring Bank Process Performance

301

actual data accrues during the execution of business processes. The technical solution must be able to collect this actual data automatically. Specific data transformations like format changes, mappings and aggregations may be necessary to provide the business view of the monitoring data in an adequate way. The key questions that need to be addressed are: • • • •

what data must be monitored where is the data stored how can the data be retrieved how can data from different sources be correlated to each other (data redundancies and bad data quality can make the correlation even more challenging)

13.10 Related Work As SOA-based BPM is a relatively new approach, monitoring concepts are still in development and information on this topic is scarce. The majority of existing SOA monitoring approaches are technical motivated and therefore non-functional aspects are mainly considered. This kind of monitoring comprises measurement of technical response times, error localization, determining the technical Quality of Service (QoS), among others. Various techniques and protocols can be applied for this kind of monitoring, e. g. URL probing,4 SNMP requests and traps,5 or NetFlow.6 Besides these technical aspects, measurement of BPP additionally requires the monitoring of sequences of service calls as processes are orchestrated by services as described in the technical introduction part. Only few articles exist that consider the measurement of business process performance based on business driven KPIs in a SOA. One such approach is described in (Kano et al. 2005). But they do not address the special characteristics of SOA monitoring, especially its heterogeneity and distributed nature. Neither of these approaches address the different possibilities in implementing a SOA. As stated before these SOA specific characteristics lead to various challenges. A multitude of not necessarily SOA specific monitoring approaches exists, like business activity monitoring (BAM), OASIS Common Base Event Infrastructure, IBM Common Event Infrastructure, among others, which could be applied for implementing SOA monitoring. Especially BAM is more and more used in BPM and SOA projects.

4 5

6

monitoring of URLs to make sure that the response is as expected. standardized protocol for monitoring and controlling agents from central management stations. open (but proprietary) Cisco protocol for collecting IP traffic information.

302

Markus Kress, Dirk Wölfing

13.11 SOA Based Business Process Performance Monitoring As mentioned above, the technical solution presented in this paper addresses performance monitoring of business processes running in SOA environments.

Figure 13.4. SOA performance monitoring architecture

The monitoring approach is explained by means of a service based reference architecture that has been successfully applied in several projects. This architecture comprises five layers, each representing specific functionalities and system types (see Figure 13.1). The presentation layer is the interface between user and system. On this layer the appearance of the graphical user interface is defined, like masks and forms. The business process layer describes the business process with focus on the business view. The two layers below represent the service hierarchy which is based here on an order acceptance example. There are two kinds of services, enterprise and integration services. While enterprise services encapsulate business functionalities, integration services have a more technical focus like data transformations and data access. The application and persistence layer contains all systems and databases existing in the enterprise. The layers in Figure 13.4 are vertical split in order to illustrate the business process and monitoring specific aspects. The left side shows the architectural design for implementing business processes without any monitoring aspects. These are depicted on the right side. As

13 Measuring Bank Process Performance

303

the monitoring approach does not require any business process execution itself, the right-hand business process layer is empty. The main advantages of this reference architecture and SOA in general are the clear separation of different functionalities and the loose coupling between components. The loose coupling of business process and the legacy systems is achieved by the two service layers which provide the abstraction required. This loose coupling together with the existing standards, leads to the adaptability and flexibility earlier mentioned. Due to its adaptable, flexible style of architecture, a SOA provides the foundation for shorter time-to-market, reduced costs, and risks in development and maintenance (Pallos 2001). Furthermore the so called “service enabling” can leverage existing investments in legacy applications. SOA should not be equated with web services as these are only one way of implementing a SOA. Web services are widely-used as they are based on platform-independent standards. Therefore they are the enabling technology and driver for the SOA trend mentioned earlier. Despite the fact that SOA is based on standards, there exists a multitude of vendor-specific so called SOA-stacks which consist of different service hierarchies. Many vendors are mainly driven by their existing tool set. Furthermore the service types within SOA are not standardized. Again, there exist proprietary vendor-specific definitions of service types. Services can be distinguished in business and technical services, or in business, abstraction, and access services. Besides these classifications, there exist other ones using different service definitions. For example the IBM SOA reference architecture uses services like interaction or partner services, among others (High et al. 2005). The specific tools and platforms for implementing such an architecture are not in the focus here, but the different functionalities each layer represents. For example there exist special software systems called business process management systems (BPMS) for the technical implementation and support of business processes. These systems can be defined as “specialist rapid application development software for the automation of rules based processes …” (Pyke 2005). They are also able to work with services so they can be used in a SOA environment. Our monitoring approach does not necessarily require a BPMS. Despite the importance of BPM technology (Miers 2005) many companies do not yet have a BPMS in use. If the process is implemented technically, most commonly an EAI or portal software is used instead. When considering this situation our approach makes the use of a BPMS optional. Within this architecture the monitoring solution must be able to collect the data incurred during business process executions. It depends on the software tools in use what data can be monitored without much effort as many tools provide monitoring abilities. For example most BPMSs provide sophisticated monitoring support on the workflow layer but lack on the other service layers. Using an architecture as described enables detailed and flexible monitoring over all layers even if the business process is not implemented completely end-to-end in one software system. In our technical solution there exists a special monitoring channel (see Figure 13.4) for the collection of monitoring data. Services have to publish messages

304

Markus Kress, Dirk Wölfing

containing the monitoring data on this channel.7 This is done by either using a special monitoring service or by publishing the data on the channel directly. This message-based asynchronous communication ensures that the business process execution is independent from the monitoring application. If the monitoring application fails the business process execution can proceed undisturbed. Standards like JMS8 can be used for performing read or write operations on the channel. In the non-Java world a separate monitoring service or special products like the .NET9 based ActiveJMS10 could be used. The message-driven approach delivers data to the monitoring application in real time, ensuring real time monitoring. As business processes are an orchestration of services that may consist of services themselves – depending on the service hierarchy, context information must be provided to each service call. This context information must contain all information required for relating service calls to the corresponding business process, sub process, and or activity. This context information solves the data relating problem described above. For example context information could comprise an unique ID for identifying the business process instance the service call belongs to (and of course it relates the data that is exchanged via the service call). Using this approach an error situation can be determined if it occurs. If a proactive monitoring approach is required, services must be instrumented for it. For example business processes or transactions must be executable in a test mode performing service calls for tracing only. One such approach is described in (Roch 2006). The monitoring solution has various advantages, as it: • • • • •

works in heterogeneous system landscapes solves the data correlation difficulties is independent from the concrete SOA implementation provides real time monitoring can be extended to gain a proactive business performance management instrument

Also there are some disadvantages: • higher network load • existing services must be adapted • If process steps are not implemented as services, monitoring data has to be extracted from systems rather than in the service layer Once the monitoring data is extracted it has to be stored somewhere. In practice there is often only a daily delivery of data after close of business when huge amount of data is transferred into a data warehouse in a batch run. The use of 7

8 9

10

for better visibility not all arcs between the services and the monitoring channel are drawn. Java Message Service, messaging standard for exchanging messages. Development and execution environment for creating Microsoft Windows-based applications. Open source software providing JMS functionality to .NET clients.

13 Measuring Bank Process Performance

305

a data warehouse to store the monitoring data is the traditional approach. Another newer approach is based on complex event processing (CEP). If required the CEP and data warehouse service (both illustrated in Figure 13.4) can use other services to retrieve further data (this is indicated by the dashed arc in Figure 13.4). This could be the case if not all business process steps are implemented as services, so the data cannot be published on the monitoring channel. Both, the data warehouse and CEP approach are explained in the following sections.

13.11.1 Data Warehouse A data warehouse is a special kind of database for storing historical data in a way to provide fast data access and reporting functionality. The data warehouse is loaded with monitoring data via an ETL process (extract, transform, and load). In our approach the extraction of monitoring data is done by the services that have to publish monitoring data on the channel as mentioned above. The monitoring application is responsible for the transformation and loading. To accomplish this, it has to collect the monitoring data from the channel, transform it in accordance to the requirements and send it the data warehouse. The actual loading technique is dependent on the data warehouse in use. There are file based or web service based approaches or completely proprietary methods. Within the data warehouse additional transformations may take place. Once the data is finally stored in the data warehouse it can be used for reports, queries or dashboards. Different standards for accessing data stored in data warehouse exist, like XMLA11. According to (Fingerhut 2006) a pure data warehouse approach based on fixed data models are not adequate for real time monitoring in a fast changing environment as the adoption of underlying data models is costly. A more adequate approach can be complex event processing.

13.11.2 Complex Event Processing Complex Event Processing (CEP) originally developed by David Luckham at Stanford University is now being adopted by more and more commercial tool vendors (Tibco, Progress Software, etc.). CEP is a promising approach. Its starting point is a so called event cloud which is the result of event generating IT systems. As events do not necessarily arrive in a predefined order and may also have complex relationships, they form this cloud. CEP focuses on the extraction of information from this cloud by using special algorithms that consider cause and time of

11

XML for analysis defines message interfaces for the interaction between clients and analytical data providers based on an industry standard language.

306

Markus Kress, Dirk Wölfing

events. Special maps and filters can be applied to provide different abstractions and views on the data, e. g. for determining existing event patterns. Another related approach (a subset of CEP) is event stream processing (ESP) that assumes that events are ordered by time. As events are timely ordered they create a stream, and not a CEP like cloud. A streaming database is a special kind of database especially designed for ESP to capture data and time very efficiently. In our scenario a streaming database could be used instead of a data warehouse. In contrary to ESP, CEP is a bit more costly concerning processing time, as the event order has to be determined. Still, CEP is supposed to be the more efficient approach for real time monitoring of huge amounts of data. A data warehouse service could provide additional data so a CEP does not necessarily replace a data warehouse. This depends on the specific scenario. The disadvantage of CEP and ESP compared to the data warehouse approach is the missing standardization. After introducing and explaining the architecture for monitoring BPP in a SOA environment, especially how data can be extracted and related from different sources, the implementation of a measurement environment is discussed. The most important measurement categories listed in the first part of this paper are time, number, capacity, and quality. Flexibility was also stated but as this cannot be monitored automatically we will need to ignore this. In the following sections we will discuss what factors under which prerequisites can be measured within a SOA.

13.11.3 Measuring “Time” It was mentioned that three types of time have to be measured: cycle time, processing time, and idle time. In accordance to the assumption that all technical as well as manual processing steps are implemented as a service the cycle time is calculated by the difference between the time the last service completed and the time the first service was called. The monitoring tool must be able to determine when the last service call has been performed. If this is not possible business processes never finish according to monitoring reports. The measurement of processing and idle times must be distinguished for humans and technical systems. The calculation for humans is more difficult. It is not possible to determine the real time a case worker is working on a specific case for two reasons. Firstly offline working time cannot be measured. This is the time the case worker works on a case not using any technical system. Secondly, even if a case worker started to use a technical system, it cannot be assured that he works all the time on the corresponding case until he completes the processing. For example he could have opened a mask for a case and then went to lunch. Idle times can be calculated by the difference between the working time and the processing time. As there is a measurement error in the processing time

13 Measuring Bank Process Performance

307

it will affect the idle time as well. The direct measurement of idle time is analog to the processing time not possible. The processing time is the basis for the calculation of the load factor of resources. Legal aspects have to be considered as well when the work of employees is monitored. In some countries it may be illegal when employees are monitored directly while anonymous role-based monitoring may be allowed. The processing time and idle time of services are simple to measure as the start and end of a service call is known. The processing time corresponds exactly to the service call duration.

13.11.4 Measuring “Volume” For measuring the volume the number of business processes running or finished must be determined. As each service call has to contain context information identifying the business process it is related to, it is relatively easy to calculate the number of business processes, sub processes or activities. Depending on the context of information provided the number of a department, product level, or other categories of interest is possible as well. If there is no context information at all and the service cannot be altered the measurement can be difficult as already stated. As the amount has to be correlated with time, e. g. in the case that the number of active processes in a specific period of time has to be measured, the timestamps of each measurement have to be stored. Based on the volume measurement a reuse level of services can be calculated. The reuse level determines how often a service was called from different clients or organizational units. The measurement of business events per sub process or activity in combination with the reuse level can serve as an indicator for standardization.

13.11.5 Measurement “Capacity” Both human and technical resources are measured for the capacity measurement. The capacity of human resources corresponds to the amount of cases workers and their working time. These case workers can be assigned with specific roles or to specific business process activities based on the system in use. For example in role-based systems a role determines what process steps a case worker is allowed and able to perform. The capacity can be given as amount of case workers (in hours or persons) per business process, business step, or role, among others. The capacity of technical resources is calculated based on the technical equipment which is in use. The equipment can be a single server or server farm, the number of processors, the memory, among others. As different services may run on the same server(s) for a specific service its share of resource usage must be calculated.

308

Markus Kress, Dirk Wölfing

13.11.6 Measuring “Quality” The term quality can be distinguished in quality as a characteristic and as a divergence from predefined objectives. Each business process is related to at least one product. This product is processed by the related business process. The quality describes what characteristics the product must have in certain states of the business process. If the characteristic can be described using attributes, plausibility checks can be applied for assuring these characteristics. It could be monitored how often plausibility checks failed and what time for rework had to be invested (with a measurement error as mentioned in the previous section). The second aspect of quality is handled in most cases by defining SLAs. An SLA defines the target settings a business process must keep. The quality measurement makes use of the other performance measurements like time and amount. These measurements correspond to the actual data, the predefined SLAs to the target data. To measure the quality a comparison between actual and target data must be made.

13.11.7 Measuring “Costs” or “Discounted Cash Flows” (DCF) The costs or DCF’s are not measured directly. Instead costs and DCF’s are calculated ex post based on the measured volume (KPI “volume”) and the capacities provided (KPI “capacity”). The number provides information how often and to what extent resources (either human or technical systems) were used. In combination with the related cost information an activity-based costing can be made. Expense streams caused by the execution of business processes are the result of this kind of accounting technique.

13.12 Conclusion The continuous increase of productivity in the financial industry requires more sophisticated methods than the traditional automation of single functions like Straight Through Processing (STP). The continuous improvement of processes due to more specialized and standardized process segments combined with quality and capacity management requires metrics to control the process performance. A very practical method to measure the process performance is based on components of the end-to-end processes. Each component has input and output interfaces and is valued with financial indicators like cost or incoming and outgoing cash flows. For detailed analysis non financial process indicators are required. By this

13 Measuring Bank Process Performance

309

methodology the overall metrics of the processes are split in parts working together to produce the quality of service the customer expects. The information to control the process technically can be produced by a monitoring system integrated in a SOA. Depending on the specific scenario various technical approaches exist to build such a monitoring solution which has to cope with increasing data volume as well as constant changes. The presented solution provides business driven monitoring in challenging technical environments. Currently the implementation effort in heterogeneous environments tends to be enormous but it can be anticipated that more sophisticated tools will be available in the near future that will simplify such implementations.

References Achenbach W, Lieber K, Moormann J, Six Sigma in der Finanzbranche 2. Auflage Bankakademie Verlag Frankfurt Fingerhut S (2006) Achieving Operational Awareness With Business Activity Monitoring (BAM), Business Integration Journal, May/June 2006, from http://www.bijonline.com/index.cfm?section=article&aid=272 Frei FX, Harker PT (2000). “Value Creation and Process Management: Evidence from Retail Banking.” In Creating Value in Financial Services, edited by Melnick E, Nayyar P, Pinedo M and Seshadri S, Kluwer Academic Publishers Frei F, Kalakota R, Leone A, Marx L (1999) Process Variation as a Determinant of Bank performance: Evidence from the Retail Banking Study Management Science Vol. 45 No. 9, pp. 1210−1220 Hammer M and Champy J (2003) Reengineering the Corporation, New York High J, Kinder S, Graham S (2005) IBM’s SOA Foundation, White Paper, from http://download.boulder.ibm.com/ibmdl/pub/software/dw/webservices/ ws-soa-whitepaper.pdf Richter-von Hagen C and Stucky W (2004) Business-Process- und WorkflowManagement: Prozessverbesserung durch Prozessmanagement, Teubner Verlag, Reihe Wirtschaftsinformatik, Stuttgart Kahno M, Koide A, Liu T, Ramachandran B (2005) Analysis and Simulation of Business Solutions in a Service-oriented Architecture, IBM Systems Journal Vol. 44, No. 4 Mende M (1995) Ein Führungssystem für Geschäftsprozesse, Dissertation at Hochschule St. Gallen, Bamberg Miers D (2005) BPM: driving business performance, White Paper, BP Trends, from http://www.filenet.com/English/Products/White_Papers/bpmdbp.pdf O’Connel J, Pyke J, Whitehead R (2006) Mastering your organization’s processes, Cambridge University Press Osterloh M, Frost J (2003) Kernkompetenz Prozessmanagement, Gabler Verlag, Wiesbaden

310

Markus Kress, Dirk Wölfing

Pallos MS (2001) Service-Oriented Architecture: A Primer, EAI Journal, from http://www.eaijournal.com/PDF/SOAPallos.pdf Pyke J (2005) Waking up the BPM paradigm, from http://www.itweb.co.za/sections/business/2005/0508190822.asp Ray G, Muhanna W, Barney J (2005), Information Technology and the Performance of the Customer Service Process: A Resource Based Analysis, MIS Quarterly, Vol. 29, No. 4, December Roch E (2006) Monitoring and Management within SOA: Lessons from a SOA Early Adopter, from http://blogs.ittoolbox.com/eai/business/archives/monitoring-and-managementwithin-soa-lessons-from-a-soa-early-adopter-8473 Sandt J (2004), Management mit Kennzahlen und Kennzahlensystemen, Wiesbaden Schrank R (2002) Neukonzeption des Performance Measurements, Verlag Wissenschaft und Praxis, Mannheim Siebert G (2002) Performance Management, Leistungssteigerung mittels Benchmarking, Balanced Scorecard und Business Excellence-Modell, Deutscher Sparkassenverlag, Stuttgart Wölfing D (2006) Six Sigma und Business Process Management im Kontext industrieller Bankprozesse, in: Achenbach W., Lieber K., Moormann J., Six Sigma in der Finanzbranche 2. Auflage Bankakademie Verlag Frankfurt

CHAPTER 14 Risk and IT in Insurances Ute Werner

14.1

The Landscape of Risks in Insurance Companies

The risk landscape of insurance companies is quite diverse: it stretches from insurance and investment risks to operational and regulatory risks, not to mention the risks (and chances) involved in generating integrated information technology infrastructures.1 In this paper, multiple levels of risks are discussed with the most aggregate one comprising the risks of insurance companies and their management. Risks transferred from customers to be managed by insurers make up the least aggregate level.

Solvency II regulations

collective savings and investment asset-liability management underwriting and timing risk operational risk Figure 14.1. Risk landscape of insurance companies

1

See e. g., the survey of 44 leading insurers from around the world done by (PricewaterhouseCoopers 2004).

312

Ute Werner

Risk management in insurance companies works partly as a bottom-up integration of risks: The individual risks of customers are combined in risk groupings based on an analysis of these individual risks and their assessment in relation to all the other risks accepted. The premiums charged by the insurer – for accepting the risk transferred from the insured – are calculated in due consideration of the expected frequency and extent of damages in the risk group. Additionally, the financial cover provided to the insured party as laid out in the respective insurance contract is considered. However, the less experience, knowledge, or data insurers carry with any particular risk class, the less reliable their estimates will be. This uncertainty is reflected by supplementary risk loadings. Differences between the total of premiums collected for each line of business and the payments due after the risk has materialized for a number of customers in a given period make up the insurance risk of the company. It is especially significant for property and casualty or reinsurance business since the uncertainty regarding the realization of risks (underwriting risk)2 is high compared to life insurances: In this line of business, the triggering event, e. g. death, is certain to happen, and the main share of the payments due at that moment has been laid out in the insurance contract. It is, therefore, known beforehand. However, timing risk as the uncertainty about when an insured event will occur still affords a very sophisticated investment planning. Returns on capital investments make up an important part of the overall economic performance of insurance companies. They are generated by reserving and investing that share of the premiums collected which has not yet been spent for reimbursing the insured. Moreover, collective savings and investments are an essential element of cash value life insurances or whole life health insurance. They definitely constitute a risk for the insurance company if a certified minimum return has been guaranteed to the customers. Therefore, market, credit and liquidity risks have to be controlled carefully for maintaining the ability of the firm to satisfy its duties. In recent years this has become more difficult following declines in equity and fixed income markets. Nevertheless, matching assets and liabilities according to their values, timing, and inherent risk is central for the insurance business. This also affects the management of underwriting, investment and pricing3 as well as their interdependencies, or capital allocation according to the risk carried by different business units.

2

3

Will an event occur at all? What is the loss amount to be expected from possible damages? – The terminology used to describe ‘insurance risk’ varies. By differentiating between underwriting and timing risk we follow the standard setting approaches of FAS 113 within US GAAP and the discussion laid out in (IAIS 2005). The practice of cash-flow underwriting, e.g., implies that coverage is provided for a premium level that is actuarially less than necessary to pay claims and expenses. In this case, the investment profit on the premiums is used (partially) to compensate for the underwriting loss.

14 Risk and IT in Insurances

313

Furthermore, it includes decisions of how to fulfil the requirements of insurance regulatory authorities regarding the solvency4 of insurance companies. Solvency II regulations in the EU will extend from minimum capital provisions to detailed rules for asset investments, and they will be supplemented by comprehensive risk management and publication principles (European Commission 2003). These principles should include operational risks as well, even if their quantification might be difficult (Linder and Ronkainen 2004) since they occur in multifarious forms reaching from risks related to human behaviour following diverging interests to data analysis procedures or data management. Basically all IT-supported critical business processes can be understood as operational risks, especially if they are vulnerable to accidents or attacks from inside or outside the company.5 Surveys by (Tillinghast-Towers Perrin 2002)6 and (PricewaterhouseCoopers 2004) indicate that most of the insurance companies sampled have to develop technology infrastructures that are integrated enough to allow accounting for interactions and correlations among risk sources. This, however, is fundamental for designing a holistic, enterprise-wide management of risks. In the following the focus will be laid on underwriting risks as the most typical risks of insurance companies. Section 14.2 refers to catastrophe modelling in order to distinguish between hazard, exposure, vulnerability and loss as the main components of underwriting risk. This perspective is useful for identifying and analysing single risks in their natural, technical or social context; at the same time it allows to account for interrelations between risks as far as they reflect in the business of insurers. It will be shown how information technology is put into service to fulfil these tasks. Section 14.3 then introduces IT as a tool for managing the underwriting risk of the company. Several applications in disaster assistance, claims handling and market development are presented. Section 14.4 concludes with first thoughts on IT’s cost and benefit in insurances.

4

5

6

The meaning of the term has been extended during the past years. Since its introduction in the 1970es, ‘solvency’ designates minimum capital requirements that are derived from generalized assumptions about the risk involved in the kind and amount of insurance business written. This focus on insurance risk changed when the Solvency II project was launched in 2000. Its objective is to create a prudential framework for the management of all the risks insurance companies are facing. This is in line with international developments in solvency, risk management and accounting. For an overview refer to (Linder and Ronkainen 2004); the latest work of the US Capital Adequacy Task Force is listed under www.naic.org/committees_e_capad.htm. Since the Basel II accord, operational risk is commonly defined as potential loss resulting from inadequate or failed internal processes, people and systems, or from external events. Interesting approaches for assessing operational risks in insurance companies have been developed by (Dexter et al. 2006) and (Tripp et al. 2004). The report is based on a Web-survey conducted in 2002. Executives from 94 companies headquartered around the world and working in all sectors of the insurance industry responded.

314

Ute Werner

14.2

IT’s Function for Risk Identification and Analysis

14.2.1 Components of Underwriting Risk Among natural scientists, engineers and economists, risk is modelled as the function of an interaction between four basic components: (1) certain hazards, (2) elements exposed to hazardous events with specified characteristics, (3) susceptibility of the exposed elements to the hazardous impact, and (4) resulting consequences (Mechler 2004; Sinha and Goyal 2004; UN/ISDR 2004).

(1) Hazard

Event Creation (3)

(2) Exposure

Vulnerability

(4) Loss

Damage Function

Figure 14.2. Structure of risk and of catastrophe models (Grossi et al. 2005; Borst et al. 2006)

Computer-based catastrophe models7 show a similar structure; they have been used since the late 1980es for assessing loss potentials due to extreme natural events such as earthquakes or hurricanes. And in the aftermath of the terrorist attacks of 9/11/2001 the modelling has been extended to include human-induced disasters. This was made possible by continuing advances in information technology. Geographic Information Systems (GIS) facilitate the linking and integration of various kinds of data: • scientific measurements and concepts regarding natural hazards, • engineering knowledge on structural characteristics of the built environment, • models of human goal-oriented behaviour, • and statistics or reports on historical occurrences of hazardous events. 7

FEMA (www.fema.gov/hazus/) offers an open-source multi-hazard catastrophe model (HAZUS-MH MRl); proprietary modeling software and services have been developed by AIR Worldwide (www.air-worldwide.com), EQECAT (www.eqecat.com), and RMS (www.rms.com). For a short history of catastrophe modeling c.f. (Grossi et al. 2005).

14 Risk and IT in Insurances

315

Remote Sensing on the other hand is able to deliver pictures or imagery useful for a rapid updating of GIS-based maps, e. g. after a disastrous event. The sensors are either based on the ground or mounted on aircraft and spacecraft. Data acquired from these systems are combined with related information to produce up-to-theminute maps, track weather events, or create databases of urban infrastructure and much more. GIS, therefore, have become an ideal environment for conducting multifaceted, spatial-oriented studies of risk. This induced a renaissance of risk mapping. The mapping focus, however, shifted from displaying (historical) evidence to up-to-date analysis and inspection information for various businesses.8 The hazard component of risk is usually modelled based on cause-effect relationships. In the case of (extreme) natural events this can be founded on detailed scientific and geophysical data collected by measurements, and on observations of losses registered. If human-induced events are to be investigated data is quite scarce, especially for small-scale events, and the information available might not be representative for other events to come (Kunreuther et al. 2005). This is especially true for malicious attacks as human actions can be directed to produce damages of a certain extent while taking into account potential protection measures. Moreover, data collection processes for human-induced events are not standardized. In catastrophe modelling, the exposure module contains geographic data on the location of the elements at risk (address, postal code, CRESTA zones9, etc.) as well as information on typical physical characteristics of the exposed objects and subjects. This includes property data such as the construction type of houses, the year they were built, or the number of stories. Regarding the population, data such as age and gender distribution, average daytime population, and maximum capacity of schools, hospitals, shopping centres or stadia have to be collected (Balmforth et al. 2005). The creation of synthetic events is possible via probabilistic models and computer simulation or by the deterministic approach of scenario planning. Probabilistic models are used to generate damage distributions or a set of events that can be used to calculate such a distribution. Scenario models provide a smaller array of estimates, but can easily be adapted for individualized loss assessments. In conjunction with the hazard module, both approaches are able to differentiate between various qualities of events (is it a hurricane or a tornado?, an industrial accident or vicious sabotage?), related intensities (e. g. in terms of wind speed or pressure waves of bomb blast) and estimated frequency (Borst et al. 2006). 8

9

E. g., Real Estate: assessment of property values or online listings with aerial photos; utility and telecommunication planning: review of new housing development; agriculture and forestry: appraisal of drought damage; insurance: prospective exposure analysis and claims management – to name just a few. CRESTA means ‘Catastrophe Risk Evaluation and Standardizing Target Accumulations’. This working program was set up by the insurance industry in 1977; it aims at establishing a globally uniform system for the accumulation risk control of natural hazards. GfK MACON is the official supplier of CRESTA maps comprising 13222 zones and 455 subzones of all countries in the world (www.cresta.org or www.gfk-macon.com/cresta).

316

Ute Werner

Damage functions represent a quantitative estimate of the impact produced by a hazard phenomenon on the elements at risk. Damage functions reflect a thorough and detailed understanding of the local conditions and practices (Clark 2002). These are “determined by physical, social, economic, and environmental factors or processes, which increase the susceptibility of a community to the impact of hazards” (UN/ISDR 2004, vol. II), and thus its vulnerability. The mapping of locations is therefore essential for any kind of vulnerability assessment. (UN/ISDR 2004) defines risk as “the probability of harmful consequences, or expected losses (deaths, injuries, property, livelihoods, economic activity disrupted or environment damaged) resulting from interactions between natural or human-induced hazards and vulnerable conditions”. This refers to direct and indirect10, economic and human losses including latent (environmental) conditions that may represent future threats. All personal and commercial insurance lines – from property and casualty to life, accident and health, or reinsurance – are therefore affected. The following examples illustrate how insurers employ GIS and other information technology for assessing their underwriting risk.

14.2.2 Assessing Hazard, Exposure, Vulnerability and Potential Loss Only risks that are sufficiently known can be assessed regarding the consequences for the risk carrier. Before a (financial) risk is transferred to an insurance company it will, therefore, be evaluated according to its individual characteristics and as to its fit with other risks accepted so far. This includes analyses of spatial interdependencies.11 Thanks to advances in information technology the century-old craft of risk mapping for insurance purposes has been recreated: Fire insurance maps originated in London in the late 18th century, as insurers attempted to gain accurate information about insured buildings (Nehls 2003). Like in other parts of the world, fire was one of the most troublesome problems: The large number of wooden buildings, constructed in close proximity to one another, primitive lighting systems, and unsafe ovens and furnaces all contributed to an environment where fires were frequent and often devastating (Ericson 1985). In 1867, D.A. Sanborn founded a mapping company in New York that still exists, holding over 1.2 million historic and updated maps of approximately 12.000 American cities and towns. Sanborn employed surveyors in each state of the U.S. and systematized the map-producing process, introducing standards for accuracy and design that the company published in 1905. The large-scale maps (1:600) detailed the location, footprint and material composition of all buildings – mainly in the downtown areas – including framing, flooring, and roofing materials, the number of stories, 10

11

Indirect losses such as business interruption are a consequence of the originating physical destruction. For a detailed discussion c.f. Chapter 2 and 3 of (NRC 1999). The actuarial assumption of independent risk groupings is needed for mathematical reasons; empirical features of risk groupings include space- and time-dependent accumulation effects.

14 Risk and IT in Insurances

317

the location of windows, doors, water and gas mains, fire walls as well as building use – thus depicting important components of the fire hazard. This was complemented with information on nearby baker’s ovens, stored kerosene, blacksmith forges or the location of gas lines. The surveyors also noted vulnerability-related characteristics like street and sidewalk widths, prevailing wind directions, number and type of fire hydrants, railroad corridors and nearby rivers as well as the strength of local fire departments (www.edrnet.com/reports/key.pdf). These ‘Sanborn Maps’ served the same purpose as GIS does for property and casualty insurance today: Individual objects can be inspected without personal examinations of the buildings and their surroundings. This is useful when underwriting new risks, or for renewals with expanded coverage. Even an underwriter with no knowledge of the territory would be provided with the information necessary to make an informed decision about the risk in question. Hazard maps are created to show, e. g., landslide and fire zones or areas with more or less criminal activity, thus serving as a quick identifier of the hazard potential for a certain address or zip code. Moreover, geographical data can be combined with economic data such as typical asset values, thus displaying economic aspects of exposure; or it may be blended with statistics and reports on previous damaging events to back up the imagination about vulnerability issues as to certain hazards. Underwriting can be facilitated further by calculating indicators that sum up the totality of risks identified. (Munich Re 2002), e. g., demonstrates an approach for assessing the risk of megacities regarding material losses due to various natural hazards. This macro- and mesoscale information may be supplemented by more detailed information, e. g. on fire and police stations in the area. Remote sensing is useful to identify the composition of roof material and types of building materials used on a structure. All this assists in the decision on whether to insure any particular property for fire, burglary, etc., and what premiums to charge for insurance. More detailed and reliable data about the risks accepted reduces uncertainty which in turn allows reducing the safety loadings on the net premiums calculated. It also facilitates premium differentiation according to the relative risk in the areas covered. Hazardous installations may be examined with the focus on their surroundings. As (Munich Re 2002) points out, the exposure accumulation within a 5-km radius of a chemical plant can be specified, for instance regarding physical, economic, or human and social effects. And if the prevailing wind directions are registered as in the Sanborn maps, hitherto underestimated loss scenarios can be identified and simulated which might supply important information for pricing. Figure 14.3 illustrates this approach by implicitly referring to health risks of people living in an area with a high density of chemical plants, the Rhine-NeckarTriangle in Germany. This information could easily be complemented and enhanced by focusing on the addresses of persons insured in life and health lines. Another part of the financial exposure of the company in case of hazardous incidents could be outlined by emphasizing nearby critical facilities or zones of heightened vulnerability due to heavy daytime use like schools or shopping districts: such a situation might translate into environmental liability compensations if one or more of the chemical companies have signed up for this kind of insurance.

318

Ute Werner

Figure 14.3. Chemical plants in the Rhine-Neckar-Triangle, Germany

Geographic data and its integration with other data relevant to the insurance of objects, subjects and liabilities provide insurance companies with the possibility to visualize their exposure as of risk accumulations and total coverage in certain areas. As another example, Figure 14.4 shows the location of embassies in Berlin, Germany. They are represented according to a vulnerability index for terrorist attacks which takes into account permanent political characteristics as well as current critical international situations (Borst et al. 2006). It is evident that some of the embassies that carry a higher degree of vulnerability are located at close distances to each other thus generating a ‘hot field’ that radiates to the surrounding areas. If an insurance company wants to judge its exposure regarding buildings in the vicinity of embassies, it has to consider, however, that the more attractive potential targets are to terrorists, the more protection will be provided to fend off attacks. This may result in heightening the danger for other embassies as long as they are less protected (Kunreuther and Heal 2003). In this case, GIS can only be a starting point for a more thorough analysis of underwriting risk. Investment risk, on the other hand, might be related more directly to the vicinity of real estate and ‘hot fields’. Both kinds of risk may induce risk selection processes on the part of the insurer since the risk capital allocated to each business segment has to be placed in the most beneficial way. (Munich Re 2002; Munich Re 2003) explain this with the example in Figure 5 that has been extended for our purposes: The reinsurer uses a rating tool for geocoding new risks based on their addresses in order to compare

14 Risk and IT in Insurances

319

Figure 14.4. Locations of embassies in Berlin, Germany

them with risks already accepted. The probable maximum loss amount in the vicinity of the new risks is indicated by the broken line circle. In this way, accumulation situations can be identified in a timely manner; they can be calculated in advance and avoided, if necessary. Even certain investment risks might be taken into account as long as they are connected to geographical locations. Data from various sources is also required for assessing covariant risks and domino effects such as lifeline and business interruption, e. g. following disastrous events in urban areas. Past events and their evolution may serve as background data

Kind of risk: new underwriting risk accepted underwriting risk investment risk (real estate)

Figure 14.5. Accumulation control of underwriting and investment risk

320

Ute Werner

Figure 14.6. Flood risk identification map provided by the state of Baden-Württemberg; (http://rips-dienste.lfu.baden-wuerttemberg.de/rips/hwgk/Default.aspx)

pool for complex risk analyses aiming at the detection of loss patterns and causeeffect relationships. These observations could be combined with current data, e. g. satellite imagery on traffic situations or land use to simulate the likely propagation of damages under present conditions. This may also foster the understanding of non-observable effects and assist in the development of predictive models. GIS have been installed and used by a broad range of industries as well as governmental agencies and the military.12 Insurers introduced them mainly in cooperation with consultants who, at the same time, provided the data necessary for detailed risk assessments. In general, the insurance business still needs to refine the granularity of data collected before an extended employment of geographical underwriting is possible. In the past, business was often written on the basis of large rating zones on the state or county level, or numerous individual risks with multiple locations were drawn together in one contract (Munich Re 2002). This aimed, of course, at reducing the cost of data collection, management and storage. Geocoding on the contrary implies that not only the addresses or locations of each risk transferred have to be recorded but also other situational attributes relevant for its coverage. Ideally, data should be collected and stored in a standardized way, in cooperation with clients and reinsurers. For assessing human-induced hazards such as industrial accidents or terrorist attacks a fine resolution of data on the street or address level is necessary (Munich 12

A very well-known non-proprietary application is HAZUS-MH, developed by the U.S. Federal Emergency Management Agency (FEMA) under contract with the National Institute of Building Sciences (NIBS). It contains models for estimating potential structural and economic losses from earthquakes, floods, and hurricanes as well as impacts on the population. Another GIS-based tool developed by the U.S. Defense Threat Reduction Agency (DTRA), FEMA, and Science Applications International Corporation (SAIC) is CATS – the ‘Consequences Assessment Tool Set’. It serves Government Agencies and Military Institutions to estimate and analyze effects due to extreme natural events and human-induced disasters including industrial accidents and terrorist attacks.

14 Risk and IT in Insurances

321

Re 2003). This also applies to weather hazards like flooding where a few meters on the ground or even centimetres of water level can cause a big difference in damages to be expected. Figure 14.6 shows the potentially flooded areas (marked by the dark line) and the corresponding water levels (the darker the higher) for the historic city of Heidelberg. For other hazards such as earthquake or windstorm, data for 5-digit postcodes are suitable (Munich Re 2003). Insurance companies have to consider carefully whether they want to rely on external expertise for risk assessment or assume this part of their business. Modern information technology provides a number of flexible, user-friendly applications and services that assist in linking risk assessment with the risk management process.

14.2.3 Information Management – Chances and Challenges The term GIS is used to denominate either just the computer systems capable of capturing, storing, manipulating, updating, analyzing, and displaying geographically referenced information (USGS), or it includes the graphic data13 and associated attributes collected in relational database systems. The overall function of Geographic Information Systems, however, is to support the management of data acquisition, its integration and distribution. Insurance companies with their large databases, mixed network computing environments and distributed processing needs are typical users of full-blown GIS systems with intra- and internet-based services. A central data repository and a development tool such as SDE (Spatial Database Engine from ESRI) allow the professional user to store and manipulate all variations of geographic features while sharing the data in a client/server environment over large networks or even via (standard) web browsers. PC-based graphic user interfaces facilitate model calculations, displays, and risk assessments and provide the convenience to view GIS data from anywhere in the world in a matter of minutes. This is extremely valuable for a business that covers extended geographic areas. An enterprise-wide application of GIS has been introduced by PartnerRe, a global reinsurer headquartered in Bermuda (www.esri.com/library/fliers/pdfs/cspartnerre.pdf). A centralized system set up for all risks and offices guarantees worldwide consistent data and calculation methods. Conservative loss scenarios are used for evaluating single risks and the total exposure of the company. Underwriters, therefore, are able to proceed based on real-time information about used and remaining reinsurance capacity by peril zone. Another goal of the implementation 13

Point (also called node), line (or arc) and area (or polygon) in either vector or raster form which represent a geometry of topology, size, shape, position and orientation. Spatial features are stored with longitude and latitude assigned to each location. This is called geocoding or georeferencing.

322

Ute Werner

project was to provide timely post-event loss estimations. This can be achieved by a satellite data acquisition and analysing process. It has been developed to enable the overlaying of an event’s geographic spread and intensity against the company’s exposures.14 Other reinsurers with longstanding international experience have continuously been updating the internet access to their digital atlases, libraries and services. Swiss Re, e. g., addresses its business partners via a catastrophe network that disseminates data using an ArcIMS application. The worldwide natural hazard atlas provides information on about 650,000 locations and is available over the internet. Other electronic services include, i. A., a life and health underwriting system intended to automate the application process for regular risks, or an e-business solution for property and casualty lines with online access to reinsurance quotes and capacity, as well as account management.15 Munich Re, too, has installed a new internet-accessible platform to connect with its cedants; specific versions for brokers and other business partners are currently being designed (www.munichre.com/assets/pdf/connect/2005_07_connect _flyer_eng.pdf). The platform offers a wide variety of analytical and transac tion tools for product development, underwriting, risk management, or claims settlement. The reinsurer also created an interactive online tool for its “Natural Hazards Assessment Network (NATHAN)”, which is published in two versions – the ‘light’ one without insurance statistics is open to the public at the company’s website. It covers about the same topics and range of locations as Swiss Re’s digital atlas. Both systems allow the user to drill down through superimposed layers of data to discover the information provided and data associated with a designated location – for example earthquake and windstorm hazards in 30 km distance of a preselected city. Beyond its use for reaching out to business partners, IT and especially worldwide digital maps surely have to potential to raise the interest of the general public in issues related to geography.16 The technology, therefore, can support the marketing and communication of risk mitigation measures which in turn contribute to risk reduction on the individual and societal level. This approach has been used by insurance companies, NGOs and local, state or federal authorities. It has to be carefully planned and prepared in cooperation with IT professionals, risk experts and data providers: On the one hand, lots of potentially useful data for risk assessment is still missing, not publicly available or hard to obtain. To name just a few examples: 14

15 16

It is conceivable that a continued integration of GIS with business intelligence solutions such as SAS BI further improves the efficiency and transparency of business processes. This and the enhancement of reporting capabilities was also a goal of PartnerRe’s project. For technical aspects c.f. Bayerl (2005). Google Earth and NASA’s World Wind, e.g., are open to everybody and employ visualization tools in 3-D that allow zooming from satellite altitude to earth. Additional guides and features can be accessed though a simplified menu (www.earth.google.com or http://worldwind.arc.nasa.gov).

14 Risk and IT in Insurances

323

Research on day-time distributions of populations around hazardous installations has just started (Balmforth et al. 2005), and access to data on elements sensitive to terrorist actions such as critical infrastructure is restricted (Kunreuther et al. 2005); the same can be experienced with data on technological hazards needed to judge possibilities of damage propagation due to supply chains or other interdependencies between businesses. Contrary to this shortage, there is an overwhelmingly large amount of primary data being collected from such diverse media as overhead imagery, real-time global positioning feeds and electronic sensors. They come in various formats and different scales. The sources providing these data are quite diverse, too, ranging from authorities (official census data or nation-wide mapping projects17), to research institutes or private companies offering online library and other web-based support services18. A strategic agenda for integrating these multi-source geographic data into existing databases and models of decision support and risk management is, therefore, necessary. In any case it has to be guaranteed that the data delivered correspond to certain quality standards, and legacy databases might need an update for enabling geospatial analyses. Special consideration should be given to data that was collected as paper maps, CAD drawings or observations from field surveys since a conversion to digital format is possible. Other services offered by professional providers of geospatial solutions include geocoding of addresses, joint analysis of two- and three-dimensional remote sensing data such as Synthetic Aperture Radar (SAR) and Laser Induced Detection and Ranging (LIDAR), satellite image processing19 and interpretation, as well as workflow design. The IT infrastructure could be prepared for applications like live mapping, dynamic spatial modelling, automated change detection, and location-based services. Alternatively, subscriptions with professional providers of these functions might be used. In this way, only relevant data are transmitted, even automatically, which reduces the more time-consuming queries on demand (Abler and Richardson 2003).

17

18

19

Projects like “The National Map” combine partners from U.S.G.S, federal agencies, state and local governments, NGOs and private industry (U.S. Department of the Interior/U.S. Geological Survey: The National Map: The Nation’s Topographic Map for the 21st century. http://nationalmap.gov/nmreports.html, viewer at http://nmviewogc.cr. usgs.gov/viewer.htm). GlobeXplorer, for instance, has a large commercial library of georeferenced digital aerial imagery. It is distributed over the internet using a proprietary platform that dynamically searches, retrieves, compacts, and delivers images from an online database (www.globexplorer.com). This company cooperates with CDS Business Mapping that provides risk analysis tools for insurers, e.g., automated property lookups for underwriting (www.riskmeter.com). This involves georeferencing, orthorectification, color balancing, resolution merging across satellite platforms and more, to ensure the desired quality in terms of clarity, color and resolution, even over larger areas.

324

Ute Werner

14.3

IT as a Tool for Managing the Underwriting Risk

14.3.1 Disaster Assistance Not all disasters come by surprise though the warning times differ considerably. Weather-related extreme events such as volcano eruptions and wildfires can be observed developing. Figure 14.7 displays a fire watch map provided by the Australian Department of Land Information. It shows the locations of wild fires (bright stars in the centre) as mapped from satellite imagery on April 16, 2006 for Northwestern Australia, and a large number of lightning events (dark squares) registered for the same day. The data is available in near-real time, i. e. within two hours of the satellite passing over and can be used in disaster modelling. In this case, the topographical and vegetation (fuel) data are fed to a combustion model, which simulates the spread of the fire from its origin, depending on further data input such as wind speed and direction as received over the internet from weather services. The simulation results are displayed with GIS over the topography to assist in assessing the consequences (elements exposed and their vulnerability), for managing the vegetation and other mitigation measures, as well as evacuation planning. Insurance companies may use this information system as a support tool for deciding where to set up catastrophe operations and how many loss adjusters or additional helpers are needed.

Figure 14.7. Fire watch map of Northwestern Australia, 04-16-2006; (http://firewatch.dli.wa.gov.au/landgate_firewatch_public.asp)

14 Risk and IT in Insurances

325

Larger insurance companies employ designated teams of specialists who assist in response efforts by establishing temporary claims offices in the field. After the 2004 tsunami event in Southeast Asia, Allianz Group set up clearly identifiable information booths on the island of Phuket, Thailand, and opened a 24-hour care centre for tsunami victims. It supported local help organizations in their efforts to supply emergency shelter, medication and clean water (www.allianz.com/azcom/ dp/cda/0,,610273-44,00.html). And during the Katrina catastrophe in summer 2005, AXA Art Insurance took care of art objects in the New Orleans area which otherwise would have been destroyed or heavily damaged.20 As a result of lessons learned from Hurricane Andrew which devastated parts of Florida in 1992, the insurer Travelers developed mobile claim headquarters (www.travelers.com/spt01portalmain.asp?startpage=/tcom/search/searchredirect. asp?attr1=disaster, Gallagher 2002). These custom-built vehicles are equipped with onboard computers and databases, printers, photocopiers, cell phones, fax machines – even two generators to restore electricity when the power goes out. Thus, debit cards can be issued on site and handed to clients in distress. Claims assessors may carry cell phones, pagers and digital cameras, and use Personal Digital Assistants, tablet PCs21 or notebooks with wireless networking capabilities to connect to a central server for information retrieval and update in real time. With the GPS toolbar, GPS measurements can be captured directly into a (central) geo-database. This is extremely useful when working in heavily destroyed areas where no street signs or landmarks are left: thus, even the locations of buildings that have been blown away or become damaged beyond recognition can be retrieved, and policyholders identified. Moreover, GPS indicates the proximity to resources for human assistance. The functionality of the variety of devices that an individual field force member uses may even be extended through a personal area network (PAN) as supported by Bluetooth. IT, therefore, enables loss adjusters to process all the information necessary for damage estimates right on site which affords clients with swift claim making possibilities. On the other hand, the insurer’s GIS can be updated in real time via incorporating actual claim information as it becomes available. This is valuable input for early loss assessments, for damage zoning, and reviews of loss reserving. Further on, a map with optimal routing and driving directions to locales of claims made can be set up for the adjusters. (Munich Re 2003) gives an example of how satellite data may be applied for loss estimations during disasters. Figure 14.8 is an adapted version of Munich 20

21

The company hired private security teams and helicoptered them to over 90 locations in the area. They also installed generators to control humidity and removed art to other save locations (www.axa-art.com/about/m_092905.html). These come with handwriting recognition and speech-recording functionality; some of them are designed specifically for field use, have outdoor viewable screens and have been ruggedized to survive dropping and resist moisture and dust. For a case study and tutorial in investigating insurance claims using ArcGIS 9 and a tablet PC (www.esri.com/news/arcuser/0205/files/tablet_pc.pdf).

326

Ute Werner

Figure 14.8. Linking up flooding information with portfolio data: individual risks as white dots and claims made as white triangles; adapted from (Munich Re 2003)

Re’s image and shows a small section of an area flooded during the August 2002 events in Europe. It is hatched in dark grey. The light-grey borders and captions relate to postcode areas, and the individual risks are represented by white dots. Thus it is clearly possible to distinguish between policyholders that are affected by the flooding and those that are not. The claims made are pictured as white triangles, which is helpful for deriving loss zones. Claims relating to risks located outside of these zones can be identified and clarified in subsequent investigations.

14.3.2 Claims Handling The mapping of claims made with the help of IT is also useful during non-disaster periods since their spatial patterns can indicate risk sources not yet considered. It is further necessary for an optimized service management and for fraud prevention. The Risk Management Agency (RMA) of the U.S. Department of Agriculture employs data mining and remote sensing in its fight against abuse in the crop insurance program (www.rma.usda.gov/news/2005/12/1230compliance.html): A centralized data warehouse provides all crop insurance-related data collected over time so that producers whose claiming patterns appear to be atypical can be identified. These clients are informed and some of them inspected during the growing season. This has led to a continual and substantial reduction in indemnities paid during the past years. To facilitate the spot-checking, investigators are trained to acquire high resolution Landsat imagery from the USDA Image Archive22 and make preliminary 22

C.f. (www.fas.usda.gov/pecad/remote.html) for information on the Production Estimates and Crop Assessment Division with links to other interesting projects.

14 Risk and IT in Insurances

327

determinations either to approve a crop insurance claim, or forward it to a remote sensing expert for further investigation. In Germany, RapidEye (www.rapideye.de) is preparing for geo-information services and analyses based on a combination of satellite technology, IT and e-commerce. The system consists of 5 satellites with multi-spectral imagers (MSI) mounted on them, ground stations, a data processing centre and internet-based information distribution, everything designed and built by an international team of technology providers. The satellites will be launched in 2007 and are then expected to be able to provide images of each point on earth, on a daily basis. This high frequency and extended coverage guarantees up to date information, even despite temporary cloudy conditions. Insurance companies may use the data provided by RapidEye in multiple ways, either directly or via geographic information services: As far as claims handling in agricultural insurance is concerned, a continuous remote monitoring of crop type and plant vitality during the growing season will be possible, ideally in cooperation with the producers. This facilitates an early determination of yield loss since unwelcome alterations of the produce can be discerned through comparisons of multi-temporal data sets. Other conceivable applications include the monitoring of temporary changes in regional traffic intensity (e. g. due to holiday seasons) which is helpful for road side assistance programs. In general, call centres of insurance companies using GIS can support customers in finding the preferred providers for auto repair and other necessities in case of a car-breakdown on the road. Additionally, web-based interactive information systems can be created for clients equipped with GPS functionality. Remote sensing also allows the tracking of mobile objects such as vessels and vehicles, including their cargo, which should prevent fraud in transport insurance lines. Since tracking a truck implies tracking the driver and its driving behaviour, the technology could even help to reduce the number of accidents due to excess hours of driving or speeding. It may further assist in the recovery of losses in case of stolen cars as long as these cars have GPS devices installed.

14.3.3 Market Development Remote sensing further serves innovative approaches in insurance marketing, e. g. when fine-tuning premium rates according to the individual risk transferred. As mentioned before, flooding risk needs to be assessed on a small scale basis, and available two-dimensional information on flood zones is too coarse since it doesn’t grasp risk characteristics like a building’s elevation compared to the surrounding area. Here, Digital Elevation Models (DEM) employed in conjunction with remote sensing data are helpful for distinguishing between risks at individual address levels and their susceptibility to flooding. Auto insurance is another part of business where individual risk ratings are feasible thanks to IT. The popularity of GPS in cars for its navigational and safety

328

Ute Werner

features can be used for premium differentiation provided the driving history is recorded. Monitoring driving behaviour delivers quantitative measures of subjective risk factors that might be more reliable than the tarification parameters applied so far: It is not unreasonable to expect that data on when, how long, and where a car is actually driven makes up a better representation of the accidental risk than knowledge about the owner’s profession, age and sex, or where the car is garaged (Holdredge 2005). Good drivers will be rewarded by lower premiums, and insurance companies using this individual risk rating approach will be rewarded by attracting good drivers. GIS software and associated demographic and socio-economic data packages further allow insurers to profile existing customers not only according to their contract or claims history and other data recordable along with the insurance business23 but also regarding data related to their geographical characteristics. These are, e. g., purchase patterns or average income for the zip code area they live in. Data mining then may result in distinguishable groups of customers with specific claims expectations. This can be mapped for further investigations. Consequently, new business will be sought in areas and among persons with characteristics similar to the ‘good risks’ discovered before. Microgeographic marketing developed about ten years ago based on the hypothesis that neighbours must have more in common than just the zip code or street address. Maps may further be used to visualize the context for insurance buying behaviour by representing the kind and extent of hazards existing around client’s residences. Other important information includes socio-economic data like income or age group as well as risk mitigation measures on the individual, local, and regional level. This is basic to locating target markets with the intent of marketing to target customers. They are the models to which insurance products, accompanying services and their prices are fitted. Information on preferred insurance acquisition channels is valuable for the (re)engineering of distribution networks ranging from insurance representatives and cooperating banks or retail business to intermediaries like employers, sports clubs, and the internet. A mapping of these agents and their networks might display geographic areas where more or less or different distribution support is needed. Once again, GIS can be used to optimize the routing of individual sales agents and to coordinate the sales areas allotted to them. Moreover, agents may be matched with clients following demographic or socio-economic criteria as well as kind of business. This is, of course, helpful for enhancing the communication between agent and client. Risk mapping has additional benefits since the visualization provided makes it easier for the agent to explain hazard, exposure, vulnerability and risk to the customer. 23

Which channel was selected for buying insurance – internet, phone, or agent? What is known about the personality of the client (call centers do check on this!)? Which insurance lines have been signed by an individual, her family? What is the customer’s satisfaction level with insurance and other services? What about her price sensitivity? Is there information available regarding risk taking behavior or risk management?

14 Risk and IT in Insurances

329

Market development will increase the quantity and improve the quality of insurance business written. Both effects are able to reduce the underwriting risk of insurers.

14.4

Conclusion

Throughout this article only an overview of the topic with some exemplary cases could be given. Though this is definitely not an exhaustive treatment of the many applications IT has to offer for risk analysis and risk management in insurances it hopefully illustrates the broad array of possible implementations. Each investment in technology, human and organizational resources has to be considered carefully, of course, regarding cost and benefit. Especially business reengineering involves time-intensive processes whose costs have to be balanced counter benefits that are hard to assess. The usability of IT as discussed here is, however, steadily improving which reduces the costs of implementation on all levels of the enterprise. This includes partners of an extended GIS-network and clients as well.

References Abler F, Richardson DB (2003) Geographic Management Systems for Homeland Security. In: Cutter SL, Richardson DB, Wilbanks TJ (eds) The Geographical Dimension of Terrorism. Taylor and Francis, New York and London, pp. 117–126 Balmforth H, McManus H, Fowler A (2005) Assessment and management of ‘at risk’ populations using a novel GIS based UK population database tool. Safety and Security Engineering, WIT Transactions on The Built Environment 82. Retrieved February 18, 2007, from http://library.witpress.com/pages/PaperInfo.asp?PaperID=15133 Bayerl C (2005) GEO Services at Swiss Re: Maintaining a company-wide centralized geodatabase. Paper delivered at the ESRI User Conference 2005. Retrieved February 18, 2007, from http://gis.esri.com/library/userconf/proc05/papers/pap1194.pdf Borst D, Jung D, Syed MM, Werner U (2006) Development of a methodology to assess man-made risks in Germany. Nat. Hazards Earth Syst. Sci. 6: 779−802 Clark K (2002) The Use of Computer Modeling in Estimating and Managing Future Catastrophe Losses. The Geneva Papers on Risk and Insurance 27: 181−195 Dexter N, Ford C, Jakhria P, Kelliher P (2006) Quantifying Operational Risk in Life Insurance Companies, developed by the Life Operational Risk Working Party. Retrieved February 18, 2007, from http://www.actuaries.org.uk/files/pdf/sessional/sm20040322.pdf

330

Ute Werner

Ericson T (1985) Sanborn Fire Insurance Maps 1867-1950, Exchange 27/3, newsletter published by the Wisconsin Historical Society. Retrieved February 18, 2007, from http://www.wisconsinhistory.org/localhistory/articles/sanborn.asp European Commission (2003) Design of a future prudential supervisory system in the EU – Recommendations by the Commission Services. Retrieved February 18, 2007, from http://www.europa.eu.int/comm/internal_market/insurance/ docs/markt-2509-03/markt-2509-03_en.pdf Gallagher J (2002) Road Map to Connectivity. Insurance & Technology, Sept. 13, 2002. Retrieved February 18, 2007, from http://www.insurancetech.com/ resources/fss/showArticle.jhtml?articleID=14706023 Grossi P, Kunreuther H, Windeler D (2005) An introduction to Catastrophe Models and Insurance. In: Grossi P, Kunreuther H (eds) Catastrophe Modeling: A new approach to managing risk. Springer, New York: pp. 23−42 Holdredge WD (2005) Gaining position with technology. In: Towers Perrin (ed) emphasis 2005/3: 2−5. Retrieved February 18, 2007 from http://www. towersperrin.com/TILLINGHAST/publications/publications/ emphasis/Emphasis_2005_3/Holdredge.pdf IAIS (International Association of Insurance Supervisors, 2005) Guidance Paper on Risk Transfer, Disclosure and Analysis of Finite Reinsurance. Retrieved February 18, 2007, from http://www.iaisweb.org/061021_GP_11_Guidance_ Paper_on_Risk_transfer_disclosure_and_analysis_of_finite_reinsurance.pdf Kunreuther H, Heal G (2003) Interdependent security. Journal of Risk and Uncertainty 26-2/3: 231−249 Kunreuther H, Michel-Kerjan E, Porter B (2005) Extending Catastrophe Modeling to Terrorism. In: Grossi P, Kunreuther H (eds) Catastrophe Modeling: A new approach to managing risk. Springer, New York, pp. 209−231 Linder U, Ronkainen V (2004) Solvency II – Towards a New Insurance Supervisory System in the EU. Scandinavian Actuarial Journal 104/6: 462−474 Mechler R (2004) Natural Disaster Risk Management and Financing Disaster Losses in Developing Countries. Karlsruhe editions II: Risk Research and Insurance Management (Karlsruher Reihe II: Risikoforschung und Versicherungsmanagement), vol. 1. Verlag Versicherungswissenschaft, Karlsruhe Munich Re (2002) Getting to the “point” – Does geographical underwriting improve risk management? Topics 2002: 41−45 Munich Re (2003) Geographical underwriting – Applications in practice, Topics 2003: 44−49 Nehls C (2003) Sanborn Fire Insurance Maps, A Brief History. Geostat Center and Department of History, University of Virginia. Retrieved February 18, 2007, from http://fisher.lib.virginia.edu/collections/maps/sanborn/history.html NRC (National Research Council, 1999) The Impacts of Natural Disasters: A Framework for Loss Estimation. Committee on Assessing the Costs of Natural Disasters. Retrieved February 18, 2007, from http://www.nap.edu/catalog/6425.html

14 Risk and IT in Insurances

331

PricewaterhouseCoopers (2004) Enterprise-wide Risk Management for the Insurance Industry, Global Study. Retrieved February 18, 2007, from http://www.pwc.com Sinha R, Goyal A (2004) Seismic Risk Scenario for Mumbai, In: Malzahn D, Plapp T (eds) Disasters and Society – From Hazard Assessment to Risk Reduction. Proceedings of the international conference at Karlsruhe, July 26−27, 2004. Logos Berlin, pp. 107−114 Tillinghast-Towers Perrin (2002) Enterprise Risk Management in the Insurance Industry. 2002 Benchmarking Survey Report. Retrieved February 18, 2007, from http://www.tillinghast.com Tripp MH, Bradley HL, Devitt R, Orros C, Overton GL, Pryor LM, Shaw RA (2004) Quantifying Operational Risk in General Insurance Companies. Developed by a Giro Working Party (general insurance research organisation), presented to the Institute of Actuaries, 22 March 2004. Retrieved February 18, 2007, from http://www.actuaries.org.uk/files/pdf/life_insurance/orcawp/op_ risk_capital.pdf UN/ISDR (Inter-Agency Secretariat of the International Strategy for Disaster Reduction, 2004). Vol. I: Living with Risk. A global review of disaster reduction initiatives. Vol. II Annexes: Annex 1: Basic terms of disaster risk reduction: 2−7, United Nations Publications, Geneva. Retrieved February 18, 2007, from http://www.unisdr.org/eng/about_isdr/bd-lwr-2004-eng.htm

CHAPTER 15 Resolving Conceptual Ambiguities in Technology Risk Management Christian Cuske, Tilo Dickopp, Axel Korthaus, Stefan Seedorf

15.1 New Challenges for IT Management Although the industrialization of financial products and services has been on top of the agenda in the financial service domain for many years now, there is still considerable pressure to lower the cost income ratio. This continuing trend stimulates restructuring of the IT environments (ECB 2004) and thus leads to new types of risk. From the corporate perspective, a major challenge lies in coping with rapid technological change and the increasing complexity of the IT landscape (BCBS 2003). At the same time, the restructuring of business processes and application systems is externally triggered by international regulations, such as the International Accounting Standards (IASB 2004) and the Sarbanes-Oxley act (SarbanesOxley Act 2002). These significant changes are further accompanied by the required integration of IT management activities into the operational risk management process with the advent of Basel II (BCBS 2005a). The mentioned internal and external factors are driving forces for operational technology risk management. Naturally, the stronger the influence of IT on a firm’s economic success is, the greater will be the associated risks (BIS2005). This poses new challenges to financial enterprises with regard to controlling and governing their risks more systematically. The most damaging technology (or IT/system) risks are often “those that are not well understood” (ITGI 2002). Given these assertions, the need for innovative approaches and new concepts offering an integrated support is more evident than ever before. This chapter is structured as follows: We first identify risk management as a mostly uncovered area for knowledge management applications in finance and suggest the application of ontologies for supporting the knowledge-intensive risk management processes. We present a technology risk ontology which constitutes the backbone of the proposed OntoRisk method. The technology risk management process is divided into three phases, namely identification, measurement and controlling. Finally, the system architecture of a tool supporting OntoRisk is described

334

Christian Cuske et al.

and the applicability as well as the advantages of the OntoRisk method are demonstrated by means of a case study.

15.2 Knowledge Management Applications in Finance In this section, we investigate the impact of knowledge management in the financial service domain as a means to sustain and improve competitiveness of financial organizations. In particular, we elaborate on the potentials for improvement regarding one of their core competencies: operational risk management. We discuss how ontologies can be applied as a knowledge representation technique to address common problems in that field.

15.2.1 An Extended Perspective of Operational Risk Management Knowledge management has been achieving increasing attention in recent years. The motivation for organizations to tap the full potential of knowledge management lies in an increasing focus on their core competencies and the acknowledgement of knowledge as a strategic resource (Prahalad and Hamel 1990; Probst et al. 1996). This trend is underlined by empirical studies (Skyrme and Amidon 1997) stating that active management of knowledge is critical for achieving a sustainable competitive advantage. Despite the existence of various definitions (Holsapple and Joshi 2004), knowledge is frequently distinguished from information and data. It refers to the collection of cognitions and capabilities that individuals or social groups use for problem solving, and it resides predominantly in people’s heads (Probst et al. 2003). From a technology-centric perspective, however, knowledge can also be codified, stored and distributed to be leveraged on the organizational level. From this point of view, a “codification strategy” for knowledge management is held most valuable, while from the second major perspective to be found in literature, the humanoriented perspective, a “personalization strategy” building on socialization and internalization is stressed (Hansen et al. 1999; Swan 2003). Knowledge management can generally be described as being “any process or practice of creating, acquiring, capturing, sharing and using knowledge […] to enhance learning and performance in organizations” (Quintas et al. 1997). Both exploration and exploitation of new and existing knowledge can be seen as the outcome of a successful holistic knowledge management approach. Such an approach will affect the enterprise in various dimensions, such as organization, employees, technology, and strategy.

15 Resolving Conceptual Ambiguities in Technology Risk Management

335

Financial organizations can be regarded as knowledge-intensive firms since they rely on creating, capturing, transferring, and applying specialized knowledge in their core business processes. In recent years, substantial parts of this industry have been undergoing structural changes that led to a focus on core competencies as well as the industrialization of financial services and products (Lamberti 2005). In this dynamic business setting, knowledge management can be an enabler for process improvements and a more efficient management of the firm’s most valuable assets. Today, there already exists a variety of applications that support knowledge-intensive business processes in finance. Examples include customer relationship management and creditworthiness ratings (Westenbaum 2003). Despite the fact that there are knowledge management solutions in place at many banks and insurance companies (Chong et al. 2000; Spies et al. 2005), a few core competencies still fall short of being sufficiently supported. Especially risk management is one key process that affects almost all the lines of business in the financial service industry. As Marshall et al. (1996) put it: “If knowledge management is of growing importance to every kind of business, its impact is perhaps most obvious in the financial service industry. This is because effective management of knowledge is key to managing risk. And the driving issue in both financial services and corporate treasuries today is risk management.” Thus, risk reduction should be viewed as a new key objective of knowledge management, in addition to the ones listed in (Abecker and van Elst 2004), which are the improvement of innovation, quality, cost-effectiveness and time-to-market. Losses stemming from the insufficient management of various types of risk, such as market, credit and operational risk, may possibly lead to bankruptcies or even destabilize the overall economy. A series of risk management failures hitting the headlines in the 1990s, e. g. the downfall of the Barings Bank, can partially be traced down to the failure of controls but has also been caused by the ineffective management of organizational knowledge (Holsapple and Joshi 2004). Since then, increasing complexity due to accelerating technological change has created even higher demands for effective risk management in the financial service domain. As a consequence, knowledge management is essential for addressing the problems encountered in risk management in general, and due to its fuzziness particularly in operational risk management. The ability to cope with financial risks involves • a clear understanding of the company’s assets and processes and their role for the creation of value, • awareness of the significant threats and their impacts, and • the application of appropriate policies and procedures helping to measure and control risks. However, ambiguities due to terminological confusion and uncertainties with respect to foreseeing the future are severe problems of the application of operational risk management in practice. Moreover, information relevant for risk management activities may reside in different organizational units and may have to be collected, analyzed and interpreted in a meaningful way, even if the data show inconsistencies. This requires a common language among the participants of the risk management

336

Christian Cuske et al.

process, which makes it possible to share their knowledge, specify an explicit model and transform it into quantitative figures useful for decision making. Against this background, it becomes clear that especially operational risk management can strongly benefit from the results of knowledge management research. For example, knowledge representation techniques such as formal ontologies can help to agree on shared vocabularies. Consequently, the above challenges should be addressed by an appropriate fusion of knowledge and risk management. In the following sections, we propose the ontology-based method OntoRisk, which combines the modeling of organizational knowledge with a quantification approach for resolving ambiguities in technology risk management.

15.2.2 Formal Ontologies as an Enabling Technology Although formal ontologies have a long tradition in knowledge representation and artificial intelligence (AI), they have not been widely adopted in financial applications yet. A widespread definition states that “an ontology is an explicit specification of a conceptualization” (Gruber 1993). Ontologies provide a structure for a commonly agreed understanding of a problem domain by specifying sets of concepts, relations and axioms. A shared vocabulary of terms can be regarded as an ontology in its simplest form. The attribution “formal” refers to the specification in a formal language. The main purpose of ontologies is to foster a systematic understanding of realworld phenomena and to create a reusable representation as a model. With the emerging vision of the Semantic Web (Berners-Lee at al. 2001), new technologies have been developed that facilitate the embedding of ontologies in information systems. In this respect, the role of ontologies can be explained in a more technical way: “Ontologies have been developed to provide machineprocessable semantics of information sources that can be communicated between different agents” (Fensel 2004). The uses of ontologies in information systems are manifold. In general, they serve as a unifying framework for different viewpoints and aim at reducing conceptual and terminological confusion (Uschold and Gruninger 1996). Here, we highlight two general advantages of ontologies: • Support of communication between the involved stakeholders by developing an enterprise-wide understanding of a specific problem domain; • Provision of interoperability between information systems with ontologies as part of their architecture. This way, formalized knowledge can be more easily transferred and translated to different perspectives. In the context of enterprise-wide information systems two applications can be identified. First, ontologies facilitate the integration of enterprise information when applied to enterprise modeling (Fox and Gruninger 1998). Second, ontologies may serve as the leveraging element in knowledge management (Maedche 2003; O’Leary 1998), since they respond well to structuring and modeling problems and support knowledge processes by flexibly integrating information from various sources (Staab et al. 2001).

15 Resolving Conceptual Ambiguities in Technology Risk Management

337

These characteristics qualify ontologies for risk management applications, especially in those areas that lack an explicit, shared understanding and strongly depend on domain knowledge. An ontology-based approach not only allows to provide precise definitions but to unify different methods and user perspectives. After describing the case for ontology-based technology risk management, we will introduce a concrete technology risk ontology which helps resolve the conceptual ambiguities.

15.2.3 The Case for Ontology-based Technology Risk Management The management of operational risk (and technology risk as a subcategory) has become increasingly important recently, especially after its integration into the regulatory requirements established by the 2005 Basel capital adequacy framework (BCBS 2005a). This integration was a reaction to the increasing dependence on IT environment in the financial service domain. Similarly, external regulations for insurance companies are also beginning to take shape with Solvency II (EC 2004). The general understanding of technology risk is embedded in the definition of operational risk: The risk of loss resulting from inadequate or failed internal processes, people and systems or from external events (BCBS 2005a). Accordingly, the management of technology risks represents an integral part of financial risk management. Technology-related risk comprises system failures, technologyrelated security issues, systems integration aspects or system availability concerns (BCBS 2003). In the following, we use the general term “technology risk” instead of the terms “system risk” or “IT risk” sometimes encountered in literature in order to indicate the broader meaning and to include risk types that are related to the technical infrastructure and IT management tasks as well. Since a clear shared understanding of technology risk does not yet exist on a general basis, there is a fundamental need for conceptual disambiguation. To this end, the idea of applying formal ontologies as one way of facilitating the understanding of technology-related risk in the banking industry has been proposed (Cuske et al. 2005c). Two main objectives for the effective management of operational technology risk inspire an ontology-based approach. According to the second pillar of the Basel II framework, senior management “[…] is responsible for understanding the nature and level of risk being taken” (BCBS 2005a). On the one hand, this implies a profound, unified understanding of the character and implications of risks and the use of a common language. Such a “shared understanding” should be based upon explicit knowledge about the corporate structure across functional divisions as one key element. On the other hand, the management of risks becomes an integral part of strategic management. In order to account for Basel II compliance, the risk management approach has to be comprehensible and clearly specified and the information base has to be made explicit. Ontologies are capable of providing

338

Christian Cuske et al.

a holistic approach to the described problems because they help to clarify the fuzzy understanding of risks and provide a suitable representation for the formal modeling of organizational knowledge. In addition, their ability to integrate different perspectives helps to combine different risk management approaches.

15.3 Technology Risk Management Using OntoRisk In order to achieve the two risk management objectives “understanding” and “management”, a formal but extensible knowledge model is required, which reflects a generally accepted interpretation of the relevant concepts and their interrelations and supports the management process. In the following, we introduce our OntoRisk method (cf. Figure 15.1), which is built upon a technology risk ontology. This ontology not only captures a shareable conceptual understanding but is also used in the OntoRisk method to create concrete, quantitative models of the technology risks an enterprise is exposed to at the time of modeling. It incorporates the regulatory perspective taken by, for example, Basel II as well as existing approaches to IT Governance as part of a Corporate Governance strategy. Subsequently, we describe the steps of the OntoRisk risk management process as depicted below. According to general risk management practices, the risk management process is divided into a minimum of three different phases (BCBS 2003; Culp 2002). Here, we define the following phases: The identification phase involves the creation of the knowledge base, which is then incrementally enriched during the measurement and controlling phases. Throughout the process, the knowledge base is transformed into different stakeholder perspectives. Finally, we describe the system architecture of a software tool for supporting the OntoRisk method.

Figure 15.1. Technology Risk Management Process in OntoRisk

15 Resolving Conceptual Ambiguities in Technology Risk Management

339

15.3.1 The Technology Risk Ontology In order to specify an ontology for technology risk, we first have to survey its fundamentals including the notions of “business and operational risk”. Since a commonly agreed terminology of business, operational or technology risk does not exist, the precise specification of these concepts and their interrelationships is a challenging task. In recent years, however, the development of the Basel II framework has led to a partial convergence of operational risk concepts within the banking sector (BCBS 2003). Such a level of agreement has not been reached in the insurance sector yet, but with Solvency II a similar development can be observed. However, the conceptualization of business and operational risk is only the first step towards a formal model of technology risk. Thus, we carry out an analysis of the regulatory and the corporate view of technology risk to eliminate remaining conceptual ambiguities. Then, we construct our extensible ontology using Description Logics as formal notation (Baader et al. 2004). Business risk is an integral part of the value adding process and is thus embedded into every business activity. It is best characterized as a random deviation from a preset monetary target (Knight 1965), which results in a negative financial effect. Operational risk used to be residually defined as financial risk that cannot be categorized as market, credit or interest rate risk (EC 1999). Following a “definitio ex positivo” of the term (GO30 1993), we understand operational risk as the risk of possible financial losses stemming from the failure of resources in operating the value chain. As stated before, the Basel II accord categorizes the causes of operational risk as internal processes, people, systems or external events. Subsequently, we limit the scope of our ontology to the technology risk category. For some risk scenarios it is difficult to draw the line between different risk types. Clear definitions of technology risk categories can foster a deeper understanding of the scope and inner structure of the encountered risks. For example, information systems risks can be subdivided into general risks (e. g. gaining unauthorized access), application risks (e. g. malfunctioning information processing) and user-driven risks (e. g. misuse of a human-machine interface) (van den Brink 2001). Wolf proposes a broader understanding of technology-related risk, which further includes IT strategy and external events affecting technology resources (Wolf 2005). A detailed analysis of all technology resources and their fundamental properties is beyond the scope of this chapter. Following the German implementation of the second pillar of Basel II, we base our technology risk model on a minimum of three categories: IT components, IT management tasks and application systems. Basic IT components include hard- and software components, which are the building blocks of more complex application systems directly supporting the business activities creating value. IT management tasks refer to the system lifecycle and include, for example, the operation and migration of application systems. In recent years, general standards for IT management have been gaining importance. For example, ITGI’s “Control Objectives for Information and related Technology” (COBIT) (ITGI 2002) and BSI’s “IT Baseline Protection” (BSI 2003)

340

Christian Cuske et al.

Table 15.1. Classification of resource properties relevant to technology risk Properties of Technology Resources Security of Compliance of Soundness of IT Components Application Systems IT Management BSI 2003 ITGI COBIT

availability, integrity, confidentiality confidentiality, reliability, availability, integrity, compliance

effectiveness, efficiency

propose normative guidelines for IT governance and security management. The BSI IT baseline protection manual provides a structured overview of threat scenarios and safeguard instructions. Information resources are grouped into modules like general components, infrastructure, networked systems, etc. A threat catalog describes the typical vulnerabilities of information resources. Depending on the particular threat scenario, availability, integrity or confidentiality of information may be compromised. The COBIT framework takes a complementary approach because it provides IT management guidelines for minimizing risks and increasing overall performance from a process-oriented perspective. In the following, we draw on BSI and COBIT for the identification of relevant risk factors and essential properties of technology resources. As depicted in Table 15.1, the properties defined by BSI and COBIT are assigned to the three generic resource types mentioned above. Special attention needs to be paid to the properties describing compliance of application systems. They very much depend on national regulations, for example IDW (IDW 2002) or Sarbanes-Oxley (Sarbanes-Oxley Act 2002). Here, we follow a widely-used classification scheme comprising completeness, timeliness and correctness as a least common denominator. Before we start to specify our technology risk ontology, let us formulate our objective (called a “motivating scenario” by Grüninger and Fox 1995), which concisely states the development goal: To establish an integrated – that is an internally and regulatory accepted – understanding of operational technology risk and its formalization as a model to enable integrated risk management. Derived from this objective are the so-called “competency questions” that our ontology must be “competent” to answer (Grüninger and Fox 1995). They describe the requirements on the ontology: 1. What are the special characteristics of technology risk and how does it relate to the more general concepts of operational and business risk? 2. How does a generic model of technology risk represent the understanding coined by Basel II? How can the model be integrated into IT management and IT governance approaches?

15 Resolving Conceptual Ambiguities in Technology Risk Management

341

3. Which is a feasible categorization of the risk-relevant resources and what is their contribution to the creation of value? 4. Which properties are relevant to technology risk management? Figure 15.2 summarizes the main concepts of the technology risk ontology, which are subsequently described in more detail. In the first step of our ontology formalization, we distinguish three groups of technology resources according to their basic properties and vulnerabilities. The top-tier technology resources are categorized as atomic IT components (ITComponent), application systems (ApplicationSystem) and operative tasks of IT management (ITManagementTask). As mentioned before, we describe these concepts and their relationships using Description Logic: (15.1)

Figure 15.2. Technology Risk Ontology

342

Christian Cuske et al.

ITComponents are essentially physical computer parts (Hardware), executable computer programs (Software), communication devices (Network), and auxiliary physical infrastructure (Infrastructure):1 (15.2) Unlike elementary IT components, application systems fulfill a business function and directly contribute to the creation of value. The relationship between application systems and IT components is expressed by composition (see relation form in Figure 15.2). Because of complex information flows on the systems level, there is a functional dependency (depend) between different application systems. (15.3) As stated before, management activities potentially influence technology risk. Mandatory managing tasks (ITManagementTask) comprise planning and controlling of the development, introduction, operation, migration, and termination of application systems, either on a per-project basis or recurrently. The degree of maturity of these tasks influences the compliance of the respective application systems. (15.4) An operational loss is accompanied by an event in which a relevant property of a technology resource (Property) is compromised. Following the classification scheme in Table 15.1, we specify three basic properties, which are described in more detail below: security of IT components, compliance of application systems and soundness of IT management tasks. (15.5) Within the scope of IT components, risk is defined as lack of security (cf. Hammer 1999). IT security is further subdivided into the properties integrity, availability and confidentiality. The effect of violating an IT component’s required property constraints may cascade through the IT topology and affect the application systems, finally resulting in financial losses. In order to ensure compliance of an application system and to guarantee that the value chain is properly operated, at least three property constraints have to be fulfilled: correctness, completeness and timeliness. However, subsequent extensions to this schema are permitted. IT management tasks are included in the ontology because the control functions on a managerial level help to address technology-related risks. Sound management 1

To avoid cluttering up the diagram, the subset relationships in Formulas 15.2, 15.4 and 15.6 were left out in Figure 15.2.

15 Resolving Conceptual Ambiguities in Technology Risk Management

343

activities as part of the system life cycle have a positive effect on the quality (Effectiveness), cost (Efficiency) and adherence to deadlines (Schedule). (15.6)

Finally, a definition of technology risk as a subcategory of business and operational risk is derived. Technology risk is characterized as a potential loss which results from the failure of a technology resource to support a business activity. (15.7) In this ontology, both the internal and external perspective on technology risk is reflected by explicitly stating the dependencies between technology resources and their risk-relevant properties. Moreover, application of the ontology can help to better comply with specific regulatory requirements. In the banking sector, for example, a mapping of business activities to Basel II business lines (BusinessLine) and resource properties to event types (EventType) can be defined using our ontology. Evaluation plays a key role in ontology development. An analysis of the taxonomic relationships in the complete specification was conducted using the OntoClean method (Guarino and Welty 2002). Moreover, two key measures for assessing the ontology’s quality are precision and coverage with respect to the intended models (Guarino and Persidis 2003). On an abstract level, the proposed ontology covers all relevant domain concepts to implement sound technology risk management practices. The precise specification of the relevant terms using Description Logic was regarded equally important. The ontology’s applicability in a real-world scenario is surveyed by means of a case study described below. The ontology presented here helps to resolve the conceptual ambiguities still residing in a traditionally fuzzy problem domain. Based on a consolidation of accepted approaches it provides a common language that permits the formalization of the organization’s knowledge on the IT system structure in general and technology risk phenomena in particular. The ontology thus contributes to a shared understanding of technology risk. The proposed model can be adapted to meet enterprise-specific and regulatory requirements due to the extensibility of ontology hierarchies. In the following sections, we will elaborate on how it can be applied in all phases of the risk management life cycle.

15.3.2 The OntoRisk Process Model The integration of the ontology into the risk management process is described according to the model depicted in Figure 15.1. This risk management process should be viewed as an iterative task.

344

Christian Cuske et al.

15.3.2.1 Identification “Banks should identify the operational risk inherent in all types of products, activities, processes and systems” (BCBS 2003). As a preparation, a model of all relevant technology resources is developed based upon the technology risk ontology. During the identification phase, the domain experts select the most risk-sensitive properties and bind estimated risk event frequencies to them. The resulting Systems & Threats Model (cf. Figure 15.1) serves as an input for subsequent measurement. It is obtained in a top-down approach as described below: 1. Discovery and documentation of the application systems, IT components and management tasks within the predefined scope. 2. Insertion of the relevant dependencies between IT components, management tasks and application systems. 3. Identification of the potential threat scenarios for application systems as well as for the underlying IT components and IT management tasks. 4. Assignment of threat scenarios to properties in conjunction with self assessment by the domain experts. The granularity of the modeled instances is subject to the risk manager. However, there should be a trade-off between the complexity and the maintainability of the model. The relevant threats are identified during interviews with domain experts. This process can be systematized if catalogues of potential hazards (and measures) are utilized, as provided by the IT security management guidelines BSI and COBIT (van den Brink 2001; Wolf 2005). In OntoRisk, identification is supported by the ability to auto-generate questionnaires from an extension of the ontology. In order to achieve this goal, the technology resources are subdivided into more finegrained objects (BSI 2003). The key issue here is the mapping from technology resources to generic terms defined in the above standards. In the following excerpt, a BSI data transmission component (DataTransmissionComponent) is introduced as subclass of a network component: (15.8) In the second step, we define additional restrictions on the risk-relevant properties: (15.9) During identification, domain experts will judge the Frequency of a threat scenario derived from the above mentioned standards: (15.10)

The presented approach provides the means for the iterative development of the Systems & Threats Model and simultaneously allows for the instant evaluation

15 Resolving Conceptual Ambiguities in Technology Risk Management

345

by domain experts. Thereby, the ontology acts as an enabler for an enterprisewide understanding of various potential hazards and ultimately leads to increased risk awareness.

15.3.2.2 Measurement The development of the Quantification Model for risk measurement (cf. Figure 15.1) represents an integral part of risk management. In the context of operational risk, many quantification approaches can be identified varying in both sophistication and risk sensitivity. A thorough analysis of the existing approaches is carried out in (Cuske et al. 2005a). For example, in the Basel II standardized approach the risk capital charge is calculated on the basis of the annual gross incomes in eight business lines (BCBS 2005a). More Advanced Measurement Approaches (AMA) include self-assessment, loss database, extreme value theory and early indicator approaches. The functional correlation approach, which was first proposed by (Kühn and Neu 2003) and further refined by (Leippold and Vanini 2005), is particularly promising since it takes the internal enterprise structure into account. The operational risk environment is modeled as a graph with functional dependencies between assets to which risk factors are attached. Corresponding risk events may lead to chain reactions influencing the overall loss. When no analytical solution exists, a simulation approach is applied (Kühn and Neu 2003). By including dependencies in the simulation model, the structural knowledge is leveraged and the nature of operational technology risk is adequately represented. The general adoption of the functional-dependency model for the ontologybased quantification of technology risk is described in (Cuske et al. 2005b). Here, we focus on how the technology risk ontology is applied as part of the OntoRisk method. The Systems and Threats Model serves as a starting point for measurement. Subsequently, we transform it into the simulation model, which is based on the functional-dependency approach. The following statements define the mapping of application systems to assets (Asset). Assets are characterized as directly contributing to the value chain. The risk-relevant properties (States) of assets are mapped onto elements (Element). Stochastic behavior of an element is injected by a supporting function (formula) which is represented as a text string.

(15.11)

Moreover, IT components (ITComponent) and management tasks (ITManagementTask) are defined as risk factor (Riskfactor). These stochastic parameters have an impact on assets via the relations form and influence. The properties of risk factors act as input parameters for the simulation model (Input) and are mapped

346

Christian Cuske et al.

accordingly. In this way, the identified frequencies of threat scenarios are integrated into the simulation model. (15.12) Loss functions (LossFunction) are introduced to calculate the effects of a failure of assets during operation of the value chain (operate). Financial losses are integrated as outputs into the simulation model. (15.13) The overall loss is calculated using a stochastic process defined by the individual outputs. The resulting risk potential is then retrieved through risk measures such as Value at Risk (VaR) or Conditional VaR (CVaR). By defining the mapping into the measurement view, we are able to combine the advantages of a shared understanding achieved by an ontology-based approach with a quantification of technology risk. This provides an elegant solution for system support, and diminishes the semantic gap between domain knowledge and its representation in a simulation model. 15.3.2.3 Controlling The controlling phase embraces population of the Analysis Model (cf. Figure 15.1). In the following, we just single out selected aspects, such as compliance with general risk management standards (BCBS 2005a). In this context, a basic regulatory requirement for risk management is backtesting of the quantitative model with historical loss data. This requires that occurring loss events are stored in loss databases in order to be compared to the calculated losses on a regular basis. The proposed ontology enables verification based on loss data, since the simulation results are written into the knowledge base. The risk potential computed by means of simulation has to be validated against empirical loss data on the basis of the predefined business line or event type schema. The business lines can be mapped via the map relation: (15.14) Moreover, Basel II demands a classification into EventTypes. The assignment to event types is also specified using the map relation. The next statement shows the first level of Basel II event type classification (BCBS2005a):

(15.15)

15 Resolving Conceptual Ambiguities in Technology Risk Management

347

A case for decision support is also found in the alternatives of hedging or avoiding technology risk (Cruz 2002; King 2001). As a rule of thumb, especially those risk events with a low frequency but high severity are to be prevented beforehand. Here, we introduce new concepts like HighSeverity to obtain risk factors and systems causing high severity losses.

15.3.3 System Architecture The OntoRisk process is supported by a software application, which is built around the ontology presented so far (cf. Figure 15.3). In this section, we shortly highlight the key aspects of our implementation. A more in depth discussion can be found in (Cuske et al. 2005b). Throughout the application we use the Web Ontology Language (OWL) (W3C 2004) as the standardized format for ontology serialization. A graphical OWL-Editor (n) is used to edit both the technology risk ontology (o) and the views (p), which contain the mappings that have been formalized in Description Logic in our discussion on the OntoRisk process above. A reasoner (p), i. e. an inference engine (W3C 2004), transforms the technology risk ontology into the perspectives specific to the respective process steps. The graphical editors, report engines and calculations which cooperate to support a certain step in the risk management

OWL Systems & Threats View

OWL Quantification View

Ontology Editor

Reasoning

OWL o Technology Risk Ontology

OWL Analysis View

n

ViewSpecific Code

p

Identification

q

r

Measurement

s

Figure 15.3. Architecture of the OntoRisk Platform

Controlling

t

348

Christian Cuske et al.

process are implemented separately (q) but share an in-memory model of the (transformed) ontology as a common data representation format. The result of these efforts is a tightly integrated set of three tools – one for each process step – which supports the user through questionnaire generation (r), a simulation module (s) and a graphical analysis and report interface (t).

15.4 Case Study: Outsourcing in Banking To demonstrate the benefits of the OntoRisk approach in a real-world application, a case study in an area of special interest is introduced: IT management considering the outsourcing of a substantial part of the value chain. The area under examination is one business line in banking, namely trading and sales. The case study does not aim at giving decision support on IT outsourcing itself. Instead, its purpose is to illustrate the effects of our ontology-based approach on the technology risk management process. In the identification phase, we analyze common threat scenarios. Within the measurement phase, two alternatives for quantification are compared: the Basel II standard measurement approach and the OntoRisk approach, which utilizes the proposed ontology-based quantification. The controlling phase is covered within the discussion of the simulation results.

15.4.1 Motivation and Influencing Factors The permanent cost pressures on financial institutions have been nurturing the discussion on lowering vertical integration, especially in the German banking industry where the cost/income ratio is higher than in EU average (ECB 2005b). In this context, the outsourcing of core activities is seen as one way to reduce the vertical integration and realize long term cost benefits. From a regulatory perspective, outsourcing is defined “[...] as a regulated entity’s use of a third party (either an affiliated entity within a corporate group or an entity that is external to the corporate group) to perform activities on a continuing basis that would normally be undertaken by the regulated entity, now or in the future” (BCBS 2005b). There exist various business models, but an intra-group outsourcing is often preferred (ECB 2004). Service Level Agreements (SLA) are used to contractually define the quality of the provided service (e. g. availability) together with penalty provisions for non-fulfillment. It is important to note, however, that outsourcing of IT leads to new types of risk by itself (Foit 2005). A recent study in the banking industry shows that the most significant risks in outsourcing are the loss of direct control, an increase in operational risks and loss of internal skills (ECB 2004). In our case study, we restrain the outsourcing scenario to trading and sales. Following the Basel II understanding, trading and sales includes proprietary trading, trade on the account of corporate clients as well as market making and treasury

15 Resolving Conceptual Ambiguities in Technology Risk Management

349

Table 15.2. Overview of data sources used in the case study (BCBS 2002a, 2002b; ECB 2005a; Federal Reserve Bank of Boston 2004) Public reports

Relevant data sources

Year

Quantitative Impact Study II Collection of loss events between 1998 and 2000 2002 – Tranche 2 at 30 international banks 2002 Loss Data Collection Quantitative Impact Study II extended survey of 2002 Exercise 2001, 89 international banks 2004 Loss Data Collection Survey of individual losses – 2004 of 27 US- 2004 Exercise banks by US Federal Bank EU banking statistics

Profitability of EU banks in 2004

2005

(BCBS 2005a). This business line has been selected because it proves to be particularly sensitive to operational risks as indicated by the relatively high capital charge (β = 18%) imposed by the Basel II standardized approach. In this business line, technology risk takes a high share within operational risk due to the strong dependence on information technology. To guarantee a high degree of generality, our case is based on publicly available data sources summarized in Table 15.2. The percentage of technology risk events in trading and sales varies between 33% and 66%. Furthermore, the income achieved in trading and sales has a substantial share of the overall return of a bank.

15.4.2 Trading Environment The exemplary IT environment serves to support a generic trading process, which follows a clear separation of business functions, adopted from the German implementation of the second pillar of Basel II (Bundesbank 2005). Figure 15.4 visualizes the system topology, which comprises six application systems, each contributing to different tasks. MarketData supplies all necessary information for trading activities executed in the FrontOffice. Every transaction is validated by a Confirmation and forwarded to the BackOffice. The trading limits are checked by the LimitController. All transactions are finally forwarded to the GeneralLedger. In this scenario, we choose Confirmation as a candidate for outsourcing. As a result, the firm is no longer directly responsible for inspection of transactions and the back office now depends on an external provider. This results in different risk assessments and potentially influences the required capital charge. Potential losses are triggered by malfunctioning IT components and management tasks. In the example case, the relevant risks affecting IT components are systematically identified by analyzing potential threat scenarios as described in the BSI baseline protection manual. For the IT management tasks we refer to the COBIT standard. The concrete impacts of these scenarios on the risk-relevant

350

Christian Cuske et al.

properties are displayed in Table 15.3. Domain experts are responsible for judging the estimated frequency of a risk event. Further, the affected property constraints of application systems are included in the Systems and Threats Model. Before measurement, the qualitative judgment of risk events by domain experts has to be translated to quantitative figures. In the simulation model, the loss events are modeled as Bernoulli distribution with the average frequency as stochastic parameter. At this stage, loss databases sometimes provide a supplementary tool for risk assessment. Loss severity is modeled independently of the frequency distribution (BCBS 2005a). For our case study, we consider loss drivers such as the missed return in trading or settlement errors resulting from the failure of supporting application systems. Following common operational risk practices (Cruz 2002), a lognormal distribution is applied and fitted using appropriate data from the banking statistics. In the outsourcing scenario, losses resulting from the confirmation system are covered within a SLA and set to zero.

Figure 15.4. Application systems with dependencies and property constraints

15 Resolving Conceptual Ambiguities in Technology Risk Management

351

Table 15.3. Identification of risk factors Description of hazard

Risk factor

Failure of an external service: After a disconnection to Reuters, Reuters → Availability the market data is temporarily unavailable (cf. BSI – T 4.31). Disruption of power supply: Power supply for the trading system is interrupted (cf. BSI – T 4.1).

PowerSupply → Availability

Deficient database security: Inadequate security mechanisms IBMDB2 open up the possibility for intru- → Confidentiality sion (cf. BSI – T 2.38). Bad facility management: Insufficient management of faci- FacilityManagement → Effectiveness lities leads to undetected errors in confirmation processing (cf. COBIT – DS12).

Application System

H M L

MarketData → Correctness

■

FrontOffice → Timeliness

■

Confirmation → Correctness

■

Confirmation → Correctness

Database failure: A failure of hard- or software potentially causes a loss of data integrity (cf. BSI – T 4.26).

OracleDB → Integrity

BackOffice → Correctness

Inadequate data management: Shortcomings result in incom plete data processing/storage (cf. COBIT – DS11).

DataManagement → Effectiveness

BackOffice → Completeness

Software vulnerabilities/errors: If the standard software is too complex, a software error may occur (cf. BSI – T 4.22).

ValuationSoftware → Integrity

LimitController → Completeness

ProjectManagement → Efficiency

GeneralLedger → Correctness

Managing projects: Weak project management increases the pressure on running tasks performed in accounting (cf. COBIT – PO10).

Frequency

■

■

■

■

■

352

Christian Cuske et al.

15.4.3 Analysis and Results In the case study, comparative figures were used to analyze alternative scenarios with sourcing strategies ranging from an in-house solution through a transfer to an external provider. The most interesting results are summarized in Figure 15.5. On the left hand side of the figure, the original risk potential of the in-house solution is shown, while the right hand side depicts the reduced risk potential after outsourcing the confirmation system. For completeness it should be noted that the latter scenario assumes a full coverage of resulting losses by the external provider. The ex-ante analysis of competing scenarios can support IT management in the process of decision making. Additionally, an ex-post analysis can serve as a starting point for the controlling phase.

Without Outsourcing VaR: 4729154 CVaR: 5109822

With Outsourcing VaR: 3071509 CVaR: 3401466

Figure 15.5. Exemplary results for two outsourcing scenarios

15.5 Conclusion By incorporating results from the area of knowledge management research into risk management, we were able to contribute new solutions to some of the prevalent challenges in that field. In detail, we applied formal ontologies to create a model for a shared understanding of the conceptual building blocks of technology risk management. This technology risk ontology forms the foundation for both our proposed process model and the supporting software toolset. In this way, we were not only able to unravel conceptual ambiguities inherent in this domain, but also to establish the basis for an integrated risk management approach as part of IT governance.

15 Resolving Conceptual Ambiguities in Technology Risk Management

353

References Abecker A, van Elst L (2004) Ontologies for Knowledge Management. In: Staab S, Studer R (eds.): Handbook on Ontologies. Springer 435−454 Baader F, Horrocks I, Sattler U (2004) Description Logics. In: Staab S, Studer R (eds.): Handbook on Ontologies. Springer 3−28 Bank for International Settlements (BIS 2005) 75th Annual Report Basel Committee on Banking Supervision (BCBS 2002a) The 2002 Loss Data Collection Exercise for Operational Risk: Summary of the Data Collected Basel Committee on Banking Supervision (BCBS 2002b) The Quantitative Impact Study for Operational Risk: Overview of Individual Loss Data and Lessons Learned Basel Committee on Banking Supervision (BCBS 2003) Sound Practices for the Management and Supervision of Operational Risk Basel Committee on Banking Supervision (BCBS 2005a) International Convergence of Capital Measurement and Capital Standards – A Revised Framework Basel Committee on Banking Supervision (BCBS 2005b) Outsourcing in Financial Services Berners-Lee T, Hendler J, Lassil O (2001) The Semantic Web. Scientific American, May Bundesamt für Sicherheit in der Informationstechnik (BSI 2003) Baseline Protection Manual Bundesbank (2005) Mindestanforderungen an das Risikomanagement (MaRisk) Chong CW, Holden T, Wilhelmij P, Schmidt RA (2000) Where does knowledge management add value? Journal of Intellectual Capital 1 366−380 Cruz MG (2002) Modeling, measuring and hedging operational risk. Wiley, Chichester Culp CL (2002) The ART of risk management: alternative risk transfer, capital structure, and the convergence of insurance and capital markets. Wiley, New York; Chichester Cuske C, Dickopp T, Schader M (2005a) Konzeption einer Ontologie-basierten Quantifizierung operationeller Risiken bei Banken. Banking and Information Technology 6 61−74 Cuske C, Dickopp T, Seedorf S (2005b) JOntoRisk: An Ontology-Based Platform for Knowledge-Based Simulation Modeling in Financial Risk Management. Proceedings of the European Simulation and Modeling Conference (ESM), Porto, Available at SSRN: http://ssrn.com/abstract=802665 Cuske C, Korthaus A, Seedorf S, Tomczyk P (2005c) Towards Formal Ontologies for Technology Risk Measurement in the Banking Industry. 1st Workshop Formal Ontologies Meet Industry (FOMI) Proceedings, Available at SSRN: http://ssrn.com/abstract=744404 European Central Bank (ECB 2004) Report on EU Banking Structure European Central Bank (ECB 2005a) EU banking sector stability European Central Bank (ECB 2005b) EU Banking Structures

354

Christian Cuske et al.

European Commission (EC 1999) Review of regulatory capital for credit institutions and investment firms European Commission (EC 2004) Framework for Consultation on Solvency II. Federal Reserve Bank of Boston (2004) Results of the 2004 Loss Data Collection Exercise for Operational Risk Fensel D (2004) Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, Berlin; Heidelberg Foit M (2005) Management operationeller IT-Risiken bei Banken. Universitätsverlag Regensburg, Regensburg Fox MS, Gruninger M (1998) Enterprise Modelling. AI Magazine 19 109−121 GO30 (1993) Special Report on Global Derivatives – Derivatives: Practices & Principles. The Group of Thirty Consultative Group on International Economic and Monetary Affairs, Inc. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl. Acquis. 5 199−220 Grüninger M, Fox M (1995) Methodology for the Design and Evaluation of Ontologies. IJCAI’95, Workshop on Basic Ontological Issues in Knowledge Sharing, April 13 Guarino N, Persidis A (2003) Evaluation Framework for Content Standards. Ontoweb Deliverable 3.5, IST-2000-29243 Guarino N, Welty C (2002) Evaluating ontological decisions with OntoClean. Commun. ACM 45 61–65 Hammer V (1999) Die 2. Dimension der IT-Sicherheit: Verletzlichkeitsreduzierende Technikgestaltung am Beispiel von Public-Key-Infrastrukturen. Vieweg, Braunschweig; Wiesbaden Hansen MT, Nohria N, Tierney T (1999) What’s your strategy for managing knowledge? Harvard Business Review 77 106−116 Holsapple CW, Joshi KD (2004) A formal knowledge management ontology: Conduct, activities, resources, and influences: Research Articles. J. Am. Soc. Inf. Sci. Technol. 55 593−612 Institut der Wirtschaftsprüfer (IDW 2002) Stellungnahme zur Rechnungslegung FAIT 1 − Grundsätze ordnungsgemäßer Buchführung bei Einsatz von Informationstechnologie International Accounting Standards Committee (IASB 2004) International Accounting Standards IT Governance Institute (ITGI 2002) Control Objectives for Information and Related Technology King JL (2001) Operational Risk: Measurement and Modelling. Wiley, Chichester Knight FH (1965) Risk, Uncertainty and Profit. Harper & Row Kühn R, Neu P (2003) Functional correlation approach to operational risk in banking organizations. Physica A 322 650−666 Lamberti H-J (2005) Kernelemente der Industrialisierung in Banken aus Sicht einer Großbank. In: Sokolovsky, Löschenkohl (eds.): Handbuch Industrialisierung der Finanzwirtschaft 59−82

15 Resolving Conceptual Ambiguities in Technology Risk Management

355

Leippold M, Vanini P (2005) The Quantification of Operational Risk. Journal of Risk 8 59−85 Maedche A, Motik B, Stojanovic L, Studer R, Volz R (2003) Ontologies for Enterprise Knowledge Management. IEEE Intelligent Systems 18 26−33 Marshall C, Prusak L, Shilberg D (1996) Financial risk and the need for superior knowledge management. California management review 38 77−101 O’Leary DE (1998) Using AI in Knowledge Management: Knowledge Bases and Ontologies. IEEE Intelligent Systems 13 34−39 Prahalad CK, Hamel G (1990) The core competence of the corporation. Harvard Business Review 68 79−91 Probst G, Buchel B, Raub S (1996) Knowledge as a Strategic Resource. Ecole des Hautes Etudes Commerçiales, Université de Genève Probst G, Raub S, Romhardt K (2003) Wissen managen: wie Unternehmen ihre wertvollste Ressource optimal nutzen. Gabler, Wiesbaden Quintas P, Lefrere P, Jones G (1997) Knowledge management: a strategic agenda. Long Range Planning 30 385−391 Sarbanes-Oxley Act of 2002 (2002) H.R.3763 Skyrme DJ, Amidon DM (1997) Creating the Knowledge-Based Business. Business Intelligence, London Spies M, Clayton A, Noormohammadian M (2005) Knowledge management in a decentralized global financial services provider: a case study with Allianz Group. Knowledge Management Research & Practice 3 24−36 Staab S, Studer R, Schnurr H-P, Sure Y (2001) Knowledge Processes and Ontologies. IEEE Intelligent Systems 16 26−34 Swan J (2003) Knowledge Management in Action? In: Holsapple CW (ed.): Handbook on Knowledge Management, Vol. 1. Springer, Berlin; Heidelberg 271−296 Uschold M, Gruninger M (1996) Ontologies: principles, methods, and applications. Knowledge Engineering Review 11 93−155 van den Brink GJ (2001) Operational Risk: The New Challenge for Banks. Palgrave Macmillan World Wide Web Consortium (W3C 2004) Web Ontology Language (OWL). Westenbaum A (2003) Bankbetriebliches Wissensmanagement: Entwicklung, Akquisition und Transfer der Unternehmensressource Wissen in Kreditinstituten. Lang, Frankfurt am Main Wolf E (2005) IS risks and operational risk management in banks. Eul, Lohmar

CHAPTER 16 Extracting Financial Data from SEC Filings for US GAAP Accountants Thomas Stümpert

16.1 Introduction Since financial statements are vital for decision makers in the professional investment world, quick access and automated import of this data is essential for the financial industry. In order to do that, an investor or analyst requires a comprehensive database with the figures from financial statements for all covered companies over a reasonably long period of time. Investment companies pay to service providers such as Reuters1, Compustat2 etc. for access to refined financial statements. A popular source of information are the filings held in the EDGAR database. EDGAR, the Electronic Data Gathering, Analysis, and Retrieval system, performs automated collection, validation, indexing, acceptance, and forwarding of submissions by companies and others who are required by law to file forms with the U.S. Securities and Exchange Commission SEC (U.S. Securities and Exchange Comission SEC 2007). Its primary purpose is to increase the efficiency and fairness of the securities market for the benefit of investors, corporations, and the economy by accelerating the receipt, acceptance, dissemination, and analysis of time-sensitive corporate information filed with the agency. Registered companies are required to submit certain financial reports in prescribing formats in so-called forms. Unfortunately, the data available in the SEC EDGAR database is weakly structured in technical terms. Filings contain mixed content of natural language text and semi-structured financial tables. Form 10-K is the annual report that most reporting companies file with the SEC, containing annual financial statements. Form 10-Q is the corresponding quarterly update. Filings consist of plain text

1 2

http://www.reuters.com http://www.compustat.com

358

Thomas Stümpert

and/or HTML. As long as these filings are the standard and most up-to-date source of information for professional and private investors, software agents that transform them into an exchangeable format are of greatest interest. For data extraction methods in general, see e. g. (Baeza-Yates and Ribeiro-Neto 1999) or (Salton 1988). The software agent described here continues prior explorations of data extraction using agents, formerly named EDGAR2XML, see (Leinemann et al. 2001). EDGAR2XML showed the basic ideas of transforming financial data from 10-K and 10-Q filings into machine readable XML format. One of the shortcomings of EDGAR2XML was the vast use of regular expressions, which made this project not maintainable. The reason for having used large regular expressions is that the underlying data format of 10-K and 10-Q filings often varies. How a filing should look like is described in the EDGAR filer manual and many companies deviate from the rules described there. The filer manual is available at (EDGAR Filer Manual 2007). Reason for not using these rules may be that accountants do not care about data formats, e. g. only a few companies support XBRL (Extensible Business Reporting Language XBRL 2007). The hereinafter described software agent EASE (EASE stands for Extraction Agent for SEC’s EDGAR database) uses term weights to specify interesting sections in 10-K and 10-Q forms. EASE is able to handle with these deviations, i. e. the agent takes into account changing document formats. Before we delve into the details of the EASE project, we take a look at existing implementations with similar objectives. Currently several EDGAR agents exist, e. g. EDGAR online, see (EDGAR Online 2007) and (FreeEDGAR 2007), 10 k Wizard (10 k Wizard: SEC filings 2007) and FRAANK, see (Financial Reporting and Auditing Agent with Net Knowledge FRAANK 2007) and (Kogan; Nelson; Srivastava; Vasarhelyi; Lu 2000). EdgarScan from PriceWaterhouseCoopers pulls filings from the SEC’s servers and parses them automatically to find key financial tables and normalize financial positions to a common set of items that is comparable across companies (EdgarScan 2007). The normalization makes EdgarScan superior compared to other implementations. When filings are available as HTML flavored only, EdgarScan obviously converts the HTML documents into plain text and parses these documents. Traditional EDGAR agents do not always detect a balance sheet in a 10-K filing correctly. EASE uses two different approaches that take into account changed document formats (EDGAR filer manual 2007) which now supports embedded HTML and PDF documents. We describe an algorithm which extracts key financial information at a high level for conventional text files. The chapter is organized as follows: The first section describes content and structure of SEC 10-K and 10-Q filings. In the following main part we describe the EASE agent approaches for extraction of financial data from these filings. This part consists of two subsections, one of which is dedicated to the traditional, textual filings, and the other for the newer HTML encoded documents. The next section describes the role of XBRL, an XML based standard for exchange of financial information. The final section contains results and a conclusion.

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

359

16.2 Structure of 10-K and 10-Q Filings A filing is a text document that contains tags in order to structure the parts it contains. The specification for filings can be found in the EDGAR filer manual (EDGAR filer manual 2007). These tags are special to SEC filings, but similar to HTML tags in terms of syntactical rules. A filing consists of a single root element named <SEC-DOCUMENT>, which contains a single child named <SEC-HEADER> and one or more children named . The following excerpt shows this structure which is common to all filings (indentations are added for illustration purposes): <SEC-DOCUMENT> <SEC-HEADER> CONFORMED SUBMISSION TYPE: 10-K CONFORMED PERIOD OF REPORT: 20041231 FILED AS OF DATE: 20050328 FILER: COMPANY DATA: CENTRAL INDEX KEY: 0123456789 … FILING VALUES: FORM TYPE: 10-K … 10-K SECURITIES AND EXCHANGE COMMISSION Washington, D.C: 20549 Form 10-K … …

16.2.1

Structure of Document Elements

The common structure of all embedded documents is a sequence of named properties followed by a single text node. An example: 10-K ANNUAL REPORT

360

Thomas Stümpert

actual contents In this example, the first two lines specify two document properties and . The text node encloses the actual contents which is either an embedded document of type HTML, plain text with tags, or even some graphical type. Parsing filings in order to obtain the structure up to this level is comparably easy. Embedded documents which contain financial information are nearly always either HTML or plain text, and sometimes a mixture. The format of these documents varies a lot between companies, and may change over time for a single company.

16.2.2 Embedded Plain Text Plain text documents are best viewed with browsers that display the contents formatted with a mono-spaced font. Line and page breaks as well as the length of character runs including white space are significant for the visible structure of sections, paragraphs and tables. The following excerpt illustrates this. 2006 2005 -----------------------<S> ASSETS Cash 27 85 Representing this example using a true type font, not respecting line breaks, produces something like 2006 2005 --------------------------------<S> ASSETSCash 27 85 which clearly illustrates the importance of the properties stated above. That said, we could easily draw a table using pipe character immediately in front of the “smaller than” characters of the columns, and would obtain this visible result:

--------ASSETS Cash

|2006 |-------| |27

|2005 |---| |85

Here is an example of how a balance sheet should appear in a filing (we call this well-formed or standard): ... <page>

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

361

... CONSOLIDATED BALANCE SHEET

... This example illustrates a kind of canonical table in a traditional document. First of all, the document uses <page> tags to separate pages. Second, the table is enclosed in an opening and a closing tag. Third, the table uses

to identify its purpose. Fourth, it uses a sequence of <S> and tags which determines the column specification (<S> stands for stub and stands for column). Finally, the table entries are placed correctly within the text, that is, they are starting at their respective horizontal position. Filings often deviate in a series of points from the example above, and some tags are frequently omitted. This makes it difficult to generalize the parsing process.

16.2.3 Embedded HTML Embedded HTML documents are in no way special to SEC filings when compared to other real-world HTML content published on the web. They are enclosed in HTML tags, using and tags, tags and others. These documents should be viewed with contemporary HTML browsers only. The following figure shows the beginning of a balance sheet.

362

Thomas Stümpert

(Dollars in million except per share amounts) AT DECEMBER 31: _______________________________________________

NOTES _______

ASSETS Current assets Cash and cash equivalents Marketable securities Notes and accounts receivable-trade, net of allowances Short-term financing receivables Other accounts receivable Inventories Deferred taxes Prepaid expenses and other current assets

D F E P

Total current assets Plant, rental machines and other property

G

2002 2001 __________ __________

$ 5,382 593 9,915 15,996 1,447 3,148 2,617 2,554 ______

$ 6,330 63 9,101 16,656 1,261 4,304 2,402 2,344 ______

41,652 ______ 36,083

42,461 ______ 38,375

Figure 16.1. Excerpt of a balance sheet from a 10-K filing in HTML format

16.2.4

Financial Reports Overview

In order to parse 10-K and 10-Q filings and detecting balance sheet data, we have to analyze the basic structure of these filings. The balance sheet itself is found in the financial statements part. Financial reports consist of natural language text disclosing various aspects of a company, and figures in tabular. Annual financial reports, the submission of which is prescribed in form 10-K of the SEC, are sectioned as follows (EDGAR Filer Manual 2007): Item 1. Item 2. Item 3. Item 4. Item 5. Item 6. Item 7. Item 8. Item 9. Item 10. Item 11. Item 12. Item 13. Item 14.

Business Properties Legal Proceedings Submission of Matters to a Vote of Security Holders Market for Registrant’s Common Equity and Related Stockholder Matters Selected Financial Data Management’s Discussion and Analysis of Financial Condition and Results of Operations Financial Statements and Supplementary Data Changes in and Disagreements With Accountants on Accounting and Financial Disclosure Directors and Executive Officers of the Registrant Executive Compensation Security Ownership of Certain Beneficial Owners and Management Certain Relationships and Related Transactions Principal Accountant Fees and Services

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

363

Quarterly updates, prescribed by form 10-Q, contain the following section, not all of which must be present if not applicable (EGDAR Filer Manual 2007): PART I: FINANCIAL INFORMATION Item 1. Financial Statements Item 2. Management’s Discussion and Analysis of Financial Condition and Results of Operations Item 3. Quantitative and Qualitative Disclosures About Market Risk Item 4. Controls and Procedures PART II: OTHER INFORMATION Item 1. Item 2. Item 3. Item 4. Item 5. Item 6.

Legal Proceedings Unregistered Sales of Equity Securities and Use of Proceeds Defaults Upon Senior Securities Submission of Matters to a Vote of Security Holders Other Information Exhibits

Financial statements in quarterly reports are typically condensed, offering a coarser granularity of financial positions compared to financial statements in annual reports. They are located in Part I, Item 1 in quarterly updates, and in Item 8 in annual reports.

16.3 Information Retrieval Within EASE Information retrieval (IR) deals with the representation, storage, organization of, and access to information items. The representation and organization of the information items should provide the user with easy access to the information in which he is interested. The three basic models in information retrieval are called Boolean, vector, and probabilistic (Baeza-Yates and Ribeiro-Neto 1999). In the Boolean model, documents and queries are represented as sets of index terms. In the vector model, documents and queries are represented as vectors in a n-dimensional space. In the probabilistic model, the framework for modeling document and query representations is based on probability theory. The vector model recognizes that the use of binary weights is too limiting and proposes a framework in which partial matching is possible. This is accomplished by assigning non-binary weights to index terms in queries in documents. These term weights are ultimately used to compute the degree of similarity between each document stored in the system and the user query. By sorting the retrieved documents in decreasing order of this degree of similarity, the vector model takes into consideration documents which match the query terms only partially. The main resultant effect is that the ranked document answer set is a lot more precise – in the sense that it better matches the user information need – than the document answer set retrieved by the Boolean model. Information extraction (IE) is a technology dedicated to the extraction of

364

Thomas Stümpert

structured information from texts to fill pre-defined templates. While most contemporary work focuses on either machine learning for IE patterns or wrapper generation, cf. (Baeza-Yates and Ribeiro-Neto 1999), (Kogan; Nelson; Srivastava; Vasarhelyi; Lu 2000) and others, and is geared towards web search engines, we deal with a predefined structure for the information, namely the financial statements of US-GAAP compliant financial reports. The guidelines in the filer manual specify what information must be present, but the technical specification is comparably weak.

16.3.1

Information Extraction from Traditional Filings

A filing is often not well-formed, e. g. tags within a table are often omitted. Due to this, the use of large regular expressions for detecting a balance sheet itself, is a not maintainable task (Leinemann et al. 2001). Also a Boolean model may be not the best solution. Therefore we use a modified version of the vector space model, cf. (Salton 1988), for the purpose of the identification of financial tables. We split a filing into segments, which we use as document equivalents. This is necessary since we are dealing with a single document, the parts of which need to be valuated. The query is defined as the search for the balance sheet, the formal details will be explained below. Aim of these modifications is to obtain a value which ranks the document segments in terms of likelihood for the contention of a balance sheet (or other financial table). First of all, three typical excerpts of a filing are presented in the example below. Then the classical vector space algorithm is applied (Salton 1988). But this algorithm does not detect the balance sheet itself in this example, so a refined new vector space algorithm is presented afterwards. This algorithm is used in EASE. 16.3.1.1 Example Documents Used for Query We will refer to the following three text excerpts from real-world filings in order to compare the results of the standard vector space model to our extended version. The second excerpt contains the balance sheet itself. Document d1: <page> The status of the pension plans follows.

Assets exceed accumulated benefit obligation ------------------------------------------1997 1996 ------------------------------------------Assets, primarily

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

365

stocks and bonds at market $ 5,074.5 $ 4,327.6 ... Document d2: <page> Balance Sheet

1997 1996 ------------------------------------------Assets Current assets: Cash and cash equivalents 800. 8 598.1 ... Document d3:

1998 1997 1996 ------- ------- -------------- -----------Assets: Current assets $1,569 $1,949 $1,995 ...

Balance Sheet from December 31, 1998 is on page 12.

16.3.2 Standard Vector Space Model The standard vector space model creates a weighted mapping of terms to documents so that similarity of documents and search queries can be measured. First, we define a set T of n search terms T = { t1 , … , t n } ,

each of which is a literal term like “balance sheet” or “assets” or a tag like or <page> etc. Document segments are defined as set D = { d1 , … , d m } .

The query weight vector ′ q = ( q1 , … , q n )

is a vector of query weights associated with the terms of T, and ′ w j = ( w j,1 , … , w j,n ) ∈ R n

is a binary vector with components wj,k = 1 if term tk exists in dj, or 0 otherwise.

366

Thomas Stümpert

Then the similarity measure sj for document dj is defined as sj =

n

∑ w j,k q k .

k =1

Higher values of similarity sj indicate a higher probability that a segment contains the balance sheet. The weights used for the terms in this model are taken based on observations of dozens of tests (see Table 16.1). Using the query weights q = (10,15,55,20), the search terms T= {<page>,

, Balance Sheet, Assets} we obtain the following similarity measures for documents d1, d2, d3: s1 = 1·10 + 1·15 + 0·55 + 1·20 = 45 for d1, s2 = 1·10 + 1·15 + 1·55 + 0·20 = 80 for d2, s3 = 0·10 + 1·15 + 1·55 + 1·20 = 90 for d3. The highest similarity measure s3 identifies document d3. But this document does not contain the balance sheet. That means, using this similarity measure does not necessarily lead to a successful detection of the balance sheet. The balance sheet is in document d2. Table 16.1. Defined query weights for detection of a balance sheet Query

Term

Weight

Q

<page>

balance sheet assets

10 15 55 20

Table 16.2. Result for the search query of documents d1, d2 and d3 Document segment D1

D2

D3

Term

<page>

balance sheet assets <page>

balance sheet assets

Weight 1 1 0 1 1 1 0 1 0 1 1 1

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

367

In this example the standard vector space model does not detect the balance sheet with the highest similarity measure. The results obtained so far motivate a new, extended version of the vector space model. The main idea is, that the order of the occurrence of the individual search terms matters.

16.3.3 Extended Vector Space Model To take the order of the search terms into account we replace the vectors of the equations with matrices, where the components of the matrix reflect the order of the terms. This leads to the following modification: Let Q be an n×n term-term-matrix ⎛ q1,1 ⎜ Q=⎜ ⎜ q n,1 ⎝

q1,n ⎞ ⎟ ⎟, q n,n ⎟⎠

where qi,k denotes the weight of subsequent occurrences of term ti and tk. Wj denotes the matrix of occurrences in document j, ⎛ w1,1 ⎜ Wj = ⎜ ⎜ w n,1 ⎝

w1,n ⎞ ⎟ ⎟, w n,n ⎟⎠

where wi,k equals 1 if term ti precedes tk, or 0 otherwise. As similarity measure for dj we use the sum of weighted occurrences, i. e. n

n

s j = ∑ ∑ w j,k,i q i,k . i =1 k =1

A high value for the similarity measure again represents a high likelihood for the occurrence of a balance sheet. We apply the modified model to the document segments above (see below). In Table 16.3 the query weight of 75 stands for Balance sheet is followed by Assets. Table 16.3. Query weights for the extended vector space model Q

<page>

<page>

Balance Sheet Assets

0 0 0 0

25 0 70

Balance Sheet 65 70 0

Assets 30 35 75

25

65

30

368

Thomas Stümpert

Table 16.4. Example for term and order weights for both queries and segments d1

<page>

Balance Sheet

Assets

<page>

Balance Sheet Assets

0 0 0 0

1 0 0 0

0 0 0 0

0 1 0 0

d2 <page>

Balance Sheet Assets

<page> 0 0 0 0

1 0 0 0

Balance Sheet 0 1 0 0

Assets 0 0 0 0

d3 <page>

Balance Sheet Assets

<page> 0 0 0 0

0 0 0 0

Balance Sheet 0 1 0 1

Assets 0 0 0 0

We obtain a similarity measure of s1 = 60 for segment d1, s2 = 95 for segment d2, and s3 = 70 for segment d3. The extraction algorithm now tries to capture individual items by their names, and by splitting rows into columns based on both column specification, number of consecutive separating white space characters, and appearance of isolated numbers. First we identify the relevant table and determine the start of the table. Next we extract multiplier and currency of the numbers. After that we start extracting the data for the different item labels. The extraction of the details is beyond the scope of this document.

16.3.4

Extraction of HTML Documents

Since SEC filings consist of HTML parts and other predefined tags mixed up with plain text, the filing is not conform to the XML specification and for the extraction of HTML flavored documents requires the use of absolute HTML paths that point to the data item to be extracted. The extraction process of HTML flavored documents is two fold: In the first stage, a parser builds a document tree based with an HTML DOM root node for each embedded HTML document. This structure is used by the extractor in the second stage to locate financial statements and extract them into the target object. Input for the document builder is the raw character sequence containing the entire document. Output is an object which

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

369

contains a reference to the raw text and a root node, which represents the root of the document hierarchy. As stated before, a filing consists of multiple parts with varying syntax. The main parser splits the document into a header and a number of documents first, and delegates the actual parsing to specialized parsers for headers and documents, a header parser and a document parser, respectively. The header and each document are added as nodes to the root node. Splitting the entire document into its constituent parts is realized with simple regular expressions, searching the respective opening and closing tags of <SEC-HEADER> and in a single call. The header can be split into lines, and these lines can be split into a property name and its value simply finding the first occurrence of a colon. Each property is added as a single node to the document tree. The document parser uses a single regular expression in order to split the document contents into the leading property values and the final text element. The former are put as properties to the document node, and the text node is added as the only child to the document node. The content of this text element is then passed on to an implementation of an HTML parser. This parser adds the document as child named to the text node. The resulting document tree can be viewed with a special end-user application implemented for demonstration purposes (see screenshot in Figure 16.2). Navigation through the entire document is possible using the methods which the node implementation allows for, among them searching by content and/or

Figure 16.2. Screen of a DOM tree

370

Thomas Stümpert

attribute values, and addressing nodes using path expressions. The address of the “filer” for example would be “/SEC-DOCUMENT/SEC-HEADER/filer”. Moreover, the entire document can be processed in consistent way regardless of the input type. The main challenge for the HTML parser stems from the fact that most real world HTML documents do not follow the W3C recommendations strictly. In contrast to XML, which enforces strict adherence to standards, HTML documents have always been quite sloppy in terms of standardization. Web browsers have been very tolerant towards incorrect syntax since their invention, which lead to the proliferation of HTML editors that produce code not in compliance with recommendations. From a theoretical point of view, XML is a type-2 grammar in Chomsky’s hierarchy. As such, it requires a finite state engine with a stack and thus exceeds the capabilities that pure regular expressions provide. HTML is only a special incorporation of XML or SGML when perfectly in compliance with HTML recommendations. In practice, most HTML documents violate the recommendations frequently. Capturing the document tree reliably despite potential violations almost certainly requires a Turing-complete engine. And in fact, writing such a parser proved to be challenging. The basis for the HTML parser is a skeletal XML parser with special precautions for various expected and unexpected violations. The parser contains two regular expressions. The first expression is used to find comments, opening tags, closing tags, and character data tags in the parsers main loop. The second is used to parse attributes in opening tags. The parser keeps a reference to the node which has been opened most recently. Every time an opening tag is detected, a new node is created. This node will become the current top node. When a closing tag is detected, the current top node is closed, and its parent is made the current top. The reference to the current top node resembles the stack required for type-2 parsers. The current top node is the top element of the stack. Creating a new node and making it the current top node resembles the “push” operation of a stack, and making the current top nodes’ parent the current top node resembles the “pop” operation of a stack. As stated before, this approach must fail for real-world HTML documents and HTML based filings. First of all, the broken tags (opening tags without a closing pendant) must not become the top node. Second, in some cases closing tags are omitted. An opening tag then forces previously found opening tags to be closed. Third, a closing tag may be misplaced at some later position in the document, in which case it will be ignored. The extraction process is similar to the concept of XPath, which describes how to address a node in an XML document. For example, in order to find out the type of a filing (10-K or 10-Q), the extractor searches for the header node, and in this the node named submission type. The following code illustrates this: Node root; // given Node header = root.find(“SEC-HEADER”); Node submissionType = header.find(“CONFORMED SUBMISSION TYPE”); String typeName = submissionType.getText();

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

371

The path and search expressions vary from company to company, while there is still a set of default or standard methods of how to identify a certain node. The extractor looks into its configuration database whether it finds a specialized path for a given item and company, and defaults to a generic implementation if not available. This procedure makes it possible to extend the range of covered companies by simply adding specialized entries to the configuration database for that particular company. Our implementation differs from XPath in that it supports regular expressions. Using XPath and regular expressions, a balance sheet can be detected using the following expressions: XPath: SEC-DOCUMENT/DOCUMENT XPath: /TEXT/HTML/BODY//TABLE[count(TR)>7]/TR/TD Regex:.*liabilities.*equity.* XPath:../..

16.4 The Role of XBRL The most important effort in standardizing exchange of financial information is the Extensible Business Reporting Language (Extensible Business Reporting Markup Language XBRL 2007), which is comprised of a set of XML schemas, describing how to present items, and taxonomies, describing which items make up a certain type of financial report. The standard is governed by a not-for-profit international consortium (XBRL International Incorporated) of more than 300 organizations, including regulators, government agencies, intermediaries and software vendors. Registrants may choose to voluntarily submit documents in XBRL format instead of plain text or HTML documents to meet filing requirements. But only documents submitted to the EDGAR system in either plain text or HTML are official filings. All XBRL documents are unofficial submissions and considered furnished, not filed. XBRL is being rapidly adopted to replace both paper-based and legacy electronic financial data collection by a wide range of regulators. It is used for the disclosure of financial performance information by companies, notably in voluntary SEC filings. XBRL is somewhat different to many other XML standards published by consortia. Generally, when two parties wish to exchange information using XML, they need to have prior access to all of the element definitions. But accounting involves the dissemination of information, the vast majority of which conforms to reporting norms, but some of which is unique to the circumstances of a particular company.

16.5 Empirical Results Basis for the following observations was a sample of 10-K filings for the 30 Dow Jones Industrial Average companies. Performance benchmark was the former

372

Thomas Stümpert

Table 16.5. Overview of detected balance sheets within 10-Q filings Company

Detected Balance Sheets

Number of Filings

3M Co. Alcoa Inc. Altria Group Inc. American Express Co. AT&T Corp. Boeing Co. Caterpillar Inc. Citigroup Inc. Coca-Cola Co. E.I. DuPont de Nemours & Co. Eastman Kodak Co. Exxon Mobil Corp. General Electric Co. General Motors Corp. Hewlett-Packard Co. Home Depot Inc. Honeywell International Inc. IBM Corp. Intel Corp. International Paper Co. J.P. Morgan Chase Co. Johnson & Johnson McDonald’s Corp. Merck & Co. Inc. Microsoft Corp. Procter & Gamble Co. SBC Communications Inc. United Technologies Corp. Wal-Mart Stores Inc. Walt Disney Co. Total

23 23 18 22 23 23 20 21 23 17 23 22 12 19 19 18 25 22 15 22 23 21 21 18 19 23 16 15 9 23 598

23 23 18 23 23 23 23 21 23 17 23 22 12 22 20 18 25 22 15 23 23 22 21 19 19 23 18 17 11 23 615

EDGAR2XML implementation, which used about two minutes to extract financial data from comparable filings. This package utilized the GNU regular expression implementation. The DOM parser was able to create the DOM instance for all filings in the sample which contained embedded HTML. The parsing process took only about 1.5 seconds for an average 1 MB filing on a 1.5 GHz single processor machine with 512 MB of RAM. Navigation within the document tree proved to be very fast. The plain text parser recognizes 598 balance sheets of 615 text-based balance sheet data from 10-Q filings of the sample (see Table 5) and from 206 10-K filings

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

373

it detected 100% balance sheet data correctly. Processing took about 0.5 seconds per filing with an average size of 400 kB. Reasons for not detecting some balance sheets correctly from 10-Q filings are the deviations from the well-defined structure: The filing for CIK 732717 in 1996 contained no tags at all. The occurrence of consolidated balance sheets for CIK 40730 included those of subsidiaries, which confused the EASE agent. The filing for CIK 40545 for year 2000 contained a mix of HTML and plain text formatting means, which the agent cannot deal with. For CIK 18230 in year 1997, the agent found too similar patterns in other segments of the filing. In order to detect financial items in an extracted balance sheet, we worked with synonym lists (see Appendix).

16.6 Conclusion The focus of this paper is the recognition of balance sheets within 10-K and 10-Q filings. The major task and challenge to extract balance sheet data is to deal with varying data formats. The EDGAR filer manual describes the conceptual structure of 10-K and 10-Q filings for filers, but not for data format specialists. Due to this fact tags are omitted and the underlying data format may change. Using only regular expressions to detect the relevant financial data leads to an unmaintainable task, cf. former EDGAR2XML project (Leinemann; Schlottmann; Seese; Stümpert 2001). A maintainable approach was a mixture of the vector space algorithm (Salton 1988) and regular expressions. Unfortunately this approach not necessarily detected the balance sheet itself, i. e. other document segments of a 10-K or 10-Q filing seemed to have higher similarity measures than the balance sheet. So the idea was born to extend the vector space algorithm by the order of pairwise search terms, e. g. Balance sheet is followed by Assets. The first results from a sample of the companies within the Dow Jones Industrial Average Index are very promising, i. e. in this sample we detected all balance sheets of the 10-K filings. The filing format is either ASCII-based or HTML-based. HTML-based format can be transformed in ASCII-based format and then the extended vector space model can be applied. Transformation in ASCII-based format was first used by EdgarScan. Another approach is to deal with HTML-based documents directly, e. g. segment a HTML-enriched filing and create a DOM representation of the filing. Segmentation uses structural information, e. g. a <page> tag is used to get a new DOM entry. The speciality of this approach is not the DOM representation, but the navigation within the DOM tree using both XPath and regular expressions, e. g. detect a balance sheet by detecting all tables with more than seven rows which contain the keywords liabilities and equity as described in the section about extraction of HTML-based documents. Summarizing the results, both the ASCII-based extension of Salton’s vector space algorithm and the HTML-based navigation within DOM representation of filings are new and lead to robust and easy detection of balance sheets even when structural information is omitted.

374

Thomas Stümpert

Acknowledgements The author thanks his students Özkan Cetinkaya and Ralf Spöth for technical support. Furthermore, he thanks Frank Schlottmann and Detlef Seese for their ideas and work on the former EDGAR2XML project contributing the basic ideas to the EASE project.

Appendix: Synonym List for Balance Sheet Data current assets=total current assets financial investments=financing assets financial investments=financial investments financial investments=investment in affiliates financial investments=investments financial investments=investments and long.term receivables financial investments=investments in equity affiliates financial investments=total financial services assets financial investments=total loans fixed assets=fixed assets fixed assets=land.? buildings.? and equipment fixed assets=land.? buildings.? machinery.? and equipment fixed assets=net properties fixed assets=parks.? resorts.? and other property fixed assets=plant.? rental machines.? and other property..net fixed assets=plants.? properties.? and equipment fixed assets=properties.? plants.? and equipment fixed assets=property and equipment fixed assets=property.? plant.? and equipment intangible assets=intangible assets total assets=total assets current liabilities=total current liabilities current liabilities=total deposits debt=long.term debt debt=long.term liabilities debt=long.term borrowings other liabilities=other liabilities other liabilities=total deferred credits and other noncurrent liabilities

16 Extracting Financial Data from SEC Filings for US GAAP Accountants

375

#equity=total (common)?shareowners. equity equity=total (common)?stock.?holders.? equity equity=total (common)?share.?holders.? equity equity=total (common)?share.?owners.? equity total liabilities=total liabilities \\s*(and|&)\\s*(common)?\\s*stock.?holder.?s.?\\s*equity total liabilities=total liabilities \\s*(and|&)\\s*(common)?\\s*share.?owner.?s.?\\s*equity total liabilities=total liabilities \\s*(and|&)\\s*(common)?\\s*share.?holder.?s.?\\s*equity total liabilities=total liabilities\\s*(and|&)\\s*(common)?\\s*equity total liabilities=total.liabilities ..preferred.stock.of.subsidiary.and.stockholders..equity total liabilities=total liabilities and

References 10 k Wizard: SEC filings (2007) url: http://www.tenkwizard.com Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley EDGAR Filer Manual (2007) url: http://www.sec.gov/info/edgar/filermanual.htm EDGAR Online (2007) url: http://www.edgar-online.com EdgarScan (2007) url: http://edgarscan.pwcglobal.com Extensible Business Reporting Language XBRL (2007) url: http://www.xbrl.org Friedel JEF (2002) Mastering regular expressions. O‘Reilly, Beijing Cambridge Farnham Köln Paris Sebastopol Taipei Tokyo Financial Reporting and Auditing Agent with Net Knowledge FRAANK (2007) url: http://fraank.eycarat.ku.edu/ FreeEDGAR (2007) url: http://www.freeedgar.com Kogan A, Nelson K, Srivastava R, Vasarhelyi M, Lu H (2000) Virtual auditing agents: The EDGAR agent challenge. Decision Support Systems 28:241−253 Leinemann C, Schlottmann F, Seese D, Stümpert T (2001) Automatic extraction and analysis of financial data from the EDGAR database. South African Journal of Information Management 3 Salton G (1988) Automatic text processing: The transformation, snalysis, and retrieval of information by computer. Addison Wesley. Stümpert T, Seese D, Cetinkaya Ö, Spöth R (2004) EASE – a software agent that extracts financial data from the SEC’s EDGAR database. Proceedings of the Fourth International ICSC Symposium on Engineering of Intelligent Systems EIS’2004, University of Madeira, Funchal, Portugal, February 29th – March 2, 2004, http://www.icsc-naiso.org, pp. 128; full paper in the electronic proceedings, pp. 1–7 U.S. Securities and Exchange Commission SEC (2007) url: http://www.sec.gov

PART II IT Methods in Finance

Introduction to Part II

This part contains 12 chapters and is devoted to IT methods in finance. There are essentially two ways where IT enters and influences methods used in finance. The first are of course methods which originate in IT and informatics, which are just applied directly or in a suitable adapted form. The second are methods which originate in finance, e. g. mathematical finance, but which need to their efficient implementation a suitable use of IT. This part includes both aspects and it adds as a third the investigation of problems from finance in computer science. Nagurney gives a survey on networks in finance in Chapter 17. The chapter traces the history of networks in finance and overviews a spectrum of relevant methodologies for their formulation, analysis, and solution ranging from optimization techniques to variational inequalities and projected dynamical systems. Numerical examples are provided for illustration purposes. In Chapter 18 van Dinther discusses the role of agent-based simulation for research in economics. Agent-based simulations have produced a variety of interesting contributions to economic research in the past years. Nevertheless, the methodology of economic simulations in general and agent-based simulations in particular is still in question. This is due to the fact that simulations are conducted without considering the underlying theory and links to standard economic approaches. This contribution discusses the use of simulations in economics and reviews the four most common agent-based approaches and their link to economic theory. Agents are also the central theme of the next chapter by Dermietzel. He gives a survey on developments and milestones on the heterogeneous agents approach to financial markets. New models of financial markets have been developed over the last 25 years, which are able to explain empirical phenomena observed in real markets. In contrast to classical financial theory a key feature of these models is the heterogeneity of traders. The models developed from very stylized models in the 1980s to complex dynamical models with realistic trading behaviour. Today these models are used to reproduce and explain the dynamics of financial markets and to investigate improvements and regulations of these sensitive economic mechanisms. This chapter discusses the discrepancies between financial theory

380

Introduction to Part II

and real market data and provides an overview of important financial market models with heterogeneous traders ranging from 1980 to 2007. Chiarella and He develop in Chapter 20 an adaptive model of asset price and wealth dynamics in a financial market with heterogeneous agents and examine the profitability of momentum and contrarian trading strategies. In order to characterise asset prices, wealth dynamics and rational adaptiveness arising from the interaction of heterogeneous agents with constant relative risk aversion (CRRA) utility, an adaptive discrete time equilibrium model in terms of return and wealth proportions (among heterogeneous representative agents) is established. Taking trend followers and contrarians as the main heterogeneous agents in the model, the profitability of momentum and contrarian trading strategies is analysed. The authors show the capability of the model to characterize some of the existing evidence on many of the anomalies observed in financial markets, including the profitability of momentum trading strategies over short time intervals and of contrarian trading strategies over long time intervals, rational adaptiveness of agents, overconfidence and underreaction, overreaction and herd behaviour, excess volatility, and volatility clustering. A survey on simulation methods for stochastic differential equations is given in Chapter 21 by Platen. Stochastic differential equations are becoming the modelling framework in finance, insurance, economics and many other areas of social sciences. Only in rare cases one has explicit solutions. Furthermore, the modelling of market segments or entire markets often involves a large number of factor processes where many quantitative methods fail to work. For these reasons simulation methods have become essential tools in quantitative finance. This chapter provides a brief introduction into the area of simulation methods for stochastic differential equations with focus on finance. The foundations of option pricing are important to many topics in finance. They are discussed by Buchen in Chapter 22. Modern approaches to pricing financial options and derivatives are concerned with the notion of no arbitrage – i. e. the concept “there are no free lunches in efficient markets”. The chapter explores the two principal methods of arbitrage-free pricing and shows how they can be applied to price a wide range of example contracts. The two methods alluded to are: the PDE (partial differential equation) method of Black and Scholes, and the EMM (equivalent martingale measure) method of Harrison and Pliska. Both methods have their origins in the stochastic calculus and ultimately, under certain restrictions, are equivalent, as in-deed they must be if the ‘Law of One Price’ is to hold. When used together with other analytical tools such as the Gaussian Shift Theorem, Principle of Static Replication and Parity Relations, one is able to price a range of exotic options with both path independent and path dependent payoffs. Sun, Rachev and Fabozzi provide a survey on long-range dependence, fractal processes, and intra-daily data in Chapter 23. With the availability of intra-daily price data, researchers have focused more attention on market microstructure issues to understand and help formulate strategies for the timing of trades. The purpose of this chapter is to provide a brief survey of the research employing intradaily price data. Specifically, the chapter reviews stylized facts of intra-daily data, econometric issues of data analysis, application of intra-daily data in volatility and

Introduction to Part II

381

liquidity research, and the applications to market microstructure theory. Longrange dependence is observed in intra-daily data. Because fractal processes or fractional integrated models are usually used to model long-range dependence, we also provide a review of fractal processes and long-range dependence in order to consider them in future research using intra-daily data. Bayesian applications to the investment management process are the central theme of Chapter 24 by Bagasheva, Rachev, Hsu and Fabozzi. The usual investment management practice is to assume that the parameter values estimated from the available data sample are the population parameter values. Failing to take into account the errors intrinsic in sample estimates, however, could lead to suboptimal asset allocations. Bayesian methods provide the theoretical framework and tools to account for estimation risk. Moreover, they allow a researcher or a practitioner to incorporate their subjective views about the model parameters into the asset allocation process. In this chapter, an overview of some aspects of the investment management process within a Bayesian setting is presented. The chapter highlights the advantages that setting provides over the classical (frequentist) estimation framework. Racheva-Iotova and Stoyanov discuss a unified framework for constructing optimal portfolio strategies in Chapter 25. It is based on an optimization problem involving the CVaR risk measure and an appropriately selected multivariate model for stock returns. The authors provide an empirical example using the Russell 2000 universe. In Chapter 26 Gilli, Maringer and Winker give a survey on applications of heuristics in finance. Optimisation is crucial in many areas of finance, but often quite demanding because the type of objective functions and the constraints that have to be considered cannot be handled with traditional optimization techniques. A rather novel group of methods are heuristics. These methods are less restricted in their applicability, and they perform search and optimisation in a non-deterministic fashion by repeatedly updating one ore a whole set of candidate solutions and by incorporating stochastic elements. Frequently, these methods are inspired by principles found in nature, e. g., evolution, they are typically not tailored for a certain type of problem, and they are flexible enough to be adapted to complex optimization problems. This chapter presents some of the most popular of these methods and demonstrates how they can be applied to financial problems including portfolio optimization and model selection. These methods are found not only to work reliably, but also to allow a better analysis of complex problems that could not be addressed with traditional methods. Another class of heuristic approaches to solve several complex problems in finance stem from the area of machine learning. They are surveyed by Chalup and Mitschele in Chapter 27 on kernel methods. Kernel methods are a class of powerful machine learning algorithms which are able to solve non-linear tasks. The chapter presents a concise overview of a selection of relevant machine learning methods and a survey of applications to show how kernel methods have been applied in finance. The overview of learning concepts addresses methods for dimensionality reduction, regression, and classification. The concept of kernelisation which can be used in order to transform classical linear machine learning methods into non-linear kernel methods is emphasised. The survey of applications of kernel

382

Introduction to Part II

methods in finance covers the areas of credit risk management, market risk management, and discusses possible future application fields. It concludes with a brief overview of relevant software toolboxes. In the last years there were several related results developed, showing that special problems form finance are provably complex, i. e. their complexity is so high that one is in need of heuristic methods to approach solutions. Some of these are results on the complexity of decision problems in foreign exchange markets. These problems are discussed in Chapter 28 by Cai and Deng. They study the computational complexity problem of finding arbitrage in foreign exchange markets, taking market frictions into consideration. The problem is approached through several steps, refining the computational complexity issue with respect to arbitrage. At first, it is investigated the computational complexity to detect the existence of an arbitrage opportunity in a general market model. This problem has a negative solution, since it turns out to be NP-complete to identify arbitrage. Secondly, market models are described for which polynomial time algorithms exist to locate an arbitrage opportunity. It is shown that two important market structure models belong to this category. Thirdly, the impact of futures on the complexity of finding arbitrage is explored. Finally, the minimal number of necessary transactions to bring the exchange system back to some non-arbitrage states is studied. The results show that different exchange systems exhibit different computational complexities for finding arbitrage. The complexity understandings help to shed new light on understanding how monetary system models are adopted and evolved in reality.

CHAPTER 17 Networks in Finance Anna Nagurney

17.1 Introduction Finance is concerned with the study of capital flows over space and time in the presence of risk. As a subject, it has benefited from numerous mathematical and engineering tools that have been developed and utilized for the modeling, analysis, and computation of solutions in the present complex economic environment. Indeed, the financial landscape today is characterized by the existence of distinct sectors in economies, the proliferation of new financial instruments, with increasing diversification of portfolios internationally, various transaction costs, the increasing growth of electronic transactions through advances in information technology and, in particular, the Internet, and different types of governmental policy interventions. Hence, rigorous methodological tools that can capture the complexity and richness of financial decision-making today and that can take advantage of powerful computer resources have never been more important and needed for financial quantitative analyses. In this chapter, the focus is on financial networks as a powerful tool and medium for the modeling, analysis, and solution of a spectrum of financial decisionmaking problems ranging from portfolio optimization to multi-sector, multiinstrument general financial equilibrium problems, dynamic multi-agent financial problems with intermediation, as well as the financial engineering of the integration of social networks with financial systems. Throughout history, the emergence and evolution of various physical networks, ranging from transportation and logistical networks to telecommunication networks and the effects of human decision-making on such networks have given rise to the development of rich theories and scientific methodologies that are networkbased (cf. Ford and Fulkerson 1962; Ahuja et al. 1993; Nagurney 1999; Geunes and Pardalos 2003). The novelty of networks is that they are pervasive, providing the fabric of connectivity for our societies and economies, while, methodologically, network theory has developed into a powerful and dynamic medium for

384

Anna Nagurney

abstracting complex problems, which, at first glance, may not even appear to be networks, with associated nodes, links, and flows. The topic of networks as a subject of scientific inquiry originated in the paper by (Euler 1736), which is credited with being the earliest paper on graph theory. By a graph in this setting is meant, mathematically, a means of abstractly representing a system by its depiction in terms of vertices (or nodes) and edges (or arcs, equivalently, links) connecting various pairs of vertices. Euler was interested in determining whether it was possible to stroll around Königsberg (later called Kaliningrad) by crossing the seven bridges over the River Pregel exactly once. The problem was represented as a graph in which the vertices corresponded to land masses and the edges to bridges. (Quesnay 1758), in his Tableau Economique, conceptualized the circular flow of financial funds in an economy as a network and this work can be identified as the first paper on the topic of financial networks. Quesnay’s basic idea has been utilized in the construction of financial flow of funds accounts, which are a statistical description of the flows of money and credit in an economy (see Cohen 1987). The concept of a network in economics, in turn, was implicit as early as the classical work of (Cournot 1838), who not only seems to have first explicitly stated that a competitive price is determined by the intersection of supply and demand curves, but had done so in the context of two spatially separated markets in which the cost associated with transporting the goods was also included. (Pigou 1920) studied a network system in the form of a transportation network consisting of two routes and noted that the decision-making behavior of the users of such a system would lead to different flow patterns. Hence, the network of concern therein consists of the graph, which is directed, with the edges or links represented by arrows, as well as the resulting flows on the links. (Copeland 1952) recognized the conceptualization of the interrelationships among financial funds as a network and asked the question, “Does money flow like water or electricity?” Moreover, he provided a “wiring diagram for the main money circuit.” Kirchhoff is credited with pioneering the field of electrical engineering by being the first to have systematically analyzed electrical circuits and with providing the foundations for the principal ideas of network flow theory. Interestingly, (Enke 1951) had proposed electronic circuits as a means of solving spatial price equilibrium problems, in which goods are produced, consumed, and traded, in the presence of transportation costs. Such analog computational devices, were soon to be superseded by digital computers along with advances in computational methodologies, that is, algorithms, based on mathematical programming. In this chapter, we further elaborate upon historical breakthroughs in the use of networks for the formulation, analysis, and solution of financial problems. Such a perspective allows one to trace the methodological developments as well as the applications of financial networks and provides a platform upon which further innovations can be made. Methodological tools that will be utilized to formulate and solve the financial network problems in this chapter are drawn from optimization, variational inequalities, as well as projected dynamical systems theory. We begin with a discussion of financial optimization problems

17 Networks in Finance

385

within a network context and then turn to a range of financial network equilibrium problems.

17.2 Financial Optimization Problems Network models have been proposed for a wide variety of financial problems characterized by a single objective function to be optimized as in portfolio optimization and asset allocation problems, currency translation, and risk management problems, among others. This literature is now briefly overviewed with the emphasis on the innovative work of (Markowitz 1952, 1959) that established a new era in financial economics and became the basis for many financial optimization models that exist and are used to this day. Although many financial optimization problems (including Markowitz’s) had an underlying network structure, and the advantages of network programming were becoming increasingly evident (cf. Charnes and Cooper 1958), not many financial network optimization models were developed until some time later. Some exceptions are several early models due to (Charnes and Miller 1957) and (Charnes and Cooper 1961). It was not until the last years of the 1960s and the first years of the 1970s that the network setting started to be extensively used for financial applications. Among the first financial network optimization models that appear in the literature were a series of currency translating models. (Rutenberg 1970) suggested that the translation among different currencies could be performed through the use of arc multipliers. Rutenberg’s network model was multiperiod with linear costs on the arcs (a characteristic common to the earlier financial networks models). The nodes of such generalized networks represented a particular currency in a specific period and the flow on the arcs the amount of cash moving from one period and/ or currency to another. (Christofides et al. 1979) and (Shapiro and Rutenberg 1976), among others, introduced related financial network models. In most of these models, the currency prices were determined according to the amount of capital (network flow) that was moving from one currency (node) to the other.

Figure 17.1. Network structure of classical portfolio optimization

386

Anna Nagurney

(Barr 1972) and (Srinivasan 1974) used networks to formulate a series of cash management problems, with a major contribution being (Crum’s 1976) introduction of a generalized linear network model for the cash management of a multinational firm. The links in the network represented possible cash flow patterns and the multipliers incorporated costs, fees, liquidity changes, and exchange rates. A series of related cash management problems were modeled as network problems in subsequent years by (Crum and Nye 1981) and (Crum et al. 1983), and others. These papers further extended the applicability of network programming in financial applications. The focus was on linear network flow problems in which the cost on an arc was a linear function of the flow. (Crum, Klingman, and Tavis 1979), in turn, demonstrated how contemporary financial capital allocation problems could be modeled as an integer generalized network problem, in which the flows on particular arcs were forced to be integers. In many financial network optimization problems the objective function must be nonlinear due to the modeling of the risk function and, hence, typically, such financial problems lie in the domain of nonlinear, rather than linear, network flow problems. (Mulvey 1987) presented a collection of nonlinear financial network models that were based on previous cash flow and portfolio models in which the original authors (see e. g. Rudd and Rosenberg 1979 and Soenen 1979) did not realize, and, thus, did not exploit the underlying network structure. Mulvey also recognized that the (Markowitz 1952, 1959) mean-variance minimization problem was, in fact, a network optimization problem with a nonlinear objective function. The classical Markowitz models are now reviewed and cast into the framework of network optimization problems. See Figure 17.1 for the network structure of such problems. Additional financial network optimization models and associated referencescan be found in (Nagurney and Siokos 1997) and in the volume edited by (Nagurney 2003). Markowitz’s model was based on mean-variance portfolio selection, where the average and the variability of portfolio returns were determined in terms of the mean and covariance of the corresponding investments. The mean is a measure of an average return and the variance is a measure of the distribution of the returns around the mean return. Markowitz formulated the portfolio optimization problem as associated with risk minimization with the objective function: Minimize V = X T QX

(17.1)

subject to constraints, representing, respectively, the attainment of a specific return, a budget constraint, and that no short sales were allowed, given by: n

R = ∑ X i ri

(17.2)

i =1

n

∑ Xi i =1

=1

X i ≥ 0, i = 1,…, n.

(17.3) (17.4)

17 Networks in Finance

387

Here n denotes the total number of securities available in the economy, X i represents the relative amount of capital invested in security i , with the securities being grouped into the column vector X , Q denotes the n × n variancecovariance matrix on the return of the portfolio, ri denotes the expected value of the return of security i , and R denotes the expected rate of return on the portfolio. Within a network context (cf. Figure 17.1), the links correspond to the securities, with their relative amounts X 1 ,…, X n corresponding to the flows on the respective links: 1,…, n . The budget constraint and the nonnegativity assumption on the flows are the network conservation of flow equations. Since the objective function is that of risk minimization, it can be interpreted as the sum of the costs on the n links in the network. Observe that the network representation is abstract and does not correspond (as in the case of transportation and telecommunication) to physical locations and links. Markowitz suggested that, for a fixed set of expected values ri and covariances of the returns of all assets i and j , every investor can find an ( R, V ) combination that better fits his taste, solely limited by the constraints of the specific problem. Hence, according to the original work of (Markowitz 1952), the efficient frontier had to be identified and then every investor had to select a portfolio through a mean-variance analysis that fitted his preferences. A related mathematical optimization model (see Markowitz 1959) to the one above, which can be interpreted as the investor seeking to maximize his returns while minimizing his risk can be expressed by the quadratic programming problem: Maximize α R − (1 − α ) V

(17.5)

subject to: n

∑ Xi = 1

(17.6)

X i ≥ 0, i = 1,…, n,

(17.7)

i =1

where α denotes an indicator of how risk-averse a specific investor is. This model is also a network optimization problem with the network as depicted in Figure 1.1 with Equations (17.6) and (17.7) again representing a conservation of flow equation. A collection of versions and extensions of Markowitz’s model can be found in (Francis and Archer 1979), with α = 1/ 2 being a frequently accepted value. A recent interpretation of the model as a multicriteria decision-making model along with theoretical extensions to multiple sectors can be found in (Dong and Nagurney 2001), where additional references are available. References to multicriteria decision-making and financial applications can also be found in (Doumpos, Zopounidis, and Pardalos 2000). A segment of the optimization literature on financial networks has focused on variables that are stochastic and have to be treated as random variables in the optimization procedure. Clearly, since most financial optimization problems are of large size, the incorporation of stochastic variables made the problems more com-

388

Anna Nagurney

plicated and difficult to model and compute. (Mulvey 1987) and (Mulvey and Vladimirou 1989, 1991), among others, studied stochastic financial networks, utilizing a series of different theories and techniques (e. g., purchase power priority, arbitrage theory, scenario aggregation) that were then utilized for the estimation of the stochastic elements in the network in order to be able to represent them as a series of deterministic equivalents. The large size and the computational complexity of stochastic networks, at times, limited their usage to specially structured problems where general computational techniques and algorithms could be applied. See (Rudd and Rosenberg 1979, Wallace 1986, Rockafellar and Wets 1991, and Mulvey et al. 2003) for a more detailed discussion on aspects of realistic portfolio optimization and implementation issues related to stochastic financial networks.

17.3 General Financial Equilibrium Problems We now turn to networks and their utilization for the modeling and analysis of financial systems in which there is more than a single decision-maker, in contrast to the above financial optimization problems. It is worth noting that (Quesnay 1758) actually considered a financial system as a network. (Thore 1969) introduced networks, along with the mathematics, for the study of systems of linked portfolios. His work benefited from that of (Charnes and Cooper 1967) who demonstrated that systems of linked accounts could be represented as a network, where the nodes depict the balance sheets and the links depict the credit and debit entries. Thore considered credit networks, with the explicit goal of providing a tool for use in the study of the propagation of money and credit streams in an economy, based on a theory of the behavior of banks and other financial institutions. The credit network recognized that these sectors interact and its solution made use of linear programming. (Thore 1970) extended the basic network model to handle holdings of financial reserves in the case of uncertainty. The approach utilized two-stage linear programs under uncertainty introduced by (Ferguson and Dantzig 1956) and (Dantzig and Madansky 1961). See (Fei 1960) for a graph theoretic approach to the credit system. More recently, (Boginski et al. 2003) presented a detailed study of the stock market graph, yielding a new tool for the analysis of market structure through the classification of stocks into different groups, along with an application to the US stock market. (Storoy et al. 1975), in turn, developed a network representation of the interconnection of capital markets and demonstrated how decomposition theory of mathematical programming could be exploited for the computation of equilibrium. The utility functions facing a sector were no longer restricted to being linear functions. (Thore 1980) further investigated network models of linked portfolios, financial intermediation, and decentralization/decomposition theory. However, the computational techniques at that time were not sufficiently well-developed to handle such problems in practice.

17 Networks in Finance

389

(Thore 1984) later proposed an international financial network for the Euro dollar market and viewed it as a logistical system, exploiting the ideas of (Samuelson 1952) and (Takayama and Judge 1971) for spatial price equilibrium problems. In this paper, as in Thore’s preceding papers on financial networks, the microbehavioral unit consisted of the individual bank, savings and loan, or other financial intermediary and the portfolio choices were described in some optimizing framework, with the portfolios being linked together into a network with a separate portfolio visualized as a node and assets and liabilities as directed links. The above contributions focused on the use and application of networks for the study of financial systems consisting of multiple economic decision-makers. In such systems, equilibrium was a central concept, along with the role of prices in the equilibrating mechanism. Rigorous approaches that characterized the formulation of equilibrium and the corresponding price determination were greatly influenced by the Arrow-Debreu economic model (cf. Arrow 1951, Debreu 1951). In addition, the importance of the inclusion of dynamics in the study of such systems was explicitly emphasized (see, also, Thore and Kydland 1972). The first use of finite-dimensional variational inequality theory for the computation of multi-sector, multi-instrument financial equilibria is due to (Nagurney et al. 1992), who recognized the network structure underlying the subproblems encountered in their proposed decomposition scheme. (Hughes and Nagurney 1992 and Nagurney and Hughes 1992) had, in turn, proposed the formulation and solution of estimation of financial flow of funds accounts as network optimization problems. Their proposed optimization scheme fully exploited the special network structure of these problems. (Nagurney and Siokos 1997) then developed an international financial equilibrium model utilizing finite-dimensional variational inequality theory for the first time in that framework. Finite-dimensional variational inequality theory is a powerful unifying methodology in that it contains, as special cases, such mathematical programming problems as: nonlinear equations, optimization problems, and complementarity problems. To illustrate this methodology and its application in general financial equilibrium modeling and computation, we now present a multi-sector, multiinstrument model and an extension due to (Nagurney et al. 1992) and (Nagurney 1994), respectively. For additional references to variational inequalities in finance, along with additional theoretical foundations, see (Nagurney and Siokos 1997) and (Nagurney 2001, 2003).

17.3.1

A Multi-Sector, Multi-Instrument Financial Equilibrium Model

Recall the classical mean-variance model presented in the preceding section, which is based on the pioneering work of (Markowitz 1959). Now, however, assume that there are m sectors, each of which seeks to maximize his return and, at the same time, to minimize the risk of his portfolio, subject to the balance accounting and

390

Anna Nagurney

nonnegativity constraints. Examples of sectors include: households, businesses, state and local governments, banks, etc. Denote a typical sector by j and assume that there are liabilities in addition to assets held by each sector. Denote the volume of instrument i that sector j holds as an asset, by X i j , and group the (nonnegative) assets in the portfolio of sector j into the column vector X j ∈ R+n . Further, group the assets of all sectors in the economy into the column vector X ∈ R+mn . Similarly, denote the volume of instrument i that sector j holds as a liability, by Yi j , and group the (nonnegative) liabilities in the portfolio of sector j into the column vector Y j ∈ R+n . Finally, group the liabilities of all sectors in the economy into the column vector Y ∈ R+mn . Let ri denote the nonnegative price of instrument i and group the prices of all the instruments into the column vector r ∈ R+n . It is assumed that the total volume of each balance sheet side of each sector is exogenous. Recall that a balance sheet is a financial report that demonstrates the status of a company’s assets, liabilities, and the owner’s equity at a specific point of time. The left-hand side of a balance sheet contains the assets that a sector holds at a particular point of time, whereas the right-hand side accommodates the liabilities and owner’s equity held by that sector at the same point of time. According to accounting principles, the sum of all assets is equal to the sum of all the liabilities and the owner’s equity. Here, the term “liabilities” is used in its general form and, hence, also includes the owner’s equity. Let S j denote the financial volume held by sector j . Finally, assume that the sectors under consideration act in a perfectly competitive environment. 17.3.1.1 A Sector’s Portfolio Optimization Problem Recall that in the mean-variance approach for portfolio optimization, the minimization of a portfolio’s risk is performed through the use of the variance-covariance matrix. Hence, the portfolio optimization problem for each sector j is the following: ⎛ ⎜ ⎜ ⎜ ⎝

T

X j ⎞⎟ j ⎛⎜ X j ⎞⎟ n ⎛ j j⎞ Minimize ⎟ Q ⎜ ⎟ − ∑ ri ⎜ X i − Yi ⎟ ⎠ ⎜ Y j ⎟ i =1 ⎝ Y j ⎟⎠ ⎝ ⎠

(17.8)

subject to: n

∑ X ij = S j

(17.9)

i =1 n

∑ Yi j = S j

(17.10)

X i j ≥ 0, Yi j ≥ 0, i = 1, 2,…, n,

(17.11)

i =1

17 Networks in Finance

391

where Q j is a symmetric 2n × 2n variance-covariance matrix associated with the assets and liabilities of sector j . Moreover, since Q j is a variance-covariance matrix, one can assume that it is positive definite and, as a result, the objective function of each sector’s portfolio optimization problem, given by the above, is strictly convex. Partition the symmetric matrix Q j , as j ⎛ ⎜ Q11

Q12j ⎞⎟

⎜Q j ⎝ 21

Q22j ⎟⎠

Qj = ⎜

⎟,

where Q11j and Q22j are the variance-covariance matrices for only the assets and only the liabilities, respectively, of sector j . These submatrices are each of dimension n × n . The submatrices Q12j and Q21j , in turn, are identical since Q j is symmetric. They are also of dimension n × n . These submatrices are, in fact, the symmetric variance-covariance matrices between the asset and the liabilities of j j sector j . Denote the i -th column of matrix Q(αβ ) , by Q(αβ ) i , where α and β can take on the values of 1 and/or 2 .

17.3.1.2 Optimality Conditions The necessary and sufficient conditions for an optimal portfolio for sector j , are that the vector of assets and liabilities, ( X j ∗, Y j ∗) ∈ K j , where K j denotes the feasible set for sector j , given by (17.9) – (17.11), satisfies the following system of equalities and inequalities: For each instrument i; i = 1,…, n , we must have that: j T 2(Q(11) i) ⋅ X

T j ∗ + 2(Q j (21) i )

⋅ Y j ∗ − ri∗ − μ1j ≥ 0,

j T T j ∗ + 2(Q j 2(Q(22) i ) ⋅Y (12) i ) ⋅ X ∗

j T ⎡ X i j ⎢⎣ 2(Q(11) i ) ⋅ X ∗

T j ∗ + 2(Q j (21) i )

j ∗ + r∗ i

− μ 2j ≥ 0,

⋅ Y j ∗ − ri∗ − μ1j ⎤⎥⎦ = 0,

∗ j T j T ⎡ Yi j ⎢⎣ 2(Q(22)i ) ⋅ Y j + 2(Q(12)i ) ⋅ X

j ∗ + r∗ i

− μ2j ⎤⎥⎦ = 0,

where μ 1j and μ 2j are the Lagrange multipliers associated with the accounting constraints, (17.9) and (17.10), respectively. Let K denote the feasible set for all the asset and liability holdings of all the sectors and all the prices of the instruments, where K ≡ {K × R+n } and m K ≡ ∏ i =1 K j . The network structure of the sectors’ optimization problems is depicted in Figure 17.2.

392

Anna Nagurney

Figure 17.2. Network structure of the sectors’ optimization problems

17.3.1.3 Economic System Conditions The economic system conditions, which relate the supply and demand of each financial instrument and the instrument prices, are given by: for each instrument i; i = 1,…, n , an equilibrium asset, liability, and price pattern, ( X ∗ , Y ∗ , r ∗ ) ∈ K , must satisfy: J

⎧= 0,

j =1

∗ ⎪⎩≥ 0, if ri = 0.

∗ ∗ ⎪ ∑ ( X i j − Yi j ) ⎨

if ri∗ > 0

(17.12)

The definition of financial equilibrium is now presented along with the variational inequality formulation. For the derivation, see (Nagurney et al. 1992) and (Nagurney and Siokos 1997). Combining the above optimality conditions for each sector with the economic system conditions for each instrument, we have the following definition of equilibrium. Definition 17.1: Multi-Sector, Multi-Instrument Financial Equilibrium A vector ( X ∗ , Y ∗ , r ∗ ) ∈ K is an equilibrium of the multi-sector, multi-instrument financial model if and only if it satisfies the optimality conditions and the economic system conditions (17.12), for all sectors j ; j = 1,…, m , and for all instruments i ; i = 1,…, n , simultaneously. The variational inequality formulation of the equilibrium conditions, due to (Nagurney et al. 1992) is given by:

17 Networks in Finance

393

Theorem 17.1: Variational Inequality Formulation for the Quadratic Model A vector of assets and liabilities of the sectors, and instrument prices, ( X ∗ , Y ∗ , r ∗ ) ∈ K , is a financial equilibrium if and only if it satisfies the variational inequality problem: m

n

⎡

j T T j ∗ + 2(Q j j ∗ − r∗ ⎤ × ⎢ X j − X ij i ⎥ ∑ ∑ ⎡⎢⎣ 2(Q(11) i) ⋅ X (21) i ) ⋅ Y ⎦ ⎣ i j =1 i =1

m

n

j T T j ∗ + 2(Q j + ∑ ∑ ⎣⎢⎡ 2(Q(22) i ) ⋅Y (12) i ) ⋅ X j =1 i =1

n

m

⎡

∗

j ∗ + r ∗ ⎤ × ⎡⎢Y j i ⎥ ⎦ ⎣ i

∗⎤

− Yi j

+ ∑ ∑ ⎢ X i j − Yi j ⎥ × ⎡⎢⎣ ri − ri∗ ⎤⎥⎦ ≥ 0, ∀( X , Y , r ) ∈ K. i =1 j =1 ⎣

⎦

∗⎤ ⎥ ⎦

∗⎤ ⎥ ⎦

(17.13)

For completeness, the standard form of the variational inequality is now presented. For additional background, see (Nagurney 1999). Define the N -dimensional column vector Z ≡ ( X , Y , r ) ∈ K , and the N -dimensional column vector F ( Z ) such that: ⎛X⎞ ⎛ 2Q ⎜ ⎟ F ( Z ) ≡ D ⎜ Y ⎟ where D = ⎜ T ⎝ −B ⎜r⎟ ⎝ ⎠

Q=

⎛ 1 ⎜ Q11 ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ 1 ⎜ Q12 ⎜ ⎜ ⎜ ⎜ ⎜ ⎝

1 Q21

Q11J 1 Q22

Q12J

B⎞ ⎟, 0⎠

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ J ⎟ Q21 ⎟ , ⎟ ⎟ ⎟ ⎟ ⎟ J ⎟⎟ Q22 ⎠ 2 mn× 2 mn

and BT = ( − I … − I

I … I ) n×mn,

and I is the n × n -dimensional identity matrix. It is clear that variational inequality problem (17.13) can be put into standard variational inequality form: determine Z ∗ ∈ K , satisfying: F ( Z ∗ )T , Z − Z ∗ ≥ 0, ∀Z ∈ K.

(17.14)

394

Anna Nagurney

17.3.2

Model with Utility Functions

The above model is a special case of the financial equilibrium model due to Nagurney (1994) in which each sector j seeks to maximize his utility function, U j ( X j , Y j , r ) = u j ( X j , Y j ) + r T ⋅ ( X j − Y j ),

which, in turn, is a special case of the model with a sector j ’s utility function given by the general form: U j ( X j , Y j , r ) . Interestingly, it has been shown by (Nagurney and Siokos 1997) that, in the case of utility functions of the form U j ( X j , Y j , r ) = u j ( X j , Y j ) + r T ⋅ ( X j − Y j ) , of which the above described quadratic model is an example, one can obtain the solution to the above variational inequality problem by solving the optimization problem: J

Maximize

∑ u j ( X j ,Y j )

(17.15)

j =1

subject to: J

∑ ( X i j − Yi j ) = 0, i = 1,…, n

(17.16)

( X j ,Y j ) ∈ K j ,

(17.17)

j =1

j = 1,…, m,

with Lagrange multiplier ri∗ associated with the i -th “market clearing” constraint (17.16). Moreover, this optimization problem is actually a network optimization problem as revealed in (Nagurney and Siokos 1997). The structure of the financial system in equilibrium is as depicted in Figure 17.3.

Figure 17.3. The network structure at equilibrium

17 Networks in Finance

17.3.3

395

Computation of Financial Equilibria

In this section, an algorithm for the computation of solutions to the above financial equilibrium problems is recalled. The algorithm is the modified projection method of (Korpelevich 1977). The advantage of this computational method in the context of the general financial equilibrium problems is that the original problem can be decomposed into a series of smaller and simpler subproblems of network structure, each of which can then be solved explicitly and in closed form. The realization of the modified projection method for the solution of the financial equilibrium problems with general utility functions is then presented. The modified projection method, can be expressed as: Step 0: Initialization Select Z 0 ∈ K . Let τ := 0 and let γ be a scalar such that 0 < γ ≤ 1L , where L is the Lipschitz constant (see Nagurney and Siokos (1997)). Step 1: Computation Compute Z τ by solving the variational inequality subproblem: ( Z τ + γ F ( Z τ )T − Z τ )T , Z − Z τ ≥ 0, ∀Z ∈ K.

(17.18)

Step 2: Adaptation Compute Z τ +1 by solving the variational inequality subproblem: ( Z τ +1 + γ F ( Z τ )T − Z τ )T , Z − Z τ +1 ≥ 0, ∀Z ∈ K.

(17.19)

Step 3: Convergence Verification If max Zbτ +1 − Zbτ ≤ ε , for all b , with ε > 0 , a prespecified tolerance, then stop; else, set τ := τ + 1 , and go to Step 1. An interpretation of the modified projection method as an adjustment process is now provided. The interpretation of the algorithm as an adjustment process was given by (Nagurney 1999). In particular, at an iteration, the sectors in the economy receive all the price information on every instrument from the previous iteration. They then allocate their capital according to their preferences. The market reacts on the decisions of the sectors and derives new instrument prices. The sectors then improve upon their positions through the adaptation step, whereas the market also adjusts during the adaptation step. This process continues until noone can improve upon his position, and the equilibrium is reached, that is, the above variational inequality is satisfied with the computed asset, liability, and price pattern. The financial optimization problems in the computation step and in the adaptation step are equivalent to separable quadratic programming problems, of special network structure. Each of these network subproblems structure can then be solved,

396

Anna Nagurney

at an iteration, simultaneously, and exactly in closed form. The exact equilibration algorithm (see, e. g., Nagurney and Siokos 1997) can be applied for the solution of the asset and liability subproblems, whereas the prices can be obtained using explicit formulae. A numerical example is now presented for illustrative purposes and solved using the modified projection method, embedded with the exact equilibration algorithm. For further background, see (Nagurney and Siokos 1997).

17.3.3.1 Example 1: A Numerical Example Assume that there are two sectors in the economy and three financial instruments. Assume that the “size” of each sector is given by S 1 = 1 and S 2 = 2 . The variance-covariance matrices of the two sectors are: ⎛ 1 .25 ⎜ ⎜ .25 1 ⎜ .3 1 Q1 = ⎜ 0 ⎜ 0 ⎜ 0 0 ⎜⎜ 0 ⎝ 0

.3 .1 1 0

0 0 0⎞ ⎟ 0 0 0⎟ 0 0 0⎟ ⎟ 1 .2 .3 ⎟ 0 .2 1 .5 ⎟ ⎟ 0 .3 .5 1 ⎟⎠

and ⎛1 ⎜ ⎜0 ⎜ .3 Q2 = ⎜ ⎜0 ⎜0 ⎜⎜ ⎝0

0 .3 0 0 1 .2 0 0 .2 1 0 0 0 0 1 .5 0 0

0⎞ ⎟ 0⎟ 0⎟ ⎟. 0⎟ 0 .5 1 .2 ⎟ ⎟ 0 0 .2 1 ⎟⎠

The modified projection method was coded in FORTRAN. The variables were initialized as follows: ri0 = 1 , for all i , with the financial volume S j equally distributed among all the assets and among all the liabilities for each sector j . The γ parameter was set to 0.35. The convergence tolerance ε was set to 10−3 . The modified projection method converged in 16 iterations and yielded the following equilibrium pattern:

Equilibrium Prices: r1∗ = .34039,

r2∗ = .23805,

r3∗ = .42156,

17 Networks in Finance

397

Equilibrium Asset Holdings: ∗

X 21 = .31803,

∗

X 22 = .60904,

X 11 = .27899, X 12 = .79662,

∗

∗

X 31 = .40298,

∗

X 32 = .59434,

∗

Equilibrium Liability Holdings: ∗

Y21 = .43993,

∗

Y22 = .48693,

Y11 = .37081, Y12 = .70579,

∗

∗

Y31 = .18927,

∗

Y32 = .80729.

∗

The above results show that the algorithm yielded optimal portfolios that were feasible. Moreover, the market cleared for each instrument, since the price of each instrument was positive. Other financial equilibrium models, including models with transaction costs, with hedging instruments such as futures and options, as well as, international financial equilibrium models, can be found in (Nagurney and Siokos 1997), and the references therein. Moreover, with projected dynamical systems theory (see the book by Nagurney and Zhang 1996) one can trace the dynamic behavior prior to an equilibrium state (formulated as a variational inequality). In contrast to classical dynamical systems, projected dynamical systems are characterized by a discontinuous right-hand side, with the discontinuity arising due to the constraint set underlying the application in question. Hence, this methodology allows one to model systems dynamically which are subject to limited resources, with a principal constraint in finance being budgetary restrictions. (Dong et al. 1996) were the first to apply the methodology of projected dynamical systems to develop a dynamic multi-sector, multi-instrument financial model, whose set of stationary points coincided with the set of solutions to the variational inequality model developed in (Nagurney 1994); and then to study it qualitatively, providing stability analysis results. In the next section, the methodology of projected dynamical systems is illustrated in the context of a dynamic financial network model with intermediation (cf. Nagurney and Dong 2002).

17.4 Dynamic Financial Networks with Intermediation In this section, dynamic financial networks with intermediation are explored. As noted earlier, the conceptualization of financial systems as networks dates to (Quesnay 1758) who depicted the circular flow of funds in an economy as a network. His basic idea was subsequently applied to the construction of flow of

398

Anna Nagurney

funds accounts, which are a statistical description of the flows of money and credit in an economy (cf. Board of Governors 1980, Cohen 1987, Nagurney and Hughes 1992). However, since the flow of funds accounts are in matrix form, and, hence, two-dimensional, they fail to capture the dynamic behavior on a micro level of the various financial agents/sectors in an economy, such as banks, households, insurance companies, etc. Furthermore, as noted by the (Board of Governors 1980) on page 6 of that publication, “the generality of the matrix tends to obscure certain structural aspects of the financial system that are of continuing interest in analysis,” with the structural concepts of concern including financial intermediation. (Thore 1980) recognized some of the shortcomings of financial flow of funds accounts and developed network models of linked portfolios with financial intermediation, using decentralization/decomposition theory. Note that intermediation is typically associated with financial businesses, including banks, savings institutions, investment and insurance companies, etc., and the term implies borrowing for the purpose of lending, rather than for nonfinancial purposes. Thore also constructed some basic intertemporal models. However, the intertemporal models were not fully developed and the computational techniques at that time were not sufficiently advanced for computational purposes. In this section, we address the dynamics of the financial economy which explicitly includes financial intermediaries along with the “sources” and “uses” of financial funds. Tools are provided for studying the disequilibrium dynamics as well as the equilibrium state. Also, transaction costs are considered, since they bring a greater degree of realism to the study of financial intermediation. Transaction costs had been studied earlier in multi-sector, multi-instrument financial equilibrium models by (Nagurney and Dong 1996 a,b) but without considering the more general dynamic intermediation setting. The dynamic financial network model is now described. The model consists of agents with sources of funds, agents who are intermediaries, as well as agents who are consumers located at the demand markets. Specifically, consider m agents with sources of financial funds, such as households and businesses, involved in the allocation of their financial resources among a portfolio of financial instruments which can be obtained by transacting with distinct n financial intermediaries, such as banks, insurance and investment companies, etc. The financial intermediaries, in turn, in addition to transacting with the source agents, also determine how to allocate the incoming financial resources among distinct uses, as represented by o demand markets with a demand market corresponding to, for example, the market for real estate loans, household loans, or business loans, etc. The financial network with intermediation is now described and depicted graphically in Figure 17.4. The top tier of nodes in Figure 17.4 consists of the agents with sources of funds, with a typical source agent denoted by i and associated with node i . The middle tier of nodes in Figure 17.4 consists of the intermediaries, with a typical intermediary denoted by j and associated with node j in the network. The bottom tier of nodes consists of the demand markets, with a typical demand market denoted by k and corresponding to the node k .

17 Networks in Finance

399

Figure 17.4. The Network Structure of the Financial Economy with Intermediation and with Non-Investment Allowed

For simplicity of notation, assume that there are L financial instruments associated with each intermediary. Hence, from each source of funds node, there are L links connecting such a node with an intermediary node with the l -th such link corresponding to the l -th financial instrument available from the intermediary. In addition, the option of non-investment in the available financial instruments is allowed and to denote this option, construct an additional link from each source node to the middle tier node n + 1 , which represents non-investment. Note that there are as many links connecting each top tier node with each intermediary node as needed to reflect the number of financial instruments available. Also, note that there is an additional abstract node n + 1 with a link connecting each source node to it, which, as shall shortly be shown, will be used to “collect” the financial funds which are not invested. In the model, it is assumed that each source agent has a fixed amount of financial funds. From each intermediary node, construct o links, one to each “use” node or demand market in the bottom tier of nodes in the network to denote the transaction between the intermediary and the consumers at the demand market. Let xijl denote the nonnegative amount of the funds that source i “invests” in financial instrument l obtained from intermediary j . Group the financial flows associated with source agent i , which are associated with the links emanating from the top tier node i to the intermediary nodes in the logistical network, into the column vector xi ∈ R+nL . Assume that each source has, at his disposal, an amount of funds Si and denote the unallocated portion of this amount (and flowing on the link joining node i with node n + 1 ) by si . Group then the xi s of all the source agents into the column vector x ∈ R+mnL . Associate a distinct financial product k with each demand market, bottomtiered node k and let y jk denote the amount of the financial product obtained by

400

Anna Nagurney

consumers at demand market k from intermediary j . Group these “consumption” quantities into the column vector y ∈ R+no . The intermediaries convert the incoming financial flows x into the outgoing financial flows y . The notation for the prices is now given. Note that there will be prices associated with each of the tiers of nodes in the network. Let ρ1ijl denote the price associated with instrument l as quoted by intermediary j to source agent i and group the first tier prices into the column vector ρ1 ∈ R+mnL . Also, let ρ 2 j denote the price charged by intermediary j and group all such prices into the column vector ρ 2 ∈ R+n . Finally, let ρ3k denote the price of the financial product at the third or bottom-tiered node k in the network, and group all such prices into the column vector ρ3 ∈ R+o . We now turn to describing the dynamics by which the source agents adjust the amounts they allocate to the various financial instruments over time, the dynamics by which the intermediaries adjust their transactions, and those by which the consumers obtain the financial products at the demand markets. In addition, the dynamics by which the prices adjust over time are described. The dynamics are derived from the bottom tier of nodes of the network on up since it is assumed that it is the demand for the financial products (and the corresponding prices) that actually drives the economic dynamics. The price dynamics are presented first and then the dynamics underlying the financial flows.

17.4.1

The Demand Market Price Dynamics

We begin by describing the dynamics underlying the prices of the financial products associated with the demand markets (see the bottom-tiered nodes). Assume, as given, a demand function d k , which can depend, in general, upon the entire vector of prices ρ3 , that is, d k = d k ( ρ3 ), ∀k .

(17.20)

Moreover, assume that the rate of change of the price ρ3k , denoted by ρ 3k , is equal to the difference between the demand at the demand market k , as a function of the demand market prices, and the amount available from the intermediaries at the demand market. Hence, if the demand for the product at the demand market (at an instant in time) exceeds the amount available, the price of the financial product at that demand market will increase; if the amount available exceeds the demand at the price, then the price at the demand market will decrease. Furthermore, it is guaranteed that the prices do not become negative. Thus, the dynamics of the price ρ3k associated with the product at demand market k can be expressed as: n ⎧ − d ( ρ ) y jk , 3 k ∑ ⎪ j =1 ⎪ = ρ 3k ⎨ n ⎪max{0, d k ( ρ3 ) − ∑ y jk }, ⎪⎩ j =1

if

ρ 3k > 0 (17.21)

if

ρ3k = 0.

17 Networks in Finance

401

17.4.2 The Dynamics of the Prices at the Intermediaries The prices charged for the financial funds at the intermediaries, in turn, must reflect supply and demand conditions as well (and as shall be shown shortly also reflect profit-maximizing behavior on the part of the intermediaries who seek to determine how much of the financial flows they obtain from the different sources of funds). In particular, assume that the price associated with intermediary j , ρ 2 j , and computed at node j lying in the second tier of nodes, evolves over time according to: o m L ⎧ ∑ y jk − ∑ ∑ xijl , ⎪⎪ k =1 i =1 l =1 ρ2j=⎨ o m L ⎪max{0, ∑ y jk − ∑ ∑ xijl }, ⎪⎩ k =1 i =1 l =1

if

ρ2 j > 0 (17.22)

if

ρ 2 j = 0,

where ρ 2 j denotes the rate of change of the j -th intermediary’s price. Hence, if the amount of the financial funds desired to be transacted by the consumers (at an instant in time) exceeds that available at the intermediary, then the price charged at the intermediary will increase; if the amount available is greater than that desired by the consumers, then the price charged at the intermediary will decrease. As in the case of the demand market prices, it is guaranteed that the prices charged by the intermediaries remain nonnegative.

17.4.3 Precursors to the Dynamics of the Financial Flows First some preliminaries are needed that will allow the development of the dynamics of the financial flows. In particular, the utility-maximizing behavior of the source agents and that of the intermediaries is now discussed. Assume that each such source agent’s and each intermediary agent’s utility can be defined as a function of the expected future portfolio value, where the expected value of the future portfolio is described by two characteristics: the expected mean value and the uncertainty surrounding the expected mean. Here, the expected mean portfolio value is assumed to be equal to the market value of the current portfolio. Each agent’s uncertainty, or assessment of risk, in turn, is based on a variance-covariance matrix denoting the agent’s assessment of the standard deviation of the prices for each instrument/product. The variance-covariance matrix associated with source agent i ’s assets is denoted by Q i and is of dimension nL × nL , and is associated with vector xi , whereas intermediary agent j ’s variance-covariance matrix is denoted by Q j , is of dimension o × o , and is associated with the vector y j .

402

Anna Nagurney

17.4.4

Optimizing Behavior of the Source Agents

Denote the total transaction cost associated with source agent i transacting with intermediary j to obtain financial instrument l by cijl and assume that: cijl = cijl ( xijl ), ∀i, j, l .

(17.23)

The total transaction costs incurred by source agent i , thus, are equal to the sum of all the agent’s transaction costs. His revenue, in turn, is equal to the sum of the price (rate of return) that the the agent can obtain for the financial instrument times the total quantity obtained/purchased of that instrument. Recall that ρ1ijl denotes the price associated with agent i /intermediary j /instrument l . Assume that each such source agent seeks to maximize net return while, simultaneously, minimizing the risk, with source agent i ’s utility function denoted by U i . Moreover, assume that the variance-covariance matrix Q i is positive semidefinite and that the transaction cost functions are continuously differentiable and convex. Hence, one can express the optimization problem facing source agent i as: n

L

n

L

Maximize U i ( xi ) = ∑ ∑ ρ1ijl xijl − ∑ ∑ cijl ( xijl ) − xi TQ i xi , j =1 l =1

(17.24)

j =1 l =1

subject to xijl ≥ 0 , for all j, l , and to the constraint: n

L

∑ ∑ xijl j =1 l =1

≤ Si,

(17.25)

that is, the allocations of source agent i ’s funds among the financial instruments made available by the different intermediaries cannot exceed his holdings. Note that the utility function above is concave for each source agent i . A source agent may choose to not invest in any of the instruments. Indeed, as shall be illustrated through subsequent numerical examples, this constraint has important financial implications. Clearly, in the case of unconstrained utility maximization, the gradient of source agent i ’s utility function with respect to the vector of variables xi and denoted by ∇ xi U i , where ∇ xi U i = ( ∂∂xUi11i ,…, ∂∂xUinLi ) , represents agent i ’s idealized direction, with the jl -component of ∇ xi U i given by: ( ρ1ijl − 2Qzi jl ⋅ xi −

∂cijl ( xijl ) ), ∂xijl

(17.26)

where Qzi jl denotes the z jl -th row of Qi , and z jl is the indicator defined as: z jl = (l − 1)n + j . We return later to describe how the constraints are explicitly incorporated into the dynamics.

17 Networks in Finance

17.4.5

403

Optimizing Behavior of the Intermediaries

The intermediaries, in turn, are involved in transactions both with the source agents, as well as with the users of the funds, that is, with the ultimate consumers associated with the markets for the distinct types of loans/products at the bottom tier of the financial network. Thus, an intermediary conducts transactions both with the “source” agents as well as with the consumers at the demand markets. An intermediary j is faced with what is termed a handling/conversion cost, which may include, for example, the cost of converting the incoming financial flows into the financial loans/products associated with the demand markets. Denote this cost by c j and, in the simplest case, one would have that c j is a function of ∑ im=1 ∑ lL=1 xijl , that is, the holding/conversion cost of an intermediary is a function of how much he has obtained from the various source agents. For the sake of generality, however, allow the function to, in general, depend also on the amounts held by other intermediaries and, therefore, one may write: c j = c j ( x), ∀j.

(17.27)

The intermediaries also have associated transaction costs in regard to transacting with the source agents, which are assumed to be dependent on the type of instrument. Denote the transaction cost associated with intermediary j transacting with source agent i associated with instrument l by cˆ ijl and assume that it is of the form

cˆ ijl = cˆ ijl ( xijl ), ∀i, j, l .

(17.28)

Recall that the intermediaries convert the incoming financial flows x into the outgoing financial flows y . Assume that an intermediary j incurs a transaction cost c jk associated with transacting with demand market k , where c jk = c jk ( y jk ), ∀j, k .

(17.29)

The intermediaries associate a price with the financial funds, which is denoted by

ρ 2 j , for intermediary j . Assuming that the intermediaries are also utility maxi-

mizers with the utility functions for each being comprised of net revenue maximization as well as risk minimization, then the utility maximization problem for intermediary agent j with his utility function denoted by U j , can be expressed as: Maximize U j ( x j , y j ) = m

L

m

L

o

∑ ∑ ρ 2 j xijl − c j ( x) − ∑ ∑ cˆ ijl ( xijl ) − ∑ c jk ( y jk ) i =1 l =1 m

L

i =1 l =1

−∑ ∑ ρ1ijl xijl − y j Q j y j , i =1 l =1

T

k =1

(17.30)

404

Anna Nagurney

subject to the nonnegativity constraints: xijl ≥ 0 , and y jk ≥ 0 , for all i, l , and k . Here, for convenience, we have let x j = ( x1 j1 ,…, xmjL ) . The above bijective function expresses that the difference between the revenues minus the handling cost and the transaction costs and the payout to the source agents should be maximized, whereas the risk should be minimized. Assume now that the variance-covariance matrix Q j is positive semidefinite and that the transaction cost functions are continuously differentiable and convex. Hence, the utility function above is concave for each intermediary j . The gradient ∇ x j U j = ( ∂∂xU j ,…, ∂∂xU j ) represents agent j ’s idealized direction in 1 j1

mjL

terms of x j , ignoring the constraints, for the time being, whereas the gradient ∇ y j U j = ( ∂∂Uy j ,…, ∂∂Uy j ) represents his idealized direction in terms of y j . Note that j1 jo

the il -th component of ∇x j U j is given by: ( ρ 2 j − ρ1ijl −

∂c j ( x) ∂ cˆ ijl ( xijl ) − ), ∂xijl ∂xijl

(17.31)

whereas the jk -th component of ∇ y j U j is given by: (−

∂c jk ( y jk ) − 2Qkj ⋅ y j ). ∂y jk

(17.32)

However, since both source agent i and intermediary j must agree in terms of the xijl s, the direction (17.26) must coincide with that in (17.31), so adding both gives us a “combined force,” which, after algebraic simplification, yields: ( ρ 2 j − 2Qzi jl ⋅ xi −

17.4.6

∂cijl ( xijl ) ∂c j ( x) ∂ cˆ ijl ( xijl ) − − ). ∂xijl ∂xijl ∂xijl

(17.33)

The Dynamics of the Financial Flows Between the Source Agents and the Intermediaries

We are now ready to express the dynamics of the financial flows between the source agents and the intermediaries. In particular, define the feasible set K i ≡ {xi xijl ≥ 0,∀i, j, l , and (17.25) holds} . Let also K be the Cartesian product given by K ≡ Π im=1 Ki and define Fijl1 as minus the term in (1.33) with 1 Fi1 = ( Fi111 ,…, FinL ) . Then the best realizable direction for the vector of financial instruments xi can be mathematically expressed as: 1 x i = Π Ki ( xi , − Fi ),

(17.34)

17 Networks in Finance

405

where Π K ( Z , v) is defined as:

Π K ( Z , v) = lim

PK ( Z + δ v) − Z

δ →0

δ

,

(17.35)

and PK is the norm projection defined by PK ( Z ) = argmin Z ′∈K Z ′ − Z .

17.4.7

(17.36)

The Dynamics of the Financial Flows Between the Intermediaries and the Demand Markets

In terms of the financial flows between the intermediaries and the demand markets, both the intermediaries and the consumers must be in agreement as to the financial flows y . The consumers take into account in making their consumption decisions not only the price charged for the financial product by the intermediaries but also their transaction costs associated with obtaining the product. Let cˆ jk denote the transaction cost associated with obtaining the product at demand market k from intermediary j . Assume that this unit transaction cost is continuous and of the general form: cˆ jk = cˆ jk ( y ), ∀j, k .

(17.37)

The consumers take the price charged by the intermediaries, which was denoted by ρ 2 j for intermediary j , plus the unit transaction cost, in making their consumption decisions. From the perspective of the consumers at the demand markets, one can expect that an idealized direction in terms of the evolution of the financial flow of a product between an intermediary/demand market pair would be: ( ρ3k − cˆ jk ( y ) − ρ 2 j ).

(17.38)

On the other hand, as already derived above, one can expect that the intermediaries would adjust the volume of the product to a demand market according to (17.32). Combining now (17.32) and (17.38), and guaranteeing that the financial products do not assume negative quantities, yields the following dynamics: ∂c jk ( y jk ) ⎧ if ρ3k − cˆ jk ( y ) − ρ 2 j − − 2Qkj ⋅ y j , ⎪ ∂y jk ⎪ = y jk ⎨ ⎪max{0, ρ3k − cˆ ( y ) − ρ 2 j − ∂c jk ( y jk ) − 2Q j ⋅ y j }, if jk k ⎪⎩ ∂y jk

y jk > 0 y jk = 0.

(17.39)

406

Anna Nagurney

17.4.8

The Projected Dynamical System

Consider now the dynamic model in which the demand prices evolve according to (17.21) for all demand markets k , the prices at the intermediaries evolve according to (17.22) for all intermediaries j ; the financial flows between the source agents and the intermediaries evolve according to (17.34) for all source agents i , and the financial products between the intermediaries and the demand markets evolve according to (17.39) for all intermediary/demand market pairs j, k . Let now Z denote the aggregate column vector ( x, y, ρ 2 , ρ3 ) in the feasible set K ≡ K × R+no + n + o . Define the column vector F ( Z ) ≡ ( F 1 , F 2 , F 3 , F 4 ) , where F 1 is as has been defined previously;

nent

Fjk2 ≡ (2Qkj ⋅ y j +

where

∂c jk ( y jk ) ∂y jk

+ cˆ jk ( y) + ρ2 j − ρ3k ) ,

F j3 ≡ (∑ i =1 ∑ l =1 xijl − ∑ k =1 y jk ) , and m

L

F 2 = ( F112 ,…, Fno2 ) , with compo-

o

∀j, k ;

F 3 = ( F13 ,…, Fn3 ) ,

F 4 = ( F14 ,…, Fo4 ) , with

Fk4 ≡

(∑ j =1 y jk − d k ( ρ3 )) . n

Then the dynamic model described by (17.21), (17.22), (17.34), and (17.39) for all k , j, i, l can be rewritten as the projected dynamical system defined by the following initial value problem: Z = Π K ( Z , − F ( Z )), Z (0) = Z 0 ,

(17.40)

where, Π K is the projection operator of − F ( Z ) onto K at Z and Z 0 = ( x 0 , y 0 , ρ 20 , ρ30 ) is the initial point corresponding to the initial financial flows and the initial prices. The trajectory of (17.40) describes the dynamic evolution of and the dynamic interactions among the prices and the financial flows. The dynamical system (17.40) is non-classical in that the right-hand side is discontinuous in order to guarantee that the constraints in the context of the above model are not only nonnegativity constraints on the variables, but also a form of budget constraints. Here this methodology is applied to study financial systems in the presence of intermediation. A variety of dynamic financial models, but without intermediation, formulated as projected dynamical systems can be found in the book by Nagurney and Siokos (1997).

17.4.9

A Stationary/Equilibrium Point

The stationary point of the projected dynamical system (17.40) is now discussed. Recall that a stationary point Z ∗ is that point that satisfies Z = 0 = Π K ( Z ∗ , − F ( Z ∗ )),

and, hence, in the context of the dynamic financial model with intermediation, when there is no change in the financial flows and no change in the prices. Moreover, as established in (Dupuis and Nagurney 1993), since the feasible set K is

17 Networks in Finance

407

a polyhedron and convex, the set of stationary points of the projected dynamical system of the form given in (17.40) coincides with the set of solutions to the variational inequality problem given by: determine Z ∗ ∈ K , such that F ( Z ∗ )T , Z − Z ∗ ≥ 0, ∀Z ∈ K,

(17.41)

where in the model F ( Z ) and Z are as defined above and recall that ⋅,⋅ denotes the inner product in N -dimensional Euclidean space where here N = mnL + no + n + o .

17.4.10 Variational Inequality Formulation of Financial Equilibrium with Intermediation In particular, variational inequality (17.41) here takes the form: determine ( x∗ , y ∗ , ρ 2∗ , ρ3∗ ) ∈ K , satisfying: m

n

L ⎡

⎢ ∑ ∑ ∑ ⎢⎢ 2Qzi ⋅ xi∗ +

i =1 j =1 l =1 ⎣⎢

jl

∗ ⎤ ∂cijl ( xijl∗ ) ∂c j ( x∗ ) ∂ cˆ ijl ( xijl ) ⎥ + + − ρ 2∗ j ⎥ ⎥ ∂xijl ∂xijl ∂xijl ⎦⎥

× ⎡⎢⎣ xijl − xijl∗ ⎤⎥⎦ n

o ⎡ ⎢

+ ∑ ∑ ⎢ 2Qkj ⋅ y∗j + j =1 k =1 ⎣⎢

⎤ ∂c jk ( y∗jk ) ⎥ + cˆ jk ( y∗ ) + ρ 2∗ j − ρ3∗k ⎥ × ⎡⎣⎢ y jk − y∗jk ⎤⎦⎥ ⎥ ∂y jk ⎦

n ⎡m

L

o

⎤

k =1

⎦⎥

+ ∑ ⎢⎢ ∑ ∑ xijl∗ − ∑ y ∗jk ⎥⎥ × ⎡⎢⎣ ρ 2 j − ρ 2∗ j ⎤⎥⎦ j =1 ⎣⎢ i =1 l =1

o ⎡ n ⎤ + ∑ ⎢ ∑ y∗jk − d k ( ρ3∗ ) ⎥ × ⎡⎢⎣ ρ3k − ρ3∗k ⎤⎥⎦ ≥ 0, ∀( x, y, ρ 2 , ρ3 ) ∈ K, k =1 ⎣ j =1 ⎦

(17.42) where K ≡ {K × R+no + n + o } and Qzi jl is as was defined following (17.26). In (Nagurney and Ke 2001) a variational inequality of the form (17.42) was derived in a manner distinct from that given above for a static financial network model with intermediation, but with a slightly different feasible set where it was assumed that the constraints (17.25) had to be tight, that is, to hold as an equality. (Nagurney and Ke 2003), in turn, demonstrated how electronic transactions could be introduced into financial networks with intermediation by adding additional links to the network in Figure 17.4 and by including additional transaction costs and prices and expanding the objective functions of the decision-makers accordingly. We discuss electronic financial transactions subsequently, when we describe the financial engineering of integrated social and financial networks with intermediation.

408

Anna Nagurney

17.4.11 The Discrete-Time Algorithm (Adjustment Process) The projected dynamical system (17.40) is a continuous time adjustment process. However, in order to further fix ideas and to provide a means of “tracking” the trajectory of (17.40), we present a discrete-time adjustment process, in the form of the Euler method, which is induced by the general iterative scheme of (Dupuis and Nagurney 1993). The statement of the Euler method is as follows: Step 0: Initialization Start with a Z 0 ∈ K . Set τ := 1 . Step 1: Computation Compute Z τ by solving the variational inequality problem: Z τ = PK ( Z τ −1 − ατ F ( Z τ −1 )),

where {ατ ;τ = 1, 2,…} is a sequence of positive scalars such that ατ → 0 , as τ → ∞ (which is required for convergence).

(17.43) ∞

∑τ =1 ατ

= ∞,

Step 2: Convergence Verification If Z bτ − Z bτ −1 ≤ ε , for some ε > 0 , a prespecific tolerance, then stop: otherwise, set τ := τ + 1 , and go to Step 1. The statement of this method in the context of the dynamic financial model takes the form:

17.4.12 The Euler Method Step 0: Initialization Step Set ( x 0 , y 0 , ρ 20 , ρ30 ) ∈ K . Let τ = 1 , where τ is the iteration counter, and set the ∞ sequence {ατ } so that ∑τ =1 ατ = ∞ , ατ > 0 , ατ → 0 , as τ → ∞ . Step 1: Computation Step Compute ( xτ , yτ , ρ 2τ , ρ3τ ) ∈ K by solving the variational inequality subproblem: m

n

L

⎡

∑ ∑ ∑ ⎢ xijlτ + ατ (2Qzi i =1 j =1 l =1 ⎣

+

τ −1 ∂ cˆ ijl ( xijl )

∂xijl

jl

⋅ xiτ −1 +

τ −1 ∂cijl ( xijl ) ∂c j ( xτ −1 ) + ∂xijl ∂xijl ⎤ ⎥

τ −1 τ ⎤ ⎡ − ρ 2τ −j 1 ) − xijl ⎥ × ⎢ xijl − xijl ⎥ ⎣ ⎦ ⎥ ⎥⎦

17 Networks in Finance

409

n o ⎡ ∂c jk ( yτjk−1 ) + ∑ ∑ ⎢ yτjk + ατ (2Qki ⋅ yτj −1 + cˆ jk ( yτ −1 ) + ∂y jk j =1 k =1 ⎣

+ ρ 2τ −j 1 − ρ3τk−1 ) − yτjk−1 ⎤⎦⎥ × ⎡⎣⎢ y jk − yτjk ⎤⎦⎥ n ⎡

m

j =1 ⎣⎢

i =1 l =1

L

o

⎤

k =1

⎦⎥

+ ∑ ⎢⎢ ρ 2τ j + ατ (∑ ∑ xτijl−1 − ∑ yτjk−1 ) − ρ 2τ −j 1 ⎥⎥ × ⎡⎢⎣ ρ 2 j − ρ 2τ j ⎤⎥⎦ o ⎡

n

⎤

k =1 ⎢⎣

j =1

⎥ ⎦

+ ∑ ⎢⎢ ρ τ3k + ατ ( ∑ yτjk−1 − d k ( ρ3τ −1 )) − ρ3τ k−1 ⎥⎥ × ⎡⎢⎣ ρ3k − ρ3τ k ⎤⎥⎦ ≥ 0, ∀( x, y, ρ 2 , ρ3 ) ∈ K.

Step 2: Convergence Verification τ If xijl − xτijl−1 ≤ ε , yτjk − yτjk−1 ≤ ε ,

ρ 2τ j − ρ 2τ −j 1 ≤ ε ,

ρ3τ k − ρ3τ k−1 ≤ ε , for all

i = 1, , m ; j = 1, , n ; l = 1,…, L ; k = 1, , o , with ε > 0 , a pre-specified tolerance, then stop; otherwise, set τ := τ + 1 , and go to Step 1. The variational inequality subproblem encountered in the computation step at each iteration of the Euler method can be solved explicitly and in closed form since it is actually a quadratic programming problem and the feasible set is a Cartesian product consisting of the product of K , which has a simple network structure, and the nonnegative orthants, R+no , R+n , and R+o , corresponding to the variables x , y , ρ 2 , and ρ3 , respectively.

Figure 17.5. The financial network structure of the numerical examples

410

Anna Nagurney

17.4.13 Numerical Examples In this section, the Euler method is applied to two numerical examples. The algorithm was implemented in FORTRAN. For the solution of the induced network subproblems in x , we utilized the exact equilibration algorithm, which fully exploits the simplicity of the special network structure of the subproblems. The convergence criterion used was that the absolute value of the flows and prices between two successive iterations differed by no more than 10 −4 . For the examples, the sequence {ατ } = .1{1, 12 , 12 , 13 , 13 , 13 ,…} , which is of the form given in the initialization step of the algorithm in the preceding section. The numerical examples had the network structure depicted in Figure 17.5 and consisted of two source agents, two intermediaries, and two demand markets, with a single financial instrument handled by each intermediary. The algorithm was initialized as follows: since there was a single financial ini strument associated with each of the intermediaries, we set xij1 = Sn for each source agent i . All the other variables, that is, the initial vectors y , ρ 2 , and ρ3 were set to zero. Additional details are given in (Nagurney and Dong 2002). 17.4.13.1 Example 2

The data for this example were constructed for easy interpretation purposes. The supplies of the two source agents were: S 1 = 10 and S 2 = 10 . The variancecovariance matrices Q i and Q j were equal to the identity matrices for all source agents i and all intermediaries j . The transaction cost functions faced by the source agents associated with transacting with the intermediaries were given by: 2 2 c111 ( x111 ) = .5 x111 + 3.5 x111 , c121 ( x121 ) = .5 x121 + 3.5 x121 , 2 2 c211 ( x211 ) = .5 x211 + 3.5 x211 , c221 ( x221 ) = .5 x221 + 3.5 x221 .

The handling costs of the intermediaries, in turn, were given by: 2

2

i =1

i =1

c1 ( x ) = .5(∑ .5 xi11 ) 2 , c2 ( x) = .5(∑ xi 21 ) 2 .

The transaction costs of the intermediaries associated with transacting with the source agents were, respectively, given by: cˆ111( x111 ) = 1.5 x111 + 3 x111 , cˆ121( x121 ) = 1.5 x121 + 3 x121 , 2

2

cˆ 211( x211 ) = 1.5 x211 + 3 x211 , cˆ 221( x221 ) = 1.5 x221 + 3 x221 . 2

2

The demand functions at the demand markets were: d1 ( ρ3 ) = −2 ρ31 − 1.5ρ32 + 1000, d 2 ( ρ3 ) = −2 ρ32 − 1.5 ρ31 + 1000,

17 Networks in Finance

411

and the transaction costs between the intermediaries and the consumers at the demand markets were given by: cˆ11( y ) = y11 + 5, cˆ12( y ) = y12 + 5, cˆ 21( y ) = y21 + 5, cˆ 22( y ) = y22 + 5.

It was assumed for this and the subsequent example that the transaction costs as perceived by the intermediaries and associated with transacting with the demand markets were all zero, that is, c jk ( y jk ) = 0 , for all j, k . The Euler method converged and yielded the following equilibrium pattern: ∗ ∗ ∗ ∗ x111 = x121 = x211 = x221 = 5.000, ∗ ∗ ∗ ∗ y11 = y12 = y21 = y22 = 5.000. ∗ ∗ The vector ρ 2∗ had components: ρ 21 = ρ 22 = 262.6664 , and the computed de∗ ∗ = 282.8106 . mand prices at the demand markets were: ρ31 = ρ32 The optimality/equilibrium conditions were satisfied with good accuracy. Note that in this example, the budget constraint was tight for both source agents, that is, n L s1∗ = s2∗ = 0 , where si∗ = S i − ∑ j =1 ∑ l =1 xijl∗ , and, hence, there was zero flow on the

links connecting node 3 with top tier nodes 1 and 2 . Thus, it was optimal for both source agents to invest their entire financial holdings in each instrument made available by each of the two intermediaries. 17.4.13.2 Example 3

The following variant of Example 2 was then constructed to create Example 3. The data were identical to that in Example 2 except that the supply for each source sector was increased so that S 1 = S 2 = 50 . The Euler method converged and yielded the following new equilibrium pattern: ∗ ∗ ∗ ∗ x111 = x121 = x211 = x221 = 23.6832, ∗ ∗ ∗ ∗ y11 = y12 = y21 = y22 = 23.7247.

∗ ∗ The vector ρ 2∗ had components: ρ 21 = ρ 22 = 196.0174 , and the demand prices ∗ ∗ at the demand markets were: ρ31 = ρ32 = 272.1509 . It is easy to verify that the optimality/equilibrium conditions, again, were satisfied with good accuracy. Note, however, that unlike the solution for Example 2, both source agent 1 and source agent 2 did not invest their entire financial holdings. Indeed, each opted to not invest the amount 23.7209 and this was the volume of flow on each of the two links ending in node 3 in Figure 17.6. Since the supply of financial funds increased, the price for the instruments charged by the intermediaries decreased from 262.6664 to 196.1074 . The demand prices at the demand markets also decreased, from 282.8106 to 272.1509 .

412

Anna Nagurney

17.5 The Integration of Social Networks with Financial Networks As noted by (Nagurney et al. 2004), globalization and technological advances have made major impacts on financial services in recent years and have allowed for the emergence of electronic finance. The financial landscape has been transformed through increased financial integration, increased cross border mergers, and lower barriers between markets. Moreover, as noted by several authors, boundaries between different financial intermediaries have become less clear (cf. Claessens and Jansen 2000, Claessens et al. 2003, G-10 2001). For example, during the period 1980–1990, global capital transactions tripled with telecommunication networks and financial instrument innovation being two of the empirically identified major causes of globalization with regards to international financial markets (Kim 1999). The growing importance of networks in financial services and their effects on competition have been also addressed by (Claessens et al. 2003). (Kim 1999) has argued for the necessity of integrating various theories, including portfolio theory with risk management, and flow theory in order to capture the underlying complexity of the financial flows over space and time. At the same time that globalization and technological advances have transformed financial services, researchers have identified the importance of social networks in a plethora of financial transactions (cf. Nagurney et al. 2004 and the references therein), notably, in the context of personal relationships. The relevance of social networks within an international financial context needs to be examined both theoretically and empirically. It is clear that the existence of appropriate social networks can affect not only the risk associated with financial transactions but also transaction costs. Given the prevalence of networks in the discussions of globalization and international financial flows, it seems natural that any theory for the illumination of the behavior of the decision-makers involved in this context as well as the impacts of their decisions on the financial product flows, prices, appreciation rates, etc., should be network-based. Recently, (Nagurney et al. 2004) took on a network perspective for the theoretical modeling, analysis, and computation of solutions to international financial networks with intermediation in which they explicitly integrated the social network component. They also captured electronic transactions within the framework since that aspect is critical in the modeling of international financial flows today. Here, that model is highlighted. This model generalizes the model of (Nagurney and Cruz 2003) to explicitly include social networks. For further background on the integration of social networks and financial networks, see (Nagurney et al. 2006). As in the model of (Nagurney and Cruz 2003), the model consists of L countries, with a typical country denoted by l or lˆ ; I “source” agents in each country with sources of funds, with a typical source agent denoted by i , and J financial intermediaries with a typical financial intermediary denoted by j . As noted ear-

17 Networks in Finance

413

lier, examples of source agents are households and businesses, whereas examples of financial intermediaries include banks, insurance companies, investment companies, and brokers, where now we include electronic brokers, etc. Intermediaries in the framework need not be country-specific but, rather, may be virtual. Assume that each source agent can transact directly electronically with the consumers through the Internet and can also conduct his financial transactions with the intermediaries either physically or electronically in different currencies. There are H currencies in the international economy, with a typical currency being denoted by h . Also, assume that there are K financial products which can be in distinct currencies and in different countries with a typical financial product (and associated with a demand market) being denoted by k . Hence, the financial intermediaries in the model, in addition to transacting with the source agents, also determine how to allocate the incoming financial resources among distinct uses, which are represented by the demand markets with a demand market corresponding to, for example, the market for real estate loans, household loans, or business loans, etc., which, as mentioned, can be associated with a distinct country and a distinct currency combination. Let m refer to a mode of transaction with m = 1 denoting a physical transaction and m = 2 denoting an electronic transaction via the Internet. The depiction of the supernetwork (see also, e. g., Nagurney and Dong 2002) is given in Figure 17.6. As this figure illustrates, the supernetwork is comprised of the social network, which is the bottom level network, and the international financial network, which is the top level network. Internet links to denote the possibility of electronic financial transactions are denoted in the figure by dotted arcs. In

Figure 17.6. The multilevel supernetwork structure of the integrated international financial network/social network system

414

Anna Nagurney

addition, dotted arcs/links are used to depict the integration of the two networks into a supernetwork. The supernetwork in Figure 17.6 consists of a social network and an international financial network with intermediation. Both networks consist of three tiers of decision-makers. The top tier of nodes consists of the agents in the different countries with sources of funds, with agent i in country l being referred to as agent il and associated with node il . There are IL top-tiered nodes in the network. The middle tier of nodes in each of the two networks consists of the intermediaries (which need not be country-specific), with a typical intermediary j associated with node j in this (second) tier of nodes in the networks. The bottom tier of nodes in both the social network and in the financial network consists of the demand markets, with a typical demand market for product k in currency h and country lˆ associated with node khlˆ . There are, as depicted in Figure 17.6, J middle (or second) tiered nodes corresponding to the intermediaries and KHL bottom (or third) tiered nodes in the international financial network. In addition, we add a node J + 1 to the middle tier of nodes in the financial network only in order to represent the possible non-investment (of a portion or all of the funds) by one or more of the source agents, as also done in the model in the previous section. The network in Figure 17.6 includes classical physical links as well as Internet links to allow for electronic financial transactions. Electronic transactions are possible between the source agents and the intermediaries, the source agents and the demand markets as well as the intermediaries and the demand markets. Physical transactions can occur between the source agents and the intermediaries and between the intermediaries and the demand markets. (Nagurney et al. 2004) describe the behavior of the decision-makers in the model, and allow for multicriteria decision-making, which consists of profit maximization, risk minimization (with general risk functions), as well as the maximization of the value of relationships. Each decision-maker is allowed to weight the criteria individually. The dynamics of the interactions are discussed and the projected dynamical system derived. The Euler method is then used to track the dynamic trajectories of the financial flows (transacted either physically or electronically), the prices, as well as the relationship levels until the equilibrium state is reached. Fascinatingly, it has recently been shown by (Liu and Nagurney 2005) that financial network problems with intermediation can be reformulated as transportation network equilibrium problems and that electric power networks can be as well (cf. Nagurney et al. 2005). Hence, we now can answer Copeland’s question in that money and electricity are alike in that they both flow on networks, which are, in turn, transformable into transportation networks.

Acknowledgments The writing of this chapter was supported, in part, by NSF Grant No.: IIS 0002647 under the MKIDS Program. This support is gratefully acknowledged. Also, the

17 Networks in Finance

415

writing of this chapter was done, in part, while the author was a Science Fellow at the Radcliffe Institute for Advanced Study at Harvard University. This support is also gratefully acknowledged. Finally, the author thanks the anonymous reviewers and editors for helpful comments and suggestions. This chapter is a condensed and adapted version of (Nagurney 2005).

References Ahuja RK, Magnanti TL, Orlin JB (1993) Network Flows, Prentice Hall, Upper Saddle River, New Jersey Arrow KJ (1951) “An Extension of the Basic Theorems of Classical Welfare Economics,” Econometrica: 1305−1323 Barr RS (1972) “The Multinational Cash Management Problem: A Generalized Network Approach,” Working Paper, University of Texas, Austin, Texas Board of Governors (1980) Introduction to Flow of Funds, Flow of Funds Section, Division of Research and Statistics, Federal Reserve System, Washington, DC, June Boginski V, Butenko S, Pardalos, PM (2003) “On Structural Properties of the Market Graph,” in Innovations in Financial and Economic Networks, A Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 29−42 Charnes A, Cooper WW (1958) “Nonlinear Network Flows and Convex Programming over Incidence Matrices,” Naval Research Logistics Quarterly 5: 231−240 Charnes A, Cooper WW (1961) Management Models and Industrial Applications of Linear Programming, John Wiley & Sons, New York Charnes A, Cooper, WW (1967) “Some Network Characterizations for Mathematical Programming and Accounting Approaches to Planning and Control,” The Accounting Review 42: 24−52 Charnes A, Miller M (1957) “Programming and Financial Budgeting,” Symposium on Techniques of Industrial Operations Research, Chicago, Illinois, June Christofides N, Hewins, RD, Salkin, GR (1979) “Graph Theoretic Approaches to Foreign Exchange Operations,” Journal of Financial and Quantitative Analysis 14: 481−500 Claessens S, Jansen, M, Editors (2000) Internationalization of Financial Services, Kluwer Academic Publishers, Boston, Massachusetts Claessens S, Dobos G, Klingebiel D, Laeven, L (2003) “The Growing Importance of Networks in Finance and Their Effects on Competition,” in Innovations in Financial and Economic Networks, A Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 110−135 Cohen J (1987) The Flow of Funds in Theory and Practice, Financial and Monetary Studies 15, Kluwer Academic Publishers, Dordrecht, The Netherlands Copeland MA (1952) A Study of Moneyflows in the United States, National Bureau of Economic Research, New York

416

Anna Nagurney

Cournot AA (1838) Researches into the Mathematical Principles of the Theory of Wealth, English Translation, Macmillan, London, England, 1897 Crum RL (1976) “Cash Management in the Multinational Firm: A Constrained Generalized Network Approach,” Working Paper, University of Florida, Gainesville, Florida Crum RL, Klingman DD, Tavis LA (1979), “Implementation of Large-Scale Financial Planning Models: Solution Efficient Transformations,” Journal of Financial and Quantitative Analysis 14: 137−152 Crum RL, Klingman DD, Tavis LA (1983) “An Operational Approach to Integrated Working Capital Planning,” Journal of Economics and Business 35: 343−378 Crum RL, Nye DJ (1981) “A Network Model of Insurance Company Cash Flow Management,” Mathematical Programming Study 15: 86−101 Dafermos SC, Sparrow FT (1969) “The Traffic Assignment Problem for a General Network,” Journal of Research of the National Bureau of Standards 73B: 91−118 Dantzig GB, Madansky A (1961) “On the Solution of Two-Stage Linear Programs under Uncertainty,” in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability 1, University of California Press, Berkeley, California, pp. 165−176 Debreu G (1951) “The Coefficient of Resource Utilization,” Econometrica 19: 273−292 Dong J, Nagurney A (2001) “Bicriteria Decision-Making and Financial Equilibrium: A Variational Inequality Perspective,” Computational Economics 17: 29−42 Dong J, Zhang D, Nagurney A (1996) “A Projected Dynamical Systems Model of General Financial Equilibrium with Stability Analysis,” Mathematical and Computer Modelling 24: 35−44 Doumpos M, Zopounidis C, Pardalos, PM (2000) “Multicriteria Sorting Methodology: Application to Financial Decision Problems,” Parallel Algorithms and Applications 15: 113−129 Dupuis P, Nagurney A (1993) “Dynamical Systems and Variational Inequalities,” Annals of Operations Research 44: 9−42 Enke S (1951) “Equilibrium Among Spatially Separated Markets,” Econometrica 10: 40−47 Euler L (1736) “Solutio Problematis ad Geometriam Situs Pertinentis,” Commetarii Academiae Scientiarum Imperialis Petropolitanae 8: 128−140 Fei JCH (1960) “The Study of the Credit System by the Method of Linear Graph,” The Review of Economics and Statistics 42: 417−428 Ferguson AR, Dantzig GB (1956) “The Allocation of Aircraft to Routes,” Management Science 2: 45−73 Ford LR, Fulkerson DR (1962) Flows in Networks, Princeton University Press, Princeton, New Jersey Francis JC, Archer SH (1979) Portfolio Analysis, Prentice Hall, Englewood Cliffs, New Jersey

17 Networks in Finance

417

G-10 (2001) “Report on Consolidation in Financial Services,” Bank for International Settlements, Switzerland Geunes J, Pardalos PM (2003) “Network Optimization in Supply Chain Management and Financial Engineering: An Annotated Bibliography,” Networks 42: 66−84 Hughes M, Nagurney A (1992) “A Network Model and Algorithm for the Estimation and Analysis of Financial Flow of Funds,” Computer Science in Economics and Management 5: 23−39 Korpelevich GM (1977) “The Extragradient Method for Finding Saddle Points and Other Problems,” Matekon 13: 35−49 Kim HM (1999) Globalization of International Financial Markets, Ashgate, Hants, England Liu Z, Nagurney A (2005) “Financial Networks with Intermediation and Transportation Network Equilibria: A Supernetwork Equivalence and Reinterpretation of the Equilibrium Conditions with Computations,” to appear in Computational Management Science Markowitz HM (1952) “Portfolio Selection,” The Journal of Finance 7: 77−91 Markowitz HM (1959) Portfolio Selection: Efficient Diversification of Investments, John Wiley & Sons, Inc., New York Mulvey JM (1987) “Nonlinear Networks in Finance,” Advances in Mathematical Programming and Financial Planning 1: 253−271 Mulvey JM, Simsek KD, Pauling B (2003) “A Stochastic Network Optimization Approach for Integrated Pension and Corporate Financial Planning,” in Innovations in Financial and Economic Networks, A Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 110−135 Mulvey JM, Vladimirou H (1989) “Stochastic Network Optimization Models for Investment Planning,” Annals of Operations Research 20: 187−217 Mulvey JM, Vladimirou H (1991), “Solving Multistage Stochastic Networks: An Application of Scenario Aggregation,” Networks 21: 619−643 Nagurney A (1994) “Variational Inequalities in the Analysis and Computation of Multi-Sector, Multi-Instrument Financial Equilibria,” Journal of Economic Dynamics and Control 18: 161−184 Nagurney A (1999) Network Economics: A Variational Inequality Approach, Second and Revised Edition, Kluwer Academic Publishers, Dordrecht, The Netherlands Nagurney A (2001) “Finance and Variational Inequalities,” Quantitative Finance 1: 309−317 Nagurney A, Editor (2003) Innovations in Financial and Economic Networks, Edward Elgar Publishing, Cheltenham, England Nagurney A (2005) “Financial Networks,” Invited Chapter for Handbook of Financial Engineering, C. Zopounidis, M. Doumpos, and P. M. Pardalos, Editors, Springer, Berlin, Germany, to appear Nagurney A, Cruz JM (2003) “International Financial Networks with Electronic Transactions,” in Innovations in Financial and Economic Networks, A. Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 135−167

418

Anna Nagurney

Nagurney A, Cruz JM, Wakolbinger T (2004) “The Co-Evolution and Emergence of Integrated International Financial Networks and Social Networks: Theory, Analysis, and Computations,” to appear in Globalization and Regional Economic Modeling, RJ Cooper, KP Donaghy, GJD Hewings, Editors, Springer, Berlin, Germany Nagurney A, Dong J (1996a) “Network Decomposition of General Financial Equilibria with Transaction Costs,” Networks 28: 107−116 Nagurney A, Dong J (1996b) “General Financial Equilibrium Modelling with Policy Interventions and Transaction Costs,” Computational Economics 9: 363−384 Nagurney A, Dong J (2002) Supernetworks: Decision-Making for the Information Age, Edward Elgar Publishers, Cheltenham, England Nagurney A, Dong J, Hughes M (1992) “Formulation and Computation of General Financial Equilibrium,” Optimization 26: 339−354 Nagurney A, Hughes M (1992) “Financial Flow of Funds Networks,” Networks 22: 145−161 Nagurney A, Ke K (2001) “Financial Networks with Intermediation,” Quantitative Finance 1: 441−451 Nagurney A, Ke K (2003) “Financial Networks with Electronic Transactions: Modeling, Analysis, and Computations,” Quantitative Finance 3: 71−87 Nagurney A, Liu Z, Cojocaru M-G, Daniele P (2005) “Dynamic Electric Power Supply Chains and Transportation Networks: An Evolutionary Variational Inequality Formulation,” to appear in Transportation Research E Nagurney A, Siokos S (1997) “Variational Inequalities for International General Financial Equilibrium Modeling and Computation,” Mathematical and Computer Modelling 25: 31−49 Nagurney A, Wakolbinger T, Zhao L (2006) “The Evolution and Emergence of Integrated Social and Financial Networks with Electronic Transactions: A Dynamic Supernetwork Theory for the Modeling, Analysis, and Computation of Financial Flows and Relationship Levels,” Computational Economics 27: 353−393 Nagurney A, Zhang D (1996) Projected Dynamical Systems and Variational Inequalities with Applications, Kluwer Academic Publishers, Boston, Massachusetts Pigou AC (1920) The Economics of Welfare, Macmillan, London, England Quesnay F (1758) Tableau Economique, 1758, reproduced in facsimile with an introduction by H. Higgs by the British Economic Society, 1895 Rockafellar RT, Wets RJ-B (1991) “Scenarios and Policy in Optimization under Uncertainty,” Mathematics of Operations Research 16: 1−29 Rudd A, Rosenberg B (1979) “Realistic Portfolio Optimization,” TIMS Studies in the Management Sciences 11: 21−46 Rutenberg DP (1970), “Maneuvering Liquid Assets in a Multi-National Company: Formulation and Deterministic Solution Procedures,” Management Science 16: 671−684

17 Networks in Finance

419

Samuelson PA (1952) “Spatial Price Equilibrium and Linear Programming,” American Economic Review 42: 283−303 Shapiro AC, Rutenberg DP (1976) “Managing Exchange Risks in a Floating World,” Financial Management 16: 48−58 Soenen L A (1979) Foreign Exchange Exposure Management: A Portfolio Approach, Sijthoff and Noordhoff, Germantown, Maryland Srinivasan V (1974) “A Transshipment Model for Cash Management Decisions,” Management Science 20: 1350−1363 Storoy S, Thore S, Boyer M (1975) “Equilibrium in Linear Capital Market Networks,” The Journal of Finance 30: 1197−1211 Takayama T, Judge GG (1971) Spatial and Temporal Price and Allocation Models, North-Holland, Inc., Amsterdam, The Netherlands Thore S (1969) “Credit Networks,” Economica 36: 42−57 Thore S (1970) “Programming a Credit Network under Uncertainty,” Journal of Money, Banking, and Finance 2: 219−246 Thore S (1980) Programming the Network of Financial Intermediation, Universitetsforlaget, Oslo, Norway Thore S (1984) “Spatial Models of the Eurodollar Market,” Journal of Banking and Finance 8: 51−65 Thore S, Kydland F (1972) “Dynamic for Flow-of-Funds Networks,” in Applications of Management Science in Banking and Finance, S Eilon and TR Fowkes, Editors, Gower Press, England, pp. 259−276 Wallace S (1986) “Solving Stochastic Programs with Network Recourse,” Networks 16: 295−317

CHAPTER 18 Agent-based Simulation for Research in Economics Clemens van Dinther

18.1 Introduction Financial theory and financial markets are complex constructs that are not always easy to understand. It is even not possible to explain the function of financial markets on basis of theoretical analysis, since theory normally assumes fully rational agents. [37] have presented the so called “No trade theorem” stating that in a state of efficient equilibrium there will be no trade on markets as long as all participants are fully rational (no noise traders or other uninformed traders) and the structure of information acqusition is common knowledge. It is obvious that the assumptions do not hold in real markets, since we observe asymmetric informed participants or technical traders who base their decisions on the analysis of market movements (and not on the assessment of the underlying value of a stock). Thus, apart from theoretic analysis science uses additional methods for the study of financial markets. Empiric studies are one way, but also simulation shows interesting results. Already in 1996, [3] have successfully applied agentbased simulation to the analysis of financial markets in their famous “Santa Fe Artificial Stock Market”. Besides agents-based simulation, Monte Carlo techniques are frequently used to model basic financial market movements (see e. g. [33]). The present chapter deals with the application of simulation in economics and in particular with agent-based simulations. It discusses the use of simulation and its application and presents some general approaches of agent-based simulation and existing links to economic analysis. Research in social science was for long aﬀected by two commonly accepted approaches for gaining insight: induction and deduction. With induction one derives general principles from particular cases, e. g. through pattern discovery in empirical data or observation of certain rules. In the deductive approach a set of axioms is

422

Clemens van Dinther

defined and the consequences deriving from these assumptions are proven [5]. This approach can also be described as inferring the particular from the general. These techniques are applied in many fields of economic research. Game Theory uses deduction to solve for equilibria, macro economics uses deduction to explain equilibria, e. g. on money or goods markets. Empirical surveys (inductive reasoning) are used for macro economic analysis or in econometric theory. Most economic theories apply the method of induction and/or deduction in a static way. Game theory, for example, studies static equilibria. The model dynamics to establish the equilibrium are studied in extensive form games. It is not always possible to develop the extensive form of a game. This is the starting point for Computational Economics (CE). This fairly new branch of economic research is firstly interested in the system’s dynamics and the processes leading to observed phenomena by conducting simulations. Secondly CE provides computational techniques to support e. g. the solution for equilibria. Real world systems or system models can be scrutinised by experimental methods, or by building formal mathematical models for analytic evaluation and deductive reasoning. One reason for applying simulation is the complexity of models under investigation. Mathematical and experimental approaches are feasible for analysis of simple models. Adding more detail to refine simple models increases the complexity, and thus, renders more diﬃculty in finding analytic solutions. Models for experimental studies are often very simple in order to not ask to much of the participants. This is where computational techniques can support researchers to better find solutions and explanations in the study of complex systems. In that sense, “Computational Economics can help to study qualitative eﬀects quantitatively” [29]. One popular computational approach is to implement heterogeneous agent models also known as agent-based computational economics. [26] presents a number of reasons for the increasing popularity of ACE. One of [26]’s most interesting argument is the idea of heterogeneous expectations. There would be no trade in a fully rational world where it is common knowledge that all agents are rational, since agents with superior private information are not able to profit from the information. This is due to the fact that all other traders realize that this agent’s trade request is based on private information, and thus, they will not trade with him.1 Another argument is the cognition (e. g. from laboratory experiments) that individuals often do not behave rational. Software agents are suited to model bounded rationality. This chapter is concerned with agents used for the simulation studies in economics. Agent-based approaches have become popular in recent years, but economists still find fault the missing link of agent-based approaches to traditional economic theory. Section 18.2 discuss the application of simulation in economics in general. Section 18.3 presents the most common agent-based approaches and links them to economic standard theory. The chapter closes with a summary of the key issues.

1

For literature on the no trade theorem see also [37, 19].

18 Agent-based Simulation for Research in Economics

423

18.2 Simulation in Economics In general, simulation in economics does not diﬀer from simulation in other fields. It is understood as a particular type of modelling the real world or other systems [20]. Simulations are an intuition of real world processes over time [7] and support the study of these systems. Therefore, simulation models represent real world systems in a less detailed and less complex way, described by input and output variables and certain system states. An artificial history of the system is generated during the course of the simulation. This artificial history is used for observation and to draw inferences concerning the objective of the study. [20] understand simulations as a method for theory development, since a specific theory can be studied within a simulation and the results are used to refine theory. This process is also depicted in Figure 18.1.

Refinement

Figure 18.1. Simulation for theory development

Simulations are often understood as an alternative approach of doing research in social science. [42] argues that social science theory can be expressed either by mathematics using formalized expression, or by verbal explanation. Additionally, he puts simulations as a third opportunity of expressing theoretical propositions in computer models. This can be understood as using simulations as a third methodological approach apart from deduction and induction. In the context of evaluating electronic markets, [39] determines simulation as a third research method beside axiomatic reasoning (cp. deduction) and experimental techniques (cp. induction). Conducting a simulation starts – comparable to deduction – with the determination of assumptions. In a deductive approach these assumptions are proven. Simulations are not applicable for theorem proving, but generate data that can be inductively analysed. Insofar, simulations use techniques from both, induction and deduction, though the data, the induction is based upon, is not observed within the real world but in a model based on assumptions. Hence, simulations can also be understood as doing thought experiments, where assumptions may be simple but the consequences may be not obvious (see e. g. [5]). In the following, advantages and disadvantages of simulation are discussed briefly.

18.2.1 Benefits of Simulation The advantage of simulations compared to other research methods is primarily the fact that the simulation designer is in control of any parameter to adapt to problem specific circumstances. This allows for both, normative and descriptive studies.

424

Clemens van Dinther

A well designed simulation system can help to understand and explain real world systems and to describe certain observed phenomena by comparing diﬀerent simulation settings. Descriptive application of simulation can especially be used for teaching and training. Additionally, normative studies can be conducted by changing a particular parameter and studying the consequences for the system’s development and its output. This helps to diagnose systems. The simulation speed can be varied (in fact this allows to compress/expand time) which enables the study of slow motion activities as well as the study of long periods of time. In comparison to other research methods such as experiments or surveys, simulations are less expensive, because no expenses for resource acquisition or for the reward payments in experiments are necessary. Agent-based simulations in particular provide some additional advantages. [6] remarks that agents allow for modelling heterogeneous behaviour and to limit rationality. This is not possible in mathematical theorizing.

18.2.2

Difficulties of Simulation

On the other hand, simulations face diverse problems. It is diﬃcult to build appropriate models which accurately represent the real systems. Therefore, it is essential to prove the correctness of the simulation model. In some cases it is diﬃcult to identify mistakes in a simulation model. Wrong conclusions are drawn (from incorrect simulation models) and are taken for granted. Thus, the quality and the correctness of simulation results is a sensitive issue. The development of good models is often not only diﬃcult but also time consuming. It may also be diﬃcult to interpret the collected data. One of the most diﬃcult problems in the development of simulation models is to decide which methodological approach to follow. There is a variety of methods available such as system dynamics, queuing models, agent-based simulation, or stochastic models. Just a few of these will produce appropriate results applied on a specific problem. The use of simulations in economics has rapidly grown in the last years and some work was done on the question of how to apply certain methods (e. g. [20]). Agent-based Computational Economics (ACE) is one specific branch within CE. It describes the “computational study of economies modelled as evolving systems of autonomous interacting agents” [52]. This research approach builds upon the understanding of economy not from the macro level perspective of a system of aggregated demand and supply, but from a micro level as the result of the interaction of numerous independent individuals. [32] notes in a talk2 that economics can be understood as the study of those phenomena emerging from the interactions among intelligent, self-interested individuals. ACE conducts interdisciplinary work on the interface of Cognitive Science, 2

The talk was given to the European Association for Evolutionary Political Economy and is published as an essay.

18 Agent-based Simulation for Research in Economics

425

Evolutionary Economics

Agent-based Computational Economics (ACE)

Cognitive Science

Computer Science

Figure 18.2. Agent-based computational economics, source: T. Eymann at http://www.econ.iastate.edu/tesfatsi/ace.htm

Computer Science, and Evolutionary Economics (see also Figure 18.2). Agentbased simulation itself provides several mechanisms such as genetic algorithms, stochastic components, or other evolutionary techniques. Thus, the decision on which method to apply has to be posed on this level. In the last years more and more eﬀort has been made to link agent-based simulation approaches to economic research. Economist are still suspicious of agentbased simulation approaches, since the simulation model’s underlying theory is often not linked to existing economic theory. Agent-based economics researchers have started to close this gap, although there are still open issues. To the author’s knowledge, there is no summary on possible agent-based simulation techniques and the existing link to economic theory. Since such an overview on the one hand facilitates the work of agent-based economics researchers, and on the other hand makes ACE more credible among economists, this chapter presents the main agent-based modelling techniques and summarises literature on bridging the gap to economic theory. [6] presented a valuable analysis of the use of agent-based simulation in respect to the solvability of the underlying theoretic model. He argues that there are three distinct alternatives for the usage of agent-based techniques: i. fully solvable models, ii. partially solvable models, and iii. insolvable models. In the first class of models (i), agent-based simulations are applied to models that are completely solvable. In this case simulations serve for model validation, exemplification of the solution, or for application of stochastic simulations. For example, it is often the case that solutions to mathematical models are only found in numerical form. If there are numerical samples available, an agent-based model can be applied to validate the results.

426

Clemens van Dinther

In those cases in which there are not only a numerical but a solution in closed form,3 simulations can be applied to better present the solution or to study the dynamics of the model. Many models are based on stochastic functions, e. g. the output of the model is dependent on stochastic input. The functional relation between input and output may be determinable. In such cases stochastic sampling on basis of agent-based simulations can be applied that randomly draws input values in order to determine the output. Stochastic sampling in general is known as “Monte Carlo simulation” and a common approach in science. In many cases it is only possible to partially solve a mathematical model (ii), e. g. if the solution is only possible under restrict assumptions that make the results ambiguous, or if the model can not be solved. In such cases agentbased simulation can be applied complementary to mathematical theorizing. [6] remarks that it has to be distinguished between simulation as instantiation of a mathematical model and simulation as supplement to mathematical models. The former case is equivalent to (i). In order to simplify such complex models, assumptions are necessary. Agent-based models allow to relax such assumptions since heterogeneous agents can be implemented. This makes it possible, firstly to apply agents in simple models that are restricted by many assumptions and validate the outcome with the partly available solutions, or numerical results. In a second stage, the assumptions can be relaxed and the (same) agents can act in the more complex environment. This facilitates a structured analysis of the model. Insolvable models (iii) are rarely analysed in scientific papers since they do not bear new insights in a given problem. Nevertheless, agent-based simulation might be a starting point in order to better understand models that are not mathematical solvable. The described classification has a problem-oriented character, giving more insight to the question which problem/model is handled in agent-approaches. As argued earlier, there is also need for a method-oriented view on agent-based simulation, answering the question how to apply agent-based simulation and which techniques to use. There are diﬀerent techniques applied in agent-based simulation. The four most popular approaches that are found in literature are pure agent-based simulation, Monte Carlo techniques, evolutionary (genetic) approaches,and reinforcement learning techniques. The first approach builds agent societies on basis of simple rules. Monte Carlo simulations are used either to sample complex models or to model stochastic processes, e. g. not deterministic behaviour. The main criticism agent-based simulation is exposed to, is the insuﬃcient theoretic basis. In recent years, it was started to link agent-based approaches to well founded theory. Evolutionary mechanisms according to biological studies have been widely used to model 3

The term closed form denominates mathematical solutions of functions or equations in a general and explicit form with a finite amount of operations. That is, the integral of a function such as F(x) = ∫ f(x) = x2 + cx +1 dx can be generally solved. Sometimes it is only possible to provide an approximation for an integral and compute numerical examples. Such solutions are not considered “closed form”.

18 Agent-based Simulation for Research in Economics

427

evolving systems. Today, evolutionary techniques are applied according to well founded game theoretic approaches as evolutionary game theory. Besides evolutionary mechanisms, it is popular to apply reinforcement learning methods that are related to Markov theory and stochastic games. Certainly, it is not possible to provide an unambiguous classification, since the discussed approaches can also be combined. Nevertheless, these four approaches appear to be the most applied techniques. Therefore, the following sections briefly summarize these four methods.

18.3 Agent-based Simulation Approaches The basic idea of agent-based simulation approaches is to build simple models of a real world system in which agents interact on basis of social rules. This allows to conduct studies of macro behaviour from a micro behavioural view. First work in this area was done by [58] with the theory on self-reproducing automata. It was the 2005 Nobel prize laureate Thomas Schelling who introduced one of the first agent models in order to explain cultural segregation in American housing areas [47]. In his models of segregation he observes the phenomenon of cultural segregation in American residential districts into areas of white and coloured citizens. In these times computational power was limited. During the last decade, computational progress made large scale agent populations possible. That is also why most of the work on ACE was done in the last ten to fifteen years. This section briefly presents the four main agent-based approaches used in economics. Since economists are still suspicious of agent-based simulation, it is more and more tried to link agent-based approaches to standard economic theory. Especially evolutionary techniques show parallels to economic theory, and thus, are presented in the following besides pure agent-based techniques and Monte Carlo simulation.

18.3.1

Pure Agent-based Simulation: The Bottom-up Approach

He studies spatial arrangements where agents of diﬀerent characteristics prefer to live in a neighbourhood of a certain characteristic, e. g. the colour of the neighbours. Agents have preferences that at least a particular fraction of the neighbourhood has their own colour. Changing these preferences certainly leads to diﬀerent compositions of the neighbourhood. In what is denominated the pure agent-based approach, the model consists of several agents, the environment the agents act in, and interaction rules. These interaction rules are fixed for the agent, meaning that the agents’ action sets are not changed, as it

428

Clemens van Dinther

is the case if using learning mechanisms such as reinforcement learning or genetic algorithms. [11, p. 7280] presents a very simple but striking example that explains the model dynamics by changing fixed interaction rules: In a group of 10−40 agents each agent is assigned a protector and an opponent randomly and secretly. Each agent tries to hide from the opponent behind the protector. This simple rule will result in a highly dynamic system of agents moving around in a seemingly random fashion. Changing the rule in such a way that the agents have to place themselves between opponent and protector, the system will end up in everyone clustering in a tight knot. This example shows how simple rules result in complex dynamic systems. [15] incrementally developed Sugarscape, a model of a growing artificial society that received much attention in the agent-based research community. Agents are born into a spatial distribution, the Sugarscape. Agents have to eat sugar in order to survive and thus, move around on the landscape to find sugar. Starting with simple movement rules, more and more social rules like sexual reproduction, death, conflicts, history, diseases, and finally trade are introduced. This results in a quite complex society that allows to study several social and economic aspects. As a last example may serve the computational market model of [50] that builds a small society of agents that can either produce food or gold. Each agent needs food for survival and is equipped with a certain skill level of producing gold or food. Food can be traded for gold. In such a simple model, agents switch from producing food to gold or vice versa if the exchange rate passes their individual level from which on it is more lucrative for them to produce the other product. This results in a cyclic trade behaviour with large amplitudes of volume and price. When adding speculators, agents that have the capabilities of storing gold and food, the price stabilises significantly. The described examples give an introduction of how to use agent models in their simplest form. It is diﬃcult to analyse the described models analytically. These models use the normative eﬀect of agent simulation by controlling and adding particular rules and their impact on the model’s behaviour. Often, economists are interested not only in the dynamics of model, but also in stability. Economists want to discover equilibria states and optimal strategies that lead to equilibrium. Pure agent models are not well suited for finding new strategies. This is why, learning techniques or stochastic components have to be used. Such approaches are introduced in the next sections.

18.3.2 Monte Carlo Simulation In general, the term Monte Carlo (MC) denotes methods for mathematical experiments using random numbers. Monte Carlo methods became known explicitly in the Manhattan Project developing the nuclear bomb during the second world war. The research group around John von Neumann namely the researchers Ulam, Metropolis and Fermi developed this method and applied random number experiments.

18 Agent-based Simulation for Research in Economics

429

The name Monte Carlo is ascribed to Metropolis who alludes to Ulam’s passion for poker by referring to the famous casino of Monte Carlo. A first article to stochastic experiments and the title The Monte Carlo Method was published by Metropolis and Ulam in 1949 [36]. In fact, stochastic processes for experiments are known from earlier years,4 but the method was systematically developed primary in the late fourties of the 20th century by Ulam, Fermi, von Neumann and Metropolis, and thus, was available for application in manifold scientific fields since then. The problems studied by Monte Carlo methods can be distinguished in probabilistic and deterministic problems [23]. Probabilistic problems determine cases where random variables are used to model real stochastic processes, e. g. in queuing theory. One speaks of deterministic Monte Carlo cases if formal theoretical models exist that are not possible or hard to solve numerically. In these cases, the Monte Carlo method helps to find solutions applying random numbers, e. g. in computing the number π.5 In the social science, the Monte Carlo method is applied to diﬀerent problem structures. In Operations Research (OR) probability distributions are used within models, e. g. in queuing theory. In other fields, e. g. in finance, stochastic processes are used to simulate order flows in stock markets. In agent-based simulation, Monte Carlo methods are used for both, probabilistic and deterministic, models. In most of the agent-based simulation models, probabilistic MC-methods are applied, e. g. to assign valuations to agents, or to simulate noise. A well known example is the contribution of [21] in which a double auction market is simulated by agents with zero intelligence. Instead of implementing fixed strategies, agents in this market draw their bid value from uniform distributions regardless of true valuation. This strategy is used to represent bounded rational agents. [13] apply a deterministic MC approach to sequential, (possibly) multiunit, sealed-bid auctions. The agents sample the valuation space of the opponents and solves the incomplete information game that results of the sample. The results of the samples are aggregated in a policy. This approach is used since the “straightforward expansion of the game is intractable even for very small problems and it is beyond the capability of the current algorithms to solve for the Bayes-Nash equilibria” [13, p.154]. As explained earlier, it is an appropriate approach to use stochastic sampling if the problem is hard, or not possible to solve. The next section introduces into evolutionary approaches for agent based simulation. 4

5

[23] refer to work by [22] that describes the computation of π by means of random processes. A needle of length l can be dropped on a board with parallel lines of distance t. Then the number of trials N and the number of areas R in which the needle has dropped are counted. The probability of the needle crossing a line is directly related to the value of p. The needle crosses a line if the distance of the needle's center to the closer line is smaller than 0.5 l sin(?) with? being the angle between the needle and the lines. We can now compute the probability of the needle crossing one line to 2 l/t p which is equivalent to the ratio R/N. Hence, p can be computed as p = 2 l N/t R.

430

Clemens van Dinther

18.3.3

Evolutionary Approach

Starting in 1975, John Holland was the first who transferred the idea of biological genetic evolution to artificial adaptive systems [24]. In biology, the genotype is stored on chromosomes that are combined during reproduction. The biological improvement mechanisms are used in computer science as search technique. Genetic algorithms (GA) have become popular for economic simulations in order to model learning of economic agents. GA are applied to find solution for optimization problems. The search space is encoded as a genetic representation which is exploited by means of an “intelligent” random search. GA are intelligent in that they use historical information to improve actual best strategies in direction of local or global optima. Through applying Charles Darwin’s proposition of “the survival of the fittest” that states that the fittest individuals survive the battle for scarce resources. Basically, actions or a sequence of actions are assigned values, e. g. performance, utility, payoﬀ, etc. The actions are genetically encoded as strategies, e. g. in string or binary representation. This chromosome representations builds the genetic strategy set. A subset of the strategy set forms a population, also called generation. Each strategy of a population is assigned a value, also called fitness, representing how good the strategy performed in the past. At each time playing one strategy out of the pool of one generation the strategy earns a reward that serves as feedback for the performance of this particular strategy. The fitness is adapted according to the reward. Some strategies will perform better than others. In order to improve the population of strategies means of innovation, experimentation and replication are applied. GA are widely used in science and industry, e. g. it is applied in industry for the optimization of production schedules, to minimize size in circuit design of computer chips, for improving the eﬃciency of machines, or for eﬃcient network design [41]. In social science simulation GA are used as learning algorithm to model bounded rational behaviour of software agents. 18.3.3.1 The Learning Model GA can be understood as models of strategic interaction. That is, economic agents try to find the best strategy in the given environment. Thus, “GA are models of social learning, which in fact is a way of learning by interaction” [46, p.48]. Basically, two components are necessary, (1) the genetic representation of the solution space, and (2) an appropriate fitness function that is used to evaluate the solution space. Each possible solution (strategy) is represented by a gene, e. g. an array of bits or characters. The fitness of a gene represents the ability of a gene to compete and derives from payoﬀs achieved in the past. As such, GA builds beliefs for rules in response to experience. The frequency of representation in a population and the fitness of a given strategy indicate the performance of that strategy. In order to determine optimal strategies, it is necessary to search the strategy set. This is done by manipulating the population with the three basic genetic modi-

18 Agent-based Simulation for Research in Economics

431

Figure 18.3. Adaption of strategies through crossover

fication mechanisms: selection (also understood as reproduction or imitation), crossover (also understood as communication), and mutation (also understood as innovation or learning by experiment). Selection simply means that the strongest strategies of a population in terms of fitness are identically reproduced into the next generation. The probability for reproduction from the current population n into next period’s population only depends on its relative fitness R (i / n ) derived from individual’s payoﬀ in ratio to the population’s aggregate payoﬀ n [46, p. 48]. During crossover, two chromosomes are drawn out of the population and split at a randomly drawn position. Than the chromosome parts are recombined with the ones of the other chromosome. The crossover operator is also understood as an communication mechanism. This might cause a dramatic change from an individual perspective, but does not introduce complete new strategies into the population. Figure 18.3 depicts an example for crossover. Mutation is an operator that randomly manipulates a single bit or character of the chromosome by flipping single bits with a (small) mutation probability μ. This is also understood as an operator for experimentation, since it is the only way of introducing abrupt changes. Mutation is just a small change to the individual strategy, but might bring a complete new strategy into the population. GA have been widely used for simulating adaptive and bounded rational agents in economics. An early application was described in [38], and [4] who applied GA to the prisoner’s dilemma, and [1] simulated auctions under human like behaviour. [25] give a brief overview on artificial adaptive agents in economic theory. [2] remarks that one of the basic questions in applying GA to economic problems is how well suited GA are to discover Nash equilibria and to model individual and social learning. 18.3.3.2 Evolutionary Games Game theory studies strategic situations of agents assuming fully rational agents. The analysis of such models is often diﬃcult or impossible. Agentbased simulation using GA models the economy as bounded rational agents that try to adapt to their environment. Bringing together both views is an important step towards a better understanding of the economy. [31] argues for a middle way of pure agent theory and game theory. Advances in this direction have been made, e. g. by [35] who compares evolutionary processes with learning models, or by [46] who compares evolutionary game theory and GA learning. The latter author is able to show

432

Clemens van Dinther

links between GA and evolutionary game theory, i. e. GA can be understood as “a specific form of a repeated evolutionary game” [46, p. 58]. The game theoretic concepts of evolutionary stable states and Nash equilibria are applied to GA but mainly valid for single population GA. An evolutionary game is understood in the words of [18, p. 16]: “By an evolutionary game, I mean any formal model of strategic interaction over time in which (a) higher payoﬀ strategies tend over time to displace lower payoﬀ strategies, but (b) there is some inertia, and (c) players do not systematically attempt to influence other players’ future actions.” The most common example is the Hawk-Dove game, where hawks and doves compete for a nutrition source v. If two hawks meet at the source, they fight against each other and win with a probability of 50%, thus receiving a payoﬀ6 of 1/2(v − c).If a hawk meets a dove at the source, the hawk dispels the dove and gets the total source for itself (payoﬀ = v), whereas the dove earns 0. Two doves share the source equally and thus receive a payoﬀ of v/2. One solution concept in evolutionary games are evolutionary stable strategies/states (ESS). A strategy σ is called ESS if it prevents the population from an invading strategy μ.Thus, if F0 is the initial fitness, F (s) is the total fitness of the individual following strategy s, ΔF (s1,s2) is the change in fitness for playing s1 while the opponent plays s2,and p is the proportion of the population following mutant strategy μ, then we write: F (σ) = F0 + (1 − p)ΔF (σ, σ) + pΔF (σ, μ) F (μ) = F0 + (1 − p)ΔF (μ, σ) + pΔF (μ, μ) Since p is close to 0 it follows that F (σ) > F (μ) if either ΔF (σ, σ) > ΔF (μ, σ) or ΔF (σ, σ) = ΔF (μ, σ)and ΔF (σ, μ) > ΔF (μ, μ). Disadvantages of this solution concept is that on the one hand in evolutionary games strategies are repeatedly compared pairwise and on the other hand an explicit formulation of the selection process underlying the concept does not exist [46, p. 52]. In contrast, GA learning compares each single strategy to the aggregated other strategies. [46, p. 51] directly links this fact to Nash equilibria: “A Nash strategy is defined as the best strategy given the strategies of the competitors […] As such a genetic population can be interpreted as a population of agents, each trying to play Nash.” This fact enables researchers to use all analytical tools available for evolutionary game theory to apply it to GA. Insofar, the findings of [46] are an important step of closing the gap between evolutionary game theory and computational techniques. The next section introduces reinforcement learning theory.

18.3.4

Reinforcement Learning

Reinforcement learning (RL) is a goal directed, trial and error learning procedure. According to [30], RL has relations to cybernetics, statistics, psychology, neuro6

c are the injury costs.

18 Agent-based Simulation for Research in Economics

433

science, and computer science. [51, p. 16] allude to the origin from two threads that developed independently. The first thread derives from learning by trial and error inspired by research in psychology. This approach was already pursuit in the early work of artificial intelligence. The second main thread concentrates on the problem of optimal control and its solution through value functions and dynamic programming. The merge of both research branches turns RL into an important research approach that is theoretical well founded from a (human) behavioural point of view as well as on basis of economic theory. The “Law of Eﬀect” states that it is more likely to choose out of a set of possible actions that action which results in the largest satisfaction [53, 57].7 Thereby, the largest satisfaction is determined by trial and error and in the course of time the action with the best outcome in the past is chosen. This effect combines the two characteristic and important aspects, search (selection) and memory (association). RL is selectional since it tries alternatives and selects the best one by comparing consequences. It is also associative because the action selection depends on particular states. [16] additionally consider the “Power Law of Practice” that was introduced by [10]. The law states that learning at the beginning of the process is more intensive and reduces during the course of time. The optimal control branch was initiated among others by [8] for designing a controller that minimizes a measure of a dynamic system’s behaviour over time. His approach implements an optimal return function and uses (dynamic) system states. These two factors define the dynamic programming equation that is also known as Bellman equation8 for discrete values. The discrete stochastic version of the dynamic control problem became known as Markov Decision Process (MDP) [9] that specifies a course of state transitions that are independent from state transitions in the past. [27] contributed the policy iteration method that was modified by [45], and [44]. These contributions are the basis for the RL theory. The idea of MDP links the computer science theory of RL to economic theory of stochastic games as it is shown in the paragraph about Markov Games. The development of RL from psychology at the one hand, and mathematics on the other hand, turns RL into a qualified foundation for agent-based simulation. It is possible to model human behaviour as e. g. [16] have shown. Additionally, the models have a mathematical foundation that can be linked to economic theory. The next paragraph describes the RL mechanism. Paragraph “Markov Games” introduces the link to stochastic game theory. 7

8

It was Edward L. Thorndike who studied this eﬀect in animal experiments during the years 1898 to 1911. [14] remarks that Thorndike has published a couple of articles ([53, 54, 55, 56, 57]) where he implicitly refers to the law of eﬀect without explicitly defining it. It is the 1907 monograph in which the term is used the first time. This makes it diﬃcult to determine the date of origin for correct citation. Bellman’s work is based on work of the Irish scientist William Rowan Hamilton (18051865) and the German mathematician Carl Gustav Jakob Jacobi (1805-1851). The Hamilton-Jacobi equation is a non-linear diﬀerential equation for the description of a system’s behaviour.

434

Clemens van Dinther

18.3.4.1 The Learning Model Reinforcement learning focuses on goal directed learning of agents in an uncertain and possibly dynamic environment. Agents seek to achieve goals by performing actions that result in (delayed) rewards. The agents can learn from rewards received in the past by assigning the rewards to actions that were performed in a particular state. At the beginning of the learning process, agents do not have preferences for certain actions. Thus, agents proceed by trial and error in order to explore the best action. The model is characterised by the environment and agents that interact with the environment in order to achieve a particular goal. Agents are not certain about their environment since they do not possess all information. Instead, agents have a perception of the environment that gives an indication about the actual state that consists of environmental and (optionally) agent internal state parameters. Agents face the problem of deciding which action to take in a particular state. This decision problem is often described as the n-armed bandit problem. A n-armed bandit is also understood as n one-armed bandits.9 Since the reward is not deterministic it is diﬃcult for the agent to determine an optimal policy. The agent faces the problem of either exploiting the action space by choosing action with high values, or exploring the action space by trying actions with low values. The probability of making a win is equal for all actions regardless of their performance in the past. The decision problem of exploration versus exploitation applies to all reinforcement learning problems. Reinforcement learning is well-suited to determine an appropriate actionselection mechanism by optimizing the direct feedback from the environment. The agent identifies certain environmental states and maps its perception of these states into particular actions. The tuple (S, A, P, r) describes a set of environmental states S, a set of possible actions A, a probability P: S × S × A → [0,1] for the transition from state s1 to state s2 by choosing action a, and a scalar reward function (reinforcement) r: S × A → R. The reinforcement is non-deterministic in the most cases, i. e. taking action ain the same state at two diﬀerent points of time might not result in the same reward and transition. Each agent holds a policy π that maps the actual state s to the action to be performed. The optimal policy maximises the sum of the expected rewards. The present value of a state is determined as the discounted sum of future rewards received by following some fixed policy π in state s, i. e. Vγπ = E ( ∑ t∞= 0 γ t rsπ,t ) . Thereby, rsπ,t the stochastic parameter for the reward r at t time steps after the realization of policy π in state s, using the discount rate γ(0 ≤ γ < 1) (cp. [48, p.266]). The value function maps states to state values. It can be approximated by any kind of function approximator such as a look-up table or a multi-layered perceptron. The problem of finding the optimal agent’s behaviour is equivalent to finding the optimal value function. 9

A one-armed bandit denominates a gambling machine with one arm that can be pulled at specific costs. The one-armed bandit stochastically pays a reward greater or equal to zero.

18 Agent-based Simulation for Research in Economics

435

Table 18.1. Reinforcement learning approaches for model-based and model-free settings Model-based Value Iteration Policy Iteration Modified Policy Iteration

Model-free Adaptive Heuristic Critic (AHC) Q-learning Averaged values

Diﬀerent methods for models of optimal behaviour have been proposed in literature. [30] distinguish model-based and model-free approaches that are summarized in Table 1. Here, the term “model” consists of knowledge about the probability of the state transitions. P: S × S × A→ [0,1] and about the reinforcement function r: S× A→ R. In most economic models the state transition is not known to the agents. One speaks of model-based approach if estimates for the model (P and r) are learned and used to find an optimal policy. In model-free approaches, the optimal policies are learned without building a model. Since Q-learning is the mainly used method for economic model it is explained in more detail in the remainder of this section. It was proposed by [59] and is an asynchronous dynamic programming approach. In general, dynamic programming provides algorithms to compute optimal policies in Markov Decision Processes (MDP). [27] defines MDP by a set of states, a set of actions, and a transition function. A MDP satisfies the Markov property that states that transitions in the actual state are independent from previous states. That is, the history of how a particular state was reached has no impact on the transition probabilities of further states. Reinforcement learning models that satisfy the Markov property are Markov Decision Processes [51]. Note that MDP theory assumes a stationary environment, i. e. the probabilities of state transitions or of receiving a specific reward do not change over time. Q-learning is often used in ACE, since the approach can be linked to Markov games as one branch of game theory. This is the reason for explaining Q-learning in the following in more detail. Markov games are presented in more detail in Section 18.3.4.2. Q-learning can be understood as controlled Markov process. Agents control the state transitions by performing actions. For each policy π, an action-value Q is defined that determines the “expected discounted reward for executing action a at state x and following policy π thereafter” [60, p. 280]: Q* ( x, a ) = Rx (a ) + γ ∑ Pxy (π ( x))V ∗ ( y )

(18.1)

y ∗

where γ is the discount factor of future rewards, V is the optimal value in state y, Pxy is the probability function for the transition from state x to y. Since V ∗ ( y ) = max a Q * ( y, a ) , the Q-learning rule can be recursively defined: Q( x, a ) = Q( x, a ) + α (r + γ max Q( x′, a ′) − Q( x, a)) a′

(18.2)

436

Clemens van Dinther

The tuple (x, a, r, x’) determines the experienced transition from state x to x’ while performing action a and receiving reward r. It can be shown for bounded rewards and learning rates 0 ≤ α < 1 that Q(x, a) → Q*(x, a) with probability 1 for an infinite number of repetitions of playing action a in state x [60, p.282]. Having converged to optimal Q*, it is optimal for agents to act greedily, i. e. in state s to choose the action with the highest Q-value. Improvement to this algorithm have been proposed. An important contribution was done by [16], who developed a descriptive model for human-like learning behaviour in strategic environments. The reinforcement learning algorithms presented by Erev and Roth can be used for stateless learning. The performance of the ErevRoth algorithm was tested for models of experimental economics and the result of the simulation was compared to the experimental results of human subjects. It was shown that the computational results were similar to the experimental results. The basic diﬀerence to Q-learning is in the action selection. The Q-learning algorithms applies an exploration rate in order to assure the exploration of certain actions. Therefore, the agents decides arbitrarily whether to explore or to exploit. During exploitation the action with maximum Q-value is chosen. The Erev-Roth algorithm introduces propensities in action selection. A propensity qnk to choose a particular action k is assigned to each possible action k of agent n. This propensity is adjusted in respect to the achieved result (reward). The probability pnk(t) for choosing a certain action at time t results from pnk (t ) = qnk (t )∑ j qnj (t ). Since this algorithm on the one hand simulates human behaviour and on the other hand builds upon the fundamental work of reinforcement learning with good characteristics to converge to stable states, it is used in many social studies, e. g. in market simulations in [40]. The next section introduces the theory of stochastic games and links reinforcement learning techniques to game theoretic approaches. 18.3.4.2 Markov Games Markov games (MG) are a subfield of game theory. Since the seminal work of [49] MG are widely studied and also known as stochastic games (SG) [43]. MG can be described as a tuple n, S , A1,..., n , T , R1,..., n where n describes the number of players, S is a set of states, A1,…,n are the action sets of each player, T is a transition function S × A1 × … × An × S → [0,1] that is controlled by the current state and the performed actions, and Ri is agent i’s reward function S × A1 × … × An → R. Each player i wants to maximise the expected sum of discounted rewards ∞ E ∑ γ j ri,t + j . A policy π maps states to actions S → A, thus, the goal is to find

(

j =0

)

an optimal policy for each state. A policy is said to be optimal if there is no other policy that achieves a better result at any state. Comparing stochastic games and classical game theory, it can be remarked that stochastic games are an extension of matrix games to multiple states [12] i. e. each state can be described as a matrix game. Matrix games are determined by the tuple n, A1,…, n , T , R1,…,n where n is

18 Agent-based Simulation for Research in Economics

437

the number of players, A: A1 × … × An is the joint action set and Ai is the action set available to player i, and player’s i payoﬀ function Ri. Alternatively, MG can be understood as the extension of a Markov decision problem (MDP) to multiple players. MDPs have at least one optimal policy [34]. In Markov games the performance depends on the opponents’ actions. Thus, it is not simple to determine optimal policies. A common approach is to evaluate each policy assuming the opponents best choice, i. e. to maximize the reward in the worst case. [34, p. 158] states that each Markov game has a non-empty set of optimal policies that might be probabilistic. In order to solve stochastic games for optimal policies, several algorithms were proposed. For the analysis, it has to be distinguished between zero-sum (also called purely competitive) and general-sum games. In zero sum games, the agent’s payoﬀ sum up to zero, whereas in general-sum games the sum of payoﬀs can be diﬀerent from zero. Almost all algorithms developed are applied to zero-sum games. [49] has shown that there are equilibria solutions for zero-sum games, and [17] were able to demonstrate the same for generalsum games. Both, reinforcement learning theory and game theory, have developed algorithms in order to find optimal policies. Nash-equilibrium solutions are one solution concept in game theory. In Nash equilibrium, each agent’s strategy is a best response on the other agents’ strategy. Each finite bimatrix game has at least one Nash equilibrium in mixed strategies. The fundamental difference between game theory solution concepts and reinforcement learning is that game theory assumes that the model of the game is known to the players. Reinforcement learning is basically applied to unknown models. Nevertheless, the approaches are very similar. [49] proposes to compute the quality of each action a of state s : Q ( s, a ) ← R ( s, a ) + γ ∑ s ' T ( s, a, s ' ) V ( s ' ) , and the value V ( s ) of a state as the value of the matrix game at state s. The value operator is given by the minimax value of the game. The diﬀerence of this algorithm to MDP’s value iteration is that the max operator was replaced by the value operator. According to this approach, [34] proposes the minimax-Q algorithm. It replaces the max operator in Q-learning with the value operator:

Q ( s, a ) ← (1 − α ) Q ( s, a ) + α ( r + γ V ( s ') )

(18.3)

and V (s) =

max

min

∑ π ( s, a1 ) Q ( s, a1 , a2 )

π ( s ,• )∈Π ( A1 ) a2 ∈ A2 a1∈ A1

(18.4)

It can be shown that in the zero-sum case this algorithm converges to the true values V * . Consequently, it is straightforward to use Q-learning mechanisms for learning in stochastic games. This is possible for a single-agent learning environment. [28] have introduced an algorithm for multi-agent learning in two player games. It is assumed that agents not only observe their own actions and payoﬀ but also the competing agents immediate payoﬀs. Under these assumptions the

438

Clemens van Dinther

algorithm converges to Nash equilibrium strategies. Unfortunately, the assumption of observing the competitors immediate payoﬀs is not applicable in many cases. This section has presented theoretical work in both fields, game theory and computer science that builds the foundation for solving Markov games. Nevertheless, there is still much theoretical work needed in order to proof convergence of Q-learning algorithms in multi-agent learning environments.

18.4 Summary This chapter deals with application of simulations in economic research. Therefore, it reflects the methodology of simulation and its applicability for economic models. Agent-based simulations have become more and more popular in recent years but there is no common understanding of its diﬀerent methodological approaches and their appropriateness to study specific problems. Some work has been done in the scientific area to close the gap between simulation approaches and economic theory. The present paper identifies four main techniques used in agent-based simulation, presents a brief review of the methodology and discusses the known similarities to economic theory. Certainly, the four approaches can also be combined according to the economic problem under investigation. The first technique is the fundamental pure-agent approach that implements agents, the environment and sets of simple (mostly) static interaction rules. It follows the agent-based philosophy of studying macro behaviour from a micro oriented model. The pure agent-based modellers are not so much interested in a direct link to standard economic approaches and tools, but try to understand and explain real economic observations as a result of interacting agents. Another approach is the Monte Carlo technique that can be used either probabilistically to model real world processes, or deterministically in order to find solutions by sampling the solution space. In the context of agent-based simulations Monte Carlo techniques are often combined with other simulation approaches, e. g. to model stochastic processes or to simplify learning by stochastic search. Evolutionary methods such as genetic algorithms are applied to search for stable states and to identify optimal strategies. Early work was done (see for example [35, 46]) in order to link computational methods to economic approaches such as evolutionary game theory. A better theoretical base is essential for economist to gain more confidence in agent-based techniques, but the cited work is a valuable contribution to close the gap. The theoretical foundation of Reinforcement Learning is developed by bridging game theory and computer science theory on basis of Markov decision problems and Markov games. Some work was found for proving the convergence of RLmethods to equilibrium of economic games. Still, the assumptions in the mathematical foundations are strong, so that the link is only well-founded for zero-sum Markov games. It is still necessary to prove the convergence to optimal behaviour in the more common case of generalsum games. Nevertheless, the progress and the results made in this area are promising.

18 Agent-based Simulation for Research in Economics

439

The presented classification of agent-based approaches can not be claimed comprehensive. Certainly, there are a couple of other techniques known in computer science such as neural networks or classifier systems. It is diﬃcult to find a strict classification, since many simulation models combine the presented techniques. Nevertheless, the present paper presents the four most common techniques and their links to economic theory.

References 1. Andreoni JA, Miller JH (1990) Auctions with adaptive artificialy intelligent agents. Santa Fe Institute Working Paper, No. 90-01-004 2. Arifovic J (1994) Genetic algorithm learning and the cobweb model. Journal of Economic Dynamics and Control, 18:3–28 3. Arthur WB, Holland J, LeBaron B, Palmer R, Tayler P (1996) Asset pricing under endogenous expectations in an artificial stock market. Technical report, Santa Fe Institute 4. Axelrod R (1987) Evolving new strategies – the evolution of strategies in the iterated prisoner’s dilemma. In Davis L, editor, Genetic Algorithms and Simulated Annealing, Pitman and Morgan Kaufman, pp. 32–41 5. Axelrod R (2003) Advancing the art of simulation in the social science. Japanese Journal for Management Information System, 12(3), Special Issue on Agent-Based Modeling 6. Axtell R (2000) Why agents? on the varied motivations for agent computing in the social sciences. In Proceedings of the Workshop on Agent Simulation: Applications, Models and Tools, IL, Argonne National Laboratory 7. Banks J (1998) Principles of simulation. In Banks J, editor, Handbook of Simulation, New York, John Wiley & Sons, Inc., pp. 3–31 8. Bellman RE (1957) Dynamic Programming, Princeton University Press, Princeton 9. Bellman RE (1957) A markov decision process, Journal of Mathematical Mechanics, 6:679–684 10. Blackburn JM (1936) Acquisition of skill: An analysis of learning curves, IHRB Report, No. 73 11. Bonabeau E (2002) Agent-based modeling: Methods and techniques for simulationg human systems, in Adaptive Agents, Intelligence, and Emergent Human Organization: Capturing Complexity Through Agent-Based Modeling, Proceedings of the National Academy of Sciences of the USA (PNAS), 99 (3), Washington, DC, National Academy of Sciences of the USA, pp. 7280–7287 12. Bowling M, Veloso M (2000) An analysis of stochastic game theory for multiagent reinforcement learning, Technical Report CMU-CS00-165, Computer Science Department, Carnegie Mellon University, http://www.cs.ualberta.ca/ bowling/papers/00tr.pdf

440

Clemens van Dinther

13. Cai G, Wurman PR (2005) Monte carlo approximation in incomplete information, sequential auction games, Decision Support Systems, Special Issue: Decision Theory and Game Theory in agent design, 39(2):153–168 14. Catania C (1999) Thorndike’s legacy: Learning, selection, and the law of effect, Journal of The Experimental Analysis of Behavior, 72(3):425–428 15. Epstein J, Axtell R (1996) Growing Artificial Societies – Social Science From the BottomUp, MIT Press, Boston 16. Erev I, Roth AE (1998) Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria, The American Economic Review, 88(4):848–881 17. Filar J, Vrieze K (1997) Competitive Markov Decision Processes, Springer Verlag, New York 18. Friedman D (1998) On economic applications of evolutionary game theory. Journal of Evolutionary Economics, 8(1):15–43 19. Fudenberg D, Tirole J (1991) Game Theory, MIT Press, Cambridge 20. Gilbert N, Troitzsch K (2005) Simulation for the Social Scientist, Open University Press, Graw-Hill Education, Berkshire, England, second edition 21. Gode D, Sunder, S (1993) Allocative eﬃciency of markes with zerointelligence traders: Market as a partial substitute for individual rationality, Journal of Political Economy, 101:119–137 22. Hall. On an experimental determination of pi, Messeng. Math., 2:113–114, 1873 23. Hammersley JM, Handscomb DC (1964) Monte Carlo Methods, Spottiswoode, Ballantyne & Co Ltd, London and Colchester 24. Holland JH (1992) Adaptation in Natural and Artificial Systems, MIT Press, Boston, 1st MIT-Press edition (book was originally published 1975) 25. Holland JH, Miller JH (1991) Artificial adaptive agents in economic theory, American Economic Review, Papers and Proceedings, 81:365–370 26. Hommes CH (2006) Heterogeneous agent models in economics and finance, in: Tesfatsion L, Judd KL, editors, Handbook of Computational Economics 27. Howard RA (1960) Dynamic Programming and Markov Processes, The MIT Press, Cambridge, MA 28. Hu J, Wellman, MP (1998) Multiagent reinforcement learning: Theoretical framework and an algorithm, in: Proceedings of the 15th International Conference on Machine Learning 29. Judd KL (1997) Computational economics and economic theory: Substitutes or complements? Journal of Economic Dynamics and Control, 21(6):907–942 30. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey, Journal of Artificial Intelligence Research, 4:237–285 31. Kirman A, The structure of economic interaction: Individual and collective rationality, in: Bourgine P, Nadal JP, editors, Cognitive Economics: An Interdisciplinary Approach, Chapter 18, Springer, Heidelberg, pp. 293–312 32. Krugman P (1996) What economists can learn from evolutionary theorists, talk given to the European Association for Evolutionary Political Economy, http://www.mit.edu/krugman/evolute.html

18 Agent-based Simulation for Research in Economics

441

33. Kunzelmann M, Mäkiö J (2005) Pegged and bracket order as a success factor in stock exchange competition, in: Proceedings of the 2nd Conference FinanceCom05, IEEE 34. Littman, ML (1994) Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning, San Francisco, CA, pp. 157–163 35. Marimon R (1993) Adaptive learning, evolutionary dynamics and equilibrium selection in games, European Economic Review, 37:603–611 36. Metropolis N, Ulam, S (1949) The monte carlo method, Journal of the American Statistical Association, 44(247):335–341 37. Milgrom P, Stokey NL (1982) Information, trade, and common knowledge, Journal of Economic Theory, 26(1):17–27 38. Miller JH (1986) A genetic model of adaptive economic behaviour, Working paper, University of Michigan, Ann Arbor 39. Neumann D (2004) Market Engineering – A Structured Design Process for Electronic Markets, PhD thesis, Universität Karlsruhe (TH), Karlsruhe, Germany 40. Nicolaisen J, Petrov V, Tesfatsion, L (2001) Market power and efficiency in a computational electricity market with discriminatory double-auction pricing. IEEE Transactions on Evolutionary Computation, 5(5):504–523, http://www. econ.iastate.edu/tesfatsi/mpeieee.pdf 41. Optimatics (2004) Genetic Algorithm Technology for the Water Industry, information on http://www.optimatics.com/GAhistory.htm 42. Ostrom TM (1988) Computer simulation: The third symbol system, Journal of Experimental Social Psychology, 24(5):381–392 43. Owen G (1982) Game Theory. The MIT Press, Cambridge, MA, second edition 44. Puterman ML (1994) Markov Decision Processes – Discrete Stochastic Dynmaic Programming, Wiley Series in Probability and Statistics, John Wiley & Sons Inc., New York 45. Puterman ML, Shin MC (1978) Modified policy iteration algorithms for discounted markov decision processes, Management Science, 24:1127–1137 46. Riechmann T (2002) Genetic algorithm learning and economic evolution, in: Chen SH, editor, Evolutionary Computation in Economics and Finance, Physica-Verlag, Heidelberg, pp. 45–60 47. Schelling TC, Models of segregation, The American Economic Review, 59(2):488–493 48. Sen S, Weiss G (1999) Learning in multiagent systems, In Weiss G, editor, Multiagent Systems, The MIT Press, Cambridge (Massachusetts) pp. 259–298 49. Shapley LS (1953) Stochastic games, Proceedings of the National Academy of Sciences of the United States of America, 39(10):1095–1100 50. Steiglitz K, Honig ML, Cohen LM (1996) A computational market model based on individual action, in Clearwater S, editor, Market-Based Control: A Paradigm for Distributed Resource Allocation, Hong Kong, 1996, World Scientific, see http://www.cs.princeton.edu/ ken/scott.ps

442

Clemens van Dinther

51. Sutton RS, Barto AG (2002), Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 2002 52. Tesfatsion L (2002) Agent-based computational economics: Growing economies from the bottom-up, ISU Economics Working Paper No. 1, February 2002, http://www.econ.iastate.edu/tesfatsi/ 53. Thorndike EL (1898) Animal intelligence: An experimental study of the associative processes in animals, Psychological Review Monograph Supplement, II(4, Whole No. 8) 54. Thorndike EL (1901) The Human Nature Club; an Introduction to the Study of Mental Life, Longman, Green and Co., New York, 2nd, enl. edition 55. Thorndike EL (1907) The Elements of Psychology, A.G. Seiler, New York, 2nd edition 56. Thorndike EL (1909) Darwin’s contribution to psychology, University of California Chronicle, 12:65–80, 1909 57. Thorndike EL (1911) Animal Intelligence: Experimental Studies, The Animal Behavior Series. Macmillan, New York 58. von Neumann J (1966) Theory of Self-Reproducing Automata, University of Illinois Press, Urbana and London 59. Watkins C. (1989) Learning from Delayed Rewards, PhD thesis, University of Cambridge, England 60. Watkins C, Dayan P (1992) Q-learning, Machine Learning, 8(3–4):279–292

CHAPTER 19 The Heterogeneous Agents Approach to Financial Markets – Development and Milestones Jörn Dermietzel

19.1

Introduction

For economists financial markets are one of the most interesting mechanisms to investigate. One reason is the availability of tons of data ranging from high frequency data to long term observations covering several decades. On the basis of these huge data sets it is possible to study the dynamics of asset prices balancing supply and demand. On the other hand there is a sound theoretical basis that has mainly been developed in the last century. The results from this theory are based on strict assumptions in order to enabled researchers to derive and to solve analytical models for financial markets. Since (Mandelbrot 1963) researchers have discovered numerous statistical properties in real market time series that contradict the theoretical results. These so called stylized facts together with the paradigm shift away from the completely rational, representative agent to boundedly rational, heterogeneous agents motivated researchers to model financial markets as complex evolving systems of heterogeneous interacting traders. These models can provide explanations for some of the observed stylized facts, as well as historical events, like the internet bubble in the early 90s and the stock crashes of 1987 and 1990. The purpose of this chapter is, to provide a survey of financial market models with heterogeneous traders for newcomers and to wake the interest of researchers in related fields. It presents a brief summary of the development from very stylized models more than 25 years ago to recent complex models, which are able to reproduce and explain many empirically observed anomalies in financial markets. The selected models are examples of pioneering works which influenced this development significantly. In order to preserve the introductory character of this article the description of the models is kept short and some details are omitted in case they are not necessary to understand the idea behind the model or the analysis. For additional overviews with many references and detailed discussions of related models the reader is referred to the surveys by (LeBaron 2000; LeBaron 2006; Hommes 2006).

444

Jörn Dermietzel

The structure of the chapter is as follows: Section 19.2 presents and discusses the basic assumptions of financial theory. Section 19.3 summarizes the most important statistical properties of real world financial time series, which contradict the results of financial theory. In Sections 19.4 to 19.6 the most important models of financial markets with heterogeneous traders are briefly described. Section 19.7 presents related questions in financial markets and section 19.8 finally concludes and provides an outlook for further research activities.

19.2

Fundamental Assumptions in Financial Theory

Financial theory is based on strong assumptions on the behaviour of traders and on the market environment. These assumptions enable researchers to model and to predict financial markets by using mathematical tools.

19.2.1 Rationality Rationality is one of the most important assumptions in economics, meaning that every economic individual has rational expectations and acts optimal in an economic sense given these expectations (see (Sargent 1993)). (Friedman 1953) claims that the behaviour of economic individuals in markets can be described as if they behave rational, since the Friedman Hypothesis states that biological competition will eliminate all non-rational traders. The concept of rationality also implies that individuals have unlimited knowledge about their environment, form beliefs that perfectly reflect this knowledge, and are able to derive optimal decisions based on these beliefs, no matter how complex the tasks are. Following this argumentation the rational expectation hypothesis (REH) was introduced by (Muth 1961; Lucas 1971), and states that beliefs are always perfectly consistent with realizations, i. e. rational individuals do not make systematic forecasting errors. According to the Friedman hypothesis these assumptions exclude rules of thumb and market psychology as profitable trading strategies, whereas (Keynes 1936) argues that investors sentiment and market psychology play an important role in financial markets. (Simon 1957) claims that individuals are boundedly rational and use simple rules of thumb, because they are limited in their knowledge about the environment, as well as in their computational skills. The most important argument by Simon is that in reality individuals have to face search costs for obtaining sophisticated information. Indeed it seems highly unrealistic that individuals are able to have full information and to predict every impact arbitrary events can have on the market. A formal proof of for the complexity of financial markets is given in (Aspnes et al. 2001). The authors present a simple asset pricing model and show that for

19 The Heterogeneous Agents Approach to Financial Markets

445

a reasonable heterogeneity among trading strategies the task of predicting the future asset price is not solvable in polynomial time. This underlines the theoretical character of the concept of full rationality. The complexity of financial markets forces traders to use imperfect rules of thumb in order to react fast enough on the changing market environment.

19.2.2 Efficient Market Hypothesis The Efficient Market Hypothesis (EMH) dates back to (Fama 1965). One distinguishes between weak, middle and strong market efficiency, which means that either all available, all public, or all public and private information is reflected in the market prices. This implies that past prices do not contain any usable information that is not represented by current prices. The argument for efficient markets is that every profit opportunity in an inefficient market would be exploited by rational arbitrageurs and thereby disappear, which drives the price immediately to its fundamental value. Thus, in efficient markets there is no forecastable structure in asset returns. Here again, the criticism of (Simon 1957) comes into play. (Grossman and Stiglitz 1980) argue that if information is costly, e. g. due to search costs, there is no incentive in an efficient market to obtain the costly information. But the question is then, how did the market became informationally efficient?

19.2.3 Representative Agent Theory The assumption of rationality leads to the representative agent theory, which aggregates the behaviour of a population of economic individuals in one representative agent, acting perfectly rational. Since it is assumed that all economic individuals are perfectly rational, including common knowledge, they will all behave the same. Thus one can describe the behaviour of a population of rational individuals as if it is one rational individual. The representative agent theory implies the no trade theorem of (Milgrom and Stokey 1982). In a market of homogeneous traders prices always reflect the fundamental price of the underlying asset. No one is willing to sell at lower prices and no one is willing to buy at higher prices. At equilibrium prices there is no arbitrage opportunity. Thus there is no incentive to trade at equilibrium prices. Of course one can assume that there are always some less rational traders around providing liquidity around equilibrium prices, but empirical observations show that there is heavy trade, which contradicts the not trade theorem. A possible explanation of excess trading volumes is heterogeneity among traders. This approach is strengthened by questioners conducted among financial practitioners. The models presented in this article base on the assumptions that traders neither are perfectly rational nor have heterogeneous expectations.

446

Jörn Dermietzel

19.3

Empirical Observations

In the 70s and 80s the assumptions presented above, namely representative agents, rational expectations and the efficient market hypothesis, became the dominant paradigms in finance and in economics in general. But almost in parallel to the development of this theory empirical observations lead to increasing criticism of these strong assumptions. (Mandelbrot 1963) was the first who discovered empirical properties in financial time series that contradicted financial theory. Other studies use the statistical properties found in financial time series to show that market prices are not as unpredictable as suggested by the efficient market hypothesis. Examples are (Campbell and Shiller 1988; Lo and MacKinlay 1988; Dacoragna et al. 1995). Their results are strengthened by (Brock et al. 1992), who test a number of simple frequently used technical trading rules on real time series and derive that these can generate significant positive returns. Other researchers, like (Hansen and Singleton 1983; Frankel and Froot 1986; Mehra and Prescott 1988; Cutler et al. 1989) call the strong connection between financial markets and macroeconomic fundamentals into question. Today a number of stylized statistical properties are known. The most important are Excess volatility: The volatility of asset prices is higher than the volatility of the underlying economic fundamentals. Volatility clustering: The variance of asset prices is not stationary. There are switches between phases of high volatility and phases of low volatility. Fat tails: The distribution of price changes has more weight in the extremes and a higher peak around the mean than the normal distribution suggests. More stylized facts and extended discussions on these findings are provided e. g. in (Cont 2001; Shiller 2003). In addition to the analysis of real world time series and the underlying fundamentals, researchers have conducted questionnaires among financial practitioners, e. g. (Allen and Taylor 1990; Ito 1990; Taylor and Allen 1992; Frankel and Froot, who published a whole series of papers with their data in the late 80s and early 90s (see (Frankel and Froot 1986, 1987a, b, 1990a, b)). These studies show that financial practitioners indeed use chart analysis and similar trading techniques in addition to fundamental analysis. But in contrast to the Friedman hypothesis these alternative trading techniques did not vanish. Instead their importance increased since then, which is a hind for their success in terms of significant excess returns. As a result of the discrepancies between real markets and theoretical models, in the late 80s and 90s heterogeneous agent models and bounded rationality became increasingly popular in economics and finance. The developments in mathematics, physics and computer science regarding nonlinear dynamics, chaos and complex

19 The Heterogeneous Agents Approach to Financial Markets

447

systems motivated economists to apply these tools to financial markets. This leads to the interpretation of financial markets as complex evolving systems as emphasized e. g. in (Anderson et al. 1988). Also the emerged availability of computational tools enormously stimulated the development in this field.

19.4

Evolution of Models

19.4.1

Fundamentalists vs. Chartists

Heterogeneous expectations of traders can have different reasons, e. g. different information accuracy, different interpretation of information, or different preferences. Most analytical models concentrate on the two typical types of traders found in financial practice, namely fundamental traders and chart traders. Fundamental traders belief that the price of an asset will return to its fundamental value in the medium to long run. Therefore they invest into undervalued assets, i. e. where the price of the asset is below its fundamental value. Analogously they sell assets that are overvalued. With their demand and supply they drive the prices of the assets to its fundamental values. This is the behaviour usually assumed in financial theory, because it is solely based on the development of economic news and the current price that reflects this news. Chart traders belief that there is useful information in past prices, which contradicts financial theory. Their trading is based on the extrapolation of price trends, expecting a trend to persist in the short run. Accordingly they buy an asset if its price shows an upward trend and they sell if decreasing prices are expected. By prolonging observed trends, which were initially started by changes of the fundamental, these traders can produce substantial deviations from fundamental prices. The interaction of fundamental and chart traders can generate complex dynamics that exhibit similar statistical properties than real world price series. One of the first models with fundamental and chart traders was developed in (Beja and Goldman 1980). They present a very stylized model that is desired to explain the implications of speculative behaviour on prices, i. e. if past prices exhibit some information that traders can exploit, which is a basic requirement for chart trading. Although the model lacks any micro foundation, it is able to produce substantial deviation of prices from the underlying fundamental price. This observation contradicts the assumption of financial theory, that market prices always reflect the fundamental price of an asset. The model is based on the assumption that prices adjust to excess demand or supply with a given finite speed, instead of instantaneous adjustment. This method, applied in (Beja and Goldman 1980), is today called market maker mechanism. In real markets market makers provide liquidity. They hold an inventory from which they offer stocks at a fixed sell-price and they buy stocks at a fixed buy-price. These prices are permanently adjusted to the market, i. e. if there is excess demand

448

Jörn Dermietzel

the market maker raises the prices, if there is excess supply he lowers the prices. By keeping the range between sell-and buy price small, the market maker guarantees liquidity around equilibrium prices. Financial market models often use a stylized version of a market maker in order to determine the market price, who adjusts the price to observed demand and supply. Accordingly the price raises as a reaction to excess demand and decreases if excess supply is observed. Another well known model using a market maker to determine asset prices is presented in (Day and Huang 1990). The performance of automated market makers in financial markets is analyzed in (Hakansson et al. 1985). Usually one assumes the price change to be a linear function of excess demand with a constant speed of adjustment. Recently (Farmer 2002) as well as (Farmer and Joshi 2002) presented a model with nonlinear price adjustment. They define a log-linear price impact function which is closely related to the linear market maker. In their model Beja and Goldman use a linear market maker that adjusts prices to the aggregated excess demand of fundamental traders and chart traders, so called speculators. That is P ( t ) = a ⎡⎣ P f ( t ) − P ( t ) ⎤⎦ + b ⎡⎣ψ ( t ) − r ( t ) ⎤⎦ + ε ( t ) ,

(19.1)

ED c ( t )

ED f ( t )

where P(t) is the logarithm of the current asset price and Pf(t) is the logarithm of the exogenously given fundamental price, i. e. the price that perfectly reflects the underlying fundamentals of the asset. The excess demand of fundamental traders EDf(t) is a linear function of the difference between the current price and the fundamental price. Thus fundamental excess demand drives the price back to its fundamental value. The excess demand of speculators EDc(t) is determined by the difference of the anticipated price trend of the asset ψ(t) and the return r(t) of an exogenously given alternative investment opportunity. The anticipated price trend can be seen as the average of the heterogeneous expectations of speculators and is revised according to

ψ ( t ) = c ⎡⎣ P ( t ) −ψ ( t ) ⎤⎦ ,

(19.2)

where c is a positive constant that determines the flexibility of adaption to the current price. While the parameters a, b and c are invariant the variables Pf(t) and r(t) may change over time by following a stochastic process. Beja and Goldman 1980) analyze the stability of their model for constant exogenous parameters, i. e. Pf(t) = Pf, r(t) = 0 and ε(t) = 0 for all t. They show that the system is stable if and only if a > c ( b − 1)

(19.3)

and it is oscillating if and only if

(1 − a c ) < b < (1 + a c ) . 2

2

(19.4)

19 The Heterogeneous Agents Approach to Financial Markets

b

449

8

7

6

5

4

3

2

1

0 0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

a/c Figure 19.1. Stability conditions in (Beja and Goldman 1980): The dotted area represents stable parameter constellations. Parameters in the grey area lead to oscillating prices with increasing resp. decreasing amplitudes

That means a high amount of speculative demand, i. e. high b, or a high flexibility in the anticipation of trends, i. e. high c, leads to instability. These properties are visualized in Figure 19.1. The analysis of the convergence to the fundamental price by Beja and Goldman shows that the impact of speculation can accelerate convergence but it can also lead to oscillating prices and instability. With their model Beja and Goldman do not only show that chart trading can be successful but they also show that speculative behaviour plays an important role in the dynamics of asset prices. Both observations are in strong contrast to classical financial theory, which states that past prices do not exhibit useful information and thus there is no substantial chart trading. In the early ’90s the ongoing debate on market efficiency and the random walk model together with the market crash of ’87 motivated researchers to enforce the search for alternative models of financial markets. In this realm (Chiarella 1992) extended the model of (Beja and Goldman 1980). He substituted the linear function which determines the excess demand in Equation 19.1 by a non-linear function, i. e. an s-shaped function h(·). This changes Equation 19.1 into P ( t ) = a ⎡⎣ P f ( t ) − P ( t ) ⎤⎦ + h (ψ ( t ) − r ( t ) ) + ε ( t ) , ED f ( t )

ED c ( t )

(19.5)

450

Jörn Dermietzel

The following properties of h(·) qualify this function to represent the aggregated excess demand of many traders, having different levels of wealth, time horizons and preferences. 1. 2. 3. 4.

h′(x) > 0 for all x, h(0) = 0, there exists an x* with h′′(x < 0 (> 0) for all x > x* (< x*), limx→±∞ h′ (x) = 0.

The analysis of the model shows that for the unstable case there exists a stable limit cycle along which the model fluctuates. In addition (Chiarella 1992) presents a discrete time version of the model which is able to generate chaotic motion around the fundamental price. This is an important step towards the replication of realistic price fluctuations. Chairella points out that the techniques of dynamic optimization theory are not sophisticated enough to analyze a model with a detailed microstructure which could generate similar dynamics, although such a model would be desired. Therefore the properties of h(·) are an approximation of the aggregation of such microscopic behaviour. Beside the development of a detailed microstructure he also suggests to incorporate endogenous changes of the parameters a, b and c which would lead to temporary regime switching between convergence to fundamental prices, limit cycle motion and chaotic fluctuations.

19.4.2

Social Interaction

The above models focus on the economic interaction of different groups of traders, i. e. fundamentalists and chartists. Another aspect that may play an important role in real markets is social interaction among traders. (Kirman 1991; Kirman 1993) describe the local interaction in ant populations and suggest, to carry over these dynamics to the behaviour of traders in financial markets. (Lux 1995) presents a model of social interaction in a financial market model. He differentiates the group of chartists into optimistic and pessimistic chartists. The transitions between these two subgroups are determined by transition laws, based on the synergetics literature of (Haken 1993). Accordingly the probabilities of pessimists to become optimists and vice versa are given by

π +− = υ exp U ,

(19.6)

π −+ = υ exp− U ,

(19.7)

with U = α1 x + α 2

P

υ

.

(19.8)

The speed of change is given by υ. While the first part of the exponent represents the social interaction, the second part represents the current price process.

19 The Heterogeneous Agents Approach to Financial Markets

451

These two forces are weighted by α1 and α2. Here x is an opinion index ranging from −1, i. e. all chartists are pessimistic, to +1, i. e. all chartists are optimistic. That means the more chartists are optimistic the more pessimists are converted into optimists and vice versa. Assuming that each pessimistic chartist sells a fixed amount sc of asset shares and each optimistic chartist buys sc asset shares the excess demand of chartists’ sums to ED c = ( n+ − n− ) s c .

(19.9)

The excess demand of fundamentalists is determined by the difference between fundamental value p f and the actual price p. That is

(

)

ED f = s f p f − p .

(19.10)

Together 19.9 and 19.10 determine the excess demand ED the market maker has to face. The latter adjusts the price according to p = β ED,

(19.11)

ED = ED c + ED f .

(19.12)

with

Note that the total numbers of chartists and fundamentalists are given exogenously and stay constant. (Lux 1998) extends the model by allowing traders to switch between both strategies. Switching between optimistic or pessimistic chartists and fundamentalists is based on the profit differential between these strategies. In addition an exit/entry process of traders to the market is modeled that leads to permanent fluctuations. Another extension is made in (Lux 1997). By adding noise to the excess demand which is observed by the market maker equation 19.12 becomes

ED = ED c + ED f + ε ,

(19.13)

were ε is i.i.d. with mean 0 and has a symmetric probability density function φ(ε). Assuming that there exists a minimal price step, e. g. cents or pence, one can then determine the probabilities of the price to increase or decline by this minimal step: ∞

π ↑ p = β ∫ − ED ( ED + ε ) φ ( ε )d ε , − ED

π↓p = β ∫ ∞

( − ED − ε ) φ (ε )d ε ,

(19.14) (19.15)

All three models produce time series that exhibit major stylized facts. While (Lux 1995) concentrates on the occurrence of bubbles and crashes, (Lux 1998) points out the fat tails of the return distribution and (Lux 1997) focuses on the time variation of second moments of the return distribution.

452

Jörn Dermietzel

19.4.3

Adaptive Beliefs

(Brock and Hommes 1998) introduce a new class of models called Adaptive Believe Systems (ABS). These are based on the Cobweb Framework of (Brock and Hommes 1997). The basic idea is to provide a framework for models of financial markets where strategies differ only by the beliefs of the different types of traders, i. e. the information set and the interpretation by traders. Once the forecast, based on this information is done, all traders act optimal given their individual forecast. There are two assets to invest in: one risky asset with endogenous price pt paying a stochastic dividend dt and one risk free asset paying a constant interest r = R−1. Traders are assumed to be myopic mean variance maximizers with risk aversion γ, who try to maximize their expected wealth, which evolves according to Wt +1 = R (Wt − pt zt ) + ( pt +1 + d t +1 ) zt Riskfree investment

Risky investment

= RWt + ( pt +1 + dt +1 − Rpt ) zt ,

(19.16)

where zt is the number of stock shares the trader holds. Thus the first part of equation 19.16 is the return of the wealth-fraction invested in the risk free asset, while the second part is the return from the risky asset. Suppose there is a type of traders called h. Given the conditional expectation Eht and variance Vht these traders face the maximization problem

γ ⎧ ⎫ max ⎨ EhtWh,t +1 − VhtWh,t +1 ⎬ . zht ⎩ 2 ⎭

(19.17)

Assuming that all traders hold the same expected conditional variance, i. e. Vht = σ2 for all types h, equation 17 leads to the optimal demand of shares zht =

Eht ( pt +1 + dt +1 − Rpt )

γσ 2

.

(19.18)

The model of (Brock and Hommes 1998) assumes that the market is always cleared, i. e. the outside supply of shares per investor zst equals the aggregated demand of traders. That is

∑ nht h

Eht ( pt +1 + dt +1 − Rpt )

γσ 2

= zst ,

(19.19)

where nht denotes the fraction of investors of type h. By setting nht = 1 and zst = 0, i. e. there is only one type of traders and no outside supply, one can derive the fundamental price

ptf =

d , r

(19.20)

19 The Heterogeneous Agents Approach to Financial Markets

453

and thereby define the deviation of the price from its fundamental value

Δp = pt − ptf . t

(19.21)

This deviation of the price from its fundamental value is one part of the dynamic system Brock and Hommes analyze. The second part of the system are the fractions of investors of the different types. In order to define the dynamic of these one first has to introduce the different beliefs of the types of traders. These are of the form

(

)

Eht ( pt +1 + dt +1 ) = Et ptf+1 + dt +1 + f h ( Δpt −1 , …, Δpt − L ) .

(19.22)

Thus all traders share the first term of the forecast, while the second term, the deterministic function fh(·), can differ across trader types h. The selection of trading strategies is based on a fitness function Πht, which is determined by realized profits. This leads to the discrete choice probabilities1 nht =

exp υΠ h,t −1 , Ut

(19.23)

with U t = ∑ exp υΠ h,t −1 ,

(19.24)

h

where υ is the intensity of choice. (Brock and Hommes 1998) analyze the dynamics for different initial populations consisting of up to four types of traders. The beliefs of these types are characterized by simple linear functions of the form f ht = ah Δpt −1 + bh ,

(19.25)

where a stands for the addiction of trends and b represents biases. Given this form of beliefs one can model a number of typical trading behaviours, e. g. fundamentalists and chartists. (Brock and Hommes 1998) analyze the stability of systems with two, three and four different types of traders and show that heterogeneity leads to persistent deviations from the fundamental price. A high intensity of switching between strategies leads to irregular, chaotic price fluctuations. Especially the setup with a population mixture of fundamentalists, trend chasers, contrarians, and biased beliefs produces time series that exhibit important stylized facts. 1

This is a simplified representation. The model of Brock and Hommes (1998) contains more parameters, e. g. the memory strength, which are not taken into account in their analysis.

454

Jörn Dermietzel

There exist many extensions and variations of the framework presented by (Brock and Hommes 1998). E. g. (Chiarella and He 2001) substitute the constant absolute risk aversion utility (CARA) of (Brock and Hommes 1998) by constant relative risk aversion utility (CRRA) in order to link the relative wealth of the traders to their demand and to the price function. They set up a model with growing price and wealth processes that is stationary in terms of return and wealth proportions. In this paper traders are not allowed to switch between strategies. The switching between strategies is added to the model in the chapter by (Chiarella and He) in this handbook. Other extensions of (Brock and Hommes 1998) are (Chiarella and He 2002; Chiarella and He 2003). In the first article the robustness of the results of Brock and Hommes are tested. Therefore the authors relax the assumptions of homogeneous degrees of risk aversion and homogeneous expected conditional variances. They also investigate the effects of different memory lengths on the dynamics. In the later article the authors substitute the equilibrium model by a more realistic market mechanism, i. e. a linear market maker which was introduced in (Beja and Goldman 1980; Day and Huang 1990). A very recent extension of the model is (Chang 2007), in which the impact of social interaction in the framework of (Brock and Hommes 1998) is analyzed.

19.4.4

Artificial Stock Markets

With the availability of computational power, researchers where able to construct simulation models and thereby to overcome the limits of mathematical modelling and analytical tractability. One of the most important models of financial markets, namely the Santa Fe Artificial Stock Market (SF-ASM), is such a simulation model. It combines the framework and parts of the theory from financial economics with techniques from computer science, especially from artificial intelligence. The basic model was presented in (LeBaron et al. 1999; Arthur et al. 1997). The economic framework is similar to (Brock and Hommes 1998). Traders can invest into one risky and one risk free asset. While the risky asset pays a stochastic dividend dt the risk free one pays a constant interest r. The price of the risky asset is determined endogenously by balancing supply and demand analytically. Similar to (Brock and Hommes 1998) traders are myopic mean variance maximizers, who seek to maximize their expected one period wealth. Each trader forecasts the future return of the risky asset according to Et ( pt +1 + d t +1 ) = aij ( pt + d t ) + bij

(19.26)

The parameters aji and bji are not only heterogeneous among traders j but depend on the current status of the market, i. e. current fundamental and technical signals. These signals are encoded in a market vector of twelve bits, as illustrated in Table 19.1.

19 The Heterogeneous Agents Approach to Financial Markets

455

Table 19.1. Market vector (left) and trading rule (right) in the SF-ASM (cf. LeBaron et al. 1999) Market vector: Bit Condition 1 pt*r/dt > 1/4 2 pt*r/dt > 1/2 3 pt*r/dt > 3/4 4 pt*r/dt > 7/8 5 pt*r/dt > 1 6 pt*r/dt > 9/8 7 pt > 5-period Moving Average 8 pt > 10-period MA 9 pt > 100-period MA 10 pt > 500-period MA 11 Always 1 12 Always 0

Trading rule j: Condition Forecast (mj1, … ,mj12) aj, bj

Variance σ2

Each trader is able to store parameter tuples for 100 market situations. Each of these trading rules consists of a condition part, a forecasting part and a fitness measure, as shown in Table 19.1. The condition part of rule j consists of twelve bits (mj,1, … , mj,12) in {0; 1; #}. In order to identify the current market situation, these bits are compared to the current market vector, where 1 matches 1, 0 matches 0 and # matches both. If all bits match the rule becomes active. There is at least one active rule, namely the one with m1 = m2 = … = m12 = # matching every market status, but there might also be several active rules. Among these the rule with the highest fitness is used to forecast the future return as given in equation 19.25. The expected variance of the forecast corresponds to the fitness of the chosen rule. Traders learn in two ways. First the fitness of all active trading rules is updated at the end of each trading period, in order to reflect recent forecasting accuracy of the rules. The second mechanism is a genetic algorithm, which is activated frequently. The genetic algorithm replaces the 20 rules with the lowest fitness of each agent by mutations and recombinations of well performing rules. Thereby traders repeatedly adjust their rule-set to recent market observations. There exist numerous extensions and alternative designs to the model. Most of these are reviewed in (LeBaron 2006). (LeBaron 2002) gives an inside view to the design decisions the developers of the model had to face. The complexity of the model and the many alternatives in designing give rise to criticism, because the mass of parameters that have to be set makes it difficult to pin down clear causalities. Nevertheless the authors show that for slow learning, i. e. the genetic algorithm is activated every 200 periods on average, the arising price dynamics correspond to the rational expectations equilibrium (REE), i. e. the risk adjusted fundamental price. For faster learning, i. e. the genetic algorithm is activated every 50 periods on average, the price dynamics exhibit major stylized facts and phases of substantial deviation from the rationally expected equilibrium price are observed.

456

Jörn Dermietzel

In the SF-ASM model the traders follow standard fundamental and technical trading patterns. (Daniel et al. 2005) use human subject experiments in order to compare the performance and the impact of very heterogeneous trading approaches. Therefore humans coming from different backgrounds program and update their own traders that act in an artificial stock market model. This method allows the authors to analyse a hugh variety of non-standard trading strategies directly imprinted by humans.

19.5

Competition of Trading Strategies

Different trading strategies have a huge impact on the performance of financial market models. In the models described in the previous sections, traders use strategies based on reasonable beliefs. These are mostly fundamental signals and trend analysis. If a selective framework is applied, these strategies coexist in the market, with temporary dominance of one of both strategies. But none of the two is eliminated persistently from the market. Thus, there are phases when one strategy works better than the other, but on average both strategies are performing well. Nevertheless it is important to analyze the returns of different trading strategies, in order to identify dominant strategies and the motivation for strategy selection. One important aspect is the impact of strategies on market dynamics and thereby on the return of other traders.

19.5.1 Friedman Hypothesis Revisited Behavioural finance investigates the impact of bounded rational behaviour on financial markets, believing that this helps to understand some important financial phenomena and offers an explanation for the risk premium puzzle. A good survey on behavioural finance is given by (Barberis and Thaler 2003). One important aspect of this field is the noise trader approach, which investigates the impact of obviously wrong information on the market. The Friedman hypothesis states, that irrational traders will realize lower returns than rational arbitrageurs and due to evolutionary pressure they disappear from the market. An example for such irrational traders are so-called noise traders, who erroneously belief that they have special information about an asset and trade on that wrong information (see (Kyle 1985; Black 1986)). In contrast to the Friedman hypothesis noise traders do not necessarily perform worse than rational traders. This is shown by (Delong et al. 1990). In their evolutionary model rational arbitrageurs trade against noise traders. The authors show that a significant impact of noise traders may drive asset prices away from its fundamental value and thereby induce additional risk. Rational arbitrageurs take into account the increased risk and, since the opinion of noise traders is generally unpredictable, they leave the

19 The Heterogeneous Agents Approach to Financial Markets

457

market. In the selective model of (Delong et al. 1990) this enables noise traders to realize higher returns than rational traders, i. e. the higher the risk aversion of rational traders, the higher the difference in returns. In this scenario it is even possible that rational traders are completely driven out of the market by noise traders. It is noteworthy that due to the risk aversion, resp. the utility function, of rational traders noise traders, who are willing to bear higher risk, can earn higher returns. In terms of utility the rational traders perform always better than the noise traders, and thus if the selection is based on utility instead of returns in the long run noise traders cannot survive.

19.5.2

Dominant Strategies

Not only wrong information can lead to decreasing returns for sophisticated traders, but chart trading can also lower the overall expected return of the market. This is shown in (Joshi et al. 1998). The authors analyze the returns of traders in the SFASM model, depending on the use of chart information or combined chart and fundamental information. They identify a multi-person prisoner’s dilemma in the selection of strategies. Table 19.2 shows the final wealth of an agent for the four scenarios A to D. If all traders, including the agent under observation, abandon technical trading (situation D) the payoff is fairly high. This is the social optimum a regulator would desire. If only the agent under observation uses technical trading (B) he has an even higher payoff. If the number of traders is high the change of this single agent’s strategy has a neglectable impact on the average payoff of all other traders, i. e. it stays approximately at the level of situation D. Thus each individual trader has an incentive to include technical trading if all other traders are not doing so. In situation A all traders, including the agent under observation, include technical trading. This leads to a fairly low payoff level, which is clearly below the payoff in situation D. If all traders, but the one under observation include technical trading (situation C) the payoff of this one is even below the level of A. Thus there is no incentive for individual traders to abandon technical trading if all traders are using this method. For each individual trader it is a dominant strategy to include technical trading, but if all traders follow this dominant strategy the overall payoff is lowered and the market is locked in an inefficient equilibrium (A).

Table 19.2. Strategy selection leads to a prisoner’s dilemma (cf. (Joshi et al. 1998)) All other traders include technical exclude technical trading trading The agent under observation

includes technical trading excludes technical trading

A: ~113

B: ~154

C: ~97

D: ~137

458

Jörn Dermietzel

According to (Joshi et al. 2002) a similar situation arises for the optimal frequency to revise the forecasting rules. As shown for the SF-ASM model in (LeBaron et al. 1999) slow learning leads to prices close to the REE, whereas fast learning leads to substantial deviation from the REE-prices, i. e. the market becomes inefficient with fast learning. (Joshi et al. 2002) use this model to show that each individual trader has an incentive to increase the frequency of revising her rules. Again the optimal decision, i. e. fast learning, leads to a suboptimal outcome, because it increases the amount of technical trading and noise in the market. Thereby it becomes more difficult for all traders to predict future prices.

19.6

Multi-Asset Markets

A natural extension of the models with one risky asset is to allow traders to split their investment into several risky assets. But more assets make the mathematical determination of optimal allocations, i. e. portfolios, more complicated. Therefore the first multi-asset models restrict their traders and thereby avoid the problem of portfolio building. A first example is (Westerhoff 2004). In his model fundamental traders are restricted to invest into one certain asset. Thus the number of fundamentalists who are active in this market is fixed. Chartists are allowed to switch between different assets, but are restricted to invest into one risky asset at a time. Thus they do not build portfolios with several risky positions. Chartists chose the market with the strongest trend signal for their risky investment. This changes the proportions of chartists to fundamentalists in the markets and leads to complex dynamics, e. g. co-movement of stocks and the replication of stylized facts. Other examples for multi-asset markets are (Cincotti et al. 2005; Cincotti et al. 2006). The first one avoids the portfolio selection by using zero intelligence traders, i. e. traders determine their portfolios randomly. In (Cincotti et al. 2006) traders are seen as nodes of a sparsely connected graph so that each agent is influenced by near agents. The authors point out that this is the first model that reproduces univariate and multivariate stylized facts at the same time. Very recently (Chiarella et al. 2007) presented a multi-asset model that is based on (Brock and Hommes 1998; Chiarella and He 2001; Chiarella and He 2003). The authors extend the framework with a single risky asset to n risky assets. They apply the standard one period capital asset pricing model (CAPM), developed simultaneously by (Sharpe 1964; Lintner 1965; Mossin 1966), in order to determine the optimal demand for shares of the risky assets. In order to keep the analysis simple Chiarella et al. 2007) excluded strategy switching in their model. Nevertheless it provides a general framework for multi-asset markets as (Brock and Hommes 1998) does for the classical case of one risky and one risk free asset. This model might become the basis for numerous extensions and investigations of multi-asset market dynamics.

19 The Heterogeneous Agents Approach to Financial Markets

19.7

459

Extensions and Applications

The approach to model financial markets as evolving systems of heterogeneous interacting agents was successful in explaining empirical phenomena which were not addressed by classical financial theory. Recently researchers started to extend the established models of this field in order to investigate related questions in financial markets. One application of these models is the regulation of financial markets. An interesting instrument to reduce destabilizing speculation is the use of Tobin-like transaction taxes. The impact of such instruments on the market is investigated e. g. in (Westerhoff 2003; Mannaro et al. 2005; Westerhoff and Dieci 2006). Another interesting extension is the interaction of markets. E. g. (Sallans et al. 2003) analyse the interaction of a financial market with an consumer market. Basically this means to replace the stochastic dividend process by a microstructural model of the underlying company, i. e. its actions, decisions and most important its cash-flows. Naturally such an extension comes at the cost of more complexity and a bunch of new parameters. (Sallans et al. 2003) apply the Marcov chain Monte Carlo method in order to analyze the impact of the most important parameters systematically. By this method one can not only show the impact of parameters, it also enables modellers to fine tune many parameters synchronously in order to calibrate models to real world time series. The new insights of financial markets and the traders therein are also used in the new field of Market Engineering, which analyzes and optimizes electronic market-platforms with respect to their micro-, infra- and business structure. An introduction to this field is provided e. g. by (Roth 2002; Weinhardt et al. 2003) (rsp. (Holtmann et al. 2002)).

19.8

Conclusion and Outlook

The above presented survey shows how the field of financial market models has evolved from very stylized models, addressing single issues, to very complex models that are able to replicate many stylized facts simultaneously. These models are very successful in explaining important puzzles in the dynamics of financial markets, and thereby reduce the gap between empirical findings in real markets and the corresponding theoretical models. Although recent models are very complex, and are able to explain certain aspects of market dynamics, the models are by no means realistic. Remember, that most models are limited to one risky and one risk-free asset. One of the most important tasks for future research is to extend these models, e. g. to multi-asset markets where traders can build realistic portfolios. Another important step is to connect the financial market with the activities of the underlying economy, e. g. individual companies or industrial sectors. This

460

Jörn Dermietzel

might shed some light on the interaction and dependencies of financial markets and thereby lead to a better understanding of financial markets as a mechanism which is embedded in an economy. The existing stylized frameworks have to be filled with proper microeconomic details in order to derive more realistic general assumptions explaining the behaviour of traders and markets. Furthermore, the results from financial market models have to find their way into financial theory. These models may help traders and regulators to understand the dynamics of financial markets better than today, and to increase the benefits from these important economic instruments.

References Allen H, Taylor MP (1990) Charts, noise and fundamentals in the London foreign exchange market. Economic Journal 100(400):49–59 Anderson PW, Arrow KJ, Pines D (1988) The Economy as an evolving Complex System. Addison Wesley Longman Publishing Co Arthur WB, Holland J, LeBaron B, Taylor P (1997) Asset pricing under endogenous expectations in an artificial stock market. In: Arthur WB, Durlauf S, Lane D (eds) The economy as an evolving complex system II. AddisonWesley, Reading, MA, pp. 15–44 Aspnes J, Fischer D, Fischer M, Kao M, Kumar A (2001) Towards understanding the predictability of stock markets from the perspective of computational complexity. In: Proc. of 12th annual ACM SIAM Symposium of Discrete Algorithms, pp. 745–754 Barberis N, Thaler R (2003) A survey of behavioural finance. In: Constantinides GM, Harris M, Stulz R (eds) Handbook of the Economics of Finance. Elsevier Beja A, Goldman MB (1980) On the dynamic behaviour of prices in disequilibrium. Journal of Finance 35:235–248 Black F (1986) Noise. Journal of Finance 41:529–543 Brock WA, Lakonishok J, LeBaron B (1992) Simple technical trading rules and the stochastic properties of stock returns. Journal of Finance 47:1731–1764 Brock WA, Hommes C (1997) A rational route to randomness. Econometrica 65: 1059–1095 Brock WA, Hommes C (1998) Heterogeneous beliefs and routes to chaos in a simple asset pricing model. Journal of Economic Dynamics and Control 22:1235–1274 Campbell JY, Shiller R (1988) The dividend-price ratio and expectations of future dividends and discount factors. Review of Financial Studies 1:195–227 Chang SK (2007) A simple asset pricing model with social interactions and heterogeneous beliefs. Journal of Economic Dynamics and Control 31:1300−1325 Chiarella C (1992) The dynamics of speculative behaviour. Annals of Operations Research 37:101–123

19 The Heterogeneous Agents Approach to Financial Markets

461

Chiarella C, Dieci R, He X (2007) Heterogeneous expectation and speculative behaviour in a dynamic multi-asset framework. Journal of Economic Behaviour and Organization 62(3):408–427 Chiarella C, He X (2007) An adaptive model on asset pricing and wealth dynamics in a market with heterogeneous trading strategies. In this Handbook Chiarella C, He X (2001) Asset pricing and wealth dynamics under heterogeneous expectations. Quantitative Finance 1:509–526 Chiarella C, He X (2002) Heterogeneous beliefs, risk and learning in a simple asset pricing model. Computational Economics 19:95–132 Chiarella C, He X (2003) Heterogeneous beliefs, risk and learning in a simple asset pricing model with a market maker. Macroeconomic Dynamics 7:503–536 Cincotti S, Ponta L, Pastore S (2006) Information-based multi-asset artificial stock market with heterogeneous agents. 1st International Conference on Economic Sciences with Heterogeneous Interacting Agents (WEHIA 2006), University of Bologna, Italy Cincotti S, Ponta L, Raberto M (2005) A multi-assets artificial stock market with zero-intelligence traders. 10th Annual Workshop on Economic Heterogeneous Interacting Agents (WEHIA2005), University of Essex, UK Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance 1:223–236 Cutler DM, Poterba JM, Summers LH (1989) What moves stock prices? Journal of Portfolio Management 15:4–12 Dacoragna MM, Müller UA, Jost C, Pictet OV, Olsen RB, Ward JR (1995) Heterogeneous real time trading strategies in the foreign exchange market. European Journal of Finance 1:383–403 Daniel G, Muchnik L, Solomon S (2005) Traders imprint themselves by adaptively updating their own avatar. In: Mathieu P, Beaufils B, Brandouy O (eds) Artificial Economics – Agent-Based Methods in Finance, Game Theory and Their Applications. Lecture Notes in Economics and Mathematical Systems, Vol. 564, Springer Berlin Heidelberg, pp. 27–38 Day RH, Huang WH (1990) Bulls, bears, and market sheep. Journal of Economic Behaviour and Organization 14:299–330 Delong JB, Shleifer A, Summers LH, Waldmann RJ (1990) Noise trader risk in financial markets. Journal of Political Economy 98:703–738 Fama EF (1965) The behaviour of stock market prices. Journal of Business 38:34−105 Farmer JD (2002) Market force, ecology and evolution. Industrial and Corporate Change 11:895–953 Farmer JD, Joshi S (2002) The price dynamics of common trading strategies. Journal of Economic Behaviour and Organization 49:149–171 Frankel JA, Froot KA (1986) Understanding the US dollar in the eighties: The expectations of chartists and fundamentalists. Economic Record, Special issue: pp. 24–38

462

Jörn Dermietzel

Frankel JA, Froot KA (1987a) Short-term and long-term expectations of the yen/dollar exchange rate: evidence from survey data. Journal of the Japanese and International Economies 1:249–274 Frankel JA, Froot KA (1987b) Using survey data to test standard propositions regarding exchange rate expectations. American Economic Review 77:133–153 Frankel JA, Froot KA (1990a) Chartists, fundamentalists and the demand for dollars. In: Courakis AS, Taylor MP (eds) Private behaviour and government policy in interdependent economies. Oxford University Press, New York, pp. 73−126 Frankel JA, Froot KA (1990b) The rationality of the foreign exchange rate. chartists, fundamentalists and trading in the foreign exchange market. American Economic Review 80(2):181–185 Friedman M (1953) The case of flexible exchange rates. In: Essays in Positive Economics. University of Chicago Press Grossman S, Stiglitz J (1980) On the impossibility of informationally efficient markets. American Economic Review 70:393–408 Hakansson NH, Beja A, Kale J (1985) On the feasibility of automated market making by a programmed specialist. Journal of Finance 40(1):1–20 Haken H (1993) Synergetics – An Introduction. Springer Series in Synergetics, Berlin: Springer, 3rd edn Hansen L, Singleton K (1983) Stochastic consumption, risk aversion, and the temporal behaviour of asset returns. Journal of Political Economy 91:249–265 Holtmann C, Neumann D, Weinhardt C (2002) Market engineering – towards an interdisciplinary approach. Technical report, Department of Economics and Business Engineering, Chair for Information Management and Systems, Universität Karlsruhe (TH) Ito K (1990) Foreign exchange rate expectations. American Economic Review 80:434–449 Joshi S, Parker J, Bedau M (1998) Technical trading creates a prisoner’s dilemma: Results from an agent-based model. Research in Economics, Number 98-12-115e, Santa Fe Institute Joshi S, Parker J, Bedau M (2002) Financial markets can be at sub-optimal equilibria. Computational Economics 19:5–23 Keynes JM (1936) The general theory of unemployment, interest and money. Harcourt, Brace and World, New York Kirman A (1991) Epedemics of opinion and speculative bubbles in financial markets. In: Taylor M (ed) Money and Financial Markets, Macmillian Kirman A (1993) Ants, rationality and recruitment. Quarterly Journal of Economics 108:137–156 Kyle AS (1985) Continuous auctions and insider trading. Econometrica 47: 1315−1336 LeBaron B (2000) Agent-based computational finance: Suggested readings and early research. Journal of Economic Dynamics and Control 24:679–702 LeBaron B (2002) Building the Santa Fe Artificial Stock Market, Working paper, Santa Fe Institute

19 The Heterogeneous Agents Approach to Financial Markets

463

LeBaron B (2006) Computational finance. In: Tesfatsion L, Judd KL (eds) Handbook of Computational Economics, vol 2. Elsevier LeBaron B, Arthur WB, Palmer R (1999) Time series properties of an artificial stock market. Journal of Economic Dynamics and Control 23:1487–1516 Lintner J (1965) The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Review of Economics and Statistics 47:13–37 Lo AW, MacKinlay AC (1988) Stock prices do not follow random walks: Evidence from a simple specification test. Review of Financial Studies 1:41–66 Lucas RE (1971) Econometric testing of the natural rate hypothesis. In: Eckstein O (ed) The Econometrics of Price Determination Conference. Board of Governors of the Federal Reserve System and Social Science Research Council Lux T (1995) Herd behaviour, bubbles and crashes. The Economic Journal 105(431):881–896 Lux T (1997) Time variation of second moments from a noise trader/infection model. Journal of Economic Dynamics and Control 22:1–38 Lux T (1998) The socio-economic dynamics of speculative markets: interacting agents, chaos, and the fat tails of return distribution. Journal of Economic Behaviour and Organization 33:143–165 Mandelbrot BB (1963) The variation of certain speculative prices. Journal of Business 36:394–419 Mannaro K, Marchesi M, Setzu A (2005) Using an artificial financial market for assessing the impact of Tobin-like transaction taxes. Technical report, DIEE, University of Cagliari Mehra R, Prescott EC (1988) The equity risk premium: A solution? Journal of Monetary Economics 22:133–136 Milgrom PR, Stokey N (1982) Information, trade and common knowledge. Journal of Economic Theory 26:17–27 Mossin J (1966) Equilibrium in a capital asset market. Econometrica 34:768–783 Muth JF (1961) Rational expectations and the theory of price movements. Econometrica 29:315–335 Roth AE (2002) The economist as an engineer: Game theory, experimental economics and computation as tools of design economics. Econometrica 70(4): 1341–1378 Sallans B, Pfister A, Karatzoglou A, Dorfner G (2003) Simulation and validation of an integrated markets model. Journal of Artificial Societies and Social Simulation, 6(4) Sargent TJ (1993) Bounded Rationality in Macroeconomics. Claredon Press, Oxford Sharpe WF (1964) Capital asset prices. a theory of market equilibrium under conditions of risk. Journal of Finance 19:425–442 Shiller RJ (2003) From efficient market theory to behavioural finance. Journal of Economic Perspectives 17:83–104 Simon HA (1957) Models of Man. Wiley, New York

464

Jörn Dermietzel

Taylor MP, Allen H (1992) The use of technical analysis in the foreign exchange market. Journal of International Money and Finance 11:304–314 Weinhardt C, Holtmann C, Neumann D (2003) Market Engineering. Wirtschaftsinformatik 45(6):635–640 Westerhoff F (2003) Heterogeneous traders and the Tobin tax. Journal of Evolutionary Economics 13:53–70 Westerhoff F (2004) Multi-asset market dynamics. Macroeconomic Dynamics 8:596–616 Westerhoff F, Dieci R (2006) The effectiveness of Keynes-Tobin transaction taxes when heterogeneous agents can trade in different markets: A behavioural finance approach. Journal of Economic Dynamics and Control 30:193–322

CHAPTER 20 An Adaptive Model of Asset Price and Wealth Dynamics in a Market with Heterogeneous Trading Strategies Carl Chiarella, Xue-Zhong He

20.1 Introduction The traditional asset-pricing models – such as the capital asset pricing model (CAPM) of [42] and [34], the arbitrage pricing theory (APT) of [40], or the intertemporal capital asset pricing model (ICAPM) of [38] – have as one of their important assumptions, investor homogeneity. In particular the paradigm of the representative agent assumes that all agents are homogeneous with regard to their preferences, their expectations and their investment strategies.1 However, as already argued by Keynes in the 1930s, agents do not have sufficient knowledge of the structure of the economy to form correct mathematical expectations that would be held by all agents. The other important paradigm underpinning these models, the efficient market hypothesis (EMH), maintains that the current price contains all available information and past prices cannot help in predicting future prices. However there is evidence that markets are not always efficient and there are periods when real data show significantly higher autocorrelation of returns than would be expected under EMH. Over the last decade, a large volume of empirical work (e. g., [8, 25, 26, 3, 41, 2, 17, 39, and 29]) has documented a variety of ways in which asset returns can be predicted based on publicly available information and many of the results 1

Financial support from the Australian Research Council under Discovery Grant DP0450526 is greatly acknowledged. An early version of this paper was presented at the 7th Asia Pacific Finance Association Annual Conference, Shanghai, July 24-26, 2000, the 2nd CeNDEF Workshop on Economic Dynamics, University of Amsterdam, 2001, the 19th Economic Theory Workshop, University of Technology, Sydney, 2001 and seminars at the University of Melbourne, the University of New South Wales, and the University of Bielefeld. In particular, we would like to thank Volker Bohm, Cars Hommes, Thomas Lux, Willi Semmler and Jan Wenzelburger for stimulating comments and discussion. The authors are responsible for any remaining errors in this paper.

466

Carl Chiarella, Xue-Zhong He

can be thought of as belonging to one of two broad categories of phenomena.2 On the one hand, returns appear to exhibit continuation, or momentum, over short to medium time intervals, which may imply the profitability of momentum trading strategies over short to medium time intervals. On the other hand, there is also a tendency toward reversals over long time intervals, leading to possible profitability of contrarian strategies. The traditional models of finance theory seem to have difficulty in explaining this growing set of stylized facts. As a result, there is a growing dissatisfaction with (i) models of asset price dynamics based on the representative agent paradigm, as expressed for example by [27], and (ii) the extreme informational assumptions of rational expectations. In order to extend the traditional models of finance theory so as to accommodate some of the aforementioned stylized facts, a literature has developed over the last decade that involves some departure from the classical assumptions of strict rationality and unlimited computational capacity, and introduces heterogeneity and bounded rationality of agents. This strand of literature seeks to explain the existing evidence on many of the anomalies observed in financial markets as the result of the dynamic interaction of heterogeneous agents. In financial markets, individuals are imperfectly rational. They seek to learn about the market from their trading outcomes, as a result the market may fluctuate around the fully rational equilibrium. A number of recent models use this approach to characterize the interactions of heterogeneous agents in financial markets (e. g. [21, 16, 9, 35, 5, 6, 7, 12, 13, 18, 19 20, 36 28, and 23]). To avoid the constraints of analytical tractability, many of these authors use computer simulations to explore a wider range of economic settings. A general finding in many of these studies is that long-horizon agents frequently do not drive short-horizon agents out of financial markets, and that populations of long- and short-horizon agents can create patterns of volatility and volume similar to actual empirical patterns. [5], [6] propose to model economic and financial markets as an adaptive belief system (ABS), which is essentially an evolutionary competition among trading strategies. A key aspect of these models is that they exhibit expectations feedback and adaptiveness of agents. Agents adapt their beliefs over time by choosing from different predictors or expectations functions, based upon their past performance as measured by realized profits. Agents have the standard constant absolute risk aversion (CARA) utility function of the CAPM world but are boundedly rational, in the sense that they do not know the distribution of future returns. The evolutionary model generates endogenous price fluctuations with similar statistical properties to those observed in financial markets. The model of Brock and Hommes has been extended in [12] by allowing agents to have different risk attitudes and different expectation formation schemes for both first and second moments of the price distribution.

2

A detailed discussion and references to the related empirical work is provided in Section 20.2.

20 An Adaptive Model of Asset Price and Wealth Dynamics

467

Because of the underlying CARA utility function, investors’ optimal decisions depend only on the asset price and do not directly involve their wealth. The resulting separation of asset price and wealth dynamics greatly simplifies the analysis of the model. However, a consequence of this separation is that the model cannot generate the type of growing price process that is observed in the market. [32] and [31] consider a more realistic model where investors’ optimal decisions depend on their wealth (as a result of an underlying constant relative risk aversion (CRRA) utility function) and both price and wealth processes are intertwined and thus growing. Using numerical simulations and comparing the stock price dynamics in models with homogeneous and heterogeneous expectations, they conclude that the homogeneous expectations assumption leads to a highly inefficient market with periodic (and therefore predictable) booms and crashes while the introduction of heterogeneous expectations leads to much more realistic dynamics and more efficient markets. [11] develop a theoretical model of interaction of portfolio decisions and wealth dynamics with heterogeneous agents having CRRA utility function. A growth equilibrium model of both the asset price and wealth is obtained. To characterize the interaction of heterogeneous agents in financial markets and conduct a theoretical analysis, stationary models in terms of return and wealth proportions (among different types of agents) are then developed. As a special case of the general heterogeneous model, these authors consider models of homogeneous agents and of two heterogeneous agents without switching of strategies. It is found that, in these cases, the heterogeneous model can have multiple steady states and the convergence to the steady states follows an optimal selection principle – the return and wealth proportions tend to the steady state which has relatively higher return. The model developed displays the volatility clustering of the returns and the essential characteristics of the standard asset price dynamics model of continuous time finance in that the asset price is fluctuating around a geometrically growing trend. The aim of the current chapter is twofold. First to establish an adaptive model of asset price and wealth dynamics in an economy of heterogeneous agents that extends the model in [11] to allow agents to switch amongst different types of trading strategies. Second to characterize the profitability of the two most popular trading strategies in real markets – momentum and contrarian trading strategies. According to [5, 6] and the references cited therein, a financial market is an interaction of heterogeneous agents who adapt their beliefs from time to time. In our model, based on certain fitness measures, such as realized wealth, the agents are allowed to switch from one strategy to another from time to time. Consequently a model with adaptive beliefs is established where evolutionary dynamics across predictor choice is coupled with the dynamics of the endogenous variables. Empirical studies provide some evidence that momentum trading (or trend following) strategies are more profitable over short time intervals, while contrarian trading strategies are more profitable over long time intervals (see for instance [8, 25, 26, 3, 2, 17; 41, 39; 29, and 30]). To characterize the profitability of momentum and contrarian trading strategies, a quasi-homogeneous model is introduced, in which agents use exactly the same trading strategies except for different time horizons. Our results in general support the empirical findings on the profit-

468

Carl Chiarella, Xue-Zhong He

ability of momentum and contrarian trading strategies. In addition, the model also exhibits various anomalies observed in financial markets, including, overconfidence and underreaction, overreaction and herd behavior, excess volatility, and volatility clustering. This chapter is organized as follows. Section 20.2 establishes an adaptive model of asset price and wealth dynamics with heterogeneous beliefs amongst agents. It is shown how the distributions of the wealth and population across heterogeneous agents are measured. As a simple case, a model of two types of agents is then considered in Section 20.3. To characterize the profitability of momentum and contrarian trading strategies, a quasi-homogeneous model is also introduced as a special case of the model of two types of agent in Section 20.3. The profitability of momentum and contrarian trading strategies is then analyzed in Sections 20.4 and 20.5, respectively. Section 20.6 concludes.

20.2 Adaptive Model with Heterogeneous Agents This section is devoted to establishing an adaptive model of asset price and wealth dynamics with heterogeneous beliefs amongst agents. The model can been treated as a generalization and extension of some asset pricing models involving the interaction between heterogeneous agents, for example, [31, 4, 6, 15, 24 and 11]. The key characteristics of this modelling framework are the adaptiveness, the heterogeneity and the interaction of the economic agents. The heterogeneity is expressed in terms of different views on expectations of the distribution of future returns on the risky asset. The modelling framework of this paper extends that of the earlier cited works by focusing on the interaction of both asset price and wealth dynamics. The framework of the adaptive model developed here is similar to the one in [31] and [11]. Our hypothetical financial market contains two investment choices: a stock (or index of stocks) and a bond. The bond is assumed to be a risk free asset and the stock is a risky asset. The model is developed in the discrete time setting of standard portfolio theory in that agents are allowed to revise their portfolios over each time interval, the new element being the heterogeneity and adaptiveness of agents and the way in which they form expectations on the return distributions. The use of CARA utility functions has been standard in much of asset pricing theory. It has the characteristic of leading to demands that do not depend on the agents’ wealth, but this dependence turns out to be quite crucial in developing a model exhibiting a growing price trend. A CRRA utility function is sufficient to capture the interdependence of price and wealth dynamics. The selection of logarithmic utility in the model developed here is based on a number of experimental and empirical studies, as summarized in [33] that, “it is reasonable to assume decreasing absolute risk aversion (DARA) and constant relative risk aversion (CRRA)” (p.65). They show that the only utility function with DARA and CRRA property is the power utility function, among which, the logarithmic utility function is one of the special cases.

20 An Adaptive Model of Asset Price and Wealth Dynamics

469

For the standard portfolio optimization problem, a model in terms of price and wealth is first established in this section. However it turns out that the resulting model is non-stationary in that both the price and wealth are growing processes. So, in order to reduce the growth model to a stationary model, the return on the risky asset and the wealth proportions (among heterogeneous investors), instead of price and wealth, are used as state variables. Based on a certain performance (or fitness) measures, an adaptive mechanism is then introduced, leading to the general adaptive model. The final model includes the dynamics of both the asset price and wealth and it characterizes three important and related issues in the study of financial markets: heterogeneity, adaptiveness, and interaction of agents.

20.2.1

Notation

Denote

It is assumed that3 all the agents have the same attitude to risk with the same utility function U (W ) = log(W ) . Following the above notation, the return on the risky asset at period t is then defined by4

ρt = 3

pt − pt −1 + yt . pt −1

To make the following analysis more tractable and transparent, the assumption that all agents have the same utility function U (W ) = log(W ) is maintained in this paper. However, the analysis can be generalized to the case of utility functions that allow agents to have different risk coefficients, say, U i (W ) = (W γ − 1) /γ i with 0 < γ i < 1 . As shown by [12], the dynamics generated by the difference in risk aversion coefficients is an interesting and important issue that for the present model is left for future work. The return can also be defined by the difference of logarithms of the prices. It is known that the difference between these two definition becomes smaller as the time interval is reduced (say, from monthly to weekly or daily). i

4

(20.1)

470

Carl Chiarella, Xue-Zhong He

20.2.2

Portfolio Optimization Problem of Heterogeneous Agents

The framework is that of the standard one-period portfolio optimization problem. The wealth of agent (or investor) i at time period t + 1 is given by

Wi ,t +1 = (1 − π i ,t )Wi ,t R + π i ,tWi ,t (1 + ρ t +1 ) = Wi ,t [ R + π i ,t ( ρ t +1 − r )]. (20.2) As in [6] and [31], a Walrasian scenario is used to derive the demand equation, that is each trader is viewed as a price taker and the market is viewed as finding (via the Walrasian auctioneer) the price pt that equates the sum of these demand schedules to the supply. That is, the agents treat the period t price, pt , as parametric when solving their optimisation problem to determine π i ,t . Denote by Ft = { pt −1 , pt − 2 , ; yt , yt −1 , yt − 2 , } the information set5 formed at time t . Let Et , Vt be the conditional expectation and variance, respectively, based on Ft , and Ei ,t , Vi ,t be the “beliefs” of investor i about the conditional expectation and variance. Then it follows from (20.2) that

Ei ,t (Wi ,t +1 )

= Wi ,t [ R + π i ,t ( Ei ,t ( ρ t +1 ) − r )],

Vi ,t (Wi ,t +1 )

= Wi ,2t π i2,tVi ,t ( ρ t +1 ).

(20.3)

Consider investor i , who faces a given price pt , has wealth Wi ,t and believes that the asset return is conditionally normally distributed with mean Ei ,t ( ρ t +1 ) and variance Vi ,t ( ρ t +1 ) . This investor chooses a proportion π i ,t of his/her wealth to be invested in the risky asset so as to maximize the expected utility of the wealth at t + 1 , as given by max π Ei ,t [U (Wi ,t +1 )] . It follows that6 the optimum investment proportion at time t , π i ,t is given by i ,t

π i ,t =

Ei ,t ( ρ t +1 ) − r Vi ,t ( ρ t +1 )

.

(20.4)

Heterogeneous beliefs are introduced via the assumption that

Ei ,t ( ρ t +1 ) = f i ( ρ t −1 ,

, ρ t − L ), i

Vi ,t ( ρ t +1 ) = g i ( ρ t −1 ,

, ρ t − L ) (20.5) i

for i = 1, , H , where Li are integers, f i , g i are some deterministic functions which can differ across investors. Under this assumption, both Ei ,t ( ρ t +1 ) and Vi ,t ( ρ t +1 ) are functions of the past prices up to t − 1 , which in turn implies the 5

6

Because of the Walrasian scenario, the hypothetical price pt at time t is included in the information set to determine the market clearing price. However, agents form their expectations by using the past prices up to time t − 1 . See Appendix A.1 in [11] for details.

20 An Adaptive Model of Asset Price and Wealth Dynamics

471

optimum wealth proportion π i ,t , defined by (20.4), is a function of the history of the prices ( pt −1 , pt − 2 , )7.

20.2.3

Market Clearing Equilibrium Price – A Growth Model

The optimum proportion of wealth invested in the risky asset,

π i,t , determines the

number of shares at price pt that investor i wishes to hold Ni ,t = π i ,tWi ,t /pt . Summing the demands of all agents gives the aggregate demand. The total number of shares in the market, denoted by N , is assumed to be fixed, and hence the market clearing equilibrium price

pt is determined by

H

∑π

i ,t

∑

Wi ,t = Npt .

H i =1

N i ,t = N , i. e., (20.6)

i =1

Thus, equations (20.2) and (20.6) show that, in this model, as in real markets, the equilibrium price pt and the wealth of investors, (W1,t , , WH ,t ) , are determined simultaneously. The optimum demands of agents are functions of the price and their wealth distribution. Also, as observed in financial markets, the model implies that both the price and the wealth are growing processes in general.

20.2.4 Population Distribution Measure Now suppose all the agents can be grouped in terms of their conditional expectations of mean and variance of returns of the risky asset. That is, within a group, all the agents follow the same expectation schemes on the conditional mean and variance of the return ρ t +1 , and hence the optimum wealth proportion ( π i ,t ) invested in the risky asset for the agents are the same. Assume all the agents can be grouped as h types (or groups) and group j has j ,t agents at time t with j = 1, , h , then 1,t + + h,t = H . Denote by n j ,t the proportion of the number of agents in group j , at time t , relative to the total number of the investors, H , that is, n j ,t = j ,t /H , so that n1,t + + nh,t = 1 . Some simple examples on return and wealth dynamics when proportions of different types of agents n j ,t are fixed over time are given in [11]. However, this is a highly simplified assumption and it would be more realistic to allow agents to 7

In [31], the hypothetical price pt is included in the above conditional expectations on the return and variance. In this case, the market clearing price is solved implicitly and is much more involved mathematically. The approach adopted here is the standard one in deriving the price via the Walrasian scenario and also keeps the mathematical analysis tractable. A similar approach has been adopted in [6] and [12]. Of course other market clearing mechanisms are possible, e. g., a market-maker. It turns out that the type of market clearing mechanism used does affect the dynamics; on this point see [13].

472

Carl Chiarella, Xue-Zhong He

adjust their beliefs from time to time, based on some performance or fitness measures (say, for example, the realized returns or errors, as in [6]). In this way, one can account for investor psychology and herd behavior8. As a consequence, the proportions of different types of agents become endogenous state variables. Therefore the vector ( n1,t , n2 ,t , , nh ,t ) measures the population distribution among different types of heterogeneous agents. The change in the distribution over time can be used to measure herd behavior among heterogeneous agents, in particular, during highly volatile periods in financial markets.

20.2.5 Heterogeneous Representative Agents and Wealth Distribution Measure By assuming the adaptiveness of agents’ behavior, agents may switch among different groups from time to time. To track the wealth evolution of each individual agent is certainly an interesting and important issue, but is rather a difficult problem within the current framework. However, by introducing heterogeneous representative agents (HRAs), the model established here is capable of characterizing their performance in terms of wealth distribution over time. By HRAs, we mean that such agents can become a fraction so that the sum of the fractions of all those agents is equal to 1. More precise construction of such HRAs is given as follows. Assume that all agents are grouped into h types (according to their beliefs) and group j had j ,t ( j = 1, 2, , h) agents at time t . Here h is assumed to be fixed, while j ,t can vary from period to period. At time t , for agents within group j , let W j ,t be the average wealth of agents within this group j , so that gives the total wealth of group j . Denote by w j ,t the average wealth j ,t W j ,t proportion of group j relative to the total average wealth W t at time t , that is,

w j ,t =

W j ,t Wt

,

h

with

W t = ∑ W j ,t .

(20.7)

j =1

Then the vector ( w1,t , w 2,t , , w h ,t ) corresponds to the wealth proportion distribution among HRAs of different types, it measures the average wealth levels associated with different trading strategies.

20.2.6 Performance Measure, Population Evolution and Adaptiveness Following [5], [6], a performance measure or fitness function, denoted (Φ1,t , , Φ h ,t ) , is publicly available to all agents. Based on the performance measure agents make a (boundedly) rational choice among the predictors. This results in the Adaptive Rational Equilibrium Dynamics, introduced by [5], an evolutionary dynamics across predictor choice which is coupled to the dynamics of the 8

See more discussion on this aspect in the next section.

20 An Adaptive Model of Asset Price and Wealth Dynamics

473

endogenous variables. In the limit as the number of agents goes to infinity, the probability that an agent chooses trading strategy j is given by the well known discrete choice model or ‘Gibbs’ probabilities9

n j ,t = exp[ β (Φ j ,t −1 − C j )]/Z t

h

Z t = ∑ exp[ β (Φ j ,t −1 − C j )], (20.8) j =1

where C j ≥ 0 measures the cost of the strategy j for j = 1, 2, , h . The crucial feature of (20.8) is that the higher the fitness of trading strategy j , the more traders will select that strategy. The parameter β , called the intensity of choice or switching intensity, plays an important role and can be used to characterize various psychological effects, such as overconfidence and underreaction, overreaction and herd behavior, as discussed by [22]. On the one hand, when individuals are overconfident, they do not change their beliefs as much as would a rational Bayesian in the face of new evidence. This may result from either high cost in processing new information or individuals’ reluctance to admit to having made a mistake (so the new evidence is under-weighted). Both overconfidence and underreaction can be partially captured by a small value of the switching intensity parameter β . In the extreme case when β = 0 , there is no switching among strategies and the populations of agents is evenly distributed across all trading strategies10. On the other hand, if the environment is volatile, or agents are less confident about their beliefs, agents may be less reluctant to switch to beliefs which generate better outcomes. This effect can be captured by a high value of the switching intensity parameter β . An increase in the switching intensity β represents an increase in the degree of rationality with respect to evolutionary selection of trading strategies. In the extreme case when β is very large (close to infinity), a large proportion of traders are willing to switch more quickly to successful trading strategies. In such a situation, market overreaction and herd behavior may be observed. A natural performance measure can be taken as a weighted average of the realized wealth return on the proportion invested in the risky asset among h HRAs, given by

Φ j ,t = φ j ,t + γΦ j ,t −1 ; φ j ,t = π

W j ,t − W j , t −1

W j ,t −1

j , t −1

=π

[ r + ( ρ t − r )π

j , t −1

]

j , t −1

for j = 1, , h , where 0 ≤ γ ≤ 1 and φ j ,t is the realized wealth return invested in the risky asset in period t . Here γ is a memory parameter measuring how strongly the past realized fitness is discounted for strategy selection, so that Φ j ,t may be interpreted as the accumulated discounted return on the proportion of wealth invested by group j in the risky asset. 9

10

See [37] and [1] for extensive discussion of discrete choice models and their applications in economics. See [11] for models with fixed, but not evenly distributed, population proportions among different types of trading strategies.

474

Carl Chiarella, Xue-Zhong He

20.2.7

An Adaptive Model

The above growth model is rendered stationary by formulating it in terms of the risky asset return and the average wealth proportions among the investors, instead of the stock price pt and the wealth Wt . It should be made clear that the term “stationary” is not being used in the econometric sense of a stationary stochastic process. Rather the term is used to refer to a stationary dynamical system that has fixed points (as opposed to growing trends) as steady state solutions. It may well be that the dynamical system to be analyzed below generates time series that are non-stationary in the econometric sense; this will depend on both the local stability/instability properties of the dynamical system and how noise is processed by the nonlinear system. When such a steady state exists, following from (20.1), it generates a geometrically growing price process. The dynamical system describing the evolution of average wealth proportions of HRAs and risk asset return is given by the following proposition. Proposition 20.1. For group i , formed at time period t − 1 , the average wealth proportions at the next time period t evolve according to

w i ,t =

∑

w i ,t −1[ R + ( ρ t − r )π i ,t −1] w j ,t −1[ R + ( ρ t − r )π j =1

h

(i = 1,

, h)

(20.9)

]

j , t −1

with return ρ t given by11

ρt

∑ =r+

h i =1

w i ,t −1[(1 + r )( ni ,t −1 π i ,t −1 − ni ,t π i ,t ) − α t ni ,t −1 π i ,t −1]

∑

h

π i ,t −1w i ,t −1( ni ,t π i ,t − ni ,t −1 ) i =1

, (20.10)

where α t denotes the dividend yield defined by α t = yt /pt −1, and the population proportions ni ,t evolve according to

ni ,t = exp[ β (Φ i ,t −1 − Ci )]/Z t ,

(20.11)

in which the fitness functions are defined by

Φ i ,t = φi ,t + γ Φ i ,t −1 ;

φi ,t = π i ,t −1

W i ,t − W i ,t −1 W i ,t −1

0 ≤ γ ≤ 1, = π i ,t −1[ r + ( ρ t − r )π i ,t −1],

h

Z t = ∑ exp[ β (Φ i ,t −1 − Ci )], i =1

and the constants Ci ≥ 0 measure the cost of the strategy for i = 1, 2,

, h.

Proof. See Appendix 20.7.1. 11

It is easy to check that ρ t ≡ r is a trivial solution. As a necessary condition for investing in the risky asset, it is assumed that E ( ρ t ) > r .

20 An Adaptive Model of Asset Price and Wealth Dynamics

475

Equations (20.9) and (20.10) constitute a difference equation system for w j ,t and ρt the order of which depends on the choice by agents of the L j at equation (20.5). It should be stressed that in the process of rendering the model stationary it has become necessary to reason in terms of the dividend yield ( α t ) rather than the dividend ( yt ) directly. It is easy to see that, when h ≤ H , j ≥ 1 and β = 0 for j = 1, , h , Proposition 20.1 leads to the model in [11] with fixed proportion n j ,t = n j ( j = 1, , h) of heterogeneous agents.

20.2.8

Trading Strategies

The adaptive model established in Proposition 20.1 is incomplete until the conditional expectations of agents on the mean and variance of returns are specified. Different trading strategies can be incorporated into this general adaptive model as indicated by equation (20.5). To illustrate various features of the model, only three simple, but well-documented, types of agents, termed fundamentalists, momentum traders and contrarians, are considered in this paper.12 Neither type is fully rational in the sense used in the rational expectations literature. The information on the dividends and realized prices is publicly available to all agent types.

20.2.9

Fundamental Traders

The fundamentalists make forecasts on the risk premium level based on both public and their private information about future fundamentals. It is assumed that

EF ,t ( ρ t +1 ) = r + δ F ,

(20.12)

where EF ,t denotes the fundamentalists’ estimate of the conditional mean of ρ t +1 for the next period t + 1 and δ F is their estimate of risk premium.13 That is, the fundamentalists believe that the excess conditional expected return for the risky asset (above the risk-free rate) is given by the risk premium δ F that they may have estimated from a detailed analysis of the risky asset (earnings reports, market prospects, political factors etc.).

20.2.10 Momentum Traders Momentum traders, in contrast to the fundamental traders, do condition on the past prices. Momentum, or positive feedback, trading has several possible motivations, one being that agents form expectations of future prices by extrapolating trends. They buy into price trends and exaggerate them, leading to overshooting. As a result there may appear excess volatility. 12

13

To simplify the analysis, we focus on the conditional mean estimation by assuming that subjective estimation of variance of all the agents is given by a constant. A constant risk premium is a simplified assumption. In practice, the risk premium may not necessarily be constant but could also be a function of, for example the variance.

476

Carl Chiarella, Xue-Zhong He

Empirical studies have given support to the view that momentum trading strategies yield significant profits over short time intervals (e. g. [3, 25, 26, 29, 39 and 41]). Although these results have been well accepted, the source of the profits and the interpretation of the evidence are widely debated. In addition, there does not exist in the literature a quantitative model to clarify and give theoretical support to such evidence. As a first step, this issue is discussed in the next section within the framework of the adaptive heterogeneous model outlined in Proposition 20.1. For momentum traders, it is assumed in this paper that their forecasts are “simple” functions of the history of past returns. More precisely, it is assumed that

EM ,t ( ρ t +1 ) = r + δ M + d M ρ M ,t ,

ρ M ,t =

LM

1 LM

∑ρ

t −k

,

(20.13)

k =1

where EM ,t ( ρt +1 ) denotes the momentum traders’ estimate of the conditional mean of

ρt +1 for the next period t + 1 and δ M is their risk premium estimate and

d M > 0 corresponds to the extrapolation rate of the momentum trading strategy. The integer L ≥ 1 corresponds to the memory length of momentum traders. M Equation (20.13) states that the expected excess return (above the risk-free rate) of momentum traders has two components: their estimated risk premium δ M and

trend extrapolation d M ρ M ,t , which is positively proportional to the moving aver-

age of the returns over the last LM time periods.

20.2.11 Contrarian Traders The profitability of contrarian investment strategies is now a well-established empirical fact in the finance literature (see, for example, [30]). Empirical evidence suggests that over long time intervals, contrarian strategies generate significant abnormal returns (see, for example, [2, 17, and 8]). Some evidence has shown that overreaction can use aggregate stock market value measures such as dividend yield to predict future market returns, so that contrarian investment strategies are on average profitable. In spite of the apparent robustness of such strategies, the underlying rationale for their success remains a matter of lively debate in both academic and practitioner communities. In the following section, the role of expectational errors in explaining the profitability of contrarian strategies is examined. For contrarian traders, it is assumed that

EC ,t ( ρ t +1 ) = r + δ C − d C ρ C ,t,

ρ C ,t =

1 LC

LC

∑ρ

t −k

,

(20.14)

k =1

where EC ,t ( ρ t +1 ) denotes the contrarian agents’ estimate of the conditional mean of ρt +1 for the next period

t + 1 and δ C is their risk premium estimated and

dC > 0 corresponds to their extrapolation rate. The integer LC ≥ 1 corresponds to

20 An Adaptive Model of Asset Price and Wealth Dynamics

477

the memory length of contrarian agents. Equation (20.14) states that contrarian traders believe that the difference of excess conditional mean and the risk premium [ E ( ρ ) − r ] − δ is negatively proportional to the moving average of C ,t

t +1

C

the returns over the last L time periods. C In addition to the different trading strategies used by different types of traders, there are various other ways to introduce agent heterogeneity such as through different risk premia, extrapolation rates and memory lengths.

20.3

An Adaptive Model of Two Types of Agents

In the remainder of this chapter, the focus is on a simple model of just two types of agents – momentum traders and contrarian traders. In this case, the adaptive model developed in section 20.2 can be reduced to a simple form, as indicated below. To examine profitability of momentum and contrarian trading strategies over different time intervals, a special case of the model, termed the quasi-homogeneous model, is then considered. Detailed discussion on the dynamics of such quasihomogeneous models, including profitability, herd behavior, price overshooting, statistical patterns of returns, is then undertaken in the subsequent sections.

20.3.1 The Model for Two Types of Agents Assume that there are only two different types trading strategies. Let w t , n t be the difference of the average wealth proportions and population proportions of type 1 and type 2 agents; that is

w t = w1,t − w 2,t ,

nt = n1,t − n2,t .

(20.15)

Then it follows from w1,t + w 2 ,t = 1 and n1,t + n2 ,t = 1 so that w 1, t =

1 + wt 2

,

w 2 ,t =

1 − wt

and

2

n1, t =

1 + nt 2

,

n2 , t =

1 − nt 2

.

Correspondingly, the adaptive model in Proposition 20.1 can be reduced to a simple form given by Proposition 20.2. Proposition 20.2. The difference of the average wealth proportions wt evolves according to

w t +1 =

f1 − f 2

(20.16)

f1 + f 2

with return ρt given by

ρt +1 = r +

g11 + g12 g 21 + g 22

,

(20.17)

478

Carl Chiarella, Xue-Zhong He

where

f1 = (1 + w t )[1 + r + ( ρ t +1 − r )π 1,t ], f 2 = (1 − w t )[1 + r + ( ρ t +1 − r )π 2 ,t ], g11 = (1 + w t )[(1 + r − α t +1 )(1 + n t )π 1,t − (1 + r )(1 + n t +1)π 1,t +1], g12 = (1 − w t )[(1 + r − α t +1 )(1 − n t )π 1,t − (1 + r )(1 − n t +1)π 2 ,t +1], g 21 = (1 + w t )π 1,t[(1 + n t +1)π 1,t +1 − (1 + n t )], g 22 = (1 − w t )π 2,t[(1 − n t +1)π 2,t +1 − (1 − n t )] and π j ,t ( j = 1, 2) are defined by (20.4). The difference of population proportions nt evolves according to

nt +1 = tanh[

β 2

((Φ1,t − Φ 2,t ) − (C1 − C2 ))],

(20.18)

where the fitness functions are defined as

Φ j ,t +1 = π j ,t[ r + ( ρt +1 − r )π j ,t ] + γ Φ j ,t ,

(20.19)

and Cj ≥ 0 measure the cost of the strategy for j = 1, 2 .

20.3.2

Wealth Distribution and Profitability of Trading Strategies

The average wealth distribution among the two types of agents (following different trading strategies) is now characterized by w t , the difference of the average wealth proportions. Over a certain time period, if w t stays above (below) the initial value w o and increases (decreases) significantly as t increases, then, on average, type 1 agents accumulate more (less) wealth than type 2 agents, and one may say type 1 trading strategy is more (less) profitable than type 2 trading strategy. Otherwise, if the difference is not significantly different from w o , then there is no evidence that on average either trading strategy is more profitable than the other.

20.3.3

Population Distribution and Herd Behavior

The distribution of populations using different types of trading strategies is now characterized by the difference of the population proportions nt . At time period, if nt is positive (negative), then this indicates that there are more (less) agents using type 1 trading strategy than type 2 trading strategy. Moreover, if nt is significantly different from zero, then this could be taken as an indication of herd behavior. This is, in particular, frequently observed to be the case when the switching intensity β > 0 is high.

20 An Adaptive Model of Asset Price and Wealth Dynamics

479

When there is evidence on the profitability of type 1 (type 2) trading strategy and a clear indication of herding behavior using type 1 (type 2) trading strategy over the time period, we say type 1 (type 2) trading strategy dominates the market.

20.3.4 A Quasi-Homogeneous Model As a special case of the adaptive model with two types of agents, consider the case, termed quasi-homogeneous model, where both types of agents use exactly the same trading strategies except that they use different memory lengths. The trading strategies for both types of agents can be unified by writing

Ei ,t ( ρt +1 ) = r + δ i + d i ρ i ,t,

ρ i ,t =

1 Li

Li

∑ρ

t −k

,

(20.20)

k =1

for i = 1, 2 , where L i ≥ 1 is integer, r ( > 0), δ i ( > 0) and d i ∈ R are constants. For the quasi-homogeneous model, it is further assumed that δ1 = δ 2 = δ , d1 = d 2 = d but 1 ≤ L1 ≤ L 2 . In the following discussion, assume that the conditional variances of agents are 2 given by a constant σ . It is convenient to standardize both thesk premium δ and 2 2 extrapolation rate d according to δ = δ/σ , d = d /σ . Correspondingly, the optimal demand of type j agents in terms of the wealth proportion invested in the risky asset is given by

π j ,t = δ j + d j ρ j , t . It is also assumed that the dividend yield process has the form

α t = α o + qN (0, 1),

(20.21)

where N (0, 1) is the standard normal distribution and q is a measure of the volatility of the dividend process.14 Because of the highly nonlinear nature of the adaptive model theoretical analysis (even of the steady states) seems intractable and thus the model is analyzed numerically. However, the results for the non-adaptive model established in [11] underly the dynamics of the adaptive model established here. In the presence of heterogeneous agents, the non-adaptive model can have multiple steady states, and the convergence of such steady states follows an optimal selection principle – the return and wealth proportions tend to the steady state which has a relatively high return. More importantly, heterogeneity can generate instability which, under the 14

The normal distribution has been chosen for convenience, it has the disadvantage that the dividend yield αt could be negative. However for the parameters used in the simulations this probability is extremely low so the distribution is truncated at 0.

480

Carl Chiarella, Xue-Zhong He

stochastic noise processes, results in switching of the return among different states, such as steady-states, periodic and aperiodic cycles from time to time. One would expect the adaptive model to display even richer dynamics. 20.3.4.1 Existence of Steady-state Returns If α t = α o is a constant, the system (20.16) – (20.20) becomes a deterministic dynamical system. The return time series generated by the adaptive model is the outcome of the interaction of this deterministic dynamical system with external noise processes (here the dividend yield process). A first step to understanding the possible dynamical behavior of the noise perturbed dynamical system is an understanding of the underlying dynamics of the deterministic systems, such as existence of steady-states, their stability and bifurcation. When α t = α o is a constant, the quasi-homogeneous model has the same steady-state as the homogeneous model (in which L1 = L2 ). The existence of such steady states is studied in [11] and the results may be summarized as follows. (i) The steady-state of the wealth proportions stays at the initial level, while the steady-state of the return depends on the extrapolation rate d . (ii) When agents are fundamentalists ( d = 0 ), there is a unique steady-state return called the fundamental steady-state return. Moreover, high risk premia δ correspond to high levels of the steady-state return. (iii) There exist two steady-state returns when agents are contrarians ( d < 0 ). One of the steady-state returns is negative while the other is positive. We call the positive steady-state return the contrarian steady-state return. More importantly, with the same risk premium, when agents act as contrarians, the contrarian steadystate return is pushed below the fundamental steady-state return. (iv) When agents are momentum traders ( d > 0 ) and they extrapolate weakly, there exist two steady-state returns. However, when d is close to zero, only one of the steady states is bounded and this steady-state return is called the momentum steady-state return. Furthermore, given the same risk premium, compared to the fundamental equilibrium, a weakly homogeneous momentum trading strategy leads to a higher level of steady-state return. An aim of the following analysis is to determine to what extent the adaptive model with two types of agents reflects these characteristics. 20.3.4.2 Parameters and Initial Value Selection Using data for the United States during the period 1926−94, as reported by Ibbotson Associates, the annual risk-free interest rate, r = 3.7%, corresponds to the average rate during that period. The initial history of rates of return on the stock consists of a distribution with a mean of 12.2% and a standard derivation of 20.4%. A mean dividend yield of α o = 4.7 % corresponds to the historical average yield on the S & P500 . The initial share price is po = $10.00 . The analysis in the following sections selects the annual risk-free rate r , standard derivation σ and the mean dividend yield α o as indicated above. For the

20 An Adaptive Model of Asset Price and Wealth Dynamics

481

simulations, the time period between each trade is one day and simulations are conducted over 20 years. Parameters and initial values are selected as follows, unless stated otherwise,

δ = 0.6, β = 0.5, γ = 0.5, C1 = C2 = 0

(20.22)

w o = 0, no = 0, Φ1,o = Φ 2,o = 0.5, po = $10.

(20.23)

and

Furthermore, annualized returns for risk-free and risky assets are used in the fitness functions Φ j ,t for j = 1, 2 .

20.4

Wealth Dynamics of Momentum Trading Strategies

This section considers the quasi-homogeneous model with d1 = d 2 = d > 0 and 1 ≤ L 1 < L 2 , that is both types of agents follow the same momentum trading strategy except for having different memory lengths. The simulations address the question as to which type of agent dominates the market over time. As discussed in Section 20.2, some empirical studies seem to support a view that momentum strategies are profitable over short time intervals, but not over long time intervals. The following discussion examines different combination of ( L1 , L2 ) and analyzes the effect of lag length on the wealth dynamics15. The results indicate in general that, the strategy with short memory length dominates the market by accumulating more wealth and attracting more of the trading population. The adaptive model outlined in this paper is thus capable of characterizing some broad features found in empirical studies.

20.4.1 Case: (L1,L2) = (3,5) We consider first the dynamics of the underlying deterministic system (i. e. q = 0 in (20.21)) and then subsequently consider the impact of the noise dividend process on the dynamics. 20.4.1.1 No-noise Case For d = 0.5 , initial population proportion no = 0 and any initial wealth proportion w o , numerical simulations show that ρ t → ρ ∗ = 15 . 45 % 15

p .a .,

w t → w o,

nt → 0 .

The selection of various combinations of lag lengths is arbitrary. However more extensive simulations (not reported) indicate some robustness of the results presented in this paper.

482

Carl Chiarella, Xue-Zhong He

By changing various parameters and initial values, the following results on the momentum trading strategies from the quasi-homogeneous model have been obtained. Risk premium and over-pricing. It is found that, ceteris paribus, for δ = 0.35 , ρt → ρ ∗ = 10.94% , while for δ = 0.53 , ρ ∗ = 15.45% . In general, a higher riskadjusted premium leads to a higher return, and a higher price as well. In fact, for the given parameters, there exists δ o ∈ (0.69, 0.7) , a called bifurcation value16,

such that the returns converge to fixed values for δ < δ o and diverge for δ > δ o , leading to price explosion. Over-extrapolation and overshooting. Momentum traders form expectations of future prices by extrapolating trends. However, when the prices or returns are over-extrapolated, stocks are over-priced, and as a result, overshooting takes place. Based on the parameters selected, simulations indicate that there exists d o ∈ (0.573, 0.574) such that, ceteris paribus, returns converge to fixed values for d < d o and diverge for d > d o , leading prices to exhibit overshooting. No herding behavior. For either fixed n o ≠ 0 and a range of w (say no = −0.3 and w o ∈ ( −0.5, 0.3) ), or fixed w o ≠ 0 and a range of no (say, w o = −0.3 and no ∈ (−1, 0.3) ), simulations show that o

ρt → ρ ∗ = 15.45% (annual),

wt → wo + ε ,

nt → 0

with ε ≈ 10 . Also, the switching intensity parameter β has almost no effect on the results (as long as the returns series converge to constants). This implies that, without the noise from the dividend yield process, in terms of profitability, type 1 trading strategy is slightly better than type 2, but not significantly. In addition, the populations of agents using different strategies become evenly distributed. In other words, no one of the momentum trading strategies dominates the market, even though both wealth and population are not evenly distributed initially. Therefore, there is no herd behavior. −6

20.4.1.2 Effect of Noise

We select the annualized standard deviation of the noisy dividend yield process, q = 0.03 = 3% . By adding a noisy dividend process to the adaptive system, the general features of the corresponding deterministic system (without the noise) still hold. However, the noise has a significant impact on the dynamics of the system, such as wealth and population distributions, autocorrelation of returns, volatility of returns and prices etc., as indicated below. In particular, the dynamics of the model is greatly affected by agents’ behavior, which is measured by their extrapolation 16

As in Brock and Hommes [6], the dynamics of the system through various types of bifurcation can be analyzed and are of interest. However, in this paper, we focus on the dynamics of the stochastic system when the return process of the underlying deterministic system is stable.

20 An Adaptive Model of Asset Price and Wealth Dynamics

483

rate, d , and switching intensity, β . The following discussion focuses on the dynamics of the system for various combinations of these two parameters d and β . Wealth distribution is largely influenced by agents’ extrapolation and strategy switching activity. Simulations show that, in general, a strong extrapolation leads type 1 trading strategy (with lag 3) to accumulate more wealth than type 2 trading strategy does. In other words, type 1 trading strategy (with lag of 3) is more profitable than type 2 (with lag of 5) under the noisy dividend process. Furthermore, as the switching intensity β increases, the profitability of type 1 trading strategy is improved significantly. This result is unexpected and interesting, and it is optimal in the sense that the overall outcome is independent of the initial wealth and population distributions. When the wealth and population are evenly distributed across the two types of strategies initially (i. e. w o = 0, no = 0 ), on average, type 1 strategy accumulates more wealth (about 5% to 6%) than type 2 strategy over the whole period, as indicated by the time series plot for the average wealth proportion difference ( w t ) in Figure 20.1. Also, as the extrapolations rate increases (i. e. as d increases), type 1 strategy accumulates more wealth than type 2 strategy (say, about 2% to 3% more for d = 0.5 , compared to 5−6% more for d = 0.53 ). This suggests that, when both types of strategies start with the same level of wealth and have the same number of traders, type 1 trading strategy is more profitable under the noisy dividend process. This result still holds when the initial wealth is not so evenly distributed. However, on average, when type 1 strategy starts with more initial wealth than type 2 (say w o = 0.2 ), the prices can be pushed immediately to very high levels so that any further trend chasing from type 1 strategy can cause price to overshoot, leading an explosion of price. For a fixed initial wealth proportion w o (say w o = 0 ) and a range of no (say, no ∈ ( −1, 0.5) ), w t increases in t . But for large no (say no = 0.6 ), the prices are pushed to explosion. This indicates that type 1 strategy accumulates more wealth over the period, even when the population of type 2 agents is high initially. However, an initial over concentration of type 1 agents can lead to overshooting of price. Herding behavior is measured by the population proportion difference nt and the switching intensity parameter β . For β = 0 , there is no switching between the two trading strategies. However, when agents are allowed to switch (i. e., β > 0 ), as indicated by the time series plot for the population ( nt ) in Figure 20.1, agents switch between the two strategies frequently. In general, because of the profitability of type 1 strategy, more agents switch from type 2 to type 1, as indicated by the mean and standard deviation of the population nt in Table 20.1. Also, as the switching intensity β increases, simulations (not reported here) show that the frequency of such switching increases too. Furthermore, as β increases, both prices and returns become more volatile, as indicated by the time series plots of returns ( ρ t ) and prices ( pt ) in Figure 20.1. Excess volatility and volatility clustering – As indicated by the time series plot of returns ρ t in Figure 20.1, adding the noisy dividend process causes an otherwise stable return series to fluctuate. This fact itself is not unexpected. What is of interest is the contrast between the simply normally distributed dividend process that is the input to the system and the return process that is the output of the sys-

484

Carl Chiarella, Xue-Zhong He

Figure 20.1. Time series plots for returns (top left), wealth (top right) and population (bottom left) distributions, and prices (bottom right) when the same momentum trading strategies with different lags (L1,L2) = (3,5) are used. Here, δ = 0.6, d = 0.53 and q =0.03

tem. With the increase of either the standard deviation of the noise process q , or agents extrapolation rate d , or switching intensity β , both returns and prices become more volatile. Moreover, volatility clustering is also observed. Autocorrelation – Significant positive autocorrelation (AC) for lags 1 and 2, negative for lags 3 to 8, positive for lags 9−14, are found, as indicated by Table 20.4. However, as lag length increases, the ACs become less significant. Overshooting – Related simulations (not reported here) indicate that either strong extrapolation (corresponding to high d ), or high volatility of the dividend yield process ( q ), or high switching density ( β ) can cause price to overshoot and lead to price explosion. Numerical simulations also show that, to avoid price overshooting, a minimum level of risk premium ( δ ) is required.

20.4.2

Other Lag Length Combinations

This section addresses how the above results are affected by different lag length combinations. For ( L1 , L2 ) = (3, 7) , the general dynamic features are similar to the case of ( L1 , L2 ) = (3, 5) , except for the following differences. (i) The underlying deterministic system is stable over a wider range of extrapo∗ ∗ ∗ lation rates d ∈ [0, d ) ; d ≈ 0.57 for L2 = 5 and d ≈ 0.773 for L2 = 7 .

20 An Adaptive Model of Asset Price and Wealth Dynamics

485

Figure 20.2. Time series plots for wealth and population distributions, returns and prices when the same momentum trading strategies with different lags (L1,L2) = (10,26) are used. Here, δ = 0.6, d = 1.2, q =0.03 and ω0 = 0.6 .

(ii) The trading strategy with short lag dominates the market. However, compared to the previous case, for the same set of parameters and initial values, both the profitability and herd behavior decrease, as indicated by the time series plots for wealth and population in Figure 20.5 and the corresponding statistical results in Table 20.1. (iii) ACs are significantly positive for lags 1 and 2, either positive or negative for lag 3, but not significantly, negative for lags 4 to 9, positive for lags 10 to 15, as indicated in Table 20.4. For ( L1 , L2 ) = (10, 14) , compared to the previous two cases, the following differences have been observed. ∗ (i) The upper bound d for returns of the underlying deterministic system to be ∗ stable increases to d ≈ 1.57 . (ii) By adding the noisy dividend yield process, the trading strategy with memory length 10 accumulates more wealth than the one with lag length 14. However, in contract to the previous cases, for the same set of parameters and initial values, the profitability and herd behavior of the strategy of lag 10 compared to the one with lag 14 is much less significant, as indicated by the corresponding statistical results in Table 20.2; although an increase of extrapolation improves the profitability of the strategy with lag 10, as indicated by the corresponding statistics in Table 20.2 for d = 1.2 . (iii) The ACs oscillate and become less significant when agents extrapolate weakly (say, for d = 0.5 ), but become more significant when agents extrapolate strongly (say, d = 1.1 ), as indicated by Table 20.4.

486

Carl Chiarella, Xue-Zhong He

(iv) There is less herding behavior than in the previous cases. This is partially because of the less significant profitability of one strategy with lag 10 over the other with lag 14. ∗ For ( L1 , L2 ) = (10, 26) , the upper bound d for returns of the underlying deter∗ ministic system to be stable increases to d ∈ (2.2, 2.3) . By adding the noisy dividend yield process, with the same parameters and initial values, profitability of type 1 trading strategy (with lag length 10) becomes questionable, as indicated by the time series plots in Figure 20.2 and the corresponding statistics in Table 20.3. As demonstrated by Table 20.4, the ACs have less patterns and do not die out as lags increase. In summarizing, we obtain the following results when both types of agents follow the same momentum trading strategy, but with different memory lengths. (i) Without the noisy dividend yield process, an increase in lag length from either one of the trading strategies stabilizes the return series of the underlying deterministic system, and enlarges the range of the extrapolation coefficient for which the market does not explode. However, for the same set of parameters, the profitability of the trading strategies and herd behavior become less significant. (ii) Adding the noisy dividend process in general improves the profitability of the trading strategies with short lag length L1 (say, ( L1 , L2 ) = (3, 5) and (3, 7) ). However, such profitability becomes less significant when the short lag length L1 increase, and may even disappear (say ( L1 , L2 ) = (10, 14) and (10, 26) ). (iii) When trading strategies become profitable, agents tend to adopt herding behavior – more agents switch to the more profitable strategy over the time period. However, over concentration (in terms of the initial average wealth, population proportion), or over extrapolation (in terms of high extrapolation rates and improper risk premium levels) can cause overshooting of price and push prices to explosion, leading to a market crash. (iv) Momentum trading strategies can push the prices to a very high level and lead the returns to be more volatile, exhibiting volatility clustering. (v) ACs follow certain patterns when one of the trading strategies becomes profitable and die out as lags increase. However, such patterns become less significant when the profitability of the trading strategy becomes less significant, and may even disappear. (vi) Price levels are more determined by the risk premium levels rather than by other parameters (such as the extrapolation rate and switching intensity).

20.5 Wealth Dynamics of Contrarian Trading Strategies This section considers the quasi-homogeneous model with d1 = d 2 = d < 0 and 1 ≤ L1 < L2 , that is both types of agents follow the same contrarian trading strategy except for having different memory lengths. As discussed in section 20.3,

20 An Adaptive Model of Asset Price and Wealth Dynamics

487

some empirical studies suggest that contrarian trading strategies are more profitable over long periods. The results in this section provide some consistency with this view and show that the adaptive model presented in this paper is capable of characterizing some features found in empirical studies. Furthermore, wealth and population distributions, statistical properties of returns (such as volatility clustering, autocorrelations), and herd behavior are discussed.

20.5.1

Case: (L1,L2) = (3,5)

With the selection of the parameters and initial values in (20.22)−(20.23), we consider first the dynamics of the underlying deterministic system and then the impact of the noisy dividend yield process on the dynamics. 20.5.1.1 No-noise Case

Let q = 0 . For d = −0.4 , initial difference of population proportions no = 0 and any initial wealth proportion wo , we obtain

ρt → ρ ∗ = 15.45% (p.a.),

w t → w o,

nt → 0.

By changing parameters and initial values, the following results are obtained. (i) It is found that, ceteris paribus, for δ (= δ/σ 2 ) = 0.4 , ρt → ρ ∗ = 11.52% , while ∗ for δ = 0.6 , ρ = 15.45% . In general, a high level of risk-adjusted premium leads to a high return and a high price correspondingly. In fact, for the given parameters, there exists a bifurcation value δ o ∈ (0 . 6 , 0 . 7) such that the returns converge to fixed values for δ < δ o and diverge for δ > δ o , leading prices to explode. (ii) Based on the parameters selected, there exists d o ∈ (−0.53, −0.52) such that, ceteris paribus, returns converge to fixed values for (0 >)d > d o and diverge for d < d o , leading prices to overshoot. Like the momentum trading strategies, overextrapolation from contrarian trading strategies also causes overshooting of prices. (iii) Unlike the case of the momentum trading strategies, wealth distributions of the deterministic system are affected differently by the extrapolation rate d , switching intensity, initial wealth and population distributions. In general, as d (< 0) decreases and is near the bifurcation value, the profitability of trading strategy with long lag ( L = 5 ) is improved significantly, say from 5% for d = −0.454 , to 25% for d = −0.48 , and to 50% for d = −0.5 . However, for fixed d < 0 , say d = −0.5 , as β increases, the profitability of trading strategy 2 becomes less significant, say from 45% for β = 0.1 to 20% for β = 2 . This is different from the case of momentum trading strategies. For fixed no ≠ 0 and a range of w o (say, no = 0.3 and w o ∈ ( −0.5, 0.5) ), ρ t → ρ ∗ = 15.45% p. a. and −6 w t → w o − ε , nt → 0 with ε ≈ 10 . This implies that agents’ wealth is distributed according to their initial wealth distribution, although populations are not evenly distributed initially. For fixed w o < 0 (say, w o = −0.3 ) and no ∈ [ −1, 1] , type 2 strategy accumulated more wealth than type 1 strategy over a very short

488

Carl Chiarella, Xue-Zhong He

period, but the difference is not significant (about 1%). In other words, when the initial average wealth for type 2 strategy is more than average wealth for type 1 strategy, no one of the contrarian trading strategies can make significant profit over the other, no matter how the initial populations are distributed. For fixed w o > 0 (say, ) and n o ∈ [ − 1, 1] , type 2 strategy accumulates more wealth than type 1 strategy over a very short period, and the difference becomes more significant (up to 37%) as more agents use type 2 trading strategy initially. This implies that, when the initial average wealth of type 1 strategy is higher than the one of type 2 strategy, contrarian strategies with long memory length ( L2 = 5 ) are able to accumulate more wealth over a very short period than the same strategy but with short memory length ( L1 = 3 ). In addition, the profitability becomes more significant when there are more agents using the strategy with long memory length initially. This is different from the case when agents use momentum strategies. (iv) The dynamics display no significant differences for different switching intensity parameter β when the returns process for the underlying deterministic system is stable. However, as d nears the bifurcation value, herd behavior is also observed. w o = 0.3

20.5.1.2 Effect of Noise

With the noisy dividend yield process set at q = 3% , we obtain the following results. Effect of the initial wealth distribution – When the wealth and population are evenly distributed among the two types of trading strategies initially (i. e. w o = 0, no = 0 ), type 2 strategy accumulates more average wealth (about 5% to 7%) than type 2 strategy over the whole period, as indicated by the time series plots of wealth and population in Figure 20.3. Also, as extrapolation increases (i. e. as d decreases), such extrapolation helps type 2 strategy to accumulate more wealth than type 1 strategy (say, about 5% for d = −0.45 , and about 45% for d = −0.5 ). This suggests that, when both types of strategies start with the same level of wealth and have equal number of traders, the strategy with long lag length ( L2 = 5 ) accumulate more wealthy than one with short lag length ( L1 = 3 ). In other words, type 2 strategy benefits significantly from the noisy dividend yield process. This result still holds when the initial wealth is not so evenly distributed (say, w o ∈ (−0.6, 0.3) for d = −0.5 ). Effect of the initial population distribution – Similar to the case without noise, the wealth distribution is affected differently as a function of different initial wealth levels. For fixed w o < 0 and a range of no (say, w o = −0.3 and no ∈ (−0.8, 0.65) ), the profitability of trading strategy 2 does not change much for different no . However, for fixed wo > 0 and a range of no (say, wo = 0.3 and no ∈ (−0.9, 0.9) ), the profitability of trading strategy 2 increases significantly as more and more agents use trading strategy 2. Price overshooting is possible when the populations are over concentrated in the use of one of the trading strategies. Herding behavior is also observed for changing values of the parameter β . Given the profitability of the trading strategy over the long lag length, more agents tend to switch to this more profitable strategy, as indicated by the time series plot

20 An Adaptive Model of Asset Price and Wealth Dynamics

489

Figure 20.3. Time series plots for wealth and population distributions, returns and prices when the same contrarian trading strategies with different lags (L1,L2) = (3,5) are used. Here, δ = 0.6, d = −0.45, and q = 0.03 .

of population in Figure 20.3 and the corresponding statistics in Table 20.5. Furthermore, as β increases, both prices and returns become more volatile, leading to excess volatility. Excess volatility and volatility clustering – The addition of a noise dividend yield process causes an otherwise stable return series to fluctuat. Similar to the case of momentum trading strategies, an increase of either the standard deviation of the dividend yield noise process q or agents’ extrapolation d leads both returns and prices to be more volatile. Volatility clustering is observed, as illustrated by the time series plot of the returns in Figure 20.3 and the corresponding statistics in Table 20.5. The ACs are significantly negative for odd lags and positive for even lags across all lags, as indicated in Table 20.6. Similar to the momentum trading strategies discussed in section 20.4, the noise dividend yield process has a significant impact on prices. An increase of q can push prices to significantly high levels. This can also result from either strong extrapolation (corresponding to low d ), or high risk premia δ , or high switching density β and causes prices to explode.

20.5.2 Other Cases For ( L1 , L2 ) = (3, 7) , the general dynamic features are similar to the above case when ( L1 , L2 ) = (3, 5) , except for the following differences.

490

Carl Chiarella, Xue-Zhong He

(i) The underlying deterministic system is stable over a wider range of extrapo∗ ∗ lation rates d ∈ ( d , 0] with d ∈ (−0.65, −0.6) for L2 = 7 in contrast with ∗ d ∈ (−0.53, −0.52) for L2 = 5 . (ii) Similar to the previous case, the trading strategy with the longer lag L2 = 7 dominates the market, in particular, when d is near the bifurcation value. However, for the same set of parameters, compared with the case of L2 = 5 , the profitability is reduced slightly. On the other hand, agents can extrapolate over a wide range (of the parameter d ). Similar impacts of initial wealth and population distributions, the switching intensity, and the standard deviation of the noisy dividend process on the dynamics can be observed over a wider range of the parameters. ∗ For ( L1 , L2 ) = (10, 14) , the lower bound d for the dynamic process for returns of the underlying deterministic system to be stable decreases to d ∗ ∈ (−1.6, −1.5) . By adding the noisy dividend yield process, with the same parameters and initial values, the profitability of type 1 trading strategy (with lag length 14) becomes questionable, as indicated by the corresponding statistics in Table 20.5. The patterns of the ACs are maintained, but they become less significant (for the same parameter d = −0.45 ), as shown in Table 20.6. Compared with the previous cases, there is less herding behavior, as indicated by the corresponding statistics in Table 20.5. This is partially due to the less significant (even no) profitability of the strategy with lag 14 over the one with lag 10.

Figure 20.4. Time series plots for wealth and population distributions, returns and prices when the same contrarian trading strategies with different lags (L1,L2) = (10,26) are used. Here, δ = 0.5, d = −0.4, and q = 0.01 .

20 An Adaptive Model of Asset Price and Wealth Dynamics

491

∗

For ( L1 , L2 ) = (10, 26) , the lower bound d (on d ) such that the dynamic process for returns of the underlying deterministic system be stable decreases to d ∗ ∈ (−2.2, −2.1) . By adding the noisy dividend yield process, the trading strategy with lag length 26 accumulates more wealth than the one with lag length 10, as shown in Figure 20.4 and Table 20.5. However, compared to the previous cases ( L1 , L2 ) = (3, 5) and (3, 7) , the profitability of the strategy of lag 26 over the one with lag 10 becomes much less significant (at about 0.01% to 0.04% more for d = −0.45 ), although a strong extrapolation rate can improve the profitability of the strategy with lag 26 (at about 5% to 7% for d = −2.0 ). The ACs become less significant when agents extrapolate weakly (say, d = −0.45 ), as indicated in Table 20.6, and more significant when agent extrapolate strongly (say, d = −2.0 ). In summary, we obtain the following results when both types of agents follow the same contrarian trading strategy, but with different lag lengths. (i) Without the noisy dividend process, for a given set of parameters, an increase in lag length of the trading strategies stabilizes the return series of the underlying deterministic system. As both d and β are near their bifurcation values, profitability of trading strategies and herd behavior are observed, in general. (ii) Adding a noisy dividend yield process, in general, improves the profitability of the trading strategies with long lag lengths (say, L2 = 5, 7, 26 ). However, such profitability becomes less significant when the relative difference between the two lag lengths is small (say, L1 = 10, L2 = 14 ). (iii) Similar to the case of momentum trading strategies, herding behavior is observed when one of the trading strategies becomes (significantly) profitable. Also, over-concentration (in terms of the initial average wealth and population proportion), or over extrapolation (in terms of low extrapolation rates and improper risk premium levels) can cause overshooting of price and lead to price explosion and a market crash. Price levels are more determined by the risk premium levels rather than the other parameters (such as extrapolation rate and switching intensity). (iv) The ACs are significantly negative for odd lags and positive for even lags when the lag lengths of the contrarian trading strategies are small. However, they become less significant when the lag lengths increase.

20.6 Conclusion This chapter has taken the basic two-date portfolio optimization model that is the basis of asset pricing theories (such as CAPM) and adds into it investor heterogeneity by allowing agents to form different views on the expected value of the return distribution of the risky asset. The outcome is an adaptive model of asset price and wealth dynamics with agents using various trading strategies. As a special case, a quasi-homogeneous model of two types of agents using either momentum or contrarian trading strategies is introduced to analyze the profitability of the trading strategies with different time horizons. It is found that agents with different timehorizons coexist. Our results shed light on the empirical finding that momentum

492

Carl Chiarella, Xue-Zhong He

trading strategies are more profitable over short time intervals, while contrarian trading strategies are more profitable over long time intervals. It should be pointed out that this is an unexpected result given the traditional foundations of the adaptive model. Even though the quasi-homogeneous model is one of the simplest cases of the adaptive model, it generates various phenomena observed in financial markets, including rational adaptiveness of agents, overconfidence and under-reaction, overreaction and price overshooting, herding behavior, excess volatility, and volatility clustering. The model also displays the essential characteristics of the standard asset price dynamics model assumed in continuous time finance in that the asset price is fluctuating around a geometrically growing trend. Our analysis in this paper is based on a simplified quasi-homogeneous model. Further analysis and extensions can be found in [14] and [10]. A more extensive analysis of the adaptive model is necessary in order to explore the potential explanatory power of the model. One of the extensions is to consider models of two or three different types of trading strategies, to analyze the profitability of different trading strategies, and to examine the stylized facts of the return distribution. Secondly, the attitudes of agents towards the extrapolation and risk premium change when the market environment changes and this change should be made endogenous. Thirdly, there should be a more extensive simulation study of these richer models once they are developed. In fact a proper Monte Carlo analysis is required to determine whether the models can generate with a high frequency the statistical characteristics of major indices such as the S&P500. These extensions are interesting problems which are left to future research work.

20.7

Appendix

20.7.1 Proof of Proposition 1 Proof. At time period t − 1 , assume that there are j ,t −1 agents belonging to group j . Then, within the group, their optimum demand in terms of wealth proportion to be invested in the risky asset are the same, denoted by π j ,t −1 . It follows from (20.2) and (20.7) that their average wealth proportion, which corresponds to the wealth proportion of the j -th HRA, at the next time period t is given by w j ,t =

W

j ,t

=

W

[ R + ( ρ t − r )π

j , t −1

Wt

Wt

] = w j ,t −1[ R + ( ρt − r )π

j , t −1

W t/W t −1

]

j ,t −1

. (20.24)

Note that Wt W t −1

∑ =

h j =1

W j ,t

W t −1

h

= ∑ w k ,t −1[ R + ( ρ t − r )π k ,t −1]. k =1

(20.25)

20 An Adaptive Model of Asset Price and Wealth Dynamics

493

Then both (20.24) and (20.25) imply that the time t average wealth proportion for the j -th group, formed at t − 1 , is given by (20.9). With the notations introduced in section 20.2, the market clearing equilibrium price equation (20.6) can be rewritten as: h

∑n

j ,t

π j ,tW j ,t = Npt /H .

(20.26)

j =1

Note that

Wt =

H j =1

h

W j,t =

j =1

l j,t W j,t = H

h j =1

n j,t W j,t .

(20.27)

It follows from (20.26) and (20.27) that the market clearing price equilibrium equation (20.26) becomes

Wt

h j =1

n j,t π j,t ω j,t = N pt

h j =1

n j,t ω j,t .

(20.28)

From (20.28) Wt Wt−1 h

h

h

j =1 n j,t π j,t ω j,t

= (1 + ρt − αt ) h

j =1 n j,t−1 π j,t−1 ω j,t−1

j =1 n j,t ω j,t

j =1 n j,t−1 ω j,t−1

. (20.29)

Note that Wt Wt−1

=

h h

j =1 n j,t W j,t

=

j =1 n j,t−1 W j,t−1

h

j =1 n j,t ω j,t−1 [R+(ρt −r)π j,t−1 ] h j =1 n j,t−1 ω j,t−1

.

(20.30)

Substituting (20.30) into (20.29), h j =1

n j,t ω j,t−1 [R + (ρt − r )π j,t−1 ] = (1 + ρt − αt )

h j =1

h j =1

n j,t ω j,t

n j,t π j,t ω j,t h j =1

n j,t−1 π j,t−1 ω j,t−1. (20.31)

Also, using (20.9), h

∑n

j ,t π j ,t w j ,t

j =1

∑ =

h j =1

∑

∑ = ∑ h

h

∑n j =1

j ,t w j , t

n j ,t π j ,t w j ,t −1[ R + ( ρ t − r )π

j =1

w k ,t −1[ R + ( ρ t − r )π k ,t −1] k =1 h

n j ,t w j ,t −1[ R + ( ρ t − r )π

]

j , t −1

w k ,t −1[ R + ( ρ t − r )π k ,t −1] k =1 h

]

j , t −1

.

,

(20.32)

(20.33)

494

Carl Chiarella, Xue-Zhong He

Substitution of (20.32) and (20.33) into (20.31) and simplification of the corresponding expression leads to the equation h

∑n

j ,t

w j ,t −1π j ,t[ R + ( ρ t − r )π

]

j , t −1

j =1

h

= [( ρ t − r ) + (1 + r − α t )](∑ n j ,t −1 π

j , t −1

w j ,t −1).

(20.34)

j =1

Solving for ρ t from (20.34), one obtains (20.10) for the return ρ t .

20.7.2

Time Series Plots, Statistics and Autocorrelation Results

For both momentum and contrarian trading strategies with different combinations of lag lengths ( L1 , L2 ) , this appendix provides (i) time series plots for wealth ( w t , the difference of wealth proportions), population ( nt , the difference of population proportions), returns ( ρt ), and prices ( pt ); (ii) numerical comparative statics for wealth (WEA), population (POP), and returns (RET); and (iii) the autocorrelation coefficients (AC) for return series with lags from 1 to 36.

Figure 20.5. Time series plots for wealth and population distributions, returns and prices when the same momentum trading strategies with different lags (L1,L2) = (3,7) are used. Here, δ = 0.6, d = 0.53, and q = 0.03

20 An Adaptive Model of Asset Price and Wealth Dynamics

495

Table 20.1. Statistics of time series of wealth (WEA), population (POP) and returns (RET) for momentum trading strategies with (L1,L2) = (3,5), (3,7) and δ = 0.6, d = 0.53 and q = 0.03 δ = 0.6, d = 0.53 and q = 0.03

Table 20.2. Statistics of time series of wealth, population and returns for momentum trading strategies with (L1,L2) = (10,14) and δ = 0.6, q = 0.03, d = 0.53 for (a), and d = 1.2 for (b)

Table 20.3. Statistics of time series of wealth, population and returns for momentum trading strategies with (L1,L2) = (3,7) and q = 0.03 , and δ = 0.6, d = 0.53 for (a), and δ = 0.6, d = 1.2 for (b)

496

Carl Chiarella, Xue-Zhong He

Table 20.4. Autocorrelation coefficients (AC) of returns for momentum trading strategies with (L1,L2) = (3,5), (3,7), (10,14) and (10,26). The parameters are δ = 0.6, d = 0.53 and q = 0.03 for (3,5), (3,7), (10,14) and (10,26); δ = 0.6, d = 1.2, q q = 0.03 for (10,14) (a) ; and δ = 0.6, d = 1.2, q = 0.03 for (10,26) (a)

Table 20.5. Statistics of time series of wealth, population and returns for contrarian trading strategies with parameters δ = 0.6, d = −0.45 and q = 0.03 and lag length combinations (L1,L2) are (3,5) for (a), (3,7) for (b), (10,14) for (c) and (10,26) for (d)

20 An Adaptive Model of Asset Price and Wealth Dynamics

497

Table 20.6. Autocorrelation coefficients (AC) of returns for contrarian trading strategies with parameters δ = 0.6, d = −0.45, β = 0.5 and q = 0.03 and lag length combinations (L1,L2) = (3,5), (3,7), (10,14) and (10,26)

References 1. Anderson S, de Palma A, Thisse J (1993) Discrete Choice Theory of Product Differentiation, MIT Press, Cambridge, MA 2. Arshanapali B, Coggin D, Doukas, J (1998) ‘Multifactor asset pricing analysis of international value investment strategies’, Journal of Portfolio Management (4), 10–23 3. Asness C 1997, ‘The interaction of value and momentum strategies’, Financial Analysts Journal, 29–36 4. Barberis N, Shleifer A Vishny R (1998) ‘A model of investor sentiment’, Journal of Financial Economics, 307–343 5. Brock W, Hommes C (1997) ‘A rational route to randomness’, Econometrica, 1059–1095 6. Brock W, Hommes C (1998) ‘Heterogeneous beliefs and routes to chaos in a simple asset pricing model’, Journal of Economic Dynamics and Control, 1235–1274 7. Bullard J, Duffy J (1999) ‘Using Genetic Algorithms to Model the Evolution of Heterogeneous Beliefs’, Computational Economics, 41–60 8. Capaul C, Rowley I Sharpe W (1993) ‘International value and growth stock returns’, Financial Analysts Journal (July/Aug.), 27–36 9. Chiarella C (1992) ‘The dynamics of speculative behaviour’, Annals of Operations Research, 101–123 10. Chiarella C, Dieci R Gardini L (2006) ‘Asset price and wealth dynamics in a financial market with heterogeneous agents’, Journal of Economic Dynamics and Control, 1755–1786 11. Chiarella C, He X (2001) ‘Asset pricing and wealth dynamics under heterogeneous expectations’, Quantitative Finance, 509–526

498

Carl Chiarella, Xue-Zhong He

12. Chiarella C, He X (2002) ‘Heterogeneous beliefs, risk and learning in a simple asset pricing model’, Computational Economics, 95–132 13. Chiarella C, He X (2003) ‘Heterogeneous beliefs, risk and learning in a simple asset pricing model with a market maker’, Macroeconomic Dynamics (4), 503–536 14. Chiarella, C, He, X (2005) An asset pricing model with adaptive heterogeneous agents and wealth effects, pp. 269–285 in Nonlinear Dynamics and Heterogeneous Interacting Agents, Vol. 550 of Lecture Notes in Economics and Mathematical Systems, Springer 15. Daniel K, Hirshleifer D, Subrahmanyam A (1998) ‘A theory of overconfidence, self-attribution, and security market under- and over-reactions’, Journal of Finance, 1839–1885 16. Day R, Huang W (1990) ‘Bulls, bears and market sheep’, Journal of Economic Behavior and Organization, 299–329 17. Fama E French K (1998) ‘Value versus growth: The international evidence’, Journal of Finance, 1975–1999 18. Farmer J (1999) ‘Physicists attempt to scale the ivory towers of finance’, Computing in Science and Engineering, 26–39 19. Farmer J, Lo A (1999) ‘Frontier of finance: Evolution and efficient markets’, Proceedings of the National Academy of Sciences, 9991–9992 20. Franke R, Nesemann T (1999) ‘Two destabilizing strategies may be jointly stabilizing’, Journal of Economics, 1–18 21. Frankel F, Froot K (1987) ‘Using survey data to test propositions regarding exchange rate expectations’, American Economic Review, 133–153 22. Hirshleifer D (2001) ‘Investor psychology and asset pricing’, Journal of Finance, 1533–1597 23. Hommes C (2001) ‘Financial markets as nonlinear adaptive evolutionary systems’, Quantitative Finance, 149–167 24. Hong H, Stein J (1999) ‘A unified theory of underreaction, momentum trading, and overreaction in asset markets’, Journal of Finance, 2143–2184 25. Jegadeesh N, Titman S (1993) ‘Returns to buying winners and selling losers: Implications for stock market efficiency’, Journal of Finance, 65–91 26. Jegadeesh N, Titman S (2001) ‘Profitability of momentum strategies: an evaluation of alternative explanations’, Journal of Finance, 699–720 27. Kirman A (1992) ‘Whom or what does the representative agent represent?’, Journal of Economic Perspectives, 117–136 28. LeBaron B (2000) ‘Agent based computational finance: suggested readings and early research’, Journal of Economic Dynamics and Control, 679–702 29. Lee C, Swaminathan B (2000) ‘Price momentum and trading volume’, Journal of Finance, 2017–2069 30. Levis M, Liodakis M (2001) ‘Contrarian strategies and investor expectations: the U.K. evidence’, Financial Analysts Journal (Sep./Oct.), 43–56 31. Levy M, Levy H (1996) ‘The danger of assuming homogeneous expectations’, Financial Analysts Journal (3), 65–70

20 An Adaptive Model of Asset Price and Wealth Dynamics

499

32. Levy M, Levy H, Solomon S (1994) ‘A microscopic model of the stock market’, Economics Letters, 103–111 33. Levy M, Levy H, Solomon S (2000) Microscopic Simulation of Financial Markets, from investor behavior to market phenomena, Acadmic Press, Sydney 34. Lintner J (1965) ‘The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets’, Review of Economics and Statistics, 13–37 35. Lux T (1995) ‘Herd behaviour, bubbles and crashes’, Economic Journal, 881−896 36. Lux T, Marchesi M (1999) ‘Scaling and criticality in a stochastic multi-agent model of a financial markets’, Nature (11), 498–500 37. Manski C, McFadden D (1981) Structural Analysis of Discrete Data with Econometric Applications, MIT Press 38. Merton R (1973) ‘An intertemporal capital asset pricing model’, Econometrica, 867–887 39. Moskowitz T, Grinblatt M (1999) ‘Do industries explain momentum?’, Journal of Finance, 1249–1290 40. Ross S (1976) ‘The arbitrage theory of capital asset pricing’, Journal of Economic Theory, 341–360 41. Rouwenhorst K G (1998) ‘International momentum strategies’, Journal of Finance, 267–284 42. Sharpe W (1964) ‘Capital asset prices: A theory of market equilibrium under conditions of risk’, Journal of Finance, 425–442 3. An Adaptive Model of Two Types of Agents

CHAPTER 21 Simulation Methods for Stochastic Differential Equations Eckhard Platen

21.1 Stochastic Differential Equations As one tries to build more realistic models in finance, stochastic effects need to be taken into account. In finance, the randomness in the dynamics is in fact the essential phenomenon to be modeled. More than hundred years ago, Bachelier used in (Bachelier 1900) what we now call Brownian motion or the Wiener process to model stock prices observed at the Paris Bourse. Einstein, in his work on Brownian motion, employed in (Einstein 1906) an equivalent construct. Wiener, then developed the mathematical theory of Brownian motion in (Wiener 1923). Itô laid in (Itô 1944) the foundation of stochastic calculus, which represents a stochastic generalization of the classical differential calculus. It allows to model in continuous time the dynamics of stock prices or other financial quantities. The corresponding stochastic differential equations (SDEs) generalize ordinary deterministic differential equations. A most striking example is given by the modern derivative pricing theory. The Nobel prize winning works in (Merton 1973) and (Black and Scholes 1973) initiated an entire derivatives industry. In our competitive world it is vital to improve continuously the understanding of the underlying stochastic dynamics of financial markets, and to calculate efficiently relevant financial quantities such as derivative prices and risk or performance measures. This will be a major driving force in the development of appropriate numerical methods for the solution of SDEs in finance. This chapter provides a very basic introduction, as well as a brief overview of the area of simulation methods for SDEs with focus on finance. An attempt has been made to highlight key approaches and main ideas. Let us consider an stochastic differential equation (SDE) of the form dX t = a ( X t ) dt + b( X t ) dWt

(21.1)

502

Eckhard Platen

t ∈ [0, T ] , with initial value X 0 ∈ [0, T ] . The stochastic process X = X t , 0 ≤ t ≤ T is assumed to be a unique solution of the SDE (21.1), which consists of a slowly varying component governed by the drift coefficient a ( ⋅) and

for

{

}

a rapidly fluctuating random component characterized by the diffusion coefficient b ( ⋅) . The second differential in (21.1) is an Itô stochastic differential with respect to the Wiener process W = Wt , 0 ≤ t ≤ T . Over a short period of time length

{

}

Δ > 0 the increment of the solution of the SDE (1) is approximately of the form X t + Δ − X t ≈ a ( X t ) Δ + b ( X t ) (Wt + Δ − Wt ) .

Here the increment Wt + Δ − Wt of the Wiener process W is Gaussian distributed with zero mean and variance Δ . Note that the Itô integral is the limit of Riemann sums where the integrand is always taken at the left hand point of the time discretization interval. For an introduction to SDEs we refer to (Øksendal 1998) or (Kloeden and Platen 1999). Itô SDEs have natural applications to finance. In particular, the gains process of a portfolio resembles an Itô integral because the investor has to fix at the beginning of an investment period the holdings in a security. Then the self-financing portfolio value changes according to the changes in the underlying security. The path of a Wiener process is not differentiable. Consequently, the stochastic calculus differs in its properties from classical calculus. This is most obvious in the stochastic chain rule, the Itô formula, which for a twice continuously differentiable function f has the form

(

)

df ( X t ) = f ′ ( X t ) a ( X t ) + 12 f ′′ ( X t ) b 2 ( X t ) dt

(21.2)

+ f ′ ( X t ) b ( X t ) dWt for 0 ≤ t ≤ T . We remark that the extra term 12 f ′′b 2 in the drift function of the resulting SDE (21.2) is characteristic of the stochastic calculus. This has a substantial impact on the dynamics of financial models and derivative prices, as well as, on numerical methods for SDEs.

21.2 Approximation of SDEs Since analytical solutions of SDEs are rare, numerical approximations have been developed. These are typically based on a time discretization with points 0 = τ 0 < τ 1 < … < τ n < … < τ N = T in the time interval [0, T ] , using, for instance, an equal step size Δ = T N . More general time discretizations can be used. Simulation methods for SDEs have been presented in the monographs (Kloeden and Platen 1999) and (Kloeden et al. 2003), (Milstein 1995), (Jäckel 2002), (Glasserman 2004) and (Platen and Heath 2006), where the last three books are related to finance.

21 Simulation Methods for Stochastic Differential Equations

503

To keep our formulae simple, we discuss in the following only the case of the above one-dimensional SDE. Discrete time approximations for SDEs with a jump component or SDEs that are driven by Lévy processes have been studied, for instance, in (Platen 1982), (Mikulevicius and Platen 1988), (Maghsoodi 1996), (Li and Liu 1997), (Kubilius and Platen 2002) and (Protter and Talay 1997). There is also a developing literature on the approximation of SDEs and their functionals involving absorbtion or reflection at boundaries, see (Gobet 2000) or (Hausenblas 2000). Most of the numerical methods that we mention have been generalized to multi-dimensional state variables X t and Wiener processes Wt and the case with jumps. Jumps can be covered, for instance, by discrete time approximations of SDEs if one adds to the time discretization the random jump times for the modeling of events and treats the jumps separately at these times, see (Platen 1982) and (Bruti-Liberati and Platen 2006). The Euler approximation is the simplest example of a discrete time approximation Y of an SDE. It is given by the recursive equation Yn +1 = Yn + a (Yn ) Δ + b(Yn ) ΔWn .

(21.3)

for n ∈ {0,1, … , N − 1} with Y0 = X 0 . Here ΔWn = Wτ n+1 − Wτ n denotes the increment of the Wiener process in the time interval [τ n ,τ n +1 ] and is represented by an independent N (0, Δ ) Gaussian random variable with mean zero and variance Δ . To simulate a realization of the Euler approximation, one needs to generate the independent random variables involved. In practice, linear congruential pseudorandom number generators are often available in common software packages. An introduction to this area is given in (Fishman 1996). There exist also natural uniform random number generators that do not generate any cycles, as described in (Pivitt 1999).

21.3 Strong and Weak Convergence The most general and also most flexible quantitative method for SDEs is the simulation of their trajectories. It is most important, that for simulations, one has to specify the class of problems that one wishes to investigate before starting to apply a simulation algorithm. There are two major types of convergence to be distinguished. These are the strong convergence for simulating scenarios and the weak convergence for simulating functionals. The simulation of sample paths, as, for instance, the generation of a stock price scenario, a filter estimate for some hidden unobserved variable, the testing of a statistical estimator or a hedge simulation require the simulated sample path to be close to that of the solution of the original SDE and, thus, relate to strong convergence.

504

Eckhard Platen

A discrete time approximation Y of the exact solution X of an SDE converges in the strong sense with order γ ∈ (0, ∞) if there exists a constant K < ∞ such that E X T − YN ≤ K Δγ ,

(21.4)

for all step sizes Δ ∈ (0,1) . Much computational effort is still being wasted, in practice, on many simulations by overlooking that in a large variety of problems in finance a pathwise approximation of the solution X of an SDE is not required. If one aims to compute, for instance, a moment, a probability, an expected value of a future cash flow, an option price, a risk measure or, in general, a functional of the form E ( g ( X T )) , then only a weak approximation is required. A discrete time approximation Y of a solution X of an SDE converges in the weak sense with order β ∈ (0, ∞] if for any polynomial g there exists a constant K g < ∞ such that E ( g ( X T )) − E ( g (YN )) ≤ K g Δ β ,

(21.5)

for all step sizes Δ ∈ (0,1) . Numerical methods, which can be constructed with respect to this weak convergence criterion are much easier to implement than those required by the strong convergence criterion (21.4). The key to the construction of most higher order numerical approximations is a Taylor type expansion for SDEs, which is known as Wagner-Platen expansion, see (Platen and Wagner 1982). It is obtained by iterated applications of the Itô formula (21.2) to the integrands in the integral version of the SDE (21.1). For example, one obtains the expansion X tn+1 = X tn + a ( X tn ) Δ + b( X tn ) I (1) + b( X tn ) b′( X tn ) I (1,1) + Rt0 ,t ,

(21.6)

where Rt0 ,t represents some remainder term consisting of higher order multiple stochastic integrals. Multiple stochastic integrals of the type τ n+1

I (1) = ΔWn = ∫ τ τ n+1

I (1,1) = ∫ τ

n

τ n+1

I (0,1) = ∫ τ

n

τ n+1

I (1,1,1) = ∫ τ

n

n

dWs ,

s 2 ∫ τ dWs dWs = 12 (( I (1) ) − Δ) 2

1

n

s2

∫τ

n

2

τ n+1

ds1dWs2 , I (1,0) = ∫ τ

s3

s2

n

n

n

s2

∫τ

n

(21.7) dWs1 ds2 ,

∫ τ ∫ τ dWs dWs dWs 1

2

3

on the interval [τ n ,τ n +1 ] form the random building blocks in a Wagner-Platen expansion. Algebraic relationships exist between multiple stochastic integrals which have been described, for instance, in (Platen and Wagner 1982) and (Kloeden and Platen 1999). The above type of stochastic Taylor expansion can be applied in many ways in finance, for instance, in the calculation of sensitivities for derivative prices.

21 Simulation Methods for Stochastic Differential Equations

505

21.4 Strong Approximation Methods If from the Wagner-Platen expansion (21.6) we select only the first three terms, then we obtain the Euler approximation (21.3), which, in general, has strong order of γ = 0.5 . By taking one additional term in the expansion (21.6) we obtain the Milstein scheme Yn +1 = Yn + a (Yn ) Δ + b(Yn ) ΔWn + b(Yn ) b′(Yn ) I (1,1) ,

proposed in (Milstein 1974), where the double Itô integral I (1,1) is given in (21.7). In general, this scheme has strong order γ = 1.0. For multi-dimensional driving Wiener processes, the double stochastic integrals appearing in the Milstein scheme have to be generated separately, involving stochastic area integrals, or should be approximated, unless the drift and diffusion coefficients fulfil a commutativity condition, see (Milstein 1995), (Kloeden and Platen 1999) and (Gaines and Lyons 1994). In (Platen and Wagner 1982) it has been described which terms of the expansion have to be chosen to obtain any desired higher strong order of convergence. The strong Taylor approximation of order γ = 1.5 has the form

(

Yn +1 = Yn + a Δ + b ΔWn + bb′I (1,1) + ba ′I (1,0) + aa ′ + 12 b 2 a ′′

(

)

(

+ ab′ + 12 b b′′ I (0,1) + b bb′′ + ( b′ ) 2

2

)I

(1,1,1)

)

Δ2 2

,

where we suppress the dependence of the coefficients on Yn and use the multiple stochastic integrals mentioned in (21.7). For higher order schemes one requires, in general, adequate smoothness of the drift and diffusion coefficients, but also sufficient information about the driving Wiener processes, which is contained in the multiple stochastic integrals. A disadvantage of higher order strong Taylor approximations is the fact that derivatives of the drift and diffusion coefficients have to be calculated at each step. As a first attempt at avoiding derivatives in the Milstein scheme, one can use the following γ = 1.0 strong order method

⎛ ^ ⎞ 1 ((ΔWn ) 2 − Δ) , Yn +1 = Yn + a (Yn )Δ + b(Yn )ΔWn + ⎜ b(Yn ) − b′(Yn ) ⎟ ⎝ ⎠2 Δ with Yˆn = Yn + b(Yn ) Δ . In (Burrage et al. 1997) a γ = 1.0 strong order two-stage Runge-Kutta method has been developed, where Yn +1 = Yn + ( a(Yn ) + 3a (Yn ) )

Δ 4

+ ( b(Yn ) + 3b(Yn ) )

ΔWn 4

(21.8)

with Yn = Yn + 23 ( a (Yn )Δ + b(Yn )ΔWn ) using the function a ( x) = a ( x ) − 12 b( x)b′( x).

(21.9)

506

Eckhard Platen

The advantage of the method (21.8) is that the principal leading error has been minimized within a class of one stage first order Runge-Kutta methods. In the context of simulation of SDEs it must be emphasized that before any properties of higher order convergence can be exploited the question of numerical stability of a scheme has to be satisfactorily answered, see (Kloeden and Platen 1999). Many problems in financial risk management turn out to be multidimensional. Systems of SDEs with extremely different time scales easily arise in modeling, which can cause numerical instabilities for most explicit methods. Implicit methods can resolve some of these problems. We mention here a family of drift implicit Milstein schemes, described in (Milstein 1995), where Yn +1 = Yn + {α a (Yn +1 ) + (1 − α )a (Yn )} Δ + b(Yn ) ΔWn + b(Yn ) b′(Yn )

(ΔWn ) 2 . 2

Here α ∈ [0,1] represents the degree of implicitness and a is given in (21.9). This scheme is of strong order γ = 1.0. In a strong scheme it is almost impossible to construct implicit expressions for the noise terms. However, in finance typically the diffusion terms are of multiplicative nature and can cause numerical instability. In Milstein, Platen & Schurz (1998) one overcomes part of the problem by suggesting the family of balanced implicit methods given in the form Yn +1 = Yn + a(Yn )Δ + b(Yn )ΔWn + (Yn − Yn +1 )Cn ,

where Cn = c 0 (Yn )Δ + c1 (Yn ) ΔWn and c 0 , c1 represent positive real valued uniformly bounded functions. This method is of strong order γ = 0.5. It demonstrates for a number of applications in finance and filtering excellent numerical stability properties, see (Fischer and Platen 1999).

21.5 Weak Approximation Methods For many applications in finance, in particular, for Monte Carlo simulation of derivative prices, it is not necessary to use strong approximations. Under the weak convergence criterion (21.5) the random increments ΔWn of the Wiener process can be replaced by simpler random variables ΔWn . By substituting the N (0, Δ ) Gaussian distributed random variable ΔWn in the Euler approximation (21.3) by an independent two-point distributed random variable ΔWn with P(ΔWn = ± Δ ) = 0.5 we obtain the simplified Euler method Yn +1 = Yn + a (Yn ) Δ + b(Yn ) ΔWn ,

(21.10)

which converges with weak order β = 1.0. The Euler approximation (21.3) can be interpreted as the order 1.0 weak Taylor scheme.

21 Simulation Methods for Stochastic Differential Equations

507

One can select additional terms from a Wagner-Platen expansion to obtain weak Taylor schemes of higher order. The order β = 2.0 weak Taylor scheme has the form Yn +1 = Yn + a (Yn )Δ + b(Yn ) ΔWn + b(Yn ) b′(Yn ) I (1,1) + b(Yn ) a ′(Yn ) I (1,0) + (a(Yn ) b′(Yn ) 1 + b 2 (Yn ) b′′(Yn )) I (0,1) + (a (Yn ) a ′(Yn ) 2 1 Δ2 + b 2 (Yn ) a ′(Yn )) 2 2

(21.11)

which was first proposed in (Milstein 1978). We still obtain a scheme of weak order β = 2.0 if we replace the random variable ΔWn in (21.11) by ΔWˆn , the double integrals I (1,0) and I (0,1) by Δ2 ΔWˆn and the double Wiener integral I (1,1) by 1 (( 2

ΔWˆn )2 − Δ) . Here ΔWˆn can be a three-point distributed random variable with

(

)

1 P ΔWˆn = ± 3Δ = 6

and

(

)

2 P ΔWˆn = 0 = 3

(21.12)

As in the case of strong approximations, higher order weak schemes can be constructed only if there is adequate smoothness of the drift and diffusion coefficients and a sufficiently rich set of random variables involved for approximating the multiple stochastic integrals. A weak second order Runge-Kutta approximation, see (Kloeden and Platen 1999), that avoids derivatives in a and b is given by the algorithm

Δ ΔWˆn Yn +1 = Yn + (a (Yˆn ) + a(Yn )) + (b(Yn + ) + b(Yn − )) 2 4 1 + (b(Yn + ) − b(Yn − ))((ΔWˆn ) 2 − Δ ) 4 Δ

(21.13)

with Yˆn +1 = Yn + a (Yn )Δ + b(Yn )ΔWˆn und Yˆn±+1 = Yn + a (Yn )Δ ± b(Yn ) Δ , where ΔWˆn can be chosen as in (21.12). Implicit methods have better numerical stability properties than most explicit schemes, and turn out to be better suited to simulation tasks with potential stability problems. The following family of weak implicit Euler schemes, converging with weak order β = 1.0, can be found in (Kloeden and Platen 1999) Yn +1 = Yn + {ξ aη (Yn +1 ) + (1 − ξ )aη (Yn )} Δ + {η b(Yn +1 ) + (1 − η )b(Yn )} ΔWn

(21.14)

Here ΔWn can be chosen as in (21.10), and we have to set aη ( y ) = a( y ) − η b( y )b′( y ) for ξ ∈ [0,1] and η ∈ [0,1] . For ξ = 1 and η = 0 , (21.14) leads to the drift implicit Euler scheme. The choice ξ = η = 0 provides the simplified Euler scheme, whereas for ξ = η = 1 we have the fully implicit Euler scheme.

508

Eckhard Platen

A family of implicit weak order 2.0 schemes has been proposed in (Milstein 1995) with Yn +1 = Yn + {ξ a (Yn +1 ) + (1 − ξ )a (Yn )} Δ

(

1 + b(Yn )b′(Yn ) ⎛⎜ ΔWˆn ⎝ 2

)

2

− Δ ⎞⎟ / 2 ⎠

1 ⎧ ⎫ + ⎨b (Yn ) + ( b′ (Yn ) + (1 − 2ξ ) a ′ (Yn ) ) Δ ⎬ ΔWˆn 2 ⎩ ⎭

{

} Δ2 , 2

+ (1 − 2ξ ) β a ′ (Yn ) + (1 − β ) a ′ (Yn +1 )

where ΔWˆn is chosen as in (21.12). For an implicit method one has usually to solve an algebraic equation at each time step. A weak second order predictor corrector method, see (Platen 1995), is avoiding such a calculation and is given by the corrector Yn +1 = Yn +

Δ 2

( a (Yˆ ) + a (Y )) +ψ n +1

n

n

with ⎛ ⎞ ψ n = b (Yn ) + ⎜ a (Yn ) b′ (Yn ) + b 2 (Yn ) b′′ (Yn ) ⎟ ΔWˆn 2 2 1

1

⎝

⎠

(

+ b (Yn ) b′ (Yn ) ⎛⎜ ΔWˆn ⎝

)

2

− Δ ⎞⎟ / 2 ⎠

and the predictor 2 1 ⎛ ⎞Δ Yn +1 = Yn + a (Yn ) Δ + ψ n + ⎜ a (Yn ) a ′ (Yn ) + b 2 (Yn ) a ′′ (Yn ) ⎟ 2 ⎝ ⎠ 2 ˆ + b (Yn ) a ′ (Yn ) ΔWn Δ / 2,

where ΔWˆn is as in (21.12). For the weak second order approximation of the functional E ( g ( X T ) ) a Richardson extrapolation of the form

( (

)) ( (

VgΔ,2 (T ) = 2 E g Y Δ (T ) − E g Y 2 Δ (T )

))

(21.15)

was proposed in (Talay and Tubaro 1990), where Y δ (T ) denotes the value at time T of an Euler approximation with step size δ. Using Euler approximations with

21 Simulation Methods for Stochastic Differential Equations

509

step sizes δ = Δ and δ = 2Δ, and then taking the difference (21.15) of their respective functionals, the leading error coefficient cancels out, and VgΔ,2 (T ) ends up to be a weak second order approximation. Further weak higher order extrapolations can be found in (Kloeden and Platen 1999). For instance, one obtains a weak fourth order extrapolation method using VgΔ,4 =

( (

))

1 ⎡ 32 E g Y Δ (T ) − 12 E 21 ⎣

( g (Y

2Δ

(T ) ) ) + E

( g (Y

4Δ

(T ) ) )⎤⎦

where Y δ (T ) is the value at time T of the weak second order Runge-Kutta scheme (21.13) with step size δ. Extrapolation methods can only be successfully applied if there is a wide range of time step sizes where the underlying discrete time approximation works efficiently.

21.6 Monte Carlo Simulation for SDEs By a discrete time weak approximation Y one can form a straightforward Monte Carlo estimate for a functional of the form u = E ( g ( X T ) ) , see (Boyle 1977) and (Kloeden and Platen 1999), using the sample average uN ,Δ =

1 N

∑ g (YT (ωk ) ) , N

k =1

with N independent simulated realizations YT (ω1 ) , YT (ω2 ) , … , YT (ω N ) . Monte Carlo methods are very general and work in almost all circumstances. However, the computational efficiency is often limited. For high-dimensional functionals it is sometimes the only method of obtaining a result. One pays for this generality by the large sample sizes required to achieve reasonably accurate estimates. To see this we can introduce the mean error μˆ in the form

μˆ = u N , Δ − E ( g ( X T ) ) . We can decompose μˆ into a systematic error μ sys and a statistical error μ stat , such that

μˆ = μ sys + μ stat ,

(21.16)

where

μ sys = E ( μˆ ) ⎛1 N ⎞ = E ⎜ ∑ g (YT (ωk ) ) ⎟ − E ( g ( X T ) ) ⎝ N k =1 ⎠ = E ( g (YT ) ) − E ( g ( X T ) ) .

(21.17)

510

Eckhard Platen

For a large number N of simulated independent sample paths of Y the statistical error μ stat becomes by the Central Limit Theorem asymptotically Gaussian with mean zero and variance of the form Var ( μ stat ) = Var ( μˆ ) =

1 Var ( g (YT ) ) . N

(21.18)

This shows a significant disadvantage of Monte Carlo methods, because the deviation Dev ( μ stat ) = Var ( μ stat ) =

decreases at only the slow rate

1 N

1 N

Var ( g (YT ) ) ,

(21.19)

as N → ∞ , which determines the length of

corresponding confidence intervals.

21.7 Variance Reduction One can increase the efficiency in Monte Carlo simulation for SDEs considerably by using variance reduction techniques. These reduce primarily the variance of the random variable actually simulated. The method of antithetic variates, see (Law and Kelton 1991), uses repeatedly the random variables originally generated in some symmetric pattern to construct sample paths that offset each other’s noise to some extent. Stratified sampling is another technique that has been widely used, see (Fournie et al. 1997). It divides the whole sample space into disjoint sets of events and an unbiased estimator is constructed that is conditional on the occurrence of these events. The measure transformation method, see (Milstein 1995), (Kloeden and Platen 1999) and (Hofmann et al. 1992), introduces a new probability measure via a Girsanov transformation and can achieve considerable variance reductions. An efficient technique that uses integral representation has been suggested in (Heath and Platen 2002). Very useful is the control variate technique, which is based on the selection of a random variable Y with known mean E(Y) that allows one to construct an unbiased estimator with smaller variance, see (Newton 1994) and (Heath and Platen 1996). The control variate technique is a very flexible and powerful variance reduction method. Let X = { X t = ( X t1, … , X td )T , t ∈ [0, T ]} be the solution of a d-dimensional SDE with initial value x = ( x1 , … , x d )T ∈ ℜ d . We wish to compute the following expected value u = E ( H ( XT )) ,

(21.20)

where H : ℜd → ℜ is some payoff function. Our aim will be to find a control variate by exploiting the particular structure of the underlying SDE. This will enable us to obtain an accurate and fast estimate of E (H ( X T )) . The control variate

21 Simulation Methods for Stochastic Differential Equations

511

technique aims to find a suitable random variable Y with known mean E (Y ) . It then estimates E (H ( X T )) by computing the expected value of Z = H ( X T ) − α (Y − E (Y ) )

for some suitable choice of α ∈ ℜ . The parameter α can be chosen to minimize the variance of Z. Because E (Z ) = E ( H ( X T ) ) ,

the average uN =

1 N

N

∑ Z (ωi ) i =1

is an unbiased estimator for E ( H ( X T ) ) . One assumes that both H ( X T ) and Y can be evaluated for any realization ω ∈ Ω . We call the random variable Y a control variate for the estimation of E ( H ( X T ) ) . As an example, we consider a stochastic volatility model with a vector process T X = X t = ( St , σ t ) , t ∈ [ 0, T ] that satisfies the two-dimensional SDE

{

}

dSt = σ t St dWt1 dσ t = γ (κ − σ t ) dt + ξσ t dWt 2

(21.21)

for t ∈ [0, T ] with initial values S0 = s > 0 , σ 0 = σ > 0 and γ , κ , ξ > 0 . Here the short rate is set to zero and the pricing is performed under an equivalent risk neutral probability measure. We take W 1 and W 2 to be two independent standard Wiener processes. Stochastic volatility models of similar type have been widely used in the literature, see (Stein and Stein 1991), (Heston 1993) or (Heath et al. 2001). Let us consider the payoff of a European call option H ( X T ) = ( ST − K )

+

where K > 0 is the strike and T is the maturity. Now, Xˆ = Xˆ t = Sˆt , σˆ t , t ∈ [ 0, T ] is an adjusted asset price process with de-

{

(

}

)

terministic volatility, which evolves according to the system of SDEs dSˆt = σˆ t Sˆt dWt1

(21.22)

dσˆ t = γ (κ − σˆ t ) dt

for t ∈ [0, T ] with the same initial values Sˆ0 = s and σˆ 0 = σ . The adjusted price û + for the European call payoff Y = SˆT − K is obtained as expectation

(

)

(

uˆ = E (Y ) = E ⎛⎜ SˆT − K ⎝

)

+

⎞. ⎟ ⎠

(21.23)

512

Eckhard Platen

Since the volatility process is deterministic, this payoff can be evaluated by using the well-known Black-Scholes formula, see (Black and Scholes 1973). Consequently, the random variable

(

+ Z = ( ST − K ) − α ⎛⎜ SˆT − K ⎝

)

+

(

− E SˆT − K

)

+

⎞ ⎟ ⎠

= ( ST − K ) − α (Y − uˆ ) +

is easily computed without any simulation. To price the option under consideration, one now estimates E ( Z ) , which is the control variate estimator, with Y playing the role of the control variate. This means, one simulates σ t , St and Sˆt for t ∈ [0, T ] . To do this, one could use the previously mentioned second weak order predictor-corrector scheme, for example. The above method can be very powerful since it takes advantage of the structure of the underlying SDE. Variance reduction techniques can be combined and can achieve several thousand times variance reduction if appropriately designed.

References Bachelier L (1900) Théorie de la spéculation. Annales de l’Ecole Normale Supérieure, Series 3 17, pp. 21–86 Black F, Scholes M (1973) The pricing of options and corporate liabilities. J. Political Economy 81, pp. 637–654 Boyle P (1977) A Monte Carlo approach. J. Financial Economics 4, pp. 323–338 Bruti-Liberati N, Platen E (2006) Strong approximations of stochastic differential equations with jumps. J. Comput. Appl. Math. (forthcoming) Burrage K, Burrage P, Belward J (1997) A bound on the maximum strong order of stochastic Runge-Kutta methods for stochastic ordinary differential equations. BIT 37(4), pp. 771–780 Einstein, A (1906) Zur Theorie der Brownschen Bewegung. Ann. Phys. IV 19, pp. 371 Fischer P, Platen E (1999) Applications of the balanced method to stochastic differential equations in filtering. Monte Carlo Methods Appl. 5(1), pp. 19–38 Fishman GS (1996) Monte Carlo: Concepts, Algorithms and Applications. Springer Ser. Oper. Res. Springer Fournie E, Lebuchoux J, Touzi N (1997) Small noise expansion and importance sampling. Asymptot. Anal. 14(4), pp. 331–376 Gaines JG, Lyons TJ (1994) Random generation of stochastic area integrals. SIAM J. Appl. Math. 54(4), pp. 1132–1146 Glasserman P. (2004) Monte Carlo Methods in Financial Engineering, Volume 53 of Appl. Math. Springer Gobet E (2000) Weak approximation of killed diffusion. Stochastic Process. Appl. 87, pp. 167–197

21 Simulation Methods for Stochastic Differential Equations

513

Hausenblas E (2000) Monte-Carlo simulation of killed diffusion. Monte Carlo Methods Appl. 6(4), pp. 263–295 Heath D, Hurst SR, Platen E (2001) Modelling the stochastic dynamics of volatility for equity indices. Asia-Pacific Financial Markets 8, pp. 179–195 Heath D, Platen E (1996) Valuation of FX barrier options under stochastic volatility. Financial Engineering and the Japanese Markets 3, pp. 195–215 Heath D, Platen E (2002) A variance reduction technique based on integral representations. Quant. Finance. 2(5), pp. 362–369 Heston SL (1993) A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev. Financial Studies 6(2), pp. 327–343 Hofmann NE, Platen E, Schweizer M (1992) Option pricing under incompleteness and stochastic volatility. Math. Finance 2(3), pp. 153–187 Itô K (1944) Stochastic integral. Proc. Imp. Acad. Tokyo 20, pp. 519–524 Jäckel P (2002) Monte Carlo Methods in Finance. Wiley Kloeden PE, Platen E (1999) Numerical Solution of Stochastic Differential Equations, Volume 23 of Appl. Math. Springer. Third corrected printing Kloeden PE, Platen E, Schurz H (2003) Numerical Solution of SDE’s Through Computer Experiments. Universitext. Springer. Third corrected printing Kubilius K, Platen E (2002) Rate of weak convergence of the Euler approximation for diffusion processes with jumps. Monte Carlo Methods Appl. 8(1), pp. 83−96 Law AM, Kelton WD (1991) Simulation Modeling and Analysis (2nd ed.). McGraw-Hill, New York Li CW, Liu XQ (1997) Algebraic structure of multiple stochastic integrals with respect to Brownian motions and Poisson processes. Stochastics Stochastics Rep. 61, pp. 107–120 Maghsoodi Y (1996) Mean-square efficient numerical solution of jump-diffusion stochastic differential equations. SANKHYA A 58(1), pp. 25–47 Merton RC (1973). Theory of rational option pricing. Bell J. Econ. Management Sci. 4, pp. 141–183 Mikulevicius R, Platen E (1988) Time discrete Taylor approximations for Ito processes with jump component. Math. Nachr. 138, pp. 93–104 Milstein GN (1974) Approximate integration of stochastic differential equations. Theory Probab. Appl. 19, pp. 557–562 Milstein GN (1978) A method of second order accuracy integration of stochastic differential equations. Theory Probab. Appl. 23, pp. 396–401 Milstein GN (1995) Numerical Integration of Stochastic Differential Equations. Mathematics and Its Applications. Kluwer Milstein GN, Platen E, Schurz H (1998) Balanced implicit methods for stiff stochastic systems. SIAM J. Numer. Anal. 35(3), pp. 1010–1019 Newton NJ (1994) Variance reduction for simulated diffusions. SIAM J. Appl. Math. 54(6), pp. 1780–1805 Øksendal BK (1998) Stochastic Differential Equations. An Introduction with Applications (5th ed.). Universitext. Springer

514

Eckhard Platen

Pivitt B (1999) Accessing the Intel Random Number Generator with CDSA. Technical report, Intel Platform Security Division Platen E (1982) An approximation method for a class of Itô processes with jump component. Liet. Mat. Rink. 22(2), pp. 124–136 Platen E (1995) On weak implicit and predictor-corrector methods. Math. Comput. Simulation 38, pp. 69–76 Platen E, Heath D (2006) A Benchmark Approach to Quantitative Finance. Springer Finance. Springer. 700 pp. 9 illus., Hardcover, ISBN10 3-540-26212-1 Platen E, Wagner W (1982) On a Taylor formula for a class of Itˆo processes. Probab. Math. Statist. 3(1), pp. 37–51 Protter P, Talay D (1997) The Euler scheme for Lévy driven stochastic differential equations. Ann. Probab. 25(1), pp. 393–423 Stein EM, Stein J (1991) Stock price distributions with stochastic volatility: An analytic approach. Rev. Financial Studies 4, pp. 727–752 Talay D, Tubaro L (1990) Expansion of the global error for numerical schemes solving stochastic differential equations. Stochastic Anal. Appl. 8(4), pp. 483−509 Wiener N (1923) Differential space. J. Math. Phys. 2, pp. 131–174

CHAPTER 22 Foundations of Option Pricing Peter Buchen

22.1 Introduction In this chapter we lay down the foundations of option and derivative security pricing in the classical Black-Scholes (BS) paradigm. The two key assumptions in this approach are the absence of arbitrage and the modelling of asset prices by geometrical Brownian motion (gBm). No arbitrage is the driving mechanism for much of modern finance and still plays a dominant role in models extended beyond the BS framework. It is generally recognised that gBm, which implies Gaussian logreturns of asset prices, while good for analysis is rather a poor description of reality. Financial data very often show stylised features in their log-return distributions that are highly non-Gaussian. These include, heavy-tails (leptokurtosis), skewness, stochastic volatility and long-range dependence. Unfortunately, no model of asset prices capturing all, or even some of these features, is widely accepted by either theorists or practitioners. One reason that the BS model has maintained its popularity to the present day, is its connection to market completeness (defined later in the text). A single asset with dynamics following gBm, coupled with a derivative security on that asset (or even a riskless asset, such as a non-defaultable zero coupon bond) together admit a complete market. This in turn leads to a unique arbitrage free price for any derivative security written on the asset. Much recent research is devoted to option and derivative pricing under alternative stochastic models of the asset price, many of which lead to market incompleteness. In such cases their is no unique arbitragefree price for derivatives, unless further assumptions, possibly involving investor preferences or investor utility, are imposed. There are two distinct methods for pricing financial derivatives in the BS framework. The first is the PDE approach and the second is called the EMM approach.

516

Peter Buchen

As shown by Black and Scholes (Black and Scholes 1973) in their now celebrated paper, derivative prices in the continuous time framework, satisfy a diffusion type partial differential equation. We shall derive this PDE using basically their dynamic hedging method. In the EMM approach, derivatives are priced as an expectation under the Equivalent Martingale Measure. Martingales are the mathematical entities that capture the notion of a fair game, or in the present context, no arbitrage. The EMM method is a consequence of the First Fundamental Theorem of Asset Pricing. While the PDE method is specifically tied to the BS model, the EMM method is generally valid for a much wider class of models. The two pricing methods are connected by another celebrated formula, well known to theoretical physicists as the Feynman-Kac Formula. Practitioners often model asset prices with stochastic volatility or other departure from BS standards. Pricing options and derivatives for such models can no longer be done analytically, so the emphasis becomes one of computational expediency. The EMM approach is particularly useful in this context as it immediately lends itself to a Monte Carlo (MC) algorithm. Indeed, the great majority of all industry computation in financial derivatives today, is by MC simulation. Does this mean that BS pricing is obsolete? Not at all. That the BS model has survived to the present day, is testimony to its extraordinary applicability and robustness. Monte Carlo methods perform best when run with control variates, which may considerably reduce standard errors of price estimates. Control variates with analytic representations are obviously much preferred, and BS prices admirably perform this role. For example, a practitioner in the foreign exchange market may wish to price a certain barrier option. The exchange rate is assumed to follow a process with stochastic volatility and stochastic interest rates. The barrier option can be priced by MC simulation, using a closed form BS representation of the barrier price (with constant volatility and interest rates) as a control variate. In this chapter we develop methods for pricing many different types of options and derivatives within the BS framework. These include: vanilla European calls and puts, dual expiry options such as compound and chooser options, two-asset rainbow options such as exchange options, and path-dependent options such as barrier and lookback options. Our approach, while strongly connected to both the PDE and EMM methods, uses in addition a powerful set of related analytical tools. In a nutshell, we employ the following general approach: • Represent the derivative or exotic option as a portfolio of elementary contracts • Price the elementary contracts using the EMM method; this is often complemented with a Gaussian Shift Theorem • Employ the Principal of Static Replication or parity relations to obtain the arbitrage free price of the derivative The details of this approach are described in many of the examples considered in the text that follows.

22 Foundations of Option Pricing

22.2

517

The PDE and EMM Methods

The standard BS model makes the following assumptions: the market is frictionless (i. e. no transaction costs or taxes and no penalties for short selling); the market operates continuously, the risk-free interest rate r is a known constant; the asset price X t follows gBm with constant volatility σ > 0 and pays no dividends; options and derivatives are European (i. e. no early exercise) and expire at time T with a payoff that depends only on X T ; the market is arbitrage free. European options are therefore path independent. We shall also price several exotic options with path dependent payoffs, including barrier and lookback options.

22.2.1 Geometrical Brownian Motion Under the assumption of gBm, the asset price X t satisfies a stochastic differential equation (sde) of the form

dX t = X t ( μ dt + σ dBt );

X 0 = x0

(22.1)

where μ is the growth rate of the asset. The term Bt is a standard Brownian motion under a measure P (called the real-world measure), with the following properties: 1. For t > 0 , Bt is continuous with B0 = 0 d 2. Bt has stationary and independent increments ⇒ Bt − Bs = Bt − s . Here we d use to mean ‘equal in distribution’. = d 3. For fixed t , Bt = N (0, t ) 4. From 2. and 3. above, cov{Bs , Bt } = min( s, t ) for any s, t > 0 . Thus, while the increments of Brownian motion are independent, their future values at two distinct times are not. Since Bt is Gaussian with zero mean and varid

ance t , it is convenient in calculations to write Bt = tZ t where Zt = N (0,1) is a standard Gaussian random variable (rv). We also take the liberty to drop the subscript and simply write Z for Z t . Theorem 22.1 (Itô’s Lemma). Let X t satisfy the sde dX t = α ( xt ) dt + β ( x, t ) dBt , where x = X t and let V ( x, t ) be any C2,1 function. Then V ( X t , t ) satisfies the sde 1 ⎡ ⎤ dV = ⎢Vt + αVx + β 2 Vxx ⎥ dt + β Vx dBt 2 ⎣ ⎦

(22.2)

518

Peter Buchen

where subscripts on V denote partial derivatives. It is sometimes more instructive to write this last sde in the equivalent form

1 ⎤ ⎡ dV = ⎢Vt + β 2Vxx ⎥ dt + Vx dX t . 2 ⎦ ⎣ Itô’s Lemma is the main tool used to solve sde’s. For example, the sde (1) for gBm can be solved by taking Yt ( X t ) = log( X t /x0 ) . The sde for Yt then becomes 1 dYt = ( μ − σ 2 ) dt + σ dBt , and this is readily integrated to give the representation 2 d

X t = x0 exp{( μ − 12 σ 2 )t + σ tZ ;

Z ~ N (0,1)

(22.3)

22.2.2 The Black-Scholes pde Consider a portfolio, established at time t , containing one contract of any derivative security, and h units of the underlying asset X . Let V = V ( x, t ) , where x = X t , denote the price of the derivative, then the portfolio has present value P = V ( x, t ) − hx . Assume the portfolio is self-financing over the time interval (t , t + dt ) . This means that changes in portfolio value occur only through changes in derivative and asset prices, but not through changes in the hedge ratio h . Then, the change in P will, by Itô’s Lemma, be dP = dV − h dX t 1 = [Vt + σ 2 x 2Vxx ] dt + (Vx − h) dBt 2

Now choose h = Vx so the last term above vanishes. The expression for P becomes non-stochastic and hence riskless. In order to avoid arbitrage, the portfolio P must therefore earn the risk-free rate of return r . That is dP = rP dt = r (V − xVx ) dt. dP = rP dt = r (V − xVx ) dt. Equating the two expressions for dP , we obtain the celebrated BS-pde for the derivative price V ( x, t ) Vt = rV − rxVx − 12 σ 2 x 2Vxx

(22.4)

This pde describes price evolution of any derivative in the BS framework and remarkably is independent of the parameter μ . The strategy under which it was derived, is called ‘dynamic hedging’, because holders of the derivative must hedge their positions with either short selling stock if h > 0 , or purchasing stock if h < 0 , to continuously eliminate risk. In practice, of course, such continuous hedging is impossible to implement, regardless of the accumulated transaction costs and other real market frictions. Any general European option with expiry T payoff f ( x ) must satisfy the pde (22.4) for all t < T with the expiry value V ( x, T ) = f ( x) . For a vanilla European

22 Foundations of Option Pricing

519

call option (the right to buy one unit of the asset for k at time T ), the payoff function is f ( x) = ( x − k ) + . For a vanilla European put option (the right to sell one unit of the asset for k at time T ), the payoff function is f ( x) = (k − x) + . In theory, one could price any European derivative security by solving this pde by transforming it into a standard initial value (IV) problem for the heat equation. For example, setting V ( x, t ) = e− rr u ( y,τ ) with y = log x + (r − 12 σ 2 )τ and τ = T − t , the pde for 2 u ( y,τ ) becomes, uτ = 12 σ uyy with initial value u ( y, 0) = f ( y ) . Such problems can be solved by Green’s Function methods, as indeed employed by BS in their original paper. However, we shall adopt a somewhat more sophisticated approach, which saves considerable effort when pricing more complex exotic (i. e. nonvanilla) options. Our approach requires more technical machinery considered in the next section.

22.2.3

The Equivalent Martingale Measure

Mathematically, we say that a stochastic process Yt is a martingale with respect to a measure Q and a filtration Ft (think of Ft as all the information available up to time t ), if Yt = EQ {Ys | Ft } for all t ≤ s . For example, it is well known that if Bt is a standard Brownian motion, then Bt , Bt2 − t and exp{σ Bt − 12 σ 2 t} are all martingales. Definition 22.1. The Equivalent Martingale Measure (EMM) is a measure Q , equivalent1 to the real world measure P , under which the discounted asset price ~ X t = X t e− rt is a martingale.

This means there exists a measure Q , such that X t = e − r ( s −t ) EQ { X s | Ft } . The First Fundamental Theorem of Asset Pricing (see (Harrison and Pliska 1981) for a formal proof) tells us that this is not only true for the asset price, but for any derivative security as well. Theorem 22.2 (First Fundamental Theorem). If the market is arbitrage free, the discounted price of any derivative security is a martingale under the Equivalent Martingale Measure.

This theorem immediately leads to a pricing formula for the derivative, since V ( X t , t ) = EQ {V ( X s , s ) | Ft } implies with s = T and V ( X T , T ) = f ( X T ) ,

V ( x, t ) = e− r (T −t ) EQ { f ( X T ) | X t = x}

(22.5)

Note that since Itô processes are Markov, Ft is equivalent to the statement Xt = x . 1

Measures P and Q are equivalent if they have the same null sets. That is, P ( A) = 0 ⇔ Q ( A) = 0 .

520

Peter Buchen

The Second Fundamental Theorem on Asset Pricing states that if the market is complete, then Q is unique. Market completeness means that it is possible to dynamically replicate the riskless security (via dynamic hedging) with any derivative and its underlying asset. Indeed one can also dynamically replicate any derivative, using just the underlying asset and a riskless security. Under Q , which is also known as the Risk Neutral Measure, the asset price follows gBm with drift rate r . That is, dX t = X t (rdt + σ dBtQ ) . However, we saw in (22.1), that under the real-world measure P , dX t = X t ( μ dt + σ dBtP ) . This does not mean that the stock price earns the risk-free rate in the real economy. It does however mean that when pricing derivatives in an arbitrage free way, stock prices must be transformed to the new measure Q , under which they do earn the risk-free rate. The asset price sde’s under the two measures are compatible if and only if the two Brownian motions are related by ⎛ μ −r ⎞ dBtQ = dBtP + ⎜ ⎟ dt ⎝ σ ⎠

(22.6)

That such a relationship exists, is the result of another well known theorem in stochastic calculus known as Girsanov’s Theorem (e. g. see (Klebaner 1998)). We shall not explore this theorem here, but point out that from a practical point of view, we could have simply replaced μ by r at the outset: a rule sometimes called the Correspondence Principle. Under Q , we can therefore replace (22.3) for the asset price, by X T = x exp{(r − 12 σ 2 )τ + σ τ Z };

τ = T −t

(22.7)

d

where Z = Z tQ = N (0,1) . While the EMM approach leading to the representation (22.5) for the derivative price was originally obtained through financial arguments, it could also have been obtained directly from the Feynman-Kac (FK) formula (e. g. see (Friedman 1975)). Basically, the FK formula expresses the solution of diffusion equations of the type (22.4), as a terminal value expectation, with respect to a measure determined from the transition probability density of a certain stochastic process. This measure is precisely the EMM of the First Fundamental Theorem. In fact, it is a simple matter to show by direct substitution, that (22.5), with X T given by (22.7), satisfies the pde (22.4). The following is a useful and powerful pricing adjunct. Theorem 22.3 (Principle of Static Replication). If the payoff of a European style derivative can be expressed as a portfolio of elementary contracts, then the arbitrage free price of the derivative is the present value of this portfolio.

Proof. While this theorem is virtually self-evident, the proof follows from either arbitrage principles (the Law of One Price) or from the linearity of both the BS-pde (22.4) and the expectation (22.5).

22 Foundations of Option Pricing

521

We remark that static replication is quite different from dynamic replication used in deriving the BS-pde. In static replication, the portfolio is created once at time t and then held to its expiry date. No continuous re-balancing to eliminate risk is required. Both methods however, have their origins in the concept of no arbitrage and both are used as pricing tools.

22.2.4 Effect of Dividends We have tended to think of the underlying asset as an equity or common stock. In practice, many equities pay dividends which have so far been ignored. While the payment of dividends is a subject of some complexity, we shall adopt a very simple approach and assume dividends are paid continuously at a constant yield q . It transpires that all the previous analysis still applies, not to the asset price X t , but rather to the dividend corrected asset price Yt = X t e − qt . The BS pde is modified to Vt = rV − (r − q ) xVx − 12 σ 2 x 2Vxx

(22.8)

The EMM formula (22.5) still applies, but the expression (22.7) for X T must now be replaced by X T = x exp{(r − q − 12 σ 2 )τ + σ τ Z }

(22.9)

There is another benefit for including dividends. While strictly related only to the equity market, the presence of dividends allows us, with only a slight change in interpretation, to apply all the formulae derived in this chapter to other markets as well. Thus we capture the futures market with the choice X t = futures prices and q = r ; the foreign exchange market with X t = exchange rate and r = rd , q = rf (the domestic and foreign risk free rates); the commodities market with X t = commodity price and q = the convenience yield; and of course the original BS equity market with q = 0 . We shall assume constant dividends as described, for the remainder of this chapter.

22.3

Pricing Simple European Derivatives

We show here how to price European derivatives with fairly simple payoff functions f ( x) . First, we price some elementary building block contracts called bond and asset binaries. These are also known as bond and asset digitals. Then we show how some popular traded contracts, such as vanilla calls and puts, can be expressed as portfolios of these elementary contracts. We finally invoke the Principle of Static Replication to price the calls and puts.

522

Peter Buchen

22.3.1 Asset and Bond Binaries Probably the simplest non-trivial contract is a zero coupon bond (ZCB) which pays one unit of cash (a nominal $1 say) at a fixed time T in the future. This payment is made with certainty, regardless of the economic ‘states of the world’ at time T . In the BS framework, we may regard the ZCB as a derivative with expiry payoff f ( x) = 1 .Whether one use the PDE method or the EMM method, the present value (pv) of the bond is found to be given by B ( x, t ) = e− rr with τ = T − t . Not surprisingly, this price is independent of the underlying asset price x . Another simple contract is one which pays one unit of the asset (e. g. one share of stock) at T , corresponding to an expiry payoff f ( x) = x . Again, it is a simple matter to derive the present value of this contract as A( x, t ) = xe − qr . Of particular interest in pricing more complex derivatives are the associated up (if s = + ) and down (if s = − ) type asset and bond2 binaries, with time T payoffs3 Asset binary

Aξs ( x, T ) =

Bond binary

Bξ ( x, T ) = s

xI ( sx>sξ ) ⎪⎫ ⎬ I ( sx>sξ ) ⎪⎭

(22.10)

The parameter ξ > 0 denotes an exercise price. In the case of the asset binary, the holder receives one unit of the asset at time T , but only if the asset is above (or below) the exercise price. For the bond binary, the payout is $1 rather than one unit of asset. It is a greater challenge to price these contracts, which are fundamental building blocks for many complex derivatives. Before doing so, we define two functions which play a ubiquitous role in BS option pricing. They are [dξ , d ′ξ ]( x,τ ) =

log( x /ξ ) + (r − q ± 12 σ 2 )τ

σ τ

(22.11)

Observe that d ′ξ ( x,τ ) = dξ ( x,τ ) − σ τ . Starting with the up-type bond binary, using (22.5), we can write its pv as Bξ+ ( x, t ) = e− rr E{I ( Z > −dξ′ )}

But from (22.9), X T > ξ implies Z > − dξ′ , which leads to Bξ+ ( x, t ) = e − rr E{I ( Z > − dξ′ )}

Since Z ~ N (0,1) it has normal cumulative distribution function N (d ) = E{I ( Z < d )} = E{I ( Z >− d )} 2 3

Bond binaries are also known as cash binaries or cash digitals. The function I ( x > ξ ) is the indicator function, equal to 1 if x > ξ , and equal to zero otherwise.

22 Foundations of Option Pricing

523

Symmetry allows us to replace Z by − Z in any expectation. The function N (d ) is related to the standard error function by N (d ) = 12 [1 + erf (d / 2)] . Hence,

we obtain Bξ+ ( x, t ) = N (dξ ) and similarly, Bξ− ( x, t ) = N (−dξ ) . We express both results in a single formula as Bξs ( x, t ) = N ( sdξ )

(22.12)

The calculation for the up-asset binary follows along similar lines, and results in the representation Aξ+ ( x, t ) = x

( − q − 1 σ 2 )τ 2

E{σ

τZ

I ( Z >− d ′ξ )}

Calculations of expectations of the above type are greatly facilitated by the following. Theorem 22.4 (Uni-variate Gaussian Shift Theorem). If Z ~ N (0,1) is a standard Gaussian rv with zero mean and unit variance, c is any real constant and F ( z ) a measurable function, then 2

E{ecZ F ( Z ) = e0.5c E{F ( Z + c)}

(22.13)

Proof. The proof follows immediately from the identity Φ ( x − c) = e

Φ ( x − c) =

1 cx − c 2 e 2

Φ ( x) , where Φ ( x) =

1 − x2 e 2

1 cx − c 2 2

Φ ( x)

/ 2π is the probability density func-

tion (pdf) of the standard normal. It follows from the Gaussian Shift Theorem (GST) that, Aξ+ ( x, t ) = xe − qr E{I ( Z + σ τ > − dξ′ )} = xe− qr N (dξ )

Similarly, we find Aξ− ( x, t ) = xe− qr N (− dξ ) , so as a single formula we obtain Aξs ( x, t ) = xe− qr N ( sdξ )

(22.14)

This completes the derivation of the arbitrage free prices of asset and bond binaries. Later we shall meet, second-order versions of these binaries which can be priced using a bi-variate version of the Gaussian Shift Theorem. 22.3.1.1 Parity Relations

From (22.10) it is obviously true that Aξ+ ( x, T ) + Aξ− ( x, T ) = x and Bξ+ ( x, T ) + Bξ− ( x, T ) = 1 , so by the Principle of Static Replication, the up-down parity relations for asset and bond binaries, Aξ+ ( x, t ) + Aξ− ( x, t ) = xe− qτ ⎫⎪ ⎬ Bξ+ ( x, t ) + Bξ− ( x, t ) = e− rτ ⎪⎭

(22.15)

524

Peter Buchen

will be valid for all t ≤ T . Parity relations such as these pervade the subject of exotic option pricing, so it is well to keep them in mind in order to avoid unnecessary calculations and also to sometimes simplify mathematical formulae obtained.

22.3.2 European Calls and Puts Recall that a European call option of strike price k , gives its holder the right to buy one unit of asset at time T for k dollars. The payoff for the contract is f ( x) = ( x − k ) + = max( x − k , 0) . The plus function captures the ‘option’ part of the contract – the holder is not obliged to buy the asset. Indeed, if at expiry the asset price satisfies x < k , then the holder will not exercise the option and the contract will expire worthless. It follows that the payoff has the equivalent representation f ( x) = ( x − k ) I ( x>k ) = xI ( x>k ) − kI ( x>k )

This payoff is the same as that of a portfolio containing one up-type asset binary and short k up-type bond binaries. From (22.10) f ( x) = Ak+ ( x, T ) − kBk+ ( x, T )

The Principle of Static Replication then tells us that a European call option can be statically replicated at any time before expiry, by a portfolio of asset and bond binaries. Holding the portfolio is financially equivalent to holding the call option. It follows, that the present value of the call option, for all t < T , is given by Ck ( x, t ) = Ak+ ( x, t ) − kBk+ ( x, t ) and by virtue of (22.12) and (22.14), this can be expressed as Ck ( x, t ) = xe − qr N (dξ ) − ke − rr N (dξ′ )

(22.16)

This as the celebrated BS formula for the price of a European call option. The corresponding formula for a European put option is derived from its payoff function Pk ( x, T ) = (k − x) + = kBk− ( x, T ) − Ak− ( x, T )

yielding the result Pk ( x, T ) = ke − rr N (−dξ′ ) − xe − qr N (− dξ )

(22.17)

for all t < T . The asset and bond binary parity relations (15) also imply the well known put-call parity relation Ck ( x, t ) − Pk ( x, t ) = xe − qr − ke− rr

(22.18)

When we need to specify the expiry date T of the option explicitly, we shall use a third parameter and write Ck ( x, t ; T ) and Pk ( x, t ; T ) for the call and put price. We also apply this rule to other options e. g. Aks ( x, t ; T ) and Bks ( x, t : T ) .

22 Foundations of Option Pricing

525

22.3.2.1 Threshold Calls and Puts

These contracts are similar to standard calls and puts, except the strike price k differs from the exercise price h . The payoffs at expiry are Ch, k ( x, T ) = ( x − k ) I ( x>h)

and

Ph, k ( x, T ) = (k − x) I ( x
(22.19)

Normally, for the threshold call option h > k , while for the threshold put option h < k . Static replication immediately determines their present values, which can be expressed in terms of asset and bond binaries as Ah+ ( x, t ) − kBh+ ( x, t ) ⎫⎪ ⎬ = kBh− ( x, t ) − Ah− ( x, t ) ⎪⎭

Ch, k ( x, t ) = Ph, k ( x, t )

(22.20)

The corresponding parity relation is Ch, k ( x, t ) − Ph, k ( x, t ) = xe− qr − ke− rr

(22.21)

Note that the right hand side is independent of the exercise price h .

22.4

Dual-Expiry Options

There are many exotic options whose payoffs depend on more than just a single asset price at a single future expiry date, as assumed in the standard BS framework. Examples include compound and chooser options whose payoffs depend on a single asset price at two future dates, and so-called two asset rainbow options whose payoffs depend on two asset prices at a single future date. We consider in this section examples of the former type and consider the two-asset case in the next section. There are strong similarities between the two types.

22.4.1 Second-order Binaries We first price some further elementary binary contracts and use these to price the exotics by static replication. However, the binary contracts needed here are somewhat more complicated. Definition 22.2 (Second-order asset and bond binaries). Let T1 and T2 with T1 < T2 be two future exercise dates. A second-order asset binary is a contract that pays at time T2 , one unit of the underlying asset, but only if x1 = X T1 is above (or below) a given exercise price ξ1 and x2 = X T2 is above (or below) a given exercise price ξ 2 . A second-order bond binary is identical except that it pays $1 at time T2 rather than one unit of asset.

526

Peter Buchen

There are four types of second-order binaries: up-up, up-down, down-up and down-down, depending on whether exercise at T1 and T2 are respectively above or below the given price levels. Let s1, 2 be sign indicators ( ± ) for these four types and ξ1, 2 the exercise prices at T1, 2 . Then the expiry T2 payoffs are respectively Aξs11ξs22 ( x1 , x2 ; T2 ) = x2 I ( s1 x1>s1ξ1 ) I ( s2 x2 >s2ξ 2 ) Bξs11ξs22 ( x1 , x2 ; T2 ) = I ( s1 x1>s1ξ1 ) I ( s2 x2 >s2ξ 2 )

Since x1 is actually known (and remains fixed) for all T1 ≤ t ≤ T2 , we can also write down the equivalent payoffs at time T1 . They are Aξs11ξs22 ( x1 , T1 ) = Aξs22 ( x1 , T1 ; T2 ) I ( s1 x1>s1ξ1 ) Bξs11ξs22 ( x1 , T1 ) = Bξs22 ( x1 , T1 ; T2 ) I ( s1 x1>s1ξ1 )

Second-order binaries are compound binaries: i. e. they are binaries of firstorder binary contracts. Theorem 22.5. The present values, for all bond binaries are given respectively by

t < T1 < T2 , of second-order asset and

Aξs11ξs22 ( x, t ) = xe− qτ 2 N ( s1d1 , s2 d 2 ; s1 s2 ρ ) ⎫⎪ ⎬ Bξs11ξs22 ( x, t ) = e− rτ 2 N ( s1d1′, s2 d 2′ ; s1s2 ρ ) ⎪⎭

(22.22)

where ρ = τ 1/τ 2 ,τ i = Ti − t , N ( d , d ′; ρ ) is the bi-variate cumulative normal distribution function with correlation coefficient ρ , and for i = 1, 2 [di , d ′i ] =

log( x /ξi ) + (r − q ± 12 σ 2 )τ i

σ τi

Proof. We present only an outline for the up-up type asset binary. Its present value is given by Aξ++ ( x, t ) = e− rτ 2 E{ X 2 I ( X 1 > ξ1 ) I ( X 2 > ξ 2 )} 1ξ 2

where, from (22.9), X i = x exp{(r − q − 12 σ 2 )τ i + σ τ i Z i ; Z i ~ N (0,1; ρ )

The correlation coefficient is easily derived from the covariance structure of Brownian motion considered in section 22.1. To calculate the above expectation we apply a bi-variate version of the Gaussian Shift Theorem (see below) and employ the bi-variate normal definition N (a, b; p) = E{I ( Z1 < a) I ( Z 2 < b)}; Z1,2 ~ N (0,1; p )

22 Foundations of Option Pricing

527

Theorem 22.6 (Multi-variate Gaussian Shift Theorem). If Z ~ N (0,1; R) is a standard Gaussian random vector with zero means, unit variances and correlation matrix R , c is any real constant vector and F ( Z ) a measurable function, then E{ecZ F ( Z )} = e0.5cRc E{F ( Z + Rc )}

(22.23)

We do not present a proof, which in any case, is a straightforward extension of the uni-variate GST (Theorem 22.4). We might also remark that the multi-variate Gaussian Shift Theorem above is the key to pricing many multi-period, multi-asset exotic options, a topic beyond the scope of this exposition. For the bi-variate case under consideration, recall that the correlation matrix has the simple two-dimensional ⎛1 ρ⎞ representation R = ⎜ ⎟. ⎝ρ 1⎠ 22.4.1.1 Parity Relations

Second-order asset and bond binaries satisfy the parity relations Aξ+1ξs22 ( x, t ) + Aξ−1ξs22 ( x, t ) =

Aξs22 ( x, t; T2 ) ⎫⎪ ⎬ Bξ+1ξs22 ( x, t ) + Bξ−1ξs22 ( x, t ) = Bξs22 ( x, t ; T2 ) ⎪⎭

(22.24)

The terms on the right are first-order asset and bond binaries which expire at time T2 .

22.4.2

Binary Calls and Puts

It is time to extend our list of elementary contracts. A second-order binary call is a contract which gives its holder at time T1 , a strike k , expiry T2 European call option, but only if the asset price at T1 is above (for an up-type) or below (for a down-type) a given exercise price h . The payoff at time T1 is therefore Chs, k ( x1 , T1 ) = Ck ( x1 , T1 ; T2 ) I ( sx1>sh)

(22.25)

and s = ± as usual. However, since we know the payoff of the European call option at time T2 is ( x2 − k ) + , we get the alternative binary call payoff at T2 as Chs, k ( x2 , T2 ) = ( x2 − k ) + I ( sx1>sh) = ( x2 − k ) I ( x2 >k ) I ( sx1>sh) s+ s+ = Ahk ( x2 , T2 ) − kBhk ( x2 , T2 )

528

Peter Buchen

That is a portfolio of second-order asset and bond binaries. It follows from the Principle of Static Replication that for any time t < T1 < T2 , s+ s+ Chs, k ( x, t ) = Ahk ( x, t ) − kBhk ( x, t )

(22.26)

A second order put binary can be obtained in the same way and is given by s− s− Phs, k ( x, t ) = kBhk ( x, t ) − Ahk ( x, t )

(22.27)

22.4.2.1 Parity Relations

For all t < T1 < T2 , Ch+, k ( x, t ) + Ch−, k ( x, t ) = Ck ( x, t ; T2 ) ⎫⎪ ⎬ Ph+, k ( x, t ) + Ph−, k ( x, t ) = Pk ( x, t ; T2 ) ⎪⎭

22.4.3

(22.28)

Compound Options

There are four basic compound options, generally referred to as a call–on–call, call–on–put, put–on–call and put–on–put. We show how to price the first of these by representing its equivalent time T1 payoff as a portfolio of elementary contracts. To fix the parameters, assume the holder has the right to buy for $ a at time T1 , a call option of strike price k and expiry T2 . Then the payoff at time T1 is given by V cc ( x, T1 ) = [Ck ( x, T1 ; T2 ) − a ]+

(22.29)

To proceed further we need to ‘remove’ the plus function. This is done as follows. Let x = h solve the equation Ck ( x, T1 ; T2 ) = a . Since Ck ( x) is a positive monotonic increasing function of x , the equation has a unique solution. This value of h depends only on fixed parameters (a, k , T1 , T2 ) and not on the current time. Hence the calculation, which must be carried numerically in practice, need only be done once. We can then rewrite the time T1 payoff as V cc ( x, T1 ) = [Ck ( x, T1 ; T2 ) − a]I ( x>h)

since for x > h , the holder would exercise the compound option, while for x < h the compound option would simply expire worthless. The payoff is now seen from (22.25) to be a portfolio of two elementary contracts: an up-type second order binary call and a short position in a up-type firstorder bond binaries. The Principle of Static Replication allows us to write down the present value immediately: V cc ( x, t ) = Ch+, k ( x, t ) − aBh+ ( x, t ; T1 )

(22.30)

22 Foundations of Option Pricing

529

When we write this price out in full, using Gaussian distribution functions, we get V cc ( x, t ) = xe− qr2 N (d h , d k ; ρ ) − ke− rr2 N (d h′ , d k′ ; ρ ) −ae − rr1 N (d h′ )

(22.31)

a formula in complete agreement with that first published in (Geske 1979). Formulae for the other compound options follow in similar fashion. We also note the following easily derived parity relations V cc ( x, t ; T1 , T2 ) − V pc ( x, t ; T1 , T2 ) = Ck ( x, t ; T2 ) − ae − r (T1 −t ) V cp ( x, t ; T1 , T2 ) − V pp ( x, t ; T1 , T2 ) = Pk ( x, t ; T2 ) − ae − r (t1 −t )

22.4.4

(22.32)

Chooser Options

Also known as ‘as-you-like-it’ options and first analysed in (Rubinstein 1991), these exotics give the holder at time T1 , the choice of either a European call option (of strike k , expiry T2 ) or a European put option (of strike h , expiry T2 ). The payoff at time T1 is therefore V ( x, T1 ) = max{Ck ( x, T1 ; T2 ), Ph ( x, T1 ; T2 )}

since the holder will choose the option with the greater payoff. Let x = a be the unique solution of the equation Ck ( x, T1 ; T2 ) = Ph ( x, T2 ; T2 ) . The solution is unique since Ck is a monotonic increasing function of x , while Ph is a monotonic decreasing function of x . Clearly the holder will choose the call option if x > a , and the put option if x < a . We therefore re-write the payoff in the equivalent form V ( x, T1 ) = Ck ( x, T1 ; T2 ) I ( x>a) + Ph ( x, T1 ; T2 ) I ( x
(22.33)

The Principle of Static Replication determines the present value of the chooser option as V ( x, t ) = Ca+, k ( x, t ; T1 , T2 ) + Pa−, h ( x, t ; T1 , T2 )

(22.34)

The chooser option is simply a portfolio of one up-type binary call option and one down-type binary put option. We conclude this section with the observation that many other dual expiry options can be priced using the same approach, as described in detail in (Buchen 2004). Examples include the extendable (or reset) options analysed in (Longstaff 1990), the one-shout option considered in (Thomas 1994) and many others. Reset options, allow the holder at a fixed time T1 , to extend the life of the option to time T2 , while simultaneously changing other parameters, such as the strike price. The reset may also include a premium payable at time T1 . A one-shout option expires at time T2 , but locks in the exercise value of the option at the shout time T1 . The holder receives the greater of the T1 and T2 exercise values.

530

Peter Buchen

22.5 Dual Asset Options Suppose we have two assets with price processes X t , Yt following correlated gBm’s of volatilities σ 1 and σ 2 and dividend yields q1 and q2 . The Brownian motions are assumed to have constant correlation coefficient ρ , implying cov{B1 (t ), B2 (t )} = ρ t . Under the assumption of gBm, the log-returns of the two assets will have the same correlation coefficient ρ . The best known example of a two asset derivative is the exchange option, first analysed in (Margrabe 1978). It gives its holder the right to exchange asset X for asset Y (i. e. receive X by paying Y ) at a fixed time T in the future. The expiry payoff is therefore ( X T − YT ) + . As for the single asset case, we can price dual-asset derivatives by either a PDE method or a corresponding EMM method. The BS-pde for two asset derivatives is Vt = rV − μ1 xVx − μ2 yVy − 12 (σ 12 x 2Vxx + σ 22 y 2Vyy + 2 ρσ 1σ 2 xyVxy )

(22.35)

where μi = r − qi . For European style options, this pde should be solved in the domain ( x, y ) ∈ R 2 , t < T with expiry condition V ( x, y, T ) = f ( x, y ) . The function f ( x, y ) is the dual-asset payoff, which for the case of an exchange option is f ( x, y ) = ( x − y ) + . The associated Feynman-Kac solution is V ( x, y, t ) = e− rr EQ { f ( X T , YT ) | X t = x, Yt = y}

(22.36)

where X T = x exp{( μ1 − 12 σ 12 )τ + σ 1 τ Z1} YT = y exp{( μ2 − 12 σ 22 )τ + σ 2 τ Z 2 }

with τ = T − t and Z1,2 ~ N (0,1; ρ ) . Once we have priced some representative binary options, two asset exotics can be priced by static replication. There are many two asset binaries. A fairly comprehensive set, sufficient to price most two asset exotics, is presented in the following theorem. Theorem 22.7 (Two Asset Binary Options). The present values of the general two-asset binaries, with expiry T payoffs V1 ( x, y, T ) = x a1 y a2 I ( s1 x>s1ξ1 ) I ( s2 y>s2ξ 2 ) V2 ( x, y, T ) = x a1 y a2 I ( sx>sy ) V3 ( x, y, T ) = x a1 y a2 I ( s1 x>s1ξ1 ) I ( sx>sy ) V4 ( x, y, T ) = x a1 y a2 I ( s2 y>s2ξ 2 ) I ( sx>sy )

22 Foundations of Option Pricing

where si = ± and ai are usually 0 or tively by

531

1 (but don’t have to be) are given respec-

= x a1 y a2 eντ N ( s1 D1 , s2 D2 ; s1 s2 ρ ) ⎫ ⎪ V2 ( x, y, t ) = x a1 y a2 eντ N ( sD) ⎪ ⎬ a1 a2 ντ V3 ( x, y, t ) = x y e N ( s1 D1 , sD; ss1 ρ1 ) ⎪ ⎪ V4 ( x, y, t ) = x a1 y a2 eντ N ( s2 D2 , sD; ss2 ρ 2 ) ⎭ V1 ( x, y, t )

(22.37)

where ⎧ ν ⎪ ⎪ ⎪ ⎪ D1 ( x,τ ) ⎪ D ( y,τ ) ⎪⎪ 2 ⎨ D( x, y,τ ) ⎪ ⎪ ⎪ ⎪ ρ1 ⎪ ρ2 ⎪ 2 ⎪⎩ σ

− r + a1 ( μ1 − 12 σ 12 ) + a2 ( μ 2 − 12 σ 22 )…

=

+ 12 (a12σ 12 + a22σ 22 + 2 ρ a1a2σ 1σ 2 ) =

[log( x /ξ1 ) + ( μ1 − 12 σ 12 + a1σ 12 + ρ a2σ 1σ 2 )τ ]/σ 1 τ

= [log( y /ξ 2 ) + ( μ 2 − 12 σ 22 + a2σ 22 + ρ a1σ 1σ 2 )τ ]/σ 2 τ [log( x /y ) + (q2 − q1 + 12 (σ 22 − σ 12 )…

= = =

+ a1σ 12 − a2σ 22 + (a2 − a1 ) ρσ 1σ 2 )τ ]/σ τ (σ 1 − ρσ 2 ) /σ −(σ 2 − ρσ 1 ) /σ

σ 12 + σ 22 − 2 ρσ 1σ 2

=

Proof. We omit the details, but as should now be familiar, we need only apply the bi-variate version of the GST (Theorem 22.6) to the discounted expected payoffs under the martingale measure Q .

22.5.1

The Exchange Option

As already remarked, the payoff at expiry is V ( x, y, T ) = ( x − y ) + = xI ( x > y ) − yI ( x > y )

This is a portfolio of a pair of two asset binaries. Using the appropriate formula from Theorem 7, we obtain the Margrabe formula (see (Margrabe 1978)) V ( x, y, t ) = xe− q1τ N (d ) − ye − q2τ N (d ′)

for all t < T and τ = T − t . We have also defined [d , d ′] =

log( x /y ) + (q2 − q1 ± 12 σ 2 )τ

σ τ

(22.38)

532

Peter Buchen

The exchange option is a generalisation of both a standard European call option and put option. It in fact includes both as special cases. For example, we may regard a European call option as an exchange of asset X for an agreed amount of cash k at time T . Then the BS call option formula (22.16) is recovered by taking q2 = 0 and y = ke − rr , the present value of the cash, in (22.38).

22.5.2 A Simple ESO ESO’s or executive stock options are very topical instruments at the current time. A simple performance-based executive compensation is the award of a call option which pays out at T , only if the stock price of the company X T exceeds both the strike price and also a given index YT . The strike price is usually set equal to the stock price at the grant date, so the option will finish in the money only if the stock price goes up over the given period. The payoff is therefore V ( x, y, T ) = ( x − k ) + I ( x> y ) = xI ( x>k ) I ( x> y ) − kI ( x>k ) I ( x> y )

This is also a portfolio of two asset binaries considered previously with ai = (1, 0) for the first term, and ai = (0, 0) for the second. Applying the relevant formula in Theorem 22.7, leads to V ( x, y, t ) = xe− q1τ N (d k , d ; ρ1 ) − ke− rr N (d k′ , d ′; ρ1 )

(22.39)

where [d k , d ′ k ] = [log( x /k ) + (r − q1 ± 12 σ 1 )τ ]/σ 1 τ d = [log( x/y )+(q2 − q1 )+ 12σ 2τ ]/σ τ d ′= d − ρ1σ1 τ .

22.5.3 Other Two Asset Exotics Other two asset exotics include calls and puts on the maximum and minimum of two assets as first considered in (Stulz 1982). These are readily priced in terms of the two asset binaries listed in Theorem 22.7. For example, a call option on the minimum of two assets has an expiry payoff Vkmin ( x, y, T ) = [min( x, y ) − k ]+ which can be expressed as the portfolio of two asset binaries: Vkmin ( x, y, T ) = xI ( x > k ) I ( x < y ) + yI ( y > k ) I ( x > y ) − kI ( x > k ) I ( y > k )

Its present value can be obtained virtually by inspection using Theorem 22.7, so the details are left to the reader. There is also a simple parity relation for calls on the maximum and minimum of two assets. It is not difficult to see that Vkmax ( x, y, t ) + Vkmin ( x, y, t ) = Ck ( x, t ) + Ck ( y, t )

(22.40)

22 Foundations of Option Pricing

533

must be true for all t < T . The two terms on the right of this equation are standard European call options of strike k : one on asset X , the other on asset Y . Further discussions of two asset rainbow options, albeit by quite a different approach, can be found in (Zhang 1998).

22.6 Barrier Options Most exotic options are traded over-the-counter (OTC), rather than in an organised exchange. Barrier options, first appearing in (Merton 1973), are the exception, where there is active trading in them in the FX4 market. Knock-out barrier options are similar to standard European calls and puts, but the payoff at expiry is made only if the asset price does not hit a given barrier during the life of the option. If the asset price hits the barrier, the option expires immediately worthless. If a trader believes the asset price during a given period will not go below a value b say, they can purchase a down–and–out barrier option with barrier level set at x = b . Depending on circumstances, this option may be considerably cheaper than a corresponding non-barrier option. Of course, the trader runs the risk of losing everything if the underlying asset hits the barrier on or before the expiry date. There are four basic types of barrier option for any given payoff f ( x) at expiry T . There are two knock-out options and two knock-in options. Within each of these classes there exist: up and down types, depending on whether the asset price hits the barrier respectively from below or above. These four barrier options are commonly known as: up–and–out, down–and–out, up–and–in and down–and–in. Knock-in options require the asset price to hit the barrier, before they become ‘alive’. If the asset price fails to hit the barrier before expiry, knock-in options expire worthless. It transpires that there are three parity relations linking the prices of the four barrier options. Hence if we know just one of the prices (it does not matter which), we immediately obtain the other three. We shall price the down– and–out barrier option as our benchmark. Most of the ideas in this section can be found in (Buchen 2001).

22.6.1

PDE’s for Barrier Options

Assuming the underlying asset pays dividends, all four barrier option prices satisfy the BS-pde (22.8). They differ only in their domains, expiry payoffs and boundary conditions (BC’s). These are given explicitly by D/O : x > b, Vdo ( x, T ) = f ( x), Vdo (b, t ) = 0 U/O : x < b, Vuo ( x, T ) = f ( x), Vuo (b, t ) = 0

4

D/I : x > b, Vdi ( x, T ) = 0,

Vdi (b, t ) = V0 (b, t )

U/I : x < b,

Vui (b, t ) = V0 (b, t )

Foreign Exchange.

Vui ( x, T ) = 0,

534

Peter Buchen

where denotes the price of a standard European option with expiry T payoff f ( x ) (i. e. an identical option without the barrier imposed). Many boundary value problems of the kind satisfied by barrier options can be solved using image methods. Fortunately, the BS-pde admits, for any given solution, a unique image solution. This image solution and its properties are discussed in the next section.

22.6.2 Image Solutions for the BS-pde Let V ( x, t ) be any solution of the BS-pde with expiry payoff V ( x, T ) = f ( x) . Then its corresponding image solution relative to the asset level x = b , denoted by ∗

V ( x, t ) , satisfies the following four properties: ∗∗

1. 2. 3. 4.

V∗ ( x, t ) = V ( x, t ) ∗ ∗ V∗ ( x, t ) solves the BS-pde with expiry payoff V ( x, t ) = f ( x) V (b, t ) = V (b, t ) If x∗ < b ( x > b ) is the domain of V ( x, t ) , then x < b ( x > b ) is the domain of V ( x, t ) .

Theorem 22.8 (BS Image Solution). Let V ( x, t ) be any solution of the BS-pde. Then its image solution, satisfying the four properties above, is given explicitly by ∗

V ( x, t ) = (b / x)α V (b 2 / x, t );

α = 2(r − q) / σ 2 − 1

(22.41)

Proof. As usual, we leave out the details (which can be found in (Buchen 2001)) but point out the relevant ideas. The BS-pde can be transformed to the heat equation Uτ = 12 σ 2 U ξξ where τ = T∗ − t and ξ = log( x/b) . The image solution for the heat equation is very simple: U (ξ , t ) = U (−ξ ,τ ) . Back-transforming to BS variables then gives the desired result. Our next theorem states explicitly the three parity relations for barrier options. Theorem 22.9 (Barrier Option Parity relations). Vdo ( x, t ) + Vdi ( x, t ) = V0 ( x, t ) ∗

Vui ( x, t ) = V di ( x, t )

(22.42)

Vuo ( x, t ) + Vui ( x, t ) = V0 ( x, t )

Proof. All three parity relations are immediate consequences of the linearity of the barrier option pde’s and properties of the image solution.

22 Foundations of Option Pricing

535

Thus if we can price the D/O barrier option for example, the first parity relation determines the D/I option; the second then determines the U/I option; and the third determines the U/O option. We shall follow precisely this prescription. So we next turn our attention to pricing the D/O barrier option.

22.6.3

The D/O Barrier Option

The down–and–out barrier option price Vdo ( x, t ) satisfies the BS-pde (22.8) in x > b, t < T , with expiry payoff Vdo ( x, T ) = f ( x) and BC V (b, t ) = 0 . The unique solution of this boundary value problem is given by the following theorem. Theorem 22.10 (Method of Images). Let V ( x, t ) solve the BS-pde (22.8) in b x > 0, t < T with expiry payoff Vb ( x, T ) = f ( x) I ( x>b) . Then, for all t < T and x > b , ∗

Vdo ( x, t ) = Vb ( x, t ) − Vb ( x, t )

(22.43)

Proof. Subject to regularity conditions on the function f ( x) , boundary value problems for the BS-pde have unique solutions. Hence, if (22.43) satisfies the pde, expiry value and BC, it must be the unique solution. It certainly satisfies the ∗

BS-pde because Vb and Vb do and the pde is linear. It also satisfies the BC, since ∗

Vb = Vb at x = b . A simple calculation also shows that (22.43) evaluated at t = T gives ∗

Vdo ( x, T ) = f ( x) I ( x > b) − f ( x) I ( x < b)

which in the domain x > b is equal to f(x).

22.6.4

Equivalent Payoffs

Barrier options are examples of path-dependent options. Their payoffs depend on the whole path followed by the underlying asset price between time t and T . European options, on the other hand, have payoffs which depend only on the asset price at time T , and are therefore path independent. The Method of Images considered in the previous section allows us to get barrier option prices from an equivalent European payoff. That is, the equivalent European payoff exactly replicates the barrier price for all t < T in the domain of the barrier option.

536

Peter Buchen

Theorem 22.11. The equivalent European payoffs for the four standard barrier options with expiry payoff f ( x ) and barrier level x = b , are given by ∗ ⎫ Vdoeq ( x, T ) = f ( x) I ( x > b) − f ( x) I ( x < b) ⎪ ∗ ⎪ Vdieq ( x, T ) = [ f ( x) + f ( x)]I ( x < b) ⎪ ⎬ ∗ ⎪ Vuieq ( x, T ) = [ f ( x) + f ( x)]I ( x > b) ⎪ ∗ ⎪ eq Vuo ( x, T ) = f ( x) I ( x < b) − f ( x) I ( x > b) ⎭

(22.44)

Proof. The equivalent payoff for the D/O barrier option follows from (22.43).The others can then be obtained using the parity relations (22.42).

22.6.5 Call and Put Barriers The Method of Images above, tells us that it is only necessary to obtain the one solution Vb ( x, t ) in order to price all barrier barrier options. Consider call barrier options with strike price k and barrier price h . Then, at expiry

Vb ( x, T ) = f ( x) I ( x>h) = ( x − k )+ I ( x>h) = ( x − k ) I ( x> ) where = max(h, k ) . But this is the payoff of a threshold call option, as in (22.19), with strike price k and exercise price . When k > h , this threshold call reduces to a standard European call. The four corresponding barrier option prices are then given by (22.42), as ∗

Cdo ( x, t ) = Cl , k − C l , k ∗

Cdi ( x, t ) = Ck − (Cl , k − C l , k ) ∗

(22.45)

∗

Cui ( x, t ) = C l , k + (Cl ,k − C l , k ) ∗

∗

Cuo ( x, t ) = (Ck − C k ) − (Cl , k − C l , k ) where C , k ( x∗, t ) denotes the present value of the threshold call, given explicitly by (22.19) and C l ,k ( x, t ) is its image relative to x = h , defined by (22.41). Observe that, in the case k > h, Vuo ( x, t ) = 0 . This makes perfect financial sense, because an U/O barrier option can never finish in the money if its strike price is above the barrier price. A similar analysis shows that the corresponding put barrier prices are given by ∗ ∗ ⎫ Pdo ( x, t ) = ( Pk − Pk ) − ( Pl , k − Pl , k ) ⎪ ∗ ∗ ⎪ Pdi ( x, t ) = Pl , k + ( Pl , k − Pl , k ) ⎪ ⎬ ∗ ⎪ Pui ( x, t ) = Pk − ( Pl ,k − Pl ,k ) ⎪ ∗ ⎪ Pui ( x, t ) = Pl , k − Pl , k ⎭

(22.46)

22 Foundations of Option Pricing

537

where P , k ( x, t ) is given by (22.20). Observe also that the D/O put option is always equal to zero the strike is below the barrier, i. e. k < h . In summary, call (and put) barrier options can be represented as portfolios of standard European calls (puts), threshold calls (puts) and their images with respect to the barrier level. Many other barrier options can be priced in the same way.

22.7 Lookback Options Options whose expiry payoffs depend on the maximum or minimum asset price over a specified time window (and possibly on the expiry value of the asset as well), are called lookback options. Early studies in lookback options can be found in (Goldman et al. 1979) and (Conze and Viswanathan 1991). We adopt here a very different approach based on pricing principles established throughout this chapter. To fix ideas, let T be the expiry date and [0, T ] the lookback window. Other time windows are possible and indeed both continuous time and discrete time monitoring for the min/max asset price have been used. We assume continuous time monitoring throughout. Let t for 0 ≤ t ≤ T , be the current time and define Yt = min{ X s } 0 ≤ s ≤t

and

Z t = max{ X s } 0≤ s ≤t

(22.47)

where X s denotes the underlying stock price process (gBm). Thus Yt represents the running minimum asset price up to time t , and Z t the running maximum. We shall often write x = X t , y = Yt and z = Z t for current values. Minimum-type lookback options have expiry payoffs of the form f ( X T , YT ) , while maximum-type lookback options have expiry payoffs of the form F ( X T , ZT ) . There are several popular lookback options of the call and put type, including both fixed strike lookbacks and floating strike lookbacks. We shall consider here the two cases of a floating strike lookback call option with payoff Vc ( X T , YT , T ) = ( X T − YT ) + and a floating strike lookback put option with payoff V p ( X T , ZT , T ) = ( ZT − X T ) + . Since YT < X T < ZT , floating strike calls and puts are always exercised at expiry, so we can safely drop the ‘plus’ sign in these payoffs. In a sense, the floating strikes: YT for a call, and ZT for a put are optimal from the perspective of the holder. For this reason, lookback options are sometimes called options of ‘minimum regret’.

22.7.1

Equivalent Payoffs for Lookback Options

There are several approaches to pricing lookback options. The EMM approach requires knowledge of the joint distribution of the asset price X t and its running min/ max Yt , Z t . The PDE approach, adopted here, makes use of the following facts.

538

Peter Buchen

Theorem 22.12. Under the standard BS assumptions,

a. Minimum type lookback option prices V ( x, y, t ) satisfy the BS-pde (22.8) in the domain x > y, t < T with expiry condition V ( x, y, T ) = f ( x, y ) and BC V ′ = 0 at x = y , where V ′( x, y, t ) = Vy ( x, y, t ) . b. Maximum type lookback option prices V ( x, z, t ) satisfy the BS-pde (22.8) in the domain x < z, t < T with expiry condition V ( x, z, T ) = F ( x, z ) and BC V ′ = 0 at x = z , where V ′( x, z, t ) = Vz ( x, z, t ) .

A technical discussion of these pde’s can be found in Chapter 15 of (Wilmott et al. 1995), where the somewhat unusual BC’s are analysed in detail. It is clear from Theorem 22.12 that the variables y, z act only as parameters in the BS-pde. This fact was exploited in (Buchen and Konstandatos 2005) to derive a new method for pricing lookback options. This method, employed exclusively here, is captured by the following theorem. Theorem 22.13

a. The equivalent European payoff for a minimum type lookback option is given by Veq ( x, y, T ) = f ( x, y ) I ( x > y ) + g ( x, y ) I ( x < y ) ⎫ ⎪ ⎬ y ∗ g ( x, y ) = f ( x, x ) − ∫x f '( x, ξ ) d ξ ⎪⎭

(22.48)

b. The equivalent European payoff for a maximum type lookback option is given by Veq ( x, z , T ) = F ( x, z ) I ( x < z ) + G ( x, z ) I ( x > z ) ∗

G ( x, z ) = F ( x, x) + ∫z F ′( x, ξ ) d ξ x

(22.49)

Proof. We point out only the main ideas behind the proof. The details can be found in (Buchen and Konstandatos 2005). It is clear from the pde for minimum type lookback options, that V ′( x, y, t ) = Vy ( x, y, t ) solves a down–and–out barrier option problem, with barrier at x = y and payoff function f ′( x, y ) . The equivalent European payoff for this barrier option is then given by the first equation in (22.44). The corresponding equivalent payoff for the lookback option is then obtained by integrating this expression with respect to the parameter y . A similar analysis, integrating an equivalent up–and–out barrier payoff with respect to the parameter z , leads to the equivalent maximum type lookback payoff. ∗ ∗ It is important to be clear about the precise meaning of the terms f '( x, ξ ) and F '( x, ξ ) appearing in Theorem 22.13. In both terms, we are taking the image (with respect to x = ξ ) of the partial derivative – rather than the other way round. Thus from (41), ∗

f '( x, ξ ) = (ξ / x)α fξ ( x , ξ );

x =ξ2 / x

22 Foundations of Option Pricing

539

∗

whith a similar interpretation for F '( x, ξ ) . The equivalent payoffs described in Theorem 22.13 have an important ramification which is easily overlooked. The x appearing in (22.48) and (22.49) refer to the asset price at expiry, i. e. x = X T . However, the y and z refer to the running min and max asset prices at the present time t , i. e. y = Yt and z = Z t . The equivalent payoffs for lookback options therefore depend, not only on conditions at expiry T , but also on conditions at the current time t .

22.7.2 Generic Lookback Options It turns out that all standard lookback options can be priced in terms of just two generic lookback options. The generic min-type lookback is an option that simply pays the minimum asset at T , i. e. the payoff is f ( x, y ) = y . Similarly, the generic maxtype lookback has expiry payoff F ( x, z ) = z . Both payoffs are independent of x . Let m( x, y, t ) denote the present value of the min-type generic lookback option, and M ( x, z, t ) the present value of the generic max-type lookback option. Then Theorem 22.14. *

m( x, y, t ) = (1 + β ) Ay− ( x, t ) + y[ By+ ( x, t ) − β By+ ( x, t )

(22.50)

*

M ( x, z , t ) = (1 + β ) Az+ ( x, t ) + z[ Bz− ( x, t ) − β Bz− ( x, t )

(22.51)

where β = σ 2 / (r − q ) . Proof. > From f ( x, y ) = y and f ′( x, y ) = 1 we get from (22.48) meq ( x, y, T ) = yI ( x > y ) + [ x − ∫x (ξ / x)α d ξ I ( x < y ) y

= yI ( x > y ) + xI ( x < y ) − β [ y ( y / x)α − x]I ( x < y ); ∗

= (1 + β ) xI ( x < y ) + y[ I ( x > y ) − β I ( x > y )]

The equivalent payoff is a portfolio of asset and bond binaries and, for the last term, an image with respect to x = y of a bond binary. Static replication and the properties of image options, together yield the result stated in (22.50). The calculation for the max-type equivalent payoff follows along similar lines. The generic lookback options with price m( x, y, t ) and M ( x, z, t ) are the building blocks for more complex lookback options, just as asset and bond binaries were seen to be the building blocks for vanilla calls and puts. We price some standard lookback options using these building blocks in the next section.

540

Peter Buchen

22.7.3 Floating Strike Lookback Options The payoffs of floating strike lookback calls and puts are given respectively by Vc ( x, y, T ) = ( x − y )

and

V p ( x, z, T ) = ( z − x)

We have removed the ‘plus’ part of the payoff since both options, as noted earlier, are always exercised at expiry. The lookback call is therefore a portfolio of one asset (long) and one generic min-type lookback (short). The present value, by static replication, is therefore Vc ( x, y, t ) = xe− qτ − m( x, y, t ) = C y ( x, t ) + Lc ( x, y, t )

(22.52)

Similar reasoning, determines the present value of the lookback put option as V p ( x, z , t ) = M ( x, z , t ) − xe − qr = Pz ( x, t ) + L p ( x, z , t )

(22.53)

We have used (22.50) and (22.51), together with (22.16) and (22.17), to obtain the alternative analytic expressions, where ∗

Lc ( x, y, t ) = β [ y By+ ( x, t ) − Ay− ( x, t )]

(22.54)

∗

L p ( x, y, t ) = β [ Az+ ( x, t ) − z Bz− ( x, t )]

(22.55)

The term Lc represents the lookback premium for the call option. It is the extra price above the current European call price, required to compensate for the possibility of reducing the strike y , even further over the remaining life of the option. The term L p is the corresponding lookback premium for the put option. Floating strike lookback options are always more expensive than their European counterparts. It is clear that pricing lookback options is not a trivial task. However, once the machinery of equivalent payoffs and generic lookbacks, has been set up the actual analysis is seen to be very easy. In fact, even in pricing the generic lookbacks, the only calculations required were integrations of simple algebraic functions.

22.8 Summary We have considered in this chapter many different derivatives, including plain vanilla calls and puts and some considerably more intricate exotic options. Yet pricing the exotics was in principle, no more exacting than pricing the plain vanillas. This arises because our basic approach expresses the payoffs of complex options as portfolios of elementary contracts. These elementary contracts vary considerably, from simple asset and bond binaries to portfolios of such binaries and their images. Most

22 Foundations of Option Pricing

541

of the analytical effort goes into pricing the elementary contracts. Equation (22.5), the mathematical statement of the First Fundamental Theorem on Asset Pricing, equivalent to the Feynman-Kac formula, is the main tool used to price elementary contracts. In practice, this equation is complemented with the Gaussian Shift Theorem, which in its most general setting is expressed mathematically by (22.23). In traditional approaches to the pricing problem, the EMM pricing formula (22.5) is usually complemented with measure changes or numeraire changes using Girsanov’s Theorem, or as is sometimes done, affecting the measure change using an Esscher transform. None of these technical devices are really necessary – they are all equivalent in the Black-Scholes world, to the Gaussian Shift Theorem. Once the elementary contracts have been priced, any portfolio of such contracts is then immediately priced using the Principle of Static Replication stated in Theorem 22.3. While this principle is immediate from the linearity of the BS-pde (and also the EMM expectation), it also follows from the fundamental postulate of no arbitrage. Static replication is a powerful pricing tool which appears to have been much under-utilised in pricing exotic options. Indeed, static replication confers an additional benefit when pricing under asset models alternative to gBm. In general, one only needs to price the elementary contracts under the new model – the exotic option will still be priced by its replicating portfolio. Path dependent options such as barrier and lookback options have received considerable attention in the literature and have led to to some very complex analysis. Yet, as we have shown, the introduction of the image solution (22.41) for the BS-pde, considerably simplifies the pricing problem and analysis. Indeed, we show in (22.44), (22.48) and (22.49) that general barrier and lookback options have equivalent European path independent payoffs. These payoffs depend on the image solution of related binary contracts, but are otherwise no more complicated than for other European exotics. Although we have considered a fairly wide class of exotics, we have in reality only scraped the surface of this large topic. We have not even touched on the problem of pricing American and Bermudan options (options with early exercise allowed), Asian options (options with payoffs depending on average asset prices over specified time windows), credit derivatives (options including counterparty default), bond and interest rate derivatives (such as caps, floors and swaps as well as options on these instruments), equity-linked FX options and the world of exotic barrier and exotic lookback options. But, interesting as these options are, they probably do not belong in a treatise on Foundations of Option Pricing.

References Black F, Scholes M (1973) The pricing of options and corporate liabilities. Journal of Political Science 81, pp. 637–653 Buchen PW (2001) Image options and the road to barriers. Risk Magazine 14, pp. 127−130

542

Peter Buchen

Buchen PW (2004) The pricing of dual expiry options. Mathematical Finance 4, pp. 101–108 Buchen PW, Konstandatos A (2005) A new method of pricing lookback options. Mathematical Finance 15, pp. 245–259 Conze A, Viswanathan (1991) Path-dependent options – the case of lookback options. Journal of Finance 46, pp. 1893–1907 Friedman A (1975) Stochastic differential equations and applications. Academic Press, New York Geske R (1979) The valuation of compound options. Journal of Financial Economics 7, pp. 63–81 Goldman MB, Sozin HB, Gatto MA (1979) Path-dependent options: buy at the low, sell at the high. Journal of Finance 34, pp. 1111–1128 Harrison JM, Pliska SR (1981) Martingales and stochastic integrals in the theory of continuous trading. Stochastic Processes and their Applications 11, pp. 215–260 Klebaner FC (1998) Introduction to stochastic calculus with applications. Imperial College Press, London Longstaff L (1990) Pricing options with extendable maturities: analysis and applications. Journal of Finance 45, pp. 935–957 Margrabe W (1978) The value of an option to exchange one asset for another. Journal of Finance 33, pp. 177–186 Merton R (1973) Theory of rational option pricing. Bell Journal of Economics and Management Science 4, pp. 141–183 Rubinstein M (1991) Options for the undecided. Risk Magazine 4(4), p. 43 Stulz, RM (1982) Options on the minimum and maximum of two risky assets. Journal of Financial Economics 10, pp. 161–185 Thomas, B (1994) Something to shout about. Risk Magazine 6, pp. 56–68 Wilmott P, Howison S, Dewynne J (1995) The mathematics of financial derivatives. Cambridge University Press, Cambridge, UK Zhang, PG (1998) Exotic options: a guide to second generation options. World Scientific, Singapore, River Edge NJ

CHAPTER 23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data Wei Sun, Svetlozar (Zari) Rachev, Frank Fabozzi

23.1 Introduction With the adoption of electronic trading and order routing systems, an enormous quantity of trading data in electronic form is now available. A complete data set of transactions recorded and their associated characteristics such as transaction time, transaction price, posted bid/ask prices, and volumes are provided. These data are gathered at the ultimate frequency level in the financial markets and usually referred to as intra-daily data or high-frequency data. There is no standardization of the term intra-daily data adopted by researchers in the market microstructure area. In this literature, there are several descriptions for intra-daily data. (Hasbrouck 1996) mentions microstructure data or microstructure time series. (Engle 2000) uses the term ultra-high frequency data to describe the ultimate level of disaggregation of financial time series. Alexander 2001) describes high-frequency data as real time tick data. (Gourieroux and Jasiak 2001) use the expression tick-by-tick data while (Tsay 2002) writes “high frequency data are observations taken at fine time intervals.” In this article, the term intra-daily data will be used interchangeably with the other terms used by market structure researchers as identified above. A summary of the literature covering intra-daily data research has been provided in several publications mainly from 1996 to 2004. (Hasbrouck 1996) provides an overview of the statistical models of intra-daily data are used for explaining trading behavior and market organization. (Goodhart and O’Hara 1997) summarize many important issues connected with the use, analysis, and application of intra-daily data. (Madhavan 2000) provides a survey covering the theoretical, empirical, and experimental studies of market microstructure theory. (Ghysels 2000) reviews the econometric topics in intra-daily data research, identifying some challenges conforting researchers. In a collection of papers edited by (Goodhard and Payne 2000), several works based the intra-daily data from foreign exchange market are presented. (Gwilym and Sutcliff 1999; Bauwens and

544

Wei Sun et al.

Giot 2001; Dacorogna et al. 2001) provide detailed coverage of econometric modeling as it pertains to the empirical analysis of market microstructure. The survey by (Easley and O’Hara 2003) is presented in terms of microstructure factors in asset pricing dynamics. (Engle and Russel 2004) highlight econometric developments pertaining intra-daily data. (Harris 2003; Schwartz and Francioni 2004) provide insights into market microstructure theory based on discussions with practitioners, focusing less on theoretical and econometric issues associated with intra-daily data. As the full record of every movement in financial markets, intra-daily data offer the researcher or analyst a large sample size that increases statistical confidence. The data can reveal events in the financial market that are impossible to identify with low frequency data. While in some sense, intra-daily data might be regarded as the microscope used for studying financial markets, these data have broader interest in econometric modeling and market microstructure research. Some other implications of intra-daily data are being identified in the literature. A fundamental property of intra-daily data is “observations can occur at varying random time intervals” (see (Ghysels 2000)). Other fundamental statistical properties that have been observed by researchers to characterize intra-daily data are fat tails, volatility clustering, daily and weekly patterns, and long-range dependence (see, for example, (Dacorogna et al. 2001)). Based on these observed stylized factors, time series models and market microstructure models have been built. Time series models capture the statistical properties of financial data, while market microstructure models explain trading behavior and market organization. Besides explaining market microstructure theory, intra-daily data has been used in studying various topics in finance. Some examples are risk management (e. g., (Beltratti and Morana 1999)), model evaluation (e. g., Gençay et al. 2002), and (Bollerslev and Zhang 2003)), market efficiency test (e. g., (Hotchkiss and Ronen 2002)), electricity price analysis (e. g., (Longstaff and Wang 2004)), information shocks to financial market (e. g. Faust et al. 2003 and Adams et al. 2004)), and financial market anomalies (e. g. (Gleason et al. 2004)). Any survey of a field with extensive research must be necessarily selective in its coverage. This is certainly the case we faced in preparing this review on research of intra-daily data, fractal processes, and long-range dependence. In this paper, we provide an aerial view of the literature, attempting to combine the research of fractal processes and long-range dependence together with the use of intra-daily data. We organized the paper as follows. In Section 23.2, we discuss the stylized facts of intra-daily data observed in financial markets. Computer implementations in studying intra-daily data are the subject of Section 23.3. In Section 23.4, we review the major studies on intra-daily data analyzed in financial markets, running the risk that we have overlooked some important studies somewhere in hyperspace that we failed to uncover or published while this work was in progress. In Section 23.5, we discuss the analysis of long-range dependence, then introducing two fractal processes, (fractional Gaussian noise and fractional stable noise) in Section 23.6. Our concluding remarks appear in Section 23.7.

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

545

23.2 Stylized Facts of Financial Intra-daily Data In data analysis, an important task is to identify the statistical properties of the target data set. Those statistical properties are referred to as stylized factors. Stylized factors offer building blocks for further modeling in such a way so as to encompass these statistical properties. Being full record of market transactions, intradaily data have properties that have been observed. In this section, some stylized factors of intra-daily data will be reviewed.

23.2.1

Random Durations

For intra-daily data, irregularly spaced time intervals between successive observations is the salient feature compared with classical time series data (see, for example, (Engle 2000) and Ghysels 2000)). Given time points equally spaced along the time line, for intra-daily data, at one time point, there might be no observation or several observations. In intra-day data, observations arrive at random time. This causes the duration between two successive observations not to be constant in a trading day. For a time series, if the time space (duration) is constant over time, it is an equally spaced time series or homogeneous time series. If the duration varies throught time, it is an unequally spaced time series, i. e., the inhomogeneous time series, see (Dacorogna et al. 2001).

23.2.2

Distributional Properties of Returns

Many techniques in modern finance rely heavily on the assumption that the random variables under investigation follow a Gaussian distribution. However, time series observed in finance often deviate from the Gaussian model, in that their marginal distributions are found to possess heavy tails and are possibly asymmetric. (Bollerslev et al. 1992) mentions that intra-daily data exhibit fatter tails in the unconditional return distributions and (Dacorogna et al. 2001) confirms the exhibition of heavy tails in intra-daily returns data. In such situations, the appropriateness of the commonly adopted normal distribution assumption for returns is highly questionable in research involving intradaily data. It is often argued that financial asset returns are the cumulative outcome of a vast number of pieces of information and individual decisions arriving almost continuously in time. Hence, in the presence of heavy tails, it is natural to assume that they are approximately governed by a non-Gaussian stable distribution, see (Rachev and Mittnik 2000; Rachev et al. 2005). (Marinelli et al. 2000) first model the heavy tailedness in intra-daily data. (Mittnik et al. 2002) point out that other leptokurtic distributions, including Student’s t, Weibull, and hyperbolic, lack the attractive central limit property. (Sun et al. 2006a) confirm the findings of heavy tailedness in intra-daily data.

546

Wei Sun et al.

(Wood et al. 1985) present evidence that stock returns are not independently and identically distributed. They find that the distributions are different for the return series in the first 30 minutes of the trading day, at the market close of the trading day, and during the remainder of the trading day.

23.2.3 Autocorrelation and Seasonality The study by (Wood et al. 1985), one of the earliest studies employing intra-daily data, finds that the trading day return series is non-stationary and is characterized by a low-order autoregressive process. For the volatility of asset returns, autocorrelation has been documented in the literature (see among others, (Engle 1982; Baillie and Bollerslev 1991; Bollerslev et al. 1992; Hasbrouck 1996; Bollerslev and Wright 2001)). The existence of negative first-order autocorrelation of returns at higher frequency, which disappears once the price formation is over, has been reported in both the foreign exchange market and equity market. The explanation for the finding of negative autocorrelation of stock returns observed by researchers is due to by what is termed the bid-ask bounce. According to the bid-ask bounce explanation, the probability of a trade executing at the bid price and then being followed by a trade executing at the ask price is higher than a trade at the bid price followed by another trade at the bid price (see (Alexander 2001; Dacorogna et al. 2001; Gourieroux and Jasiak 2001)). Many intra-daily data display seasonality. By seasonality, we mean periodic fluctuations. Daily patterns in the trading day have been found in the markets for different types of financial assets (see (Jain and Joh 1988; McInish and Wood 1991; Bollerslev and Domowitz 1993; Engle and Russel 1998; Andersen and Bollerslev 1997; Bollerslev et al. 2000; Veredas et al. 2002)). One such trading pattern is the well-known “U-shape” pattern of daily trading (see (Wood et al. 1985; Ghysels 2000; Gourieroux and Jasiak 2001)). This trading pattern refers to the observation in a trading day that trade intensity is high at the beginning and at the end of the day, and trading durations increase and peak during lunch time. As a result, return volatility exhibits a U-shape where the two peaks are the beginning and the end of a trading day, with the bottom approximately during the lunch period. (Hong and Wang 2000) confirm the U-shape patterns in the mean and volatility of returns over trading day. They find that around the close and open there exist higher trading activity, the returns of open-to-open being more volatile than that of close-to-close. Besides the intra-day pattern, there exists a day-of-week pattern evidenced by both lower returns and higher volatility on Monday.

23.2.4

Clustering

Many financial time series display volatility clustering. It is observed that large returns are followed by more large returns and small returns by more small returns. Equity, commodity, and foreign exchange markets often exhibit volatility

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

547

clustering at higher frequency. Volatility clustering becomes pronounced in intradaily data (see, for example, (Alexander 2001; Haas et al. 2004)). Besides volatility clustering, intra-daily data exhibit quote clustering and duration clustering. Quote or price clustering is the preference for some quote/prices over others. Duration clustering is the long and short durations tend to occur in clusters (see, for example, (Bauwens and Giot 2001; Chung and Van Ness 2004; Engle and Russell 1998; Feng et al. 2004; Huang and Stoll 2001; Sun et al. 2006b)).

23.2.5 Long-range Dependence Long-range dependence or long memory (sometimes also referred to as strong dependence or persistence) denotes the property of time series to exhibit persistent behavior. (A more precise mathematical definition will be provided in Section 23.5.) It is generally believed that when the sampling frequency increases for financial returns, long-range dependence will be more significant. (Baillie 1996) discusses long memory processes and fractional integration in econometrics. (Marinelli et al. 2000) propose subordinated modeling to capture long-range dependence and heavy tailedness. Several researchers, focusing both on theoretical and empirical issues, discuss long-range dependence. (Doukhan et al. 2003; Robinson 2003; Teyssiére and Kirman 2006) provide an overview of the important contributions to this area of research. Investigating the stocks comprising the German DAX, (Sun et al. 2006a) confirm that long-range dependence does exist in intra-daily return data. We provide a more detailed discussion of long-range dependence in Section 23.5.

23.3

Computer Implementation in Studying Intra-daily Data

23.3.1 Data Transformation Being a full record of transactions and their associated characteristics, intra-daily data represent the ultimate level of frequency at which trading data are collected. The salient feature of such data is that they are fundamentally irregularly spaced. It is necessary to distinguish intra-daily data from high frequency data because the former are irregularly spaced while the latter are sometimes spaced by aggregating to a finer fixed-time interval. In order to clarify the time interval, it is useful to refer to the data by its associated time interval. For example, if the raw intra-daily data have been aggregated to create an equally-spaced time series, say five minutes interval, then the return series is referred to as the “5 minutes intra-daily data”. In order to clarify the characteristic of the interval between data points, (Dacorogna et al. 2001) propose employing a definition of homogeneous and inhomogeneous time series. The irregularly spaced time series is called an inhomogeneous time series while the equally spaced one is called a homogeneous time series. One can

548

Wei Sun et al.

aggregate inhomogeneous time series data up to a specified fixed time interval in order to obtain a corresponding homogeneous version. Naturally, such aggregation will either lose information or create noise, or both. For example, if the observations in the original data are more than that in the aggregated one, some information will be lost. If the observations in the original data are much less than that in the aggregated one, noise will be generated. Data interpolation is required to create aggregated time series. Obviously, different interpolation methods lead to different results. This leads not only to the loss of information but also the creation of noise due to introducing errors in the interpolation process. (Engle and Russel 1998) argue that in the aggregation of tick-by-tick data to some fixed time interval, if a short time interval is selected, there will be many intervals having no new information and, as a result, heteroskedasticity will be introduced; and if wide interval is chosen, microstructure features might be missing. Therefore, it is reasonable to keep the data at the ultimate frequency level (see (Aït-Sahalia et al. 2005)). Standard econometric techniques are based on homogeneous time series analysis. Applying analytical methods of homogeneous time series to inhomogeneous time series may produce unreliability. That is the dichotomy in intra-daily data analysis: a researcher can retain the full information without creating noise but is challenged by the burden of technical complexity. Sometimes, it is not always necessary to retain the ultimate frequency level. In those instances, aggregating inhomogeneous intra-daily data to a relatively lower but still comparably higher frequency level of homogeneous time series is needed. (Wasserfallen and Zimmermann 1995) show two interpolation methods: linear interpolation and previous-tick interpolation. Given an inhomogeneous time series with times ti and values ϕi = ϕ (ti ) , the index i identifies the irregularly spaced sequence. The target homogeneous time series is given at times t0 + j Δt with fixed time interval Δt starting at t0 . The index j identifies the regularly spaced sequence. The time t0 + j Δt is bounded by two times ti of the irregularly spaced series, I = max( i | ti ≤ t0 + j Δt )

(23.1)

t I ≤ t0 + j Δt > t I +1

(23.2)

Data will be interpolated between t I and t I +1 . The linear interpolation shows that

ϕ (t0 + j Δt ) = ϕ I +

t0 + j Δt − t I (ϕ I +1 − ϕ I ) t I +1 − t I

(23.3)

and previous-tick interpolation shows that

ϕ (t0 + j Δt ) = ϕ I

(23.4)

(Dacorogna et al. 2001) point out that linear interpolation relies on the future information whereas previous-tick interpolation is based on the information already known. (Müller et al. 1990) suggest that linear interpolation is an appropriate method for independent and identically distributed (i.i.d.) increments stochastic processes.

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

549

More advanced techniques have been adopted by some researchers in order to find sufficient statistical properties of data but at the same time retaining the inhomogeniety of time series. (Zumbach and Müller 2001), for example, propose a convolution operator to transform the original inhomogeneous time series to a new inhomogeneous time series in order to get more sophisticated quantities. Newly developed techniques such as the wavelet method have been adopted to analyze intra-daily data. For example, (Gençay et al. 2001, 2002) employed a wavelet multiscaling method to remove intra-daily seasonality in five-minute intra-daily data of foreign exchange. As mentioned in Section 23.2, intra-daily data exhibit daily patterns. Several methods of data adjusting have been adopted in empirical analysis in order to remove such pattern (see (Engle and Russell 1998; Veredas et al. 2002; Bauwens and Giot 2000, 2003; Bauwens and Veredas 2004)).

23.3.2 Data Cleaning In order to improve the quality of data, data cleaning is required to detect and remove errors and inconsistencies from the data set. Data quality problems are the result of misspelling during data entry, missing information, and other kinds of data invalidity. There are two types of error: human errors and computer system errors (see (Dacorogna et al. 2001)). When multiple data sources must to be integrated, for instance, pooling the data in each trading day together for one year, the need for data cleaning increases significantly. The reason is that the sources often contain redundant data in different representations. High quality data analysis requires access to accurate and consistent data. Consequently, consolidating different data representations and eliminating duplicate information become necessary. The object of data cleaning is the time series of transactions information. Usually, the transactions information is “quote” or “tick” information. Each quote or tick in the intra-daily data set contains a time stamp, an identification code, and variables of the tick, such as bid/ask price (and volume), trade price (and volume), and locations. Intra-daily data might contain several errors that should be specially treated. Decimal errors occur when the quoting software uses cache memory so that it is failure to change a decimal digit of the quote. Test tick errors are caused by data managers’ testing operation of sending artificial ticks to the system to check the sensitivity of recording. Repeated ticks are caused by data managers’ test operation of letting the system repeat the last tick in the specified time intervals. Some errors occur when data managers copy the data or when a scaling problem occurs, see (Dacorogna et al. 2001). (Coval and Shumway 2001) illustrate the existence of occasionally incorrect identification of the exact time. They use the data cleaning method to ensure that the time stamps on each tick were accurate and scaled to the second level. They introduce a method of summing up variables from 31 seconds past one minute to 30 seconds past the next minute to aggregate the tick to the minute level. Some detailed methods used in data cleaning are discussed in (Dacorogna et al. 2001).

550

Wei Sun et al.

23.4

Research on Intra-Daily Data in Finance

23.4.1 Studies of Volatility Volatility is one of the most important risk metric. (Harris 2003) defines it as the tendency for prices unexpectedly changing. Volatility could be regarded as the market reaction to news reflected by price changing. He distinguishes fundamental volatility and transitory volatility. Fundamental volatility is caused by the endogenous variables which determine the value of trading instruments. Transitory volatility is due to trading activity by uninformed traders. Volatility is not constant over the trading stage, but changes over time. Transitory volatility might occur in a very short time period before it converts to its fundamental value. The intra-daily data can observe both fundamental volatility and transitory volatility with sufficient statistical significance. (Engle 2000) adopts the GARCH model to the irregularly spaced ultra-high frequency data. Letting di be the duration between two successive transactions and ri the return between transactions i − 1 and i , then the conditional variance per transaction is: Vi −1 (ri | di ) = φi

(23.5)

and the conditional volatility per unit of time is defined as, Vi −1 (

ri di

| di ) = σ i2

(23.6)

Then the connection between equations (23.5) and (23.6) can be established by

φi = diσ i2 . The predicted variance conditional on past returns and durations is Ei −1 (φi ) = Ei −1 (diσ i2 ) . Using ARMA(1,1) with innovations ε i , the series of return per unit of time is ri di

=a

ri −1 di −1

+ ε i + b ε i −1

(23.7)

If the current duration contains no information, the simple GARCH specification is used, and then

σ i2 = ω + α ε i2−1 + β σ i2−1

(23.8)

If durations are informative, Engle 2000) proposes an autoregressive conditional duration (ACD) model to define the expected durations. If the ACD model is

ψ i2 = h + m di −1 + nψ i −1

(23.9)

then the ultra-high frequency GARCH model is expressed as:

σ i2 = ω + α ε i2−1 + β σ i2−1 + γ 1 di−1 + γ 2

di

ψi

+ γ 3 ξi −1 + γ 4 ψ i−1

(23.10)

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

551

where ξi −1 is the long-run volatility computed by exponentially smoothing r 2 /d with a parameter 0.995 such that ⎛ ri2−1 ⎞ ⎟ + 0.995ψ i −1 ⎝ di −1 ⎠

ψ i = 0.005 ⎜

(23.11)

(Alexander 2001) points out that a number of studies have shown that the aggregation properties of GARCH models are not straightforward. The persistence in volatility seems to be lower when it is measured using intra-day data than when measured using daily or weekly data. For example, fitting a GARCH(1,1) to daily data would yield a sum of autoregressive parameter and moving average parameter estimates that is greater than the sum of that estimated from fitting the same GARCH process by using 2-day returns. In the heterogeneous ARCH (HARCH) model proposed by Müller et al. 1997), a modified process is introduced so that the squared returns can be taken at different frequencies. The return of rt of the HARCH(n) process is defined as follows, rt = ht ε t n

j

j =1

i =1

ht2 = α 0 + ∑ α j (∑ rt −i ) 2

(23.12)

where ε t is i.i.d. random variables with zero mean and unit variance, α 0 > 0 , α n > 0 , α j ≥ 0 , for j =1, 2,..., n − 1 . The equation for the variance ht2 is a linear combination of the squares of aggregated returns. Aggregated returns may extend over some long intervals from a time point in the distant past up to time t − 1 . The HARCH process belongs to the wide ARCH family but differs from all other ARCH-type processes in the unique property of considering the volatilities of returns measured over different interval size. (Dacorogna et al. 2001) generalized the HARCH process. In equation (23.12), all returns considered by the variance equation are observed over recent interval ending at time t − 1 . This strong limitation is justified by its empirical successive observations, but a more general formula of the process with observation intervals ending in the past before t − 1 can be shown as follows: rt = ht ε t n

j

j

q

i=k

i =1

ht2 = α 0 + ∑ ∑ α jk ( ∑ rt −i ) 2 + ∑ bi ht2−i j =1 k =1

(23.13)

where

α 0 > 0, α jk ≥ 0, for j = 1, 2,..., n; k =1, 2,..., j; b j ≥ 0 for i = 1, 2,..., q.

(23.14)

552

Wei Sun et al.

The generalized process equation considers all returns between any pair of two time points in the period between t − n and t − 1 . It covers the case of HARCH (all α jk = 0 except some α j1 ), as well as that of ARCH and GARCH (all α jk = 0 except some α jj ). From historical data, realized volatility can be computed. (Dacorogna et al. 2001) show the realized volatility as: ⎡1 n ⎤ υ (ti ) = ⎢ ∑ | r (Δt ; ti − n + j ) p |⎥ ⎣ n j =1 ⎦

1/p

(23.15)

where n is the number of return observations, and r stands for the returns in the regularly spaced time intervals. Δt is the return interval. (Taylor and Xu 1997; Andersen and Bollerslev 1998; Giot and Laurent 2004), among others, show that summing up intra-daily squared returns can estimate the daily realized volatility. Given that a trading day can be divided into n equally spaced time intervals, and if ri,t denotes the intra-daily return of the i th interval of day t , the daily volatility for day t can be expressed as: ⎡ n ⎢ ri,t ⎢ ⎢⎣ i =1

∑

⎤ ⎥ ⎥ ⎥⎦

2

n

n

= ∑ ri2,t + 2∑ i =1

n

∑ r j ,t r j − i ,t

(23.16)

i =1 j = i +1

(Andersen et al. 2001a) show that if the returns have a zero mean and are uncorrelated, ∑ in=1 ri2,t is a consistent and an unbiased estimator of the daily variance.

n [∑i =1 ri,t ]2 is called the daily realized volatility since all squared returns on the right side of equation (23.16) can be observed when intra-daily data sampled over an equally spaced time interval are available. This is one method for modeling daily volatility using intra-daily data, and a method that has been generalized by (Andersen et al. 2001a, 2001b). Another method is to estimate the intra-daily duration model on trade durations for a given asset. It is observed that longer durations lead to lower volatility and shorter durations lead to higher volatility and durations are informative, see (Engle 2000; Dufour and Engle 2000). (Gerhard and Hautsch 2002) estimate daily volatility for an intra-daily duration model at which irregularly time spaced volatility has been used at the aggregated level. There are several articles that provide a detailed discussion of modeling volatility, see, for example, (Andersen et al. 2005a, 2005b, 2006). Modeling and forecasting volatility based on intra-daily data has attracted a lot of research interest. (Andersen et al. 2001) improve the inference procedures for using intra-daily data forecasts. (Bollerslev and Wright 2001) propose a method to model volatility dynamics by fitting an autoregressive model to log-squared, squared or absolute returns of intra-daily data. They show that when working with intra-daily data, using a simple autoregressive model can provide a better prediction for forecasting future volatility than standard GARCH or exponential GARCH (EGARCH) models. They suggest that intra-daily data can be easily used to generate superior daily volatility forecasts. (Blair et al. 2001) offer evidence that intra-daily returns

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

553

provide much more accurate forecasts of realized volatility than daily returns. In the context of stochastic volatility models, (Barndorff-Nielsen and Shephard 2002) investigate the statistical properties of realized volatility. (Corsi et al. 2005) study the time-varying volatility of realized volatility. Using a GARCH diffusion process, (Martens 2001; Marten et al. 2002) point out that using intra-daily data can improve the out-of-sample daily volatility forecasts. (Bollen and Inder 2002) use the vector autoregressive heteroskedasticity and autocorrelation (VARHAC) estimator to estimate daily realized volatility from intra-daily data. (Andersen et al. 2003) propose a modeling framework for integrating intra-daily data to predict daily return volatility and return distribution. (Thomakos and Wang 2003) investigate the statistical properties of daily realized volatility of futures contracts generated from intra-daily data. Using 5-minute intra-daily foreign exchange data, (Morana and Beltratti 2004) illustrate the existence of structural breaks and long memory in the realized volatility process. (Fleming et al. 2003) indicate that the economic value of the realized volatility approach is substantial and deliver several economic benefits for investment decision making. (Andersen et al. 2005a) summarize the parametric and non-parametric methods used in volatility estimation and forecasting.

23.4.2

Studies of Liquidity

A topic of debate in finance is the meaning of liquidity. Some market participants refer to liquidity as the ability to convert an asset into cash quickly, some define liquidity in terms of the low transaction cost, and some think high transaction activity is liquidity, see (Easley and O’Hara 2003) and (Schwartz and Francioni 2004). Schwartz and Francioni point out that a better approach for defining liquidity should be based on the attributes of liquidity. They define the depth, breadth, and resiliency as the dimensions of liquidity, while (Harris 2003) identifies immediacy, width, and depth as the relevant attributes of liquidity. In the microstructure literature, the bid/ask spread proxies for liquidity, see (Easley and O’Hara 2003). (Schwartz and Francioni 2004) show that liquidity can be approximated by the trade frequency of an asset traded in the market. The frequency can be measured by the magnitude of short-term price fluctuation for such asset. (Chordia et al. 2002) find that order imbalances affect liquidity and returns at the aggregate market level. They suggest using order imbalance as a proxy for liquidity. Using bid/ask spread as a proxy for liquidity, (Chung et al. 2001) compare the bid/ask spread on the NYSE and Nasdaq. They find that the average NYSE specialist spreads are significant smaller than the Nasdaq specialist spreads. (Huang and Stoll 2001) report that dealer markets have relative higher spreads than auction markets. By comparing spreads in different markets, they found that the spreads on London Stock Exchange are larger than that for the same stocks listed on the NYSE and the spreads in the Nasdaq are larger than that on stocks listed on the NYSE. By checking the growth of electronic communication networks (ECNs) in Nasdaq, (Weston 2002) confirms that the electronic trading system has the

554

Wei Sun et al.

ability to improve the liquidity on the Nasdaq. (Kalay et al. 2004) report that the opening is more liquid than the continuous trading stage. For small price changes and small quantities, there is a less elastic supply curve than demand curve. From the perspective of liquidity providers, researchers usually use the order book data. (Coughenour and Deil 2002) catagorize the specialist firms on the NYSE into two types: owner-specialist firms and employee-specialist firms. By investigating the influence of these two types of liquidity providers, they show that with similar trading costs, the owner-specialist firms have greater frequency of large trades and have greater incentive to reduce adverse selection costs. For employee-specialist firms, the stocks traded exhibit less sensitivity between change in quoted depth and quoted spreads. Meantime, these stocks show price stability at the opening. (Peterson and Sirri 2002) investigate order submission strategies and find that limit orders perform worse than market orders involving the trading costs. But investors still prefer limit orders, suggesting that individual investors are less able to choose an optimal trading strategy.

23.4.3

Studies of Market Microstructure

Market microstructure refers to the description of trading processes for obtaining a better understanding of how prices adjust in order to reflect new information. Researchers in this field are typically willing to investigate the information contained in intra-daily data. The trading mechanisms and the price information flows are the major interest of market microstructure studies. (Madhavan 2000) provides a good survey of market microstructure research based on four topics: price formation, market structure and design, transparency, and other applications. Transaction costs, private information, and alternative trading systems have been investigated in the research on trading processes. Transaction costs are “market frictions” and reduce the trading frequency of investors. (O’Hara 1995; Madhavan 2000) demonstrate that private information will impact the investors’ belief of the asset value. If investors possess private information, new information arrivals can be reflected by the order flows. The market structure might influence the size of trading costs. Different market structures have different functions for finding prices and matching buyers and sellers. The transparency of certain market structure also affects the functioning of market mechanisms, see (Bollmfield and O’Hara 1999, 2000; O’Hara 1995; Spulber 1999; Madhavan 2000). How are prices determined in the financial market? As (Madhavan 2000) notes, studying the market maker is the logical starting point for any analysis of market price determination. In market microstructure studies, inventory and asymmetric information are the factors that influence the price movement. (Naik and Yadav 2003) examine whether equivalent inventories or ordinary inventories dominate the process of trading and pricing decisions by dealer firms. They find that ordinary inventories play the main role consistent with the decentralized nature of market making. (Bollen et al. 2004) propose a new model based on intra-daily data to understand and measure the determinants of bid/ask spread of market makers. Using

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

555

the vector autoregressive model, (Dufour and Engle 2000) find that time durations between transaction impact the process of price formation. (Engle and Lunde 2003) develop a bivariate model of trades and quotes for NYSE traded stocks. They find that high trade arrival rates, large volume per trade, and wide bid/ask spreads can be regarded as information flow variables that revise prices. They suggest that if such information is flowing, prices respond to trades more quickly. In the study of market quality, much of the debate in the literature, according to (Madhavan 2000), centers on floor versus electronic markets, and auctions versus dealer system. Using intra-daily data for Bund futures contracts, (Frank and Hess 2000) compare the electronic system of the DTB (Deutsche Terminbörse) and the floor trading of the LIFFE (London International Financial Futures Exchange). They find that the floor trading turns out to be more attractive in periods of high information intensity, high volatility, high volume, and high trading frequency. The reason they offer for this finding is that market participants infer more information from observing actual trades. (Coval and Shumway 2001) confirm the claim that market participants gather all possible information rather than only relying on easily observable data, say, past prices, in determining their trade values. They suggest that subtle but non-transaction signals play important roles. The electronic exchanges they find lose information that can be mirrored by a face-to-face exchange setting. Based on the co-existence of the floor and electronic trading system in the German stock market, (Theissen 2002) argues that when employing intra-daily data, both systems contribute to the price discovery process almost equally. Based on testing whether the upstairs intermediation can lower adverse selection cost, (Smith et al. 2001) report that the upstairs market is complementary in supplying liquidity. (Bessembinder and Venkataraman 2004) show that the upstairs market does a good job of complementing to the electronic exchange because the upstairs market is efficient for many large trades and block-sized trades. By confirming that market structure does impact the incorporation of news into market prices, (Masulis and Shivakumar 2002) compare the speed of adjustment on the NYSE and Nasdaq. They demonstrate that there are faster price adjustments to new information on the Nasdaq. (Weston 2002) confirms that the electronic trading system has improved the liquidity of the Nasdaq. (Boehmer et al. 2005) suggest that since the electronic trading reveals more information, market quality could be increased by such exposed information in the electronic trading system. (Huang and Stoll 2001) point out that tick size, bid/ask spread, and market depth are not independent from market structure. They are linked to market structure. It is necessary to take account of tick size, bid/ask spread, and market depth when analyzing the market structure. Numerous studies investigate how market structure impacts price discovery and trading costs in different securities markets. (Chung and Van Ness 2001) show that after introducing new order handling rules in the Nasdaq, bid/ask spreads decreased, confirming that market structure has a significant effect on trading costs and the price forming process. Market transparency is the ability of market participants to observe information about the trading process, see (O’Hara 1995). Referring to the time of the trade, (Harris 2003) defines ex ante (pre-trade) transparency and ex post (post trade)

556

Wei Sun et al.

transparency. Comparing market transparency is complicated by the lack of absence of a criterion that can be used to judge the superiority of one trading system over another such as floor market versus electronic market, anonymous trading versus disclosure trading, and auction system versus dealer system. According to (Madhavan 2000), there is broad agreement of the influence of market transparency. Market transparency does affect informative order flow and the process of price discovery. Madhavan also points out that while partial disclosure will improve liquidity and reduce trading costs, complete disclosure will reduce liquidity, a situation (Harris 2003) refers to as the “ambivalence” from the viewpoint of trader’s psychology. By investigating the OpenBook1 in NYSE, (Boehmer et al. 2005) find there is a higher cancel rate and a short time-to-cancel of limit orders in the book, suggesting that traders attempt to manipulate the exposure of their trades. By confirming market design does impact the trading strategy of investors, they support increasing pre-trade transparency and suggest that market quality can be enhanced by greater transparency of the limit order book.

23.4.4

Studies of Trade Duration

There is considerable interest in the information content and implications of the spacing between consecutive financial transactions (referred to as trade duration) for trading strategies and intra-day risk management. Market microstructure theory, supported by empirical evidence, suggests that the spacing between trades be treated as a variable to be explained or predicted since time carries information and closely correlates with price volatility (see, Bauwens and Veredas 2004), Engle 2000), Engle and Russell 1998), Hasbrouck 1996), and O’Hara 1995)). Manganelli 2005) finds that returns and volatility directly interact with trade durations and trade order size. Trade durations tend to exhibit long-range dependence, heavy tailedness, and clustering (see, Bauwens and Giot 2000), Dufour and Engle 2000), Engle and Russell 1998), and Jasiak 1998)). Several studies have modeled durations based on point processes in order to discover the information they contain (see Bauwens and Hautsch 2006)). Before reviewing the major studies of trade duration in this section, we will introduce the mathematics underlying duration models, that is, point processes. 23.4.4.1 Point Processes in Modeling Duration Point processes are stochastic processes. Let us first define the stochastic process. Given a probability space (Ω , A, P) , a family of random variables ( X t )t∈T on Ω 1

OpenBook was introduced in January 2002 allowing traders off the NYSE floor exchange to find each price in real time for all listed securities. Before OpenBook, only best bid/ask could be observed.

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

557

with values in some set M , (i. e., for all t ∈ T and T is some index set), X t : (Ω , A) → (M, B) is defined as a stochastic process with index set T and state space M . The point process is then the sequence (Tn ) n∈N of positive real random variables if Tn (ω ) < Tn +1 (ω ) for all ω ∈ Ω and all n ∈ N ; and lim n →∞ Tn (ω ) = ∞ , for all ω ∈ Ω . In this definition, Tn is called the n th arrival time and Tn = ∑ in=1τ i ; and τ n = Tn − Tn −1 (where τ 1 = T1 ) is called the n th waiting time (duration) for the point process. Point processes and counting processes are tightly connected. A stochastic process ( N t )t∈[0,∞ ) is a counting process if: N t : (Ω , A) → (N 0 , P(N 0 )) for all t ≥ 0 , N0 ≡ 0 ; N s (ω ) ≤ N t (ω ) , for all 0≤st N s (ω ) = N t (ω ) lim s →t , s >t N s (ω ) = N t (ω ) , for all t ≥ 0 and all ω ∈ Ω ; N t (ω ) − lim s →t , s >t N s (ω ) ∈ (0,1) , for all t > 0 and all ω ∈ Ω ; and limt →∞ N t (ω ) = ∞ , for all ω ∈ Ω . A point process (Tn ) n∈N corresponds to a counting process ( N t )t∈[0,∞ ) and vice versa, i. e., N t (ω ) = {n ∈ N : Tn (ω ) ≤ t}

(23.17)

for all ω ∈ Ω and all t ≥ 0 . For all ω ∈ Ω and all n ∈ N , Tn (ω ) = min{t ≥ 0 : N t (ω ) = n}

(23.18)

The mean value function of the counting process is m(t ) = E ( N t ) , for t ≥ 0 , and m : [0, ∞) → [0, ∞), m(0) = 0 . m is an increasing and a right continuous function with limt →∞ m(t ) = ∞ . If the mean value function is differentiable at t > 0 , then the first-order derivative is called the intensity of the counting process. Defining λ (t ) = dm(t ) /dt , 1 P ( N t +Δt − N t = 1) = λ (t ) Δt → 0,Δt ≠ 0 | Δt |

(23.19)

1 P ( N t +Δt − N t ≥ 2 ) = 0 Δt → 0,Δt ≠ 0 | Δt |

(23.20)

lim

and lim

From the viewpoint of the point processes intra-daily financial data can be described as marked point processes; that is, the state space M is a product space of R 2 ⊗ M where M is the mark space. (Engle 2000) pointed out that ultra-high frequency transaction data contain two types of processes: (1) time of transactions and (2) events observed at the time of the transaction. Those events can be identified or described by marks, such as trade prices, posted bid and ask price, and volume. As explained earlier, the amount of time between events is the duration. The intensity is used to characterize the point processes and is defined as the expected number of events per time increment considered as a function of time. In survival analysis, the intensity equals the hazard rate. For n durations,

558

Wei Sun et al.

d1 , d 2 ,..., d n , which are sampled from a population with density function f and corresponding cumulative distribution function F , the survival function S (t ) is: S (t ) = P[di > t ] = 1 − F (t )

(23.21)

and the intensity or hazard rate λ (t ) is:

λ (t ) = lim

P [t < d i ≤ t + Δt | di > t ]

Δt

Δt → 0

(23.22)

The survival function and the density function can be obtained from the intensity,

λ (t ) =

f (t ) −d log( S (t )) = S (t ) dt

(23.23)

23.4.4.2 Major Duration Models Several models have been proposed to model durations by estimating the intensity. The favored models in the literature is the autoregressive conditional duration (ACD) model proposed by (Engle and Russell 1998), the stochastic conditional duration (SCD) model by (Bauwens and Veredas 2004), and the stochastic volatility duration (SVD) model by (Ghysels et al. 2004). The ACD model expresses the conditional expectation of duration as a linear function of past durations and past conditional expectation. The disturbance is specified as an exponential distribution and as an extension the Weibull distribution. The SCD model assumes that a latent variable drives the movement of durations. Then expected durations in the SCD model is expressed by observed durations driven by a latent variable. The SVD model seeks to capture the mean and variance of durations. After these models were proposed, the following extensions appeared in the literature: • • • • •

(Jasiak 1998): the fractional integrated ACD model; (Bauwens and Giot 2000): the logarithmic ACD model; (Zhang et al. 2001): the threshold ACD model; (Bauwens and Giot 2003): the asymmetric ACD model; (Feng et al. 2004): the linear non-Gaussian state-space SCD model.

The ACD(m,n) model specified in (Engle and Russell 1998) is di = ψ i ε i

(23.24)

m

n

j =0

j =0

ψ i = ω + ∑ α j di − j + ∑ β j ψ i − j

(23.25)

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

559

(Bauwens and Giot 2000) give the logarithmic version of the ACD model as follows di = eψ i ε i

(23.26)

Two possible specifications of conditional durations are m

n

j =0

j =0

m

n

j =0

j =0

ψ i = ω + ∑ α j log di − j + ∑ β j ψ i − j

(23.27)

and

ψ i = ω + ∑ α j log ε i − j + ∑ β j ψ i − j

(23.28)

(Zhang et al. 2001) extend the conditional duration to a switching-regime version. Defining Lq = [lq −1 , lq ) , and q = 1, 2,..., Q for a positive integer Q , where −∞ = l0 < l1 < ... < lq = +∞ are the threshold values, di follows a q -regime threshold ACD (TACD(m,n)) model; that is: m

n

j =0

j =0

ψ i = ω ( q ) + ∑ α (jq ) di − j + ∑ β (j q ) ψ i − j

(23.29)

For example, if there is a threshold value lh and 0 < h < q , the TACD(1,1) model can be expressed as follows: ⎧ω1 + α1 di −1 + β1 ψ i −1 0< di −1 ≤lh ψi = ⎨ ⎩ω2 + α 2 di −1 + β2 ψ i −1 lh
(23.30)

The threshold lh determines the regime boundaries. (Fernandes and Gramming 2005) propose nonparametric tests for ACD models and suggested the practical application for estimation of intra-daily volatility patterns. (Sun et al. 2006) model the trade durations of 18 Dow Johns index component stocks using the ACD model with the positive defined fractal processes and find that these fractal processes do describe the data better than the ACD model with i.i.d. distributions. The SCD model given by (Bauwens and Veredas 2004) takes the following form: di = Ψ i ε i

(23.31)

Ψ i = eψ

(23.32)

where i

ψ i = ω + β ψ i −1 + ui

(23.33)

in which | β |< 1 , denotes I i −1 the information set before di , ui | I i −1 ~ N (0, σ 2 ) ui | I i −1 ~ N (0, σ 2 ) , ε i | I i −1 follows some distribution with positive support, and ui is independent of ε j | I i −1 for any i and j .

560

Wei Sun et al.

(Ghysels et al. 2004) propose a SVD model by assuming that durations are independently and exponentially distributed with gamma heterogeneity. More explicitly, their model can be expressed as: di =

Ui cVi

(23.34)

where U i and Vi are two independent variables with exponential distribution and gamma( a, a ) distribution. Then this expression can be transferred with suitable nonlinear transformations to the expression with Gaussian factors:

di =

Ψ (1,Φ ( F1 )) H (1, F1 ) = cΨ (a,Φ ( F2 )) cH (1, F2 )

(23.35)

where F1 and F2 are i.i.d standard normal variables, Φ is the cdf of the standard normal distribution, and Ψ (a,.) is the quantile function of the gamma( a, a ) distribution.

23.5

Long-Range Dependence

Long-range dependence is the dependence structure across long time periods. As stated earlier, it denotes the property of a time series to exhibit persistent behavior, i. e., a significant dependence between very distant observations and a pole in the neighborhood of the zero frequency of their spectrum. In time domain, if { X t , t ∈ T } exhibits long-range dependence, its autocovariance function γ (k ) has the property of ∑ | γ (k ) |= ∞ , where k measures the distance between two observations, i. e., the order of lags. In frequency domain, if { X t , t ∈ T } exhibits longrange dependence, its spectral density f (λ ) ( −π < λ < π ) has a “pole” at frequency zero, i. e., f (0) = 1/ 2π ∑ ∞k =−∞ γ (k ) = ∞ .

23.5.1 Estimation and Detection of LRD in Time Domain 23.5.1.1 The Rescaled Adjusted Range Approach

The rescaled adjusted range method, denoted by R /S , was proposed by (Hurst 1951) and discussed in detail in (Mandelbrot and Wallis 1969; Mandelbrot 1975; Mandelbrot and Taqqu 1979; Beran 1994). For a time series, { X t , t ≥ 1} , let YT = ∑ t =1 X t and T

S 2 (t , k ) =

1 t +k ∑ ( X i − X t ,k )2 k i =t +1

(23.36)

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

561

t+k

where X t ,k = k −1 ∑ i =t +1 X i , then define the adjusted range i R(t , k ) = max[Yt + i − Yt − (Yt + k − Yt )] 0 ≤i ≤ k k i − min[Yt + i − Yt − (Yt + k − Yt )] 0≤i ≤ k k

(23.37)

the standardized ratio R(t , k ) /S (t , k ) is the rescaled adjusted range, i. e., the R/S statistic. Hurst observed that for a large k based on the Nile River data, log E[ R /S ] ≈ a + H log k with H > 0.5 . To determine H by using the R /S statistic, we can do the following: 1. Divide the time series of length N into K blocks. 2. For each lag t , starting at points ti = iT /K + 1 , compute R(ki , t ) /S (ki , t ) , i = 1, 2,… , for all possible k such that ti + k ≤ N . 3. Plot its logarithm against the logarithm of k . This plot is sometimes called the pox plot for the R /S statistic. 4. The parameter H is the estimated slope of the line in the pox plot. The R /S method requires cutting off both the low and high end of the plot to make reliable estimates. The low end of the plot stands for the short-range dependence in the time series and there are too few points on the high end. In the literature it is also argued that the R /S cannot provide the confidence intervals for the estimates and cannot discriminate slight LRD from no LRD. Compared with other methods, the R /S approach is less efficient. When the time series is nonstationary and departs from the normal distribution, this method is not robust, see (Lo 1991; Taqqu et al. 1995; Taqqu and Teverovsky 1998). Lo modifies the R /S approach and proposes a test procedure for the null hypothesis of no LRD. In Lo’s method, he suggests using a weighted sum of autocovariance for S instead of the sample standard deviation to normalize R . Meantime, his modification suggests not considering multiple lags but only using the length N of the series, i. e., Sq2 ( N ) = +

2 N

1 N

q

N

j =1

i = j +1

N

∑ ( X j − X N )2 j =1

∑ ω j (q)( ∑ ( X i − X N )( X i − j − X N ))

(23.38)

where X N denotes the sample mean of the time series, and ω j (q ) := 1 − q +j 1 , q < N . We can use the following term to represent Sq ( N ) by adding the weighted sample autocovariances to the sample variance, i. e., q

Sq2 ( N ) = S 2 + 2 ∑ ω j (q )γˆ j j =1

(23.39)

562

Wei Sun et al.

where γˆ j are the sample autocovariances. Lo shows that the distribution of the statistic Vq ( N ) :=

N −1/ 2 R ( N ) Sq ( N )

(23.40)

is asymptotic to W1 = max W0 (t ) − min W0 (t ) 0 ≤ t ≤1

0 ≤ t ≤1

(23.41)

where W0 is the standard Brownian bridge. This fact allows the computation of a 95% confidence interval for W1 . Thus, Lo uses the interval [0.809,1.862] as the asymptotic 95% acceptance region of the null hypothesis of no LRD. Since Lo only provides the method to test if LRD is present or not without suggesting an estimator of H , (Teverovsky et al. 1999) modified Lo’s method to get an estimator of H . They suggest using Vq with a wide range of values of q , and then ploting the estimates as was done for the pox plot. 23.5.1.2 ARFIMA Model

Conventional analysis of time series under the stationary assumption typically relies on the standard integrated autoregressive moving average model, i. e. ARIMA model of following form:

α ( L)(1 − L) d X t = β ( L)ε t

(23.42)

where, ε t ~ (0, σ 2 ) , and α ( L) is the autoregressive polynomials in the lag operator L such that α ( L) = 1 − α1 ( L)−,..., −α p ( L) p . β ( L) is the moving average polynomials in the lag operator L such that β ( L) = 1 + β1 ( L)−,..., − β q ( L) q β ( L) = 1 + β1 ( L)−,..., − β q ( L) q . All roots of α ( L) and β ( L) lie outside the unit circle. (Granger and Joyeux 1980) and (Hosking 1981) generalize d to a noninteger value by the fractional differencing operator defined by (1 − L)d =

Γ (k − d ) Lk k = 0 Γ ( − d )Γ ( k + 1) ∞

∑

(23.43)

where Γ (⋅) is gamma function. That is the Autoregressive Fractional Integrated Moving Average (ARFIMA) model allows a non-integer value of d . The ARFIMA model has the ability to capture significant dependence between distant observations compared with ARIMA model. (Hosking 1981) shows that autocorrelation function ρ (k ) of an ARFIMA process has a slower hyperbolic decay pattern, that is ρ (k ) ~ k 2 d −1 , with d < 0.5 when k → ∞ and the autocorrelation of ARIMA decay follows an exponential pattern, that is, ρ (k ) ~ r k , with r ∈ (0,1) when k → ∞ . The memory of time series is captured by d , therefore the existence of long memory can be tested based on the statistical significance of the fractional differencing parameter d .

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

563

For the first different of the series Yt , Yt = (1 − L) X t , the equation

(1 − L)d X t = α −1 ( L)β ( L)ε t = ut

(23.44)

can be used to estimate d . In this case, the Hurst index is 1/ 2 + d . 23.5.1.3 Variance-Type Method

(Teverovsky and Taqqu 1995; Taqqu et al. 1995; Taqqu and Teverovsky 1996) discuss the variance-type methods for estimating the Hurst index: the aggregated variance method and differenced variance method. For the aggregated variance method, the variance of X is of order N 2 H − 2 suggesting: 1. For an integer m between 2 and N / 2 , divide a given series of length N into blocks of length m , and compute the sample mean over each k -th block.

Xk

(m)

:=

km 1 ∑ Xt m t = ( k −1) m +1

where k = 1, 2,…, [ N /m] . ( m) 2. For each m , compute the sample variance of X k , 2

s m :=

[ N /m ] (m) 1 ∑ ( X k − X )2 . [ N /m] − 1 k =1

3. Plot log sm2 against log m . 4. For the large values of m , the result should be a straight line with a slope of 2 H − 2 . Then the slope can be estimated by fitting a least-squares line in the log-log plot. If the series has no long-range dependence and finite variance, then H = 0.5 and the slope of the fitted line is −1 . There are two types of non-stationarity, one is jumps in the mean and the other is slowly declining trends. (Teverovsky and Taqqu 1995) distinguish these from long-range dependence by using the differenced variance method. They difference the variance and study the sample variance s 2mi +1 − s m2 i , where mi are the successive values of m as defined above. Using it together with the original aggregated variance method, the difference variance method can detect the presence of the two types of non-stationary effects mentioned above. Another method involving the variance is the variance of residuals method introduced by (Peng et al. 1994). Similar to the aggregated variance method, the series is divided into blocks with size of m . Next, the partial sums are computed within each block, i. e.,

Y (k )( m ) :=

km

∑

t = ( k −1) m +1

Xt

564

Wei Sun et al.

where k = 1, 2, , [ N /m] . A regression line a + bk is fitted to (the partial sums m) is computed. within each block, and the sample variance of the residuals s case, the variance of residuals is (Taqqu et al. 1995) prove that in the Gaussian (m) proportional to m 2H . By plotting log s against log m , the slope, i. e., 2 H , can be estimated (see (Taqqu and Teverovsky 1996) for more details). 23.5.1.4 Absolute Moments Method

The absolute moments method is a generalization of the aggregated variance method. Using this method 1. For an integer m between 2 and N / 2 , divide a given series of length N into blocks of length m , and compute the sample mean over each k -th block. (m)

X k :=

km 1 ∑ Xt m t = ( k −1) m +1

where k = 1, 2, , [ N /m] . ( m) 2. For each m , compute the n -th absolute moment of X k , AM n( m ) =

[ N /m ] (m) 1 ∑ | X k − X |n . [ N /m] − 1 k =1

3. The AM n( m ) is asymptotically proportional to mn ( H −1) . 4. Plot log AM n( m ) against log m . 5. For the large values of m , the result should be a straight line with a slope of n( H − 1) . Then the slope can be estimated by fitting a least-squares line in the log-log plot. If the series has no long-range dependence and finite variance, then H = 0.5 and the slope of the fitted line is −n/ 2 . Similar to the absolute moments method, (Higuchi 1988) suggests the fractal dimension method, see (Taqqu and Teverovsky 1998). The difference between these two methods is that the absolute moments method (when n = 1 ) uses a moving window to compute the aggregated series, while the fractal dimension method uses the non-intersecting blocks. The fractal dimension method requires intensive computation and increases accuracy in shorter time series. (Taqqu et al. 1995), and (Taqqu and Teverovsky 1998) discuss these two methods in detail.

23.5.2 Estimation and Detection of LRD in Frequency Domain 23.5.2.1 Periodogram Method

(Geweke and Porter-Hudak 1983) introduced a semi-nonparametric procedure to test long memory based on the slope of spectral density around the angular

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

565

frequency ω = 0 . For the periodogram of X t at frequency ω j , i. e., g (ω ) , which is defined as follows g (ω ) =

1 2π T

T

∑ eitω ( X t − X )

2

(23.45)

t =1

the differencing parameter d can be consistently estimated by the regression ln g (ω ) = c − d ln(4sin 2 (

ωj 2

)) + η j ,

j = 1, 2,..., n

(23.46)

where ω j = 2π j /T , ( j = 1, 2, , T − 1) denotes the Fourier frequencies of the sample, T is the sample size, and n = f (T ) << T is the number of Fourier frequencies included in the spectral regression. As (Geweke and Porter-Hudak 1983) show, the slope of the line in log-log plot is 1 − 2 H . Extensions and improvements to the periodogram method, for example, the continuous periodogram method and the averaged (comulative) periodogram method, have been discussed in the literature, see, for example, (Robinson 1995a; Taqqu et al. 1995; Taqqu and Teverovsky 1998; Moulines and Soulier 1999; Hurvich and Brodski 2001). 23.5.2.2 Whittle-Type Methods

The Whittle estimator is the extension of the periodogram method. If the time series X t follows Gaussianity, the Gaussian maximum likelihood estimate (MLE) might have optimal asymptotic statistical properties and can be used for approximation, see (Whittle 1951; Hannan 1973). The periodogram of X t at frequency ω j is defined as g (ω ) , g (ω ) =

1 2π T

T

∑ X t eitω

2

(23.47)

t =1

which is an estimator of the spectral density. It is evaluated at the Fourier Frequencies ω j = 2π jT . (Beran 1994) shows that the following equation is an approximation to the Gaussian likelihood, LW (θ ) = −

1 2π

[T / 2]

∑ log fθ (ω j ) + j =1

I N (ω j ) fθ (ω j )

and the Whittle estimator is found by minimizing it for a given parametric spectral density fθ (ω ) . The Gaussian likelihood can be replaced by different approximations without affecting first-order limit distributional characteristics. (Robinson

566

Wei Sun et al.

2003) shows that the estimates which maximize such approximations are all n consistent and of the same limit normal distribution as the Gaussian MLE. (Fox and Taqqu 1986) show the Whittle estimate Hˆ of H is asymptotically normal with rate of convergence T 1/ 2 and the asymptotic distribution of T ( Hˆ − H ) is Gaussian. Relaxing Gaussianity, (Giraitis and Surgailis 1990) discuss the properties of the Whittle estimate Hˆ of H . (Robinson 1995b) develops the local Whittle method and (Taqqu and Teverovsky 1998) provide further discussion about it. The local Whittle method is a semi-parametric estimator. It only specifies the parametric form of the spectral density with ω approaching to zero. It assumes that f c, H (ω ) = c ω1− 2 H

for frequencies ω close to the origin. One estimate minimizes m

∑ log f c, H (ω j ) + j =1

IT (ω j ) f c, H (ω j )

with respect to c and H for some m < [T / 2] , T being the length of the data. (Taqqu and Teverovsky 1998) introduce the aggregated Whittle method which provides a robust Whittle estimator without considering exact parametric information about the spectral density. It can be used for longer time series. This method suggests aggregating the data to create a shorter series. X i :=

mt 1 ∑ Xt m t = m (i −1) +1

If the aggregation level of m is high enough and long-range dependence occurs, then the new series will approach to a fractional Gaussian noise. In the finite variance case, the Whittle estimator can increase the estimation accuracy with an underlying fractional Gaussian noise assumption.

23.5.3 Econometric Modeling of LRD Several econometric models have been extended to describe long-range dependence, for example, extending the ARMA model to the ARFIMA model discussed in the previous section. In this section, we introduce four types of extensions, i. e., GARCH-type extension, stochastic volatility type extension, unit root type extension, and regime switching type extension. 23.5.3.1 GARCH-Type Extension

(Robinson 1991) suggests extending GARCH model by using fractional differences in order to accommodate the existence of long-range dependence. The fractional

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

567

differencing operator is defined as in equation (23.43) by (Baillie et al. 1996). The fractionally integrated GARCH (FIGARCH) model is then (1 − L)d β ( L)(ht2 − μ ) = α ( L)(rt2 − μ )

(23.48)

For the well-defined process, the parameters α j , β j , and d are constrained. Then the coefficients θ j are all nonnegative in ht2 = μ + (1 − L) − d α ( L) β −1 ( L) rt2 ∞

= μ + ∑ θ (rt2−1− j − μ )

(23.49)

j =0

Implied by equation (23.49), the parameters α j and β j are constrained as in the standard GARCH model. This also implies that the parameter d is constrained to be positive. (Breidt et al. 1998) argue that the rt in equation (23.49) is not covariance stationary and the autocovariance function of rt is not defined. (Bollerslev and Mikkelsen 1996) formulate a fractionally integrated EGARCH model of the following form log ht2 = μt + θ ( L)φ ( L) −1 (1 − L) − d g (ε t −1 )

(23.50)

where θ ( z ) = 1 + θ1 z +, , +θ p z p for | z |≤ 1 is an autoregressive polynomial, and φ ( z ) = 1 − φ1 z −, , −φ p z p is a moving average polynomial and φ ( z ) has no roots in common with θ ( z ) . The fractional integrated EGARCH model gives a strictly stationary and ergodic process. (Nelson 1991) shows that (log ht2 − μt ) is covariance stationary under certain condition and d < 0.5 . 23.5.3.2 Stochastic Volatility Type Extension

A long-range dependence stochastic volatility model is discussed by (Breidt et al. 1998). The stochastic volatility model is defined by rt = ht ε t , ht = h exp (ut / 2),

where ut is independent of ε t , ε t is independent and identically distributed (i. i. d.) with mean zero and variance one. ut in a simple long-range dependence model can be defined as (1 − L) d ut =ηt

where ηt follows i.i.d. normal distribution with zero mean and variance ση2 , and d ∈ (−0.5, 0.5) . For long-range dependence, ut can be expressed as an ARFIMA ( p, d , q ) process, defined as (1 − L) d φ ( L) ut = θ ( L)ηt

where ηt follows i.i.d. normal distribution with zero mean and variance σ η2 .

568

Wei Sun et al.

23.5.3.3 Unit Root Type Extension

(Robinson 1994) considers following model that nests a unit root model in order to grasp the effect of long-range dependence:

φ ( L) rt = ε t , t ≥ 1,

(23.51)

rt = 0, t ≤ 0,

(23.52)

where ε t is an I (0) process with parametric autocorrelation and n

φ ( L) = (1 − L) d (1 + L)d ∏ (1 − 2 cos ω j L + L2 ) d 1

2

j

(23.53)

j =3

where ω j are given distinct real numbers in ( 0, π ), and the d j , 0 ≤ j ≤ n , are arbitrary real numbers. This model also covers seasonal and cyclical components. (Velasco and Robinson 2000) propose the following model (1 − L ) s rt = ε t , t ≥ 1,

(23.54)

rt = 0, t ≤ 0,

(23.55)

(1 − L ) d − s ε t = ut , t = 0, ±1,

,

(23.56)

where s is the integer part of d + 1/ 2 and ut is a parametric I (0) process. ε t is a stationary I (d − s ) process. (Marinucci and Robinson 1999) discuss the difference between the two models given by Equations (23.51)−(23.52) and Equations (23.54)−(23.56) with respect to the two definitions of nonstationarity I (d ) processes. 23.5.3.4 Regime Switching Type Extension

(Diebold and Inoue 2001) show that long-range dependence models and regime switching models are intimately related in several circumstances, including a simple mixture model, stochastic permanent break model, and Markov-switching model. They demonstrate that with suitably adapted time varying transition probabilities these regime switching models can generate an autocovariance structure. This autocovariance structure is similar to the fractionally integrated processes. (Banerjee and Urga 2005) provide an overiew of the recent development in the studies of modeling regime switching and long-range dependence. (Haldrup and Nielsen 2006) propose a regime switching multiplicative seasonal ARFIMA model that accommodates both fractional integration and regime switching simultaneously. The model is: Ast ( L) (1 − α st L24 )(1 − L) d st ( yt − μ st ) = ε st ,t , ε st ,t ∼ nid (0, σ s2t )

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

569

where Ast ( L) is an eighth order lag polynomial capturing the within-the-day effects. The polynomial (1 − α st L24 ) stands for a daily quasi-difference filter, st = 0,1 denotes the regime determined by a Markov chain with transition probabilities, and μ is the mean.

23.6

Fractal Processes and Long-Range Dependence

Fractal processes (self-similar processes) are tightly connected with the analysis of long-range dependence. Self-similar processes are invariant in distribution with respect to changes of time and space scale. The scaling coefficient or selfsimilarity index is a non-negative number denoted by H, the Hurst parameter. If d

{ X (t + h) − X (h), t ∈ T }= { X (t ) − X (0), t ∈ T } for all h ∈ T , the real-valued process { X (t ), t ∈ T } has stationary increments. Samorodnisky and Taqqu 1994) provide d

a succinct expression of self-similarity: { X (at ), t ∈ T }= {a H X (t ), t ∈ T } . The process { X (t ), t ∈ T } is called H-sssi if it is self-similar with index H and has stationary increments. Long-range dependence processes are asymptotically second-order self-similar (see (Willinger et al. 1998)).

23.6.1 Specification of the Fractal Processes (Lamperti 1962) first introduced semi-stable processes (which we nowadays call self-similar processes). Let T be either R, R+ = {t : t ≥ 0} or {t : t > 0} . Then the real-valued process { X (t ), t ∈ T } is self-similar with Hurst index H > 0 (H-ss) for any a > 0 and d ≥ 1 , t1 , t2 ,..., td ∈ T , satisfying: d

( X (at1 ), X (at2 ),..., X (atd ))= (a H X (t1 ), a H X (t2 ),...a H X (td )).

(23.57)

23.6.1.1 Fractional Gaussian Noise

For a given H ∈ (0,1) there is basically a single Gaussian H-sssi process, namely fractional Brownian motion (fBm) that was first introduced by Kolmogorov 1940). Mandelbrot and Wallis 1968) and Taqqu 2003) clarify the definition of fBm as a Gaussian H-sssi process {BH (t )}t∈R with 0 < H < 1 . Mandelbrot and van Ness 1968) defined the stochastic representation BH (t ) :=

1 1 1 0 ( ∫ −∞ [(t − s ) H − 2 − (− s ) H − 2 ]dB ( s ) 1 Γ (H + 2 )

+ ∫ 0 (t − s ) H − 2 dB( s)) t

1

(23.58)

570

Wei Sun et al.

where Γ (⋅) represents the Gamma function: ∞

Γ (a) := ∫ 0 x a −1e− x dx and 0 < H < 1 is the Hurst parameter. The integrator B is the ordinary Brownian motion. The main difference between fractional Brownian motion and ordinary Brownian motion is that the increments in Brownian motion are independent while in fractional Brownian motion they are dependent. As to the fractional Brownian motion, (Samorodnitsky and Taqqu 1994) define its increments {Y j , j ∈ Z } as fractional Gaussian noise (fGn), which is, for j = 0, ±1, ±2,... , Y j = BH ( j − 1) − BH ( j ) . 23.6.1.2 Fractional Stable Noise

Fractional Brownian motion can capture the effect of long-range dependence, but has less power to capture heavy tailedness. The existence of abrupt discontinuities in financial data, combined with the empirical observation of sample excess kurtosis and unstable variance, confirms the stable Paretian hypothesis first identified by (Mandelbrot 1963, 1983). It is natural to introduce stable Paretian distribution in self-similar processes in order to capture both long-range dependence and heavy tailedness. (Samorodinitsky and Taqqu 1994) introduce the α -stable H-sssi processes { X (t ), t ∈ R} with 0 < α < 2 . If 0 < α < 1 , the Hurst parameter values are H ∈ (0,1/α ] and if 1 < α < 2 , the Hurst parameter values are H ∈ (0,1] . There are many different extensions of fractional Brownian motion to the stable distribution. The most commonly used is the linear fractional stable motion (also called linear fractional Lévy motion), {Lα , H (a, b; t ), t ∈ (−∞, ∞)} , which is defined by (Samorodinitsky and Taqqu 1994) as follows: ∞

Lα , H (a, b; t ) := ∫ −∞ fα , H (a, b; t , x) M (dx)

(23.59)

where H − α1

fα , H (a, b; t , x) := a ((t − x) +

H − α1

− (− x) +

H − α1

) +b((t − x) −

H − α1

− (− x) −

)

(23.60) and where a, b are real constants, | a | + | b |> 1 , 0 < α < 2 , 0 < H < 1 , H ≠ 1/α , and M is an α -stable random measure on R with Lebesgue control measure and skewness intensity β ( x) , x ∈ (−∞, ∞) satisfying: β (⋅) = 0 if α = 1 . They define linear fractional stable noises expressed by Y (t ) , and Y (t ) = X t − X t −1 , H − α1

Y (t ) = Lα , H (a, b; t ) − Lα , H (a, b; t − 1) = ∫R (a[(t − x) + H −α

+b[(t − x) −

1

H −α

− (t − 1 − x) −

1

]) M (dx)

H − α1

− (t − 1 − x) +

]

(23.61)

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

571

where Lα , H (a, b; t ) is a linear fractional stable motion defined by equation (23.59), and M is a stable random measure with Lebesgue control measure given 0 < α < 2 . In this paper, if there is no special indication, the fractional stable noise (fsn) is generated from a linear fractional stable motion. Some properties of these processes have been discussed in (Mandelbrot and Van Ness 1968; Maejima 1983; Maejima and Rachev 1987; Manfields et al. 2001; Rachev and Mittnik 2000; Rachev and Samorodnitsky 2001; Samorodnitsky 1994, 1996, 1998; and Samorodinitsky and Taqqu 1994).

23.6.2

Estimation of Fractal Processes

23.6.2.1 Estimating the Self-Similarity Parameter in Fractional Gaussian Noise

(Beren 1994) discusses the Whittle estimation (which we discussed earlier) of the self-similarity parameter. For fractional Gaussian noise, Yt , let f (λ ; H ) denote the power spectrum of Yt after being normalized to have variance 1 and let I (λ ) the periodogram of Yt , that is I (λ ) =

1 2π N

N

∑ Yt ei t λ

2

(23.62)

t =1

The Whittle estimator of H is obtained by finding Hˆ that minimizes π g ( Hˆ ) = ∫ −π

I (λ ) dλ f (λ ; Hˆ )

(23.63)

23.6.2.2 Estimating the Self-Similarity Parameter in FSN

(Stoev et al. 2002) proposed the least-squares (LS) estimator of the Hurst index based on the finite impulse response transformation (FIRT) and wavelet transform coefficients of the fractional stable motion. A FIRT is a filter v = (v0 , v1 ,..., v p ) of real numbers vt ∈ ℜ, t = 1, 2,..., p , and length p + 1 . It is defined for X t by p

Tn,t = ∑ vi X n (i + t )

(23.64)

i =0

where n ≥ 1 and t ∈ N . The Tn,t are the FIRT coefficients of X t , that is, the FIRT coefficients of the fractional stable motion. The indices n and t can be interpreted as “scale” and “location”. If ∑ ip=0 i r vi = 0 , for r = 0,..., q − 1 , but

∑i =0 i q vi ≠ 0 , p

the filter vi can be said to have q ≥ 1 zero moments. If

{Tn,t , n ≥ 1, t ∈ N } is the FIRT coefficients of fractional stable motion with the

572

Wei Sun et al.

filter vi that have at least one zero moment, (Stoev et al. 2002) prove the followd

d

ing two properties of Tn,t : (1) Tn,t + h = Tn,t , and (2) Tn,t = n H T1,t , where h, t ∈ N and n ≥ 1 . We suppose that Tn,t are available for the fixed scales n j j = 1,..., m and locations t = 0,..., M j − 1 at the scale n j , since only a finite number, say M j , of the FIRT coefficients are available at the scale n j . By using these properties, we have E log | Tn j ,0 |= H log n j + E log | T1,0 |

(23.65)

The left-hand side of this equation can be approximated by Ylog ( M j ) =

1 Mj

M j −1

∑ log | Tn ,t | j

t =0

(23.66)

Then we get ⎛ Ylog ( M 1 ) ⎞ ⎛ log n1 1⎞ ⎞ ⎜ ⎟ ⎜ ⎟⎛ H ⎜ ⎟=⎜ ⎟ ⎜ E log T1,0 ⎟ ⎠ ⎜ Ylog ( M m ) ⎟ ⎜ log nm 1⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎛ M (Ylog ( M 1 ) − E log Tn1 ,0 ⎞ ⎜ ⎟ +⎜ ⎟ ⎜⎜ ⎟⎟ ⎝ M (Ylog ( M m ) − E log Tnm ,0 ⎠

(23.67)

In short, we can express the above equation as follows Y = Xθ +

1 M

ε

(23.68)

Equation (23.68) shows that the self-similarity parameter H can be estimated by a standard linear regression of the vector Y against the matrix X . (Stoev et al. 2002) explain this procedure. 23.6.2.3 Estimating the Parameters of Stable Paretian Distribution

Stable distribution requires four parameters for complete description: an index of stability α ∈ (0, 2] (also called the tail index), a skewness parameter β ∈ [−1,1] , a scale parameter γ > 0 , and a location parameter ζ ∈ ℜ . There is unfortunately no closed-form expression for the density function and distribution function of a stable distribution. (Rachev and Mittnik 2000) give the definition of the stable

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

573

distribution: A random variable X is said to have a stable distribution if there are parameters 0 < α ≤ 2 , −1 ≤ β ≤ 1 , γ ≥ 0 and real ζ such that its characteristic function has the following form:

πα ⎧ α α ⎪⎪exp{−γ | θ | (1 − i β (sin θ ) tan 2 ) + iζθ } if α ≠ 1 E exp(iθ X ) = ⎨ (23.69) ⎪ exp{−γ | θ | (1 + i β 2 (sin θ ) ln | θ |) + iζθ } if α = 1 ⎪⎩ π and, ⎧ 1 if θ > 0 ⎪ sign θ = ⎨ 0 if θ = 0 ⎪−1 if θ < 0 ⎩

(23.70)

Stable density is not only support for all of (−∞, +∞) , but also for a half line. For 0 < α < 1 and β = 1 or β = −1 , the stable density is only for a half line. In order to estimate the parameters of the stable distribution, the maximum likelihood estimator given in (Rachev and Mittnik 2000) has been employed. Given N observations, X = ( X 1 , X 2 , , X N )′ for the positive half line. The log-likelihood function is of the form N

N

i =1

i =1

ln(α , λ ; X ) = N ln λ + N ln α + (α − 1)∑ ln X i − λ ∑ X iα

(23.71)

which can be maximized using, for example, a Newton-Raphson algorithm. It follows from the first-order condition, ⎛ N

⎞

⎜ ⎝ i =1

⎟ ⎠

λ = N ⎜⎜ ∑ X iα ⎟⎟

−1

(23.72)

that the optimization problem can be reduced to finding the value for α which maximizes the concentrated likelihood ⎛ N

⎞

⎜ ⎝ i =1

⎟ ⎠

ln ∗ (α ; X ) = ln α + αν − ln ⎜⎜ ∑ X iα ⎟⎟

(23.73)

where ν = N −1Σ iN=1 ln X i . The information matrix evaluated at the maximum likelihood estimates, denoted by I (αˆ , λˆ ) , is given by ⎛ −2 ⎜ Naˆ I (αˆ , λˆ ) = ⎜ N ⎜ αˆ ⎜ ∑ X i ln X i ⎝ i =1

N

⎞

∑ X iα ln X i ⎟ ˆ

i =1

N λˆ −2

⎟ ⎟ ⎟ ⎠

It can be shown that, under fairly mild condition, the maximum likelihood estimates αˆ and λˆ are consistent and have asymptotically a multivariate normal distribution with mean (α , λ )′ (see (Rachev and Mittnik 2000)).

574

Wei Sun et al.

Other methods for estimating the parameters of a stable distribution (i. e., the method of moments based on the characteristic function, the regression-type method, and the fast Fourier transform method) are discussed in (Stoyanov and Racheva-Iotova 2004a, 2004b, 2004c).

23.6.3

Simulation of Fractal Processes

23.6.3.1 Simulation of Fractional Gaussian Noise

(Paxson 1997) provides a method to generate the fractional Gaussian noise by using the Discrete Fourier Transform of the spectral density. (Bardet et al. 2003) describe a concrete simulation procedure based on this method that overcomes some of the implementation issues encountered in practice. The procedure is: 1. Choose an even integer M . Define the vector of the Fourier frequencies Ω = (θ1 ,...,θ M / 2 ) , where θt = 2π t /M and compute the vector F = f H (θ1 ),..., f H (θ M / 2 ) , where f H (θ ) =

1

π

sin(π H )Γ (2 H + 1)(1 − cos θ ) ∑ | 2π t + θ |−2 H −1 t∈ℵ

f H (θ ) is the spectral density of fGn. 2. Generate M / 2 i.i.d exponential Exp (1) random variables E1 ,..., EM / 2 and M / 2 i.i.d uniform U [0,1] random variables U1 ,..., U M / 2 . 3. Compute Z t = exp(2iπ U t ) Ft Et , for t = 1,..., M / 2 . 4. Form the M -vector: Z = (0, Z1 ,..., Z ( M / 2) −1 , Z M / 2 , Z ( M / 2) −1 ,..., Z 1 ) . 5. Compute the inverse FFT of the complex Z to obtain the simulated sample path.

23.6.3.2 Simulation of Fractional Stable Noise

Replacing the integral in equation (23.61) with a Riemann sum, (Stoev and Taqqu 2004) generate the approximation of fractional stable noise. They introduce parameters n, N ∈ℵ , then express the fractional stable noise Y (t ) as nN j j Yn, N (t ) := ∑ (( ) +H −1/α − ( − 1) +H −1/α ) Lα , n (nt − j ) n j =1 n

(23.74)

where Lα , n (t ) := M α (( j + 1) /n) − M α ( j /n) , j ∈ ℜ . The parameter n is mesh size and the parameter M is the cut-off of the kernel function.

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

575

(Stoev and Taqqu 2003) describe an efficient approximation involving the Fast Fourier Transformation (FFT) algorithm for Yn, N (t ) . Consider the moving average process Z (m) , m ∈ℵ , nM

Z (m) := ∑ g H , n ( j ) Lα (m − j )

(23.75)

j j g H ,n ( j ) := (( ) H −1/α − ( − 1)+H −1/α )n−1/α n n

(23.76)

j =1

where

and where Lα ( j ) is the series of i.i.d standard stable Paretian random variables. Since d

d

Lα , n ( j )= n−1/α Lα ( j ), j ∈ℜ , Equation (23.74) and (23.75) imply Yn, N (t )= Z (nt ) , for t = 1,..., T . Then, the computing is moved to focus on the moving average series Z (m) , m = 1,..., nT . Let Lα ( j ) be the n( N + T ) -periodic with Lα ( j ) := Lα ( j ) , for j = 1,..., n( N + T ) and let g H , n( j ) := g H , n ( j ) g H , n( j ) := g H , n ( j ) , for j = 1,..., nN ; g H ,n( j ) := 0 , for j = nN + 1,..., n( N + T ) , then d

n ( N +T )

{Z (m)}nT m =1 = {

m =1 ∑ g H ,n( j ) Lα (n − j )}nT

(23.77)

j =1

because for all m = 1,..., nT , the summation in equation (23.75) involves only Lα ( j ) with indices j in the range −nN ≤ j ≤ nT − 1 . Using a circular convolution of the two n( N + T ) -periodic series g H , n and Lα computed by using their Discrete Fourier Transforms (DFT), the variables Z (n) , m = 1,..., nT (i. e., the fractional stable noise) can be generated.

23.6.4

Implications of Fractal Processes

Fractal processes have been applied to the study of computer networks. (Leland et al. 1994; Willinger et al. 1997) employ fractal processes in modeling Ethernet traffic. (Feldmann et al. 1998) discuss the fractal processes in the measurement of TCP/IP and ATM WAN traffic. (Paxson and Floyd 1995; Paxson 1997; Feldmann et al. 1998) discuss the characteristics of self-similarity in wide-area traffic with respect to the fractal processes. (Crovella and Bestavros 1997) provide evidence of self-similarity in world-wide web traffic by means of fractal processes modeling. An extensive bibliographical review of the research in the area of network traffic and network performance involving the fractal processes and self-similarity is provided by (Willinger et al. 1996). (Sahinoglu and Tekinay 1999) survey studies on the self-similarity phenomenon in multimedia traffic and its implications in network performance. (Baillie 1996) provides a survey of the major econometric research on longrange dependence processes, fractional integration, and applications in economics and finance. (Bhansali and Kokoszka 2006) review recent research on long-range

576

Wei Sun et al.

dependence time series. Theoretical and empirical research on long-range dependence in economics and finance is provided by (Robinson 2003; Rangarajan and Ding 2006; Teyssiére and Kirman 2006). Based on the modeling mechanism of fractal processes, (Sun et al. 2006) empirically compare fractional stable noise with several alternative distributional assumptions in either fractal form or i.i.d. form (i. e., normal distribution, fractional Gaussian noise, generalized extreme value distribution, generalized Pareto distribution, and stable distribution) for modeling returns of major German stocks. The empirical results suggest that fractional stable noise dominate these alternative distributional assumptions both in in-sample modeling and out-of-sample forecasting. This finding suggests that the model built on non-Gaussian non-random walk (fractional stable noise) performs better than those models based on either the Gaussian random walk, the Gaussian non-random walk, or the non-Gaussian random walk.

23.7 Summary In this chapter, we review recent studies in finance that use intra-daily price and return data. A large body of research confirms stylized facts (for example, random durations, heavy tailedness, seasonality, and long-range dependence) based on intra-daily data from both the equity market and the foreign exchange market. We summarize the stylized facts reported in the intra-daily data literature. We discuss the research fields in finance where the use of intra-daily data have attracted increased interest and potential applications, such as stylized facts modeling, market volatility and liquidity analysis, and market microstructure theory. Recent research shows that intra-daily data exhibit strong dependence structure. In order to pull together long-range dependence study with intra-daily analysis, we briefly introduce research on long-range dependence and fractal processes. The principal problem in long-range dependence modeling is measuring the selfsimilar coefficient, i. e., the Hurst index. We review the studies on long-range dependence from three viewpoints: in the time domain, in the frequency domain, and with respect to econometric models. Modeling long-range dependence in intra-daily data with the help of fractal processes is introduced.

References Adams, G, McQueen G, Wood R (2004) The effects of inflation news on intradaily stock returns. Journal of Business 77-3: 547−574 Aït-Sahalia Y, Mykland P, Zhang L (2005) How often to sample a continuoustime process in the presence of market microstructure noise. Review of Financial Studies 18: 356−416 Alexander C (2001) Market Models: A Guide to Financial Data Analysis. John Wiley & Sons, New York

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

577

Andersen T, Bollerslev T (1997) Intraday periodicity and volatility persistence in financial markets. Journal of Empirical Finance 4: 115−158 Andersen T, Bollerslev T (1998) Answering the skeptics: yes, standard volatility models do provide accurate forecasts. International Economic Review 39: 885−905 Andersen T, Bollerslev T, Das A (2001) Variance-ratio statistics and highfrequency data: testing for changes in intraday volatility patterns. Journal of Finance 59-1: 305−327 Andersen T, Bollerslev T, Diebold F, Ebens H (2001) The distribution of realized stock return volatility. Journal of Financial Economics 61: 43−76 Andersen T, Bollerslev T, Diebold F, Labys P (2000) Exchange rate returns standardized by realized volatility are (nearly) Gaussian. Multinational Finance Journal 4: 159−179 Andersen T, Bollerslev T, Diebold F, Labys P (2001) The distribution of realized exchange rate volatility. Journal of the American Statistical Association 96: 42−55 Andersen T, Bollerslev T, Diebold F, Labys P (2003) Modeling and forecasting realized volatility. Econometrica 71: 529−626 Andersen T, Bollerslev T, Diebold F (2005) Parametric and nonparametric measurements of volatility. In: Aït-Sahalia Y, Hansen L (eds) Handbook of Financial Econometrics. North Holland Press, Amsterdam Andersen T, Bollerslev T, Christoffersen P, Diebold F (2006) Volatility and correlation forecasting. In: Elliott G, Granger C, Timmermann A (eds) Handbook of Economic Forecasting. North Holland Press, Amsterdam Andersen T, Bollerslev T, Christoffersen P, Diebold F (2005) Practical volatility and correlation modeling for financial market risk management. In: Carey M, Stulz R (eds) Risks of Financial Institutions. University of Chicago Press Baillie R (1996) Long memory processes and fractional integration in econometrics. Journal of Econometrics 73: 5−59 Baillie R, Bollerslev T (1991) Intra-day and inter-market volatility in foreign exchange rates. Review of Economic Studies 58: 565−585 Baillie R, Bollerslev T, Mikkelsen H (1996) Fractional integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 74: 3−30 Banerjee A, Urga G (2005) Modeling structural breaks, long memory and stock market volatility: an overview. Journal of Econometrics 129: 1−34 Bardet J, Lang G, Oppenheim G, Philippe A, Taqqu M (2003) Generators of longrange dependent processes: a survey. In: Doulkhan P, Oppenheim G, Taqqu M (eds) Theory and Applications of Long-range Dependence. Birkhäuser, Boston Barndorff-Nielsen O, Shephard N (2002) Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of Royal Statistical Society B 64-2: 253−280 Bauwens L, Veredas D (2004) The stochastic conditional duration model: a latent variable model for the analysis of financial durations. Journal of Econometrics 119: 381−412

578

Wei Sun et al.

Bauwens L, Giot P (2000) The logarithmic ACD model: an application to the bidask quote process of three NYSE stocks. Annales d’Economie et de Statistique 60: 117−149 Bauwens L, Giot P (2001) Econometric Modelling of Stock Market Intraday Activity. Kluwer Academic Publishers, Boston Bauwens L, Giot P (2003) Asymmetric ACD models: introducing price information in ACD models. Empirical Economics 28-4: 709−731 Bauwens L, Hautsch N (2006) Modelling financial high frequency data using point processes. In Handbook of Financial Time Series. Springer-Verlag, to appear Beltratti A, Morana C (1999). Computing value at risk with high frequency data. Journal of Empirical Finance 6: 431−455 Beran J (1994) Statistics for Long-Memory Processes. Chapman & Hall, New York Bessembinder H, Venkataraman K (2004) Does an electronic stock exchange need an upstairs market? Journal of Financial Economics 73: 3−36 Bhansali R, Kokoszaka P (2006) Prediction of long-memory time series: a tutorial review In: Rangarajan G, Ding M (eds) Processes with Long-Range Correlations: Theory and Applications. Springer, New York Blair B, Poon S, Taylor S (2001) Forecasting S& P 100 volatility: the incremental information content of implied volatilities and high-frequency index returns. Journal of Econometrics 105: 5−26 Bloomfield R, O’Hara M (1999) Market transparency: who wins and who loses? Review of Financial Studies 2-1: 5−35 Bloomfield R, O’Hara M (2000) Can transparent markets survive? Journal of Financial Economics 55-3: 425−459 Boehmer E, Saar G, Yu L (2005) Lifting the veil: an analysis of pre-trade transparency at the NYSE. Journal of Empirical Finance 10: 321−353 Bollen B, Inder B (2002) Estimating daily volatility in financial markets utilizing intraday data. Journal of Empirical Finance 9: 551−562 Bollen N, Smith T, Whaley R (2004) Modeling the bid/ask spread: measuring the inventory holding premium. Journal of Finance 60: 97−141 Bollerslev T, Chou R, Kroner K (1992) ARCH modeling in finance: A review of the theory and empirical evidence. Journal of Econometrics 52: 783−815 Bollerslev T, Domowitz I (1993) Trading patterns and prices in the interbank foreign exchange market. Journal of Finance 48: 1421−1443 Bollerslev T, Mikkelsen H (1996) Modelling and pricing long-memory in stock market volatility. Journal of Econometrics 73: 154−184 Bollerslev T, Cai J, Song F (2000) Intraday periodicity, long memory volatility and macroeconomic announcement effects in the US treasury bond market. Journal of Empirical Finance 7: 37−55 Bollerslev T, Wright J (2001) High-frequency data, frequency domain inference and volatility forecasting. The Review of Economics and Statistics 83-4: 596−602 Bolerslev T, Zhang Y (2003) Measuring and modeling systematic risk in factor pricing models using high-frequency data. Journal of Empirical Finance 10: 533−558

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

579

Breidt F, Crato N, de Lima P (1998) The detection and estimation of long memory in stochastic volatility. Journal of Econometrics 73: 325−348 Chordia T, Roll R, Subrahmanyam A (2002) Order imbalance, liquidity and market returns. Journal of Financial Economics 65: 111−130 Chung K, Van Ness B (2001) Order handling rules, tick size and the intraday pattern of bid-ask spreads for Nasdaq stocks. Journal of Financial Markets 4: 143−161 Corsi F, Kretschmer U, Mittnik S, Pigorsch C (2005) The volatility of realized volatility. CFS Working Paper No.2005/33 Coughenour J, Deil D (2002) Liquidity provision and the organizational form of NYSE specialist firms. Journal of Finance 57: 841−869 Coval J, Shumway T (2001) Is sound just noise? Journal of Finance 56: 1887−1910 Crovella M, Bestavros A (1997) Self-similarity in World Wide Web traffic: evidence and possible causes. IEEE/ACM Transactions on Networking 5-6: 835−846 Dacorogna M, Gençay R, Müller U, Olsen R, Pictet O (2001) An Introduction of High-Frequency Finance. Academic Press, San Diego Diebold F, Inoue A (2001) Long memory and regime switching. Journal of Econometrics 105: 131−159 Doukhan P, Oppenheim G, Taqqu M (2003) (eds) Theory and Applications of Long-Range Dependence. Birkhäuser, Boston Dufour A, Engle R (2000) Time and the price impact of a trade. Journal of Finance 55: 2467−2498 Easley D, O’Hara M (2003) Microstructure and asset pricing. In: Constantinides G, Harris M, Stulz R (eds) Handbook of the Economics of Finance. Elsevier, North Holland Engle R (1982) Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation. Econometrica 50: 987−1008 Engle R (2000) The econometrics of ultra-high frequency data. Econometrica 68: 1−22 Engle R, Russell J (1998) Autoregressive conditional duration: a new model for irregularly spaced transaction data. Econometrica 66: 1127−1162 Engle R, Russell J (2004) Analysis of high frequency data. In: Aït-Sahalia Y, Hansen L (eds) Handbook of Financial Econometrics. North Holland Press, Amsterdam Faust J, Rogers J, Swanson E, Wright J (2003) Identifying the effects of monetary policy shocks on exchange rates using high frequency data. Journal of European Economic Association 1: 1031−1057 Feldmann A, Gilbert A, Willinger W (1998) Data networks as cascades: investing the multifractal nature of Internet WAN traffic. ACM Computer Communication Review 28: 42−55 Feldmann A, Gilbert A, Willinger W, Kurtz T (1998) The changing nature of network traffic: scaling phenomena. ACM Computer Communication Review 28: 5−29

580

Wei Sun et al.

Feng D, Jiang G, Song P (2004) Stochastic conditional duration models with “leverage effect” for financial transaction data. Journal of Financial Econometrics 119: 413−433 Fernandes M, Gramming J (2005) Nonparametric specification tests for conditional duration models. Journal of Econometrics 127: 35−68 Fleming J, Kirby C, Ostdiek B (2003) The economic value of volatility timing using “realized” volatility. Journal of Financial Economics 67: 47−509 Franke G, Hess D (2000) Information diffusion in electronic and floor trading. Journal of Empirical Finance 7: 455−478 Gençay R, Selçuk F, Whitcher B (2001) Differentiating intraday seasonalities through wavelet multi-scaling. Physica A 289: 543−556 Gençay R, Selçuk F, Whitcher B (2002) An Introduction toWavelets and Other Filtering Methods in Finance and Economics. Elsevier, San Diego Gençay R, Ballocchi G, Dacorogna M, Olsen R, Pictet O (2002) Real-time trading models and the statistical properties of foreign exchange rates. International Economic Review 42: 463−491 Gerhard F, Hautsch N (2002) Volatility estimation on the basis of price intensities. Journal of Empirical Finance 9: 57−89 Geweke J, Porter-Hudak S (1983) The estimation and application of long memory time series models. Journal of Time Series Analysis 4: 221−238 Ghysels E (2000) Some econometric recipes for high-frequency data cooking. Journal of Business and Economic Statistics 18: 154−163 Ghysels E, Gouriéroux C, Jasiak J (2004) Stochastic volatility duration models. Journal of Econometrics 119: 413−433 Giot P, Laurent S (2004) Modeling daily Value-at-Risk using realized volatility and ARCH type models. Journal of Empirical Finance 11: 379−398 Giraitis L, Surgailis D (1990) A central limit theorem for quadratic forms in strongly dependent random variables and its application to asymptotical normality of Whittle’s estimate. Probability Theory and Related Fields 86: 87−104 Gleason K, Mathur I, Peterson M (2004) Analysis of intraday herding behavior among the sector ETFs. Journal of Empirical Finance 11: 681−694 Goodhard C, O’Hara M (1997) High frequency data in financial market: issues and applications. Journal of Empirical Finance 4: 73−114 Goodhard C, Payne R (2000) (ed) The Foreign Exchange Market: Empirical Studies with High-Frequency Data. Palgrave, Macmillan Gouriéroux C, Jasiak J (2001) Financial Econometrics: Problems, Models and Methods. Princetion University Press, Princeton and Oxford Gramming J, Maurer K (1999) Non-monotonic hazard functions and the autoregressive conditional duration model. The Econometrics Journal 3: 16−38 Gwilym O, Sutcliffe C (1999) High-Frequency Financial Market Data. Risk Books Haldrup N, Nielsen M (2006) A regime switching long memory model for electricity prices. Journal of Econometrics 135: 349−376 Hannan E (1973) The asymptotic theory of linear time series models. Journal of Applied Probability 10: 130−145 Harris L (2003) Trading and exchanges. Oxford university press, Oxford

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

581

Hasbrouck J (1996) Modeling microstructure time series. In: Maddala G, Rao C (eds) Statistical Methods in Finance (Handbook of Statistics, Volume 14). North-Holland, Amsterdam Haas M, Mittnik S, Paolella M (2004) Mixed normal conditional heteroskedasticity. Journal of Financial Econometrics 2: 211−250 Higuchi T (1988) Approach to an irregular time series on the basis of the fractal theory. Physica D 31: 277−283 Hong H, Wang J (2000) Trading and returns under periodic market closures. Journal of Finance 55: 297−354 Hotchkiss E, Ronen T (2002) The informational efficiency of the corporate bond market: an intraday analysis. The Review of Financial Studies 15: 1325−1354 Huang R, Stoll H (2001) Tick size, bid-ask spreads, and market structure. Journal of Financial and Quantitative Analysis 36: 503−522 Hurst H (1951) Long-term storage capacity of reservoirs. Transactions of the American Society of Civil Engineers 116: 770−808 Hurvich C, Brodeski J (2001) Broadband semiparametric estimation of the memory parameter of a long memory time series using fractional exponential models. Journal of Time Series Analysis 22: 222−249 Jain P, Joh G (1988) The dependence between hourly prices and trading volume. Journal of Financial and Quantitative Analysis 23: 269−284 Jasiak J (1998) Persistence in intertrade durations. Finance 19: 166−195 Kalay A, Sade O,Wohl A (2004) Measuring stock illiquidity: an investigation of the demand and supply schedules at the TASE. Journal of Financial Economics 74: 461−486 Kolmogoroff A (1940) Wienersche Spiralen und einige andere interessante Kurven im Hilbertschen Raum. Comptes Rendus (Doklady) Acad. Sci. URSS (N.S.) 26: 115−118 Kryzanowski L, Zhang H (2002) Intraday market price integration for shares cross-listed internationally. Journal of Financial and Quantitative Analysis 37: 243−269 Leland W, Taqqu M, Willinger W, Wilson D (1994) On the self-similar nature of Ethernet traffic. IEEE/ACM Transactions on Networking 2: 1−15 Lamperti J (1962) Semi-stable stochastic Processes. Transactions of the American Mathematical Society 104: 62−78 Lo A (1991) Long-term memory in stock market prices. Econometrica 59: 1279−1313 Longstaff F, Wang A (2004) Electricity forward prices: a high-frequency empirical analysis. Journal of Finance 59: 1877−1900 Madhavan A (2000) Market Microstructure: A survey. Journal of Financial Markets 3: 205−258 Maejima M (1983) On a class of self-similar processes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 62: 235−245 Maejima M, Rachev S (1987) An ideal metric and the rate of convergence to a self-similar process. Annals of Probability 15: 702−727

582

Wei Sun et al.

Mandelbrot B (1963) New methods in statistical economics. Journal of Political Economy 71: 421−440 Mandelbrot B (1975) Limit theorems on the self-normalized range of weakly and strongly dependent processes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 31: 271−285 Mandelbrot B (1983) The Fractal Geometry of Nature. Freeman, San Francisco Mandelbrot B,Wallis J (1969) Computer experiments with fractional Gaussian noises. Water Resources Research 5: 228−267 Mandelbrot B Taqqu M (1979) Robust R/S analysis of long-run serial correlation. Bulletin of the International Statistical Institute 48: 69−104 Mansfield P, Rachev S, Samorodnitsky G (2001) Long strange segments of a stochastic process and long-range dependence. The Annals of Applied Probability 11: 878−921 Marinelli C, Rachev S, Roll R, Göppl H (2000) Subordinated stock price models: heavy tails and long-range dependence in the high-frequency Deutsche Bank price record. In: Bol G, Nakhaeizadeh G, Vollmer K (eds) Datamining and Computational Finance. Physica-Verlag, Heidelberg Marinucci D, Robinson P (1999) Alternative forms of fractional Brownian motion. Journal of Statistical Planning and Inference 80: 111−122 Martens M (2001) Forecasting daily exchange rate volatility using intraday returns. Journal of International Money and Finance 20: 1−23 Martens M, Chang Y, Taylor S (2002) A comparison of seasonal adjustment methods when forecasting intraday volatility. The Journal of Financial Research 25: 283−299 Masulis R, Shivakumar L (2002) Does market structure affect immediacy of stock price responses to news? Journal of Financial and Quantitative Analysis 37: 617−648 McInish T,Wood R (1992) An analysis of intradaily patterns in bid/ask spreads for NYSE stocks. Journal of Finance 47: 753−746 Mittnik S, Paolella M, Rachev S (2002) Stationary of stable power-GARCH processes. Journal of Econometrics 106: 97−107 Morana C, Beltratti A (2004) Structural change and long-range dependence in volatility of exchange rate: either, neither or both? Journal of Empirical Finance 11: 629−658 Moulines E, Soulier P (1999) Log-periodogram regression of time series with long-range dependence. Annals of Statistics 27: 1415−1439 Müller U, Dacorogna M, Olsen R, Pictet O, Schwarz M, Morgenegg C (1990) Statistical study of foreign exchange rates, empirical evidence of a price change scaling law, and intraday analysis. Journal Banking and Finance 14: 1189−1208 Müller U, Dacorogna M, Dave R, Olsen R, Pictet O, von Weizsacker J (1997) Volatilities of different time resolutions: analyzing the dynamics of market components. Journal of Empirical Finance 4: 213−239 Naik N, Yadav P (2003) Do dealer firms manage inventory on a stock-by-stock or a portfolio basis? Journal of Financial Economics 69: 235−353

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

583

Nelson D (1991) Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59: 347−370 Neilsen B, Shephard N (2004) Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics. Econometrica 72: 885−925 O’Hara M (1995) Market Microstructure Theory. Blackwell, Cambridge Paxson V (1997) Fast, approximation synthesis of fractional Gaussian noise for generating self-similar network traffic. Computer Communications Review 27: 5−18 Peng C, Buldyrev S, Simons M, Stanley H, Goldberger A (1994) Mosaic organization of DNA nucleotides. Physical Review E 49: 1685−1689 Peterson M, Sirri E (2002) Order submission strategy and the curious case of marketable limit orders. Journal of Financial and Quantitative Analysis 37: 221−241 Rachev S, Mittnik S (2000) Stable Paretian Models in Finance. Wiley, New York Rachev S, Samorodnitsky G (2001) Long strange segments in a long range dependent moving average. Stochastic Processes and their Applications 93: 119−148 Rachev S, Menn C, Fabozzi F (2005) Fat-Tailed and Skewed Asset Return Distributions. Wiley, New Jersey Rangarajan G, Ding M (2006) (ed) Processes with Long-Range Correlations: Theory and Applications. Springer, New York Robinson P (1991) Testing for strong serial correlation and dynamic conditional heteroskedasticity in multiple regression. Journal of Econometrics 47: 67−84 Robinson P (1994) Efficient tests of nonstationary hypotheses. Journal of the American Statistical Association 89: 1420−1437 Robinson P (1995a) Log periodogram regression of time series with long range dependence. The Annals of Statistics 23: 1048−1072 Robinson P (1995b) Gaussian semiparametric estimation of long range dependence. The Annals of Statistics 23: 1630−1661 Robinson P (2003) (ed) Time Series with Long Memory. Oxford University Press, New York Samorodnitsky G (1994) Possible sample paths of self-similar alpha-stable processes. Statistics and Probability Letters 19: 233−237 Samorodnitsky G (1996) A class of shot noise models for financial applications. In: Heyde C, Prohorov Y, Pyke R, Rachev S (eds) Proceeding of Athens International Conference on Applied Probability and Time Series. Volume 1: Applied Probability. Springer, Berlin Samorodnitsky G (1998) Lower tails of self-similar stable processes. Bernoulli 4: 127−142 Samorodnitsky G, Taqqu M (1994) Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. Chapman & Hall/CRC, Boca Raton Schwartz R, Francioni R (2004) Equity Markets in Action. John Wiley & Sons, Hoboken

584

Wei Sun et al.

Smith B, Turnbull D, White R (2001) Upstairs market for principal agency trades: analysis of adverse information and price effects. Journal of Finance 56: 1723−1746 Stoev S, Pipiras V, Taqqu M (2002) Estimation of the self-similarity parameter in linear fractional stable motion. Signal Processing 82: 1873−1901 Stoev S, TaqquM(2004) Simulation methods for linear fractional stable motion and FARIMA using the Fast Fourier Transform. Fractals 12: 95−121 Stoyanov S, Racheva-Iotova B (2004a) Univariate stable laws in the field of finance – approximation of density and distribution functions. Journal of Concrete and Applicable Mathematics 2: 38−57 Stoyanov S, Racheva-Iotova B (2004b) Univariate stable laws in the field of finance – parameter estimation. Journal of Concrete and Applicable Mathematics 2: 24−49 Stoyanov S, Racheva-Iotova B (2004c) Numerical methods for stable modeling in financial risk management. In: Rachev S (ed) Handbook of Computational and Numerical Methods. Birkhäuser, Boston Spulber D (1999) Market Microstructure: Intermediaries and the Theory of the Firm. Cambridge University Press, Cambridge Sun W, Rachev S, Fabozzi F (2006a) Fratals or i.i.d.: evidence of long-range dependence and heavy tailedness form modeling German equity market volatility. Technical Report, University of Karlsruhe and UCSB Sun W, Rachev S, Fabozzi F, Kalev P (2006b) Fratals in duration: capturing longrange dependence and heavy tailedness in modeling trade duration. Technical Report, University of Karlsruhe and UCSB Taqqu M, Teverovsky V (1998) Estimating long-range dependence in finite and infinite variance sereis. In: Adler R, Feldman R, Taqqu M (eds) A Practical Guide to Heavy Tails. Birkhäuser, Boston Taqqu M, Teverovsky V, Willinger W (1995) Estimators for long-range dependence: an empirical study. Fractals 3: 785−798 Taylor S, Xu X (1997) The incremental volatility information in one million foreign exchange quotations. Journal of Empirical Finance 4: 317−340 Teverovsky V, Taqqu M (1995) Testing for long-range dependence in the presence of shifting means or a slowly declining trend using a variance-type estimator. Preprint Teverovsky V, Taqqu M, Willinger W (1999) A critical look at Lo’s modified R/S statistic. Journal of Statistical Planning and Inference 80: 211−227 Teyssiére G, Kirman A (2006) (eds) Long Memory in Economics. Springer, Berlin Theissen E (2002) Price discovery in floor and screen trading systems. Journal of Empirical Finance 9: 455−474 Thomakos D, Wang T (2003) Realized volatility in the futures markets. Journal of Empirical Finance 10: 321−353 Tsay R (2002) Analysis of Financial Time Series. John Wiley & Sons, New York Velasco C, Robinson P (2000) Whittle pseudo-maximum likelihood estimation for nonstationary time series. Journal of the American Statistical Association 95: 1229−1243

23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data

585

Veredas D, Rodrìguez-Poo J, Espasa A (2002) On the (intraday) seasonality of a financial point process: a semiparametric approach. Working Paper, CORE DP 2002/23, Université catholique de Louvain Wasserfallen W, Zimmermann H (1985) The behavior of intradaily exchange rates. Journal of Banking and Finance 9: 55−72 Weston J (2002) Electronic communication networks and liquidity on the Nasdaq. Journal of Financial Services Research 22: 125−139 Whittle R (1951) Hypothesis Testing in Time Series Analysis. Uppsala, Almqvist Willinger W, Taqqu M, Erramilli A (1996) A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks. Stochastic Networks: Theory and Applications, Royal Statistical Society Lecture Notes Series, Vol. 4. Oxford University Press, Oxford Willinger W, Taqqu M, Sherman R, Wilson D (1997) Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level. IEEE/ACM Transactions on Networking 5: 71−86 Willinger W, Paxson V, Taqqu M (1998) Self-similarity and heavy tails: structural modeling of network traffic. In: Adler R, Feldman R, Taqqu M (eds) A Practical Guide to Heavy Tails. Birkhäuser, Boston Wood R, McInish T, Ord J (1985) An investigation of transaction data for NYSE stocks. Journal of Finance 40: 723−739 Zhang M, Russell J, Tsay R (2001) A nonlinear autoregressive conditional duration model with applications to financial transaction data. Journal of Econometrics 104: 179−207 Zumbach G, Müller U (2001) Operators on inhomogeneous time series. International Journal of Theoretical and Applied Finance 4: 147−178

CHAPTER 24 Bayesian Applications to the Investment Management Process Biliana Bagasheva, Svetlozar (Zari) Rachev, John Hsu, Frank Fabozzi

24.1 Introduction There are several tasks in the investment management process. These include setting the investment objectives, establishing an investment policy, selecting a portfolio strategy, asset allocation, and measuring and evaluating performance. Bayesian methods have been either used or proposed as a tool for improving the implementation of several of these tasks. There are principal reasons for using Bayesian methods in the investment management process. First, they allow the investor to account for the uncertainty about the parameters of the return-generating process and the distributions of returns for asset classes and to incorporate prior beliefs in the decision-making process. Second, they address a deficiency of the standard statistical measures in conveying the economic significance of the information contained in the observed sample of data. Finally, they provide an analytically and computationally manageable framework in models where a large number of variables and parameters makes classical formulations a formidable challenge. The goal of this chapter is to survey selected Bayesian applications to investment management. In Section 24.2, we discuss the single-period portfolio problem, emphasizing how Bayesian methods improve the estimation of the moments of returns, primarily the mean. In Section 24.3, we describe the mechanism for incorporating asset-pricing models into the investment decision-making process. Tests of mean-variance efficiency are surveyed in Section 24.4. We explore the implications of predictability for investment management in Section 24.5 and then provide concluding remarks in Section 24.6.

24.2 The Single-Period Portfolio Problem The portfolio choice problem represents a primary example of decision-making under uncertainty. Let rT +1 denote the vector (N × 1) of next-period returns and W

588

Biliana Bagasheva et al.

current wealth. We denote next-period wealth by WT +1 = W (1 + ω ' rT +1 ) in the absence of a risk-free asset and WT +1 = W (1 + rf + ω ' rT +1 ) when a risk-free asset with return rf is present. Let ω denote the vector of asset allocations (fractions of wealth allocated to the corresponding stocks). In a one-period setting, the optimal portfolio decision consists of choosing ω that maximizes the expected utility of next-period’s wealth, max E (U (WT +1 ) ) = max ∫ U (WT +1 ) p ( r | θ ) dr , ω

ω

(24.1)

subject to feasibility constraints, where θ is the parameter vector of the return distribution and U is a utility function generally characterized by a quadratic or a negative exponential functional form. A key component of Eq. (24.1) is the distribution of returns p (r | θ ) , conditional on the unknown parameter vector θ . The traditional implementation of the mean-variance framework1 proceeds with setting θ equal to its estimate θˆ(r ) based on some estimator of the data r (often the maximum likelihood estimator). Then, the investor’s problem in Eq. (24.1) leads to the optimal allocation given by

(

)

ω * = arg max E U (ω ' r ) | θ = θˆ(r ) . ω

(24.2)

The solution in Eq. (24.2), known as the certainty equivalent solution, treats the estimated parameters as the true ones and completely ignores the effect of the estimation error on the optimal decision. The resulting portfolio displays high sensitivity to small changes in the estimated mean, variance, and covariance, and usually contains large long and short positions that are difficult to implement in practice.2 Starting with the work of (Zellner and Chetty 1965), several early studies investigate the effect parameter uncertainty plays on optimal portfolio choice by reexpressing Eq. (24.1) in terms of the predictive density function.3 The predictive density function reflects estimation risk explicitly since it integrates over the posterior distribution, which summarizes the uncertainty about the model parameters, 1

2 3

ω ' Σω , The mean-variance selection rule of (Markowitz’s 1952), given by min ω s.t. ω ' μ ≥ μ *, ω 'ι = 1 , where μ is the vector of expected returns, Σ is the covariance matrix of returns, and ι is a compatible vector of ones, provides the same set of admissible portfolios as the quadratic-type expected-utility maximization in Eq. (23.1). (Markowitz and Usmen 1996) point out that the conventional wisdom that the necessary conditions for application of mean-variance analysis are normal probability distribution and/or quadratic utility is a “misimpression” (Markowitz and Usmen 1996, p. 217). Almost optimal solutions are obtained using a variety of utility functions and distributions. For example, it is possible to weaken the distribution condition to members of the location-scale family. See (Ortobelli, Rachev, and Schwartz 2004). See, for example, (Best and Grauer 1991). See, for example, (Barry 1974; Winkler and Barry 1975; Klein and Bawa 1976; Brown 1976; Jobson, Korkie and Ratti 1979; Jobson and Korkie 1980; Chen and Brown 1983).

24 Bayesian Applications to the Investment Management Process

589

updated with the information contained in the observed data. The optimal Bayesian portfolio problem takes the form:

max Eθ {E r|θ (U (WT +1 ) | θ )} =

(24.3)

ω

max ∫∫ U (WT +1 ) p (rT +1 | θ ) p(θ | r ) ω

[

]

max ∫ U (WT +1 ) ∫ p (rT +1 | θ ) p (θ | r ) dθ dr ω

where by Bayes’ rule, the posterior density p(θ | r ) is proportional to the product of the sampling density (the likelihood function) and the prior density, f (r | θ ) p (θ ) . The multivariate normal distribution is the simplest and most convenient choice of sampling distribution in the context of portfolio selection, even though empirical evidence does not fully support this model.4 In the case where no particular information (intuition) about the model parameters is available prior to observing the data, the decision-maker has diffuse (non-informative) prior beliefs, usually expressed in the form of the Jeffrey’s prior p ( μ , Σ ) ∝ Σ − ( N +1) / 2 , where μ and

Σ are, respectively, the mean vector and the covariance vector of the multivariate normal return distribution, N is the number of assets in the investment universe, and ∝ denotes “proportional to”. The joint predictive distribution of returns is then a multivariate Student-t distribution. Informative prior beliefs are usually cast in a conjugate framework to ensure analytical tractability of the posterior and predictive distributions. The predictive distribution is multivariate normal only when the covariance Σ is assumed known and μ is asserted to have the conjugate prior N μ0ι ,τ 2 I , where μ0 stands for the prior mean, ι is a vector of ones, and τ 2 I is the diagonal prior covariance matrix. When both parameters are unknown and conjugate priors are assumed (the conjugate prior for Σ in a multivariate setting is an inverse-Wishart with scale parameter S −1 , where S is the sample covariance matrix), the predictive distribution is multivariate Student-t.5 (Klein and Bawa 1976) compare the Bayesian and certainty equivalent optimal solutions under the assumption of a diffuse prior for the parameters of the multivariate normal returns distribution ((Barry 1974) asserts informative priors) and show that in both cases the admissible sets are the same up to a constant. However, the optimal choice differs in the two scenarios since portfolio risk is perceived differently in each case. Both the optimal individual investor’s portfolio and the market portfolio have lower expected returns in the Bayesian setting. (Brown 1976) shows that the failure to account for estimation risk leads to suboptimal solutions.

(

4 5

For example, see (Fama 1965). See, for example, (Brown 1976).

)

590

Biliana Bagasheva et al.

It is instructive to examine the posterior mean under the informative prior assumption. Assuming that Σ = σ 2 I , the ith element of μ ’s posterior mean has the form −1

1 ⎞ ⎛ T 1 ⎛ T ⎞ + 2 ⎟ ⎜ 2 ri + 2 μ0 ⎟ 2 τ ⎠ ⎝σ τ ⎝σ ⎠

μi | r , σ 2 = ⎜

(24.4)

where ri is the sample mean of asset i, and T is the sample size. The posterior mean is a weighted average of the prior and sample information; that is, the sample mean ri of asset i is shrunk to the prior mean μ0 . The degree of shrinkage depends on the strength of the confidence in the prior distribution, as measured by the prior precision 1 / τ 2 . The higher the prior precision, the stronger the influence of the prior mean on the posterior mean. Shrinking the sample mean reduces the sensitivity of the optimal weights to the sampling error in it. As a result, weights take less extreme values and their stability over time is improved. The prior distribution of μ could be made uninformative by choosing a very large prior variance 2 elements τ . In the extreme case of an infinite prior variance, the posterior mean coincides with the sample mean and the correction for estimation risk becomes insignificant (Brown 1979; Jorion 1985). The approach of employing shrinkage estimators as a way of accounting for uncertainty is rooted in statistics and can be traced back to (James and Stein 1961), who recognized the inadmissibility of the sample mean in a multivariate setting under a squared loss function. The James-Stein estimator given by

μˆ JS = δ μ 0ι + (1 − δ ) r ,

(24.5)

where r = (r1,t , ..., rN ,t ) is the vector of sample means, has a uniformly lower risk than r , regardless of the point μ 0 towards which the means are shrunk.6 However, the gains are greater the closer μ 0 is to the true value. For the special case when the return covariance matrix has the form Σ = σ 2 I , σ 2 is known, and the number of assets N is greater than 2, the weight δ is given by

⎧

⎫ ( N − 2)T ⎬. −1 ( r μ ι )' ( r μ ι ) − Σ − 0 0 ⎭ ⎩

δ = min ⎨1,

Within the portfolio selection context, the effort was initiated with the papers of (Jobson, Korkie, and Ratti 1979; Jobson and Korkie 1980, 1981) and developed by (Jorion 1985, 1986; Grauer and Hakansson 1990). (Dumas and Jacquillat 1990) discuss Bayes-Stein estimation in the context of currency portfolio selection. While the choice of prior distributions is often guided by considerations of tractability, the parameters of the prior distributions (called hyperparameters) are 6

(Berger 1980) points out that the inadmissibility of the sample mean in the frequentist case is translated into inadmissibility of the Bayesian rule under the assumption of diffuse (improper) prior.

24 Bayesian Applications to the Investment Management Process

591

determined in a rather subjective fashion. This has led some researchers to embrace the empirical Bayes approach, which uses sample information to determine the hyperparameter values and is at the heart of the Bayesian interpretation of shrinkage estimators. The shrinkage target is the grand mean of returns M:

P ( μ ) ~ N (M , τ Σ ) . 7

(24.6)

(Frost and Savarino 1986; and Jorion 1986) employ it in an examination of the portfolio choice problem, asserting the conjugate inverse-Wishart prior for Σ . They estimate the prior parameters via maximum likelihood, assuming equality of the means, variances, and covariances. Comparing certainty-equivalent rates of return, they find that the optimal portfolios obtained in the Bayesian setting with informative priors outperform the optimal choices under both the classical and diffuse Bayes frameworks.8 (Jorion 1986) assumes that Σ is known and is replaced by its sample estimator

T −1 S . Jorion derives the so-called Bayes-Stein estimator of expected reT −N −2

turns – a weighted average of sample means and the mean of the global minimum

−1 variance portfolio Σ ι r (the solution to the variance minimization problem −1

ι' Σ ι

under the constraint that the weights sum to unity).9 He finds that the Bayes-Stein shrinkage estimator outperforms significantly the sample mean, based on comparison of the empirical risk function.10 (Grauer and Hakansson 1990) observe that the portfolio strategies based on the Bayes-Stein and the James-Stein estimators are only marginally better than the historic mean strategies. (Frost and Savarino 1986) obtain a shrinkage estimator not only for the mean vector but also for the covariance matrix of the predictive returns distribution, thus contributing to a relatively neglected area. A reason why there are relatively more studies concerned only with uncertainty about the mean (see also the discussion of the Black and Litterman model below) may be that optimal portfolio choice is 7

8

9

10

It is not unusual to assume that the degree of uncertainty about the mean vector is proportional to the volatilities of returns. A value of τ smaller than 1 reflects the intuition that uncertainty about the mean is lower than uncertainty about the individual returns. A certainty-equivalent rate of return is the risk-free rate of return which provides the same utility as the return on a given combination of risky assets. (Dumas and Jacquillat 1990) argue that in the international context this result introduces country-specific bias. They advocate shrinkage towards a portfolio which assigns equal weights to all currencies. The empirical risk function is computed as the loss of utility due to the estimation risk

(

)

L ϖ ,ϖˆ = *

Fmax − F ( qˆ )

averaged over repeated samples, where

Fmax

sis of the sample estimate

is the solution to

θ is known, ϖˆ is the portfolio choice on the baˆθ , F and F are the corresponding values of the utility max

(24.1) when the true parameter vector

functions.

ϖ*

592

Biliana Bagasheva et al.

highly sensitive to estimation error in the expected means, while variances and covariances (although also unknown) are more stable over time ((Merton 1980)). However, given that the optimal investor decision is the result of the trade-off between risk and return, efficient variance estimation seems to be no less important than mean estimation.11

24.3 Combining Prior Beliefs and Asset Pricing Models (Ledoit and Wolf 2003) develop a shrinkage estimator for the covariance matrix of returns in a portfolio selection setting, choosing as a shrinkage target the covariance matrix estimated from Sharpe’s (Sharpe 1963) single-factor model of stock returns. They join a growing trend in the shrinkage estimator literature of deriving the shrinkage target structure from a model of market equilibrium. Equivalently, the asset pricing model serves as the reference point around which the investor builds prior beliefs. There is a trade-off then between the degree of confidence in the validity of the model and the information content of the observed data sample. The influential work of Black and Litterman (Black and Litterman 1990, 1991, 1992) (BL) presumably constitute the first analysis employing this approach.12 Their model allows for a smooth and flexible combination of an asset pricing model, the Capital Asset Pricing Model (CAPM), and investor’s views. The CAPM is assumed to hold in general, and investors’ beliefs about expected stock returns can be expressed in the form of deviations from the model predictions.13 Interpretations of the BL methodology from the Bayesian point of view are scarce (Satchell and Scowcroft 2000; He and Litterman 1999; Lee 2000; Meucci 2005), although, undoubtedly, the BL decision-maker is Bayesian, and somewhat ambiguous. The excess returns of the N assets in the investment universe are assumed to follow a multivariate normal distribution r ~ N (μ , Σ ) .14 The implied equilibrium risk premiums Π are used as a proxy for the true equilibrium returns and the distribution of expected.equilibrium returns is centered on them, with a covariance matrix proportional to Σ :

μ ~ N (Π,τ Σ ) 11 12

13

14

(24.7)

See, for example, (Frankfurter, Phillips, and Seagle 1972). For example, (Jorion 1991) mentions the possibility of using the CAPM equilibrium forecasts to form prior beliefs but doesn’t pursue the idea further. BL consider an equilibrium model, such as the CAPM, as the most appropriate neutral shrinkage target for expected returns, since equilibrium returns clear the market when all investors have homogeneous views. The covariance matrix Σ is estimated outside of the model (see (Litterman and Winkelmann 1998)) for the specific methodology) and considered as given.

24 Bayesian Applications to the Investment Management Process

593

where the scalar τ indicates the degree of uncertainty in the CAPM.15 The investor’s views (linear combinations of expected asset returns) are expressed as probability distributions of the expected returns on the so-called “view” portfolios:

Pμ ~ N (Q, Ω ) ,

(24.8)

where P is a (K x N) matrix whose rows correspond to the K view portfolio weights. The magnitudes of the elements ϖ i of Ω represent the degree of confidence the investor has in each view. There is no consensus as to which one of the distributions in Eqs. (24.7 and 24.8) defines the prior and which one the sampling density. (Satchell and Scowcroft 2000; Lee 2000; Meucci 2005) favor the position that the investor views constitute the prior information which serves to update the equilibrium distribution of expected returns (in the role of the sampling distribution). This interpretation is in line with the Bayesian tradition of using subjective beliefs to construct the prior distribution. On the other hand, He and Litterman’s (He and Litterman 1999) reference to Eq. (1.8) as the prior also has grounds in the Bayesian theory. Suppose that we are able to take a sample from the population of future returns, in which our subjective belief about the expected stock returns is realized. Then, a view could be interpreted as the information contained in this hypothetical sample.16 The sample size corresponds to the degree of confidence the investor has in his view. The particular definition one adopts does not have a bearing on the results. Deriving the posterior distribution of expected returns is a straightforward application of conjugate analysis and yields the familiar result

(

~ μ | Π, Q, Σ, Ω,τ ~ N μ~,V

)

(24.9)

where the posterior mean and covariance matrix are given by

(

μ = (τ Σ ) + P ' Ω −1 P and

−1

) ((τ Σ ) −1

−1

(

Π + P ' Ω −1Q

~ −1 V = (τ Σ ) + P' Ω −1 P

)

−1

)

.

(24.10)

(24.11)

The estimator of expected returns in Eq. (24.10) clearly has the form of a shrinkage estimator (the weights of Π and Q sum up to 1). When the level of 15

The equilibrium risk premiums Π are the expected stock returns in excess of the riskfree rate, estimated within the CAPM framework. In the setting of the BL model, the vector Π is determined by a procedure appropriately called “reverse optimization”. The market-capitalization weights observed in the capital market are considered the opˆ of the covariance matrix, the risk premiums timal weights ω * . Using the estimate Σ are backed out of the standard mean-variance result

16

(

ω* = (1 / λ ) Σˆ −1Π / ι ' Σˆ −1Π

),

where λ is the coefficient of relative risk aversion. See (Black and Litterman 1992) for this interpretation. Interpreting prior belief in terms of a hypothetical sample is not uncommon in Bayesian analysis. See also Stambaugh (1999).

594

Biliana Bagasheva et al.

certainty about the equilibrium returns increases ( τ approaches 0), their weight −1 (τ Σ )−1 + P ' Ω −1P (τ Σ )−1 increases and the investor optimally holds the mar-

(

)

ket portfolio. If, on the contrary, belief in the deviations from equilibrium returns is strong, more weight is put on the views. (Lee 2000) extends the BL model to the tactical allocation problem. The equilibrium risk premiums Π are replaced by the vector of expected excess returns corresponding to a neutral position with respect to tactical bets, i. e., to holding the benchmark portfolio. Admittedly, the BL methodology does not make use of all of the available information in historical returns, particularly, the sample means. (Pastor 2000; Pastor and Stambaugh 1999) address this issue by developing a framework in which uncertainty in the validity of the asset pricing model is quantified in terms of the amount of model mispricing. The estimate of expected returns is a weighted average between the model prediction and the sample mean, thus incorporating the benefits of both the Bayes-Stein and the BL methodologies.17 Let the return generating process for the stock’s excess return be

rt = α + β ' f t + ε t

t =1, ..., T ,

(24.12)

where f t denotes a (K × 1) vector of factor returns (returns to benchmark portfolios), and ε t is a mean-zero disturbance term. Then, the slopes of the regression in Eq. (24.12) are stock’s sensitivities (betas). The stock’s expected excess return implied by the model is

E (rt ) = β ' E ( f t )

(24.13)

That is, the model implies that α = 0 .18 When the investor believes there is some degree of pricing inefficiency in the model, the expected excess return will reflect this through an unknown mispricing term:

E (rt ) = α + β ' E ( f t ) .

(24.14)

In a single factor model such as the CAPM, the benchmark portfolio is the market portfolio. In a multifactor model, the benchmarks could be zeroinvestment, non-investable portfolios whose behavior replicates the behavior of an underlying risk factor (sometimes called factor-mimicking portfolios)19 or factors extracted from the cross-section of stock returns using principal components analysis.20 (Pastor 2000) investigates the implications for portfolio selection of 17

18

19 20

The investigation of model uncertainty is expanded and explicitly modeled in the context of return predictability using the Bayesian Model Averaging framework by (Avramov 2000; Cremers 2002), among others. See Section 23.5. α is commonly interpreted as a representation of the skill of an active portfolio manager. (Pastor and Stambaugh 2000) point out this interpretation is not infallible. For example, the benchmarks used to define α might not price all passive investments. See, for example, (Fama and French 1993). See (Connor and Korajczyk 1986).

24 Bayesian Applications to the Investment Management Process

595

varying prior beliefs about α . When beliefs about a pricing model are expressed, the prior mean of α , α 0 , is set equal to zero. It could have a non-zero value, when, for example, the investor expresses uncertainty about an analyst’s forecast. The prior variance σ α of α reflects the investor’s degree of confidence in the prior mean – a zero value of σ α represents dogmatic belief in the validity of the model; σ α = ∞ suggests complete lack of confidence in its pricing power. (Pastor and Stambaugh 1999), investigating the cost of equity of individual firms, suggest that α 0 could be set equal to the average ordinary least squares estimate from a subset (cross-section) of firms sharing common characteristics. (Pastor 2000) assumes normality of stock and factor returns, and conjugate uninformative priors for all parameters in Eq. (24.12) but α . In the special case of one stock and one benchmark, the optimal weight in the stock is shown to be proportional to the ratio of the posterior mean of α and the posterior mean of the residual variance, α~ / σ~ 2 . The posterior mean α~ has the form of a shrinkage estimator:

⎛ ⎛α ⎞ ⎛ α~ ⎞ ⎜⎜ ~ ⎟⎟ = M −1 ⎜⎜ Ψ −1 ⎜⎜ 0 ⎟⎟ + σ 2 ( X ' X )−1 ⎝β ⎠ ⎝ β0 ⎠ ⎝

(

where

(

= Ψ −1 + σ 2 ( X ' X )

M

σ 2 ( X ' X ) −1

(σ

2

−1

⎛ αˆ ⎞ ⎞ ⎜⎜ ⎟⎟ ⎟⎟ , ˆ ⎝ β ⎠⎠

(24.15)

)

−1 −1

= (sample) covariance estimator of the least-squares estimators αˆ and βˆ ,

( X ' X )−1 )

Ψ −1

)

−1

= sample precision matrix, and = prior precision matrix.

Pastor’s results demonstrate greater stability of optimal portfolio weights, which take less extreme values. Examining the home bias that is observed in solutions to international asset allocation studies, Pastor finds that the holdings of foreign equity observed for U.S. investors is consistent with a prior standard deviation σ α equal to 1% – evidence for strong belief in the efficiency of the U.S. market portfolio.21 Building upon the recognition of the fact that no model is completely accurate, (Pastor and Stambaugh 2000) undertake an empirical investigation comparing three asset pricing models from the perspective of optimal portfolio choice, while accounting for investment constraints. The models are: the CAPM, the Fama21

Home bias is a term used to describe the observed tendency of investors to hold a larger proportion of their equity in domestic stocks than suggested by the weight of their country in the value-weighted world equity portfolio.

596

Biliana Bagasheva et al.

French model, and the Daniel-Titman model22 Pastor and Stambaugh explore the economic significance of different investors’ perceptions of the degree of model accuracy by comparing the loss in certainty-equivalent return from holding portfolio A (the choice of an investor with complete faith in model A), when in fact the decision-maker has full confidence in model B or C. They observe that when the degree of certainty in a model is less than 100%, cross-model differences diminish (the certainty-equivalent losses are smaller). Investment constraints dramatically reduce the differences between models, which is in line with Wang’s (Wang 1998) conclusion that imposing constraints acts to weaken the perception of inefficiency of the benchmark portfolio (see Section 24.4).

24.4 Testing Portfolio Efficiency Empirical tests of mean-variance efficiency in the Bayesian context of both the CAPM and the Arbitrage Pricing Theory (APT) could be divided into two categories. The first one focuses on the intercepts of the multivariate regressions describing the CAPM

ri = α + β rM + ε i ,

i = 1, ..., N

(24.16)

and the APT

rt = α + β1 f1, t + ... + β k f k , t + u t ,

t = 1, ..., T ,

(24.17)

where returns are in risk-premium form (in excess of the risk-free rate), rM in (24.16) is the market risk premium, f j , t is the risk premium (return) of factor j at time t, and β j is return’s exposure (sensitivity) to factor j. As in the previous section, the pricing implications of the CAPM and the APT yield the restriction that the elements of the parameter vector α are jointly equal to zero. Therefore, the null hypothesis of mean-variance efficiency is equivalent to the null hypothesis of no mispricing in the model.23 The test relies on the computation of the posterior odds ratio. At the heart of the tests in the second category lies the computation of the posterior distributions of certain measures of portfolio inefficiency. A strand of the pricing model testing literature focuses on the utility loss as a measure of the eco22

23

The (Fama and French 1993) model is a factor model in which expected stock returns are linear functions of the stock loadings on common pervasive factors. Book-to-market ratio and size-sorted portfolios are proxies for the factors. The Daniel and Titman (1997) model is a characteristic-based model. Expected returns are linear functions of firms’ characteristics. Co-movements of stocks are explained with firms’ possessing common characteristics, rather than being exposed to the same risk factors, as in the Fama-French model. When returns are expressed in risk-premium form, and expected returns are linear combinations of exposures to K sources of risk, the mean-variance efficient portfolio is a combination of the K benchmark (factor) portfolios and performing the test above in the context of the APT is equivalent to testing for mean-variance efficiency of this portfolio.

24 Bayesian Applications to the Investment Management Process

597

nomic significance of deviations from the pricing restrictions, for example, by comparing the certainty-equivalent rate of return. (McCulloch and Rossi 1990) follow this approach.

24.4.1

Tests Involving Posterior Odds Ratios

(Shanken 1987; Harvey and Zhou 1990; McCulloch and Rossi 1991) employ posterior odds ratios to test the point hypotheses of the restrictions implied by the CAPM (the first two studies) and the APT (the third study). The test of efficiency can be expressed in the usual way:

H 0 : α = 0 vs. H1 : α ≠ 0

(24.18)

The investor’s belief that the null hypothesis is true is incorporated in the prior odds ratio, and then updated with the data to obtain the posterior odds ratio. The posterior odds ratio is the product of the ratio of predictive densities under the two hypotheses and the prior odds and is given by

G=

p(α = 0 | r ) p(r | α = 0) p(α = 0) , = p(α ≠ 0 | r ) p (r | α ≠ 0) p(α ≠ 0)

(24.19)

where r denotes the data.24 It is often assumed that the prior odds is 1 when no particular prior intuition favoring the null or the alternative exists. Then, G becomes:

G=

∫ L(β , Σ | α = 0) p (β , Σ )dβ dΣ 0

∫ L(β , Σ, α | α ≠ 0) p1 (α , β , Σ )dα dβ dΣ

,

(24.20)

where L(β , Σ | α = 0 ) is the likelihood function L(α , β , Σ ) evaluated at α = 0 . Since the posterior odds ratio is interpreted as the probability that the null is true divided by the probability that the alternative is true, a low value of the posterior odds provides evidence against the null hypothesis that the benchmark portfolio is mean-variance efficient. Assume the disturbances in Eq. (24.16) are identically and independently distributed (i. i. d.) normal with a zero mean vector and a covariance matrix Σ . (Harvey and Zhou 1990) explore three distributional scenarios – a multivariate Cauchy distribution, a multivariate normal distribution, and a Savage density ratio approach. In the first two scenarios, the prior distribution under the null is taken to be a diffuse one:

p 0 (β , Σ ) ∝ Σ

− ( N +1) / 2

.

(24.21)

Under the alternative, the prior is

p1 (α , β , Σ ) ∝ Σ 24

We assume that

− ( N +1) / 2

p (α = 0 ) and p (α ≠ 0 )

f (α | Σ ) ,

are strictly greater than zero.

(24.22)

598

Biliana Bagasheva et al.

where f (α | Σ ) is the prior density function of α (a multivariate Cauchy or a multivariate normal). Following (McCulloch and Rossi 1991), Harvey and Zhou investigate also the so-called Savage density ratio method,25 asserting a conjugate prior under the alternative hypothesis, p1 (α , β , Σ ) = N (α , β | Σ )IW (Σ ) (N denotes normal density, IW denotes inverted Wishart density)). The prior under the null is:

p 0 (β , Σ ) = p1 (α , β , Σ | α = 0 ) =

p1 (α , β , Σ )

∫ p (α , β , Σ )dβ dΣ α 1

(24.23) =0

Large deviations of the intercepts from zero, under the multivariate normal prior, intuitively, provide greater evidence against the null hypothesis than large deviations from zero under the multivariate Cauchy prior. Therefore, the normal prior is expected to produce lower posterior odds ratio than the Cauchy prior. The Savage density assumption leads to a simplification of the posterior odds. Assuming a prior odds ratio equal to 1,

G=

p (α | r ) , p (α ) α =0

(24.24)

where both the marginal posterior density of α in the numerator and the prior density in the denominator can be shown to be multivariate Student-t densitites. In an examination of the efficiency of the market index, (Harvey and Zhou 1990) find that the posterior odds increase monotonically for increasing levels of dispersion in the prior distributions. Both the Cauchy and the normal priors provide evidence against the null. The posterior probability of mean-variance efficiency varies between 8.9% and 15.5% under the normal assumption, and between 26.2% and 27.2% under the Cauchy assumption. The Savage prior case is analyzed for three different prior assumptions of relative efficiency of the market portfolio, reflected in the choice of hyperparameters of β and Σ .26 The Savage prior offers more evidence against the null, compared to the normal and Cauchy priors – the probability of efficiency is generally less than 1%. (McCulloch and Rossi 1991) explore the pricing implications of the APT and observe great variability of the posterior odds ratio in response to changing levels of spread of the Savage prior.27 The ratio in the high-spread specification exceeds 25

26

27

The Savage density ratio method involves selecting a particular form of the prior density under the null, as in Eq. (24.23), which results in the simplification of the posterior odds ratio in Eq. (24.24). Relative efficiency is measured by the correlation ρ between the given benchmark index and the tangency portfolio; ρ = 1 implies efficiency of the benchmark. (Shanken 1987) shows that in the presence of a risk-free asset, ρ is equal to the ratio between the Sharpe measure (ratio) of the benchmark portfolio and Sharpe measure of the tangency portfolio (which is the maximum Sharpe measure). A parallel could be drawn between McCulloch and Rossi’s (McCulloch 1990, 1991) investigation and the traditional two-pass regression procedure for testing the APT. The authors first extract the factors using the principal components approach of (Connor and

24 Bayesian Applications to the Investment Management Process

599

the one in the low-spread case by more than 40 times when a five-factor model is considered. Overall, evidence against the null hypothesis is weak in the case of the one-factor model (except in the high-variance scenario) and mixed in the case of the five-factor model. McCulloch and Rossi caution, however, against drawing conclusions about the benefit of adding more factors to the one-factor model. The addition of factors needs to be analyzed in a different posterior-odds framework, in which the restriction of zero coefficients of the new factors is imposed.

24.4.2 Tests Involving Inefficiency Measures Investors are often less interested in an efficiency test offering a “binary” outcome (reject/do not reject) than in an investigation of the degree of inefficiency of a benchmark portfolio. (Kandel, McCulloch, and Stambaugh 1995) target this argument and develop a framework for testing the CAPM, in which the posterior distribution of an inefficiency measure is computed.28 (Wang 1998) extends their analysis to incorporate investment constraints. Denote by p the portfolio whose efficiency is being tested and by x the efficient portfolio with the same variance as p. Then, the observation that the expected return of p is less than or equal to the expected return of x immediately suggests an intuitive measure of portfolio p’s inefficiency:

Δ = μx − μ p ,

(24.25)

where μ j denotes the expected return of portfolio j. The benchmark portfolio is efficient if and only if Δ = 0 . The non-negative value of Δ could also be interpreted as the loss of expected return from holding portfolio p instead of the efficient portfolio x (carrying the same risk as p). Another measure of inefficiency explored by Kandel, McCulloch, and Stambaugh is ρ , the correlation between p and any efficient portfolio. The posterior density of Δ and ρ does not have a closed-form solution under standard diffuse prior assumptions about the mean vector μ and the covariance matrix Σ of the risky asset returns. An application of the Monte Carlo methodology, however, makes its evaluation straightforward. Suppose the posterior density of the mean and covariance are given by p (μ | Σ, r ) and p (Σ | r ) , respectively. Then, a draw from the (approximate) posterior distribution of Δ and ρ is obtained by drawing repeatedly from the posterior distributions of μ and Σ and then computing the corresponding values of Δ and ρ .

28

Korajzcyk 1986) and then perform the Bayesian analysis. In contrast, (Geweke and Zhou 1996) adopt a single-stage procedure in which the posterior distribution of a measure of the APT pricing error is obtained numerically. Admittedly, the Geweke-Zhou approach could only be employed to a relatively small number of assets, in contrast to the McCulloch-Rossi approach. (Shanken 1987; Harvey and Zhou 1990) also discuss similar measures.

600

Biliana Bagasheva et al.

Kandel, McCulloch, and Stambaugh observe an interesting divergence of results depending on whether or not a risk-free asset is available in the capital market. For example, in the absence of a risk-free asset, most of the mass of ρ ’s posterior distribution lies between −0.1 and 0.3, while when the risk-free asset is included, the posterior mass shifts to the interval 0.89 to 0.94 (suggesting a shift from a very weak to a very strong correlation between the benchmark and the efficient portfolio). Similarly, the posterior mass of Δ lies farther away from Δ = 0 in the former than in the latter case. An investigation into the extent that the data influence the posterior of ρ reveals that informative, rather than diffuse, priors are necessary to extract the information of inefficiency contained in the data in the presence of a risk-free asset and, in general, the prior’s influence on the posterior is strong. When the risk-free asset is excluded, the data update the prior better, and the results show that the benchmark portfolio (composed of NYSE and AMEX stocks) is highly correlated with the efficient portfolio. The methodology of Kandel, McCulloch, and Stambaugh is easily adapted to account for investment constraints in testing for mean-variance efficiency of a portfolio. (Wang 1998) proposes to modify Δ in the following way to incorporate short-sale constraints:

[

]

~ Δ = max x' μ − x p ' μ | x ≥ 0, x' Σx ≤ x p ' Σx p ,

(24.26)

where x p are the weights of the given benchmark portfolio under consideration, x are the weights of the efficient portfolio, and x' μ and x p ' μ are the expected portfolio returns, denoted by μ x and μ p , respectively, in (24.25). The constraint modification to reflect a 50% margin requirement29 is xi ≥ −0.5, i = 1, ..., N . For each set of draws of the approximate posteriors of μ and Σ , the constrained optimization in (24.26) is performed and a draw of Δ is obtained. Wang compares the posterior distributions of the inefficiency measures with and without investment constraints. When no constraints are imposed, the posterior mean of Δ is 20.9% (indicating that a portfolio outperforming the benchmark by 20% could be constructed). Imposing the 50% margin constraint brings the values of the posterior mean of Δ down to 8.37%, while when short sales are not allowed, the posterior mean decreases to 4.25%. Thus, the benchmark’s inefficiency decreases as stricter investment constraints are included in the analysis. Additionally, (Wang 1998) observes that uncertainty about the degree of mispricing declines with the imposition of constraints, making the posterior distribution of Δ less dispersed.

24.5 Return Predictability Predictability in returns impacts optimal portfolio choice in several ways. First, it brings in horizon effects. Second, it makes possible the implementation of market 29

A 50% margin requirement is a restriction on the size of the total short sale position an investor could take. The short sale position can be no more than 50% of the invested capital.

24 Bayesian Applications to the Investment Management Process

601

timing strategies. Third, it introduces different sources of hedging demand. In this section we will explore how these three consequences of predictability are examined in the Bayesian literature. With the exception of (Kothari and Shanken 1997) who investigate a Bayesian test of the null hypothesis of no predictability, most of the predictability literature focuses on the implications of predictability for the optimal portfolio choice, rather than on accepting or rejecting the null hypothesis, since portfolio performance and utility gains (losses) provide natural measures to assess predictability power.

24.5.1

The Static Portfolio Problem

The vector autoregressive (VAR) framework is a convenient and compact tool to model the return-generating process and the dynamics of the endogenous predictive variables. For the simple case of one predictor, its form is:

rt = α + β xt −1 + ε t , xt = θ + ρ xt −1 + u t

(24.27)

where rt is the excess stock return (return on a portfolio of stocks) in period t, xt −1 is a lagged predictor variable, whose dynamics is described by a first-order autoregressive model, and ε t and ut are correlated disturbances. The vector ( ε t , ut ) ' is assumed to have a bivariate normal distribution with a zero mean vector and a covariance matrix

⎛σ2 Σ=⎜ ε ⎜σ ε u ⎝

σεu ⎞ ⎟. σ u2 ⎟⎠

The predictor is a variable such as the dividend yield, the book-to-market ratio, and interest rate variables, or lagged values of the continuously compounded excess return rt .30 The dividend yield is considered a prime predictor candidate and all of the studies discussed below use it as the sole return predictor. The investor maximizes the expected utility, weighted by the predictive distribution as in Eq. (24.1). (Kandel and Stambaugh 1996) examine the problem in Eq. (24.1) in a static, single-period investment horizon setting, while (Barberis 2000) extends it to consider multi-period horizon stock allocations with optimal rebalancing. Kandel and Stambaugh investigate a no-predictability informative prior for B and Σ . They do so by constructing it as the posterior distribution that would result from combining the diffuse prior p(B, Σ ) ∝ Σ − ( N + 2 ) / 2 with a hypothetical sample identical to the real sample, save for a sample coefficient of determination R 2 equal to 30

Numerous empirical studies of predictability have identified variables with predictive power. See, for example, (Fama 1991).

602

Biliana Bagasheva et al.

zero.31 The behavior of the optimal stock allocations is analyzed over a range of values of the predictors, for a number of samples that differ by the number of predictors N, the sample size T, and the regression R 2 . Kandel and Stambaugh’s results confirm an intuitive relation between the optimal stock allocation and the current value of the predictor variable, xT . Specifically, the greater the positive difference between the one-step ahead fitted value bˆ xT and the returns’ long-term average r = bˆ x , the higher the stock allocation. Kandel and Stambaugh put forward a related criterion for assessing the economic significance of predictability evidence. The optimal allocation ω a in the case when xT = x (where x is the long-term average of the predictor variable) is no longer optimal when xT ≠ x . Then, a comparison of the certainty-equivalent returns associated with the expected utilities of the optimal allocations when xT = x and when xT ≠ x allows one to examine the economic implications (if any). (Kandel and Stambaugh 1996) emphasize the important departure of the evidence of economic significance from the evidence of statistical significance. For example, given an R 2 (unadjusted) from the predictive regression of only 0.025 (implying a p-value of 0.75 of the standard regression F statistic), the investor optimally allocates 0% of his wealth to stocks when predicted return bˆ xT is one standard deviation below its long-term average r , but 61% when bˆ xT = r , under a diffuse prior and a coefficient of risk aversion equal to 2. Under the nopredictability informative prior, the allocations are, respectively, 53% and 83%. Therefore, statistical insignificance of the predictability evidence does not translate into economic insignificance. The mechanism through which predictability affects portfolio choice is further enriched by the investigation of (Barberis 2000), who ties the Kandel and Stambaugh’s framework to the issue of a varying investment horizon. Incorporating parameter uncertainty into the portfolio problem tends to reduce optimal stock holdings, and this horizon effect is, not surprisingly, stronger at a long-horizon than at a short-horizon. In contrast, when the possibility of predictable returns is taken into account, perceived risk of stocks by a buy-and-hold investor at long horizons diminishes because the variance of cumulative returns grows slower than linearly with the horizon. Thus, a higher proportion of wealth is allocated to stocks at long horizons compared with the case when returns are assumed to be i. i. d. and these differences increase with the horizon.32 Analyzing the interaction of the two opposing tendencies, Barberis finds that introducing estimation risk, in a static 31

32

In a related paper, (Stambaugh 1999) characterizes the economic importance of the sample evidence of predictability by considering hypothetical samples carrying the same information content about B and Σ as the actual sample but differing in the value of yT . Empirically observed mean-reversion in returns (negative serial correlation) helps explain the horizon effect. However, Barberis notes that predictability itself may be sufficient to induce this effect, if not mean-reversion. Specifically, the negative correlation between the unexpected returns and the dividend yield innovations is one condition for the horizon effect. See also (Avramov 2000), and Section 24.5 below.

24 Bayesian Applications to the Investment Management Process

603

setting, reduces the horizon effect for a risk-averse investor – the uncertainty about the process parameters adds to uncertainty about the forecasting power of the predictor(s) and increases risk at longer horizons. As a result, the 10-year buy-andhold portfolio strategy of an investor with a risk aversion parameter of 10, who takes both predictability and uncertainty into account, results in up to a 50% lower allocation compared to the case of predictability only, with no estimation risk. Both (Barberis 2000) and (Stambaugh 1999) explore the sensitivity of the optimal allocation to varying the initial predictor’s value, x0 . Long-horizon allocations under uncertainty generally increase with the horizon for low starting values of the predictor and decrease for high starting values, leading to a lesser sensitivity to the predictor’s starting value. Stambaugh demonstrates that treating x0 as a stochastic realization of the same process that generated x1 , x 2 , ..., xT , compared to considering it fixed, brings in additional information about the regression parameters and changes their posterior means. He observes that, when estimation risk is incorporated, the long-horizon (in particular, 20-years) optimal allocation is often decreasing in the predictor, even though expected return is not. This pattern can be ascribed to the skewness of the predictive distribution. Incorporating uncertainty (particularly the uncertainty about the autoregressive coefficient of the predictor) induces positive skewness for low initial values of the predictor (leading to high allocations) and negative skewness for high initial values (leading to low allocations).

24.5.2

The Dynamic Portfolio Problem

As mentioned earlier, market-timing is one of the modifications to the portfolio allocation problem resulting from predictability. Suppose that an investor at time T with an investment horizon T + Tˆ has a dynamic strategy and rebalances at each of the dates T + 1, ..., T + Tˆ − 1 . The new intertemporal context of the problem allows us to consider a new aspect of parameter uncertainty33 – not only does the investor not know the true parameters of the return generating process but the relationship between the returns and the predictors may also be time-varying. At time T, the Bayesian investor solves the portfolio problem taking into account that at each rebalancing date, the posterior distribution of the parameters is updated with the new information. It turns out that this “learning” (Bayesian updating) process plays an important role in the way the investment horizon affects optimal allocations.34 33

34

An early discussion of the Bayesian dynamic portfolio problem in a discrete-time setting (without accounting for predictability) can be found in (Winkler and Barry 1975). (Grauer and Hakansson 1990) examine the performance of shrinkage and CAPM estimators in a dynamic, discrete-time setting. (Merton 1971; Williams 1977) show that incorporating learning in a dynamic problem leads to the creation of a new state variable representing the investor’s current beliefs. Here, the new state variables are the posterior estimates of the unknown parameters, whose dynamics might be nonlinear. If learning is ignored, the current dividend yield is the only state variable, and it fully characterizes the predictive return distribution.

604

Biliana Bagasheva et al.

The underlying factor driving changes in allocations across horizons is now a hedging demand – a risk-averse investor attempts to hedge against the perceived changes in the investment opportunity set (equivalently, in the state variables).35 (Barberis 2000) considers a discrete dynamic setting with i. i. d. stock returns to explore the effects of learning about the unconditional mean of returns and finds that uncertainty induces a very strong negative hedging demand at long horizons.36 A long-horizon investor who admits the possibility of learning about the unconditional mean in the future allocates substantially less to stocks than an investor with a buy-and-hold strategy. While the framework introduced by Barberis involves learning about the unconditional mean of returns only, (Brandt, Goyal, Santa-Clara, and Stroud 2004) address simultaneous learning about all model parameters. The utility loss from ignoring learning is substantial but is negatively related to the amount of past data available and to the investor’s risk aversion parameter. Brandt et. al observe that the utility gains from accounting for uncertainty or for learning are of comparable size, and increasing with the horizon and the current predictor value. They break down the hedging demand and analyze its components – (1) the positive hedging component arising from the negative correlation between returns and changes in the dividend yield and (2) the negative hedging component due to the positive correlation between returns and changes in the model parameters. The aggregate effect can be positive at short horizons (up to five years) but turns negative for longer horizons. Brandt et. al observe that learning about the mean of the dividend yield and about the correlation between returns and the dividend yield induce a positive hedging demand which could partially offset the negative hedging demand above. A question of practical importance to investors is whether it is possible to take advantage of the evidence of predictability in practice. (Lewellen and Shanken 2002) offer an insightful answer which is unfortunately disappointing. They find that patterns in stock returns, like predictability, which a researcher observes, cannot be perceived by a rational investor.

24.5.3

Model Uncertainty

(Avramov 2000; Cremers 2002) address what could be viewed as a deficiency shared by the predictability investigations above – model uncertainty, introduced 35

36

Hedging demands are introduced by (Merton 1973). An investor who is more risk averse than the log-utility case (i.e., with a coefficient of risk aversion higher than 1) aims at hedging against reinvestment risk and increases his demand for stocks when their expected returns are low. Recall that expected stock returns are negatively correlated with realized stock returns. The intuition behind the negative hedging demand is that an unexpectedly large return leads to an upward revision of unconditional expected return.

24 Bayesian Applications to the Investment Management Process

605

by selecting and treating a certain return-generating process as if it were the true process. At the heart of Bayesian Model Averaging (BMA) is computing a weighted Bayesian predictive distribution of the “grand” model, in which individual models are weighted by their posterior distributions.37 Suppose that each individual model has the form of a linear predictive regression:

rt = x j , t −1 B j + ε j ,t ,

(24.28)

where rt = (N × 1) vector of excess returns on N portfolios, x j , t −1 = 1, z j , t −1 , z j , t −1 = ( k j × 1) vector of predictors, observed at the end of t − 1, that belong to model j, B j = (( k j + 1 ) × N) matrix of regression coefficients, and ε j , t = disturbance of model j, assumed to be normally distributed with mean 0 and covariance matrix Σ j (Avramov) or Σ (Cremers).38 The framework requires that two groups of priors be specified – model priors (i. e., priors of inclusion of each variable in an individual model), and priors on the parameters B j and Σ j of each model. Each model could be viewed equally likely a priori, and assigned the diffuse prior P M j = 1 / 2 K , where M j , j = 1, ..., K is the j th model. A different prior ties the model selection problem with the variable selection problem, as in (Cremers 2002):

(

)

( )

P(M j ) = ρ

kj

(1 − ρ )K −k

j

,

(24.29)

where ρ denotes the probability of inclusion of a variable in model j (assumed equal for all variables, but easily generalized to reflect different degrees of prior confidence in subsets of the predictors).39 No predictability (no confidence in any of the potential predictors) is equivalent to not including any of the explanatory variables in the regression in (24.28). Then, returns are i. i. d., and, using (24.29), the model prior is P M j = (1 − ρ )K . The posterior probability of model M j is given by

( )

P(M j | Φ t ) =

P(Φ t | M j )P(M j ) ,

∑ P(Φ 2K

j =1

37 38

39

t

(24.30)

| M j )P(M j )

If K variables are entertained as potential predictors, there are 2 K possible models. Both Avramov and Cremers treat the regression parameters B j as fixed. (Dangl, Halling and Randl 2005) consider a BMA framework with time-varying parameters. (Pastor and Stambaugh 1999) observe that when the set of models considered includes one with a strong theoretical motivation (e. g., the CAPM), assigning a higher prior model probability to it is reasonable.

606

Biliana Bagasheva et al.

where Φ t denotes all sample information available up to time t. The marginal likelihood function P Φ t | M j is obtained by integrating out the parameters B j and Σ j :

(

)

P (Φ t | M j ) =

(

L(B j , Σ j ; Φ j , M j ) P (B j , Σ j | M j ) P (B j , Σ j | Φ j , M j )

)

,

(24.31)

where L B j , Σ j ; Φ j , M j is the likelihood function corresponding to model M j , P B j , Σ j | M j is the joint prior and P B j , Σ j | Φ j , M j is the joint posterior of the model parameters. The weighted predictive return distribution is given by:

(

)

(

)

2K

P ( Rt +Tˆ | Φ t ) = ∑ P ( M j | Φ t ) ∫ P ( B j , Σ j | Φ t , M j ) j =1

x P ( Rt +Tˆ | B j , Σ j , M j ,Φ t ) dB j

(24.32)

where Rt +Tˆ is the predicted cumulative return over the investment horizon Tˆ . To express prior views on predictability, Cremers considers three quantities directly related to it: the expected coefficient of determination, E R 2 , the expected covariance of returns, E (Σ ) , and the probability of variable inclusion, ρ . He asserts conjugate priors for the parameters and includes a hyperparameter which penalizes large models. (Avramov 2000) uses a prior specification for B j and Σ j based on the one of (Kandel and Stambaugh 1996). The size of the hypothetical prior sample, T0 , determines the strength of belief in lack of predictability (as T0 increases, belief in predictability diminishes). Both Cremers and Avramov find in-sample and out-of-sample evidence of predictability.40 Avramov estimates a VAR model similar to Eq. (24.27). His variance decomposition of predicted stock returns into model risk, estimation risk, and uncertainty due to forecast error shows that model uncertainty plays a bigger role than parameter uncertainty. He finds that model uncertainty is proportional to the distance of the current predictor values from their sample means. To gauge the economic significance of accounting for model uncertainty, Avramov uses the difference in certainty equivalent metric and reaches an interesting result: the optimal allocation for a buy-and-hold investor is not sensitive to the investment horizon. This finding is contrary to the general findings of the Bayesian predictability literature. He ascribes the finding to the positive correlation between the unexpected returns and the innovations on the predictors with the highest posterior probability. The dividend yield, which is most often the only predictor in predictability investigations, has a lower posterior probability than the term premium and market premium predictors, and therefore, a smaller influence in the “grand” model (confirmed by Cremers’ results).

( )

40

Other empirical studies of return predictability include (Lamoureux and Zhou 1996; Neely and Weller 2000; Shanken and Tamayo 2001; Avramov and Chordia 2005).

24 Bayesian Applications to the Investment Management Process

607

24.6 Conclusion The application of Bayesian methods to investment management is a vibrant and constantly evolving one. Space constraints did not allow us to review many worthy contributions.41 Active research is being conducted in the areas of volatility modeling, time series models, and regime-switching models. Recent examples of stochastic volatility investigations include (Jacquier, Polson, and Rossi 1994; Mahieu and Schotman 1998; Uhlig 1997); time series models are explored by (Aguilar and West 2000; Kleibergen and Van Dijk 1993; Henneke, Rachev, and Fabozzi 2006); regime switching has been discussed by (Hayes and Upton 1986; So, Lam, and Li 1998), and employed by (Neely and Weller 2000). Bayesian methods provide the necessary toolset when heavy-tailed characteristics of stock returns are analyzed. (Buckle 1995; Tsionas 1999) model returns with symmetric stable distribution, while (Fernandez and Steel 1998) develop and employ a skewed Student t parameterization. These investigations have been made possible thanks to great advances in computational methods, such as Markov Chain Monte Carlo (see Bauwens, Lubrano, and Richard 2000)). The individual investment management areas mentioned above, several of which were surveyed in the previous sections, will continue to evolve in future works. We see the main challenge lying in their integration into coherent financial models. Without doubt, Bayesian methods are the indispensable framework for embracing and addressing the ensuing complexities.

Acknowledgement Rachev gratefully acknowledges research support by grants from Division of Mathematical, Life and Physical Sciences, College of Letters and Science, University of California, Santa Barbara, the Deutschen Forschungsgemeinschaft and the Deutscher Akademischer Austausch Dienst.

References Aguilar O, West M (2000) Bayesian dynamic factor models and portfolio allocation. J of Business and Economic Statistics 18 (3): 338−357 Avramov D (2000) Stock return predictability and model uncertainty. J of Financial Economics 64: 423−458

41

For a more detailed discussion, see (Rachev, Hsu, Bagasheva, and Fabozzi 2007).

608

Biliana Bagasheva et al.

Avramov D, Chordia T (2005) Predicting stock returns, http://ssrn.com/abstract=352980 Barberis N (2000) Investing for the long run when returns are predictable. J of Finance 55 (1): 225−264 Barry C (1974) Portfolio analysis under uncertain means, variances, and covariances. J of Finance 29 (2): 515−522 Bauwens L, Lubrano M, Richard JF (1999) Bayesian inference in dynamic econometric models. Oxford University Press Berger J (1980) Statistical decision theory. Springer, New York Best MJ, Grauer RR (1991) Sensitivity analysis for mean-variance portfolio problems. Management Science 37 (8): 980−989 Black F, Litterman R (1990) Asset allocation: Combining investor views with market equilibrium. Goldman Sachs Black F, Litterman R (1991) Global asset allocation with equities, bonds, and currencies. Fixed Income Research, Goldman Sachs Black F, Litterman R (1992) Global portfolio optimization. Financial Analysts Journal Sept−Oct: 28−43 Brandt MW, Goyal A, Santa-Clara P, Stroud J (2004) A simulation approach to dynamic portfolio choice with an application to learning about return predictability. NBER Working Paper Series Brown SJ (1976) Optimal portfolio choice under uncertainty: a Bayesian approach. Ph.D. thesis, University of Chicago Brown SJ (1979) The effect of estimation risk on capital market equilibrium. J of Financial and Quantitative Analysis 14 (2): 215−220 Buckle DJ (1995) Bayesian inference for stable distributions. J of the American Statistical Association 90: 605−613 Chen S, Brown SJ (1983) Estimation risk and simple rules for optimal portfolio selection. 38(4): 1087−1093 Connor G, Korajczyk R (1986) Performance measurement with the arbitrage pricing theory: A new framework for analysis. J of Financial Economics 15:373−394 Cremers KJM (2002) Stock return predictability: A Bayesian model selection perspective. The Review of Financial Studies 15 (4): 1223−1249 Dangl T, Halling M, Randl O (2005) Equity return prediction: Are coefficients time-varying? 10th Symposium on Finance, Banking, and Insurance. University of Karlsruhe Daniel K, Titman S (1997) Evidence on the characteristics of cross sectional variation in stock returns. J of Finance 52 (1): 1−33 Dumas B, Jacquillat B (1990) Performance of currency portfolios chosen by a Bayesian technique: 1967−1985. J of Banking and Finance 14: 539−558 Fama E (1965) The behavior of stock market prices. J of Business 38:34−105 Fama E (1991) Efficient capital markets: II. J of Finance 46 (5): 1575−1617 Fama E, French K (1993) Common risk factors in the returns on stocks and bonds. J of Financial Economics 33: 3−56

24 Bayesian Applications to the Investment Management Process

609

Fernandez C, Steel M (1998) On Bayesian modeling of fat tails and skewness. J of the American Statistical Association 93: 359−371 Frankfurter GM, Phillips HE, Seagle JP (1971) Portfolio selection: the effects of uncertain means, variances, and covariances. J of Financial and Quantitative Analysis 6 (5): 1251−1262 Frost PA, Savarino JE (1986) An empirical Bayes approach to efficient portfolio selection. J of Financial and Quantitative Analysis 21 (3): 293−305 Geweke J, Zhou G (1996) Measuring the pricing error of the arbitrage pricing theory. The Review of Financial Studies 9 (2): 557−587 Grauer RR, Hakansson NH (1990) Stein and CAPM estimators of the means in asset allocation. International Review of Financial Analysis 4 (1): 35−66 Harvey C, Zhou G (1990) Bayesian inference in asset pricing tests. J of Financial Economics 26: 221−254 Hays PA, Upton DE (1986) A shifting regimes approach to the stationarity of the market model parameters of individual securities. J of Financial and Quantitative Analysis 21 (3): 307−321 He G, Litterman R (1999) The intuition behind Black-Litterman model portfolios. Investment Management Division, Goldman Sachs Henneke J, Rachev S, Fabozzi F (2006) MCMC based estimation of MS-ARMAGARCH models. Technical Reports, Department of Probability and Statistics, UCSB Jacquier E, Polson NG, Rossi PE (1994) Bayesian analysis of stochastic volatility models. J of Business and Economic Statistics 12 (4): 371−389 James W, Stein C (1961) Estimation with quadratic loss. In: Proceedings of the 4th Berkeley Simposium on Probability and Statistics I. University of California Press, pp. 361−379 Jobson JD, Korkie B (1980) Estimation of Markowitz efficient portfolios. J of the American Statistical Association 75 (371): 544−554 Jobson JD, Korkie B (1981) Putting Markowitz theory to work. Journal of portfolio management 7 (4): 70−74 Jobson JD, Korkie B, Ratti V(1979) Improved estimation for Markowitz portfolios using James-Stein type estimators. In: Proceedings of the American Statistical Association, Business and Economics Section. American Statistical Association, Washington Jorion P (1985) International portfolio diversification with estimation risk. J of Business 58 (3): 259−278 Jorion P (1986) Bayes-Stein estimation for portfolio analysis. J of Financial and Quantitative Analysis 21 (3): 279−292 Jorion P (1991) Bayesian and CAPM estimators of the means: Implications for portfolio selection. J of Banking and Finance 15: 717−727 Kandel S, McCulloch R, Stambaugh R (1995) Bayesian inference and portfolio efficiency. The Review of Financial Studies 8 (1): 1−53 Kandel S, Stambaugh R (1996) On the predictability of stock returns: An assetallocation perspective. J of Finance 51 (2): 385−424

610

Biliana Bagasheva et al.

Kleibergen F, Van Dijk HK (1993) Non-stationarity in GARCH models: A Bayesian analysis. J of Applied Econometrics 8, Supplement: Special Issue on Econometric Inference Using Simulation Techniques: S41−S61 Klein R, Bawa V (1976) The effect of estimation risk on optimal portfolio choice. J of Financial Economics 3: 215−231 Kothari SP, Shanken J (1997) Book-to-market, dividend yield, and expected market returns: A time-series analysis. J of Financial Economics 44: 169−203 Lamoureux CG, Zhou G (1996) Temporary components of stock returns: What do the data tell us? The Review of Financial Studies 9 (4): 1033−1059 Ledoit O, Wolf M (2003) Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J of Empirical Finance 10: 603−621 Lee W (2000) Advanced theory and methodology of tactical asset allocation. John Wiley & Sons, New York Lewellen J, Shanken J (2002) Learning, asset-pricing tests, and market efficiency. J of Finance 57 (3): 1113−1145 Litterman R, Winkelmann K (1998) Estimating covariance matrices. Risk Management Series, Goldman Sachs Mahieu RJ, Schotman PC (1998) An empirical application of stochastic volatility models. J of Applied Econometrics 13 (4): 333−359 Markowitz H (1952) Portfolio selection. J of Finance 7 (1): 77−91 Markowitz H, Usmen N (1996) The likelihood of various stock market return distributions, Part I: Principles of inference. J of Risk and Uncertainty 13: 207−219 McCulloch R, Rossi P (1990) Posterior, predictive, and utility-based approaches to testing the arbitrage pricing theory. J of Financial Economics 28: 7−38 McCulloch R, Rossi P (1991) A Bayesian approach to testing the arbitrage pricing theory. J of Econometrics 49: 141−168 Merton R (1971) Optimum consumption and portfolio rules in a continuous-time model. J of Economic Theory 3: 373−413 Merton R (1973) An intertemporal capital asset pricing model. Econometrica 41: 867−887 Merton R (1980) An analytic derivation of the efficient portfolio frontier. J of Financial and Quantitative Analysis 7 (4): 1851−1872 Meucci A (2005) Risk and asset allocation. Springer, Berlin Heidelberg New York Neely CJ, Weller P (2000) Predictability in international asset returns: A reexamination. J of Financial and Quantitative Analysis 35 (4): 601−620 Ortobelli S, Rachev S, Schwartz E (2004) The problem of optimal asset allocation with stable distributed returns. In: Krinik AC, Swift RJ (eds) Stochastic processes and functional analysis, Lecture notes in pure and applied mathematics, Marsel Dekker: 295−347 Pastor L (2000) Portfolio selection and asset pricing models. J of Finance 55 (1): 179−223 Pastor L, Stambaugh R (1999) Costs of equity capital and model mispricing. J of Finance 54 (1): 67−121

24 Bayesian Applications to the Investment Management Process

611

Pastor L, Stambaugh R (2000) Comparing asset pricing models: An investment perspective. J of Financial Economics 56: 335−381 Rachev S, Hsu J, Bagasheva B, Fabozzi F (2007) Bayesian methods in Finance. John Wiley & Sons, forthcoming Satchell S, Scowcroft A (2000) A demystification of the Black-Litterman model: managing quantitative and traditional portfolio construction. J of Asset Management 1−2: 138−150 Shanken J (1987) A Bayesian approach to testing portfolio efficiency. J of Financial Economics 19: 195−215 Shanken J, Tamayo A (2001) Risk, mispricing, and asset allocation: Conditioning on the dividend yield. NBER Working Paper Series Sharpe WF (1963) A simplified model for portfolio analysis. Management Science 9: 277−293 So MKP, Lam K, Li WK (1998) A stochastic volatility model with Markov switching. J of Business and Economic Statistics 16 (2): 244−253 Stambaugh R (1999) Predictive regressions. J of Financial Economics 54: 375−421 Tsionas E (1999) Monte Carlo inference in econometric models with symmetric stable disturbances. J of Econometrics 88: 365−401 Uhlig H (1997) Bayesian vector autoregressions with stochastic volatility. Econometrica 65 (1): 59−73 Wang Z (1998) Efficiency loss and constraints on portfolio holdings. J of Financial Economics 48: 359−375 Williams JT (1977) Capital asset prices with heterogeneous beliefs. J of Financial Economics 5: 219−241 Winkler RL, Barry BB (1975) A Bayesian model for portfolio selection and revision. J of Finance 30 (1): 179−192 Zellner A, Chetty VK (1965) Prediction and decision problems in regression models from the Bayesian point of view. J of the American Statistical Association 60: 608−616

CHAPTER 25 Post-modern Approaches for Portfolio Optimization Borjana Racheva-Iotova, Stoyan Stoyanov

25.1 Introduction The search for alpha is the major challenge for both traditional equity managers and hedge funds. The assets under management only in hedge fund industry rose from about $500 billion to over 1 trillion for the period 2001−2005 and are expected to grow to 2.5 trillion over the next 4 years; the number of hedge funds approaches ten thousands. This highly competitive environment challenges discoveries of scalable strategies that can effectively attract and utilize new investments. Besides, the market itself calls for increased sophistication demonstrating pronounced asymmetry in financial returns, heavy-tails and other phenomena. It is clear that in this new environment the traditional portfolio optimization methods based on the minimization of the standard deviation or tracking error can no longer present a viable portfolio construction mechanism and portfolio managers are forced to shift from what is now known as modern portfolio management and revise some key concepts in their practices. The research for improved portfolio construction approaches has four directions: a. constructing better alpha models b. searching for an alternative risk measure which possesses better properties than the classical standard deviation c. realistic models for assets returns d. combining the three above into a nice practical optimization framework In the current chapter we will skip the item a) considering it a special domain of knowledge of portfolio managers often being the most subjective part of the portfolio construction process and will concentrate on the other three. The motivation behind item b) relates to the well known fact that the traditional standard deviation used as a proxy of risk measure has numerous disadvantages. One is that it penalizes profit and loss symmetrically while a true risk measure should be asymmetric as risk is associated with potential, future loss only.

614

Borjana Racheva-Iotova, Stoyan Stoyanov

Risk is associated with future, uncertain losses. Therefore, a realistic model for returns distributions is of paramount importance. Today, there are also a lot of studies exploring the non-normality of assets returns and suggesting alternative approaches. Among the well known candidates are t-distributions, generalized hyperbolic distribution see (Bibby and Sorensen 2003) and stable Paretian distributions see (Rachev and Mittnik 2000). At least some of their forms are subordinated Normal models and thus provide a very practical and tractable framework. A very good introduction to heavy-tailed models in finance is the recent book (Rachev et al. 2005). From the standpoint of portfolio managers, a), b) and c) are parts of one big problem – the construction of the strategy. Therefore, while the three ranges of problems are important on their own, portfolio managers face the involved issue of finding a practical framework in which to combine a), b) and c). One viable approach is to discretize the portfolio return distribution by sampling scenarios from it and employ a proper optimization technique such as linear programming or convex programming which depends, partly, also on the choice of a risk measure. We find the conditional value at risk (CVaR) minimization of particular interest. Our chapter discusses re-working of the modern portfolio optimization approaches into a framework that allows for increased flexibility, accurate assets modeling and sound risk measurement and shows that employing Generalized Stable Distributions together with CVaR minimization can serve as a flexible framework improving the performance of equity-based strategies within the full range of the long-short ratio and variety of other investment constraints, see (Rachev et al. 2007). The chapter is organized as follows. Section 25.2 provides a summary of risk and performance measures properties and brings to attention CVaR measures and related performance measures such as STARR and R-Ratio. Section 25.3 discusses several heavy-tailed models with a special attention to the Generalized Stable Distributions. We review various optimization models that can be used for variety of equity based strategies in Section 25.4. In Section 25.5 we provide some empirical optimization studies using the strategies discussed in Section 25.4 and Russell 2000 universe.

25.2 Risk and Performance Measures Overview In a recent survey (Holton 2004) risk is argued to be a subjective phenomenon involving exposure and uncertainty. We do not consider risk in broad sense, but only in the context of investment management. In our narrow context, exposure is identified with monetary loss and thus investment risk is related to uncertain monetary loss that a manager may be exposed to. Subjectivity appears because two different managers may define one and the same investment as differently risky – it is a question of personal predisposition.

25 Post-modern Approaches for Portfolio Optimization

615

This definition is illuminating but is insufficient in order for risk to be quantified. Still, in line with the reasoning in (Holton 2004), it is easy to see that both uncertainty and monetary loss are essential for the definition. For example, if an asset will surely lose 30% of its value tomorrow, then it is not risky even though money will be lost. Uncertainty alone is not synonymous with risk either. If the price of an asset will certainly increase between 5% and 10% tomorrow then there is uncertainty but no risk as there is no monetary loss. Sometimes risk is qualified as asymmetric in the sense that it is related to loss only. Concerning uncertainty, it is our assumption that there is intrinsic uncertainty pertaining to the future values of the traded assets on the market. If we consider two time instants, present and future, then the inherent uncertainty materializes as a probability distribution of future prices or returns; that is, these are random variables as of the present instant. The managers do not know the probabilistic law exactly but can infer it, to a degree, from the available data – they approximate the unknown law by assuming a parametric model and fitting its parameters. Uncertainty relates to the probable deviations from the expected price or return where these probable deviations are described by the unknown law. Therefore, a measure of uncertainty should be capable of quantifying the probable positive and negative deviations. In this aspect, any uncertainty measure is symmetric. As an extreme case, consider a variable characterized by none uncertainty whatsoever. It follows that this variable is non-random, it is a constant and we know its future value. A classical example of an uncertainty measure is the variance. It is the aver2 age squared deviation from the mean of a distribution, σ 2 = E ( X − EX ) , where X denotes the random variable, e. g. the future price or return, and E is the mathematical expectation. Another measure is the standard deviation, σ, which is the square root of variance. It is more understandable as it is measured in the same units as the random variable X. For instance, if X stands for a future price, then σ is measured in dollars; if X stands for return, σ can be measured in percentage points. The standard deviation is not the only way of measuring uncertainty. There are cases in which it is inappropriate – there are distributions for which the standard deviation is infinite. Other examples of uncertainty measures include the average absolute deviation from the mean, E X − EX , the interquartile range, etc. Beside the essential features of risk discussed above, there are others. Investment risk may be relative and this is not in the sense that we want to compare and rank strategies according to their riskiness. In benchmark tracking problems, it is relevant to demand smaller risk of the performance of the strategy relative to the benchmark, i. e. to demand smaller risk of the excess return. If there are multiple benchmarks, then there are multiple relative risks to take into account and strategy construction becomes a multi – dimensional, or a multi – criterion, one. Apparently, a true operational definition of investment risk is out of reach. Nevertheless, aspects of it allow for quantification. From practical viewpoint, portfolio managers are interested, among other things, in minimizing portfolio risk and therefore it is necessary to find a way to measure it. Efforts to quantify risk have given rise to the notion of a risk measure. A risk measure is defined as a functional which

616

Borjana Racheva-Iotova, Stoyan Stoyanov

assigns a numerical value to a loss distribution. Any risk measure should satisfy certain properties which represent some of the aspects of risk that we talked about.

25.2.1 The Value-at-Risk Measure An example of a risk measure which has been widely accepted since 1990’s is the Value-at-Risk (VaR). It is defined as the minimum level of loss at a given, sufficiently high, confidence level 1 – ε. The obvious interpretation is that losses larger than the VaR happen with probability ε. The probability ε is sometimes called tail probability. For example, if the accepted confidence level is 95%, then losses larger than the corresponding VaR level happen with 5% probability. The formal definition of the VaR is the following. Definition 25.1. The VaR at confidence level 1 – ε is defined as the negative of the lower ε-quantile of the gain distribution, where ε is in (0, 1): VaRε(X) = – inf{x | P(X ≤ x) ≥ ε} Here is one aspect in which the VaR differs from the deviation measures (and all uncertainty measures). As a consequence of the definition, if we add to our random variable, X, representing loss a certain profit C, the VaR of the result can be expressed by the VaR of the initial variable in the following way: VaRε(X + C) = VaRε(X) – C. It is also trivial to show that scaling the profit-loss distribution by a positive constant λ scales the VaR by the same constant: VaRε(λX) = λVaRε(X). As far as the practical side is concerned, computing the VaR of a fitted distribution is an easy task since by definition the VaR is a quantile with its sign changed. Quantiles are readily calculated in all statistical packages. In the last decade, it turned out that the VaR has important deficiencies and while it provides intuitive characteristics of portfolio loss, generally, should be abandoned as a risk measure. The most important drawback is that, in some cases, the reasonable diversification effect that every sensible manager expects from a risk measure is not present; that is, the VaR of a portfolio is greater than the sum of the VaRs of the constituents. This is completely irrational. Another drawback is that VaR is not very informative. It provides information about a loss threshold – we lose more than the threshold with ε probability but we do not have a clue as to how large this loss may be, provided that it exceeds the VaR level.

25.2.2 Coherent Risk Measures A systematic approach towards construction of risk measures was undertaken in (Artzner et al. 1998). A set of four axioms is proposed and all functionals satisfying these axioms are called coherent risk measures. The definition is as follows.

25 Post-modern Approaches for Portfolio Optimization

617

Definition 25.2. A coherent risk measure is any real-valued functional ρ defined on the space of real-valued random variables satisfying:

a. b. c. d.

ρ(Y) ≤ ρ(X), if Y ≥ X ρ(0) = 0, ρ(λX) = λρ(X), for all X and all λ > 0 ρ(X + Y) ≤ ρ(X) + ρ(Y), for all X and Y ρ(X + C) = ρ(X) – C, for all X and C

The random variables in the definition stand for random loss. It may be measured in dollars, if X is the future value or in percentage from the present value, if X stands for future returns. The interpretation of the four axioms also depends on that. Originally, the axioms were introduced for X denoting future value. In this paper, we will show how the interpretation changes if the X denotes future returns. (i). Property a) states that if investment A has returns which are not less than the returns of investment B in all states of the world at a given horizon, then the risk of A is not higher than the risk of B. Note that this result is irrespective of the value of A and B today. (ii). Properties b) and c) imply that the functional is convex meaning that ρ(λX + (1 – λ)Y) ≤ λρ(X) + (1 – λ)ρ(Y). The random quantity λX + (1 – λ)Y stands for the return of a portfolio composed of the two assets X and Y with weights λ and (1 – λ) respectively and therefore the convexity property states that the risk of a portfolio is not greater than the sum of the risks of its constituents. Therefore the convexity property is behind the diversification effect that we expect. (iii). Suppose that today the value of our portfolio is PV0 > 0 and we add a certain amount of cash C. Today the value of our portfolio becomes PV0 + C. We will assume that the value tomorrow is stochastic and equals PV1 + C in which PV1 is a random variable. The return of our portfolio equals X =

PV1 + C − PV0 − C PV0 + C

=

PV1 − PV0 ⎛ PV0 ⎞ PV1 − PV0 = hY ⎜ ⎟=h PV0 PV0 ⎝ PV0 + C ⎠

where h is obviously a positive constant. Property c) alone implies that ρ(hX) = hρ(X), i. e. the risk of the new portfolio will be the risk of the portfolio without the cash but scaled by h. (iv). Suppose that the return of a stock is the random variable X and we build a portfolio by adding a government bond yielding a risk-free rate rB. The return of the portfolio equals wX + (1 – w)rB. The quantity (1 – w)rB is non-random. Property d) states that the risk of the portfolio can be decomposed as ρ(wX + (1 – w)Y) ≤ ρ(wX) + ρ((1 – w)Y), and using c) we get ρ(wX + (1 – w)Y) ≤ wρ(X) + (1 – w)ρ(Y). In effect, the risk measure admits the following interpretation: assume that the constructed portfolio is equally weighted, i. e. w = ½, then the risk measure equals the level of the risk-free rate such that the risk of the equally weighted portfolio of risky and not risky asset is zero. Alternative interpretations are also possible. Assume that we can find a government bond earning return RB at the horizon of interest. Then we can ask the question in the opposite direction, i. e. how much should we invest in the government

618

Borjana Racheva-Iotova, Stoyan Stoyanov

bond in order to hedge the risk ρ(X)? The calculations are easy to do, the answer is that the additional investment should equal ρ(X)/RB times the initial investment, PV0, in X. In this way we can choose the bond to invest in and obviously, the larger RB, the smaller the reserve investment in the bond. Since at the horizon, the money gained from the bond is ρ(X)PV0 and it is this money which we will use against adverse movement in prices, instead of constructing a bond portfolio, we can place in reserve today amount of cash equal to ρ(X)PV0. (v). An additional consequence of properties a) or d) is that if h ≥ 0, then ρ(X + h) ≤ ρ(X). Therefore increasing the mean results in a decrease of the risk as measured by ρ. This is a fundamental property of all coherent risk measures. In the subsection devoted to VaR, we noted that a disadvantage of the VaR is that it does not give information about the severity of the losses which exceed the VaR. We know that, if the distributional assumption is correct, losses will exceed the VaR on average in ε100% of the cases but the VaR measure does not show how much larger the losses will be than the VaR.

25.2.3

The Conditional Value-at-Risk

The Conditional Value-at-Risk (CVaR) is a risk measure which does not have the disadvantages of the VaR. Firstly, it satisfies all axioms a) through d) and therefore is coherent and, secondly, it is much more informative. The definition for a general loss distribution follows below. Definition 25.3. The Conditional Value-at-Risk at 1 – ε confidence level is defined as CVaRε ( X ) =

1ε

VaR p ( X ) dp ε ∫0

= E ( − X | − X > VaRε ( X ) ) −

(

)

F F −1 ( ε ) − ε

ε

F −1 ( ε )

where F−1(p) = inf{x | F(x) ≥ p} is the generalized inverse of the cumulative distribution function (c.d.f.) of X, F(x) = P(X ≤ x). Apparently, the CVaRε(X). can be interpreted as the average of all VaRs larger than VaRε(X) and therefore the CVaR captures the information in the entire tail of the loss distribution beyond the corresponding VaRε(X). The second equality has two terms. The second term is present because the loss distribution may happen to have a point mass at VaRε(X) of F(F−1(ε)) – ε; that is, the loss distribution may have a jump at VaRε(X) and the size of the jump equals the probability of VaRε(X). If the c.d.f. is continuous, or if there is not a jump at the selected confidence level, then the CVaR reduces to CVaRε ( X ) = E ( − X | − X > VaRε ( X ) )

which is also known as expected tail loss (ETL).

25 Post-modern Approaches for Portfolio Optimization

619

From practical point of view, for some loss distributions it is very easy to calculate the CVaR, such as the normal, Student’s t. For others, the CVaR calculation becomes a very involved problem, such as the stable laws see (Stoyanov et al. 2006). There is a general approach towards this problem based on the Monte Carlo method. Let X1, …, XN be independent scenarios from the distribution of X. The CVaRε(X) can be approximated by

1 ⎡ CVaRεappr ( X ) = min ⎢θ + θ ⎣ Nε

N

⎤

k =1

⎦

∑ max ( − X k − θ , 0 )⎥ ,

for further details see [16]. This approach is appealing from practical viewpoint as a general framework can be built on it. Further on, the arising optimization problem can be reformulated as a linear program by introducing a number of auxiliary variables, 1 N ⎤ ⎡ CVaRεappr ( X ) = min ⎢θ + ∑ dk θ ,d ⎣ N ε k =1 ⎥⎦ s.t. − X k − θ ≤ d k , k = 1, N d k ≥ 0, k = 1, N

(25.1)

for which there are efficient solvers.

25.2.4 Performance Measures The solution of the optimal portfolio problem is a portfolio that minimizes a given risk measure provided that the expected return is constrained by some minimal value R:

(

min ρ wT r − rb w

s.t.

) (25.2)

wT Er − Erb ≥ R l ≤ Aw ≤ u

where the vector notation wTr stands for the returns of a portfolio with composition w = (w1, w2, …, wn), l is a vector of lower bounds, A is a matrix, u is a vector of upper bounds, and rb is some benchmark (which could be set equal to zero). The set comprised by the double linear inequalities in matrix notation l ≤ Aw ≤ u includes all feasible portfolios. If the benchmark is zero, rb = 0, and instead of the risk measure ρ we use the standard deviation, which is a uncertainty measure, then the optimization problem transforms into the classical Markowitz problem. Optimal portfolio problems with

620

Borjana Racheva-Iotova, Stoyan Stoyanov

a benchmark are called active. The benchmark could be non-stochastic or stochastic, for example the return of another portfolio or a market index. In case rb is nonzero and we use the standard deviation instead of ρ, the problem transforms into the classical tracking error problem. The set of all solutions of (25.2), when varying the value of the constraint, is called the efficient frontier. Along the efficient frontier, there is a portfolio that provides the maximum expected return per unit of risk; that is, this portfolio is a solution to the optimal ratio problem max w

wT Er − Erb

(

ρ wT r − rb

) (25.3)

s.t. l ≤ Aw ≤ u

An example of a reward-risk ratio is the celebrated Sharpe ratio or the information ratio depending on benchmark is stochastic. In both cases, instead of the risk measure ρ the standard deviation is used. Beside the Sharpe ratio, or the information ratio, many more examples can be obtained by changing the risk and, possibly, the reward functional see (Biglova et al. 2004) for an empirical study: • the STARR ratio: E(wTr – rb)/CVaRα(wTr – rb) • the Stable ratio: E(wTr – rb)/σrp, where σrp is the portfolio dispersion. Here it is assumed that the vector r follows a multivariate sub-Gaussian stable distribution and thus σrp = (wTQw)1/2, where Q is the dispersion matrix see (Rachev and Mittnik 2000). • the Farinelli-Tibiletti ratio: (E(max(wTr – t1, 0) γ)) 1/γ/(E(max(t2 − wTr, 0) δ)) 1/δ, where t1 and t2 are some thresholds. • the Sortino-Satchell ratio: E(wTr – rb)/(E(max(t – wTr, 0) γ)) 1/ γ, γ ≥ 1 • the Rachev ratio (R-ratio): CVaRα(rb – wTr)/CVaRβ(wTr – rb) • the Generalized Rachev ratio (GR-ratio): CVaR(γ, α)(rb – wTr)/CVaR(δ, β)(wTr – rb), where CVaR(γ, α)(X) = (E((max(−X, 0))γ| −X > VaRα(X))) γ* and γ* = min(1, 1/γ). • the VaR ratio: VaRα(rb – wTr)/VaRβ(wTr – rb) In the R- and GR-ratio, the reward functional is non-linear. The R-ratio can be interpreted as the ratio between the average (active) profit exceeding a certain threshold and the average (active) loss below a certain level. Problem (25.3) can be transformed into a more simple problem on condition that the risk measure is strictly positive for all feasible portfolios

(

min ρ xT r − trb x ,t

s.t. x Er − tErb = 1 tl ≤ Ax ≤ tu T

) (25.4)

25 Post-modern Approaches for Portfolio Optimization

621

where t is an additional variable. If (xo, to) is a solution to (25.4), then wo = xo/to is a solution to problem (25.3). There are other connections between problems (25.3) and (25.4). Let ρo be the value of the objective at the optimal point (xo, to) in problem (25.4). Then 1/ρo is the value of the optimal ratio, i. e. the optimal value of the objective of problem (25.3). Moreover, 1/to is the expected return of the optimal portfolio and ρo/to is the risk of the optimal portfolio. For more details, see (Stoyanov et al. 2007) For the particular choice of CVaR as a risk measure, problems (25.2) and (25.4) can be solved by linearizing the risk measure as it is done in (25.1). We provide the linearization of problem (25.2), the same approach works for (25.4): min θ +

w,θ , d

1 Nε

N

∑ dk

k =1

s.t. wT Er − Erb ≥ R −w r + T k

rbk

(25.5)

− θ ≤ d k , k = 1, N

d k ≥ 0, k = 1, N l ≤ Aw ≤ u

It is possible to reduce optimal ratio problems to more simple ones in fairly general conditions when the reward functional is not the expected return but a more general one. For more details on the general approach see (Stoyanov et al. 2007), where the ratios are categorized into groups. Unfortunately, the classification is inclomplete in the sense that there are ratios which do not belong to any of the groups, such as the R-ratio, GR-ratio and Farinelli-Tibiletti ratio. For these ratios, a different approach should be undertaken.

25.3 Heavy-tailed and Asymmetric Models for Assets Returns Specifying properly the distribution of assets returns is vital for risk management. A failure may lead to significant underestimation of portfolio risk and, consequently, to wrong decisions. Consider the VaR defined in the previous section as a very simple illustration. Clearly, since it reduces to quantile estimation, the VaR figure is heavily influenced by the distributional hypothesis. More precisely, it is very sensitive to the modeling of the tail as it is the tail that accounts for the extreme losses. We mentioned that while the VaR is widely used, it has many deficiencies as a risk measure. However, the above conclusion holds for the CVaR as well and the CVaR, being average of VaRs in the entire tail, is even more sensitive to modeling of the tail.

622

Borjana Racheva-Iotova, Stoyan Stoyanov

In the optimal portfolio problem, this issue becomes even more pronounced because the distributional assumption is an input in the optimization problem. A misspecification, or bad parameter estimation, may result in significant deviations of the solution from what would be optimal on the market because we optimize on incorrect input. The distributional modeling of stocks or funds returns has several dimensions. Firstly, we should have realistic models for the returns of each stock considered separately. That is, we should employ realistic one-dimensional models. Secondly, the model should capture properly the dependence between the one-dimensional variables. Therefore, in summary, we need a true multivariate model with the above two building blocks correctly specified.

25.3.1

One-dimensional Models

The cornerstone theories in finance such as mean-variance model for portfolio selection and asset pricing models that have been developed rest upon the assumption that asset returns follow a normal distribution. Yet, there is little, if any, credible empirical evidence that supports this assumption for financial assets traded in most markets throughout the world. Moreover, the evidence is clear that financial return series are heavy-tailed and, possibly, skewed. Fortunately, several papers have analyzed the consequences of relaxing the normality assumption and developed generalizations of prevalent concepts in financial theory that can accommodate heavy-tailed returns ((Rachev and Mittnik 2000; Rachev 2003) and references therein). (Mandelbrot 1963) strongly rejected normality as a distributional model for asset returns, conjecturing that financial return processes behave like non-Gaussian stable processes. To distinguish between Gaussian and non-Gaussian stable distributions. The latter are commonly referred to as “stable Paretian” distributions or “Levy stable” distributions1. While there have been several studies in the 1960s that have extended Mandelbrot’s investigation of financial return processes, probably, the most notable is (Fama 1963, 1965). His work and others led to a consolidation of the stable Paretian hypothesis. In the 1970s, however, closer empirical scrutiny of the “stability” of fitted stable Paretian distributions also produced evidence that was not consistent with the stable Paretian hypothesis. Specifically, it was often reported that fitted characteristic exponents (or tail-indices) did not remain constant under temporal aggregation2. Partly in response to these empirical “inconsistencies”, various alternatives to the stable law were proposed in the literature, including fat-tailed 1

2

Stable Paretian is used to emphasize that the tails of the non-Gaussian stable density have Pareto power-type decay “Levy stable” is used in recognition of the seminal work of Paul Levy’s introduction and characterization of the class of non-Gaussian stable laws. For a more recent study, see (Akgiray and Booth 1988; Akgiray and Lamoureux 1989).

25 Post-modern Approaches for Portfolio Optimization

623

distributions being only in the domain of attraction of a stable Paretian law, finite mixtures of normal distributions, the Student t-distribution, and the hyperbolic distribution, see (Bibby and Sorensen 2003). A major drawback of all these alternative models is their lack of stability. As has been stressed by Mandelbrot and argued by (Rachev and Mittnik 2000), among others, the stability property is highly desirable for asset returns. This is particularly evident in the context of portfolio analysis and risk management. Only for stable distributed returns does one obtain the property that linear combinations of different return series (e. g., portfolios) follow again a stable distribution. Indeed, the Gaussian law shares this feature, but it is only one particular member of a large and flexible class of distributions, which also allows for skewness and heavy-tailedness. Recent attacks on Mandelbrot’s stable Paretian hypothesis focus on the claim that empirical asset return distributions are not as heavy-tailed as the non-Gaussian stable law suggests. Studies that come to such conclusions are typically based on tail-index estimates obtained with the Hill estimator. Because sample sizes beyond 100,000 are required to obtain reasonably accurate estimates, the Hill estimator is highly unreliable for testing the stable hypothesis. More importantly, the Mandelbrot’s stable Paretian hypothesis is interpreted too narrowly, if one focuses solely on the marginal distribution of return processes. The hypothesis involves more than simply fitting marginal asset return distributions. Stable Paretian laws describe the fundamental “building blocks” (e. g., innovations) that drive asset return processes. In addition to describing these “building blocks”, a complete model should be rich enough to encompass relevant stylized facts, such as • • • •

non-Gaussian, heavy-tailed and skewed distributions volatility clustering (ARCH-effects) temporal dependence of the tail behavior short- and long-range dependence

An attractive feature of stable models – not shared by other distributional models – is that they allow us to generalize Gaussian-based financial theories and, thus, to build a coherent and more general framework for financial modeling. The generalizations are only possible because of specific probabilistic properties that are unique to (Gaussian and non-Gaussian) stable laws, namely, the stability property, the Central Limit Theorem and the Invariance Principle for stable processes. Detailed accounts of properties of stable distributed random variables can be found in (Samorodnitsky and Taqqu 1994; Janicki and Weron 1994).

25.3.2

Multivariate Models

For the purposes of portfolio risk estimation, constructing one-dimensional models for the instruments is incomplete. Failure to account for the dependencies between the instruments may be fatal for the analysis.

624

Borjana Racheva-Iotova, Stoyan Stoyanov

There are two ways to build a complete multivariate model. It is possible to hypothesize a multivariate distribution directly; that is, the dependence between stock returns as well as their one-dimensional behavior. Assumptions of this type include the multivariate Normal, the multivariate Student t, the more general elliptical family, the multivariate stable, etc. Sometimes, in analyzing dependence, an explicit assumption is not made, for instance, the covariance matrix is very often relied on. While an explicit multivariate assumption is not present, it should be kept in mind that this is consistent with the mutivariate Normal hypothesis. More generally, the covariance matrix can describe only linear dependencies and this is a basic limitation. In the last decade, a second approach has become popular. One can specify separately the one-dimensional hypotheses and the dependence structure through a function called copula. This is a more general and more appealing method because one is free to choose separately different parametric models for the standalone variables and a parametric copula function, for more information, see (Embrechts et al. 2003). The reader may object that apart from specifying a parametric model, one can rely on the historical method. That is, we collect the vectors of observed returns and treat them as equally likely scenarios for the future. There are at least two deficiencies of this approach, firstly treating the observed returns equally likely we cannot account for the clustering of the volatility effect, which is apparent in the daily returns data. Secondly, we tacitly assume that what has happened in the past will happen in the future; that is the future will repeat the past, which is not a very convincing assumption.

25.4 Strategy Construction Based on CVaR Minimization In general, there are several types of strategies depending on whether the risk and/ or the expected return of the optimal portfolio are compared to those of a benchmark with the goal of beating the benchmark. If so, then the strategy is said to be active as opposed to the passive strategy of replicating a benchmark. In the classical Markovitz framework, an example of a benchmark tracking problem is the tracking error problem. In it, the standard deviation of the excess return, which is called tracking error, is minimized subject to constraints. The tracking error in this case is used to measure the deviation of the optimal portfolio return relative to the benchmark. Certainly, this goal can be achieved through a better measure based on the CVaR rather than the standard deviation. Other types of strategies can be distinguished. If short positions are not allowed, then we have long only strategies, if short positions are allowed we have long-short and as an extreme case of these appear the zero-dollar strategies in which the dollar value of the long part exactly equals the dollar value of the short

25 Post-modern Approaches for Portfolio Optimization

625

part. The zero-dollar strategies generally give rise to more complicated problems from mathematical viewpoint. One of the points in having short positions in the portfolio is to gain return in both bearish and bullish markets. All problems considered can be extended with transaction cost or turnover constraints. We do not consider them for brevity of exposition and clarity of the optimization problem formulations. Also, all strategies are considered self-financing in the sense that inflows and outflows of capital are not allowed.

25.4.1 Long-only Active Strategy The traditional equity portfolio is constructed and managed relative to an underlying benchmark and aims at achieving active return while minimizing the active risk. Typically the portfolio construction process in such cases suffers from two main disadvantages – employment of tracking error as a measure of the active risk which penalizes upside deviations from the market and the lack of consideration of absolute risk which can play dramatic effect in case of market crashes. For example, in periods when the risk of the benchmark increases, the risk of the optimal solution will increase as well since there is no absolute measure of it in the problem. The traditional definition of the problem is:

(

⎧min var wT r − rb ⎪ w ⎪ s.t. ⎪ ⎪ ⎨ T ⎪ w Er − Erb ≥ R ⎪ T ⎪w e = 1 ⎪l ≤ Aw ≤ u ⎩

) (25.6)

where var stands for variance; i. e. in the objective there is the active variance. If we change the active variance for the tracking error, the optimal solution will not change. Problem (25.6) is easier to solve because it reduces to a quadratic problem for which there are efficient solvers. In (25.6), and in all problems considered below, the double inequalities l ≤ Aw ≤ u in matrix form are linear constraints on the weights and generalize all possible linear constraints that can be imposed, such as constraints by industry, or simple box-type constraitns either in absolute terms or relative to the weight of the corresponding stocks in the benchmark. Actually, the relative expected return constraint and the constraint that all weigths sum up to 1, wTe = 1, can also be generalized in this fashion but we give them separately as they have special meaning. The relative expected return constraint is of practical importance and the constraint wTe = 1 guarantees that the strategy is self-financing; i. e. in absence of transaction costs, the optimal portfolio present value equals the present value of the initial portfolio.

626

Borjana Racheva-Iotova, Stoyan Stoyanov

A modification of (25.6) based on the CVaR can be suggested in which an additional absolute risk constraint is added. One possible version is:

(

)

⎧min CVaR β wT r − rb ⎪ w ⎪ s.t. ⎪ ⎪CVaR α wT r ≤ CVaR* ⎨ ⎪ wT Er − Erb ≥ R ⎪ ⎪ wT e = 1 ⎪ ⎩l ≤ Aw ≤ u

(

)

(25.7)

where the CVaR in the objective is at the confidence level 1 – β which is relative to the benchmark portfolio returns rb. The CVaR in the constraint set is in absolute terms and its confidence level is 1 – α which may differ form 1 – β. The upper bound CVaR* is selected by the portfolio manager. It has a direct interpretation due to the CVaR being the average loss beyond the corresponding VaR level, see the CVaR definition in the previous section. Suppose that all the constraints and the parameters in an optimization problem have been selected, including the risk measure, in this case the CVaR and its confidence level, the additional weight constraints and, last but not least, the multivariate model. Then the next step in the analysis is to run back-testing calculations in order to see what we would have gained in terms of risk-adjusted return, had we followed the strategy in the previous 1, 3 or 5 years for instance. In this way, we can compare several strategies and decide on the best one. Problem (25.7) has the disadvantage that it may become infeasible for some periods in a back-testing calculation because of the absolute risk constraint or the relative expected return constraint. One reason is that for some periods the upper bound CVaR* may be too low and there may not be a single portfolio satisfying it. An alternative formulation of (25.7) can be more advantageous:

(

)

(

)

⎧min λ1CVaR β wT r − rb + λ2 CVaR α wT r − wT Er ⎪ w ⎪ s.t. ⎨ ⎪ wT e = 1 ⎪ ⎩l ≤ Aw ≤ u

(25.8)

In (25.8), the objective has a more complicated form. It can be viewed as a utilty function composed of three components with two positive risk-aversion coefficients, λ1 and λ2, which signify the relative importance of the three components. Note that, due to the linearity of the expectation, the expected return of the benchmark can be ignored in (25.8). The two aversion coefficients are free parameters in the problem and can be chosen by the portfolio manager. Alternatively, they can be calibrated by a sequence of back-testing calculations.

25 Post-modern Approaches for Portfolio Optimization

627

The main advantages of (25.7) or (25.8) to the traditional problem (25.6) are: • they allow for a non-normal distributional assumption • the risk measure used penalizes only the downside deviations from the market • the absolute risk of the optimal portfolio can be controlled In practice, (25.7) or (25.8) can be solved using the linearization procedure applied in (25.5), see the previous section, after we have produced scenarios from the fitted parametric model. The type of the parametric model does not influence the structure of the optimization problem; the technique is one and the same on condition that scenarios are provided.

25.4.2

Long-short Strategy

Long/short hedge funds focus on security selection and aim at achieving absolute returns keeping small market risk exposure by offsetting short and long positions. The increased flexibility introduced by short positions allows reducing dependence with the market and allows taking advantage of overvalued as well as undervalued securities. Suppose that the portfolio manager knows which stocks will be long (winners) and which will be short (losers). The only remaining problem is their relative proportions. In this case, both problems (25.7) and (25.8) can be used without modification. The set l ≤ Aw ≤ u can be defined to account for portfolio manager’s decision. For example, the box constraints of all short stocks are of the type a% ≤ wi ≤ 0 ad those of the long stocks, 0 ≤ wi ≤ b%. The constraint wTe = 1 means that the weights are interpreted as a percentage of the optimal portfolio present value, which, in absence of transaction costs, equals the present value of the initial portfolio as the strategy is by construction self-financing. Therefore this problem does not control the ratio of the exposure of the long and the short legs. A constraint of this type can easily be implemented by including

∑ wi ≤ ∑ wi

i∈S

i∈L

where L and S are the sets of the long and short stocks respectively. If the manager does not know which stocks to short, then the problem becomes more involved. Basically, there are two options. On one hand, the manager can screen the stocks prior to the optimization and decide on the winners and the losers and after that proceed to the optimization. The screening can be done by adopting a relevant criterion which could be a performance measure. For example, the STARR ratio and the R-ratio are completely compatible with problems (25.7) or (25.8) as they involve the CVaR as a risk measure; see the previous section for more details. In summary, we suggest a two-step procedure:

628

Borjana Racheva-Iotova, Stoyan Stoyanov

a. rank the stocks according to a performance measure and choose the winners and the losers b. solve the optimization problem to get the optimal weights On the other hand, the portfolio manager may decide to use a one-step process. In this case, binary variables can be employed since the sign of the weights is not known beforehand. The optimization problem becomes numerically much more involved than the two-step alternative but provides the true optimal solution. Such problems are solved with mixed-integer programming (MIP) methods. In all problems considered above, there is a numerical stability issue. When the long/short ratio approaches one, i. e. the zero-dollar problem, the optimization problem becomes numerically unstable because the weights are defined as a percentage of the present value which becomes zero at the limit. Therefore, the zerodollar problem is a singularity. To avoid this, the weights can be redefined as a percentage of the total exposure instead; that is the value of the long leg minus the value of the short leg. In effect, the constraint wTe = 1 should be replaced.

25.4.3

Zero-dollar Strategy

The zero-dollar strategy is a limit case of the long/short strategy in which the value of the long leg equals in absolute terms the value of the short leg, or, alternatively, the total exposure of the portfolio is split in halves between the short and the long part. The optimization problem becomes more involved because the intuitive formulation in (25.7) or (25.8) should be modified since the weights cannot be defined as a percentage of the present value which is equal to zero. The alternative of (25.8) is, for example: ⎧ T T T ⎪min λ1CVaR β w r − rb + λ2 CVaR α w r − w Er w ⎪ ⎪ s.t. ⎪ ⎨ ∑ wi = 1 ⎪i∈L ⎪ ∑ wi = −1 ⎪i∈S ⎪l ≤ Aw ≤ u ⎩

(

)

(

)

(25.9)

where S is the set of the short stocks and L is the set of the long stocks. The two new weight constraints mean that they are defined as a percentage of the value of the long part. Thus, the sum of all weights is zero in contrast to the formulation in (25.8). Here we face the same problems as in the long/short case. Either we can use the two-step procedure outlined there, or a one-step process involving MIP. The same trick of changing the weight definition works with (25.6) and (25.7) as well. Actually, if the weights are defined as a percentage of the total exposure, then the sum of the positive weights equals ½ and the sum of the negative weights to –½.

25 Post-modern Approaches for Portfolio Optimization

25.4.4

629

Other Aspects

Apart from being long or long/short, there are other important features of a strategy. One aspect is if it is correlated with an industry represented by an index. It is very often the case that the portfolio manager demands that the optimal portfolio should not be correlated with one or several indexes. Similar requirements can be implemented in the optimization problems considered in the paper by including additional linear weight constraints. Suppose that the returns of the stocks can be represented by a multifactor model in which the explanatory variables are the returns of the indexes: K

ri = ai + ∑ bij rjI + ε i ,

i = 1,N

j =1

where rjI are the returns of the corresponding indexes, ai and bij are the regression coefficients and εi are the residuals. The returns of a portfolio of these stocks can be expressed in a similar way, K

wT r = wT a + ∑ wT b j rjI + ei . j =1

Thus, in the optimization problem, we have to include additional constraints involving the coefficients in the above equation since an optimal portfolio with wT b j = 0 will be uncorrelated with the index j. Certainly the reasoning relies on whether the multifactor model is realistic. If it is not, or the regression coefficients are not properly estimated, then the optimal solution in reality will not be uncorrelated with the corresponding indexes.

25.5 Optimization Example In this section, we provide a back-testing example of a long-only optimal portfolio strategy using the Russell 2000 universe. The back-testing time period is ten years – from December 1993 to December 2004 with monthly frequency. In the optimization algorithm, we use the proprietary stable model in Cognity Risk & Portfolio Optimization System. In the strategies, the Russell 2000 index is used as the benchmark; that is rb is the return of Russell 2000. The optimization constraints are the following. • 0% to 3% limit on single stock • +/– 3% industry exposure with respect to the benchmark; the industries being defined by Ford Equity Reserch • The active return is strictly positive • The two-way turnover is below 15% per month. This constraint is used as a soft constraint; i. e. may be exceeded at times. Also, no limit is imposed in July because the benchmark is adjusted in July.

630

Borjana Racheva-Iotova, Stoyan Stoyanov

The back-testing is performed in the following way. We use 450 stocks as initial universe. One year of daily data is used to calibration the model and monthly scenarios are produced by it. Then a version of the optimal portfolio problem (25.8) is solved in which a tail probability of 5% is selected for the CVaR.. At the end of the month, the portfolio PV is calculated. The process is repeated next month. Figure 25.1 shows the stable CVaR portfolio PV compared to the Russell 2000 index. In addition to the stable method, monthly back-testing is performed for a version of the Markowitz problem (25.6). We use a factor model of 8 factors and 5 years of monthly data to calibrate it. Each month the covariance matrix is estimated through the factor model and the optimization problem is solved. The portfolio PV is calculated at the end of month. The evolution of the portfolios PVs is available on Figure 25.2. Note that the PV of the stable portfolio is scaled to start with the same capital as the Markowitz. Additional information is given in Tables 25.1 and 25.2. The average monthly turnover is defined as the dollar weighted purchases plus the dollar weighted sales. Tables 25.3 and 25.4 provide details on return-risk ratios. The information ratio is the active return per unit of tracking error.

Optimal Portfolo Present Values 3500 Stable CVaR(95%) Russell 2000 3000

Present Value

2500

2000

1500

1000

500 Dec-93

Jul-95

Mar-97

Nov-98 Jul-00 Date

Mar-02

Nov-03Dec-04

Figure 25.1. The time evolution of the present values of the stable CVaR portfolio compared to the Russell 2000 index

25 Post-modern Approaches for Portfolio Optimization

631

Optimal Portfolio Present Values 2800 Stable CVaR(95%) Markowitz, 8frs Russell 2000

2600 2400

Present Value

2200 2000 1800 1600 1400 1200 1000 800 Dec-99

Sep-00

Aug-01

May-02 Mar-03 Date

Jan-04

Dec-04

Figure 25.2. The time evolution of the present values of the Markowitz and the stable CVaR (scaled) portfolios compared to the Russell 2000 index Table 25.1. Average holdings count 10 year 5 year 3 year 2 year 1 year

Stable CVaR 112 105 102 100 104

Markowitz 137 110 104 100

Table 25.2. Average monthly turnover 11 months July All months

Stable CVaR 16% 163% 27%

Markowitz 18% 85% 24%

Table 25.3. Annualized information ratio 10 year 5 year 3 year 2 year 1 year

Stable CVaR 0.74 0.71 0.93 0.74 1.22

Markowitz 0.29 −0.24 −0.57 1.03

632

Borjana Racheva-Iotova, Stoyan Stoyanov

Table 25.4. Sharpe ratios Stable CVaR 1.01 0.92 1.22 2.13 1.66

10 year 5 year 3 year 2 year 1 year

Markowitz 0.68 0.71 1.99 2.16

Russell 2000 0.42 0.36 0.58 1.82 1.19

Detailed risk vs return statistics follow in Tables 25.5 and 25.6. Table 25.5. Annualized return and volatility Return 10 year 5 year 3 year 2 year 1 year

Stable CVaR 17.90% 18.10% 20.30% 34.00% 24.10%

Volatility 10 year 5 year 3 year 2 year 1 year

Stable CVaR 17.70% 19.60% 16.50% 15.90% 14.50%

Markowitz 11.10% 9.90% 24.30% 21.50% Markowitz 16.50% 14.00% 12.20% 10.00%

Russell 2000 11.20% 7.90% 21.10% 28.60% 17.10% Russell 2000 26.40% 21.80% 21.10% 15.70% 14.40%

Table 25.6. Annualized active return and annualized tracking-error Active Return 10 year 5 year 3 year 2 year 1 year

Stable CVaR 6.70% 7.80% 8.10% 5.30% 7.00%

Tracking-error 10 year 5 year 3 year 2 year 1 year

Stable CVaR 9.00% 11.10% 8.70% 7.20% 5.70%

Markowitz 3.30% −2.20% −4.30% 6.40% Markowitz 11.40% 9.50% 7.70% 6.30%

On the basis of the charts and the information in the tables, it is obvious that both the optimization problem type and the multivariate distributional assumption are vital for the optimal portfolio performance and should be considered.

25 Post-modern Approaches for Portfolio Optimization

633

25.6 Conclusion In this chapter, we consider a unified framework for constructing optimal portfolio strategies. We stress on the importance of both the use of an appropriate optimization problem with a realistic down-side risk measure and the multivariate modeling of the stock returns. A back-testing example of two strategies completely supports this view. We compare the back-testing performance of a classical Markowitz optimization with a strategy constructed according to the post-modern framework described in the paper and using heavy-tailed modeling of the stock returns. From the results of the back-testing, it becomes evident that adopting the more sophisticated approach pays off.

References Akgiray A, Booth G (1988) The stable-law model of stock returns, Journal of Business and Economic Statistics, 6, pp. 51−57 Akgiray A and Lamoureux G (1989) Estimation of stable-law parameters: A comparative study, Journal of Business and Economic Statistics, 7, pp. 85−93 Artzner P, Delbaen F, Eber J-M, Heath D (1998) Coherent Measures of Risk, Math. Fin. 6, pp. 203−228 Bibby BM, Sorensen M (2003) Hyperbolic Processes in Finance, in Handbook of heavy-tailed distributions in finance, editor S. Rachev, pp. 212−248 Biglova A, Ortobelli S, Rachev S, Stoyanov S (2004) Different Approaches to Risk Estimation in Portfolio Theory, Journal of Portfolio Management, 31, pp. 103−112 Embrechts P, Lindskog F, McNeil A (2003) Modelling dependence with copulas and applications to risk management, In: Handbook of heavy-tailed distributions in finance, ed. S. Rachev pp. 329–384 Fama E (1963) Mandelbrot and the stable paretian hypotheis, Journal of Business, 36, pp. 420−429 Fama E (1965) The behavior of stock market prices, Journal of Business, 38, pp. 34−105 Holton A (2004) Defining Risk, Financial Analyst Journal, 60, (6), pp. 19−25 Janicki A, Weron A (1994) Simulation and chaotic behavior of alpha-stable stochastic processes, Marcel Dekker, New York Mandelbrot B (1963) The variation of certain speculative prices, Journal of Business, 7, (1), pp. 77−91 Rachev S (2003) Handbook of heavy-tailed distributions in finance, editor, Rachev, Elsevier/North-Holland Rachev S, Fabozzi F, Menn C (2005) Fat tails and skewed asset returns distributions, Wiley, Finance

634

Borjana Racheva-Iotova, Stoyan Stoyanov

Rachev S, Martin D, Racheva B, Stoyanov S (2007) Stable ETL optimal portfolios and extreme risk management, in “Risk Assessment: Decisions in Banking and Finance”, G. Bol, S. Rachev and R. Wuerth (Editors), Springer – Physika Verlag Rachev S, Mittnik S (2000) Stable Paretian Models in Fianance, John Wiley & Sons, Series in Financial Economics Rockafellar T, Uryasev S (2002) Conditional Value-at-Risk for general loss distributions, Journal of Banking and Finance, 26, (7), pp. 1443−1471 Samorodnitsky G, Taqqu M (1994) Stable non-Gaussian random processes, Chapman & Hall, New York, London Stoyanov S, Samorodnitsky G, Rachev S, Ortobelli S (2006) Computing the portfolio conditional value-at-risk in the alpha-stable case, Probability and Mathematical Statistics, Vol. 26, pp. 1−22 Stoyanov S, Rachev S, Fabozzi F (2007) Optimal financial portfolios, forthcoming in Applied Mathematical Finance

CHAPTER 26 Applications of Heuristics in Finance Manfred Gilli, Dietmar Maringer, Peter Winker

26.1 Introduction Having the optimal solution for a given problem is crucial in the competitive world of finance; finding this optimal solution, however, is often an utter challenge. Even if the problem is well defined and all necessary data are available, it is not always well behaved: rather simple constraints are often enough to prohibit closed form solutions or evade the reliable application of standard numerical solutions. A common way to avoid this difficulty is to restate the problem: restricting and cumbersome constraints are relaxed and simplifying assumptions are introduced until the revised problem is approachable with the available methods, and are afterwards superimposed onto the solution for the simplified problem. Unfortunately, subtleties as well as central properties of the initial problem can be lost in this process, and results assumed to be ideal can actually be far away from the true optimum. Instead of fitting the problem at hand to the available optimization tool, a more desirable approach would be to tailor the optimization approach around the original unmodified problem. This, of course, requires that the method of choice itself is based on a rather flexible concept that forms a general purpose technique. These conditions are met by heuristic optimization approaches. However, the versatility of these methods comes at some cost. First, the implementation of heuristics might be slightly more demanding than that for traditional methods. In addition, ready to use toolboxes are not available yet. Second, the solutions resulting from heuristic optimization inherit some additional stochastics. Nevertheless, a high quality stochastic solution to the relevant problem might be far more useful than a low quality deterministic solution or a solution to a simplified and, consequently, irrelevant problem. This chapter addresses issues in finance where the problem is well defined but exceeds traditional solution methods – and how answers can be found with heuristic methods. Both the problems presented and the heuristics employed do not represent a complete overview. They rather serve as examples for the principles of heuristics and how they can be used in finance. References to further applications

636

Manfred Gilli et al.

and heuristic methods can be found, e. g., in (Winker 2001), (Winker and Gilli 2004), (Maringer 2005b) or (Brabazon and O’Neill 2006).

26.2

Heuristics

26.2.1 Traditional Numerical vs. Heuristic Methods Not all optimization problems in finance have closed form solutions. As a matter of fact, this is the exception rather than the rule for real life problems. Nonetheless, one might still be interested in numerical solutions for given data or parameter settings. Many of the traditional deterministic methods that provide such numerical solutions can be assigned to one of the following three groups: 1. (Almost) exhaustive search methods test as many of the candidate solutions as possible, ideally all of them. Ideally, the search space can be narrowed down by excluding areas that are known to be inferior. If this pruning is not possible or if the objective variables are continuous, methods such as grid search structure the solution space in order to identify areas that are most promising and which are then closer investigated. 2. Neighborhood methods based on the first order conditions aim to converge from an initial solution towards the optimum by repeatedly modifying the current solutions based on sophisticated guesses. Gradient search, e. g., uses the first derivative of the objective function with respect to the variables, i. e., the gradient, and then adjusts the current candidate solution such that it gets closer to where the first derivative is equal to zero – which is equivalent to the formal definition of the necessary (but not sufficient) condition for an (local) optimum. 3. General algorithmic approaches can incorporate both of the above principles, and they can also apply a “first things first” approach. Some mathematical programming methods, e. g, only modify one variable at a time, starting with the one supposed to be the most important and then turning to less critical ones. The literature holds an impressive array of different numerical methods all of which have their merits but also their shortcomings.1 Often designed for a certain type of objective function or constraints, they work efficiently and find deterministically the solution when the underlying assumptions are met. When a problem becomes too demanding for standard methods, it is usually due to one or a combination of the following reasons: (i) even if the problem space has been narrowed down, it still is too vast for exhaustive search; (ii) there exist multiple optima, yet there is no a priori criterion that indicates where to look for 1

For an introduction to optimization techniques, see, e. g., (Taha 2003).

26 Applications of Heuristics in Finance

637

the global one or how to avoid convergence to local ones; and/or (iii) discontinuities in the search space obstruct the search process. When the search space is too rough, the area around the global optimum might easily slip through a grid that is too coarse while a sufficiently fine grid might exceed the computational capacities. The gradient might lead the path to the optimum closest to the current solution which, however, might be a local one that will never be escaped. And once a certain degree of complexity is exceeded or some basic assumptions do not hold applying the methods is not unlike the proverbial search for the keys not where they have been lost but under the streetlight because it is the only place with light: one might strike it lucky, but more promising areas might be left out. Compared to these traditional methods, heuristic methods differ in some very fundamental aspects. A salient feature of these methods is the incorporation of stochastic elements, another one though still preferring improvements in the objective functions, they also allow for (interim) impairments. By these two principles, the search process is no longer deterministic. This can be very beneficial for the search process: Unlike in the traditional deterministic methods, converging to an inferior solution once does no longer imply that another run cannot produce the global optimum. Also, local optima can be overcome more easily. Furthermore, heuristics are often designed as general principles that can easily be adapted to all sorts of problems and constraints.

26.2.2 Some Selected Approaches 26.2.2.1 Threshold Accepting and Simulated Annealing A very simple, yet also very powerful heuristic is Threshold Accepting (TA). Introduced by (Dueck and Scheuer 1990), this method is not unsimilar to the aforementioned gradient search (GS): the search process starts with some initial solution, and new solutions are then iteratively generated within the neighborhood of the current solution in order to converge to the optimum. However, while GS aims to start off in a “good” neighborhood that is supposed to be close to the optimum and, more important, makes sophisticated guesses in which direction to move, TA does neither: The initial solution comes from a random guess, and so do all the suggested modifications and moves. However, to prevent a complete random walk, these suggestions are accepted only either if they do improve the objective value, or if the impairment is not too bad and does not exceed a certain threshold (hence the name). According to this principle, the algorithm accepts any uphill move but also small downhill moves for a maximization problem, and vice versa for a minimization problem. The numerical value of the threshold (T) is therefore a measure of how tolerant the algorithm is towards impairment. Usually it is set rather high in the beginning and gradually lowered towards zero in the course of iterations. This encourages exploration in the early stage of the search process that eventually blends into a type of uphill search strategy once the threshold has reached zero.

638

Manfred Gilli et al.

Table 26.1. Comparison of Pseudo-codes for a maximization problem Uphill search Initialize x (sophisticated guess) REPEAT sophisticated guess for d; xn = x + d; If f(xn) > f(x) x: = xn; end; UNTIL converged;

Threshold Accepting Initialize x (at random) REPEAT random guess for d; xn = x + d; If f(xn) > f(x) – T x: = xn end; lower threshold, T; UNTIL converged;

The ability of (repeatedly) accepting downhill steps is one remedy to escape local optima. TA can therefore be shown to converge to the global optimum if certain requirements are met.2 TA is a simplified variant of Simulated Annealing3 that has a stochastic acceptance criterion, mimicking the particle behavior in an annealing or crystallization process. When applying TA, values for its main components have to be found: First, a neighborhood for the variable (vector) x has to be defined from which the modified candidate solution x' will be drawn randomly. For continuous variables, e. g., this can be done by selecting a suitable distribution for a random variable d that represents the suggested change in x; the new solution is then x' = x + d. Next, a threshold sequence has to be found that governs the acceptance decision. The value of the threshold ought to consider the typical change in the objective function f(x) and how much time has already been spent on the search process. This idea of a data driven choice of the threshold sequence has been introduced by (Fang and Winker 1997). Finally, the halting criterion has to be specified. In TA, the number of iterations is fixed – not least as this allows predefining the threshold sequence. (Winker 2001) and (Maringer 2005b) offer detailed discussions on the tuning of these parameters. Table 26.1 compares the main steps in a gradient search method to TA. 26.2.2.2 Evolutionary and Genetic Methods Evolutionary Strategies (as introduced by (Rechenberg 1973)) and Genetic Algorithms (GAs) (see (Holland 1975)) are based on the idea of natural evolution and selection. Within populations, it is the more fit individuals that outperform (and ultimately replace) the less capable ones and that have a higher chance of reproducing and providing their offspring with an advantageous (genetic) inheritance. Translated into heuristics, methods built on this principle consider whole populations of search agents where each individual represents one solution. These individuals can reproduce by combining their “genetic information” (i. e., solution) 2 3

See (Althöfer and Koschnick 1991). See (Kirkpatrick et al. 1983).

26 Applications of Heuristics in Finance

639

with a partner’s. Typically, this is implemented by some sort of cross-over mechanism that combines part of the genetic patrimony of each parent. In GA, candidate solutions are usually encoded as binary strings. In this case, the cross-over operator decides at which positions the string elements are copied from the paternal string; the remaining positions are filled with the respective maternal elements. In addition, mutation can be performed where a slight random change is applied onto the new solution. If the new individual, often called offspring, is endowed with a genetic structure that has good or superior fitness with respect to the other individuals of the current population then it is likely to survive and replace a less fit individual. If its fitness is relatively low it will not become accepted as a new member of the population. By this process of natural selection, favorable characteristics (i. e., parts of the solution) will survive and become refined while inappropriate elements will be removed.4 Evolutionary and genetic methods are therefore quite different from TA: TA can be characterized as a trajectory method where a new solution is found only within the current’s neighborhood and the search agent traverses the solution space. Discontinuous methods, on the other hand, can always access the full solution space to generate new solutions. Also, TA is a single agent method where only one candidate solution is considered per iteration step, opposed to multi agent methods where many different solutions are maintained at the same time.5

26.2.3

Further Methods and Principles

During the search process with the heuristics outlined so far, it might happen that similar or even the same candidate solutions are repeatedly suggested: e. g., in trajectory methods, the same or close points might be revisited while moving through the search space, and in evolutionary methods different parents might produce identical offspring if the population is rather homogenous. In some cases this might be advantageous, e. g., when reversing from a local optimum or in order to reinforce supposedly “close to perfect” solutions. An augmentation of this strategy is the elitist principle which compares the current solution(s) to the best one found so far, commonly called the elitist. In multi agent methods, the elitist might then replace the worst individual of the current population, and in single agent methods, the elitist might replace the current solution if there has been no overall progress over a certain number of periods.6 In either case, the underlying idea is to avoid moving into less promising areas of the search space. In some cases, however, replication comes just with high computational costs but without much benefit. (Glover 1986) recognized this problem and suggested 4 5

6

For a more in depth presentation, see (Fogel 2001). For a characterization and relevant aspects of different heuristics, see (Winker and Gilli 2004). Alternatively, a restart at an arbitrary point in the search space might be beneficial to check whether or not the elitist represents just a local optimum.

640

Manfred Gilli et al.

Tabu Search which is also a trajectory search method. Given some initial candidate solution, a number of alternative solutions within its neighborhood are generated. The best of these solutions is then chosen and forms the starting point for the next iteration’s neighborhood search. In order to avoid cycles, however, a list of recently visited solutions is kept, and these solutions should not be revisited. This list is updated each iteration by including the current solution and excluding an ancient one. 26.2.3.1 Hybrid Methods Two important issues determining the performance of a heuristic are the capability to explore the search space and the possibility to exploit the experience accumulated during the search. The classical meta-heuristics are generally performing well only for one or the other of these features. This suggests the construction of hybrid meta-heuristics, where elements of classical meta-heuristics are combined in order to achieve a better tradeoff between exploration and exploitation. This allows one to imagine a large number of new techniques. In order to get a more compact view of the possible types of hybrid meta-heuristics, it is appropriate to provide a classification combining a hierarchical and a flat scheme (see (Talbi 2002)). The hierarchical scheme distinguishes between low-level and high-level hybridization and within each level we distinguish relay and co-evolutionary hybridization. Figure 26.1 illustrates this classification. Low-level hybridization replaces a component of a given meta-heuristic by a component from another meta-heuristic. In the case of high-level hybridization different meta-heuristics are self-contained. Relay hybridization combines different meta-heuristics in a sequence whereas in co-evolutionary hybridization the different meta-heuristics cooperate. For example, Genetic Algorithms perform well in the exploration of the search space but are weak in the exploitation of the solutions found. Therefore an interesting hybridization, in the form of a low-level co-evolutionary hybrid metaheuristic, would be to use in a Genetic Algorithm a greedy heuristic for the crossover and a Tabu Search for the mutation. The use of a greedy heuristic to generate the initial population of a Genetic Algorithm and/or Simulated Annealing and

Figure 26.1. Hybridization scheme

26 Applications of Heuristics in Finance

641

Tabu Search to improve the population obtained by the Genetic Algorithm would be an example of a high-level relay hybrid meta-heuristic. For the flat scheme we distinguish between homogenous versus heterogeneous, global versus partial and specialist versus general hybridization. A homogeneous hybrid uses the same meta-heuristics and a heterogeneous hybridization combines different meta-heuristics. In global hybrids all algorithms explore the same solution space and partial hybrids work in a partitioned solution space. Specialist hybrids combine meta-heuristics which solve different problems and in general hybrids the algorithms solve all the same problem. Although hybridization offers a road to novel optimization heuristics which might perform well or even superior compared to pure heuristics, one should be aware of the large number of additional parameters one might introduce in the algorithm. Thus, the implementation and tuning of such hybrids might require additional input before achieving a satisfactory performance. Generally speaking, heuristic optimization techniques can be classified as soft computing techniques and as computational intelligence methods: They do not provide a set of deterministic rules but allow (and actually benefit) from imprecisions and stochastic elements, and in many cases these numerical procedures simulate intelligent behavior. However, they also differ from many alternative methods that would also fall into these classes. Unlike, e. g., Artificial Neural Networks7, heuristic methods are not used for problems like forecasting or estimation, but they can be applied to improve such methods. 26.2.3.1 Implementation Issues Meta-heuristics are typically not designed for a special type of optimization problem. Nonetheless, comparative studies show that they all have different merits, but that there is no clear “winning” method that would consistently outperform all other heuristics for all problems. When selecting a method, several aspects should be taken into account. First, a high level of analogy between problem and heuristic allows a more straightforward implementation. E. g., Genetic Algorithms originally deal with binary strings and can therefore be applied to problems where the variables reflect this structure with some ease while redefinitions of variables and constraints might be cumbersome. Second, heuristics vary in the number of parameters, and finding appropriate values for them can be time consuming. Third, because of the differences in their 7

Artificial Neural Networks (ANNs) consist of “cells” that are connected in to a net and that try to mimic the processes in the brain by transforming the input into output. If the (collected and weighted) input of a cell is high enough, it is activated and sends an impulse to other cells – which will or will not trigger these cells to send impulses themselves. Typically trained on a set of observations or data with know inputs and outputs, ANNs represent a non-linear regression and are widely used for estimation and prediction when the actual functional relationship between inputs and outputs is either unknown or too complex to estimate. For more details, see, e. g., (Azoff 1994).

642

Manfred Gilli et al.

search strategies and convergence behavior, necessary CPU time as well as reliability and stability of the reported solutions vary – which confirms that repeated runs should be performed before a solution is reported. Fourth, personal experience with a method’s strengths and caveats will contribute to a successful implementation. Despite of the differences of algorithms and problems, a few general recommendations can be given with regard to the implementation of optimization heuristics based on the experience of the authors with applications of different methods to a large set of highly complex optimization problems. The first recommendation is to keep the approaches simple. An algorithm with a small number of parameters should be preferred compared to a more complex algorithm unless a careful performance comparison clearly indicates that it is possible to use the additional parameter for improving the results. Second, one should avoid implementing too detailed problem specific information. This information – which typically imposes some constraints on the search set or search direction of the algorithms – will reduce the flexibility of the search process without necessarily improving the quality of the final results. Thus, even if one has – or believes to have – some sophisticated guess about what a “good” solution will look like, one should not impose this structure on the optimization heuristic, e. g. as a start point for a local search heuristic. Third, being a nuisance as compared to deterministic algorithms, the randomness introduced by heuristic methods can also be exploited for inference about the quality of results obtained. Therefore, it is recommended to rerun each algorithm several time and to analyze the distribution of results (both with regard to the objective function and the underlying variables, e. g. portfolio weights). If these distributions are concentrated close to the optimum, the probability of having found a solution close to the global optimum is higher than for much dispersed results with no concentration close to the best value found. Finally, the information about the distribution of results can be used to derive an optimal allocation of computational resources (see (Winker 2001) and (Winker 2005)).

26.3

Applications in Finance

26.3.1 Portfolio Optimization 26.3.1.1 The Basic Framework Arguable one of the most prominent optimization problems in finance is portfolio selection. Generally speaking, a combination of assets has to be found to maximize an investor’s utility. The utility is typically a function of (expected) return, or the risk, or a combination of these two aspects, and the portfolio structure might be subject to certain constraints. In the traditional mean-variance (MV) framework, the portfolio’s risk is a quadratic function of the assets’ covariances and their relative weights in the portfolio, while the portfolio’s expected return is a linear function of

26 Applications of Heuristics in Finance

643

the assets’ returns and weights. Introduced by (Markowitz 1952), this problem can be split into two sub-problems by first identifying the set of efficient portfolios (i. e., the portfolios with minimum risk given a certain expected return) and then selecting the one efficient portfolio that suites the investor the best. If there is also a risk free asset (see (Tobin 1958) and (Sharpe 1966)), the ideal risky portfolio is the one that maximizes the Sharpe ratio or reward to variability ratio, SR = (rP – rs)/σP, where rP and rs are the portfolio’s and the safe asset’s returns, respectively, and σP is the portfolio’s volatility. All investors should hold some combination of this all-riskyassets portfolio and the safe asset where the proportion, again, depends on the investor’s utility function and attitude towards risk. In its original version, markets in a Markowitz framework are assumed to be frictionless, and the only constraints on the (relative) weights are that they add up to one and that they must not be negative. Though the second constraint prohibits a closedform solution, this optimization problem can be approached with quadratic programming (QP). As soon as additional constraints are introduced that bring this model closer to reality, however, the assumption of frictionlessness does no longer hold – and mathematical programming techniques might be inappropriate. Heuristic methods, on the other hand, can deal with these constraints and are therefore well suited for this type of problems. A first application of optimization heuristics to portfolio optimization under constraints and for non standard objective functions has been presented by (Dueck and Winker 1992). Since then, an increasing number of studies have applied heuristic methods to financial optimization problems. The following subsections address some of these problems that are highly relevant in practice, yet cannot be answered satisfactorily by traditional methods. 26.3.1.2 Transaction Costs Quadratic Programming (QP) can be used to identify the efficient frontier in a meanvariance framework when there are only (linear) limits on the weights and if there are linear constraints on the weights (including restricted short selling and class constraints). Unlike in the original model, however, buying and selling assets usually comes with fees and transaction costs. Transaction costs are often linked to the traded volume. Proportional costs are then charged as a percentage on the value of the trade. In practice, these rates, too, can depend on the volume resulting in a convex cost function; here, however, we will consider only a fixed percentage rate, i. e., linear costs. If the traded volume is below a certain limit, there might also be a minimum fee which comes into effect. Alternatively or in addition to proportional costs, there might be some fixed fee per trade. In particular minimum and fixed fees make the solution space discontinuous, and the QP can no longer be used to produce reliable answers; nonetheless, this problem can be approached with heuristic methods. Typically the constraints include that the (relative) weights add up to 100 percent – implying that the whole of the budget has to be invested. In the case of fixed transaction costs, it is advantageous to work with absolute quantities, stock prices and initial endowment as these costs, too, are expressed in monetary units.

644

Manfred Gilli et al.

Following the Simulated Annealing (SA) paradigm, which is equal to Threshold Accepting except for using a probabilistic acceptance function instead of the threshold sequence, an initial solution can be found by generating a random, yet valid solution with respect to the constraints. In particular, this means that the budget constraint is not violated and that all quantities ni are either positive or zero, but must not be negative. In the course of the following iterations, each iteration consists of the following steps: first, two assets i and j are randomly picked. The quantity of the first asset, ni, is then lowered by a random amount, and the second one’s, nj, is increased as much as possible without violating the budget constraint – given the stock prices and considering the avoided and additional costs. The next step is to compute the change in the Sharpe ratio, ΔSR, that comes with the modification of the quantities. If the Sharpe ratio has actually been increased (ΔSR > 0), this means that the change was advantageous, and the new weights are accepted. If, on the other hand, the new Sharpe ratio is lowered, the change might still be accepted with a probability of p = exp(ΔSR/T) where T represents the “temperature” in the original cooling (annealing) process: When the temperature is high, the fraction ΔSR/T is close to zero and the probability p is close to 1 (other things equal). If the temperature becomes lower, the (negative) fraction ΔSR/T goes towards minus infinity, and the probability p goes towards zero. These steps are repeated until a certain halting criterion is met, and the algorithm then reports the portfolio structure that had the highest Sharpe ratio amongst all candidate solutions. Accepting “downhill moves” that reduce the Sharpe ratio might appear to accept moves in the wrong direction – yet this is the only way to escape a local optimum: A greedy uphill search algorithm such as gradient search would directly move towards the optimum closest to the starting point. As it is rather unlikely that this starting point is close the global optimum, the odds are that the greedy algorithm gets stuck in and reports an inferior local optimum. From analyzing the probabilistic acceptance function, p = exp(ΔSR/T), three basic principles become apparent: (i) other things equal, the more pronounced the impairment, the lower the chances it is accepted; (ii) uphill moves are always preferred over downhill moves since the latter have a lower probability; and (iii) by lowering the temperature, T, in the course of the iterations, the algorithm becomes more and more intolerant towards downhill moves; hence the “exploration phase” with a high temperature for finding a good neighborhood will eventually be replaced with a purposive search for the optimum within this neighborhood. In (Maringer 2002, 2005b) this approach has been successfully applied to data for the 30 stocks contained in the German stock index DAX. The results show that increasing the transaction costs will eventually lower the number of different assets an investor should hold in her portfolio. This is true for virtually any form of cost function and any level of initial endowment; only the structure of portfolios for investors with high initial endowment (one million Euros and above) and who face only fixed transaction costs are hardly affected. Even more importantly, the results show that considering the cost function while optimizing the portfolios is highly advantageous: individually optimized portfolios are highly likely to outperform an

26 Applications of Heuristics in Finance

645

0.3

0.1

optimized 5.0%

4.5%

4.0%

3.5%

3.0%

2.5%

2.0%

1.5%

1.0%

-0.1

0.5%

0.0

0.0%

Sharpe Ratio

0.2

simplified

-0.2 -0.3

proportional costs Figure 26.2. Sharpe ratios for portfolios that have been optimized considering (“optimized”) and ignoring (“simplified”) the transaction costs

investment into a fund which comes with equal fees as the individual stocks, and adapting a solution that has been found under the assumption of zero costs (i. e., a solution for the basic Markowitz/Sharpe framework) has a substantially worse reward-to-variability ratio. Figure 26.2 compares the Sharpe Ratios for optimized portfolios under proportional transaction costs (with minimum costs per transaction) to those of portfolios simply replicating a portfolio structure optimized under a “perfect market without costs” assumption. 26.3.1.3 Cardinality Constraints Theoretically, investors are supposed to hold as many different assets in their portfolios as possible, since with the correct weights each will have a marginal contribution to the diversification of risk without diminishing the overall return. Empirical studies, however, tell a different story: Amongst other things, a great deal of privately owned portfolios consist of half a dozen of different assets or even less, and they are reluctant to diversify internationally and hence give away substantial opportunities for diversification.8 At the same time, however, it is often claimed that good diversification can be achieved already with a rather small number of different assets. Statistical evidence for this claim is usually based on randomly picking a small number of different assets and analyzing the (average) marginal contribution of an additional asset; alternatively, rules exists that rank the available assets and suggest to prefer the best ranked assets over the others. Such rules usually ignore the covariance structure between these assets and are therefore not guaranteed to identify the optimal solution in the MV framework. Imposing a cardinality constraint that limits the number of different assets in the portfolio prohibits closed form 8

For literature surveys, see, e. g., Chapters 18–19 of (Cuthbertson and Nitzsche 2005) and Chapter 4 of (Maringer 2005b).

646

Manfred Gilli et al.

solutions as well as accurate numerical solutions with traditional techniques; and exhaustive search for identifying the optimal selection of n out of the N available assets usually comes with too many possible combinations. (Chang et al. 2000) apply three different heuristic methods for this search problem – Simulated Annealing, Genetic Algorithms, and Tabu Search – and find in particular the population based Genetic Algorithm useful. (Gilli and Këllezi 2002a) consider a portfolio selection problem with a cardinality constraint that also takes transaction costs and lot sizes into account. They find that TA, too, is well suited to approach this problem. (Maringer 2005b) uses Simulated Annealing (for a Sharpe framework) and an original hybrid approach (for identifying Markowitz efficient sets) and then analyzes the effects of different cardinality constraints on the diversification of the optimal portfolios for different markets. The results suggest that “small” portfolios (with a rather low number of different assets) can be almost as well diversified as the market index – provided that the assets in these small portfolios are well chosen. In particular it is found that heuristic optimization clearly outperforms standard selection rules used in the industry: Heuristically optimized portfolios with a strict cardinality constraint have a higher Sharpe ratio with the same (or even lower) number of assets as portfolios based on popular rules, e. g., by including the assets with the highest individual Sharpe ratios. This effect is more pronounced the “smaller” the target portfolio ought to be. 26.3.1.4 Index Tracking Roughly speaking, approaches to portfolio management can be divided into two broad categories: active and passive. Active managers aim to find a portfolio based on their own expectations of future returns and covariances; the previously presented problem of maximizing the (expected) Sharpe Ratio under a cardinality constraint falls into this category. Passive approaches, on the other hand, are based on the assumption that the market cannot be beaten on the long run and that the next best thing to holding the market portfolio is to have a portfolio that reproduces the whole of the market as closely as possible. Closeness measures are typically based on the differences between portfolio and market returns, e. g., the square root of mean squared deviations. Preferably, the tracking portfolio is composed of a relatively small subset of the stocks in that market. In real life problems, the size of this subset – and hence the portfolio structure – are typically affected by transaction costs and integer lot size constraints; these assumptions will reduce the number of different assets in the portfolio and will therefore have the effect of an implicit cardinality constraint.9 However, an additional explicit cardinality constraint can also apply if, e. g., the manager wants to limit the number of assets to be monitored. (Beasley et al. 2003) approach this problem with an evolutionary population based method while (Gilli and Këllezi 2002b) consider this problem using Threshold Accepting which has the advantage of being easier to implement. Using the 9

See, e. g., (Maringer 2002, 2005b).

26 Applications of Heuristics in Finance

647

same empirical data as well as experimenting with different constraints which exceed the capabilities of traditional numerical methods, they conclude that TA, too, is an efficient method for solving the index tracking and benchmarking problems. In their model, the investor faces transaction costs; also, positive deviations from the index (i. e., outperforming it) are rewarded while negative deviations (i. e., underperformance) are punished. Their results indicate that this problem is solvable with TA and that their reported optimal portfolios perform well also in out of sample tests. 26.3.1.5 Value at Risk and Expected Shortfall In the Markowitz framework, the risk is captured either by the variance of the returns (which expresses the mean squared deviations from the expected return) or the volatility, which is simply the square root of the variance. Both variance and volatility are symmetric risk measures, i. e., they measure any deviation from the expected return – be it positive or negative. In practice, however, investors are more concerned about the risk that their portfolio value falls below a certain level. That is the reason why different measures of the downside-risk are considered in the asset allocation problem. For instance an investor might be trying to maximize the future value of their portfolio, requiring the probability that the future value of their portfolio falls below a certain level, called value-at-risk (VaR), not to be greater than a certain shortfall probability. Furthermore, it would be realistic to consider an investor who cares not only for the VaR, but also for the expected shortfall (ES), i. e. the extent to which his portfolio value can fall below the VaR level. Alternative approaches attempt to conform the assumptions of the Markowitz model to reality by, for example, dismissing the normality hypothesis in order to account for the fat-tailedness and the asymmetry of asset returns. The uncertainty about future returns, i. e. about future portfolio value, is generally modeled through a set of possible realizations, called scenarios. Scenarios of future outcomes can be generated relying on a statistical model, past returns or experts’ opinions. The introduction of discrete scenarios makes the objective function discontinuous which again introduces additional complexity in the optimization problem.10 Again, for this type of problem, a heuristic method, such as Threshold Accepting heuristic, is particularly well suited to solve the problem as shown in (Gilli and Këllezi 2002a) and (Gilli et al. 2006). In this paper also other risk measures, such as the Omega function (see (Keating and Shadwick 2002)) and the maximum loss are considered. Another application using a population based heuristic for solving the portfolio optimization problem under VaR constraints for stock portfolios is given by (Maringer 2005a), additional results for bond portfolios can be found in (Winker and Maringer 2005a), and a comparative study is given in (Maringer 2005b). In either of theses cases, the authors find that VaR has severe short10

The minimization of the expected shortfall, also called the conditional value-at-risk (CVaR), can be solved with a linear program if no cardinality constraints are imposed; see (Rockafellar and Uryasev 2000).

648

Manfred Gilli et al.

comings when used as a unique measure of risk: The undesirable properties of VaR11 might lead to over-fitting and underestimating the risk even if the VaR is reliable for the individual assets or non-optimized portfolios.

26.3.2 Model Selection and Estimation 26.3.2.1 Risk Factor Selection Explaining past stock returns and reliably predicting future performance has been a major issue in the finance literature ever since. Meanwhile, several theoretically sound and well-founded equilibrium models exist, arguably the most popular of these being the Capital Asset Pricing Model (CAPM). The CAPM links the risk premium of an asset linearly to the market’s risk premium by the beta coefficient: If the market’s return goes up (or down) by 1%, the asset’s return should go up (or down) by beta percent – at least on the long run. Not least because of its simplicity and elegance, the CAPM and the beta coefficient are popular in the industry, but they are also criticized to be too simplistic and to have problems with predicting and explaining short term price movements. Multiple linear regression (MLR) models aim to reduce this problem by linking the asset’s return to a whole set of different factors that might affect the price movements. Traditionally, there are three types of factors that might be considered: firm specific information (e. g., balance sheet information), market and industry specific data (e. g., some sector indices), and macro economic data (e. g., inflation or interest rates). Usually, the literature suggests bundles of about five different risk factors which are either selected based on statistical criteria or because they are assumed to be economically plausible. These bundles are selected universally and assumed to be the same for all assets. The weights for the factors (which are similar to the beta coefficient) are then computed by regressing historical data. A practical reason for using a “one fits all” set of factors is the vast number of possible alternatives that makes the factor selection problem so challenging. Identifying the best combination of factors for an individual asset by pure chance is rather unlikely, and, once more, there exists no deterministic approach that guarantees to find the optimum. However, heuristic optimization can be applied to solve this problem. (Maringer 2004) suggests a population based approach to identify the five (out of 103) Morgan Stanley Capital International (MSCI) indices that explain most of an asset return’s variation. Applying this methodology to the stocks represented in the S&P 100, the results show that for eighty percent of the investigated cases, this individual selection allows explaining at least thirty percent of their variance, in some cases eighty percent or even more can be explained. Typically, the identified bundles contained one or several industry or sector indices that contributed most to the quality; geographic indices were mainly used to supplement this information; 11

Unlike, e. g., volatility, VaR can be classified as an incoherent measure of risk implying that it does not satisfy all criteria for a reliable risk measure; see, e. g., (McNeil et al. 2005).

26 Applications of Heuristics in Finance

649

the estimated regression parameters frequently were statistically significant, too, yet their contribution to the model’s explanatory power was lower. The results also confirmed that by and large, five is a reasonable number of factors to include in such a model: when there are fewer factors, the explanatory power tends to be noticeably lower, more factors, however, are not very beneficial, and the factor weights are not always significantly different from zero. The results also showed that in many cases the selected bundles work well out of sample and that the results are stable. 25.3.2.2 Risk Estimation and GARCH Models So far, we addressed the application of heuristics to problems that can easily been identified as optimization problems. These techniques, however, can also be used in other areas that are highly relevant in finance yet are typically not seen as optimization problems. One such application would be parameter estimation for financial or economic models. As pointed out previously, financial risk is often measured in terms of the variance of the returns. In traditional models, the variance is assumed to be constant. A closer look at empirical data, however, reveals that financial data often exhibit what is called volatility clustering: there are times where the risk is high, and times where it is low. If this applies, then an important aspect is left out when computing just the unconditional variance that is assumed to be constant for the whole of the considered time window and that does not distinguish such phases of high and low risk. The conditional variance, on the other hand, takes this information into account by allowing the variance to change over time, i. e., to be heteroskedastic. In particular, the general autoregressive heteroskedasticity (GARCH) model assumes that the current risk consists not only of a constant term, but is also linked to the magnitude of the previous risk and the previous squared deviations from the mean. These extensions allow the model to distinguish high and low risk phases. This Nobel Prize winning model has been found very useful to estimate risk, but it also comes with a downside: like many non-linear models, there is no simple general formula for estimating its parameters. Apart from one special case, this task has to be performed by finding parameter values that maximize a loglikelihood function. Most standard econometric and statistic software now allow for GARCH estimation using numerical optimization. The log-likelihood function, however, has local optima. Different programs therefore tend to report different “optimal” solutions. Since from this perspective, GARCH estimation is seen as an optimization problem, heuristic methods can be used as an alternative approach. (Maringer 2005) and, more sophistically, (Winker and Maringer 2005) successfully apply Threshold Accepting to this parameter estimation problem. In either case, the parameters they find are closer to their true values and therefore outperform many a specialized program. This high reliability also allows investigating the convergence behavior of GARCH models. In due course, this allows statements about the minimum number of observations or the length of the data series that is required to re-identify the parameters of the true underlying model depending on the required precision.

650

Manfred Gilli et al.

26.3.2.3 Indirect Estimation and Agent Based Modeling In the GARCH problem, the objective function (i. e., the log-likelihood function) is well defined and easy to compute exactly for a given set of parameters. The parameter estimation problem becomes substantially more challenging when this is no longer true and the objective function itself is to some extent random. An example for such a situation are agent based models (ABMs), which recently gained interest for the modeling of financial markets.12 ABMs are nondeterministic models where individuals are assumed to follow a (more or less) simple set of rules. The collective behavior is derived by observing the interactions between agents and the resulting market outcome. It has been observed that this type of models can reflect stylized facts observed in the real world which appear difficult to explain by models at the aggregate level. In an artificial stock market (ASM), e. g., agents might base their buy and sell decisions on their past behavior, the price history, the perceived behavior of their fellow agents, etc. The resulting buy and sell orders lead to changes in the traded asset’s price which then affect the agents’ behavior and so on. The actual behavior of the variables modeled through an agent based approach depends on the model specification, but also to a crucial extent on the values of model parameters governing the individual agents’ behavior. These parameters might include, e. g., the probability to change an investment strategy or how far to look into the past when following a technical trading rule. Much akin to real markets, ASMs usually contain a certain level of noise which is necessary to produce some of the desired effects, but which also prohibits a perfect prediction of the simulation outcome. As a result, the effect of changing the agents’ parameters and re-running the ASM simulation cannot be seen immediately. It is rather that the ASM with the modified parameters will exhibit modified characteristics only on average with an increasing number of repeated runs as a new convergence result. In this light, finding the parameters for an ABM that exhibits certain characteristics can become a daunting task: Not only does the (high dimensional) solutions space exhibit many local optima, but the solution space itself exists only as a convergence result and is therefore not directly observable, but has to be approximated by means of Monte Carlo simulations. (Gilli and Winker 2003) used an implementation of a Threshold Accepting hybrid to approach this highly complex optimization in an application to a model of the foreign exchange market proposed by (Kirman 1991, 1993). Although several features of their approach can be improved, which is the object of current research activities, the results of the first implementation merit some interest as they point at the future potential of the method. If Kirman’s model is accepted as a reasonable approximation of the foreign exchange market (German Mark versus US-$), the results indicate that this market is characterized by large swings between periods of more fundamentally oriented trading rules and periods when chartist behavior becomes dominant. Future applications of the framework will not only consider different 12

See, e. g., (Kirman 1991) and (Lux and Marchesi 2000).

26 Applications of Heuristics in Finance

651

models and markets, but might include the test of different assumptions about what are the relevant fundamental values etc.

26.4 Conclusion This chapter offered a short (and by no means exhaustive) introduction to some challenging financial optimization problems. Traditionally, these problems are simplified until they become approachable by standard optimization techniques or algorithms, and these solutions are then modified such that the previously ignored constraints are taken into account again. Though it is well known that this might easily lead to suboptimal decisions (which, in finance, usually come with losses or opportunity costs), this has long been the only feasible way. A new type of search and optimization techniques that overcome many of these problems are heuristics. Usually based on a general purpose concept, they are very flexible to use and can therefore be applied to a much broader array of problems with all sorts of different constraints. Often following some principles found in nature, heuristics incorporate stochastic elements in the generation and/or the acceptance of new candidate solutions. These random elements allow escaping local optima which might be encountered during the search process (and in which traditional numerical methods easily get stuck). Furthermore, heuristics do not demand sophisticated guesses for an initial solution or for their modifications (which often is a prerequisite for traditional methods and which might be an utter computational challenge). Heuristic methods are becoming increasingly popular and are an acknowledged (if not only) means to tackle certain problems. The amount of research in this area is steadily growing, and they are gaining importance in commercial applications. The examples discussed in this chapter focus on different aspects of portfolio optimization, parameter estimation and model evaluation. In any of these cases, the use of heuristics allows a closer look on issues that evade a reliable analysis with traditional methods.

References Althöfer I, Koschnik K-U (1991), On the Convergence of Threshold Accepting. Applied Mathematics and Optimization 24:183–195 Azoff EM (1994), Neural Network Time Series Forecasting of Financial Markets. Wiley, Chichester Beasley JE, Meade N, Chang T-J (2003), An Evolutionary Heuristic for the Index Tracking Problem. European Journal of Operations Research 148(3):621–643 Brabazon T, O’Neill M (2006), Biologically inspired algorithms for financial modeling. Springer, New York

652

Manfred Gilli et al.

Chang T-J, Meade N, Beasley JE, Sharaiha YM (2000), Heuristics for Cardinality Constrained Portfolio Optimisation. Computers and Operations Research 27(13):1271–1302 Cuthbertson K, Nitzsche D (2004), Quantitative Financial Economics. John Wiley & Sons Ltd, Chichester Dueck G, Scheuer T (1990), Threshold Accepting: A General Purpose Algorithm Appearing Superior to Simulated Annealing. Journal of Computational Physics 90:161–175 Dueck G, Winker P (1992), New Concepts and Algorithms for Portfolio Choice. Applied Stochastic Models and Data Analysis 8:159–178 Fang K-T, Winker P (1997), Application of Threshold Accepting to the Evaluation of the Discrepancy of a Set of Points. SIAM Journal on Numerical Analysis 34(5):2028–2042 Fogel DB (2001), Evolutionary Computation: Toward a New Philosophy of Machine Intelligence. 2nd edition, IEEE Press, New York, NY Gilli M, Këllezi E (2002a), Portfolio Optimization with VaR and Expected Shortfall. In: Kontoghiorghes EJ, Rustem B, Siokos S (eds), Computational Methods in Decision-Making, Economics and Finance. Kluwer Applied Optimization Series, pp. 167–183 Gilli M, Këllezi E (2002b), The Threshold Accepting Heuristic for Index Tracking. in: Pardalos P, Tsitsiringos VK (eds), Financial Engineering, E-Commerce and Supply Chain. Applied Optimization Series, Kluwer Academic Publishers, Boston, pp. 1–18 Gilli M, Këllezi E, Hysi H (2006), A Data-Driven Optimization Heuristic for Downside Risk Minimization. Journal of Risk 8(4):1–18 Gilli M, Winker P (2003), A Global Optimization Heuristic for Estimating Agent Based Models. Computational Statistics and Data Analysis 42:299–312 Glover F (1986), Future Paths for Integer Programming and Links to Artificial Intelligence. Computers and Operations Research 13:533–549 Holland JH (1975), Adaption in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI Keating C, Shadwick WF (2002), A Universal Performance Measure. The Finance Development Centre, London, http://faculty.fuqua.duke.edu/~charvey/ Teaching/BA453_2005/Keating_A_universal_performance.pdf Kirkpatrick S, Gelatt CD, Vecchi MP (1983), Optimization by Simulated Annealing. Science 220:671–680 Kirman A (1991), Epidemics of Opinion and Speculative Bubbles in Financial Markets. In: Taylor M (ed.), Money and Financial Markets. Macmillan, New York, pp. 354–368 Kirman A (1993), Ants, Rationality, and Recruitment. Quarterly Journal of Economics 108:137–156 Lux T, Marchesi M (2000), Volatiliy Clustering in Financial Markets: A MicroSimulation of Interacting Agents. International Journal of Theoretical and Applied Finance 3:675−702

26 Applications of Heuristics in Finance

653

Maringer D (2002), Portfolioselektion bei Transaktionskosten und Ganzzahligkeitsbeschränkungen. Zeitschrift für Betriebswirtschaft 72(11):1155–1176 Maringer D (2004), Finding the Relevant Risk Factors for Asset Pricing. Computational Statistics and Data Analysis 47:339–352 Maringer D (2005a), Distribution Assumptions and Risk Constraints in Portfolio Optimization. Computational Management Science 2(2):139–153 Maringer D (2005b), Portfolio Management with Heuristic Optimization. Springer, Berlin, Heidelberg, New York Markowitz HM (1952), Portfolio Selection. The Journal of Finance 7(1):77–91 McNeil AJ, Frey R, Embrechts P (2005), Quantitative Risk Management: Concepts, Techniques and Tools. Princeton University Press, Princeton, NJ Rechenberg I (1973), Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Fromman-Holzboog Verlag, Stuttgart Refenes, Apostolos-Paul (ed.) (1995), Neural Networks in the Capital Markets. John Wiley & Sons, Chichester et al. Rockafellar RT, Uryasev S (2000), Optimization of Conditional Value-at-Risk. Journal of Risk 2(3):21–41 Sharpe WF (1966), Mutual Fund Performance. Journal of Business 39(1):119–138 Taha HA. (2003), Operations Research: An Introduction. 7th edition, Prentice Hall, Upper Saddle River, NJ Talbi EG (2002), A Taxonomy of Hybrid Metaheuristics. Journal of Heuristics 8:541–564 Tobin, J (1958), Liquidity Preference as Behavior Towards Risk. Review of Economic Studies 26(1):65−86 Winker P (2001), Optimization Heuristics in Econometrics: Applications of Threshold Accepting. Wiley, Chichester Winker P (2005), The Stochastics of Threshold Accepting: Analysis of an Application to the Uniform Design Problem. Discussion paper 2005-003E, University of Erfurt Winker P, Maringer D (2005a), The Hidden Risks of Optimising Bond Portfolios under VaR. Deutsche Bank Research Notes 13, Frankfurt aM Winker P, Maringer D (2005b), The Convergence of Optimization Based Estimators: Theory and Application to a GARCH Model. Discussion paper 2005003E, University of Erfurt Winker P, Gilli M (2004), Applications of Optimization Heuristics to Estimation and Modelling Problems. Computational Statistics and Data Analysis 47(2):211–22

CHAPTER 27 Kernel Methods in Finance Stephan Chalup, Andreas Mitschele

27.1 Introduction Kernel methods (Cristianini and Shawe-Taylor 2000; Herbrich 2002; Schölkopf and Smola 2002; Shawe-Taylor and Cristianini 2004) can be regarded as machine learning techniques which are “kernelised” versions of other fundamental machine learning methods. The latter include traditional methods for linear dimensionality reduction such as principal component analysis (PCA) (Jolliffe 1986), methods for linear regression and methods for linear classification such as linear support vector machines (Cristianini and Shawe-Taylor 2000; Boser et al. 1992; Vapnik 2006b). For all these methods corresponding “kernel versions” have been developed which can turn them into non-linear methods. Kernel methods are very powerful, precise tools that open the door to a large variety of complex non-linear tasks which previously were beyond the horizon of feasibility, or could not appropriately be analysed with traditional machine learning techniques. However, with kernelisation come a number of new tasks and challenges that need to be addressed and considered. For example, for each application of a kernel method a suitable kernel and associated kernel parameters have to be selected. Also, high-dimensional nonlinear data can be extremely complex and can feature counter-intuitive pitfalls (Verleysen and Francois 2005). Some kernel methods, for example non-linear support vector machines (SVMs) (Vapnik 1998, 2000; Cristianini and Shawe-Taylor 2000; Schölkopf and Smola 2002; Burges 1998; Evgeniou et al. 2000), have become very popular and useful tools in applications. They have, in recent years, substantially improved former benchmarks in application areas like bioinformatics, computational linguistics, and computer vision (Cristianini and Shawe-Taylor 2000). Specific applications include genome sequence classification (Sonnenburg et al. 2005), other applications in computational biology (Schölkopf et al. 2004), time series prediction (Cao 2003), text categorization (Fu et al. 2004), handwritten digit recognition (Vapnik 2000), pedestrian detection (Kang et al. 2002; Chen et al. 2006), face recognition (Osuna

656

Stephan Chalup, Andreas Mitschele

et al. 1997; Li et al. 2004b; Carminati and Benois-Pineau 2005), and other applications in bioengineering, signal- and image processing (Camps-Valls et al. 2007). In machine learning applications the characteristics of the available data plays a crucial role. Of course this applies to applications in finance as well, including risk management, asset allocation issues, time series prediction, and financial instrument pricing. In risk management one may try to estimate the future price uncertainty of a financial instrument exposure using large historical data sets. The reliability of such an analysis is highly dependent on the adequacy of the chosen risk management model and the quality of the input data. However, financial data are generally known to be inherently noisy, non-stationary and deterministically chaotic (Cao and Tay 2001). The potential of kernel machines to deal with such complex and possibly non-linear input data is therefore of particular value. Additionally, financial data sets exhibit highly heterogeneous behavior. While market risk1 data usually have extensive time series – especially when considering tickwise data – credit risk2 data show sparsity due to relatively rare credit events and short lengths of times series. Many well-approved techniques have been developed in the past to handle the deficiencies that these data sets exhibit. While some authors have employed artificial intelligence techniques to solve these problems, it can be stated that traditional statistical methods, like GARCH (generalized autoregressive conditional heteroskedasticity) or PCA are more common. (Alexander 2001) gives an introductory overview to the more traditional techniques. Some developments in kernel machines are very recent and have not yet had a chance to be applied in the area of finance. Our aim is to provide a concise overview of fundamental concepts of kernel methods that are already common in finance applications, as well as aspects of some newer developments which may be of importance for financial applications in the future. There are several excellent texts on introductory or more theoretical aspects of kernel methods available which are recommended to complement and extend the present chapter. Among them are the introductory tutorial on SVMs and statistical learning theory by (Burges 1998), the books (Vapnik 1998, 2000, 2006b; Cristianini and Shawe-Taylor 2000; ShaweTaylor and Cristianini 2004; Schölkopf and Smola 2002; Herbrich 2002; Suykens et al. 2002),, as well as relevant book chapters in (Haykin 1999; Gentle et al. 2004; Bishop 2006), and a number of recent collections and overviews on associated topics such as (Müller et al. 2001; Burges 2005; Chapelle et al. 2006). The first part of the present chapter provides some theoretical background and explains in the next section the general idea of kernel machines and kernelisation. Then the three fundamental machine learning paradigms dimensionality reduction, regression, and classification as well as associated questions of kernel and parameter selection are addressed. The chapter’s second part gives a survey of typical questions and tasks arising in finance applications and how kernel meth1

2

Market risk arises due to adverse changes in the market prices of financial instruments, e. g. stocks, bonds or currencies. Credit risk describes potential losses for the creditor through a debtor who is not able or willing to repay his debt in full.

27 Kernel Methods in Finance

657

ods have been applied to solve them. Finally follows a brief overview of relevant software toolboxes.

27.2 Kernelisation Kernel methods employ a non-linear feature mapping

φ : X ⎯→ H

(27.1)

from an input space X , for example X = R d , to a high-dimensional possibly ∞ -dimensional feature space H . φ lifts a potentially non-linear task from X to H where “remotely” a satisfactory solution is sought via traditional typically linear tools (cf. Figure 27.1). The idea is that the feature mapping φ only appears implicitly and does not need to be determined explicitly in any of the computations. Central to this approach is a continuous and symmetric function K : X × X ⎯→ R

(27.2)

which will be used to measure the similarity between inputs. Mercer’s condition (Mercer 1909; Courant and Hilbert 1953; Cristianini and Shawe-Taylor 2000; Herbrich 2002) says if K is positive semi-definite then there exists a mapping φ as in Eq. (27.1) from X into some Hilbert space H (that is, a complete dot product space) such that K is a (Mercer) kernel, that is, it can be written as dot product as follows K ( xi , x j ) = φ ( xi )T φ ( x j ) for i, j ∈ {1,..., k}.

(27.3)

Figure 27.1. The feature map of non-linear SVMs: A non-linear separator (thick line on the left) in input space is obtained by applying a linear maximal margin classifier in high-dimensional feature space (on the right)

658

Stephan Chalup, Andreas Mitschele

Note that given a vector space X = R d the function K is positive semi-definite if k ∑ i, j =1 ci c j K ( xi , x j ) ≥ 0 for all k ∈ N , any selection of examples x1 ,..., xk ∈ X ,

and any coefficients c1,..., ck ∈ R . Kernelisation of any algorithm, for example for dimensionality reduction, regression, or classification, can be achieved by seeking a formulation of the algorithm where variables only occur in form of dot products xiT x j . Then, after formally replacing all the xi ∈ X by φ ( xi ) the feature vectors only occur in the form of dot products φ ( xi )T φ ( x j ) and finally, provided the selected kernel function K fulfills Mercer’s condition, the substitution of Eq. (27.3) can be applied, which leads to a kernel method. The computations of kernel methods thus only involve kernel values and do not require explicit knowledge of φ . This is often called the Kernel Trick. The basic idea of kernel methods was already developed in the 1960s (Aizerman et al. 1964) and incorporated into the SVM framework from 1992 (Boser et al. 1992; Vapnik 2000, 2006a). The kernel trick was also employed to kernelise other methods such as PCA (Schölkopf et al. 1996, 1997; Schölkopf 1997).

27.3 Dimensionality Reduction There are numerous motivations to perform dimensionality reduction including reduction of computational complexity, a better understanding of the data, and the avoidance of unwanted effects which can occur in high-dimensional spaces. In finance, particularly in risk management, dimensionality reduction plays a crucial role to enable the modeling of systems which often have several hundreds of risk factors. Hitherto, usually linear methods, such as PCA, were used in finance applications, like interest rate modeling (Alexander 2001). To describe the fundamental paradigm let xt ∈ R d , t = 1,..., n be data points in the high dimensional space. The aim is to find a mapping R d ⎯→ R r xt yt

(27.4)

for all t = 1,..., n such that r < d and the new lower dimensional representation of the data in R r has some equivalent topological structure and still contains all its essential information.

27.3.1

Classical Methods for Dimensionality Reduction

This section briefly describes PCA and multidimensional scaling (MDS). These two classical spectral methods for linear dimensionality reduction are the basis of kernel principal component analysis (KPCA) and isomap. The latter two methods can be regarded as kernel methods for non-linear dimensionality reduction (Williams 2001; Ham et al. 2004). PCA and MDS are in general not appropriate for

27 Kernel Methods in Finance

659

Figure 27.2. Left: A two-dimensional helical strip is embedded in three-dimensional Euclidean space. Right top: Linear methods such as PCA or metric MDS project the data into R2. Right bottom: The non-linear method isomap with neighbourhood parameter 5 is able to unfold the helix and can approximately maintain topologically correct neighbourhood relationships between the data points

processing data samples from non-linear manifolds. This is illustrated in Figure 27.2 where the task is to reduce the embedding of the two-dimensional helical strip shown on the left from an embedding into R 3 to an embedding into R 2 . As shown on the right hand side of Figure 27.2 PCA and classical metric MDS, which both produce the same output (Cox and Cox 2001; Xiao et al. 2006), both fail to unfold the helix correctly and instead project it onto the two-dimensional plane. The non-linear method isomap (Tenenbaum et al. 2000), however, is able to unfold the helix and thus leads to a topologically correct embedding of the helix into R 2 . 27.3.1.1 Principal Component Analysis PCA is a well-established method for linear dimensionality reduction (Hotelling 1933; Jolliffe 1986). Given a set of sample vectors xt ∈ R d , t = 1,..., n the first n step of PCA is to calculate the sample mean x = 1n ∑ t =1 xt and the sample covariance matrix C=

1 n ∑ ( xt − x )( xt − x )T . n t =1

(27.5)

As C is a symmetric (d × d ) -matrix it is possible to compute its eigenvalues

λi ∈ R and associated eigenvectors vi ∈ R d , i = 1,..., d , such that V T CV = diag[λ1 ,..., λd ]

(27.6)

where the λ1 ≥ λ2 ≥ ... ≥ λd are the eigenvalues of C in descending order, diag[λ1 ,..., λd ] is the diagonal matrix with the eigenvalues as diagonal elements, and V is the matrix with the associated eigenvectors vi, the principal components,

660

Stephan Chalup, Andreas Mitschele

as columns. The obtained eigenvectors are mutually orthonormal and define a new basis aligned with the directions of maximum variance in the data. The eigenvalues represent the projected variance of the inputs along the new axes. The number of significant eigenvalues, r , indicates the intrinsic dimensionality of the data set. A projection of the data into the r -dimensional subspace is given by yt = VrT ( xt − x ), t = 1,..., n where Vr is the (d × r ) submatrix of V containing the first r eigenvectors associated with the r largest eigenvalues as columns. One possible geometric interpretation is that PCA seeks the orthogonal projection of the data onto a lower dimensional linear space such that the variance of the projected data becomes maximal (Jolliffe 1986; Bishop 2006). 27.3.1.2 Multidimensional Scaling Classical MDS (Cox and Cox 2001) can be motivated as a method which has the aim to find a faithful lower dimensional embedding of the given data xt ∈ R d , t = 1,..., n xt ∈ R d , t = 1,..., n under the constraint that pairwise Euclidean distances between data points are preserved as much as possible, that is, || xi − x j || ≈ || yi − y j ||, i, j = 1,..., n || xi − x j || ≈ || yi − y j ||, i, j = 1,..., n . Unn der the assumption that the centroid of the data is at the origin, that is ∑ t =1 xt = 0 , the following relation between the matrix of squared Euclidean distances D = (|| xi − x j ||2 )i , j =1,..., n and the Gram matrix G = ( xiT x j )i , j =1,..., n holds (Cox and Cox 2001): 1 G = − HDH 2

(27.7)

where H = ( H ij )i, j =1,..., n is the centering matrix with H ij = δ ij −

⎧ 1, i = j 1 and δ ij = ⎨ n ⎩0, i ≠ j.

(27.8)

Let X = [x1 ,..., xn ] ∈ R d ×n be the matrix with the data points as columns. Since G = X T X is symmetric there is an orthonormal basis of eigenvectors w1 ,..., wn such that G = W ⋅ diag[ μ1 ,..., μ n ] ⋅ W T

(27.9)

where the μ1 ≥ μ2 ≥ ... ≥ μn are the eigenvalues of G in descending order and W is the matrix with the associated eigenvectors wi as columns. After selecting the r most significant eigenvalues a low dimensional representation of the data is obtained by xi

yi = ( μ1 w1i ,..., μ r wri )T , i = 1,..., n.

(27.10)

An alternative motivation used in MDS is to preserve the dot products as far as possible, that is x i xTj ≈ yi yTj , i, j = 1,..., n, and start with an eigenvalue decomposition of the Gram matrix as in (27.9). This approach directly works on the input

27 Kernel Methods in Finance

661

data and not on the pairwise distances. Metric MDS can be categorised as a kernel method because it can be interpreted as a form of kernel MDS or kernel PCA, respectively (Williams 2001).

27.3.2

Non-linear Dimensionality Reduction

Non-linear dimensionality reduction is often used synonymously with the term manifold learning. Manifolds are central objects in geometry and topology (Spivac 1979). Examples of manifolds are locally Euclidean objects such as lines, circles, spheres, tori, subsets of those objects, and their generalisations to higher dimensions. The aim of manifold learning methods is to extract low-dimensional manifolds from high-dimensional data and faithfully embed them into a lower dimensional space. The use of non-linear dimensionality reduction techniques in practical applications can come with technical and conceptual challenges. When using linear dimensionality reduction techniques the outcome has always been restricted because there exists essentially only one linear space in each dimension, while in the non-linear case there is a large variety of manifolds. In recent years several new methods for non-linear dimensionality reduction have been developed and many of them are, similar to PCA and MDS, spectral methods (Xiao et al. 2006). These include, for example, kernel principal component analysis (KPCA) (Schölkopf et al. 1998), isometric feature mapping (Isomap) (Tenenbaum et al. 2000), locally linear embedding (LLE) (Roweis and Saul 2000), maximum variance unfolding (MVU) (Weinberger and Saul 2004; Weinberger et al. 2004, 2005), and laplacian eigenmaps (Belkin and Niyogi 2003). The different methods have distinct individual advantages and it is a topic of current research to gain better understanding of their robustness and ability to recognise and preserve the underlying manifolds’ topology and geometry. 27.3.2.1 Kernel Principal Component Analysis The basic idea of KPCA is to kernelise PCA, that is, to map the data x1 ,..., xn to a higher dimensional space using an implicit non-linear feature map φ : R d ⎯→ H and then to apply PCA (Schölkopf et al., 1997, 1998). In practice, however, it is not easy to center the mapped data in feature space for calculation of the estimated covariance matrix T

C=

1 n ⎛ 1 n 1 n ⎞⎛ ⎞ ∑ ⎜ φ ( xi ) − ∑ φ ( xt ) ⎟ ⎜ φ ( xi ) − ∑ φ ( xt ) ⎟ . n i =1 ⎝ n t =1 n t =1 ⎠⎝ ⎠

which is required for PCA. According to (Schölkopf et al. 1997, 1998; Williams 2001; Saul et al. 2006) there is an alternative solution to KPCA which utilises the Gram matrix in feature space K = (φ ( xi )T φ ( x j ))i, j =1,..., n .

662

Stephan Chalup, Andreas Mitschele

After suitable kernel substitution with a positive semi-definite kernel, K is called the kernel matrix. A centered kernel matrix K C can be calculated by K C = HKH

(27.11)

where H = ( H ij )i, j =1,..., n is the centering matrix of MDS as in Eq. (27.8). The remainder of the method follows the approach explained for MDS in KC is symmetric it can be diagonalised Eq. (27.9): Since C K = W ⋅ diag[ μ1 ,..., μn ] ⋅ W T K C = W ⋅ diag[ μ1 ,..., μn ] ⋅ W T such that the eigenvalues μi are in descending order on the diagonal and W is the matrix with the associated eigenvectors wi as columns. Selection of the r most significant eigenvalues leads to a low-dimensional representation of the data as in Eq. (27.10). 27.3.2.2 Isometric Feature Mapping Isomap (Tenenbaum et al. 2000) is a manifold learning method which can be seen as an extension of MDS where approximate geodesic distances are used as inputs instead of pairwise Euclidean distances. In contrast to Euclidean distances which are measured straight through the surrounding space R d geodesic distances can be longer because they are measured along (shortest) arcs within the manifold using its intrinsic metric. As in Eq. (27.3) let x1 ,..., xn ∈ R d , be given data points which are assumed to be sampled from a low-dimensional manifold M which is embedded in the highdimensional input space R d . A basic version of Isomap using k -nearest neighbour graphs can then be explained in three steps (Tenenbaum et al. 2000): 1. Select a neighbourhood parameter k ∈ N and construct a neighbourhood graph G such that each point of the manifold is connected only to its k nearest neighbours. Weight existing connections between two neighbouring vertices x p , xq ∈ G by their Euclidean distance in R d . 2. Generate a distance matrix D = (dij )i, j =1,..., n where each coefficient dij is the shortest path distance in G between each pair of the initially given sample points xi , x j ∈ R d , i, j = 1,..., n . The shortest paths in G can be calculated, for example, by Dijkstra’s algorithm (Cormen et al. 2001). The idea is that the path-length dij is an approximation of the geodesic distance between each pair of points xi , x j ∈ M , i, j = 1,..., n . 3. Apply metric MDS using the dij , i, j = 1,..., n as inputs. (Ham et al. 2004) showed that Isomap can be regarded as a form of KPCA, that is a kernel method, if the following conditionally positive definite kernel matrix is used 1 K isomap = − HDH , 2

(27.12)

where D is Isomap’s distance matrix of step 2 above and H is the centering matrix of MDS which was defined in Eq. (27.8).

27 Kernel Methods in Finance

663

27.3.2.3 Example: Rotating Coin The rotating coin data set consists of a series of 200 digital images taken of a gold coin while it was rotating. Each image had 1682 pixel which means that it can be regarded as a point in a 28224-dimensional vector space. The essential underlying dynamics contained in the image sequence can be represented by a circle, that is, a 1-dimensional manifold embedded in the 2-dimensional plane. The task for the dimensionality reduction method was to recognise the underlying dynamics, extract the circle and reduce the dimensionality from 1682 down to 2 dimensions. Figure 27.3 shows the results obtained with Isomap (Tenenbaum et al. 2000) using k = 3 as the neighbourhood parameter. Similar results but with stronger geometric distortions and irregularities were obtained with KPCA and with Isomap when using k > 3 . The traditional linear methods PCA and MDS were not able to solve this task.

Figure 27.3. Application of isomap (k = 3) to the rotating coin data. Each point represents an image. Similar images are mapped to points in close vicinity. The rotation encoded in the sequence of high-dimensional pixel arrays is represented by the resulting one-dimensional circle graph

664

Stephan Chalup, Andreas Mitschele

27.4 Regression The task of linear regression is to estimate a function f ( x) = wT x + b

(27.13)

by finding suitable parameters w ∈ R m and b ∈ R such that f ( xi ) = yi , i = 1,..., n f ( xi ) = yi , i = 1,..., n for a given set of iid data points

( x1 , y1 ) ,..., ( xn , yn ) ∈ R m × R.

(27.14)

In finance, (mostly linear) regression represents a standard tool to perform time series analysis, especially for forecasting tasks (Alexander 2001). A selection of applications of support vector machine regression in finance will be reviewed later in this chapter. Geometrically linear regression corresponds to finding an offset and a direction of the line that best fits a set of given data points. A classical solution method for the task of line fitting is least squares optimisation which minimises the square loss ∑im=1 ( yi − (wxi + b))2 over all ( w, b ) ∈ R m × R . SVMs for linear regression are a more recent method which can be used to solve this task by minimising the empirical risk 1 n ∑ || yi − f ( xi ) ||ε n i =1

(27.15)

0, if || z ||≤ ε ⎧ where || z ||ε = ⎨ || z || − ε , otherwise, ⎩

(27.16)

is Vapnik’s ε-insensitive loss function (Vapnik 1998). This can be formulated as a constraint optimisation task as follows: n 1 Minimise wT w + C ∑ (ξi + ξi∗ ) 2 i =1

over all w, b ∈ R and ξi , ξi∗ ≥ 0, i = 1,..., n.

(27.17)

subject to yi − ( w xi + b) ≤ ε + ξi , i = 1,..., n T

( wT xi + b) − yi ≤ ε + ξi∗ , i = 1,..., n.

The parameters ε > 0 and C ≥ 0 are to be selected separately by the user. C ≥ 0 regulates the tolerance level provided by the slack variables ξi , ξi∗ ∈ R ≥ 0 . If all data points lie within the ε -tube of Eq. (27.16) no slack variables are required.

27 Kernel Methods in Finance

665

Through application of the method of Lagrange multipliers the dual formulation of Eq. (27.17) can be obtained Maximise −

1 n ∑ (α i − α i∗ )(α j − α ∗j ) xiT x j 2 i , j =1 n

n

i =1

i =1

−ε ∑ (α i + α i∗ ) + ∑ yi (α i − α i∗ )

(27.18)

over all Lagrange multipliers α i , α i∗ ∈ [0, C ] n

subject to∑ (α i − α i∗ ) = 0. i =1

The SVM in dual space is n

f ( x) = ∑ (α i − α i∗ ) xiT ⋅ x + b

(27.19)

i =1

where α i , α i∗ are obtained by solving Eq. (27.18) and b follows from the associated Karush–Kuhn–Tucker (KKT) conditions of optimality. Through kernelisation the above procedure can be extended for non-linear function estimation. After substitution of x by φ ( x) the function to be estimated in Eq. (27.13) becomes f ( x) = wT φ ( x) + b.

(27.20)

The primal optimisation task is the same as described in Eq. (27.17) except that the xi are formally replaced by φ ( xi ) . Application of Lagrangian optimisation leads to the dual formulation: Maximise −

1 n ∑ (α i − α i∗ )(α j − α ∗j ) K ( xi , x j ) 2 i, j =1 n

n

i =1

i =1

−ε ∑ (α i + α i∗ ) + ∑ yi (α i − α i∗ ) over all Lagrange multipliers α i , α

(27.21) ∗ i

n

subject to

∑ (α i − α i∗ ) = 0 i =1

0 ≤ α i ≤ C , i = 1,..., n 0 ≤ α i∗ ≤ C , i = 1,..., n.

Finally the SVM in dual space is n

f ( x) = ∑ (α i − α i∗ ) K ( xi , x ) + b i =1

(27.22)

666

Stephan Chalup, Andreas Mitschele

where α i , α i∗ are obtained by solving Eq. (27.21) and b follows from the associated conditions of optimality. This is the same as Eqs. (27.18) and (27.19) except that the dot products xiT x j in Eqs. (27.18) and (27.19) are formally replaced by the associated kernel values K ( xi , x j ) . The two free parameters ε and C in Eq. (27.21) control the VC-dimension (Vapnik 2000) of f ( x ) and have to be selected separately, for example, through cross-validation. It turns out that the set of data points xi for which (α i − α i∗ ) ≠ 0 is sparse. Its elements are called the support vectors of the machine Eq. (27.22). More details about this method are available in (Vapnik 1998, 2000, 2006b; Cristianini and Shawe-Taylor 2000; Mika et al. 2005; Shawe-Taylor and Cristianini 2004; Suykens et al. 2002; Schölkopf and Smola 2002).

27.5 Classification Classification can be seen as a special case of the regression task Eqs. (27.13)– (27.14) where the function values are integers, for example, {−1,1} in the case of binary classification. Classification tasks are the second dominant application within the finance area. Kernel-based classification has been thoroughly used in credit risk management, for instance to differentiate between good and bad clients. The perceptron (Rosenblatt 1958; Mitchell 1997) is one of the most basic methods for binary classification. It is often interpreted as an abstract neuron model r = ϕ ( w ⋅ x + b)

(27.23)

where r is the associated return, b ∈ R is the bias, w ⋅ x = ∑ j w j x j is the dot product between a weight vector w ∈ R d and an input vector x ∈ R d , and ϕ is an activation function, for example, the signum function

ϕ : R ⎯→ {−1,1}, y

⎧−1, if 0 < y, ⎩+1, if y ≤ 0.

ϕ ( y) = ⎨

Geometrically the perceptron defines a hyperplane as the set of points which is perpendicular to the weight vector w and has distance a = −b / || w || from the origin: {x; w ⋅ x = −b} = {x; ||ww|| ⋅ x − a = 0}

(27.24)

By dividing the input space into two halves a hyperplane can be employed for linear binary classification of input patterns. The perceptron training rule (Mitchell 1997) is an iterative algorithm which adjusts the weight vector and bias to place the associated hyperplane in some sensible position between two classes of given training points. In contrast basic support vector machines (Cristianini and Shawe-Taylor 2000) treat linear classification as convex optimisation task which in primal weight space

27 Kernel Methods in Finance

667

represents a standard constraint optimisation problem: The margin of separation between the two point classes should be maximised under the constraint that the pattern should be classified correctly, that is in the case of separable patterns yi ( w ⋅ xi + b) ≥ 1, i = 1,..., n

(27.25)

and in the case of non-separable patterns yi ( w ⋅ xi + b) ≥ 1 − ξi , i = 1,..., n

(27.26)

where the slack variables ξi > 0 can take exceptions and outliers into account (Cortes and Vapnik 1995). After substituting the xi by φ ( xi ) the constraints for the non-linear case become yi ( w ⋅ φ ( xi ) + b) ≥ 1 − ξi , i = 1,...n.

(27.27)

As for regression this constraint optimisation task can be formulated in primal and dual form using the method of Lagrangian optimisation. In the dual form feature vectors will only occur as parts of dot products φ ( xi )T φ ( x j ) . Using a positive definite kernel function the latter can be replaced by kernel values according to Eq. (3) and a non-linear SVM is obtained by kernelising the concept of the linear maximum margin classifier. It can be shown (Mangasarian 1999; Pedroso and Murata 2001; Ikeda and Murata 2005) that if the distances between the feature vectors and the separating hyperplane are evaluated using a L p -norm, that is 1 ⎧ n p p ⎪(∑ | wi | ) || w || p := ⎨ i =1 ⎪ max | wi | ⎩ 1≤i ≤ n

, 1≤p <∞ , p = ∞,

the problem of margin optimisation becomes a p -th order programming problem, for example: p = 2: Quadratic programming is used by classical SVMs. p = 1: Linear programming is used by least squares support vector machines (LSSVMs) (Suykens and Vandewalle 1999a,b; Suykens et al. 2002). SVMs (Vapnik 2000; Cristianini and Shawe-Taylor 2000; Suykens et al. 2002; Schölkopf and Smola 2002) are among the most commonly used kernel machines. They can be regarded as a newer and, in practical applications, often better alternative to artificial neural networks for function approximation to solve classification or regression tasks. In contrast to traditional artificial neural networks no lengthy iterative training procedure is required. In most cases SVM “training” is faster and does not get stuck in local minima because it is achieved by a direct calculation for linearly constraint optimisation.

668

Stephan Chalup, Andreas Mitschele

We have described the basic form of SVMs for regression and for binary classification. Other important variations and extensions of SVMs are ν -SVMs (Schölkopf and Smola 2002), LS-SVMs (Suykens et al. 2002), SVMs for clustering (Tax and Duin 1999; Ben-Hur et al. 2001; Yang et al. 2002b), and SVMs for multi-class classification (Rifkin and Klautau 2004). The SVM algorithms used before 1999 were typically slower than artificial neural networks with similar generalisation performance (Haykin 1999, p. 345). Significant speed improvements were achieved, for example, through the SMO algorithm by (Platt 199). Since then a variety of techniques have been investigated and evaluated to support application of SVMs to large data sets Huang et al., 2006), data sets with small training samples (Hertz et al. 2006), and unbalanced data sets (Raskutti and Kowalczyk 2004). The latter topics are still object of current research.

27.6 Kernels and Parameter Selection The selection of suitable kernels and associated kernel parameters is an important task when using kernel methods for regression, classification, and dimensionality reduction. There are several possibilities to choose the kernel function K : X × X ⎯→ R (Cristianini and Shawe-Taylor 2000). If input space and feature space are identified, that is φ ( x) = x , this results in the linear kernel, which is the dot product of the two input vectors K ( x, y ) = x ⋅ y.

Examples of other common kernels are the polynomial kernel K ( x, y ) = (1 + x ⋅ y ) p

and the Gaussian or radial basis function (RBF) kernel K ( x, y ) = e

− d ( x , y )2 σ2

.

The p -Gaussian kernel (Francois et al. 2005) K ( x, y ) = e

− d ( x,y ) p σp

is a generalisation of the Gaussian kernel. The parameter p controls the smallest distance corresponding to the decreasing part of the kernel and can help to leviate some of the issues that standard Gaussian kernels can encounter in high dimensional spaces. For example, it can be shown that if data is uniformly distributed within the unit ball almost all will have norm equal to 1 in high dimensions (Verleysen and Francois 2005). Consequently, in a high-dimensional Gaussian distribution most data will be contained in the tails of the distribution. This is in contrast to low dimensions where the Gaussian distribution is local and most data is close to zero.

27 Kernel Methods in Finance

669

In general, a kernel is only required to be positive semi-definite and should represent a plausible similarity measure between pairs of input examples (cf. Section 27.2). These rather loose prerequisites of the Mercer theorem allow for task specific kernel design. Specialised string kernels have been developed and applied in biosequence analysis and text categorisation (Cristianini and Shawe-Taylor 2000). Some recent developments in SVMs allow to learn task specific kernels or to use many kernels in parallel (Lanckriet et al. 2004; Sonnenburg et al. 2006). For the dimensionality reduction paradigm a variation of KPCA, called Maximum Variance Unfolding (MVU), was proposed where the kernel matrices are optimised by semi-definite programming to preserve the geometry of the input data up to isometry (Weinberger et al. 2004). Experimental evaluation revealed that the positive semi-definite kernel matrices constructed by MVU have clear advantages over Gaussian, polynomial, or linear kernels (Weinberger et al. 2004). This shows that kernels which are successful for large margin classification are not necessarily suitable for dimensionality reduction and vice versa (Saul et al. 2006). The reason for this is that high-dimensional spaces have a few, sometimes counter-intuitive, properties which can have essential impact on kernel choice or design (Verleysen and Francois 2005). In addition to kernel selection, many kernel methods require that specific parameters have to be selected or adapted. These include the parameter C of C -SVMs for classification, the C and ε of SVMs for regression, the neighbourhood parameter k of isomap, and specific kernel parameters such as the width σ in Gaussian kernels. Several of the kernel methods have their roots in statistical learning theory. For example, learning with SVMs and neural networks is based on Valiant’s principle of PAC learning (Valiant 1984). According to (Vapnik 2006a) the bigger theoretical framework could be described as empirical inference science. SVMs approximately implement Vapnik’s method of structural risk minimisation which defines a “trade-off between the quality of the approximation of the given data and the complexity of the approximating function” (Vapnik 2000): The generalisation error is bounded by the sum of training error and a function of the VapnikChervonenkis (VC) dimension, which is a measure of the capacity, or flexibility of a function class. More specifically (Vapnik 1998, 2000) proved that hyperplanes with || w ||< l have a Vapnik-Chervonenkis (VC) dimension which is bounded by min(int[ R 2 l 2 ], n) + 1 where R is the radius of the smallest ball containing all feature vectors. SVM parameters can be selected to minimise this bound in order to achieve good generalisation ability of the model. In practice, however, these basic bounds, although predictive (Burges 1998), are typically very conservative and refinements or alternative methods, such as cross-validation and its variants such as n -fold cross-validation (Mitchell 1997; Stone 1974), are often preferred. For binary classifiers an alternative method to evaluate the performance is the receiver operating curve (ROC) analysis (Fawcett 2004).

670

Stephan Chalup, Andreas Mitschele

27.7 Survey of Applications in Finance The aim of this survey is to provide a structured overview of successful or promising kernel machine applications in the area of finance. While the survey does not claim completeness, it tries to highlight some of the main results. Most of the hitherto performed analyses have their focus on risk management. In further areas, such as financial instrument pricing (e. g. option pricing) or asset allocation, there have hardly been applications yet. The survey has been divided into applications within the market risk context, which primarily concern time series forecasting, and within credit risk, which mainly involve rating/scoring schemes and bankruptcy prediction. Not least through the new capital regulations Basel II (Basel Committee on Banking Supervision 2006), that have been imposed on banks since the beginning of 2007 with a one year transition period, risk management has taken an even more prominent role in finance.

27.7.1

Credit Risk Management

The area of credit risk management has rapidly gained importance in recent years, particularly boosted through the new Basel II framework (Basel Committee on Banking Supervision 2006). The first Capital Accord, called Basel I, which has been published in 1988, already required banks to build buffers for possible losses through defaults of their clients. While these former regulations were rather undifferentiated, the new guidelines demand a considerably higher distinction. In this challenging context advanced statistical methods are frequently used by bank practitioners for the analysis of the newly built credit databases. Different machine learning approaches, including neural networks and kernel machines, have only recently been introduced to the area (Thomas et al. 2005). Credit risk management represents a promising application area for kernel methods. SVMs have been successfully employed to derive bond ratings, help banks to classify their loan base into “good” and “bad” clients, and to detect risks of business failure. Even though banks have put great efforts into extending their available credit data, actual databases are not very large yet. This is due to relatively rare nature of some events, for example defaults. In a detailed research overview concerning bond rating estimation (Huang et al. 2004) showed that artificial intelligence (AI) methods, including rule-based expert systems, case-based reasoning and machine learning, proved to deliver very good results in comparison to conventional statistical methods like multiple linear regression or multiple discriminant analysis. The test set prediction accuracy of earlier AI investigations had ranged from around 50% to 88% for the classification problem. In their empirical study they compared back-propagation neural networks to SVMs for bond rating analysis using financial ratios and ratings from Taiwan (1998−2002) and the United States (1991−2000). In general they found

27 Kernel Methods in Finance

671

that AI methods performed better than classical approaches with SVMs slightly ahead concerning the results. (Chen and Shih 2006) used a number of new input variables, including stock market information, financial support by the government and financial support by major shareholders, for their multi-class rating system, and also compared different SVM approaches with a back-propagation neural network. Using Taiwanese bank rating data with their SVM models they were able to further improve results with an accuracy rate of 89% to 100% in the training set and 73% to 85% in the test set, substantially outrivalling the benchmark NN approach. Another study on bond rating estimation was performed by (Cao et al. 2006) using multi-class classification with “one-against-all”, “one-against-one” and directed acyclic graph SVM (DAGSVM) approaches. All the SVM approaches, especially the latter one, significantly outperformed the commonly used traditional benchmarks (back-propagation neural networks, logistic regression and ordered probit regression) concerning classification accuracy. Through an additional sensitivity analysis of feature importance (Cao et al. 2006) were able to further enhance the SVM generalisation ability. While rating analysis is mainly relevant for larger companies who can afford to undergo the cost-intensive rating process, credit scoring plays a corresponding role within credit assessment for smaller companies and retail banking customers. (Baesens et al. 2003) provide an extensive study on state-of-the-art classification algorithms and their performance within credit scoring. Apart from common algorithms (for example logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees) they also considered kernel-based methods, namely SVMs and least-squares SVMs (LS-SVMs). They used eight real-life credit scoring data sets which they obtained from two banks and from publicly available sources. According to their chosen performance measures (percentage correctly classified cases, area under the receiver operating characteristic curve) LS-SVMs and neural network classifiers achieved the best results. However, in their study linear classifiers also yielded good outcomes, indicating that the analysed data sets exhibited only weak non-linear behavior. In turn, the authors noted that even small improvements in scoring accuracy can lead to significant savings regarding the huge volume of banks’ credit business. (Schebesch and Stecking 2005a) successfully employed SVMs to divide a set of labelled credit applicants into subsets of typical and critical patterns. While class labels for typical patterns were relatively easy to predict even using standard linear classification methods they stated that the more interesting critical patterns include less trivial training examples. Additionally they suggested using the SVM results as input for linear discriminant analysis in a hybrid approach that would improve generalisation ability. In another contribution (Stecking and Schebesch 2006) outlined how appropriate kernels for the problem class could be chosen and compared. Their experiments with the relatively unknown Coulomb kernel (1+ || xi − x j ||2 /E ) −δ (Hochreiter et al. 2003), which represents a localised kernel function similar to RBF kernels, indicated better classification performance based on a lower expected outof-sample error than linear, polynomial, sigmoid, and RBF kernels.

672

Stephan Chalup, Andreas Mitschele

As in credit scoring it is usually not possible to distinguish between absolutely “good” and “bad” debtors. (Wang et al. 2005) proposed fuzzy SVMs to evaluate credit risk. In their approach each customer was considered to belong both to the positive and to the negative class via a membership concept. Instances that were identified as outliers were assigned with low membership for the one class while the opposite class was assigned a higher membership. As each instance contributed two errors to the total error term they called their new hybrid approach bilateralweighted fuzzy SVM. In their empirical study they analysed three different data sets: 60 UK corporations (30 failed and 30 non-failed with 12 characteristic variables each), 653 Japanese credit card application approval data (357 granted and 296 refused with 15 attributes each) and 1225 applicants (323 bad and 902 good with 12 variables each). Overall they found that their newly suggested fuzzy SVM almost consistently outperformed numerous comparable methods (linear/logit regression, neural network, standard and fuzzy SVM with varying kernels). Additionally it provided better generalisation ability than former fuzzy SVM approaches as the negative impact of outliers could be successfully reduced. They noted, however, that computational complexity increased considerably in their approach and that membership generation had to be thoroughly contemplated to avoid distortions. Among other applications of SVMs in the area of credit scoring were (van Gestel et al. 2003a; Sánchez et al. 2004; Li et al. 2004a; Schebesch and Stecking 2005b; Stecking and Schebesch 2005; Lai et al. 2006b). Besides the assessment of creditworthiness, that is rating and scoring respectively, different authors have used SVM approaches to predict bankruptcy of companies. It has to be noted that this is also a very important task within the new Basel II guidelines (Basel Committee on Banking Supervision 2006). (Härdle et al. 2005) used a SVM based approach to predict bankruptcy for 42 US companies that had filed for Chapter 11 of the US Bankruptcy Code in 2001−2002. Based on information from the annual reports they computed a number of financial ratios concerning profit measures, for instance EBIT/TA (Earnings before income tax/total assets), leverage ratios, liquidity ratios and turnover ratios, for a total of 84 companies, that was 42 companies that went bankrupt and 42 other companies with similar characteristics that survived. Discriminant analysis suggested that the most significant predictors belonged to profit and leverage ratios. Through their analysis they found that SVMs were able to classify successful companies into one certain cluster, meaning that they had to have certain characteristics in common, while the failing companies were located outside this cluster. They also showed that SVMs provided superior classification information concerning defaulted companies compared to the common discriminant analysis approach. As their data set was relatively small the linear classifier delivered the best classification results. They noted, however, that for larger data sets non-linear classifiers would probably be better suited. In their study (Shin et al. 2005) showed that SVMs outperformed back-propagation neural networks in the problem of corporate bankruptcy prediction. Particularly they claimed that SVMs could handle smaller sample sizes – that often occur within the credit risk context – more adequately. While they did not thoroughly ana-

27 Kernel Methods in Finance

673

lyse the optimality of the kernel parameters (Min and Lee 2005) performed a detailed analysis in their contribution. First of all they compared SVM performance to the common benchmarks: multiple discriminant analysis, logistic regression and threelayer fully connected back-propagation neural networks. Their data sample covered a – in terms of credit risk – quite extensive data set of 1888 firms, with both 50% bankrupt and non-bankrupt firms. Inter alia they performed a principal component analysis and further steps to discern important features and ran an extensive grid search to determine optimal parameters for the SVM kernel. They found that SVMs outperform all benchmark methods and consistently achieved similar or better performance than the neural network approach. (Min and Lee 2005) also proposed that SVMs tend to handle smaller sample sizes quite well while generally keeping their generalisation ability through the use of the structural risk minimisation principle. In another hybridised approach (Min et al. 2006) integrated genetic algorithms (GAs) with SVMs to predict bankruptcy for a data set of 614 Korean companies, half of which filed for bankruptcy between 1999 and 2002. The GA was used to optimise the feature subset and the SVM parameters. For the feature subset selection they used the so-called wrapper approach that trained the classifier with a certain subset and subsequently evaluated the corresponding classification error using a validation set. For the SVM’s RBF kernel two parameters, C and σ 2 , had to be optimised. As both the feature subset selection and the parameter optimisation were mutually dependent, they were optimised simultaneously. (Min et al. 2006) observed that the hybridised model significantly outperformed the common benchmarks logistic regression and artificial neural networks, and was additionally able to improve prediction quality in comparison with the pure SVM approach. More studies on bankruptcy prediction with SVMs showing superior performance were conducted by (Fan and Palaniswami 2000; Shin et al. 2004; van Gestel et al. 2003b; Yun et al. 2004; Lai et al. 2006a; Hui and Sun 2006).

27.7.2

Market Risk Management

In recent years kernel methods have also been widely used within market risk management due to the well-known properties of financial time series which usually are inherently noisy, non-stationary and deterministically chaotic (Cao and Tay 2001). The noisiness characteristic, which can lead to over- or underfitting when using estimation methods, implies that there is no complete information available from the past behavior of financial markets. Non-stationarity means that financial time series switch their dynamics between different regions. This behavior causes an ever changing dependency between input and output variables (Tay and Cao 2001b). Thus, financial forecasting represents a promising area for the application of kernel methods. (Tay and Cao 2001b) compared SVMs to a multi-layer perceptron with backpropagation in forecasting the S&P 500 Daily Index. They reported better forecasting

674

Stephan Chalup, Andreas Mitschele

abilities for the SVM approach and emphasised the low number of parameters (after the kernel had been specified) that had to be calibrated for the SVM. Remarkably, the choice of parameters had a minor influence on the results of their study when using SVMs while the neural network approach was much more affected by parameters. Also SVMs showed superior speed characteristics compared to back-propagation. To further improve their results (Tay and Cao 2001b; Cao 2003 built a twostage neural network architecture where they combined SVMs with a self organizing map (SOM) in a hybrid model. According to the ‘divide-and-conquer’ principle they first clustered the input data in disjoint regions, using SOMs. Secondly, they employed multiple SVMs (also called SVM experts) to fit each region separately with individual and most appropriate kernel functions and corresponding parameters. With this two-stage procedure they were able to significantly improve prediction performance for six different financial data sets compared to the abovementioned single SVM benchmark model. Additionally, their model delivered more efficient learning through the data set splitting and it provided a sparser solution representation as less support vectors were used. Another hybrid variation of their standard model (Cao and Tay 2001) was provided in (Cao and Tay 2001a) where they used saliency analysis (SA) and genetic algorithms (GAs) to perform feature selection for the SVMs. Both methods improved convergence, generalisation performance, and training time, however, SA was slightly ahead of the GA version due to lower computational cost. With yet another contribution (Cao and Tay 2002) enhanced SVMs to model non-stationarity of financial time series. Specifically they included the assumption that more recent data may provide more relevant information than older data. The model was tested using real futures contracts and delivered again superior performance to standard SVMs. In a comprehensive study (Hansen et al. 2006) compared SVMs to a selection of modern time-series prediction methods, namely exponential smoothing, autoregressive integrated moving average (ARIMA) and partially adaptive estimated ARIMA. Nine time-series from the Federal Reserve Economic Database (FRED) with widely differing characteristics were used to evaluate the model performances. Impressively, SVMs achieved the best predictive results for eight out of the nine investigated data sets. The direction of daily stock price index changes was predicted by (Kim 2003) in another study of financial time series forecasting using SVMs. According to his empirical investigation, SVMs outperformed back-propagation neural networks (BPN) and case-based reasoning (CBR). (Huang et al. 2005) also successfully predicted market movement direction for the NIKKEI 225 index with SVMs in comparison to a variety of benchmark methods. Remarkably a newly proposed hybrid approach consisting of a combination of all considered methods delivered even better classification results. (Pérez-Cruz et al. 2003) employed SVMs to estimate the parameters of a GARCH model for the prediction of the conditional volatility of stock market returns. They found that – given normally distributed data – standard GARCH estimators (for instance maximum likelihood) return better results than SVMs, as they implicitly assumed the Gaussian distribution. However, in estimating non-normally distributed

27 Kernel Methods in Finance

675

probability distribution functions (pdfs) SVMs outperformed standard methods substantially, being capable to approach any given distribution. It has to be noted that they only used linear SVMs, leaving improvement through the usage of kernels and non-linear SVMs open for further research. Another detailed study on volatility forecasting and the specific problems of time series data was performed by (Gavrishchaka and Ganguli 2003). The authors found that SVMs could successfully handle both long memory and multiscale effects of inhomogeneous markets without imposing the restrictive model assumptions of other methods. They emphasised the capability of SVMs to process realtime multiscale and high-frequency market data and their ability to tolerate data incompleteness. (Gavrishchaka and Banerjee 2006) extended this analysis from foreign exchange data to stock market data (S&P 500 index). Using kernel methods (Ince and Trafalis 2006) selected stocks for short-term portfolio management. In their study they examined SVMs and minimax probability machines (MPM), both of which provided similarly good results, depending on a sensible choice of free parameters. By assuming that the efficient market hypothesis does not hold around companies’ earnings announcements (Ince and Trafalis 2006) were able to earn excess returns through trading individual stocks after their earnings announcements. (Trafalis et al. 2003) also employed SVM Regression for option pricing and obtained minimum mean square error compared to RBF and MLP networks. Further applications in financial forecasting were presented by (Trafalis and Ince 2000; Ince and Trafalis 2004; van Gestel et al. 2001; Yang et al. 2002a; Zhang et al. 2006; Kamruzzaman et al. 2003). Another interesting hybrid approach that again combined self organizing maps with SVMs for exchange rate prediction was presented by (Ni and Yin 2006).

27.7.3 Synopsis and Possible Future Application Fields As set forth in the preceding sections, kernel based algorithms have successfully been introduced in many different financial application areas. In Table 27.1 we present a structured overview of these applications showing their individual publication dates. It can be concluded from the table that support vector regression plays a very dominant role within market risk management while in credit risk management mostly kernel based classification methods are employed. There are also a smaller number of kernel PCA applications and some hybrid approaches throughout most of the considered areas. Interestingly, we found within the sample of publications of this review that the publication frequency reached another peak in 2006 after an early peak in 2003. These numbers underpin the timeliness of the proposed methods and the domain specific advantages that kernel methods may deliver within finance.

676

Stephan Chalup, Andreas Mitschele

Table 27.1. Publications of kernel methods in finance Kernel PCA

Hybrid Approaches

#

(Zhang et al. 2006) (Trafalis and Ince 2000) (van Gestel et al. 2001) (Cao and Tay 2001) (Yang et al. 2002a) (Pérez-Cruz et al. 2003) (Kamruzzaman et al. 2003) (Cao 2003) (Kim 2003) (Gavrishchaka and Ganguli 2003) (Gavrishchaka and Banerjee 2006) (Hansen et al. 2006) (Ni and Yin 2006)

(Cao et al. 2003b) (Cao et al. 2003a) (Ince and Trafalis 2004)

(Tay and Cao 2001b) (Tay and Cao 2001a) (Huang et al. 2005)

19

Else

(Ince and Trafalis 2006)

–

–

3

CR

RS

– (van Gestel et al. 2003a) (Baesens et al. 2003) (Huang et al. 2004) (Li et al. 2004a) (Schebesch and Stecking 2005b) (Stecking and Schebesch 2005) (Stecking and Schebesch 2006) (Cao et al. 2006) (Chen and Shih 2006) (Lai et al. 2006b)

–

13 (Wang et al. 2005) (Schebesch and Stecking 2005a) (Lai et al. 2006a)

CR

BP

– (Fan and Palaniswami 2000) (van Gestel et al. 2003b) (Sânchez et al. 2004) (Shin et al. 2004) (Yun et al. 2004) (Härdle et al. 2005) (Min and Lee 2005) (Shin et al. 2005) (Hui and Sun 2006)

–

(Min et al. 2006)

10

21

3

7

45

Area Field

Support Vector Classification

MR

TS

MR

#

Support Vector Regression

(Trafalis et al. 2003) (Takeuchi et al. 2006)

14

27 Kernel Methods in Finance

677

27.7.3.1 Possible Future Application Fields

Besides the presented overview of machine learning concepts and the review of ongoing research efforts in the subject matter it is an aim of this contribution to identify promising trends for future application of kernel methods. Due to their strengths in statistical data analysis, SVMs in particular can possibly improve performance in numerous further financial application fields. While SVM algorithms have been extensively applied in specific credit risk domains, like rating/scoring and bankruptcy prediction, their usage in the analysis of other credit risk relevant parameters like Loss Given Default (LGD)3 for instance has received very little attention until today. In the area of market (MR) and credit risk (CR) with times series prediction (TS), rating/scoring (RS) and bankruptcy prediction (BP) as specific applications. The adequate estimation of such parameters is very important for banks’ internal credit risk models and also for the fulfilment of supervisory regulations. Even though it usually involves very heterogenous data sets with possibly non-linear relations, banks commonly still trust in linear methods, like linear regression, to derive their parameter estimates in practice. Referring to the presented advanced dimensionality reduction methods in section 27.3 there are also promising new application fields. (Thomason 1998) reviewed PCA as a traditional dimensionality reduction method in the context of financial forecasting models. He stated that standard PCA using the normal distribution assumption may be not particularly suited for financial data. While he proposed a selfdeveloped advanced PCA approach, a number of authors have already employed other recently developed dimensionality reduction algorithms such as KPCA. (Cao et al. 2003a,b) compared the performance of PCA, KPCA and independent component analysis (ICA) for feature extraction of different time series data sets (including real futures contracts). They found that KPCA had the best characteristics. Within short term portfolio management (Ince and Trafalis 2004) used KPCA and factor analysis, respectively, to identify the most influential inputs for a SVM based stock price forecasting model. Furthermore, within market risk management there are different new areas where dimensionality reduction can also be applied, for instance term structure modeling. In these models a high number of possible input factors has to be reduced to eventually make the modeling possible (Alexander 2001). However, to the knowledge of the authors, no kernel method applications have been reported in this area yet. Apart from such modeling issues, market risk management often involves the approximation of high quantiles for a certain distribution. This is especially relevant in the context of portfolio risk management where value at risk (VaR) measures the risk of a loss within a specific time interval given a certain confidence 3

LGD is a highly relevant parameter from the Basel II context (Basel Committee on Banking Supervision 2006) and represents the percentage of an engagement that a financial institution looses if a specific obligor defaults. It has the following relation to the alternatively quoted recovery rate (RR): LGD = 1 − RR

678

Stephan Chalup, Andreas Mitschele

level (quantile). In a novel application (Christmann 2005) and subsequently (Takeuchi et al. 2006) used SVMs to estimate these high quantiles.

27.8 Overview of Software Tools There are a large number of kernel machine implementations freely available through the internet4 with SVM algorithms clearly dominating. LIBSVM 5 (Chang and Lin 2001) has probably become the most popular SVM software package offering a variety of algorithms, including support vector classification (C-SVC, ν -SVC, multi-class classification), regression (epsilon-SVR, ν SVR) and distribution estimation (one-class SVM). Apart from the source code in C++ and Java the software comes with numerous interfaces to different other software packages. It has been integrated as package e1071 into the R project which is a popular open source statistics software6. Another R package including parts of LIBSVM is kernlab which extends the algorithm spectrum by KPCA, spectral clustering and more. SVMlight7 (Joachims 1999) implements SVMs for pattern recognition, regression and learning of a ranking function in C code. One of the main strengths of this implementation is the fact that through scalable memory requirements it can handle problems with many thousands of support vectors and several hundred thousands of training vectors very efficiently. Additionally the package klaR for the R project and a Java version called mySVM8 have been implemented. Users can define their own kernel functions in SVMlight and find example problems on the author’s web site. LS-SVMlab9 (Suykens et al. 2002) is a least-squares SVM toolbox for Matlab which is also available in C. This well-documented software package offers standard classification and regression using LS-SVM algorithms. Additionally KPCA, ultra large scale problems and a number of other advanced methods are supported. WEKA10 (Witten and Frank 2005) is a sophisticated environment with graphical user interface for machine learning and data mining which is implemented in Java. It includes a large library of classification algorithms including SVMs and neural networks as well as evaluation tools such as ROC curves [30]. The Spider11 is a Matlab based toolbox with interfaces to several Matlab and C/C++ libraries for kernel-based algorithms. It includes a large variety of tools for preprocessing, training and evaluation. Several kernel methods are implemented 4

5 6 7 8 9 10 11

Links to a selection of software are available at the following web sites: www.support-vector-machines.org/ and www.kernel-machines.org/ www.csie.ntu.edu.tw/~cjlin/libsvm/ www.r-project.org svmlight.joachims.org/ www-ai.cs.uni-dortmund.de/SOFTWARE/MYSVM/index.html www.esat.kuleuven.ac.be/sista/lssvmlab/ www.cs.waikato.ac.nz/~ml/weka/ www.kyb.tuebingen.mpg.de/bs/people/spider/main.html

27 Kernel Methods in Finance

679

including SVMs for classification and regression, one-class SVMs, and KPCA. A WEKA interface has also been integrated. SHOGUN12 (Sonnenburg et al. 2006) is a new machine learning toolbox which focuses on SVMs and implements a variety of kernels including several string kernels which are important in computational biology. It offers the option to employ combined kernels which can be constructed by weighted linear combinations of sub-kernels. SHOGUN connects to LIBSVM (Chang and Lin 2001) and SVMlight (Joachims 1999). It is implemented in C++ and has interfaces to Matlab, Octave, Python and R. In addition to the already mentioned implementations of KPCA in kernlab and LS-SVMlab, software toolboxes for dimensionality reduction methods are often available from the associated authors’ webpages or general resource pages on manifold learning.13

27.9 Conclusion The overview of kernel methods showed that the field has quickly advanced in recent years and provides an umbrella for some of the most successful algorithms for classification, regression, and dimensionality reduction. Non-linear methods such as KPCA, Isomap, and non-linear SVMs for regression and classification can be obtained through kernelisation of linear techniques. The development on the machine learning side is rapid and new concepts and improvements, which have not been yet applied in finance, are continually emerging. Recent advances in the machine learning community to establish task specific kernel design offer new opportunities and challenges for financial applications. Among the financial applications addressed in the review section notably the best results have been obtained in the area of credit risk whenever the underlying data exhibited non-linear characteristics, as for instance in (Baesens et al. 2003; van Gestel et al. 2003a, 2006). Although SVMs and KPCA have been successfully applied in several studies on financial data the field of “Kernel Methods in Finance” is still in the early stages of development. The availability of kernel methods to accurately handle nonlinear dependencies has potential to further enhance current results. With respect to the high amounts that are dealt with on the financial markets even very small performance or accuracy improvements can result in considerable savings.

Acknowledgements The authors are grateful to Stephen Young who worked as research assistant and supported generation of the figures. The second author would like to thank 12 13

www2.fml.tuebingen.mpg.de/raetsch/projects/shogun www.cse.msu.edu/~lawhiu/manifold/

680

Stephan Chalup, Andreas Mitschele

GILLARDON AG financial software for support of his work. Nevertheless, the views expressed in this chapter reflect the personal opinion of the authors and are neither official statements of GILLARDON AG financial software nor of its partners or its clients.

References Aizerman M, Braverman E, Rozonoer L (1964) Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25, 821–837 Alexander C (2001) Market Models: A Guide to Financial Data Analysis. John Wiley & Sons, Chichester Baesens B, van Gestel T, Viaene S, Stepanova M, Suykens JAK, Vanthienen J. (2003) Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society 54 (6), 627–635 Basel Committee (Juni 2006) Basel II: International convergence of capital measurement and capital standards: A revised framework – Comprehensive version. Basel Committee on Banking Supervision, Bank for International Settlements Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15 (6), 1373–1396 Ben-Hur A, Horn D, Siegelmann HT, Vapnik VN (2001) Support vector clustering. Journal of Machine Learning Research 2, 125–137 Bishop CM (2006) Pattern Recognition and Machine Learning. Springer Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory. ACM Press, pp. 144–152 Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2, 121–167 Burges CJC (2005) Geometric methods for feature extraction and dimensional reduction – a guided tour. In: [61], pp. 59–92 Camps-Valls G, Rojo-Álvarez JL, Martínez-Ramón M (Eds.) (2007) Kernel methods in bioengineering, signal and image processing. Idea Group Inc., Hershey, PA, USA Cao L (2003) Support vector machines experts for time series forecasting. Neurocomputing 51, 321–339 Cao L, Chua KS, Chong WK, Lee HP, Gu QM (2003a) A comparison of PCA, KPCA, ICA for dimensionality reduction in support vector machine. Neurocomputing 55, 321–336 Cao L, Chua KS, Guan LK (2003b) Combining KPCA with support vector machine for time series forecasting. In: IEEE Computational Intelligence for Financial Engineering. pp. 325–329 Cao L, Guan LK, Jingqing Z (2006) Bond rating using support vector machine. Intelligent Data Analysis 10, 285–296

27 Kernel Methods in Finance

681

Cao L, Tay FEH (2001) Financial forecasting using support vector machines. Neural Computing & Applications 10, 184–192 Carminati L, Benois-Pineau J (2005) Support vector tracking of human faces with affine motion models. In: Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) 2005. Suisse (Montreux) Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm/ Chapelle O, Schölkopf B, Zien A (Eds.) (2006) Semi-Supervised Learning. The MIT Press, Cambridge, MA Chen D, Cao XB, Xu YW, Qiao H (2006) An evolutionary support vector machines classifier for pedestrian detection. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 4223–4227 Chen W-H, Shih J-Y (2006) A study of Taiwan’s issuer credit rating systems using support vector machines. Expert Systems with Applications 30 (3), 427–435 Christmann A (2005) On a combination of convex risk minimization methods. In: Gaul W, Weihs C (Eds.), Classification – the Ubiquitous Challenge. Springer, pp. 434–441 Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to Algorithms, 2nd Edition. MIT Press and McGraw-Hill Cortes C, Vapnik VN (1995) Support vector networks. Machine Learning 20, 273–297 Courant R, Hilbert D (1953) Methods of Mathematical Physics. Interscience Publishers, Inc, New York Cox TF, Cox MAA (2001) Multidimensional scaling, 2nd Edition. Chapman & Hall/CRC Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel based learning methods. Cambridge University Press Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. Advances in Computational Mathematics 13 (1), 1–50 Fan A, Palaniswami M (2000) Selecting bankruptcy predictors using a support vector machine approach. In: IJCNN 2000, Proceedings of the IEEE-INNSENNS International Joint Conference on Neural Networks (2000) Vol. 6. pp. 354–359 Fawcett T (2004) ROC graphs: Notes and practical considerations for researchers. Tech. rep., HP Laboratories, Palo Alto, CA, USA, Tech Report HPL-2003-4 Francois D, Wertz V, Verleysen M (2005) On the locality of kernels in highdimensional spaces. In: ASMDA2005, Applied Stochastic Models and Data Analysis. pp. 238–245 Fu X, Ma Z, Feng B (2004) Kernel-based semantic text categorization for large scale web information organization. Vol. 3251 of Lecture Notes in Computer Science (LNCS). Springer, pp. 389–396 Gavrishchaka VV, Banerjee S (2006) Support vector machine as an efficient framework for stock market volatility forecasting. In: Computational Management Science. Vol. 3. Springer, pp. 147–160

682

Stephan Chalup, Andreas Mitschele

Gavrishchaka VV, Ganguli SB (2003) Volatility forecasting from multiscale and high-dimensional market data. Neurocomputing 55, 285–305 Gentle JE, Härdle W, Mori Y (Eds.) (2004) Handbook of Computational Statistics. Concepts and Methods. Springer Ham J, Lee DD, Mika S, Schölkopf B (2004) A kernel view of the dimensionality reduction of manifolds. In: Proceedings of the 21st International Conference on Machine Learning Hansen JV, McDonald JB, Nelsen RD (2006) Some evidence on forecasting timeseries with support vector machines. Journal of the Operational Research Society 57 (9), 1053–1063 Härdle W, Moro RA, Schäfer D (2005) Predicting bankruptcy with support vector machines. In: Cizek P, Härdle W, Weron R (Eds.), Statistical Tools for Finance and Insurance. Springer, pp. 225–248 Haykin S (1999) Neural Networks. A Comprehensive Foundation, 2nd Edition. Prentice Hall Herbrich R (2002) Learning Kernel Classifiers. The MIT Press Hertz T, Hillel A-B, Weinshall D (2006) Learning a kernel function for classification with small training samples. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh June 25–29, 2006 Hochreiter, S, Mozer MC, Obermayer K (2003) Coulomb classifiers: Generalizing support vector machines via an analogy to electrostatic systems. In: Advances in Neural Information Processing Systems (NIPS). Vol. 15. The MIT Press, pp. 561–568 Hotelling H (1933) Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24, 417–441, 498–520 Huang T-M, Kecman V, Kopriva I (2006) Kernel Based Algorithms for Mining Huge Data Sets. Supervised, Semi-supervised, and Unsupervised Learning. Vol. 17 of Studies in Computational Intelligence. Springer Huang W, Nakamori Y, Wang S-Y (2005) Forecasting stock market movement direction with support vector machine. Computers & Operations Research 32, 2513–2522 Huang Z, Chen H, Hsu C-J, Chen W-H, Wu S (2004) Credit rating analysis with support vector machines and neural networks: a market comparative study. Decision Support Systems (37), 543–558 Hui X-F, Sun J (2006) An application of support vector machine to companies’ financial distress prediction. Vol. 3885 of Lecture Notes in Computer Science (LNCS). Springer, pp. 274–282 Ikeda K, Murata N (2005) Geometrical properties of nu support vector machines with different norms. Neural Computation 17, 2508–2529 Ince H, Trafalis TB (2004) Kernel principal component analysis and support vector machines for stock price prediction. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Vol. 3. pp. 2053–2058 Ince H, Trafalis, TB (2006) Kernel methods for short-term portfolio management. Expert Systems with Applications 30, 535–542

27 Kernel Methods in Finance

683

Joachims T (1999) Making large-scale SVM learning practical. In: Schölkopf B, Burges C, Smola A (Eds.), Advances in Kernel Methods – Support Vector Learning. The MIT Press Jolliffe IT (1986) Principal Component Analysis. Springer-Verlag, New York Kamruzzaman J, Sarker RA, Ahmad I (2003) SVM based models for predicting foreign currency exchange rates. In: ICDM ’03: Proceedings of the Third IEEE International Conference on Data Mining. IEEE Computer Society, Washington, DC, USA, p. 557 Kang S, Byun H, Lee S-W (2002) Real-time pedestrian detection using support vector machines. Vol. 2388 of Lecture Notes in Computer Science (LNCS). Springer, pp. 268–277 Kim K-j (2003) Financial time series forecasting using support vector machines. Neurocomputing 55, 307–319 Lai KK, Yu L, Huang W, Wang S (2006a) A novel support vector machine metamodel for business risk identification. Vol. 4099 of Lecture Notes in Computer Science (LNCS). Springer, pp. 980–984 Lai KK, Yu L, Zhou L, Wang S (2006b) Credit risk evaluation with least square support vector machine. Vol. 4062 of Lecture Notes in Computer Science (LNCS). Springer, pp. 490–495 Lanckriet GRG, Bie TD, Cristianini N, Jordan MI, Noble WS (2004) A statistical framework for genomic data fusion. Bioinformatics 20, 2626–2635 Li J, Liu J, Xu W, Shi Y (2004a) Support vector machines approach to credit assessment. Vol. 3039 of Lecture Notes in Computer Science (LNCS). Springer, pp. 892–899 Li Y, Gong S, Sherrah J, Liddell H (2004b) Support vector machine based multiview face detection and recognition. Image and Vision Computing 22, 413−427 Maimon O, Rokach L (Eds.) (2005) The Data Mining and Knowledge Discovery Handbook. Springer Mangasarian OL (1999) Arbitrary-norm separating plane. Operations Research Letters 24, 15–23 Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos. Trans. Roy. Soc. London A 209, 415–446 Mika S, Schäfer C, Laskov P, Tax D, Müller K-R (2005) Support vector machines. In: Gentle JE, Härdle W, Mori Y (Eds.), Handbook of Computational Statistics. Concepts and Methods. Springer, pp. 841–876 Min JH, Lee Y-C (2005) Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications 28, 603–614 Min S-H, Lee J, Han I (2006) Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications 31, 652–660 Mitchell T (1997) Machine Learning. McGraw Hill Müller K-R, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks 12 (2), 181–201

684

Stephan Chalup, Andreas Mitschele

Ni H, Yin H (2006) Recurrent self-organising maps and local support vector machine models for exchange rate prediction. Vol. 3973 of Lecture Notes in Computer Science (LNCS). Springer, pp. 504–511 Osuna E, Freund R, Girosi F (1997) Training support vector machines: An application to face detection. In: CVPR ’97: Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR ’97). IEEE Computer Society, Washington, DC, USA, pp. 130–136 Pedroso JP, Murata N (2001) Support vector machines with different norms: Motivation, formulations, and results. Pattern Recognition Letters 22, 1263–1272 Pérez-Cruz F, Afonso-Rodríguez JA, Giner J (2003) Estimating GARCH models using support vector machines. Quantitative Finance 3 (3), 163–172 Platt J (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges C, Smola A (Eds.), Advances in Kernel Methods – Support Vector Learning. The MIT Press Raskutti B, Kowalczyk A (2004) Extreme re-balancing for svms: a case study. SIGKDD, Explorations 6 (1), 60–69 Rifkin R, Klautau A (2004) In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141 Rosenblatt F (1958) The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65 (6), 386–408 Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290 (5500), 2323–2326 Sánchez M, Prats F, Agell N, Rovira X (2004) Kernel functions over orders of magnitude spaces by means of usual kernels. Application to measure financial credit risk. Vol. 3040 of Lecture Notes in Computer Science (LNCS). Springer, pp. 415–424 Saul LK, Weinberger KQ, Sha F, Ham J, Lee DD (2006) Spectral Methods for Dimensionality Reduction, Ch. 16. In: [19], pp. 293–308 Schebesch KB, Stecking R (2005a) Support vector machines for classifying and describing credit applicants: detecting typical and critical regions. Journal of the Operational Research Society 56 (9), 1082–1088 Schebesch KB, Stecking R (2005b) Support vector machines for credit scoring: Extension to non standard cases. In: Baier D, Wernecke K-D (Eds.), Innovations in Classification, Data Science, and Information Systems. Proceedings of the 27th Annual Conference of the Gesellschaft für Klassifikation e.V., Brandenburg University of Technology, Cottbus, March 12−14, 2003. Springer, pp. 498–505 Schölkopf B (1997) Support vector learning. R. Oldenbourg Verlag, Munich Schölkopf B, Smola A, Müller K-R (1996) Nonlinear component analysis as a kernel eigenvalue problem. Tech. Rep. 44, Max-Planck-Institut für biologische Kybernetik Schölkopf B, Smola A, Müller K-R (1997) Kernel principal component analysis. Vol. 1327 of Lecture Notes in Computer Science (LNCS). Springer, Berlin, pp. 583–588 Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10, 1299–1319

27 Kernel Methods in Finance

685

Schölkopf B, Smola AJ (2002) Learning with Kernels. The MIT Press, Cambridge, Massachusetts Schölkopf B, Tsuda K, Vert J-P (Eds.) (2004) Kernel Methods in Computational Biology. The MIT Press Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press Shin K-s, Lee KJ, Kim H-j (2004) Support vector machines approach to pattern detection in bankruptcy prediction and its contingency. Vol. 3316 of Lecture Notes in Computer Science (LNCS). Springer, pp. 1254–1259 Shin K-s, Lee TS, Kim H-j (2005) An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications 28, 127–135 Sonnenburg S, Raetsch G, Schaefer C, Schoelkopf B (2006) Large scale multiple kernel learning. Journal of Machine Learning Research 7, 1531–1565 Sonnenburg S, Rätsch G, Schölkopf B (2005) Large scale genomic sequence svm classifiers. In: Proceedings of the International Conference on Machine Learning, ICML 2005 Spivac M (1979) A Comprehensive Introduction to Differential Geometry, 2nd Edition. Publish or Perish, Inc Stecking R, Schebesch KB (2005) Informative patterns for credit scoring using linear svm. In: Weihs C, Gaul W (Eds.), Classification – the Ubiquitous Challenge: Proceedings of the 28th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Dortmund. Springer, pp. 450–457 Stecking R, Schebesch KB (2006) Comparing and selecting svm-kernels for credit scoring. In: Spiliopoulou M, Kruse R, Borgelt C, Nürnberger A, Gaul W (Eds.), From Data and Information Analysis to Knowledge Engineering. Proceedings of the 29th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Magdeburg, March 9−11, 2005. Springer, pp. 542–549 Stone M (1974) Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series B (Methodological) 36 (2), 111–147 Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least Squares Support Vector Machines. World Scientific Pub. Co., Singapore Suykens JAK, Vandewalle J (1999a) Least squares support vector machine classifiers. Neural Processing Letters 9, 293–300 Suykens JAK, Vandewalle J (1999b) Training multilayer perceptron classifiers based on a modified support vector method. IEEE Transactions on Neural Networks 10 (4), 907–911 Takeuchi I, Le QV, Sears TD, Smola AJ (2006) Nonparametric quantile estimation. Journal of Machine Learning Research 7, 1231–1264 Tax DMJ, Duin RPW (1999) Support vector domain description. Pattern Recognition Letters 20 (11–13), 1191–1199 Tay FEH, Cao L (2001a) A comparative study of saliency analysis and genetic algorithm for feature selection in support vector machines. Intelligent Data Analysis 5 (3), 191–209

686

Stephan Chalup, Andreas Mitschele

Tay FEH, Cao L (2001b) Improved financial time series forecasting by combining support vector machines with self-organizing feature map. Intelligent Data Analysis 5 (4), 339–354 Tay FEH, Cao L (2002) Modified support vector machines in financial time series forecasting. Neurocomputing 48, 847–861 Tenenbaum JB, de Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 Thomas LC, Oliver RW, Hand DJ (2005) A survey of the issues in consumer credit modelling research. Journal of the Operational Research Society 56, 1006–1015 Thomason MR (1998) Non-traditional PCA for dimensionality reduction of financial forecasting models. Journal of Computational Intelligence in Finance 6 (4), 34–38 Trafalis TB, Ince H (2000) Support vector machine for regression and applications to financial forecasting. In: IJCNN 2000, Proceedings of the IEEE-INNSENNS International Joint Conference on Neural Networks, 2000. Vol. 6. pp. 348–353 Trafalis TB, Ince H, Mishina T (2003) Support vector regression in option pricing. Conference on Computational Intelligence for Financial Engineering: Workshop Paper, Hong Kong Baptist University Valiant LG (1984) A theory of the learnable. Communications of the ACM 27 (11), 1134–1142 van Gestel T, Baesens B, Garcia J, van Dijcke P (2003a) A support vector machine approach to credit scoring. Bank en Financiewezen 2, 73–82 van Gestel T, Baesens B, Suykens J, Baestaens D, Vanthienen J, De Moor B (2003b) Bankruptcy prediction with least squares support vector machine classifiers. In: Proceedings of the Conference on Computational Intelligence for Financial Engineering (CIFEr’03), March 21−23, Hong Kong. IEEE van Gestel T, Baesens B, Suykens JAK, van den Poel D, Baestaens D-E, Willekens M (2006) Bayesian kernel based classification for financial distress detection. European Journal of Operational Research 172, 979–1003 van Gestel T, Suykens JAK, Baestaens D-E, Lambrechts A, Lanckriet G, Vandaele B, De Moor B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Transactions on Neural Networks 12 (4), 809–821 Vapnik VN (1998) Statistical Learning Theory. John Wiley & Sons, NY Vapnik VN (2000) The nature of statistical learning theory. Springer, NY Vapnik VN (2006a) Estimation of Dependencies Based on Empirical Data, Ch. Empirical Inference Science, Afterword 2006. In: [118], pp. 400–505 Vapnik VN (2006b) Estimation of dependencies based on empirical data, reprint of 1982 edition, 2nd Edition. Springer Verleysen M, Francois D (2005) The curse of dimensionality in data mining and time series prediction. Vol. 3512 of Lecture Notes in Computer Science (LNCS). Springer, pp. 758–770

27 Kernel Methods in Finance

687

Wang Y, Wang S-Y, Lai KK (2005) A new fuzzy support vector machine to evaluate credit risk. IEEE Transactions on Fuzzy Systems 13 (6), 820–831 Weinberger KQ, Packer BD, Saul LK (2005) Unsupervised learning of image manifolds by semidefinite programming. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics. Barbados Weinberger KQ, Saul LK (2004) Unsupervised learning of image manifolds by semidefinite programming. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR-04). Vol. 2. Washington D. C., pp. 988–995 Weinberger KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the Twenty First International Conference on Machine Learning (ICML-04). Banff, Canada, pp. 839–846 Williams CKI (2001) On a connection between kernel pca and metric multidimensional scaling. In: Advances in Neural Information Processing Systems (NIPS). Vol. 13 Witten IH, Frank E (2005) Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan Kaufmann, San Francisco Xiao L, Sun J, Boyd S (2006) A duality view of spectral methods for dimensionality reduction. In: Proceedings of the 23rd International Conference on Machine Learning (ICML 2006). pp. 1041–1048 Yang H, Chan L, King I (2002a) Support vector machine regression for volatile stock market prediction. Vol. 2412 of Lecture Notes in Computer Science (LNCS). Springer, pp. 391–396 Yang J, Estivill-Castro V, Chalup S (2002b) Support vector clustering through proximity graph modelling. In: Proceedings, International Conference on Neural Information Processing (ICONIP’2002). pp. 898–903 Yun Y, Yoon M, Nakayama H, Shiraki W (2004) Prediction of business failure by total margin support vector machines. Vol. 3213 of Lecture Notes in Computer Science (LNCS). Springer, pp. 441–448 Zhang Z-y, Shi C, Zhang S-l, Shi Z-z (2006) Stock time series forecasting using support vector machines employing analyst recommendations. Vol. 3973 of Lecture Notes in Computer Science (LNCS). Springer, pp. 452–457

CHAPTER 28 Complexity of Exchange Markets Mao-cheng Cai, Xiaotie Deng

28.1 Introduction It is generally accepted that state variables of financial instruments will disallow the existence of investment strategies with riskless profit, commonly referred to as an arbitrage opportunity. Such a belief is based on the assumption that investment agents will actively seek to exploit any arbitrage opportunity in financial markets. In turn, such acts will deplete any arbitrage opportunity as soon as it may arise. Naturally, the speed at which such an arbitrage opportunity can be located and may be taken advantage of is vitally important for profit-seeking investors. Indeed, the identification of arbitrage states is, at a frictionless foreign exchange market, not difficult at all and can be reduced to the existence of arbitrage on three currencies as discussed, for example, in (Mavrides 1992). In reality, friction does exist. Because of friction, there is a possibility that arbitrage opportunities exist in the market but are difficult to find. Such a situation would make it difficult to eliminate arbitrage by taking its advantage. Empirical studies in foreign exchange markets have shown that arbitrage does exist in reality. Examination of data from ten markets over a twelve day period by (Mavrides 1992) revealed that arbitrage opportunities existed. Some were observed to be persistent for a relatively long period. The problem becomes worse at forward and futures markets coupled with covered interest rates, as observed by (Abeysekera and Turtle 1995; Clinton 1988). An obvious interpretation is that the arbitrage opportunities are not immediately identified because of information asymmetry in the market. However, that is not the only reason. Both the time necessary to collect the market information (so that an arbitrage opportunity would be identified) and the time we (or computer programs) need to find a set of transactions for arbitrage are important factors for eliminating arbitrage opportunities. The advance of the information and communication technology has clearly made the elimination of arbitrage states in the market much faster and will potentially make the perfect market a possibility as envisioned by (Gates et al 1995). With

690

Mao-cheng Cai, Xiaotie Deng

market information asymmetry resolved, the last frontier left is how to identify the arbitrage opportunities as fast as possible. In this article, we address this computational issue with the methodology of algorithms and complexity. In this methodology, the computational time necessary for a problem is, in the first level of classification, divided into two types: easy problems and hard problems. The easy problems are those, in the terminology of Edmonds, solvable with good algorithms, i. e., algorithms with execution time that grows in polynomial of the input size, measured in the number of bits to represent the problem. An important class of hard problems is those so-called NP-complete problems (non-deterministic polynomial time). Informally, a problem is in NP if we can verify a solution to the problem in polynomial time. An NP-complete problem is one that can be used as an oracle to solve any problem in NP in polynomial time. Metaphorically, the situation compares to the different levels of difficulties in solving a high polynomial equation and in verifying a solution for a high polynomial equation. Interpreting the methodology of algorithms and complexity, we would say a problem in P (solvable in polynomial time) is easy and an NPcomplete problem is hard. Therefore, from this approach, we are interested to know whether arbitrage in foreign exchange markets is in P or NP -complete. We examine this problem for various models of exchange systems. We study the computational resource constraints (usually in terms of time) of identifying an arbitrage opportunity, and the number of transactional operations to remove arbitrage opportunities from the market. We are also interested in the design of efficient algorithms to locate arbitrage and to take its advantage for a given exchange system. As the exchange market models with lower complexity should be preferred to those with high complexity in order to facilitate arbitrage if there is such a possibility and, in turn, to eliminate arbitrage. The computational complexity evaluation of exchange models in this sense would allow for better understanding of exchange systems in the reality. We model an exchange market with a digraph, called an exchange digraph, in which each vertex represents a currency (or commodity) and each arc from one vertex to another represents an exchange relation from the former to the latter. In a frictional market, the maximum amount at a given exchange rate is specified, which is formulated as a capacity for each arc. Because of that, we may have multiple arcs from one vertex to another (as they represent different exchange rates). In addition, there may exist integrality constraints that the trade of one currency to another can only be dealt with in an integer amount of the denominations. In Section 28.2, we introduce the necessary definitions and models. In Section 28.3, we consider a most general model of frictional exchange markets and confirm that it is computationally difficult, more specifically, NP-hard, to find an arbitrage opportunity. Obviously, this is in strong contrast against what most would expect. The general hardness result can be related to a barter market that allows for abundant arbitrage opportunities. The hardness of computing arbitrage of a general exchange market may not necessarily imply that realistic frictional exchange markets in nowadays are full of arbitrage opportunities. In Section 28.4, we discuss two models that allow for

28 Complexity of Exchange Markets

691

polynomial-time solvability of finding arbitrage. That is, for those two types of frictional exchange market models, locating arbitrage is computationally easy. If the exchange digraph is star-shaped (i. e., all arcs have a common vertex) or the exchange digraph has a fixed constant number of vertices (note that the number of parallel arcs may still be large), we can decide whether there are arbitrage conditions in polynomial time (and find them if so). Though both cases are quite simple, they concur with the reality to some extent. Obviously, when the number of currencies is a fixed constant rather than a changing variable, the computation of arbitrage becomes easier. As a matter of fact, we only have a moderate number of currencies in the world. Therefore, the complexity of arbitrage may not be as pessimistic as the hardness results show. In addition, the number of currencies tends to be reduced gradually, as in the creation of Euro. In corresponding to the starshaped digraph, one may relate the central vertex to the role of money in a commodity market (or the role gold in the money market, hailed by many as the preferred model, e. g., (Mundell 2001)). Therefore, and arguably, complexity in commercial activities may well be an important index in understanding the evolution of commodity exchange markets. We further discuss other issues, arbitrage in currency futures markets in Section 28.5, and the minimal number of transactional operations to bring a market to an arbitrage-free state in Section 28.6. We conclude in Section 28.7 with remarks and discussion. Our study provides a quantitative analysis method to the study of complexity in arbitrage. Our results show interesting comparisons of theoretical analysis with the reality of exchange markets. Indeed, realistic exchange markets are not in the general model which has a high computational complexity in locating arbitrage opportunities. In fact, the use of money as the medium of commerce is a good empirical example related to the star-shaped exchange markets, as discussed above. Furthermore, the study of polynomial time algorithms for locating arbitrage opportunities in such markets can lead to efficient tools for investors. Finally, as we apply the worst case complexity in our study, the results hold only for the worst case. It also invites studies using other complexity metrics, such as average case complexity, parameterized complexity, and smoothed complexity. It would be interesting to see new insights shed on the problem of arbitrage by those methodologies for complexity analysis.

28.2 Definitions and Models Exchange markets start historically in various forms. The barter market, where merchants exchange goods for goods (such as wheat for eggs), is of the most primitive ones. The foreign exchange market is, in many ways, a virtual barter market, of dollars, marks and yens. Formally speaking, we consider n foreign currencies: N = {1, 2,…, n} . For each ordered pair (i, j ) of currencies, one may

692

Mao-cheng Cai, Xiaotie Deng

change one unit of currency i to rij units of currency j . Rate rij is the exchange rate from i to j . In an ideal market, the exchange rate holds for any amount that is exchanged. An arbitrage opportunity is defined as such a set of exchanges between pairs of currencies that the net balance for each involved currency is nonnegative and there is at least one currency for which the net balance is positive. Under ideal market conditions, there is no arbitrage if and only if there is no arbitrage among any three currencies, see (Mavrides 1992). In reality, various types of market frictions exist. For example, bid-offer spread may be expressed in our mathematical format as rij rji < 1 for some i, j ∈ N . In addition, usually the traded amount is required to be in multiples of a fixed integer amount, hundreds, thousands or millions. Moreover, different traders may bid or offer at different rates, and each for a limited amount. A more general model to describe these market imperfections will include, for pairs i ≠ j ∈ N , lij different rates rijk of exchanges from currency i to j up to bijk units of currency i , k = 1,…, lij , where lij is the number of different exchange rates from currency i to j . In summary, we define the problem of FOREX ARBITRAGE as follows: • INPUT: A directed multi-graph G = (V , E ; l ) , where V is the vertex set, and E is the arc set, and a function l : E → R+ represents the multiplicity of individual arcs. Therefore, for each (ordered) pair (i, j ) ∈ E of vertices, there are lij arcs from i to j . For every arc (i, j ) ∈ E , for each k : 1 ≤ k ≤ lij , let bijk denote the maximum amount at the exchange rate rijk one may change from currency i to j . • OUTPUT: A currency exchange vector x = ( xijk ) , 1 ≤ k ≤ lij , i ∈ N , j ∈ N , i ≠ j , 1 ≤ k ≤ lij , i ∈ N , j ∈ N , i ≠ j , such that 0 ≤ xijk ≤ bijk .

(28.1)

• There exists arbitrage in the market if there is a currency exchange vector x = ( xijk ) satisfying the following inequalities, for each i ∈ V , with at least one strict inequality: l ji

lij

∑ ∑ ⎢⎣ rjik x kji ⎥⎦ − ∑ ∑ xijk ≥ 0, i = 1, …, n. j ≠ i k =1

(28.2)

j ≠ i k =1

Note that the first term on the left hand side of (28.2) is the total income of the currency i by selling other currencies and the second term is the total expense of the currency i for buying other currencies. In addition, we require the currency exchange vector x to be integral, i. e., xijk is integer, 1 ≤ k ≤ lij ,1 ≤ i ≠ j ≤ n.

(28.3)

Note that Equation 28.1 is usually called the capacity constraint. The fact that the exchange rate from the currency i to the currency j may be different from the

28 Complexity of Exchange Markets

693

exchange rate from the currency j to the currency i is related to the ask-bid spread in exchange markets. Finally the requirement that the currency exchange vector x be an integer is often called the integer constraint. Those are often ignored in theoretical analysis and considered as frictions to the ideal market model. We commonly refer such markets with frictional constraints as frictional markets. Defined in this manner, it belongs to a class of problems called decision problems. The answer to such problems is either YES or NO. An algorithm solves the problem correctly if for every input instance, it provides the correct output. For this particular problem FOREX ARBITRAGE, the input is the directed multigraph G = (V , E ; l ) , the bound function b(⋅) and the rate function r (⋅) on the multiple edges of the graph. The time complexity is measured in the number of operations in the algorithm to give out the answer. Usually, for all the instances of size n , measured in the number of bits representing the input, the longest time T (n) to solve one such instance is used to measure its complexity. Though in practice this method, commonly referred to the worst case analysis, may not be the best metric, it has been well accepted in algorithmic analysis. In addition, related complexity measures are proposed to deal with some of the drawbacks. At the same time, the worse case analysis is still used as the first step in understanding the complexity of solving a problem. Our study will focus on this measure. We comment that further studies on different measures of computational complexity will also be helpful in a full understanding of the computational complexity of finding an arbitrage opportunity. We shall comment on some of the related issues at the end of this chapter. Two important classes stand out in computational complexity studies. One is the class P , for problems solvable in polynomial time. More specifically, a problem is in P if there is an algorithm solving it that takes time T (n) bounded by a polynomial P(n) , that is T (n) ≤ P(n) for all n ≥ 0 . Those problems are usually regarded as easy to solve, as the most credited inventor of the class P , Jack Edmonds, put it: problems with good algorithms. Another is the class NP , founded by Stephen Cook. Informally, it refers to problems for which, for an instance with an YES answer, it takes a polynomial time to verify the answer is correct (possibly with necessary auxiliary information). The most difficult problems in NP are NP complete problems. For those problems, if there is a polynomial time algorithm, then there is a polynomial time algorithm for every problem in NP . With the computational complexity study, we are interested either to find an algorithm that solves FOREX ARBITRAGE in polynomial time, or to prove that it is NP complete. The most common approach to prove a problem NP -complete is to first prove it in NP , and then prove that another NP -complete problem can be reduced to it. The former task is usually easier and most of the effort is often spent on the latter to prove an NP -completeness result. The problems SETCOVER and PARTITION are two of the most useful NP -complete problems to start the reduction process. We should use them in our discussion. For readers interested in full details of the NP -complete methodology, we would like to refer them to an excellent introduction on this topic: (Garey and Johnson 1979).

694

Mao-cheng Cai, Xiaotie Deng

28.3 On Complexity of Arbitrage in a General Market Model In this section, we shall show the problem of arbitrage in these general frictional markets, FOREX ARBITRAGE, becomes NP-complete. Theorem 28.1 FOREX ARBITRAGE is NP-complete, even if all lij ∈ {0,1} .

Proof. First it is clear that we can check in polynomial time whether a given currency exchange vector x = ( xijk ) , 1 ≤ k ≤ lij , i ∈ N , j ∈ N , i ≠ j satisfies the conditions in the definition. Therefore, the decision problem is in the class NP. Let us now describe a polynomial-time reduction τ from SETCOVER to an instance of our decision problem. As we should show the result for the case where all lij ∈ {0,1} , we should consider a simple graph G = (V , E ) . Therefore, the variables xijk will be denoted by xij for (i, j ) ∈ E . Consider an instance I of SETCOVER, see, for instance, (Garey and Johnson 1979) and (Karp 1972): Given a collection C = {C1 , C2 , , C p } of subsets of S = {e1 , e2 , , eq } and positive integer K ≤ p , does C contain a cover for S of size K or less, i. e., a subset C ′ ⊆ C such that every element in S belongs to at least one member of C ′ ? We construct an instance τ ( I ) of the decision problem with n = p + q + 4 from I in the following way.

• Currencies: 1, 2, • Rates: ⎧ 1 ⎪ 1 ⎪ ⎪ 2 ⎪ 1 ⎪ ⎪ ⎪ 1 ⎪ ⎪ q ⎪ ⎪ q rij = ⎨ ⎪ K +1 ⎪ 1 ⎪ ⎪ K +1 ⎪ 1 ⎪ ⎪| C j | ⎪ 1 ⎪ ⎪| C j | ⎪ 0 ⎩

,n . if 1 ≤ i ≤ p, p + 1 ≤ j ≤ p + q and e j − p ∈ Ci , if p + 1 ≤ i ≤ p + q, 1 ≤ j ≤ p and ei − p ∈ C j , if p + 1 ≤ i ≤ p + q and j = n − 3, ori = n − 3andp + 1 ≤ j ≤ p + q, if i = n − 3 and j = n − 2, if i = n − 2 and j = n − 3 , if i = n − 2 and j = n − 1, if i = n − 1 and j = n − 2, if i = n − 1 and j = n or i = n and j = n − 1, if i = n and 1 ≤ j ≤ p, if 1 ≤ i ≤ p and j = n, otherwise.

28 Complexity of Exchange Markets

695

• Bounds: if 1 ≤ i ≤ p, p + 1 ≤ j ≤ p + q and e j − p ∈ Ci , orp + 1 ≤ i ≤ p + q, 1 ≤ j ≤ p and ei − p ∈ C j , if p + 1 ≤ i ≤ p + q and j = n − 3, ori = n − 3andp + 1 ≤ j ≤ p + q, if i = n − 3 and j = n − 2, if i = n − 2 and j = n − 3, if i = n − 2 and j = n − 1, if i = n − 1 and j = n − 2,

⎧ 1 ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ q ⎪ ⎪ 1 bij = ⎨ ⎪ 1 ⎪ K +1 ⎪ ⎪ K ⎪ 1 ⎪ ⎪| C j | ⎪ 0 ⎩

if i = n − 1 and j = n or i = n and j = n − 1, if i = n and 1 ≤ j ≤ p, if 1 ≤ i ≤ p and j = n, otherwise.

This completes the construction of τ ( I ) . It is shown by the following digraph G1 . Now we claim that there exists arbitrage in market τ ( I ) if and only if C contains a cover for S of size K or less. We first assume that there exists an arbitrage in market τ ( I ) , i. e., there is a vector x = ( xij ) satisfying (28.1), (28.2) and (28.3) with at least one strict inequality in

n

(1,K)

n-1

(1/|Cp|,|Cp|)

(1/|C| 1 ,|C1|) (|C|1 1, )

1

(K+1,1)

(1/ (K +1) , K +1)

i

2

(1/|C| , Ci|) i| (|C i|,1)

(|Cp|,1)

p-2

p-1

p

(1/2,1 )

(1/2,1 )

(1/2,1)

(1,1 )

(1,1 )

(1,1 )

p+1

p+2

p+3

n-6

j (1,1 )

(1,1 )

(1,1 )

n-2

(1,1 )

(1/ q, q) ( q ,1)

Figure 28.1. Digraph G1

n-3

n-5

n- 4

696

Mao-cheng Cai, Xiaotie Deng

(28.2). Among all such vectors, we choose x with an additional property that σ ( x) = ∑{xij | ij ∈ E (G )} is minimum. Obviously, σ ( x) > 0 . Further we have Claim 28.1. xij = 0 if p + 1 ≤ i ≤ p + q , 1 ≤ j ≤ p and ei − p ∈ C j . Otherwise put

if p + 1 ≤ i ≤ p + q, 1 ≤ j ≤ p and ei − p ∈ C j , otherwise.

⎧ 0 x′ij = ⎨ ⎩ xij

Note that rij = 1/ 2 and xin ≤ bin = 1 , then ⎣⎢ rin xin ⎦⎥ = 0 , hence x′ is still arbitrage but σ ( x ′) < σ ( x) , contradicting the choice of x . Claim 28.2. xin = 0 for i = 1,…, p , and xi (i −1) = 0 for i = n − 2, n − 1, n . Indeed, assuming xin > 0 for some 1 ≤ i ≤ p , then by (28.2) and Claim 28.1, xin ≤ rni xni =| Ci | xni . By setting xni = xni − ⎢⎣ xin / Ci ⎥⎦ , xin = 0 and keeping all other xij unchanged, we have new arbitrage x with lesser σ ( x) , a contradiction. A similar argument shows that xi (i −1) = 0 for i = n − 2, n − 1, n . Claim 28.3. x( n − 2)( n −1) > 0 . If x( n − 2)( n −1) = 0 , then it follows from (28.2), Claims 28.1 and 28.2 that x( n −1) n = 0 , xni = 0 , for i = 1,…, p , xij = 0 for 1 ≤ i ≤ p , p + 1 ≤ j ≤ p + q p + 1 ≤ j ≤ p + q and e j − p ∈ Ci , and n

∑ {∑ ⎢⎣ r i =1 j ≠ i

ji x ji

⎥⎦ − ∑ xij } = − j ≠i

q −1 x( n −3)( n − 2) ≤ 0. q

But as at least one inequality in (28.2) is strict, the left hand side of the above inequality is positive, a contradiction. Now considering the integrality of x and the bound b( n − 2)( n −1) = 1 , we have x( n − 2)( n −1) = 1 , from which it follows easily that x( n −3)( n − 2) = q , implying xi ( n −3) = 1 and x( n −3)i = 0 for i = p + 1,…, p + q . We derive further that, for each i = p + 1,…, p + q , there is h(i ) , 1 ≤ h(i ) ≤ p , such that xh (i )i = 1 and xnh (i ) = 1 , implying ei − p ∈ Ch (i ) by the definition of rij , and hence C ∗ = {C j ∈ C | xnj = 1} C ∗ = {C j ∈ C | xnj = 1} is a cover for S . Now let us show | C ∗ |≤ K . Indeed, it follows from (28.2) and (28.3) that p

| C ∗ |= ∑ xnj ≤ ⎢⎣ r( n −1) n x( n −1) n ⎦⎥ ≤ K . j =1

Conversely suppose that C contains a cover C ′ = {Ci1 ,…, CiK ′ } for S with K ′ ≤ K . We define currency exchange x = ( xij ) as follows.

28 Complexity of Exchange Markets

697

if Ci ∈ C ′ and e j − p ∈ Ci , if p + 1 ≤ i ≤ p + q and j = n − 3, if i = n − 3 and j = n − 2,

⎧1 ⎪ 1 ⎪ ⎪ q ⎪ xij = ⎨ 1 ⎪ K′ ⎪ ⎪ 1 ⎪ ⎩ 0

if i = n − 2 and j = n − 1, if i = n − 1 and j = n, if i = n and C j ∈ C ′, otherwise.

It is easy to check that x = ( xij ) satisfies (28.1)–(28.3) with strict inequality for i = n − 1 in (28.2). Hence there exists arbitrage in market τ ( I ) . The theorem is proved.

28.4 Polynomial-Time Solvable Models of Exchange Markets Now we consider two polynomial-time solvable special cases when the number of currencies is constant or the exchange digraph is star-shaped (a digraph is starshaped if all arcs have a common vertex). Those polynomial-time solvable cases may imply that there are exchange markets where finding an arbitrage is easy. Therefore, for such market models, it is easier to arrive at an arbitrage-free state than the general model. In a bold claim, we may state that such exchange markets should be preferred for over the general market models. Theorem 28.2. There are polynomial-time algorithms for Problem ( P ) when the number of currencies is constant.

First write yijk for ⎢⎣ rijk xijk ⎥⎦ and b ijk for the smallest integer such that = ⎢⎣ rijk bijk ⎥⎦ , 1 ≤ k ≤ lij ,1 ≤ i ≠ j ≤ n . Then Problem ( P) can be reformulated as the following integer linear programming ( P∗ ) : ⎢ rijk b ijk ⎥ ⎣ ⎦

l ji

n

lij

maximize∑ wi ∑ ( ∑ y kji − ∑ xijk ) i =1

j ≠ i k =1

k =1

subject to l ji

lij

∑ ( ∑ y kji − ∑ xijk ) ≥ 0, i = 1,…, n, j ≠ i k =1

rijk xijk 0≤

k =1

≥

xijk

xijk , yijk

yijk ,1 ≤ ≤

k ≤ lij ,1 ≤ i ≠ j ≤ n,

k b ij,1 ≤

k ≤ lij ,1 ≤ i ≠ j ≤ n,

integer,1 ≤ k ≤ lij ,1 ≤ i ≠ j ≤ n.

Hence ( P) can be solved in polynomial time, see (Lenstra 1983).

698

Mao-cheng Cai, Xiaotie Deng

But the approach above has a disadvantage that the number of the variables in ( P ∗ ) may be very large since there are lij variables for each ordered pair (i, j ) . Now we present another approach to reduce the number of variables by solving ∏{lij |1 ≤ i ≠ j ≤ n} integer linear programming problems with only one variable xij for each ordered pair (i, j ) . For this aim, we suppose rij1 > rij2 > > rijlij for each ordered pair (i, j ) . Clearly, for an optimum solution x = ( x ijk ) , if x ijk > 0 , then x ijh = b ijh for all 1 ≤ h < k . For notational convenience, we write kij

kij

h =1

h =1

sijkij = ∑ ⎢⎣ rijh b ijh ⎥⎦ , uijkij = ∑ b ijh for all 0 ≤ kij ≤ lij ,1 ≤ i ≠ j ≤ n . Note that sij0 = uij0 = 0 by convention. Now for each (kij ) of {(kij ) | 1 ≤ kij ≤ lij ,1 ≤ i ≠ j ≤ n} , we solve the following integer linear programming ( P(∗kij ) ) : n

maximize∑ wi ∑ ( y ji − xij ) i =1

j ≠i

subject to kij −1

∑ ( y ji − xij ) ≥ ∑ (uij j ≠i

j ≠i

− s kjiji −1 ), i = 1,…, n,

rijkij xij ≥ yij ,1 ≤ i ≠ j ≤ n, 0 ≤ xij ≤ b ijkij , 1 ≤ i ≠ j ≤ n, xij , yij integer,1 ≤ i ≠ j ≤ n

Among all the optimum solutions we choose an optimum solution ( xij∗ , yij∗ ) of ( P(∗kij∗ ) ) such that n

n

∗ ji

∗ ij

k −1 k −1 ∑ wi ∑ ( y∗ji − xij∗ ) + ∑ wi ∑ ( s ji − uij ) i =1

j ≠i

i =1

j ≠i

is maximum, and for all ordered pair (i, j ) and k , 1 ≤ i ≠ j ≤ n,1 ≤ k ≤ lij , define ⎧b ijkij ⎪⎪ ∗ k x ij = ⎨ xij ⎪ ⎪⎩ 0

if 1 ≤ k < kij∗ , if k =kij∗ , if kij∗ < k ≤ lij .

Then the defined ( x ijk ) is an optimum solution to ( P ) . Theorem 28.3. It is solvable in polynomial time to find the maximum revenue at the center currency of arbitrage in a frictional foreign exchange market with bid-ask spread, bound and integrality constraints when the exchange digraph is star-shaped.

28 Complexity of Exchange Markets

699

As the exchange digraph G = (V , E ) is star-shaped, the exchange transactions are conducted between the center currency and the other currencies in the market. The problem can be decomposed into subproblems ( Pi∗∗ ), i = 2,…, n , l ji

l1i

k =1

k =1

maximize( ∑ yik1 − ∑ x1ki ) subject to l1i

li 1

k =1

k =1

∑ y1ki − ∑ xik1 ≥ 0,

r1ki x1ki

≥ y1ki ,1 ≤ k ≤ l1i ,

ri1k xik1 ≥ yik1 ,1 ≤ k ≤ li1 , 0 ≤ x1ki ≤ b 1ki,1 ≤ k ≤ l1i , 0 ≤ xik1 ≤ b ik1,1 ≤ k ≤ li1 , x1ki , y1ki integer,1 ≤ k ≤ l1i , xik1 , yik1 integer,1 ≤ k ≤ li1

where currency 1 is the center currency. Clearly each ( Pi∗∗ ) can be solved in polynomial time by the same technique used in the proof of Theorem 28.2. Theorem 28.4. It is solvable in polynomial time to determine the existence of arbitrage in a frictional foreign exchange market with bid-ask spread, bound and integrality constraints when the exchange digraph is star-shaped.

Indeed, if the exchange digraph G = (V , E ) is star-shaped, there are exchange transactions between the center currency and the other currencies in the market. Therefore, the existence of arbitrage can be solved separately for each exchange, and hence in polynomial time.

28.5 Arbitrage on Futures Markets In today’s markets, take foreign exchange as an example, forward markets allow for trading of contract of foreign currency exchanges at a future date. This is called the futures market. Arbitrage in futures markets is a more advanced type of arbitrage than the one we have discussed above, commonly referred to as the spot market. In this section, we are interested to know whether futures markets make the problem much more difficult.

700

Mao-cheng Cai, Xiaotie Deng 2

i (

r1i , b1i)

ri'i , bi'i)

(

(

r b12 )

( 12 ,

(

rii' , bii' )

(

i' (

1

r b

( i1 , i1)

ri'1' , bi'1' ) (

r22' , b22' )

r b

( 21 , ) 21

(

r b

( n1 , n1)

r1'1 , b1'1)

r1n , b1n)

r

n

b

( 2'2 , 2'2)

2'

( r , b1'i' ) 11' (

ri'1' , bi'1') 1'

( r ,b ) 1'2' 1'2' r2'1' ,b2'1')

r

b

( n'1' , n'1' ) (

r b

( n'n , n'n )

(

rnn' , bnn' )

r1'n' , b1'n' ) n'

Figure 28.2. Digraph G2

For example, a star shaped exchange market, when combined with a futures market, the structure is a coalescence of a star-shaped exchange digraph and its copy, shown by Digraph G2 . The problem of identifying arbitrage then becomes NP-complete as shown in the following theorem.

Theorem 28.5. It is NP-complete to decide whether there exists arbitrage in a frictional foreign exchange market with bid-ask spreads, bound and integrality constraints even if its exchange digraph is coalescent.

Proof. It is easily seen that the decision problem is in the class NP. We describe a polynomial-time reduction τ from PARTITION to an instance of the decision problem. Consider an instance I of PARTITION, see (Garey and Johnson 1979) and (Karp 1972): Given a set C = {c2 , , cn−1} of positive integers, is there a subset C′ ⊆ C such that ∑c ∈C′ ci = ∑c ∈C′ ci ? i

n −1

i

We put s = ∑ i = 2 ci and construct an instance τ ( I ) from I as follows.

• Currencies: 1,1′, • Rates:

, n, n′ .

28 Complexity of Exchange Markets

1 r11′ = r1′ 1 = , 2 rii′ = ri′i = 1 if i = 2,…, n, ⎧ 1 if i = 2,…, n − 1, ⎪⎪ 4c i r1i = ⎨ ⎪ 1 if i = n, ⎪⎩ 2 ⎧ 1 if i = 2,…, n − 1, ⎪ ri1 = ⎨ 2 ⎪⎩2 s + 1 if i = n, r1′i′

ri′1′

⎧1 ⎪⎪ 2 =⎨ ⎪1 ⎪⎩ 2 s ⎧4ci ⎪ =⎨ 1 ⎪⎩ 2

if i = 2,…, n − 1, if i = n, , −1, if i =2,…n if i = n,

• Bounds: bii′ = bi′i = 1 if i = 1,…, n, , −1, ⎧4ci if i =2,…n b1i = ⎨ if i = n, ⎩1 bi1 = 1 if i = 2,…, n, ⎧ 1 if i = 2,…, n − 1, b1′i′ = ⎨ if i = n, ⎩2s bi′1′ = 1 if i = 2…, n.

This completes the construction of τ ( I ) . If there is a subset C ′ ⊆ C with ∑ ci ∈C ′ ci = ∑ ci ∈C ′ ci (= s ) , put x1∗i = 4ci , xii∗′ = xi∗′1′ = 1 if ci ∈ C ′, x1∗′n′ = 2s, xn∗′n = xn∗1 = 1, xij∗ = 0 otherwise.

Then it is easily seen that the defined exchange vector x∗ is arbitrage.

701

702

Mao-cheng Cai, Xiaotie Deng

Conversely, suppose that there is a vector xˆ = ( xˆ ij ) that satisfies (28.1)–(28.3) with at least one strict inequality in (28.2). By similar arguments used in the proof of Theorem 28.1, we have xˆ 11′ = xˆ 1′ 1 = 0, xˆ 1′i′ = xˆ i′i = xˆ i1 = 0 if i = 2,…, n − 1, xˆ 1n = xˆ nn′ = xˆ n′1′ = 0, xˆ n1 = xˆ n′n = 1, xˆ 1′n′ = 2s.

Moreover, if xˆ 1i > 0 , then xˆ 1i = 4ci and xˆ ii ′ = xˆ i′1′ = 1 ; conversely if xˆ i′1′ > 0 , then xˆ i′1′ = xˆ ii′ = 1 and xˆ 1i = 4ci . Put C ′ = {ci ∈ C | xˆ 1i = 1}. Then C ′ is the required subset. Indeed, it derives from (28.2) that 2s + 1 ≥ ∑ c ∈C ′ 4ci ≥ 2s , implying ∑c ∈C′ 4ci = 2s i

i

as

∑ c ∈C ′ 4ci is even. Therefore ∑ c ∈C ′ ci i

i

= s/ 2 .

28.6 The Minimum Number of Operations to Eliminate Arbitrage Finally we turn to consider another problem: In order to eliminate arbitrage, how many transactions we have to make? Note that each arc (a, b) in an exchange digraph represents an exchange rate offer from the currency a to the currency b . Therefore, the number of transactions is related to the number of arcs involved in a sequence of transactions. This minimum number is interesting as it is related to the speed at which an arbitrage state can be reduced to an arbitrage-free state. On the other hand, it also presents the complication involved with a sequence of arbitrage operations in the most general exchange market model. Our answer is that the number of transactions to achieve that is in the worst case quite large, is proportional to the maximum number of edges in a graph. Theorem 28.6. There is an exchange digraph of order n such that at least 2 ⎢⎣ n/ 2 ⎥⎦ ⎡⎢ n / 2 ⎤⎥ − 1 transactions and at least n / 4 + n − 3 arcs are in need to bring the system back to non-arbitrage states.

For instance, consider the currency exchange market corresponding to digraph G3 = (V , E ) , where the number of currencies is n =| V | , p = ⎣⎢ n / 2 ⎦⎥ and K = n 2 . Set C = {aij ∈ E |1 ≤ i ≤ p, p + 1 ≤ j ≤ n} ∪ {a1( p +1) } ∪{ai (i −1) | 2 ≤ i ≤ p} ∪ {ai (i +1) | p + 1 ≤ i ≤ n − 1}.

{a( p +1)1}

28 Complexity of Exchange Markets

1

2 (1,K) (1/3 , 1 ) (1,K) (1,1 )

(1/3 ,1)

(1,1 ) (1,1 ) (1,1)

(1,1 )

(1/3 ,1 )

i

(1,1 ) (1,1 )

(1/3 ,1) (2,K )

(1/3 ,1)

pp (1/3 ,1) 1 (1,K) (1/3 ,1)

(1,K)

(1,1) (1,1 )

(1,1 ) (1,1 )

(1/3 ,1)

(1/3 ,1)

p+1 (1/3 ,1) (1,K) p+2(1/3 , 1)

(1/3 ,1) ( 1

/3 ,1)

(1/3 ,1) (1/3,1) (1/3 ,1)

(1/3 ,1) (1,K )

(1,1 )

(1,1 ) (1,1)

(1,1 )

(1/3 ,1) (1/3 ,1)

703

j (1/3 ,1)

(1,K)

(1,1 )

(1/3 ,1) (1/3 ,1) (1/3 ,1) (1/3 ,1)

n-1 (1/3 ,1)

(1,K)

n

Figure 28.3. Digraph G3

Then | C |= ⎢⎣ n/ 2 ⎥⎦ ⎡⎢ n/ 2 ⎤⎥ + n − 2 =| E | / 2 > n 2 / 4 + n − 3 . It follows easily from the rates and bounds that each arc in C has to be used to eliminate arbitrage. And ⎢⎣ n/ 2 ⎥⎦ ⎡⎢ n/ 2 ⎤⎥ − 1 transactions corresponding to {aij ∈ E |1 ≤ i ≤ p, p + 1 ≤ j ≤ n} {a( p +1)1} are in need to bring the system back to non-arbitrage states.

28.7 Remarks and Discussion The research effort of computational complexity for concepts in Economics has been growing stronger and stronger in recent years, e. g., in cooperative games, see (Megiddo 1978, Deng and Papadimitriou 1994), arbitrage in a security market, see (Deng, Li, and Wang 2002), and general equilibrium, see (Deng, Papadimitriou, and Safra 2002), and settlement of the computational complexity of Nash equilibrium in bimatrix form (Daskalakis, Goldberg, and Papadimitriou 2006; Chen and Deng 2006). Our study follows this tradition and offers interesting results to the study of arbitrage in an exchange market. It is shown that different exchange systems result in different computational complexities for arbitrage. Our study applies the computational complexity to the study of arbitrage in exchange markets. Our hardness results and polynomial-time solvability results for different models of exchange markets reveal that the market structures are relevant to the speed at which an arbitrage opportunity may be identified, and hence, eliminated. Consequently, they could play an important role in what kinds of market structures are adopted in reality. Arguably, we may prefer a star-shaped exchange market model, as in the role of money play now in markets. The ease of finding arbitrage in a constant size exchange market could be interpreted as a force behind the trend in reduced number of currencies, such as the establishment of Euro. We have further obtained interesting results for arbitrage in currency futures market and for minimization of the transaction operations to eliminate arbitrage in

704

Mao-cheng Cai, Xiaotie Deng

the most general exchange market. Further study on the complexity of such transactions in more realistic setting would be very interesting. Monetary systems in reality are results of an evolution process that lasted for centuries. Many factors are observed to be important, see, for example, (Mundell 1961 and 2000). Our work identify a new and independent factor, the computational complexity for arbitrage issues, responsible for the difference of different exchange systems. Indeed, the transition of barter economy to money economy is consistent to our discovery that the former may result in a much complex system than the latter in identifying arbitrage and taking its advantage and eliminating it. Our approach may shed new light in the study of monetary systems and we expect it to become a fruitful method in finance.

References Abeysekera SP, Turtle HJ (1995) Long-run Relations in Exchange Markets: A test of covered interest parity. J. Financial Research 18 (4), 431−447 Ausiello G, Crescenzi P, Gambosi G, Kann V, Marchetti-Spaccamela A, Protasi M (1999) Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties, Berlin, Springer Chen X, Deng X (2006) Settling the Complexity of Two-Player Nash Equilibrium. FOCS: 261−272 Clinton K (1988) Transactions costs and covered interest arbitrage: Theory and evidence. J. Political Econ. 96(2), 358−370 Daskalakis C, Goldberg PW, Papadimitriou C (2006) The complexity of computing a Nash equilibrium. STOC: 71−78 Deng X, Li ZF, Wang S (2002) Computational Complexity of Arbitrage in Frictional Security Market. International Journal of Foundations of Computer Science (5), 681−684 Deng X, Papadimitriou C (1994) On the Complexity of Cooperative Game Solution Concepts. Math. Oper. Res. 19(2), 257−266 Deng X, Papadimitriou C, Safra S (2002) On the Complexity of Equilibria. In Proceedings of the Thirty Fourth Annual ACM Symposium on the Theory of Computing, 67−71 Garey MR, Johnson DS (1979) Computers and Intractability: A Guide of the Theory of NP-Completeness, Freeman, San Francisco Gates B, Myhrvold N, Rinearson P (1995) The Road Ahead, November. Hochbaum D (1997) Approximation Algorithms for NP-hard Problems, PWS, Boston Jones CK (2001) A Network Model for Foreign Exchange Arbitrage, Hedging and Speculation. International Journal of Theoretical and Applied Finance 4(6), 837−852 Karp RM (1972) Reducibility among combinatorial problems. In R. E. Miller and J. W. Thatcher (eds.), Complexity of Computations, Plenum Press, New York, 85–103

28 Complexity of Exchange Markets

705

Lenstra Jr HW (1983) Integer programming with a fixed number of variables. Math. Oper. Res. 8(4), 538–548 Mavrides M (1992) Triangular Arbitrage in the Foreign Exchange Market – Inefficiencies, technology and investment opportunities, Quorum Books, London Megiddo N (1978) Computational Complexity and the game theory approach to cost allocation for a tree. Math. Oper. Res. 3, 189−196 Mundell RA (1961) A Theory of Optimum Currency Areas. The American Economic Review 51(4), 657−665 Mundell RA (2000) Currency Areas, Exchange Rate Systems, and International Monetary Reform, paper delivered at Universidad del CEMA, Buenos Aires, Argentina, April 17. http://www.robertmundell.net/pdf/Currency Mundell RA (2001) Gold Would Serve into the 21st Century. Wall Street Journal, September 30, 1981, p. 33 Simon H (1972) Theories of Bounded Rationality. In Decision and Organization (R. Radner, Ed.), North-Holland, New York Zhang S, Xu C, Deng X (2002) Dynamic Arbitrage-free Asset Pricing with Proportional Transaction Costs. Mathematical Finance 12(1), 89−97 Zuckerman D (1993) NP-complete problems have a version that’s hard to approximate. In th structure in Complexity Theory Conf., 305–312

Part III Further Aspects and Future Trends

Introduction to Part III

This last part of the handbook contains four chapters devoted to challenges and trends in the financial sector. The first challenge is the security problem, which is of special importance in the digital world. It is presented by Müller, Sackmann and Prokein in their contribution on IT security, which covers new requirements, regulations and approaches. The key problem is that financial institutions are exposed to security risks depending on their interconnections worldwide and the specification of their product. Data and the availability and access to information at any time are a key success factor for companies, but also a means for customers to gain transparency in financial transactions and positions. Today, information systems communicate in a way that is controlled by their owners. In future, these information systems will be infrastructures that will react to a rapidly changing environment. This shift requires new security paradigms, where technical security mechanisms no longer prevent “bad” things from happening, but where security policies based upon national and international regulations provide a flexible and dynamic means to assure security or at least to make risks calculable. In this chapter, the challenges of IT security and its management resulting from the technological development are discussed. The second challenge is to find the right legislative regulations of the digital economy. It is discussed in Chapter 30 by Spindler and Recknagel, which is devoted particularly to legal aspects of Internet banking in Germany. Internet-banking has become increasingly popular in Germany within the last ten years. The variety of possible transactions and the scope of business have widened, starting from pure transaction banking via internet broking to the complete spectrum of banking services including credits and loans. Internet-banking is part of the direct-banking spectrum, which next to internet-banking includes means of communication by telephone, fax or letter. The discussion concerning the terms “online-banking” or “home-banking” is non-fruitful and has become obsolete, since the termination of the BTX-Service by the German Telekom. Besides the common legal banking questions additional problems arise from the implementation of the internet, in particular the necessity of providing sufficient access to the bank-server – crucial for instance

710

Introduction to Part III

for day trading. Moreover, issues of identifying customers and security assurance are essential within this sensitive business field. Not surprisingly the legal discussion has focused on the problems of phishing, pharming or identity theft, which endanger financial transactions and the acceptance of internet-banking. One outstanding problem is the burden of proof due to an alleged identity theft or a breach of obligations by the customer. In the chapter the conclusion of a contract between bank and customer, the specific dangers of internet-banking and finally the question of liability are analysed and discussed in a nutshell. Frosch, Lauterbach and Warg present challenges and trends for insurance companies in Chapter 31. The actuarial results of insurance companies are generated by three central variables: growth in high yield business sectors, claims expenditure and the reduction in expense ratios. If the consistent and ongoing restructuring of portfolios presupposes the reduction in claims expenditure, the challenge is to create growth in the high yield sectors. The chapter describes the importance of IT for identifying and implementing the appropriate strategy, for self-controlling systems, for the development of unique selling propositions, for providing support to sales and marketing and for identifying, recruiting and developing the right talent. The last chapter of this part and hence the final chapter of the handbook is devoted to the strategically important question of finding the right balance and partnership between industrial and academic research. Here, Koenig, Blumenberg and Martin discuss added value and challenges of industry-academic research partnerships using the successful E-Finance Lab as illustrative example. As many financial industry players like banks do not operate their own research and development departments, it is the mission of industry-academic research partnerships like the E-Finance Lab to develop scientifically sound, yet practical methods for an industrialisation of the financial service industry. Thereby, knowledge sharing between research institutes and industry partners implies a two-way stream of information. Research activities have to be soundly based to both ensure high quality of research results and in parallel to enable an efficient knowledge transfer into practice. This chapter reveals the added value of such co-operations as well as the challenges that industry-academic research organisations are facing.

CHAPTER 29 IT Security: New Requirements, Regulations and Approaches Günter Müller, Stefan Sackmann, Oliver Prokein

29.1 Introduction Over the past decades, the business environment in the banking sector has changed substantially. Financial service providers and direct brokers have entered the market for private customers. Insurances meanwhile offer investment funds and other products for retirement provision and companies that usually operate in other branches offer new services to customers (e. g. home-banking software of Microsoft). Faced with such increased competition, banks, in particular, pursued Customer Relationship Management (CRM) (Sackmann and Strüker 2005) as a strategy for canvassing new customers and optimizing existing customer relationships. This personalization has a serious side effect: the business of whole institutions depends on the availability, correctness and security of the infrastructure to run the services. Security and the relationship between customers and providers has become a critical issue for both to protect assets and to provide transparency. This contribution discusses the changing challenges to IT security resulting from the integration of various financial services relying on a dynamic technological infrastructure. In the following section, online portals as one solution to customize services dynamically according to the needs of the customer are presented. The integration of services and the rapidly changing environment extend the demands on security mechanisms. The challenges to security are the subject of the third section. With a focus on the paradigm shift brought by ubiquitous computing, dependability, as crucial demand on security approaches, is outlined and discussed. In the fourth section, policies and audit are discussed as promising concepts to reach dependability. As result, solely technological mechanisms are not able to provide total security or dependability. IT security becomes subject to risks that have to be considered by value-based management as discussed in the fifth section. A conclusion completes this chapter.

712

Günter Müller et al.

29.2

Online Services in the Financial Sector

29.2.1 Integration of Financial Services With focus on the customer, the activities shift from isolated services to integrated services covering their advanced needs. Usually a customer is situated in the middle of a process, e. g. “property purchase” and needs numerous products and/or services (Österle et al. 2000). Traditionally, the coordination of such a process is up to the customer himself: Several providers such as a bank, insurance broker, notary, architect etc. have to be contacted and the different offers have to be evaluated. Obtaining all the information and products required from one source through this integration can benefit the customer by enormously reducing search costs. From a service provider’s view, an integration of services can be realized by means of process portals (Puschmann and Alt 2004). In order to satisfy the whole process, it becomes necessary for the integrating organization to cooperate with a multitude of business partners, thereby producing a supply chain network. The process portal architecture has to be flexible enough to dynamically integrate the required internal and external services in order to adapt to the individual needs and preferences of different customers. Concrete examples of such flexible process portals can be found, e. g., at Credit Suisse (www.yourhome.ch) or HypoVereinsbank (www.immoseek.de) basing on a modular architecture visualized in Figure 29.1 that encompasses four modules: the process, integration, administration and security modules (Puschmann and Alt 2004). On the basis of personalized data, the process module aims at offering all the required information and products to the customer. The module is composed of five different elements: • Presentation: Display of all the information and products according to the customers’ needs and preferences on the user interface. • Navigation: Provision of different browsing models depending on the desired information and products. • Interaction: The information and products required can be requested from the customer (pull model) or are offered by the portal operator (push model). • Personalization: In order to satisfy the customer process, all information and products required are generated on the basis of personalized data and individual preferences. • Content Aggregation: According to the personalized data and individual preferences, the portal extracts and applies the information and products from different content sources. The integration module pools the heterogeneous applications. At the presentation level of the user interface, the integration can be realized by portlets, which are web applications based on standards to facilitate, for example, the exchange of data via

29 IT Security: New Requirements, Regulations and Approaches

Security Module

Authentification

Process Module

Presentation

713

Administration Module User Administration

Navigation Secure Communication

Role Administration

Interaction

Personalization

Portlet Administration

Content Aggregation

Monitoring

Single Sign-On

Session Handling

Integration Module

Figure 29.1. Architecture of a process portal (Puschmann and Alt 2004)

HTML. At the functionality level, this integration can be realized by communication between loosely coupled software components like Web Services1. At the data level, the integration can be realized with enterprise application integration (EAI) providing coordinated access to different systems and their data. Web Services and EAI are mutually complementary rather than competing approaches. EAI systems are mainly employed in the intra-organizational field whereas, due to standardization, Web Services make them suitable for inter-organizational applications. The process and integration modules are further supported by the administration module being responsible for adaptation of the dynamics in the underlying architecture. For this, the administration module defines the different roles of the integrated actors by adding or removing portlets. This includes, for example, applying software updates, adding new services or removing obsolete applications. The security module supports the process and integration modules as well and provides the necessary security guarantees of data and communication. The design of the security module is of particular importance in the context of financial services. On the one hand, the underlying CRM requires the collection and use of 1

The W3C-Consortium defines Web Services as a software system designed to support interoperable machine-to-machine communication over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards (Booth et al. 2004).

714

Günter Müller et al.

sensitive personal data and has, at minimum, to guarantee the observance of legal regulations. On the other hand the integration of services from other service providers requires, at least partly, the communication of this data. Thus, the security module has at least to signalize the manner of collecting and using personal data and to provide technical mechanisms for making the usage transparent and comprehensible. Furthermore; the module has to guarantee security according to the classical protection goals of multilateral security (confidentiality, integrity, accountability and availability) (Müller et al. 2003). One main success factor for the acceptance of process portals and the integration of full customers’ processes will be the availability of suitable security functionalities. The dynamic integration of services and the rapidly changing environment raises new demands on security approaches that will be sketched and discussed in the following section.

29.2.2 Security Risks of Personalized Services The desire of any customer to remain in control of his data poses a tremendous problem to present day security technology, whose prime objective is to avoid the transmission and exposure of unnecessary personal data. It also shows a risk if the services described above are accepted. This trade-off limits customization based on extensive profiling – anonymity rules out personalization. Once access to personal data is granted, there is no control over how data are processed. The challenge to financial service providers becomes that of providing technology accepted by the customers in order to gain their trust. One promising way can be seen in an integrated approach of technology and legal contracts (Sackmann et al. 2006).

29.3

Security beyond Access Control

Three seemingly different technical aspects, mobile, pervasive and autonomic computing define the future infrastructure influencing the mechanisms for “security” (Figure 29.2). All these technical directions are part of a ubiquitous computing infrastructure. Yet the main emphasis of their impetus is so different that it is currently still justifiable to speak of different areas. The focus of the mobile computing effort is the global reachability of each network participant, irrespective of his location. The inclusion of context is the challenge of pervasive computing. The miniaturization and variety of the devices and services involved make it increasingly less likely that such information systems can be configured or operated by human. The systems become autonomous and manage themselves independently.

29 IT Security: New Requirements, Regulations and Approaches

715

2010

Pervasive Computing

Mobile Computing

Ambient Computing

2000

2020

Figure 29.2. Evolution of computing

29.3.1 Security: Protection Goals The security of an IT system is defined in terms of the four protection goals: confidentiality, integrity, availability and non-reputation (Figure 29.4). Safety and system stability are not considered part of IT security. The threats to the security of an IT system are analyzed using an “attacker model” that consists of a description of the potential attacks and the different capabilities of an attacker (Müller and Gerd tom Markotten 2000). A threat may open up the possibility of an attack, i. e. a negative change of behavior, possibly leading to some form of information leakage. Once the potential threats are identified, security policies describing how to counter these threats are developed; these policies are expressed by means of rules and are eventually realized by choosing the appropriate security mechanisms. It is worth remarking that, since security mechanisms can only cope with known attacks, total security cannot be ensured; mechanisms cannot cope with unforeseen threats. Security is often equated with access control, which consists of authentication and authorization, and is realized in such a way that unauthorized operations are identified in advance and subsequently forbidden (Lampson et al. 1992). Although this approach constitutes a consistent basis for security in information systems so far, it is not adequate for the development towards Ubiquitous Computing (Figure 29.2): • Reachability: Continuous reachability can be endangered through lacking security in such a way that the reliability of the entire system is negatively influenced. Security in the form of access control can only partly help here as it requires an appropriate central infrastructure which is not available in such dynamic systems.

716

Günter Müller et al.

• Context-awareness: Context recognition is crucial for ascertaining the meaning of the surrounding components. If context information is not made available integrally and, if necessary, confidentially, emergent structures cannot be properly supported in, for instance, the composition of available services. • Autonomous Problem Handling: IT systems recognize situations without user interaction and react. Security-related decisions, such as the admission of external code are also made in this way. The system may not thereby put itself into a state contradictory to the guidelines, which endangers the reliability of the system. In spite of constant further development of available mechanisms, the security of current systems is increasingly endangered through more sophisticated attacks, as the number of vulnerabilities and attacks made known illustrate (Figure 29.3).

Figure 29.3. CERT statistics on the number of vulnerabilities and incidents registered

29 IT Security: New Requirements, Regulations and Approaches

717

29.3.2 Security: Dependability The foundations of the currently used methods to provide security were already developed at the time of the mainframes and have survived until today with few changes and adjustments. Even though safety and system-stability are not usually part of IT security they are more often becoming a backdoor for threats and attacks. Since many processes in daily life and business depend on safe and stable systems, dependability extends IT security towards safety and system stability (Figure 29.4). If, however, one considers present IT systems, a constant rise in system failures can be observed, which is due to an increasing number of software problems. The methods and tools to increase the reliability of IT systems are dependent on the respective underlying computing paradigms. Thus, if the paradigms change, the security methods and tools have to be checked for their suitability in future.

Figure 29.4. Aspects of dependability

29.4 Security: Policy and Audits The concentration of security efforts on the specification of undesired system behavior and on mechanisms for its avoidance is partly responsible for declining dependability. Two of the most promising concepts are explained in the following: 1. Security policies describe desirable system behavior. They may be divided into two categories, namely access control and command-based policies. 2. Secure logging services allow audits to be performed and to distinguish between acceptable and non-acceptable systems states in term of security.

718

Günter Müller et al.

29.4.1 Security Policies The scope of security policies includes authorization, control of resources and communication channels. They define which components or services of a system are allowed to access protected resources and regulate the way of doing it. Thus, security policies restrain components from interacting with protected resources in an uncontrolled manner. Security policies of this kind forbid undesired events (Schneier and Kelsey 1999). In addition to these undesired events, there are events that are desirable, but pose a risk to the system. 29.4.1.1 Access Control Based Security Properties The definition of security policies can be divided into three steps and follows a life cycle model (Figure 29.5) (Control Data 1999) 1. Policy specification: In this phase, risks and the determination of assets to be protected are identified. 2. Policy enforcement: This is the operational phase of the system, where the specified security policy is executed using enforcement mechanisms. 3. Policy evaluation: In this phase, the success of the specified policy to protect the assets against the predicted threats is evaluated. The security assumptions made during the specification phase are tested and failures are analyzed for future incorporation into the policy. For the specification of security objectives, a plethora of policy languages have been developed. While some of them are targeted at particular application domains, there also exist “all-round” languages. Although the syntax of these security policy languages widely differs, the basic elements of their rules are based on the common form of low-level action rules (Sloman 1994). They encompass the attributes: modality, subject, objective and constraints (Figure 29.6).

Figure 29.5. Life cycle of security policies

29 IT Security: New Requirements, Regulations and Approaches

719

Figure 29.6. Attributes of a security policy rule

Security policies act upon a given subject and define intended actions on this subject or parts of it. Policy action rules distinguish between the active subjects, the intended actions and the targeted objects. This results in the objectives of policy rules. The modality of the rules is differentiated into two categories: • Authoritative: These rules define what a subject is allowed or not allowed to do and legitimate it with the power to perform its requested action on a target object. For security policies, authorization rules are mainly concerned with managing access control tasks. • Imperative: These event-triggered rules carry out management tasks and define what a subject must or must not do when a certain event occurs. Imperative policies have a subject-based interpretation in that the subject needs to interpret the policy statement and perform the actions on the object. An important attribute of a policy rule is the set of constraints under which the rule is valid. These constraints can be classified as provisions and obligations (Hilty et al. 2005): • Provisions: Similar to the concept of pre-conditions, provisions relate to actions that need to be performed by the subject before a policy rule becomes valid. Alternatively, they specify conditions to be fulfilled by the environment at the time of rule execution, e. g. the time constraint “access granted only between 0900 and 1700 hours”. • Obligations: Obligations impose conditions on the future behavior of components and often define how protected data has to be used when accessed, e. g. “release resource five minutes after access”. Thus, obligations are similar to post conditions. 29.4.1.2 Command-based Security Policies In addition to the undesired properties, there are also those properties that are desirable for the entire system, but have to be treated in an individualized manner. For example, a store would run out of business if the retailer were to prevent unknown customers from shopping. In e-commerce this is however still the case,

720

Günter Müller et al.

Figure 29.7. Prohibition and command-based security

where each customer usually has to submit identity before shopping can take place. Figure 29.7 shows both the difference between the desirable (liveness properties) and the harmful properties of security (safety properties). Security policies considering both states properties describe how a secure state can be maintained or reached, instead of just preventing “bad” states from happening. The approach of information security based on access control was introduced when systems exhibited a monolithic structure and a low degree of dynamic and heterogeneity. In such systems, one can, for instance, achieve authorization through a body of rules and regulations on the configuration of a firewall or produce an access control list. If a new participant comes along, these mechanisms are newly configured by an administrator. To maintain and reach security, prohibitions have to be complemented by commandments. If security policies define explicit regulatory instructions for such systems by way of prohibitions, security cannot be guaranteed in a changing environment. Spontaneous changes lead to vulnerable, unreliable systems. If, on the other hand, additional commandments are given as security guideline to the system, security can also be realized in the long term. While the specification of the desired security behavior can permanently remain, commandments prove to be flexible due to negotiations between the subjects. Desired properties of an action have to be checked with regard to itself and with regard to the consequences a violation of a property has for other transactions and processes: This checking can be made at three different times: Before execution: For the detection of the (non-)observance of a commandment before execution, information about the (expected) future behavior of a system is required. By means of a mathematical model of a system, the development of its security level can, in fact, be examined. During execution: The execution is supervised today mostly by a reference monitor. In future, one may employ methods to automatically check the correctness of the program or verify the function performed. For liveness properties, this cannot always be decided at any time when the program runs for it still remains possible that the desired event may now not be true, but may ultimately occur. In this case, probabilities for the violation of the guidelines in the entire process must be calculated. The estimation of a concrete situation on the basis of these probabilities is dependent on the context of the application and can be supported by learning systems.

29 IT Security: New Requirements, Regulations and Approaches

721

After execution: After execution, there may be no chance for a role back. To detect the security gap and adapt the security policy, audits are needed. In traditional “client-server” architectures, audit is carried out by third parties with the aid of recorded information. The subsequent reaction tries to create a state specified as “stable” according to certain criteria with checkpoints and rollback procedures and to guide the entire system back to the last stable state. Through the evaluation of recorded information, violations of security guidelines can also be identified and the components involved made attributable.

29.4.2 Logging Services Logging is a basic service in computing systems in charge of archiving relevant events as they occur. A number of other services employ information collected by logging. For example, log data can be used to determine whether a system is properly configured, to provide a basis for recovery services based on checkpoint/rollback techniques, and to investigate and detect security incidents by means of audit tools. To obtain a general scenario for logging systems, we classify the network components according to their roles (Figure 29.8) (Lonvick 2001): • Device is a service that generates a log message; • Relay is a service that receives a message and forwards it to another service; and • Collector is a service that receives a message and logs it. Log mechanisms use this setting, such as reliable syslog (New and Rose 2001) and Schneier and Kelsey’s proposal (Schneier and Kelsey 1999). In current computer systems, the information collected by such log mechanisms has a definite realm circumscribed by the underlying purpose of data collection. This situation changes completely in an infrastructure of ubiquitous computing: the autonomous behavior of components implies that systems follow unspecified and sometimes even chaotic patterns. The relationship between events and their purpose is not specifiable in advance. In consequence, each and every event happening in the system needs to be persistently logged. Logged data can be classified into consentable and consentless data. Consentable data is the data collected according to the security policy; consentless data stands for the data collected without a previous formalized security policy. This gives rise to an even acuter privacy threat than ones faced today: it becomes impossible to

Figure 29.8. Log service setting

722

Günter Müller et al.

completely define the scope of a security policy. This leads to a paradox situation: while security threats arise from thorough data collection, there is no other way to assure security. While all mechanisms in computer sciences provide a system centric view to enforce security policies for achieving privacy, for example, a more holistic view is needed. Legal and regulatory aspects play an important role.

29.5 Regulatory Requirements and Security Risks Financial institutions can offer their customers benefits in the form of new services like process portals and also a reduction of transaction costs. Such services and the use of ubiquitous computing systems are associated with emerging security and also privacy risks when sharing private data with other service providers. Without managing these new challenges effectively, the effort of the services and the realization of competition advantages become improbable. As an additional challenge to financial institutions, the commitment to take the new security risks into consideration arises mainly for two reasons: Firstly, successful attacks due to security risks have a negative monetary impact on a company’s value. This aspect is gaining importance because the number of vulnerabilities and incidents are increasing dramatically (CERT 2004) and losses due to security risks have increased enormously over the last years (Gordon et al. 2005). Secondly, financial institutions face new regulatory requirements that require the management of security risks and the dealing with personal data. In order to determine a company’s value and to fulfill the regulatory requirements, the future development of vulnerabilities and security risks has to be taken into account by adequate risk management. The following sections present the actual regulatory requirements from data protection laws as well as Sarbanes Oxley Act and Basel II. Finally, four phases to manage Security and PrivacyRisks are presented.

29.5.1 Data Protection Laws In Europe, the collection, storage, processing and transmission of personal data is regulated by the European Personal Data Protection Directive of 1995. This directive constitutes a framework for national laws2 facilitating data flow within the European Union. Therefore, the processing of sensitive data in order to personalize services by customer profiling, data mining or taking decisions automatically is restricted. It must comply with the principles of purpose limitation, data quality and proportionality, social justification, transparency, and the security of data storage and communication. The directive also restricts the export of personal data 2

In Germany, the necessary adoption of the existing “Bundesdatenschutzgesetz” (Federal Data Protection Law) was carried out in March 2001.

29 IT Security: New Requirements, Regulations and Approaches

723

to countries that do not come within the scope of the directive. For these countries the regulation had been extended by the Safe Harbor principle allowing an exchange of personal data with companies that have signed to obey the requirements. Focusing on integrated services for customers and offering a network of services provided by different companies challenges the observance of data protection laws in numerous ways. On the one hand, different companies have, for example, different privacy policies and have to bring them together as one. On the other hand, integrating services provided in different countries or contexts have to take different – and sometimes inconsistent – laws into consideration. The observance of all regulations, laws and policies becomes a task of increasing complexity and uncertainty (Subirana and Bain 2005). This might become one of the biggest hurdles for online commerce in the next decade, if the different laws do not converge (Roßnagel 2005; Smith 2005). Integration of services with high flexibility and dynamics faces the companies with increasing costs and risks associated with the correct use of personal data. Thus, the integration of services should inherently be linked with a suitable concept for risk management.

29.5.2 Regulatory Requirements Influenced by a numerous falsification of balance sheets as with Enron the US government established the Sarbanes-Oxley-Act (SOX) in 2002 in order to protect investors and improve the accuracy and reliability of corporate disclosures. The act counts for companies that are listed in US stock exchanges as well as for subsidiary companies that are settled in foreign countries. Now coming into effect, SOX signals a fundamental change in business operations and financial reporting. Among others, one of the most important provisions of SOX is the certification of financial reports by Chief Executive Officers (CEO) and Chief Financial Officers (CFO) (Armour 2005). The signing officers must certify that they are responsible for establishing and maintaining internal controls. IT affects the financial reports mainly for two reasons (Brown and Nasuti 2005): 1. The financial reporting process of most organizations is driven by IT systems and almost every company electronically manages its data and documents that are required for the financial reporting process. Thus, IT systems are used to support, provide data and produce the financial reports. The Public Company Accounting Oversight Board (PCAOB)3 suggests that the nature and characteristics of the use of IT systems affect a company’s internal control over financial reporting. Data about asset values, sales records and receivables may contain errors that compromise the accounting systems. In this case, the valuation of a company would be affected and thus the financial report. 3

The PCAOB is a private-sector and non-profit corporation that was created by the Sarbanes Oxley Act in order to oversee the auditors of public companies.

724

Günter Müller et al.

2. The second influence of IT systems on the financial report is the value of the IT systems itself and the security risks that are associated with their employment. The value of the IT systems can be measured relatively simply due to the purchase price and depreciation over the years. Security risks on the other hand have a negative impact on a company’s value and thus on the financial reporting. Companies have to execute IT audits that require risk management. Some of the key areas of responsibility of IT include identifying risks related to the IT systems, designing and implementing controls to mitigate the identified risks. Thus, in order to consider the losses caused by security risks within the financial report, the use of risk management is inescapable. A further regulatory requirement that makes the management of security risks for banks necessary is Basel II. Following a first consultative paper in June 1999, the Basel Committee published a second consultative paper containing proposals for revising the 1988 Basel Accord. Besides credit and market risks, Basel II formally addresses operational risk for the first time in the context of a regulatory perspective. The Basel Committee on Banking Supervision defines operational risk as the risk of loss resulting from inadequate or failed internal processes, people and systems or from external events. This definition includes legal risk but excludes strategic and reputational risk (Basel Committee on Banking and Supervision 2005). According to this definition security risks are a subset of operational risk. Particularly developments such as the use of more highly automated technology, the enormous growth of e-commerce during the last years, large-scale mergers and acquisitions, the emergence of banks as very large-volume service providers as well as the increased prevalence of outsourcing suggest that operational risk exposures may be substantial and growing (Basel Committee on Banking and Supervision 2001). As a consequence of these developments, Basel II firstly demands a regulatory capital charge for unexpected losses due to operational risk. Since security risks are a subset of operational risks, a bank also has to charge capital for security risks. In order to determine the capital charge, Basel II proposes several methods that range from simple indicator-based approaches through to complex stochastic methods (e. g. Value-at-Risk) in a continuum of increasing sophistication and risk sensitivity (Basel Committee on Banking and Supervision 2005). The higher the risk sensitivity of the method, the less capital must be charged (Faisst and Kovacs 2003). Thus, there is an incentive for financial institutions to use stochastic methods in order to calculate the capital charge.

29.5.3 Managing Security and Privacy Risks Risk management is not a unique operational sequence but rather an iterative process. It is necessary to improve the process continuously and challenge the results critically. Managing security risks encompasses at least four recurring phases – identification, quantification, controlling and monitoring – as illustrated in Figure 29.9:

29 IT Security: New Requirements, Regulations and Approaches

Identification Phase

Quantification Phase

Controlling Phase

725

Monitoring Phase

Figure 29.9. Phases of managing security risks

1. Identification phase Security risks result as a threat of the protection goals of confidentiality, integrity, non-repudiation and availability. Thus, at the outset, the vulnerabilities of the information system used have to be identified. The most common methods for this purpose are the failure mode and effects analyses (FMEA) and the Hazard and Operability Study (HAZOP).4 The FMEA was developed from the NASA and is a fault tree method (Deutsche Gesellschaft für Qualität 2001). It is possible to examine potential failures in information systems and processes and may be used to evaluate risk management priorities for mitigating known threat-vulnerabilities. The HAZOP is a systematic method for the examination of a planned information system or process in order to identify and evaluate problems that may constitute risks (Kletz 1992). Based on the identified vulnerabilities and the companies’ risk policy, the security risks have to be defined and classified. The main results of the identification phase are the determination of risk sources and potential events (Piaz 2002). 2. Quantification phase In a following phase, the classified risks have to be quantified, i. e. the analysis of the monetary impact of the security risks. There are numerous methods that can be used to quantify security risks: questioning techniques, indicator approaches, stochastic methods, and causal methods (Faisst and Kovacs 2003). As described earlier, Basel II encourages banks to calculate the regulatory capital charge for unexpected losses by the use of stochastic methods. These methods generally combine a frequency and severity distribution to calculate an aggregated loss distribution (Cruz 2003). An example of an aggregated loss distribution is illustrated in Figure 29.10(a). The expected severity of events is mapped on the abscissa and the expected frequency of events on the ordinate. The aggregated loss distribution of operational risk is usually skewed to the right (Fontnouvelle et al. 2003).5 Unexpected losses are an estimation of the amount by which the actual losses can exceed the expected losses over a pre-specified time horizon, measured to a pre-defined level of confidence (e. g. 99.9%) (Schierenbeck 2001). Basel II requires banks to calculate 4 5

Further methods are described for example in (Vaughan 1997). Since security risks are a subset of operational risks they are also assumed to be rightskewed.

(a)

Günter Müller et al.

expected frequency of events

726

Expected Loss (EL)

Unexpected Loss (UL)

(b)

expected frequency of events

expected severity of events

Expected Loss (EL)

Unexpected Loss (UL)

Ubiquitous Computing can lead to increased expected and unexpected losses

(c)

expected frequency of events

expected severity of events

Expected Loss (EL)

Unexpected Loss (UL)

Countermeasures can reduce expected and unexpected losses

expected severity of events

Figure 29.10. Aggregated loss distributions and the impact of countermeasures

their regulatory capital charge as the sum of expected and unexpected losses, unless a bank can demonstrate that it is adequately capturing the expected losses in its business practices (Basel Committee on Banking and Supervision 2005). In this case, the expected losses can be seen as standard risk costs that are considered at the calculation of minimum margins for different products. In doing so, a bank has only to charge capital for the unexpected losses. 3. Controlling phase Having quantified the security risks, available countermeasures can be used for changing the (un)expected losses to a quantity that is acceptable for the bank. Today, available countermeasures can be classified in active countermeasures (risk strategies such as risk reduction6, risk diversification7 and risk avoidance8) 6

7

8

Security risks can be reduced, for example, by the implementation of technical security mechanisms. A diversification of the thread virus can be reached by the use of different operating systems in order to limit the impact. To avoid a hacker attack, the bank could, for example, decide to disconnect to the Internet connection. In this case, however, the chances due to the Internet can no longer be realized.

29 IT Security: New Requirements, Regulations and Approaches

727

and passive countermeasures (risk strategies such as risk provision9 and risk transfer10) (Hölscher 2002). In contrast to the active strategies, the risk itself persists by the use of passive strategies. 4. Monitoring phase The risk management process ends with the monitoring phase analyzing whether all the events that have been occurred have been identified before as possible events. In a next step, it is examined whether the distribution of probabilities of occurrence of events and the distribution of severities of losses have been anticipated within the quantification phase. Finally, it has to be analyzed whether the selected controlling measures have led to the desired results. The increasing dynamics resulting from an integration of full business processes into one process portal and the development of ubiquitous computing technologies as well as the increasing vulnerabilities of information systems predicted by the CERT-statistics enormously change the security risks for the financial sector. Assuming that an increase in vulnerabilities implicates an increase in successful attacks, the result will be a shift in both the frequency and severity distribution of security risks. This leads to a shift of the aggregated loss distribution as illustrated in Figure 29.10 as the difference between (a) and (b). If the shift cannot be compensated by suitable countermeasures (illustrated as the difference between Figure 29.10(b) and (c)), the financial sector has to take it into account by possibly high costs.

29.6 Conclusion Some financial institutions have begun to operate process portals. The financial institution – as the operator of the process portal – thereby not only offers its own products to its customers but also products of its business partners. A supply chain network arises. According to the ad-hoc networking, the process portal represents a step towards a rapidly changing environment. The development and future usage of ubiquitous computing technologies means an additional change of the technical infrastructure and increases the dynamic of the IT systems by decentralized components and self-management properties. Due to the characteristics of ubiquitous computing, namely reachability, context-awareness and autonomous problem handling, access control is no longer sufficient and also appears to be inadequate for the future. Reliability and availability are becoming protection criteria for future IT security.

9

10

Capital is accumulated and charged in order to avoid illiquidity due to security risks. Basel II makes the capital charge for banks obligatory. Risks can be transferred outside the company through an insurance policy or so-called alternative risk transfer instruments (e.g. captives, securitization).

728

Günter Müller et al.

The increasing dynamic of the information system faces companies with new security risks. For the financial sector, two consequences should mainly be taken into consideration: • Guaranteeing the security of services offered and the protection of personal data is a crucial success factor with regard to their acceptance. In order to guarantee the protection goals of confidentiality, integrity, nonrepudiation and availability and to guarantee reliable systems in a rapidly changing environment, three concepts are discussed: Firstly, policy-based management where the desired systems’ behavior is described. Secondly, secure logging services that allow audits to be performed and acceptable and non-acceptable system states to be identified through IT security. Thirdly, commandments complement prohibitions and do not aim at preventing security risks, but allow active risk management. These approaches may help financial institutions to accomplish active risk management which has become more important for them over the past. • Data protection laws and new regulatory requirements like the SarbanesOxley Act and Basel II demand active risk management and the consideration of security risks as part of the operational risks. In order to fulfill the regulatory requirements, security risks have to be identified, quantified, controlled and monitored in an iterative process. The actual development of technology is expected to cause a significant shift of security risks. This means a shift in both the frequency and severity distribution of security risks that have to be managed in an active way.

References Armour P G (2005) Sarbanes-Oxley and Software Projects. Communications of the ACM 48: 15−17 Basel Committee on Banking Supervision (2005) International Convergence of Capital Measurement and Capital Standards Retrieved December 12, 2005, from http://www.bis.org/publ/bcbs118.pdf Basel Committee on Banking Supervision (2001) Working Paper on the Regulatory Treatment of Operational Risk Retrieved December 12, 2005, from http://www.bis.org/publ/bcbs_wp8.pdf Booth D, Haas H, McCabe F, Newcomer E, Champion M, Ferris C et al. (eds) Web Services Architecture, W3C, Retrieved December 12, 2005, from http://www.w3.org/TR/ws-arch/ Brown W, Nasuti F (2005) Sarbanes-Oxley and Enterprise Security: IT Governance – What It Takes to Get the Job Done. Security Management Practices 14: 15−28 CERT (2004) CC Statistics 1988−2004 Retrieved January 1, 2006, from http://www.cert.org/stats/

29 IT Security: New Requirements, Regulations and Approaches

729

Control Data (1999) Why Security Policies Fail. White Paper. Retrieved December 13, 2006, from http://www.mis.uwec.edu/keys/Teaching/is365/ 208770-BT%20Why%20Security%20Policies%20Fail%20-20000718.pdf Cruz MG (2003) Modeling, Measuring and Hedging Operational Risk. John Wiley & Sons Ltd., Chichester Deutsche Gesellschaft für Qualität (2001) FMEA – Fehlermöglichkeits- und Einflussanalyse. Beuth-Verlag, Berlin, Zürich, Wien Faisst U, Kovacs M (2003) Quantifizierung operationeller Risiken – ein Methodenvergleich (in German). Die Bank 43: 342−349 Fontnouvelle P, Virginia DR, Jordan J, Rosengren E (2003) Using Loss Data to Quantify Operational Risk. Retrieved November 23, 2006, from http://www. bis.org/bcbs/events/wkshop0303/p04deforose.pdf Gordon LA, Loeb MP, Lucyshyn W, Richardson R (2005) CSI/FBI Computer Crime and Security Survey 2005. Retrieved December 11, 2006, from http://i.cmpnet.com/gocsi/db_area/pdfs/fbi/FBI2005.pdf Hilty M, Basin D, Pretschner A (2005) On Obligations. In: De Capitani di Vimercati S, Syverson P, Gollmann D (eds.) 10th European Symposium On Research In Computer Security. Springer-Verlag, Berlin, Heidelberg, pp. 98−117 Hölscher R (2002) Von der Versicherung zur integrativen Risikobewältigung: Die Konzeption eines modernen Risikomanagements In Hölscher R, Elfgen R (eds.) Herausforderung Risikomanagement. Identifikation, Bewertung und Steuerung industrieller Risiken, Wiesbaden: Gabler, pp. 3−31 Kletz T (1992) HAZOP and HAZAN. 3rd Edition, Taylor & Francis Inc, London Lampson B, Abadi M, Burrows M, Wobber E (1992) Authentication in Distributed Systems: Theory and Practice. ACM Transactions on Computer Systems 10: 265–310 Lonvick C (2001) RFC 3164: The BSD syslog Protocol. Retrieved December 12, 2004, from http://www.rfc-archive.org/getrfc.php?rfc=3164 Müller G, Gerd tom Markotten D (2000) Sicherheit in der Kommunikationstechnik (in German). Wirtschaftsinformatik 42: 487−488 Müller G, Eymann T, Kreutzer M (2003) Telematik- und Kommunikationssysteme in der vernetzten Wirtschaft (in German). Oldenbourg, München, Berlin New D, Rose M (2001) Reliable delivery for syslog. RFC 3195. Retrieved April 23, 2004, from http://www.faqs.org/rfcs/rfc3195.html Österle H, Bach V, Schmid R (2000) Mit Customer Relationship Management zum Prozessportal (in German). In: Bach V, Österle H (eds.) Customer Relationship Management in der Praxis. Springer, Berlin, pp. 3−55 Piaz JM (2002) Operational Risk Management bei Banken (in German) Versus, Zürich Puschmann T, Alt R (2004) Process Portals – Architecture and Integration In: Sprague, R (ed) Proceedings of the Thirty-Seventh Annual Hawaii International Conference on System Sciences. Hawaii, 05.01.2004, Los Alamitos (CA) Roßnagel A (2005) Verantwortung für Datenschutz (in German). Informatik Spektrum 28: 462−473

730

Günter Müller et al.

Schierenbeck H (2001) Ertragsorientiertes Bankmanagement (in German). Band 2: Risiko-Controlling und integrierte Rendite-/Risikosteuerung. 7th Ed., Gabler, Wiesbaden Sackmann S, Strüker J (2005) Electronic Commerce Enquête 2005 – 10 Jahre E-Commerce: Eine stille Revolution in deutschen Unternehmen (in German). Institut für Informatik und Gesellschaft, Telematik, Universität Freiburg. K-IT-Verlag, Leinfelden Sackmann, S, Strüker, J, Accorsi, R (2006) Personalization in Privacy-Aware Highly Dynamic Systems, Communications of the ACM Vol. 49 (9), 32−38 Schneier B, Kelsey J (1999) Security audit logs to support computer forensics. ACM Transactions on Information and System Security 2: 159–176 Sloman M (1994) Policy Driven Management For Distributed Systems. Journal of Network and Systems Management 2: 333 Smith B (2005) Protecting Consumers and the Marketplace: The Need for Federal Privacy Legislation. Retrieved Sept. 12, 2006, from http://download. microsoft.com/download/c/2/9/c2935f83-1a10-4e4a-a137-c1db829637f5/ PrivacyLegislationCallWP.doc Subirana B, Bain M (2005) Legal Programming: Designing Legally Compliant RFID and Software Agent Architectures for Retail Processes and Beyond. Springer, New York Vaughan E (1997) Risk Management. John Wiley & Sons, Inc., New York

CHAPTER 30 Legal Aspects of Internet Banking in Germany Gerald Spindler, Einar Recknagel

30.1 Introduction Internet-banking has become increasingly popular in Germany within the last ten years. The variety of possible transactions and the scope of business has widened, starting from pure transaction banking via internet broking to the complete spectrum of banking services including credits and loans. Internet-banking is part of the direct-banking spectrum, which next to internetbanking includes means of communication by telephone, fax or letter. The discussion concerning the terms “online-banking” or “home-banking” is non-fruitful and has become obsolete, since the termination of the BTX-Service by the German Telekom.1 Besides the common legal banking questions additional problems arise from the implementation of the internet, in particular the necessity of providing sufficient access to the bank-server – crucial for instance for day trading. Moreover, issues of identifying customers and security assurance are essential within this sensitive business field. Not surprisingly the legal discussion has focussed on the problems of phishing, pharming or identity theft, which endanger financial transactions and the acceptance of internet-banking. One outstanding problem is the burden of proof due to an alleged identity theft or a breach of obligations by the customer. In the following the conclusion of a contract between bank and customer, the specific dangers of internet-banking and finally the question of liability shall be analysed and discussed in a nutshell:

1

Bildschirmtext and its follow-up T-Online; service was terminated in the year 2001.

732

Gerald Spindler, Einar Recknagel

30.2

Contract Law – Establishing Business Connections, Concluding Contracts, and Consumer Protection

The legal regime of contracting via internet is governed by general provisions of the German Civil Code (Bürgerliches Gesetzbuch)2. However, in order to ensure a high level of consumer protection, EU-legislation has taken influence on and will further strongly influence legislation.

30.2.1 Establishing the Business Connection The business connection between customer and bank is not limited to short-term deals, the relationship between the parties generally is longlasting and includes the whole spectrum of banking services. In principal, the core of internet-banking business consists of money transference services, securities depots offering trading and broking, and sometimes credit lendings. Establishing a bank account or depot exclusively via internet is – though the digital signature is legally provided ([74] pp. 169 f.) – not practiced yet. Banks are required to identify an account holder for reasons of taxation and anti-money laundering when first opening an account and establishing the business connection.3 Within an existing connection renewed identification on the occasion of the further opening of an account is not necessary. Due to the small use of digital signatures in practice, accounts are opened in the traditional way by signing and sending the relevant contracts in paper. The customer may order the relevant forms via internet and the bank responds by sending the contracts requested via mail while the aforementioned identification is assured by using the socalled “post-ident” procedure (see [60] pp. 1605 ff.).

30.2.2 Contract Law On the exchange of declarations of intent via internet between customer and bank the same legal rules apply than in a non-internet business(for all [71], § 312 b Rn. 4). For the declarations of intent, the offer and the acceptance, internet-specific particularities have to be considered, but which do not cause problems of a larger extent: 2

3

The German civil code will be abbreviated in the following with its German abbreviation BGB (Bürgerliches Gesetzbuch). § 154 AO (Abgabenordnung) and § 8 GwG (Geldwäschegesetz) demand that the bank procures certainty about the account holder by identifying him by name, address and date of birth before opening an account. If the bank fails to identify the account holder, assets and deposits can only be disbursed with the affirmation of the fiscal authorities.

30 Legal Aspects of Internet Banking in Germany

733

For the conclusion of a contract an offer and a corresponding acceptance are necessary. The offer via internet is released when the sender has executed willingly the mouseclick or the “enter”-key, or has sent an email or (in internet-banking for example) has entered the necessary PIN and TAN ([22] p. 3000; [81] p. 3007; [27] p. 8; [29] p. 372; [40] Rn. 4.754; [86] p. 1046). The declaration of intent is received by the recipient, when it has entered the sphere of control of the addressee. In case of transaction orders they are received by the bank, when they are (automatically) saved on the bank server and the possibility of taking notice is given ([58] p. 33). If the bank has provided for communication channels (for example [email protected]) via email it is obliged to regularly check for incoming mails several times a day. On the other hand the customer only has to check his email account if he has dedicated his email address to business communication ([30] p. 7; [21] p. 369).

30.2.3 Terms and Conditions The ZKA (Zentraler Kreditausschuss) as the central organisation of the five main banking associations ([87]) has developed standard contract terms and conditions ([15] Rn. 109) which differ only in the payment verification instrument, ([17] § 8 Rn. 1; [24] § 55 Rn. 27) regulating the usage of PIN/TAN or HBCI-verification. In these terms and conditions questions of liability are not specifically mentioned, but several obligations and duties concerning the secrecy of the payment verification instrument are imposed on the customer. The inclusion of general contract terms and conditions via internet has been discussed at length; according to the unanimously opinion there are no obstacles concerning their inclusion as long as the legal requirements of the §§ 305 BGB ff. are observed. According to § 305 sec. 2 BGB the bank has to point out to the customer that the inclusion of terms and conditions is intended. Moreover, the bank has to enable the customer to take notice of the terms and conditions with reasonable effort (For all see [74] pp. 174 ff.). If the customer applies for the first time for business connection via internet the terms and conditions are sent to him in paper along with the contract.4

30.2.4 Consumer Protection – EU-Legislation and Directives Like in all legal areas European legislation has gained important influence on the legislation concerning internet banking. Especially, the directive concerning the distance marketing of consumer financial services5 and the directive on payment 4 5

See above under section 30.2.1. Directive 2002/65/EC of the European Parliament and of the Council of 23 September 2002 concerning the distance marketing of consumer financial services and amending Council Directive 90/619/EEC and Directives 97/7/EC and 98/27/EC, [90].

734

Gerald Spindler, Einar Recknagel

services in the internal market6 strongly affect the provision of banking services via internet. 30.2.4.1 The Directive Concerning the Distance Marketing of Consumer Financial Services The directive concerning the distance marketing of consumer financial services has been adopted September 23, 2002 in order to harmonize laws, regulations and administrative provisions of the Member States concerning the distance marketing of consumer financial services granting an EU-wide consumer protection. The directive has been implemented into German Civil Code in the §§ 312 b ff. BGB in 2004.7 In general the directive (and its national implementations) grants the consumer certain rights/duties of information prior to the conclusion of the distance contract the communication of the contractual terms and conditions and the right of withdrawal. The duties of information and the communication of the contractual terms and conditions are specified in Art. 3–5 of the Directive. Amongst others information concerning the supplier, the financial service and the distance contract shall be provided to the consumer prior to the conclusion of the contract. The information on the supplier shall contain its identity and the main business8. Information about the financial service shall provide a description of the main characteristics of the service, the price and the relevant risks.9 The information concerning the distance contract shall inform the consumer on the existence or absence of a right of withdrawal and its exercise, the minimum duration of the contract and information on any rights the parties may have to terminate the contract in a due course. The supplier shall provide the consumer with all contractual terms, conditions, and the mentioned information on paper or on another durable medium10 available and accessible to the consumer in due course, before the consumer is bound by any distance contract or offer. The right to withdrawal grants the consumer a period of fourteen days to withdraw from the contract without penalty and without providing any reason. The period for withdrawal begins either from the day of the conclusion of the distance contract, or from the day on when the consumer has received the contractual terms and conditions and the information. 6

7

8 9 10

Proposal for a Directive of the European Parliament and of the council on payment services in the internal market and amending Directives 97/7/EC, 2000/12/EC and 2002/65/EC (presented by the Commission) [72]. Gesetz zur Änderung der Vorschriften über Fernabsatzverträge bei Finanzdienstleistungen vom 02.12.2004, [23]; for a detailed summary of the implementation to the BGB see [74], pp. 106 ff. Art. 3 Sec.1 of the directive. Art. 3 Sec.2 of the directive. See above section 30.2.3.

30 Legal Aspects of Internet Banking in Germany

735

The right to withdrawal does not apply to financial services where the price depends on fluctuations within the financial market which may occur during the withdrawal period, such as services related to trades on foreign exchange, money market instruments, transferable securities or comparable instruments and contracts where the performance has been fully completed by both parties at the consumer’s request before the consumer exercises his right of withdrawal. 30.2.4.2 The Directive on Payment Services in the Internal Market After releasing a recommendation concerning transactions by electronic payment instruments11 pointing out guidelines to rights and obligations on electronic transactions the Commission has presented a proposal for a directive on payment services in the internal market,12 which recently had been adopted. With this proposal the Commission aims to create a single payment market where improved economies of scale and competition reduce the cost of the payment system. A modern and technologically based economy needs an efficient and modern payment system. The Commissions’ initiative focuses on electronic payments as an alternative to cost intensive cash payments. The directive shall provide legal certainty on core rights and obligations for users and providers in the interests of a high level of consumer protection, improved efficiency, a higher level of automation and pan-European straight-through processing. Though the directive mainly aims at the regulation of transeuropean payments, it shall affect inland payments as well. The directive specifies the obligations on the payment service user (Art. 46) and the obligations on the payment service provider (Art. 47) in relation to payment verification instruments. The payment service user is obliged to use a payment verification instrument in accordance with the terms governing the issuing and usage of the instrument (Art. 46 lit. a)) and to notify the payment service provider, or the entity specified by the latter, without undue delay on becoming aware of loss, theft or misappropriation of the payment verification instrument or of its unauthorised usage (Art. 46 lit. b)). In addition the user shall, as soon as he receives a payment verification instrument, take all reasonable steps to keep his security features safe. Correspondingly the payment service provider is obliged to make sure that the personalised security features of a payment verification instrument are not accessible to parties other than the holder of the payment verification instrument (Art. 47 lit a)); to refrain from sending an unsolicited payment verification instrument, except where a payment verification instrument already held by the payment 11

12

Commission Recommendation of 30 July 1997 concerning transactions by electronic payment instruments and in particular the relationship between issuer and holder [89] – the recommendation is not legally binding. Proposal for a Directive of the European Parliament and of the Council on payment services in the internal market and amending Directives 97/7/EC, 2000/12/EC and 2002/65/EC; COM(2005) 603 (final) [72].

736

Gerald Spindler, Einar Recknagel

service user is to be replaced, because it has expired or because of the need to add or change security features (Art. 47 lit b)); and to ensure that appropriate means are available at all times to enable the payment service user to make a notification pursuant to Art. 46 b) (Art. 47 lit b)). Of particular interest is the implementation of the prima facie evidence in Art. 48 which require Member States to oblige the payment service provider to provide at least evidence that the payment transaction was authenticated, accurately recorded, entered in the accounts and not affected by a technical breakdown or some other deficiency, where a payment service user denies having authorised a completed payment transaction. If, on giving this evidence, the payment service user continues to deny having authorised the payment transaction, he shall provide information or elements to allow the presumption that he could not have authorised the payment transaction and that he did not act fraudulently or with gross negligence with regard to his abovementioned obligations. For the purposes of rebutting this presumption, the use of a payment verification instrument recorded by the payment service provider shall not, of itself, be sufficient to establish either that the payment was authorised by the payment service user or that the payment service user acted fraudulently or with gross negligence. In Art. 50 the user liability for losses in respect of unauthorised payment transactions is stipulated. The payment service user shall bear the loss, however, only up to a maximum of € 150, resulting from the use of a lost or stolen payment verification instrument and occurring before he has fulfilled his obligation to notify his payment service provider (Art. 46 b)). The payment service user shall bear all the losses on unauthorised transactions if he incurred them by acting fraudulently or with gross negligence with regard to his obligations. In such cases, the maximum amount of € 150 shall not apply.

30.3

Liability – Duties and Obligations of Bank and Customer

The main focus in the legal discussion currently lies on the attribution of liability if the customer denies having disposed an order or a payment transaction claiming that a third party has used the means of identification without consent or knowledge of the user. While the legal situation is quite clear, the procedural problems concerning the burden of proof are immense.

30.3.1 Phishing and Its Variations – A Case of Identity Phishing attacks use both social engineering and technical subterfuge to steal consumers’ personal identity data and financial account credentials. Social-engineering schemes use “spoofed” e-mails to lead consumers to counterfeit websites designed to trick recipients into divulging financial data such as

30 Legal Aspects of Internet Banking in Germany

737

credit card numbers, account usernames and passwords. Hijacking brand names of banks, e-retailers and credit card companies, phishers often convince recipients to respond. Technical subterfuge schemes plant crimeware onto PCs to steal credentials directly, often using Trojan keylogger spyware. Pharming crimeware misdirects users to fraudulent sites or proxy servers, typically through DNS hijacking or poisoning, “harvesting” the consumer-information on this website. With a “Man-in-the-middle attack” the attacker mingles in between the sender and the recipient, accesses the traffic, modifies it (i. e., by changing the beneficiary of the transaction) and forwards it to the recipient – the bank. Banks have reacted to these threats by modifying the verification instruments trying to ensure that phishing and pharming attacks become ineffective. The Basis of verification instruments is the original and still widely spread PIN/TAN system in which the customer enters his Personal Identification Number and a freely chosen Transaction Number into the browser-window or the bankingprogramm. This technique of identification is executed without media conversion as all communication channels are directed via the computer of the customer – which specifically increases the danger of trojan attacks, pharming and phishing as the risk of being spied out augments. Against these dangers countermeasures by implementing verification instruments with media conversion have been developed. By the use of additional hardware – TAN-generators or receiving a TAN by mobile phone – the prevention of manipulated transactions shall be ensured. The Safest internet-banking system is the HBCI-standard ([25]) in which by the use of a Chip-card and an electronic signature the transaction is encrypted and therefore, considered safe against manipulation.

30.3.2 Legal Situation in Case of Phishing or Pharming When the account of a customer is debited with a transaction amount resulting from phishing or pharming the question evolves if and on which legal grounds the bank can claim the transferred amount from its customer. The possible basis for such a claim by the bank is a reimbursement of expenses (cf. 30.3.2.1)) or a claim for damages (cf. 30.3.2.2)). 30.3.2.1 Reimbursement of expenses A claim for reimbursement of expenses in accordance to §§ 670, 675, 676a BGB is grounded on a contractual basis and therefore, demands a “transaction-contract” between the bank and the customer concerning the relevant transaction. If a third party – the attacker – initiates a transaction by a phishing or pharming attack a transaction contract between customer and bank is not established as the customer has not offered to conclude a transaction contract. The bank cannot distinguish between a true or falsified transaction order and therefore, will point out

738

Gerald Spindler, Einar Recknagel

that the transaction order being verified by the payment verification instrument (PIN/TAN or HBCI) has been initiated by the customer. If the customer denies having given the order, the bank has the burden of proof for the “transactioncontract”. If the proof – with the adoption of prima facie evidence13 – cannot be supplied, the bank cannot claim reimbursement of expenses. Some scholars propagate a contractual binding by apparent declaration of intent as the incoming falsified order resembles to be given with the true means of identification and its eligible proprietor. The grounds for this opinion are based on the figure of apparent authority ([24] § 55 Rn. 26; [31] p. 1684, F/35; [42] p. 408; [19] p. 1206; to the apparent authority in general [71] § 173 Rn. 14 ff.) arguing that high technical security standards are applied for internet-banking. “Apparent authority” is assumed, when the – repeated – action of the agent is unknown to the representee due to lack of diligence by the representee and the contracting third partner is not aware of this fact. The application of “apparent authority” in this area, however, has to be rejected in accordance with the prevailing opinion ([16] p. 3315; [35] p. 353; [41] p. 24; [86] p. 1047; [44] pp. 145 f). For the assumption of apparent authority it is necessary that the agent is acting repeatedly for a certain period of time. Concerning internetbanking it is on the other hand assumed that the customer himself and not an agent is using the verification instrument ([14] Rn. 82; [77] Rn. 21 f). Moreover, the notion “apparent authority” requires that the acting third party usually acts within the sphere of the representee so that it could be assumed that the representee has the power to control its would-be agent ([63] § 167, Rn. 31 ff.; [77] Rn. 25). All these premises of the figure of apparent authority, however, do not apply in internet banking. At first place, the agent does not act several times, as the fraudulent transaction occurs generally only once ([77] Rn. 29); however, the traditional notion of “apparent authority” requires such a repeated action. Furher, neither the “phisher” nor the “pharmer” usually act within the sphere of the customer. These arguments demonstrate that the assumption of a transaction contract concluded by an agent with “apparent authority” does not match the circumstances under which the fraudulent order is given and executed. A transaction contract between customer and bank is not concluded when an unauthorised third party uses the means of verification of the customer. The question whether the “theft” of the verification instrument was due to the fault of the customer or to an insecure technical environment provided by the bank, has to be evaluated for claims of damages. Contributory negligence by the customer may also play an essential role which could be handled flexible. 30.3.2.2 Claims for Damages A breach of duties either by the bank or the customer results in a claim for damages by the other party. If the customer neglects his due diligence stated in the online-banking agreement and general contract terms he is liable to the bank for 13

See also Section 30.3.4. for burden of proof.

30 Legal Aspects of Internet Banking in Germany

739

the resulting damages, §§ 280 Sec. 1, 241 Sec. 2 BGB. The terms and conditions concerning online banking do not contain a stipulation on liability;14 hence, the general rules of civil code apply, § 276 BGB. Correspondingly the bank is being held liable if it neglects its duties and obligations. The exact assignment of obligations and duties to customer and to the bank has not been developped by courts so far. aa) Assignment of Duties and Obligation to Bank and to Customer Due to the offer of using internet-banking services the bank has to provide secure technical procedures for the relevant transactions, moreover, as the bank is the cheapest cost avoider concerning technical safeguards.spi These duties to maintain safety are imposed on the bank resulting from the risk by opening up the internetbanking service to its customers (Comparable argumentation when using dialers see [10; 55]). The bank, due to its personnel, financial and technical resources, can control the resulting risks better than its customers, moreover, as the bank is free to define the extent of services available and the relevant technical precautions. As well the bank has a wider horizon concerning the technical menaces from the internet than the average user ([35] p. 356; [77] Rn. 32). Finally, the bank benefits of many financial advantages from internet-banking. These services reduce the cost for personnel, branches and backoffice-operations considerably. Although internet-banking provides advantages to the customer as well, as independence from banking hours and reduced costs, the main benefits are drawn by the bank. This, however, cannot result in imposing all risks to the bank as the bank can not control the sphere of the customer. In particular, the computer of the customer may be the main source of danger being endangered by virus infections or as a prime target for phishing and pharming attacks. Therefore, security measures have to be taken also by the customer with due diligence. A breach of these duties results in liability of the customer. bb) Duties of the Bank The bank is subject to several duties concerning the provision of online banking. It has to provide a secure technical environment, has duties of surveillance regarding to the technical and security relevant developments, has to instruct and warn the customer and prevent misuse of the system when detected. The failure of accessibility is discussed in a short excursus. (a) Providing Secure Technical Environment by the Bank A principal duty concerning liability is the provision of a secure technical environment, namely the bank server ([35] pp. 357 f.; [33] p. 217; [36] Rn. 812; [74] pp. 151, 206; [24] § 55 Rn. 19), which is derived from § 25a KWG (Kreditwesengesetz). Due to the continuing development of technique and of newly evolving 14

See above section 30.3.

740

Gerald Spindler, Einar Recknagel

hazards, this duty has a dynamic character as the bank is obliged to improve constantenously the implemented software according to the acknowledged standards. The bank server has to be kept safe and secure, preventing intrusion of unauthorised third parties, virus attacks, etc. An insufficiently secured system breaches the duty of maintaining safety resulting in liability for the evolving damages ([37] p. 161, [32] p. 130; [74] p. 206). Due to further advancement in technology the bank has to implement the latest security measures on a regular basis. While the customer is responsible for securing his home PC15 the duty for maintaining safety of the bank commences the moment the customer contacts the bank server. The bank is therefore responsible – on a contractual basis to its customers, on a tortious basis to every third party – to provide a secure server environment. Additionally, the bank is liable for the incorrect configuration of the server, the usage of faulty software and its employees for the unauthorised changing or deleting of data. Though complete security cannot be provided ([43] p. 163; [66]) due to the ever changing risk potential in the internet the bank should, for minimising the liability risk, always employ state of the art technology ([35] p. 359; [74] pp. 107, 225). This adoption to state of the art technology has to be set in relation to the economical possibilities and the resulting security gain. Not everything which is technically possible can be demanded to be employed by the bank ([57] pp. 162, 437 f.; [62] § 823 Rn. 29). In this field parallels to product liability can be drawn ([77] Rn. 40). The manufacturer of a product is not liable for faults unknown to state of the art of science and technology at the time of placing the product into circulation, but he is obliged to consider newly discovered problems in future production ([5; 7; 9; 62] § 823 Rn. 508, 602). If a newly known threat to IT-security becomes known to the bank it is obliged to adjust its system to counter these dangers ([35] p. 359; [74] pp. 47 f.). This may result in the necessity of providing new or modifed verification instruments in due time ([35] p. 357). The discussion concerning the security of the sole PIN/TANprocedure has already started, arguing that this method is not state of the art any longer ([35] p. 357; [77] Rn. 40 f.). This method is subject to severe attacks by secretly employed malware on the customers PC. As a reaction to these threats banks have already developed variations of the PIN/TAN-procedure involving the employment of additional hardware (TANgenerator), the communication of a TAN to the mobile phone (mTAN) or other means of communicating with the customer without the usage of the “PC-bankingchannel” ([77] Rn. 42). Presently, the HBCI-procedure16 is considered being the most secure ([77] Rn. 43). This, however, does not result in an obligation to change its complete verification system from modified PIN/TAN-procedures to HBCI. Even if the HBCI provides the highest level of security the new PIN/TAN-methods have proven to be secure to the presently known threats ([35] p. 359; [79] pp. 212 f.). Furthermore, HBCI is not widely accepted by the customers due to the necessity 15 16

At least to a certain degree, see under Section 30.3.2.2 cc) (a). Home-Banking-Computer-Interface, also known as FinTS.

30 Legal Aspects of Internet Banking in Germany

741

of buying additional technical equipment (smartcard reader) and being bound to the home PC not being able to use internet banking from the place of work. (b) Excursus: Providing Access to the System of the Bank Besides the necessity of providing a secure infrastructure the question of accessibility presents a serious problem within a business connection based on communication via internet. If the customer is not able to place orders or transfers due to malfunctions and server breakdowns this may result in liability of the bank, if for instance a stock order is not executed in due time ([26] p. 307; [74] p. 207). Therefore, a stable and reliable access to the bank server is of crucial importance ([13] p. 141; [38] p. 43; [18] p. 1444; [76] p. 127) although due to the complexity of a server system a system failure is generally possible and the bank is not able to guarantee a 100 percent access at all times. It is obvious that not every interruption of accessibility can result in liability i. e., when the server is down for maintenance reasons17 or a sudden failure interrupts the services until a backup server is started. For these short term interruptions a breach of duty cannot be assumed until the nonaccessibility of the banking services exceeds for more than one hour ([2] p. 1541; [3] p. 264; [39] p. 224; [74] pp. 211 ff.). Concerning contributory negligence courts have to take into account available other means for the customer in order to contact the bank ([56] p. 1591; [51] – for the case of non-accessibility and the resulting liability see also [54]). The claim for breach of duty depends on the question of default, § 276 BGB. A strict liability can only be assumed, if specifically accepted by the bank, for example in their terms and conditions ([74] p. 212). Mere advertisements on the other hand are not sufficent to create a strict liablilty. (c)

Duties of Surveillance and Monitoring

Besides the duty of surveillance concering its own system the bank has to also monitor – as far as possible – possible security leaks in the customers sphere. Although this does not lead to a controlling of the customers PC the bank nonetheless has to react to threats targeted especially against the customer. As the bank has superior knowledge18 concerning the threats evolving from pharming or phishing for example, it has to react in order to neutralise or minimise the relevant dangers ([77] Rn. 45). As an analogy to the monitoring obligation of products by the manufacturer ([78] § 823 Rn. 511) one can differ between an active and a passive surveillance obligation. If misuse and events of damage become known the bank it is obliged to investigate. In addition the bank is held to inform itself by media and internet concerning new evolving threats to a certain extent, i. e., information on windows security exploits, active viruses etc. The information gained has to be passed on to the customer, depending on the seriousness of the threat via website, by flyer or even 17

18

Generally this maintenance work occurs at night-time or on weekends when the demand is low. See above Section 30.3.2.2 aa).

742

Gerald Spindler, Einar Recknagel

by mail. Warnings by e-mail, however, are – due to the suspicion of a phishing attack – not advisable. (d)

Duties of Instruction and Warning

The bank is obliged to provide the customer with proper instructions towards the internet-banking system before initial use ([35] p. 356; [33] p. 218; [14] Rn. 65 ff; [15] Rn. 167; [20] Rn. 527 ff.; [24] § 55 Rn. 19 ff.; [74] pp. 220 ff.). Especially for the technically unexperienced user the demand for information and instruction cannot be underestimated. With the conclusion of the internet-banking contract the customer has to be provided with information to the use ([14] Rn. 66) and the dangers of abuse ([14] Rn. 67; [74] p. 220; [24] § 55 Rn. 20). If the bank offers different systems of verification, i. e. PIN/TAN or HBCI, the bank has to give its customer the choice and give information towards the advantages and disadvantages of the different systems, for example to inform the costumer about the hardware related higher costs with HBCI. If the bank gains knowledge about new risks which endanger the security of the system, it is obliged to warn its customers about the threat. This warning duty correlates with the abovementioned duty of surveilleance. The information can be provided in print, the terms and conditions and the online user guidance on the website19 of the bank ([35] p. 357; [14] Rn. 67; [74] p. 220; [24] § 55 Rn. 21; [77] Rn. 48). As pop-up windows are often blocked by the browser, instructions should not be provided this way ([35] p. 357; [77] Rn. 48). There remain some doubts whether the presentation of the duties of care of the customer in the terms and conditions is sufficent as “fine print” is not always read by the customer ([20] Rn. 527 q; [77] Rn. 48). Such a negligence by the customer – who is also aware of the necessity to act with due care concerning his banking- or creditcards – cannot, however, exculpate him ([14] Rn. 67; [74] p. 220; [28] pp. 251, 260 f.). To guarantee a sufficent instruction a website based instruction is advisable. Instructions to be sent via e-mail on the other hand should be discarded in order to avoid email contact with the customer. This principle should be adopted for the prevention of phishing attacks. A bank should therefore, regularly point out (on its website for example) that it does not contact its customers via email ([35] p. 357; [77] Rn. 48). (e)

Duty to Prevent Misuse

If the bank gains knowledge of abusive transactions it has to ensure that no further unauthorised transactions can be executed ([33] p. 218). Not every assumed misuse enables the bank to bar the internet-account,20 but at least the customer should be informed by the bank. 19

20

The web presentation of several banks has been tested towards security and ease of use by the SIT Fraunhofer institute, see [70]. Such a right also cannot be stipulated in the terms and conditions; see [67].

30 Legal Aspects of Internet Banking in Germany

743

In addition the bank has to provide a system, which enables the customer to bar the access to his account on his own initiative.21 cc) Duties of the Customer The extent to which duties of care concerning IT-specific security (phishing-mails, installation of virus protection and firewalls etc.) can be imposed onto the customer is still being discussed intensively. Obligations arising from the regular duties preventing tortius liability have to be obeyed in contractual binding as well, § 241 sec. 2 BGB ([71] § 241, Rn. 28; [61] § 280 Rn. 104; [4] § 251 Rn. 92). Within the contractual connection between a customer and the bank obeying these duties can be regarded as a minimum standard ([77] Rn. 56). The question remains whether the duties of care of the customer are increased within the internet-banking connection ([33] p. 217). Increased due care could be supported by the fact that by strongly discussing the phishing problem in the media and the related information distributed by the bank a certain awareness to these dangers of the customer can be assumed ([33] p. 217). It is, however, questionable if the mere knowledge about existing dangers without the technical knowledge or ability of countering these dangers can result in enhanced duties. Different from the duties concerning secrecy and safe custody of the verification instruments countermeasures against “internet-related threats” are unknown to the average customer. The technical abilities of the common average customer therefore represent the range within security related care can be expected ([79] pp. 178 ff.; [74] p. 225). If a threat, which can only be controlled by an “above-average user” leads to a damage, it has to be considered as a risk to be taken by the bank, which is justified as the internet-banking-offer is not limited to the experienced user but all customers regardless of their internet-experience ([79] pp. 178 ff.; [74] p. 225). To determine the duties of care of the customer concerning a threat, the customer’s awareness and reasonable technical and financial countermeasures have to be taken into account. Commonly known warnings as not to delete phishing mails, not to open links these e-mails refer to and not to open attachments from unknown senders can be – after given instructions by the bank – expected to be regarded by the customer ([77] Rn. 59). A certain browser- or firewall configuration of the customer’s PC, however, cannot be expected. (a) Duty to Protect the Personal Computer Courts acknowledge for a general duty of installing security updates and employing a virus scanner as these warnings are known to the common user as well. A surplus of these usual countermeasures like installing a firewall or defining a certain browser configuration cannot be expected by the common internet user

21

See also Art. 47 of the proposal of a directive on payment services in the internal market.

744

Gerald Spindler, Einar Recknagel

(Different opinion [33] p. 217) and therefore, cannot define obligations leading to a breach of duty. Defining obligations concerning the protection of the private computer becomes obsolete if the customer does not employ internet banking from his home PC but from work. Definitions of browser configuration usually cannot not be applied as the customer generally does not possess the necessary administrative rights to configure the computer or to install a firewall or security programs in the office. It should also be taken into account that an additional threat may result out of the working environment and the simplified access to the computer by others. For a versatile user it does not provide problems to install a keylogger or other malware on the customers working PC, keeping track of visited websites and used passwords. A breach of obligation by the customer could only be assumed, when it was clear to him that the working PC was accessible by thirds and no additional security measures have been installed. (b) Duties Concerning the Storage of Verification Instruments and Emails When processing banking-transactions online certain security-measures can be expected from the customer. This means that PIN and TAN Codes should not be stored on the hard drive of the computer, the signature card (HBCI) should be removed from the card reader after processing the transaction ([74] p. 221). In addition the usual duties concerning the handling of unknown email attachments are applicable. After sufficient warning notices by the bank the customer can be expected not to respond to phishing mails even if they appear as having been sent by the customer’s bank. Moreover, the customer should not enter PIN and/or TAN Codes on websites opened by links that were enclosed in an email received ([33] p. 217; [74] p. 222). Taking into account that phishing has been widely discussed in the media and warnings by all banking institutions are regularly issued, these simple rules of conduct should be known to every internet user without depending on specified computer knowledge. Whether a breach of duty by the customer can be assumed, when the bank itself contacts the customer in other business matters is at least questionable, because under these circumstances it might not be possible for the customer to determine if a certain mail originates from the bank or not ([16] p. 3314). For the bank therefore, it is advisable to point out that it will never ask for verification instruments via email (or any other way) or even safer to fully abstain from contacting its customers via email. (c) Duties While Processing Transactions If the average customer is able to verify the security characteristics of the website (correct URL, SSL-lock-symbol, information on the certificate, digital fingerprint) seems questionable (Pro [33] p. 216; doubting [35] p. 356). At least presently these security checks are – although easy to apply – not common knowledge so far ([77] Rn. 64).

30 Legal Aspects of Internet Banking in Germany

745

(d) Duties in Case of Misuse The customer is obliged in case of certain or suspected misuse to inform the bank and bar the access to his accounts ([16] p. 3314; [83] p. 339; [85] Rn. 19/86; [74] p. 223). If the customer failed to inform the bank and if the bank could have prevented the false transaction, when the information had reached the bank in due time, the customer is liable for the resulting damage. In order to prevent further misuse the customer must change his PIN after recognising the unauthorised use of his PIN ([85] Rn. 16/68; [74] p. 224). A duty to regularly change the PIN on the other hand does not exist as due to the growing number of PIN and password based applications in everyday life the common user would be overstrained ([79] p. 218; [85] Rn. 19/70; [74] p. 224). dd) Default of the Customer, § 280 Abs. 1 BGB The terms and conditions concerning the internet banking do not contain a specified regulation concerning liability. Therefore, the usual liability designated by law applies resulting in the customers liability for not only gross rather than any negligence ([35] p. 354). § 280 sec. 1 sentence 2 BGB, however, presumes the negligence if the contracting party had breached its duties – which can be rebutted, in practice, however, rarely the case. For this reason the exact definition of the duties to be fulfilled by the customer is mandatory ([77] Rn. 71). ee) Contributory Negligence by the Bank, § 254 BGB The customer may defend himself against claims of the bank by giving evidence of contributory negligence by the bank, in particular if insufficient security measures had been the cause for damages ([33] p. 219; [16] p. 3315). A concurrent cause could be assumed, when the bank hinders the recognition of manipulated websites by frequently changing of the design of the website or the employment of difficult “not for itself speaking”-URLs. The burden of proof for the contributory negligence by the bank lies with the customer ([8; 35] p. 360; [82] § 254 Rn. 68). 30.3.2.3 Conclusion In case of misuse of the verification instruments the bank may base its claims against the customer on § 670 BGB (reimbursement of expenses) or claims for damages, §§ 280, 241 sec 2 BGB. If the bank cannot prove the conclusion of a (transaction) contract (to the burden of proof see the following section) the bank may only claim for damages due to the breach of duties by the customer based on insecure handling of the verification instruments. The exact definition of the relevant duties of the customer is essential. Duties of the bank (duties of information, instruction and warning) correlate with duties of the customer. The requirements to the customer concerning due care when using internet banking are no higher than within the usual internet usage on the basis of the average internet user.

746

Gerald Spindler, Einar Recknagel

If a breach of duty by the customer can be assumed, possible contributory negligence by the bank has to be taken into account.

30.3.3 Procedural Law and the Burden of Proof The bank has to state and prove its claim for either reimbursement of expenses for the executed transaction or damages if the customer denies having disposed a payment transaction claiming that a third party has used the means of identification without consent or knowledge of the user. If the transaction is initiated by the use of the correct verification instrument, the bank will have severe problems in proving the facts. Therefore, alleviations of the burden of proof are discussed, either a complete shift in the burden of proof or the use of prima facie evidence. 30.3.3.1 Shift in the Burden of Proof Towards the Customer A shift in the burden of proof resulting in the customers obligation to prove that he did not initiate the disputed transaction have no legal basis in German law ([77] Rn. 78) and also would not be appropriate ([77] Rn. 78). Although the bank has no insight whatsoever into the customers sphere and the customer is able to secure a safe handling of the verification instruments there remain additional risks in internet communication between the customers PC and the bank server which are not controlled by the parties. These uncontrollable risks are the main reason, why a shift in the burden of proof has to be rejected ([74] p. 145; for internet auctions jurisdiction refuses a shift in the burden of proof as well, [69; 68; 46; 45; 53]). Moreover, a complete shift in the burden of proof to the customer would not allow for a assessment of the actual technical state of the art and development (i. e. new threats) ([75] p. 208). Therefore, a shift in the burden of proof in internetbanking would not be appropriate ([59] p. 111; [74] p. 144; [73] p. 459). 30.3.3.2 Prima Facie Evidence Most legal scholars favour that the burden of proof remains with the bank; however, the application of a prima facie evidence is preferred referring to similar problems in the usage of bank-cards (fomer ec-cards).22 Applied on internet-banking the prima facie evidence states that the transaction initiated by the use of the correct verification instrument originates from the customer or an authorised third person or if this can be rebutted by the customer, that the customer did not handle the verification instrument with due care. 22

Regarding to bank-cards jurisdiction applies the prima facie evidence for quite a long time stating that if the card was used with the correct PIN it indicates prima facie that a) the cardholder initiated the transaction or b) the cardholder did not act with due care making him liable for the resulting damages. Prevailing jurisdiction: [12; 34; 47; 48; 49; 50; 52; 1; 65]; see also [74] pp. 151 ff.

30 Legal Aspects of Internet Banking in Germany

747

Jurisdiction concerning the appliance of prima facie to internet-banking has not been issued yet.23 Prima facie rule applies when evidence is produced which may allow for a presumption based on facts that a certain event causes very probably a specific result. Hence, applying this rule requires the existence of a presumption in internetbanking. If such a presumption can be established it has to be examined if it can be rebutted by giving evidence regarding to a phishing or pharming attack, thus stating an atypical cause. aa) Presumption of Techical Security The grounds for either presumption (customer has initiated the transaction or acted negligent) are that common experience states the general technical security of internet-banking applications. The prevailing opinon amongst legal scholars affirms this security with reference to the employed PIN/TAN or HBCI system. These verification instruments can only be “cracked” with considerable amount of time and effort making such an attack improbable from a technical and economical point of view ([83] p. 339; [86] pp. 1047, 1050; [41] p. 24; [84] Rn. 19/17; [74] p. 150; [33] p. 218; [77] Rn. 90). The presumption of technical security therefore, is the basis for the high probability that the customer or an authorised third party has initiated the transaction, alternativly that the customer has handled his verification instrument negligent resulting in his liability. A prima facie rule is therefore justified as long as a high technical security standard can be affirmed ([77] Rn. 91). Actually the HBCI-System (being signature based) is presently considered as secure, in contrast to the security regarding the PIN/TAN-System. While it is agreed upon that the danger of calculating or guessing PINs and/or TANs is negligable ([35] p. 369; [86] p. 1047) security hazards emerge from the danger of being spied out by phishing, pharming or trojan viruses. Although jurisdiction has recognised the possibility of “being spied out” as possible means to rebut the presumption of security in the use of bank cards it did not rule out the utilisation of prima facie evidence in general ([11]). A direct transposition to internet-banking is, however, not feasible. Although “physical” spying – i. e., looking over the shoulder of the customer and noting PIN and TANs – cannot be totally excluded, it has to be considered as rather improbable. The real threat lies in the virtual attack, phishing, pharming or the use of trojan viruses. Taking these threats into account, the usage of the “simple” PIN/TAN does not shield the user on a level which enables the assumption of adequate security. Considering the actual state of the art technology is has to be distinguished between verification systems according to media conversion.

23

Although the use of PIN/TAN instruments is mentioned as „secure“ by [46].

748

Gerald Spindler, Einar Recknagel

(a) Verification Instruments Without Media Conversion Regarding the new threats by phishing and pharming the PIN/TAN system the actual security level does not match the dangers of being spied out by making use of phishing mails. Under these circumstances there are considerable doubts if the prima facie rule can still be applied to the detriment of the customer ([77] Rn. 95 ff.). Moreover, customers can not be held liable for acting without due care. Protection against threats by phishing24 and pharming are scarcely available for customers. Even if the customer applies all security measures, trojan viruses can remain undetected and spy out the verification data ([35] p. 356). Due to these dangers the application of prima facie evidence when using PIN/TAN systems without media conversion has to be questioned ([77] Rn. 95 ff.). (b)

Verification Instruments with Media Conversion

The high standard necessary for the presumption of the necessary security level has to be taken for granted when the verification instrument is based on media conversion. If for example a TAN-generator is used, the spied out TAN being digitally linked to a certain transaction cannot be used for another (false) transaction. The same applies, when the TAN is transmitted by SMS to the mobile phone of the customer. An illegal use of the verification instruments is only possible when the attacker not only spies out the PIN but also gains access to the additional hardware (TAN-generator, mobile phone) which indicates an insecure storage by the customer, giving way to the presumption of negligence. (c)

Conclusion

The prevailing opinion assumes technical security including the PIN/TAN system without media conversion burdening the customer with the full proof for being spied out ([33] p. 219; [16] p. 3316). As the customer is obliged to keep the verification instruments secret, the illegal use of PIN and TAN indicates at least negligence by the customer. However, it is still from being clear which role media conversion could play. Due to the uncontrollable dangers of being spied out the presumption for security can only be maintained for systems with media conversion while for systems without media conversion the full burden of proof is (re-)located to the bank ([77] Rn. 95, 101, 114). bb) Rebutting Prima Facie Evidence When applying prima facie evidence the customer has to rebut it by stating evidence which shows a substantiated possibility of an atypical event ([6; 16] p. 3317; [33] p. 218; [64] § 286 Rn. 23; [88], vor § 284, Rn. 29). Applied on internet-banking the customer has to state (and prove) a substantiated possibility that an unauthorized third had initiated the transaction. Even if the customer is 24

Save „obvious“ emails, using false grammar for example.

30 Legal Aspects of Internet Banking in Germany

749

able to prove an untypical event he has to verify in the second step, that he did not handle his verification instruments with negligence. Which requirements are required to rebut the prima facie rule has to be assessed by the court. However, courts did not yet had to decide upon these issues ([35] p. 359).25 If the customer rebuts the evidence by stating that he fell victim to a phishing attack he should at least be able to produce the phishing mail in question. Whether the mail has to be considered as real or is obviously faked would have to be decided in court. Pharming attack, however, raises more difficulties. Some legal scholards contend that – due to the difficulty of proving a pharming attack – it should be sufficient to rely on abstracts or media reports ([35] p. 360). However, the pure technical possibility without connection to the case in question does not state counter evidence at all.26 cc) Final Conclusion The application of prima facie rule for internet-banking (still) has to be approved by courts. Whereas such systems as HBCI or PIN/TAN systems combined with media conversion are undisputed, discussions recently have begun considering the safety of PIN/TAN system without media conversion. When the customer claims a phishing or pharming attack it is absolutely necessary to save the corresponding mail and the (unchanged) PC-system. The use of an antivirus programm after a possible attack may lower the chances of rebutting the prima facie evidence.

References 1. AG Frankfurt, NJW (Neue Juristische Wochenschrift) 1998, 687 2. Balzer, Peter, Rechtsfragen des Effektengeschäfts der Direktbanken, in: WM (Wertpapiermitteilungen Nr. IV) 2001, 1533 3. Balzer, Peter, Haftung von Direktbanken bei Nichterreichbarkeit, in: ZBB (Zeitschrift für Bankrecht und Bankwirtschaft) 2000, 258 4. Bamberger, Heinz Georg/Roth, Herbert, BGB, 2. ed. 5. BGH, NJW 1990, 906 (907 f.) 6. BGH, NJW 1991, 230 (231) 7. BGH, NJW 1994, 517 (519 f.) 8. BGH, NJW 1994, 3102 (3105) 25

26

The LG Stralsund assumed the prima facie evidence of a telephone bill as rebutted by proving the existence of a „Backdoor-explorer 32-Trojan“-virus on the hard drive of the customer, [55]. Although the proof is hard to produce it has to be demanded as the similar situation in bank card cases show, where regularly the abstract possibility of downloading undefined programs „from the internet“ are stated in order to „prove“ the hack of the PIN.

750

9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.

29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39.

Gerald Spindler, Einar Recknagel

BGH, NJW 1994, 3349 (3350) BGH, MMR (MultiMedia und Recht) 2004, 308 (311) BGH, NJW 2004, 3623 (3624) BGH, WM 2004, 2309 BGHZ (Entscheidungssammlung des Bundesgerichtshofs in Zivilsachen) 146, 138 Bräutigam, Peter/Leupold, Andreas, Online-Handel, Chapter VII Neumann, Dania/Bock, Christian, Zahlungsverkehr im Internet, 2004 Borges, Georg, Rechtsfragen des Phishing, in: NJW 2005, 3313 Derleder, Peter/Knops, Kai-Oliver/Bamberger, Heinz G., Handbuch zum deutschen und europäischen Bankrecht, 2003 Borsum, Wolfgang/Hoffmeister, Uwe Bildschirmtext und Bankgeschäfte, in: BB (Betriebsberater) 1983, 1441 (1444) Borsum, Wolfgang/Hoffmeister, Uwe, Rechtsgeschäftliches Handeln unberechtigter Personen mittels Bildschirmtext, in: NJW 1985, 1205 Canaris, Wilhelm, Bankvertragsrecht, 1988 Dörner, Heinrich, Rechtsgeschäfte im Internet, in: AcP (Archiv für die civilistische Praxis) 202; 363 Geis, Ivo, Die digitale Signatur, in: NJW 1997, 3000 Gesetz zur Änderung der Vorschriften über Fernabsatzverträge bei Finanzdienstleistungen vom 02.12.2004, BGBl. I 2004, p. 3102 Schimansky, Herbert/Bunte, Hermann-Josef/Lwowski, Hans-Jürgen, Bankrechtshandbuch, 2. ed., 2001 http://www.hbci-zka.de Hartmann, in Hadding/Hopt/Schimansky, Entgeltklauseln in der Kreditwirtschaft und E-Commerce von Kreditinstituten – Bankrechtstag 2001 Heermann, Peter W., K&R (Kommunikation und Recht) 1999, 6 Hellner, Thorwald, Rechtsfragen des Zahlungsverkehrs unter besonderer Berücksichtigung des Bildschirmtextverfahrens, in: Festschrift für Winfried Werner, 1984 Hoeren, Thomas/Oberscheidt, Jörn, VuR (Verbraucher und Recht) 1999, 371 Hoffmann, NJW-Beil. zu 14/2001 Baumbach, Adolf/Hopt, Klaus, Handelsgesetzbuch, 32. ed., 2006 Janisch, Sonja, Online-Banking, 2001 Karper, Irene, DuD (Datenschutz und Datensicherheit) 2006, 215 KG, NJW 1992, 1051 (1052) Kind/Werner, Rechte und Pflichten im Umgang mit PIN und TAN, in: CR (Computerrecht) 2006, 353 Koch, Robert, Versicherbarkeit von IT-Risiken, 2005 Köhler, Helmut, Die Problematik automatisierter Rechtsvorgänge, insbesondere von Willenserklärungen, in: AcP 182, 126 Köndgen, Johannes, Neue Entwicklungen im Bankhaftungsrecht, 1987 Krüger, Thomas/Bütter, Michael, Elektronische Willenserklärungen im Bankgeschäftsverkehr: Risiken des Online Banking zugl. Besprechung des Urteils des LG Nürnberg-Fürth vom 19.05.1999 in: WM 2001, 221

30 Legal Aspects of Internet Banking in Germany

751

40. Kümpel, Siegfried, Bank- und Kapitalmarktrecht, 3. ed. 41. Kunst, Diana, Rechtliche Risiken des Internet-Banking, in: MMR-Beil. (MultiMedia und Recht-Beilage) 09/2001, 23 42. Lachmann, Jens-Peter, Ausgewählte Probleme aus dem Recht des Bildschirmtextes, in: NJW 1984, 405 43. Lange, Thomas, Internet-Banking, 1998 44. Langenbucher, Katja, Die Risikozuordnung im bargeldlosen Zahlungsverkehr, 2001 45. LG Bonn, MMR 2002, 225 (256) 46. LG Bonn, MMR 2004, 179 (180, 181) 47. LG Bonn, NJW-RR (Neue Juristische Wochenschrift-Rechtsprechungs-Report), 1995, 815 48. LG Darmstadt, WM 2000, 9121 (913) 49. LG Frankfurt, WM 1999, 1930 (1932 f.) 50. LG Hannover, WM 1998, 1123 51. LG Itzehoe, MMR 2001, 833 (833) 52. LG Köln, WM 1995, 976 (977) 53. LG Konstanz, MMR 2002, 835 (836) 54. LG Nürnberg-Fürth, VuR 2000, 173 55. LG Stralsund MMR 2006, 487 (488) 56. Mackenthun, Thomas/Sonnenhol, Jürgen, Entgeltklauseln in der Kreditwirtschaft – e-Commerce von Kreditinstituten, Bericht über den Bankrechtstag am 29.06.2001 in Kiel, in: WM 2001, 1585 57. Marburger, Peter, Die Regeln der Technik im Recht, 1979 58. Mehrings, Josef, Vertragsschluss im Internet: Eine neue Herausforderung für das „alte“ BGB, in: MMR 1998, 30 59. Mellulis, Klaus-J.,, Zum Regelungsbedarf bei der elektronischen Willenserklärung, in: MDR (Monatsschrift für deutsches Recht) 1994, 109 60. Möller, Mirko, Rechtsfragen im Zusammenhang mit dem PostIdent-Verfahren, in: NJW 2005, 1605 61. Münchener Kommentar zum Bürgerlichen Gesetzbuch, 4. ed. 62. Münchener Kommentar zum Bürgerlichen Gesetzbuch, 4. ed. 63. Münchener Kommentar zum Bürgerlichen Gesetzbuch, 4. ed. 64. Musielak/Foerste, ZPO, 9. ed. 65. LG Osnabrück, NJW 1998, 688 66. OLG Hamburg, MMR 2003, 340 (340) 67. OLG Köln, CR 2000, 537 68. OLG Köln, MMR 2002, 813 69. OLG Naumburg, NJOZ (Neue Juristische Online-Zeitschrift) 2005, 2222 (2224) 70. Phishing: http://www.sit.fraunhofer.de/cms/media/pdfs/phishing.pdf 71. Palandt, BGB, 67. ed., 2008 72. Proposal for a Directive of the European Parliament and of the council on payment services in the internal market and amending Directives 97/7/EC, 2000/12/EC and 2002/65/EC (presented by the Commission)

752

Gerald Spindler, Einar Recknagel

73. Roßnagel, Alexander, Auf dem Weg zu neune Signaturregelungen – Die Novellierungsentwürfe für SigG, BGB und ZPO, in: MMR 2000, 451 74. Recknagel, Einar, Vertrag und Haftung beim Internet-Banking, 2005 75. Sieber, Stefanie/Nöding, Toralf, ZUM 2001, 199 76. Siebert, Lars Michael, Das Direktbankgeschäft, 1998 77. Spindler, Gerald, Online-Banking, Haftungsprobleme, 2006 – abstract presented at the „3. Tag des Bank- und Kapitalmarktrechts 2006“ in Karlsruhe 78. Bamberger, Heinz Georg/Roth, Herbert, BGB, 2. ed. 79. Spindler in Hadding/Hopt/Schimansky, Entgeltklauseln in der Kreditwirtschaft und E-Commerce von Kreditinstituten – Bankrechtstag 2001 80. Trapp, Andreas, Zivilrechtliche Sicherheitsanforderungen an eCommerce, in: WM 2001, 1192 (1199) 81. Ultsch, Michael, Zugangsprobleme bei elektronischen Willenserklärungen – Dargestellt am Beispiel der Electronic Mail, in: NJW 1997, 3007 82. Bamberger, Heinz Georg/Roth, Herbert, BGB, 2. ed. 83. Werner, Stefan, Das Lastschriftenverfahren im Internet, in: MMR 1998, 338 84. Hellner, Thorwald/Steuer, Stephan, BuB (Bankrecht und Bankpraxis) 85. Hellner, Thorwald/Steuer, Stephan, BuB (Bankrecht und Bankpraxis) 86. Wiesgickl, Margareta, Rechtliche Aspekte des Online-Banking, in: WM 2000, 1039 87. http://www.zentraler-kreditausschuss.de/ 88. Zöller/Greger, ZPO, 26. ed. 89. (97/489/EC), Official Journal 1997, L 208, pp. 52 ff. 90. (2002/65/EC), Official Journal, 2002, L 271 p. 16

CHAPTER 31 Challenges and Trends for Insurance Companies Markus Frosch, Joachim Lauterbach, Markus Warg

31.1 Starting Position There are some 670 insurance companies1 operating in the German insurance market that grew in 2004 by 1.6 percent in direct non-life/accident insurance, by 2.6% in life insurance and by 7.8% in health insurance. In such mature and highly competitive markets policies, prices and commissions converge as a result so that in many segments insurance services can almost be regarded as standardised products from a customer perspective. Nevertheless some insurance companies have been successful in recording growth far above the market average – and this organically. If one excludes the non-actuarial result, investment profits, the factors underlying the success must then be sought in the actuarial result. The actuarial result is determined by premium income, benefits and costs. This article explains in detail the reasons why there are significant differences between business results and derives from these the challenges to be faced in the future. The combined ratio is an important key figure used in reviewing the profitability of a single insurance contract, of all contracts with a customer, of a part of the business in force or the total insurance business in force. This combined lossexpense ratio for a financial year is the sum of the gross loss ratio calculated from the losses paid out or reserved in the reporting year plus processing expenses and internal claim settlement costs, the administration expense ratio and the cost ratio for other expenses, divided by the gross premiums earned. As soon as this ratio exceeds 100 percent, the transaction or business segment under review has suffered an actuarial loss. By comparing over a period of several years the trend of the combined ratio of the business segments of a company with the combined ratios of other insurance companies important conclusions can be drawn with 1

cf. GDV, 2005 Statistical pocket book of the insurance industry, 2 number of insurance companies.

754

Markus Frosch et al.

regard to the susceptibility to losses and consequently the quality of the business in force and the efficiency of management.2 The reasons underlying positive actuarial results – i. e. combined ratios clearly below 100 percent – are that some insurance companies have been successful in achieving premium growth in high yield business sectors, the objective of almost all insurance companies, and at the same time in reducing costs. If the products and prices are standardised to a large extent, the right service strategy, competence and services offered become the decisive factors for the success of the sales partner with the customer. Multi-faceted interaction between strategy, sales and marketing and management is required for this. The following points must be mastered in order to optimally cover this spectrum: • First of all it must be ensured that all resources are deployed in line with the company objectives. The business, IT and HR strategies must be balanced. A possible method to be applied is illustrated by the Balanced Scorecard. • Are the core processes in line with/consistent the company objectives, is it sensible to introduce self-controlling systematics. An example of this is sales and marketing management based on contribution margins through which it is ensured that incentive structures for the sales partners are in line with the company objectives. • In a highly standardised market it is effective with regard to customer perception to create a distinction between the company and the competition. An example of a successful unique selling proposition is the introduction of “on demand insurance” by an Austrian insurance company. • Competent and service oriented advice is based on many processes that support sales and marketing. Internet based access to primary data is becoming more important for the integration of the customer, sales and marketing and management. • Excellent services provided in such a complex environment result from the employment of excellent staff. The talent required in the view of the company must be identified, recruited and developed. The acquiring of this management talent is probably the most formidable of these challenges.

31.2

Challenges and Trends

31.2.1 Business and IT Strategy in the Balanced Scorecard According to the intellect and actions of a cautious businessman the development of the market, staff, processes and finances should be structured in a balanced manner and agreed – the Balanced Scorecard provides a systematic process geared to this. 2

cf. F. von Fürstenwerth, A.Weiss: Insurance Alphabet, 2001, page 143.

31 Challenges and Trends for Insurance Companies

755

Figure 31.1. Balanced Scorecard

Naturally it is the duty of a company to generate a net income that is customary for the market – in the international arena 15% on equity capital is considered a normal return on business activities. Ultimate success cannot be achieved if financial targets are continually not being met. Rather there is a challenge to set out in a transparent manner the financial targets and the method by which they can be achieved and to communicate them to the staff. The strategy regarding the positioning of the company and its business alignment that ensures that the financial targets are achieved on a medium- and long-term basis should form part of this Scorecard. The product portfolio, the business segments but also market presence and the interaction with the customer are reflected and understood in the customer Scorecard so that the costs and revenues contributed by the products or business segments to the bottom line are made transparent. Only on this basis can growth targets be properly specified, i. e. allocated to individual business segments, and financial targets consistently met. The targets for the processing area are then to be derived from the understanding of finances and the customer area. Here the balancing act to be achieved in most cases becomes apparent; in concrete terms the reduction of the administration expense ratio with a simultaneous improvement in sales and marketing support or in other services. The staff is responsible for the total process. Only if they understand and support the company objectives and the actions taken to achieve the objectives can success be achieved on a sustainable basis by staff responsible. Modern personnel policies, such as employment based on ability and incentive systems that reward the achievement of both company objectives and individual targets are essential for this. In addition to its importance as a communication and management tool the value of the Scorecard is its portrayal of dependencies in a visual form. Growth in high yield sectors can only be achieved if these segments are made known, i. e. detailed contribution margin accounting has been performed.

756

Markus Frosch et al.

strat.+finance

U3: ROE 15 %

process+structures

customer+markets

1

staff+partners

U2: expand market position

U1: guarantee independence

2

0

2 K2: unique selling propositions

K3: sales managed based on contribution margin

1 K1: transparency in costs + income

K4: service offensive

P3: improve in degree of automation

1

P2: reduction of admin costs

2

2 P1: sales support

1 M2: target based salary M3: talentmanagement

2

M1: HR process integration

2

Figure 31.2. Dependencies in the Balanced Scorecard

The interdependencies are clarified in Figure 31.2. If a target of a return on equity of 15 percent is set, sales must be then managed on a contribution margin basis so that the overall target can be “broken down” into individual targets. Cost and revenue transparency is again a central requirement for managing sales on a contribution margin basis. A sales business plan can only be prepared on the basis of this transparency, the implementation of which also results in the desired profit targets being met. Reliable information on costs and revenues, in particular from a product perspective, is essential for cost and revenue transparency as well as for managing sales on a contribution margin basis. All these milestones that are needed to meet the targets can only be allocated if there are transparency, clear communication and agreed targets that relate to the superordinate target. Under no. 2 the “delivery themes” are numbered that underpin this process chain.

31.2.2 Self-controlling Systems: Management Based on the Contribution Margin A service – irrespective of whether provided by the sales partner or management – is not an end in itself and can adversely affect profits if incorrectly allocated. The

31 Challenges and Trends for Insurance Companies

757

costs of providing services must therefore be allocated to the revenues generated by the transaction. This also applies to costs for advertising and marketing events and in particular to the remuneration of the sales team, as the yield on the transaction is more crucial than the revenue generated by it, i. e. the premiums written by the sales partner. The contribution margin calculation is used as the central tool for providing relevant information for the decision-making process. The contribution margin is a form of profit calculation that differentiates between fixed and variable costs. The contribution margin is the difference between the revenue generated and the variable costs incurred in producing the revenue. The contribution margin specifies the amount that is used to cover fixed costs and is often further defined between contribution margin 1, 2 etc. in order to show the effect of different types of costs. For example, premium income and other income less losses paid out is referred to as contribution margin 1, less losses paid out and commissions as contribution margin 2, less losses paid out, commissions and directly allocatable administration expenses as contribution margin 3.3 But which contribution margin is the correct one to be used for making impending decisions? The selection is large: product, class of business, company department, brand, customer, customer group based on age, sex, location, sales, exclusive sales, agency, representative, sales by broker etc. The list can be greatly extended, therefore at this point the importance of the period used for the calculation of the contribution margin should be mentioned in conclusion, as this period (over several years, year, month, week) can also be relevant to the contribution margin amount and consequently for the information calculated for this period that is used as a basis for decision-making. This illustrates the scope of the data needed. Insurance companies require much more than customer data: product information, commission payments, claim payments, periods, directly allocable costs etc. All relevant data and their interrelationships must be brought together so that evaluations can be performed based on consistent data. As illustrated in the SAS Non Life Subject Model the cycle of the relevant data and the interrelationships must first be defined. This data must obviously be allocated to specific objects that are uniquely identifiable in the insurance world. These objects are also called entities in information technology. Customer, policy, sales partner, product, costs etc. are examples. However the entity type is more important than the individual entity in the design of a self-controlling system, i. e. the general characteristics of the customer are the central point, as profiles for target customers or “non-targets” can be determined from these. The primary data must then be extracted from the source systems using this logical data model, transformed into the target schematic and loaded into a separate data inventory (Detail Data Store). This data inventory combined with the analysis software provides the basis from which reports on a contribution margin basis can be produced.

3

cf. F. von Fürstenwerth, A.Weiss: Insurance Alphabet, 2001, page 155.

758

Markus Frosch et al.

Source: SAS

– Non Life Subject Model

Figure 31.3. Non life subject model4

Source Data

Exploitation

Logical Data Model

Data Marts

Integration, Transform ation and History Managem ent Layer Detail Data Store

Applications

Dimensional Physical Models

HOLAP Data Models Reporting

Analytical Data Models

Analytics Cam paign Management Data Mart

BSC Knowledge Base Performance Managem ent

Current Data

Campaign Managem ent

ETL Process and Metadata

Online HTML WA documentation

Source: SAS – Insurance Intelligence Solutions Technical A : rchitecture

Figure 31.4. Insurance intelligence solutions: technical architecture

The extent of the dovetailing of the business strategy – growth in high yield business segments – with IT is clearly illustrated in this process diagram. The dependencies can also be shown in visual terms in the Balanced Scorecard by means of so-called strategy maps, as is illustrated in the following graph for the strategy and customer areas.

4

Source: Non Life Subject Model.

31 Challenges and Trends for Insurance Companies

759

Source: SAS – Strategy M aps

Figure 31.5. Strategy maps

31.2.3 Unique Selling Propositions: On Demand Insurance Insurance taken out by private households as a unit comprise three different service levels. The first level represents the core product, e. g. the specific insurance cover for the motorboat. At the second level services are added to the core product that are directly connected, such as advice, explanations, customer service, processing of claims relating to the motorboat insurance policy. The third level includes additional services that also take into account the specific personal circumstances of the customer with regard to the core product and thereby rounds off the total benefits offered.5 In the case of motorboat insurance this can be, for example, a smartphone that has worldwide reception and is programmed with location specific emergency telephone numbers and thereby gives the customer the warm feeling that he is also prepared for emergencies in this regard. The more insurance products become similar or the customer perceives that there is no difference in quality at the core product level, the more important the provision of services becomes. The services are then no longer regarded as a simple addition to the core product but as primary services provided by the supplier.6 Opportunities thus arise for innovations and special products that trigger the feeling of uniqueness in the minds of the customer. This allows the insurance company – 5

6

cf. M.Haller, P. Maas, in P. Albrecht, E. Lorenz, B. Rudolph: Risk Research and Insurance, 2004, page 179–214. cf. M. Kremer, 1994.

760

Markus Frosch et al.

Level 3 Ebene 3

services/ 3 Market Marktleistungen/ Erweiterte Funktionen additional functions

Ebene22 Level -Market service/ 2 Core Kern-Marktleistung/ core function Kernfunktion Ebene11 Level

Opportunity

Danger Chance Innovations Substitutions Innovations Substitutions

1 Kernprodukt 1 Core product

Source: Haller M., Maas P., in Albrecht P., Lorenz E., Rudolph B.: Risikoforschung und Versicherung , Karlsruhe 2004, S. 179-214

Figure 31.6. Service levels

as a result of the low price elasticity of demand – to institute a company-friendly price policy, at least for a certain period of time. Naturally there is the risk that additions made at the second and third levels are substituted by the competition. As these services represent additions to the core product they, together with the personal circumstances of the customer, are also to be included in the core product and its premium calculation. An example of this is the “flexible car insurance with a base fee and a price per kilometre”,7 that is checked by UNIQA, the Austrian insurance group, in conjunction with IBM. The objective is to make the premium dependent on the real use of the car. The premium decreases the less and the more safely the customer drives. The car usage is determined through satellite navigation. The car usage is determined through a combination of satellite navigation, handheld and IT. The car is tracked by GPS via satellite, the position data is then fed to a central computer via a mobile telephone network. A Navi box installed in the car serves as the GPS receiver and the GSM sender. The computer then compares the navigation data to a map and calculates the kilometres driven and routes that are then used to calculate the premium. If safe driving is rewarded by lower premiums in this manner, the new model could also be a motivating factor for increased road traffic safety. This form of insurance based on needs can also have a beneficial effect on the environment and 7

cf. UNIQA press release “Flexible Car Insurance with Basic Fee and Price per Kilometre”, UNIQA press department 5 October 2005.

31 Challenges and Trends for Insurance Companies

(secured connection through encryption

761

Combination of the GPS and Map Data is used for the calculation of the Insurance Premium

Navi Box: records the tours with the aid of the GPS-Coordinates

Figure 31.7. Flexible car insurance

society. If the car is used less through targeted use and is therefore driven more on safer routes, this will result in a reduction in traffic congestion and accidents and the environment will be protected. Furthermore this technology could also be applied as a car finder for locating a stolen car. An emergency call function is also provided by the connection to the mobile telephone network. In an emergency the driver can transmit an emergency call via GSM, the system locates the car and sends help. It is also possible to further develop the system so that an accident can be independently identified, for example on the sudden stopping of the car. The use of the tracking system in managing car fleets is also conceivable. The target groups for this insurance approach based on needs are car drivers who do not drive much, owners of second cars or younger car drivers who are interested in new technology and are looking for favourable rates.8

31.2.4 Sales and Marketing Support: Access to Primary Data The scope of activities that support sales and marketing is very extensive. The integrated process support system is the central point over and above the information and the knowledge required to support the competence of the salesman on 8

cf. Dr. Hajek in UNIQA press release “Flexible Car Insurance with Basic Fee and Price per Kilometre”, UNIQA press department 5 October 2005.

762

Markus Frosch et al.

site. The process structure is key to the quality and processing speed as perceived by the customer. Outlined below are some process milestones that are of pivotal importance for a qualitatively high-class advisory and closing process: • Range of products • Automatic data transfer (if already customer: name, address, …) on preparing the insurance proposal • Plausibility checks on range of products to avoid any subsequent queries • Transfer of proposal data into back office processing systems (portfolio, documentation systems) • Authority levels • Contract information with regard to extent of cover • Information systems that provide extensive and timely information on existing contracts • On-site changes of a technical and non-technical nature • Transparency over the status of the processing of the transaction • Cooperation agreed between sales and the call centre The quality of these process components with regard to the quality of the information, the potential for error, the throughput speed and the homogeneity of the information between sales, processing, the service centre and the portfolio team play a decisive role in how the customer perceives the quality. The standardisation of the support processes and the elimination of any redundant data housekeeping and functions is a central aspect from an IT perspective. Access to primary data, i. e. sales as well as processing staff have direct access subject to authority levels to contract, customer, collection, claims data, et al is one method for achieving this. This strategy for process support has several advantages: 1. First of all the data/information is complete and current, as the “leading system” has been accessed 2. In addition the data/information is consistent, as not only data partitions are replicated and redundantly maintained 3. Furthermore this strategy allows data housekeeping and the number of applications to be reduced resulting in a decrease in licence fees and maintenance costs in addition to a reduction in the complexity. A technology that allows the primary data held in the portfolio systems to be accessed is IMS-Connect supplied by IBM. IMS Connect is the interface through which Java applications and TCP/IP IMS transactions are called up. At the same time IMS Connect provides together the IBM development tools a programming environment that has code generators e. g. for Web services. For example this enables the SOAP Gateway to talk directly to IMS transactions as a Web service, without the need for writing or generating code and also for using an application server. 9 9

cf. IBM Corporation 2005: more details can be found on the IMS Homepage: http://www.ibm.com/ims.

31 Challenges and Trends for Insurance Companies

763

IMS

Network Connection to IMS TM

The World Connects

to it !

SNA

TCP/IP

(LU0 LU1 LU2 LU6.1 LU6.2)

VTAM

APPC/MVS

TCP/IP for z/OS WebSphere MQ

IMS/ESA

IMS Application

IMS Connect

Base DCE Services

OTMA IMS Application

IMS Application

AS/IMS

© IBM Corporation 2005

Figure 31.8. Network connection

There are basically two options for accessing the primary data in this way: via protocols such as APPC or VTAM (3270 terminals) in older architectures in a SNA network; via TCP/IP in newer architectures. In IMS an interface called OTMA is used for this purpose. Both IMS Connect and MQ Series (now called WebSphere MQ) are OTMA clients and can call up transactions and receive responses. There are the following communication options: From 3270 terminals using the Telnet3270 protocol. In new MQ Series (WebSphere MQ), applications via socket connections to IMS Connect, using IMS connectors for Java or via the SOAP gateway to make the transactions available in a Web service. The SOAP gateway is a Web service solution for IMS that requires the installation of IMS Connect. This enables the IMS applications or modules to into Service Oriented Architectures (SOA). The direct communication between the Web service initiator and IMS is supported so that an application server is not required to be interconnected. Open standards such as SOAP and XML are also supported. The Web service is immediately available if the configuration files were generated with the WebSphere development tools (WDz and RAD) and the SOAP gateway informed. The runtime components required to execute the Java applications form part of IMS Connect. The development components or development tools through which, for example, a Java application or a Web service can be generated from a IMS transaction written in COBOL, are an integral part of the IBM Development Tools

764

Markus Frosch et al.

IMS

IMS Connect

The World Connects

to it !

TCP/IP Client using IMS Connector for Java

TCP/IP Network

IMS TM IMS application GU IOPCB

OTMA

T C P/I P for z/ O S

Any TCP/IP client

YYY XXX

ISRT IOPCB

IMS Connect

Web Server Web Browser

© IBM Corporation 2005

Figure 31.9. IMS connect

WebSphere Studio Application Developer-Integration Edition (WSAD-IE), the Rational Application Developer (RAD) or the WebSphere Developer for z/OS (WDz). This diagram illustrates that it is possible to operate Web applications (Browser, Web Server, IMS Connect; 3 three-level model) and also so-called Fat Clients (Client, IMS Connect; 2- level) in parallel. The applications open a TCP/IP socket and send a message (call up of a transaction) to IMS Connect. IMS Connect formats this message and places it in the IMS queue via the OTMA interface. The processing then carried out by IMS is exactly the same as that performed via the 3270 terminal. The transaction response is sent back to the initiator by the same method. Interestingly, due the very streamlined IMS Connect application processing is several times faster than that of the MQ Series.10 In such service oriented architectures it is important that the services are not too small. The smaller the services, the higher the communication expense. Services should be defined as larger entities of functions, such as, for example, an entire transaction. Equating services with functions, as they are contained in their hundreds in parts of the application, will degrade performance, in particular if these services are distributed across several systems or platforms.

10

cf. IBM Corporation 2005.

31 Challenges and Trends for Insurance Companies

765

IMS

IMS Connector for Java (IC4J)

The World Connects

to it !

' IBM Corporation 2005

Figure 31.10. IMS Connector for java

31.2.5 Talent Management: Identify, Hire and Develop Talented Staff The successful positioning, the agreement of processes and incentive structures for setting the company objectives, the development of appropriate self-controlling systems, innovative ideas and also the optimal support for sales all result from teamwork that is moulded by the company culture and the existing talent. Sentences such as “our success is due to our staff” and “we are developing our talented staff” have been heard more and more often in the last few years. This is probably also due to the recognition that technologically standardised processes and simple activities are being transferred abroad more and more as part of the increasingly noticeable globalisation process. It is only human nature that the population of the Western world is rising up in revolt against this development. However this will not change the fact that jobs are not going to locations where the level of salaries is comparatively high and capital is not flowing to locations where the rate of interest earned is comparatively low.11 If we want to secure the future, we must change the way we operate. We must be open to new ideas, we must change our perceptions and develop forward-looking concepts with regard to the economy, society and politics.12 11 12

cf. M.Miegel: Change of era, 2005, page 232. cf. M.Miegel: Change of era, 2005, page 282 et seqq.

766

Markus Frosch et al.

Consequently aspirations at the both society and company level are being concentrated on intellect and its innovative spirit and thus on staff. However, except in some cases, this recognition that is becoming increasingly prevalent in society and companies is not reflected in HR management in the form of a consistent policy for identifying, acquiring and developing talented staff. This already poses a few questions: • • • •

What does a talented member of staff mean to you? Which talents are strategically relevant to your company? Is the HR organisation consistent with the company strategy? Is there a standard process in the company that compares the actual performance of HR to the HR plan?

Talent is an inborn asset that allows one to perform excellently in a field. Talent finds its own way of evolving. Pleasure in performing a task can indicate that more opportunities should be provided to evolve and develop the talent. “The most important thing for developing a talent is the opportunity to do so”13. However, the creation of these development opportunities is not an end in itself for a company but a central factor to its success. Specific investment in staff and a HR department that identifies, positions and develops this investment and is geared to achieving the company objectives is required for each clearly defined positioning of the company.14 Figure 31.11 illustrates by way of an example the steps needed to be taken to implement a HR scorecard. The upper half of the diagram shows the interrelationships between the positioning of the company, the value drivers (talented staff) and targets, the bottom half the operational action plan, i. e. the control loop between incentive plans and the achievement of targets. Both halves must be brought together in order to for the company to achieve success. Kaplan provides another view of this interrelationship between the positioning of the company and the HR organisation by means of the “Strategic Human Capital Planning” and the “Strategic HR Management” aspects.15 The HR requirements are first of all derived from the values on the Balanced Scorecard: strategic skills and talents, management attributes, culture, alignment with the company objectives and the learning environment. The “Human Capital Development Program” is then defined based on covering the plan profile. A development plan is defined as part of the process for the areas relating to the development of expertise and management skills, the company culture, incentive structures and the agreement of objectives etc.

13 14

15

cf. Promerit Talent Management position statement, 06.07.2004 and 10.03.2005. cf. Promerit HR Scorecard: Transparency and Management for the Personnel department, 06.2004. cf. M. Bower, R.S. Kaplan: Balanced Strategy Focused Organizations with the Balanced Scorecard, 2001.

31 Challenges and Trends for Insurance Companies

767

Roadmap for implementing a HR scorecard The HR scorecard can be implemented and established in the organization within 3-6 months with the help of the roadmap and experienced Promerit consultants Project tagets/-expectations (HR team/-customers) Projektplanning & setup

Preparation

SWOT analysis determine/review of action areas and strategic HR targets

Agreement of HR value drivers and HR parameters for review (per department)

Positioning & strategy

Value drivers & control levels

Agreement of strategic HR initiatives, actionplans and targets for each area

Initiatives & targets

Strategic Operational Regular agreement of status and adjustment to action plan/review of strategic objectives

Implementation of a Balanced Scorecard Frankfurt, April 2004

Communication and, if applicable, BSC company specification; implementation of Balanced Scorecard using appropriate tool, responsibilities and roles in BSC-process

Implementation Operational action plan

Measurement & evaluation Agreement of targets

Organisation

Contents (Measurements)

Operational action plan and integration in the targets and, if necessary, the incentive systems for management and staff

© PROMERIT AG

Figure 31.11. Introduction of a HR scorecard16

The so-called health checks provide concrete assistance in analysing the HR organisation with regard to staff requirements and the recruitment and development of such staff. As part of such reviews six aspects are evaluated: Aspect 1: Positioning of HR The positioning of the Personnel Department and consequently the status of talent management can be determined by the answers to a few questions. • Are clear targets based on company strategy set for the Personnel Department? Is HR regarded as core function with its own responsibilities and “appreciated”? • Is there a HR strategy? If yes, is this based on company objectives and in line with them? • How does the Personnel department view itself? More an administration unit or rather a driver of business? • How is talented staff treated? As a schoolchild or as a customer? • Is there a talent management process?

16

Source: Promerit AG: HR Scorecard, April 2004.

768

Markus Frosch et al.

Aspect 2: HR core processes The core processes should be transparent and known to both management and staff. • Are there procedures in place for managing staff and setting objectives? • Have job specifications been written? • Is there a systematic talent review, talent development, based on appraisals? Is there proper administration of so-called personal master data or are there browser-based talent pools showing the talent available in the company that can be consequently tracked and employed? • Is there a succession plan? • Is there a correlation between training and the long-term development of talent? • Is remuneration based on performance and are there variable components based on the company success – self-controlling remuneration models? Aspect 3: HR support processes The classical personnel administration function and the support processes up to the administration of access rights to IT applications should be in line with the positioning of HR. • Are the personnel administration functions covered to a large extent by systems? Is there a system-supported interface with other areas such as accounting? • Is this function perceived as a competent service or just as a pure service? • Are the authority levels with regard to employment contracts and their negotiation centralised? • Are benchmarks used for administration expense? If yes, are all functions taken into account or does the business process outsourcing of, for example, the payroll only provide illusory advantages? Aspect 4: Staff and competencies The central aspects to be taken into account with regard to the development of talent and the alignment of the HR organisation to the company objectives are the following. • Are strategic functions communicated, raised in a systematic manner and documented? • Are there skill and competence profiles in place that are consistent with the positioning of the company and the relevant strategic functions? • Is an evaluation of human capital performed? • How motivated and committed is the staff? Are there benchmarks for sick days taken and absences from work? • How are the competence of management and cooperation perceived by the staff? • Is there sufficient talent available? Do the business plan, the HR plan and recruiting dovetail?

31 Challenges and Trends for Insurance Companies

769

• Are the company objectives made known to the staff and is there a performance culture? • How is the willingness to adjust and flexibility characterised? Do job rotations and temporary transfers to projects increase self-esteem? Aspect 5: Acceptance of HR An assessment can be made following a survey carried out in a few interviews of how the HR unit is regarded as to what extent the personnel department is involved in the implementation of company strategy. • • • •

How is the value added by HR perceived by top management? How satisfied are middle management and staff as customers? How does the staff sense the appreciation shown to them by HR? Does HR have the ability to manage and coach operating departments with regard to personnel matters – or is HR only a supplier and a processor? • Do talented staff feel like customers who are systematically looked after? • Is HR proactive in changing processes? Or “only” decisions from top management are implemented? • How attractive is the company to external talent?

Aspect 6: HR controlling The ability of the HR unit to provide information is not only used to ensure that Annual General Meetings are run smoothly but also indicates its competence and willingness to be compared to internal external benchmarks on an ongoing basis. • Are key personnel figures available for the entire company? Can developments – not only relating to staff numbers – be highlighted? • Is the contribution made by HR measured? • Is the implementation of HR strategy communicated? – for example in the Intranet? • Are parameters set to measure the efficiency of HR processes? • Are employee satisfaction surveys carried out? • What is the state of the system support? The evaluation of this health check indicates how the HR organisation is aligned to the company objectives. Any frictions and risks to the achievement of the company objectives are highlighted and development steps recommended. Only when such checks become routine for a company, such as the review of potential areas of savings, will the right talent be available at the right time in the company. Only then will excellent staff performance be achieved and the innovative ideas will be utilised to the full, which the interaction of the aspects highlighted in this article requires, and our opportunities in the age of globalisation become apparent.

770

Markus Frosch et al.

HR Health Check: evaluation of the six dimensions 1

1

4

1

2 HR core

HR positioning

2

3

4

5

5

People & competencies

2

3

4

Talent Management Consulting Health Check Frankfurt, März 2004

1

5

1

2

3

3

4

5

Acceptance of HR / customer satisfaction

2

3

4

5

1

6

1

HR support -processes

2

3

4

5

Value based HR controlling

2

3

4

5

© PROMERIT AG

Figure 31.12. Summary of results: HR Health Check17

31.3 Summary and Outlook This chapter highlights the importance of IT from an insurance company perspective. It is evident that the role played by IT today extends far beyond the automation of previously manual processes. The preparation of a strategy is hardly conceivable without IT and the resultant income and cost transparency. In mature markets, such as the German insurance market, undifferentiated strategies no longer have a future. The focal points for income-based growth, sales targets, marketing actions and administration changes can only be properly planned and forecasted if the relevant information is known. The importance of IT for self-controlling systems, unique selling propositions and sales support is surely not surprising – perhaps, however, the speed with which the link between sales and IT is being developed more and more. Direct access to contribution margin and commission information as well as to primary data that affects the proposal and the contract seems to replace in part the classical administration function and to result in training administration staff rather as the carriers of special knowledge and towards supporting sales.

17

Source: Promerit AG: HR Scorecard, March 2004.

31 Challenges and Trends for Insurance Companies

771

In addition to these developments we would venture to forecast that the importance of IT as part of the HR organisation will be revolutionised in the next five years. A pivotal factor for success is the identifying, recruiting and developing of talent required by each company. It requires more than administering personnel master data in a standard software package in order to achieve this on a professional basis. It requires openness and transparency on the part of the HR organisation in terms of the highlighted health check, furthermore talent pools must be built up that allows the available talent resources to be utilised in a more effective way for achieving the company objectives. We would like to conclude with the observation that these developments will also result in changes being made to the academic profiles at high schools. Company requirements will shift from the classical computer scientist towards information managers who control the interaction of the aspects highlighted in the Scorecard example.

CHAPTER 32 Added Value and Challenges of IndustryAcademic Research Partnerships – The Example of the E-Finance Lab Wolfgang Koenig, Stefan Blumenberg, Sebastian Martin

32.1 Introduction The financial service industry is believed to be on the verge of a dramatic [r]evolution. A substantial redesign of its value chains aimed at reducing costs, providing more efficient and flexible services and enabling new products and revenue streams is imminent. But there seems to be no clear migration path or goal which can cast light on the question where the finance industry and its various players will be and should be in a decade from now. The imperative necessity for action taking has been recognized by practitioners as well as the academic world. This is why the E-Finance Lab (EFL) was founded in 2003 as a cooperation of the Universities Frankfurt and Darmstadt together with industry partners Accenture, BearingPoint, Deutsche Bank, Deutsche Börse, Deutsche Postbank, FinanzIT, IBM, Microsoft, Siemens, T-Systems, as well as DAB bank and IS.Teledata. The mission of the EFL is the development of industrial methods for the impending change process of the financial service industry. Important challenges include the design of smart production infrastructures, the development and evaluation of advantageous sourcing strategies, and smart selling concepts to enable new revenue streams for financial service providers in the future. The overall goal is to contribute methods and views to the realignment of the financial supply chain.

32.2 Structure of the EFL The EFL is organized into five clusters, as depicted in Figure 32.1. In all clusters, theoretical model-related research, technical prototyping as well as case studies and interviews with industry experts are employed to devise value-creating strategies. Cluster 1 “Sourcing and IT Management in Financial Processes” focuses on sourcing and aligning the IT resource with financial business processes as a means

774

Wolfgang Koenig et al.

Figure 32.1. The EFL “Temple”

of internally and externally rearranging the financial value chain. Related research topics are operational risk mitigation and sourcing contracts. The results so far include a real options based model for sourcing decisions under uncertainty and empirical evidence for substantial efficiency potentials in financial processes. Cluster 2 “Emerging IT Architectures to support Business Processes within the E-Finance Industry” focuses on the evaluation and development of innovative technologies and technical concepts in the context of the E-Finance industry. Of special interest is the relationship between business processes and technologies. A further research area is IT risk with respect to Basel II recommendations on operational risk. Cluster 3 “Customer Management in a Multi-Channel Environment” develops, evaluates, and implements a set of Customer Management solutions in order to generate substantial efficiency improvements for financial institutions. The charter of cluster 3 is thus to look at customers as assets and to optimize the customer equity. Cluster 4 “Reshaping the Banking Business, develops scenarios of how the German and European banking business will evolve. Based on the hypothesis that industry value chains will be rearranged in the near future, cluster 4 focuses on the core research question: “Who shall perform what function in an efficient banking value chain?”. Specific business processes, outsourcing initiatives of non-core activities and the sales and distribution function in retail banking are central topics within the cluster. Cluster 5 analyzes the “Securities Trading Value Chain” and its dynamics. The cluster has a specific focus on the impact of market regulation, the intermediation

32 Added Value and Challenges of Industry-Academic Research Partnerships

775

relationships between the industry players and recent technological innovations in this field. The roof cluster pulls together research efforts and results of the five EFL research clusters in order to boost synergies and to allow a unified view on current technology-enabled trends in the financial services industry. Joint research, conferences and discussions are targeted at answering the question: “What will the finance industry be alike in 2012?”. Moreover, the roof cluster strives to coordinate research methods to contribute to a cohesive E-Finance framework.

32.3 Generation and Sharing of Knowledge Between Science and Practice: Research Cycle The work within the EFL is performed in close cooperation with the industry partners that support the research. Thus, we are not only focused on a scientifically sound research process but also on a mutual exchange of knowledge. Consolidation has been a widespread phenomenon in the financial services industry for many years. Numerous of today’s most successful firms have rapidly grown both in scale and scope. Given the strategic uncertainty about future core competencies and profitable business areas, becoming bigger and broader and maintaining “deep pockets” seemed to be one of the safest options. We believe that the financial services industry is currently on the verge to a new competitive landscape in which much more focus is indispensable. Technological advances, the ever increasing importance of brands and reputation and the imperative to exploit crossselling opportunities imply that horizontal scale and scope economies will remain of utmost importance. However, because the same technological advances boost the standardization and industrialization of business processes, an influx of specialized and highly efficient service providers will strongly erode the extent of vertical integration that we can still observe. We envision a decomposition and reconfiguration of existing value chains into flexible value networks, in which players exclusively concentrate on their core competencies and IT plays a vital role in value creation. We also believe that the industrialization of processes will be complemented by relentless attempts by leading financial institutions to offering customized, relationship-oriented financial services to their customers through multiple channels. Relationship-banking will hence prosper even in the light of an intensifying competition. Research on the industrialization of the financial services industry in Germany has only been a matter of interest in recent years. Beside theoretical model-related research and technical prototyping, the EFL conducts both quantitative and qualitative empirical studies to support findings in this heavily emerging area of interest. Empirical research is usually conducted according to the research cycle described in Figure 32.2 (based on Atteslander 2003). We introduce these phases in the following sections, while focusing on phase four (analysis) and five (utilization) in the context of the research work within the EFL, in order to illustrate particularities of the generation and sharing of knowledge.

776

Wolfgang Koenig et al.

Figure 32.2. Empirical research cycle (based on (Atteslander 2003))

32.3.1 Problem Definition The theoretical problem definition phase refers to the formulation of a research question that restrains the problem to a verifiable circumstance and emphasizes the importance of the chosen research objective. The resulting question may change during the following phases but must be formulated first, in order to narrow the focus of the subsequent research. In this step, hypotheses are formulated (Bortz and Döring 2002). Theses hypotheses rely on close scientific requirements like empirical controllability, universal validity and falsifiability (Opp. 1999). For a detailed description of hypotheses requirements see Bortz (2002) and Opp (1999). The logical combination of a set of hypotheses may lead to a theory that tries to explain and predict circumstances in the research area. In case of presence of only vague hypotheses about the research object, an explorative study should precede to derive a clear view of the research area and to obtain sound hypotheses.

32.3.2 Operationalization After the definition of hypotheses and theories, the research object is narrowed down to a practical level in order to choose the right time frame, target group, and field access to systematize observations in categories (Atteslander 2003). • Time frame: Which period will be addressed with the empirical research and how much time and how many funds are needed and available for a study? Empirical research can, for example, focus on a point in time or conduct a time series analysis.

32 Added Value and Challenges of Industry-Academic Research Partnerships

777

• Target group: What kind of group of social events or people will the research focus on? Especially the selection of people to question within an organization can be important to receive appropriate results. • Field access: Can the chosen interviewees be addressed or are there any security issues that may hinder an empirical study? For example, empirical studies in financial service organizations are often challenging because it is difficult to get access to management level decision makers. The problem definition and operationalization phases are connected with each other. Hypotheses might, for example, change during the process of operationalization.

32.3.3 Application After hypotheses generation and operationalization, the preliminarily specified data can be gathered by means of the chosen research methods. Depending on the kind of research, the gathering will be more structured in the case of a quantitative study than in case of a qualitative approach. Data can be collected through experiments within an environment designed by the researcher, through interviews, or by observing the natural environment of the research object (Atteslander 2003).

32.3.4 Analysis The gathered data has to be prepared to run (statistical) tests and analyses which depend on the chosen methods and the gathered data. As an example of data collection and analysis within the EFL, we introduce the EFL Tracker, a collection of software tools developed by the EFL. The EFL Tracker supports data collection and analysis in quantitative (e. g. surveys with Germany’s 1.000 largest banks) as well as qualitative (e. g. highly detailed case study analysis) empirical research. Hence, the EFL Tracker can be classified as a tool that supports phases three (application) and four (analysis) of the research cycle. The E-Finance Universal Benchmarker, a primary tool in the portfolio of the EFL Tracker, is designed for supporting quantitative research approaches and has been successfully deployed in several cross-sectional and industry surveys (http://www.efinancelab.de/benchmarker). The EFL Benchmarker is a web-based software that supports the collection and on-the-fly analysis of data, providing an automated, instant benchmark for the survey participants. In particular, this benchmarking feature proved to be a major incentive for firms to participate in surveys of the EFL and turned out to be a useful way to collect data for the Lab even after a particular survey has been finished. Participating firms are able to instantly compare their own data with data of the entire available sample of the

778

Wolfgang Koenig et al.

Figure 32.3. Administration interface of the EFL Benchmarker

industry peer group. From a researcher’s perspective, another especially useful feature of using the EFL Benchmarker, besides substantially accelerated set-up time for Web-based surveys, is a drastic reduction of data errors. Until now, the EFL Benchmarker has been successfully deployed in eight empirical field surveys of the EFL. In total, the databases of the system embrace detailed data of 1,005 corporations, including 123,524 data points. The user-friendly administration interface (see Figure 32.3) allows for a fast and simple configuration of the EFL Benchmarker as well as the implementation of nearly arbitrary questionnaire designs. An SPSS output interface has been implemented for seamless integration with statistics software for advanced data analysis. Even after completion of the data collection and initial analysis process, each participant has full personalized access to her data. Furthermore, participants not included in the initial sample may enter their data and receive an instant benchmark; the data collection process never stops. As an example, for the very first

32 Added Value and Challenges of Industry-Academic Research Partnerships

779

empirical survey on Financial Chain Management in addition to the original data set of 103 firms, 69 further firms have registered online and provided their data to enhance the sample. For the support of qualitative empirical research, a second software tool under the umbrella of the EFL Tracker has been developed, called the EFL Case Study Database. Just as the EFL Benchmarker supports the collection and structuring of quantitative empirical data, the EFL Case Study Database provides support for structuring and analyzing qualitative data. Interviews and documents collected in case studies can be deposited in the database according to the methodological requirements associated with qualitative research. For data analysis, an instant comparison between different case studies or analytical units can be conducted. Currently the EFL Case Study Database contains detailed data of six in-depth case studies. Raw empirical data from the EFL Benchmarker and the EFL Case Study Database have led to a number of top-class research results, published in national and international journals and conferences. In total, the data contained in the various databases of the EFL Tracker has been used in more than 35 mostly international publications. Also, taking only Cluster 1 as an example, the data is used in more than nine Ph.D. theses. As a matter of fact, the EFL Tracker is not capable of generating empirical data, but it greatly fosters the gathering and transition process from raw and unstructured empirical data towards theoretically sound research findings and analytical models. Providing such a support, the E-Finance Tracker has been established as a valuable tool for conducting empirical research, keeping track with fast changing environments in the financial services industry and at the same time giving the EFL a boost in the internationally highly competitive scientific field of empirical research.

32.3.5 Utilization The utilization of research findings is as important as its generation. From a scientific point of view these results only represent the position of the researchers who are responsible for them. Publication and discussion in scientific journals and conferences helps the researchers to discuss their findings with other experts and, where appropriate, improve and extend the methods and the results. Additionally, problems not addressed so far may arise and the cycle will start over again. (Rynes et al. 2001) point out the importance of a cooperation of science and practice. Research results in the EFL are not only discussed with the scientific community. Knowledge transfer to the partners of the EFL has become a crucial factor of success during the last years of cooperative work. Based on the scientific publications of the EFL, ten channels of knowledge transfer between academia and practice and vice versa have been established so far (Figure 32.4).

780

Wolfgang Koenig et al.

External Events

Cooperative Research Projects with Industry Partners

Participants of the Cooperative Ph.D. Program

(25)

(9)

• Conferences hosted by the EFL (2) • Conferences with contributions of EFL members (manifold) • Monthly jours fixes (12) • Presentations of EFL staff (manifold)

Internal Events

Print

Marketing and PR

• Inhouse workshops at the industry partners (3) • Internal jour fixes (manifold)

• Booklet (3 issues) • Printed Newsletters (4) • Progress reports (4)

• Press releases (360) • Press reports (18)

EFL online

Software

Databases

• Electronic Newsletter (4) • Website with publications • Glossary • FAQ • Short film „Credit process modernization“

• Universal Benchmarker • Rating-Tool „German-Z-Score“ • Portfolio optimization and algorithms backtesting software

• Universal Benchmarker (124.000 data points) • Literature database • Case study database

Publications (147)

Figure 32.4. Knowledge transfer channels (the number in brackets show the amount of items realized in 2005)

32.3.5.1 Cooperative Research Projects with Industry Partners Each cluster comprises 5–7 projects which run not more than 12 monthly and which are reevaluated in the course of the yearly budget planning process. Each project is described by: • • • •

Who is responsible for the delivery? What has when to be delivered (e. g. paper, software, data, workshop)? What is the basic research approach? How is practice incorporated in the research process?

32.3.5.2 Cooperative Ph.D. Program The participants in the cooperative Ph.D. program represent the main knowledge transfer channel towards the industry partners. Each of the industry partners can propose one person from his organization per year to become member of the cooperative Ph.D. program, i. e. to participate in cooperative research projects together with researchers from the EFL. This cooperation covers all phases of the research cycle. The participants of the cooperative Ph.D. program are important disseminators of information as they are familiar with the research areas, methods and results

32 Added Value and Challenges of Industry-Academic Research Partnerships

781

and can communicate them within their organizations in an appropriate form. In return, industry partners process the results and foster the Ph.D. students to discuss their organization’s viewpoints and areas of interest to allow a mutual exchange of knowledge and interest. In summary, participants of the corporate Ph.D. program fruitfully bridge the gap between science and practice on individual level. 32.3.5.3 External and Internal Events The EFL hosts various events to share the results with the industry partners and a wider public. During the big biannual conferences in spring and autumn with about 300–400 participants, EFL researchers and industry partners present new research results. In addition to the presentations given by EFL members, experts from practice, internationally renowned researchers and the public are invited to discuss about their work and the EFL results. These big conferences are, for example, especially interesting for members of the EFL industry partners that are not closely involved in the daily research work but are interested in getting insights into the research results. The autumn conference also offers workshops for indepth discussion of selected topics. In addition to the big biannual conferences, the EFL researchers and industry partners meet once a month in so-called jours fixes, where research in progress is presented and discussed with internal and external participants on a monthly basis. These public jours fixes are preceded by internal regular meetings of the research groups, where topics and cooperation between researchers are planned and discussed. The EFL also offers to its industry partners the possibility to organize in-house workshops where latest research results – customized to the individual needs of the organizations – are discussed. These in-house workshops are often the beginning of new research projects as the consolidation of scientific work and practitioner´s questions lead to new research questions. 32.3.5.4 Print, Marketing and Public Relations The EFL issues printed information from a quite generic level down to a highly specific level with detailed information about the status of the research. First of all, a booklet that introduces the EFL, its members, the research work and the results is issued and continuously updated. Regular press reports are released to reach the public. In 2005, 18 press reports generated 360 press releases in print, TV and online media. To complement this generic information, a printed newsletter is published on a quarterly basis. A printed newsletter presents two latest research results and moreover an editorial, an interview and information about publications on a summarized level, thus offering a concise overview on selected EFL activities. Finally, two progress reports are published biannually. One informs the EFL members about all activities within the Lab: published articles and books, attended conferences, held presentations, discourses of guest scientists, and many more.

782

Wolfgang Koenig et al.

The second printed progress report contains a collection of the top three publications of each cluster. 32.3.5.5 EFL Online The EFL issues a quarterly electronic newsletter to complement the printed newsletter with information about recent developments and short abstracts of current research projects with links onto the EFL website for further information. All information described above is available on the EFL website, which also contains a glossary with an explanation of some important research terms. Interested people can request guidance – e. g., definitions of technical terms – via the Frequently Asked Questions System. Selected research results are also presented in short film format. For example, in 2005, the short film “Credit Process Modernization” briefly explains the findings of an empirical study with 500 German banks. 32.3.5.6 Software and Databases Software with applied scientific methods like the above-described Universal Benchmarker is also one of the channels that allow communicating research results to the industry partners and to the public in an automated manner. Furthermore, based on the results of two empirical studies of “Kreditanstalt fuer Wiederaufbau” and “Bundesbank”, the EFL developed the Internet portal “German-Z-Score” which enables medium-sized enterprises to determine an approximated rating based on financial ratios. Potential effects of different strategies can be determined through variation of the input variables. The EFL also developed a tool called “Integration of Estimation Risk in the Context of Total Return Strategies in Modern Investment Management” which covers a comprehensive library of optimization and performance evaluation tools for asset management. Databases with the raw data of qualitative and quantitative empirical studies and a literature database with more than 1,000 scientific papers dealing with e-finance are exclusively available for industry partners and EFL members. So far, the EFL databases embrace detailed data of 1005 corporations, including more than 123,000 data points. 32.3.5.7 Publications Last but not least, for a research institute like the EFL, publications represent the single most important knowledge transfer channel. From all 147 contributions of the EFL in 2005, 67% are written in English and 61% have been quality approved by peer reviewers. 22% of the research papers of the EFL have been published in (international) journals.

32 Added Value and Challenges of Industry-Academic Research Partnerships

783

32.4 Added Value and Challenges of the IndustryAcademic Research Partnership An industry-academic research partnership is comprised of people and organizations in the university as well as in the business practice – and in the case of the EFL the sponsors do not only provide expertise and personal connections but also funds to employ researchers. In general, we deliberate basic prerequisites as well as the advantages of a close research cooperation of industry and university, for instance boiled down to two questions. First: What is the value of research results in practice? To point it out: If industries or important acting people there do not appreciate the value of research in the development of their enterprises, then there is almost no chance to convince someone of the value of a research cooperation. The second question, too, is linked to the requirement that both sides sustainable have to be fond of cooperating with each other. In this sense: Does academia “please” the practitioners? In particular, application oriented research approaches provide advantages towards the knowledge transfer into practice. And against the background that very often research results are complex and hard to understand, some so-called secondary virtues come into play: Do researchers seriously accept their practitioner-counterparts as “buddies”? People sense this e. g. in the atmosphere of a meeting. Or: Do researchers seriously work as much as their cooperation partners in industry? Often, the latter work hard and long (exceptions assert the rule) – and do not easily understand if for instance the researchers are not at work in the late afternoon or do not check their emails on weekends. In detail, we distinguish the following added value of an industry-academic research partnership: • Scientists need not to invent “the wheel” manifold. Experience says that a lot of concepts about how an interesting problem may be solved have already been evaluated in the industry. A research partnership helps to unearth former evaluations of concepts and ideas – if academia allows situations in which experience-advantages of practitioners compared to scientists become evident. • In particular, practitioners help to shape the structure of empirical investigations, specifying questions and the way how answers are obtained. They also help increase the respondents rate, because of their industry knowledge. In the case of the EFL, the quality of the empirical data is often outstanding. In international comparison, our respondents rate is regularly more than twice as high – which nicely supports our requirements (see the respective point in “challenges”) to publish in top-tier journals. Or, to put it in other words: One long-term wealth of the EFL is its unparalleled data collection that we can use in second research steps for numerous analytical research processes.

784

Wolfgang Koenig et al.

• The close cooperation between practice and academia also provides for real-life problems (which of course might be anonymized, but these data are real world data compared to otherwise “visions” of researchers) – and, in the opposite direction – supplies the practitioners with profound methodological knowledge. • The technology transfer from academia into practice is lubricated because industry people are from the very beginning involved in the specification of the research projects. • We have top practitioners in the cooperative Ph.D. program – it is just a joy to work with them scientifically. They provide high quality research results and publish these internationally. • In this cooperative work environment, we create a stream of top quality research results which differentiates us from many other research institutions and programs. • The networking between the actors is of high importance and very valuable. The best brains on both sides interconnect to each other and easily exchange ideas as well as business. In this humus, a lot of very good ideas and approaches are developed. And what are the challenges of an industry-academic research partnership? • The basic prerequisite is a congruence of interests on both sides. Which means on the academic side: You need to have solid research questions and you need to perform solid research procedures. On the industry side you need to explain your cooperation partners the advantages of particular research results. Often, this requires a doubled presentation of results. One version is geared towards top science media where for example sometimes more than 50% of an article is devoted to thoroughly describing the derivation of a research process – which is in the version for the industry replaced by the description of new insights. Moreover, in the latter case one often has to translate the results into the language of the “customers”. • Aside of all short term necessities to put new research results into everyday business work: Solid practitioners look for sustainable results – not some of a type: “valid today and probably tomorrow, but after that we have to perform a new research project”. Again: This requires solid research questions and solid research procedures. And in the long run, only results that pass the peer examinations of the scientific community count – even in practice (see for example the actual university excellence initiative in Germany). Which says: Top quality publications are required (plus a second type of presentation of results, one that is geared towards business). • So, the “architecture” of the research design is of paramount importance. Take exciting questions from practice, come up with a solid research process, incorporate properly the experience of practitioners, and put all these things properly together not only for a moment but at least for 2 years! • In the EFL we execute only 1 year research projects which give the members of the board time to at least review once a year the whole research

32 Added Value and Challenges of Industry-Academic Research Partnerships

•

•

• •

785

design – and high quality research is then continued in the second or even in the third term. The board of the EFL is made up of the 5 cluster heads which supervise the research in their clusters plus 1 senior representative of each of the main sponsors (actually 10). The board meets at least twice a year personally but may in addition decide via e-mail. In order to support the complex process of aligning the interests of practice and academia, the EFL has decided on and it maintains over the years “research priorities”, a document that tries to provide a kind of long-term guidance of exciting topics spanning more than one year of attention. And we have pinpointed for each cluster the 15 world-wide most important scientific journals and the five most important conferences where the researchers want to present and discuss their results – the goal is “top notch publications”. Sure, one needs a substantial share of “team players” in such a research group – on both the academic as well as on the industry side. This group may “digest” one or the other excellent solitaire, but the team must closely work together. You need excellent people who are willing to work more than usual in order to also realize the technology transfer requirements without inhibitive time-delays. And, last but not least, you need resources so that all ambitious ideas in the end really become true. The EFL is very thankful for an almost 1.5 Mio. Euro budget in 2005 that funds more than 30 researchers – and this group for example provided 147 publications in that year, half of them peerreviewed contributions to highly ranked media and more than half of them in English.

In addition, all the other knowledge transfer channels have to be executed – see Figure 32.4.

32.5 Conclusion The EFL has been founded as a means for practice and academia to cooperatively support the imminent changes in the German financial services industry. The mission of the EFL is to develop scientifically sound and relevant contributions to an industrialization of the financial processes. Research contributing to this goal is done within five research clusters in cooperation with the industry partners. Many of the publications rely on empirical studies conducted according to high scientific standards within five phases (problem definition, operationalization, application, analysis and utilization). The EFL uses a specialized software toolkit to support and structure empirical analysis and data gathering in the context of quantitative and qualitative analyses. Other important facets of its success are the manifold channels of knowledge transfer between academia and industry. Aside of a large amount of publications, cooperative research projects and a cooperative

786

Wolfgang Koenig et al.

Ph.D. program are at the core of the knowledge transfer, being complemented by a variety of printed materials like newsletters, brochures and a website with all information tailored to the needs of the industry partners. As a result of the close cooperation and the mutual knowledge transfer, the research partnership offers added value to both, practitioners and scientists. Practitioners provide real life problems and help to shape, for example, the structure of empirical investigations with their knowledge of the industry while researchers bring in sound scientific methods to investigate how concepts that have been developed and successfully applied in other industries may be adapted to the finance industry. Besides added value we rise to challenges like the requirement of a congruence of interest on both sides, long-term validity of research results and dual presentation of results, depending on the kind of auditorium.

References Atteslander P (2003) Methoden der empirischen Sozialforschung. Walter de Gruyter, Berlin, New York Bortz J, Döring N (2002) Forschungsmethoden und Evaluation für Human- und Sozialwissenschaftler. Springer, Berlin, Heidelberg, New York Opp KD (1999) Methodologie der Sozialwissenschaften, Westdeutscher Verlag, Opladen. Rynes SL, Bartunek JM, Daft RL (2001) Across the great divide: knowledge creation between practitioners and academics. Academy of Management Journal 44(2): 340–355

Index

1 10-K filing 358, 362 A ABS 452, 466 absolute moments method 564 Abu Dhabi Securities Market 144−147, 159, 166 accounting 9, 19, 95, 99, 101, 103, 104, 107, 110, 111, 114, 116, 129, 131, 134, 135, 224, 267, 293, 308, 313, 351, 371, 389−391, 590, 595, 603, 604, 606, 723, 755, 768 ACD 550, 558, 559, 578 activity diagram 222, 227, 228 Adaptive Believe Systems 452 Adaptive Heuristic Critic 435 Adaptive Rational Equilibrium Dynamics 472 ADS 215 ADSM 144, 159, 166 Advanced Measurement Approaches 345 Agent Based Modeling 650 algorithm 358, 364, 368, 373, 395−397, 410, 430, 436, 437, 439−441, 455, 503, 507, 516, 573,

575, 629, 637, 641, 642, 644, 658, 662, 666, 668, 678, 680, 693 AMA 345 AMEX 600 antithetic variates 510 Application Service Provider 261 application software 3, 216 APT 465, 596−598 arbitrage pricing theory 465, 608−610 Arbitrage Pricing Theory 596 arbitrage-free 380, 515, 691, 697, 702 Architecture Description Standard 215 ARFIMA 562, 567−569 ARIMA 562, 563, 674 Artificial Neural Networks (ANN) 641 ASP 261 asset allocation 120, 201, 225, 381, 385, 587, 595, 608−611, 647, 656, 670 asset management 120, 782 asset pricing 444, 458, 465, 468, 491, 497−499, 544, 579, 592, 594, 595, 609−611, 622 AT&T 258, 372 Australian Stock Exchange 35

788

Index

autoregressive conditional duration 550, 558, 581, 585 Autoregressive Fractional Integrated Moving Average 562 autoregressive integrated moving average 674 B B2C 18, 20, 21, 242 back-propagation neural networks 670−672, 674 Bahrain Stock Exchange 144−147, 151, 153, 167 balance sheet 7, 111, 223−225, 229, 358, 360−362, 364−367, 371−374, 390, 648 Balanced Scorecard 293, 310, 754−756, 758, 766 Basel Committee on Banking Supervision 76−81, 83, 92, 353, 670, 672, 677, 680, 724, 728 Basel II 4, 7, 76, 79, 92, 313, 333, 337−340, 343, 345, 346, 348, 349, 670, 672, 677, 680, 722, 724, 725, 727, 728, 774 Bayesian methods 381, 587, 607, 611 Bayesian Model Averaging 594, 605 Bayes-Nash equilibria 429 Bayes-Stein estimator 591 Behavioural finance 456 Black-Scholes 512, 515, 518, 541 BMA 153, 167, 605 BMW Group 4, 95, 104−108, 110, 111, 117, 121 BPEL 27, 48 BPEL4WS 216 BPN 674 BPP 299−301, 306 brokerage 3, 5, 35, 36, 152, 171, 172, 181 Brokered market 34

BS 515−522, 524, 525, 530, 532, 534, 535, 538, 541 BSE 144, 151−153 BTX-Service 709, 731 Bundesbank 349, 353, 782 Bürgerliches Gesetzbuch 732 business logic 3, 13, 14, 22−24, 216, 231 business process automation 3 Business Process Execution Language for Web Services 48, 216 business process management 6, 14, 303 business process performance 299−301 business processes 3, 7, 11, 14, 16, 20, 24, 25, 29, 48, 49, 52, 55, 66, 73, 81, 98, 99, 110, 121, 125, 126, 138, 292, 293, 296, 299−304, 306−308, 313, 322, 333, 335, 727, 773−775 Business risk 339 Business-to-Consumer 242 C C2B 242, 244 C2C 242, 246, 254 Capital Asset Pricing Model (CAPM) 648 capital markets 29−31, 37, 40, 49, 96, 102, 103, 106, 109, 110, 164, 353, 388, 608 CAPM 79, 458, 465, 466, 491, 592−597, 599, 603, 605, 609, 648 CARA 454, 466−468 cardinality constraints 645 case study 3−5, 7, 97, 115, 117, 118, 141, 161, 325, 334, 343, 348−350, 352, 355, 684, 777 case-based reasoning 670, 674 cash on delivery 249 Causal methods 78, 79 cause-effect relationships 315, 320

Index

CBR 674 CE 422, 424, 681 Central Limit Theorem 510, 623 Central Securities Depositary 163 CEO 97, 723 CEP 304−306 CFO 723 Chief Executive Officers 723 Chief Financial Officers 723 Chip-card 737 class diagram 221, 226 classification 33, 37, 64, 76, 340, 342, 346, 381, 388, 621, 640, 655, 656, 658, 666−675, 678−680, 682, 684, 686, 690 clearing 144, 148, 152−159, 163, 164, 171, 172, 174, 178−180, 183, 184, 186, 189, 190, 394, 470, 471, 493 Clearstream Banking AG 178−180, 186, 190, 191 Client Server 9, 15 cluster computing 261, 262, 266, 268 COBIT 339, 340, 344, 349, 351 COBOL 59, 763 Cobweb 452 COD 249−251, 253 collective savings and investments 312 Complex Event Processing 305 complexity 4, 16, 17, 20, 21, 24, 26, 99, 103, 105, 109, 123, 124, 126, 133, 138, 198, 202, 203, 206, 207, 214, 232, 263, 276, 280, 284, 333, 335, 344, 382, 383, 388, 412, 422, 444, 455, 459, 521, 548, 637, 647, 658, 669, 672, 690, 691, 693, 703, 704, 723, 741, 762 computational complexity 382, 690, 691, 693, 703 Computational Economics 422 computational grid 260, 262, 263, 266, 273

789

consulting 4, 5, 19, 193−198, 200−208, 256 Consumer-to-Business 242 Consumer-to-Consumer 242 contrarian trading strategy 486, 491 contrarians 380, 453, 475, 480 Control Objectives for Information and related Technology 339 control variate technique 510, 511 controlling model 3, 4 CORBA 15, 18, 216 corporate finance 104, 106, 109, 127 counting process 557 covariant risks 319 creation of synthetic events 315 Credit Process Modernization 782 credit risk 7, 119, 213, 382, 656, 666, 670, 672, 675, 677, 679, 684, 687 Credit Suisse 712 CRESTA 315 CRM 10, 99, 204−206, 293, 711, 713 CRRA 380, 454, 467, 468 customer relationship management 13, 99, 335 CVaR 346, 381, 614, 618−621, 624, 626, 628, 630−632, 647 Cyberbucks 239 CyberCash 239, 242, 243 Cybercoin 243 D Daniel-Titman model 596 data cleaning 549 Data Type Definition 47 data warehouse 64, 132, 304−306, 326 database 7, 21, 22, 43, 59, 62−64, 67, 97, 98, 110, 111, 117, 120, 122, 132, 136, 138, 203, 221, 231, 276, 305, 321, 323, 325, 329, 345, 351, 357, 358, 371, 375, 779, 782

790

Index

DCF 295, 299, 308 DCOM 216 decision support system 212 DEM 327 deployment 21, 22, 24, 26, 48, 52, 59, 73, 99, 107, 123, 215, 231, 260, 262, 269, 275, 280, 281, 283, 284 Description Logics 339, 353 deterministic 84, 85, 315, 381, 388, 426, 429, 434, 453, 470, 480−482, 484−488, 490, 491, 501, 511, 512, 635−637, 641, 642, 648, 650, 690 Deutsche Terminbörse 555 DFM 143, 144, 158, 166 DFT 575 DIFX 5, 141, 160−164 DigiCash 239, 242, 243 Digital Elevation Models 327 Dijkstra 662 dimensionality reduction 381, 655, 656, 658, 659, 661, 663, 668, 669, 677, 679, 680, 682, 684, 686, 687 discounted cash flow 299 discrete choice model 473 Discrete Fourier Transforms 575 distributed computing 6, 15, 16, 235, 257, 259, 262, 265, 266, 268, 270, 271, 278, 279, 282, 283 DNS hijacking 737 document object model 7 Doha Securities Market 144−147, 157, 166, 168 DOM 7, 368, 369, 372, 373 domino effects 319 drift coefficient 502 DSM 144, 157, 166 DSS 97, 99, 100, 103, 122, 212, 213 DTB 555 DTD 47 Dubai Financial Market 143−147, 158, 166, 168

Dubai International Financial Exchange 5, 141, 160, 162, 164, 165 dynamical system 406−408, 414, 474, 480 E EAI 10, 20, 21, 24, 26, 28, 303, 309, 713 eBay 246 ECB 239, 254, 255, 333, 348, 349, 353 e-commerce 242, 244, 248, 254, 255, 327, 719, 724 EDGAR 7, 357−359, 362, 371, 373, 375 EDGAR2XML 358, 372−374 EDI 46, 236 efficient frontier 387, 620, 643 Efficient Market Hypothesis 445 e-finance 29, 782 E-Finance Lab 167, 710, 773 EFL 773−775, 777−785 electronic communication network 38 Electronic Data Gathering, Analysis, and Retrieval system 357 Electronic Data Interchange 46 electronic finance 29 electronic signature 737 electronic trading 5, 144, 153, 154, 160, 165, 172, 175, 543, 553, 555 EMH 445, 465 EMM 380, 515−517, 519−522, 530, 538, 541 encryption 243 enterprise application integration 23, 713 Enterprise Resource Planning 9 EPC 253 equilibrium 380, 383−385, 388, 389, 392−398, 411, 414, 421, 422, 428, 437, 438, 441, 445, 448, 454,

Index

455, 457, 466, 467, 471, 480, 493, 499, 592−594, 608, 648, 703, 704 equivalent martingale measure 380 Equivalent Martingale Measure 516, 519 ERP 9, 10, 47, 97, 99, 100, 104, 110, 111, 136 Euler method 408−411, 414, 506 European Central Bank 168, 239, 253, 255, 353 European Payments Council 253 European Personal Data Protection Directive 722 evolution 30, 67, 96, 131, 143, 146, 147, 149, 160, 181, 268, 319, 381, 383, 405, 406, 430, 439, 441, 472, 474, 518, 630, 631, 638−640, 646, 651, 691, 704, 773 evolutionary approach 161 evolutionary stable strategies/states 432 Evolutionary Strategies 638 exchange digraph 690, 691, 697, 699, 700, 702 Expected Shortfall 647 expense 224, 692, 710, 753, 755, 764, 768 expert system 213 exposure 7, 32, 78, 95, 101, 107, 108, 119, 127, 129−133, 136−138, 149, 313, 315−318, 321, 328, 556, 596, 614, 627−630, 656, 714 Extensible Business Reporting Language 358, 371, 375 Extensible Markup Language 216, 236 F failure mode and effects analyses 725 Fama-French model 596 FAS 96, 104, 312 fat tails 446

791

Fat tails 634 feature mapping 657, 661 Federal Reserve Economic Database 674 FEMA 314, 320 Feynman-Kac 516, 520, 530, 541 FIGARCH 567 financial data 7, 39, 127, 128, 357, 358, 371−373, 375, 544, 557, 570, 649, 656, 674, 677, 679, 736 financial derivatives 46, 96, 111, 515, 516 Financial Information eXchange 46 financial markets 5, 29, 30, 95, 96, 107, 167, 213, 235, 291, 379, 380, 412, 443, 444, 446−450, 452, 454, 456, 459, 460, 466−469, 471, 472, 492, 499, 543, 544, 577, 578, 650, 673, 679, 689 financial planning 5, 103, 108, 120, 127, 128, 202, 209−214, 216−218, 220, 222, 229−231, 233−235 financial portal 4, 123−125, 127, 128, 136−138 Financial Products Markup Language 46 financial ratios 224, 225, 228, 670, 672, 782 financial services 3, 4, 47, 49, 51, 52, 95, 105, 128, 172, 189, 261, 270, 288, 291, 292, 297, 335, 355, 374, 412, 711, 713, 733−735, 775, 779, 785 financial theory 379, 444, 446, 447, 449, 459, 460, 622 finite impulse response transformation 572 First Fundamental Theorem 516, 519, 520, 541 First Virtual 239 FIRT 572 fitness function 453, 472 FIX 46 FIXML 46

792

Index

FK 169, 520 FMEA 725, 729 foreign exchange 4, 32, 33, 42, 95, 101, 106, 109, 114−116, 118, 121−124, 127−131, 136, 138, 382, 516, 521, 543, 546, 549, 553, 576−578, 580, 583, 584, 650, 675, 689−691, 699, 700, 735 FOREX ARBITRAGE 692−694 Form 10-K 357, 359 Form 10-Q 357 FpML 46 fractal processes 380, 544, 559, 575, 576 fractionally integrated GARCH 567 FRED 674 fundamentalists 450, 451, 453, 458, 461, 462, 475, 480 FX 116, 118, 119, 123, 131−135, 137, 138, 309, 513, 533, 541

German Civil Code 732, 734 GGF 278, 284 GILLARDON AG financial software 680 Girsanov’s Theorem 520, 541 GIS 314−318, 320−322, 324, 325, 327−329 Global Grid Forum 278, 284, 287, 289 Governor 44 gradient 636−638, 644 gradient search 636 graph theory 384 grid computing 6, 257−264, 267−281, 283−289 grid search 636 Grid Service Providers 261 GSM 207, 760, 761 GSPs 261 Gulf Cooperating Council 5, 141

G

H

GA 430−432, 639, 673, 674 Game Theory 422, 440, 441, 461 GARCH 550−552, 567, 582, 609, 610, 649, 650, 653, 656, 674, 684 Gateway 21, 44, 762 Gaussian 380, 502, 503, 506, 510, 515−517, 523, 526, 527, 529, 541, 544, 545, 558, 560, 564−566, 570, 571, 574, 576, 577, 582−584, 620, 622, 623, 634, 668, 669, 674 gBm 515, 517, 518, 520, 530, 537, 541 GCC 5, 141−147, 151, 153, 157, 159, 160, 165, 168, 169 genetic algorithm 455, 685 Genetic algorithms 430 Genetic Algorithms (GA) 640, 641, 646 Genetic Algorithms (GAs) 638 Geographic Information Systems 314, 321 geometrical Brownian motion 515

HARCH 551, 552 Hawk-Dove game 432 hazard 313−317, 322, 328, 351, 557, 558, 581 Hazard and Operability Study 725 HAZOP 725, 729 HBCI 733, 737, 738, 740, 742, 744, 747, 749 heavy tails 545, 582, 585 herd behaviour 380 herding behavior 479, 482, 486, 490−492, 580 heterogeneity 11, 284, 300, 301, 379, 445, 453, 466, 468, 469, 477, 479, 491, 560, 720 heterogeneous agents 379, 380, 426, 443, 466−468, 472, 475, 479, 497, 498 heterogeneous representative agents 380, 472 heuristic 36, 381, 382, 635, 637−643, 646−649, 651, 653

Index

HRAs 472−474 HTML 7, 19, 232, 357−362, 368−373, 713 HTTP 22, 216, 232, 713 Human Resource Management 9 Hurst parameter 569, 570 hybrid approach 646, 671, 672, 674, 675 hybrid heuristics 640 Hybrid market 34 HypoVereinsbank 712 I IBM 11, 28, 47, 48, 122, 215, 216, 237, 258, 259, 266, 269, 270, 272, 276, 286−289, 301, 303, 309, 372, 760, 762−764, 773 ICA 677, 680 ICAPM 465 ICT 9, 11, 73, 74, 287 identity theft 710, 731 IFRS 96, 103, 134, 135, 139 IMS 762−765 income 7, 78, 109, 128, 158, 203, 223−225, 228−230, 312, 328, 333, 348, 349, 672, 692, 753, 755, 757, 770 income statement 223−225 independent component analysis 677 index tracking 646, 647, 651, 652 Indicator approaches 78, 79 industry-academic research partnership 783, 784 Information and Communication Technology 9 information retrieval 7, 325, 363, 375 Information retrieval 363 information system 37, 39, 74, 82, 89, 91, 99, 123, 127, 138, 212, 214, 215, 234, 236, 324, 725, 728 insolvency 225, 230

793

insurance 4, 6, 24, 51−54, 58, 62, 73, 74, 80, 82−88, 91, 212, 224, 229, 230, 270, 311−313, 315−318, 320, 322, 325−331, 335, 337, 339, 353, 380, 398, 413, 710, 712, 727, 753, 754, 757, 759−762, 770 integral representation 510 integration layer 14, 22 integration module 712 Inter-Enterprise Workflow 47 internet payment 255, 256 Internet-banking 709, 731 intertemporal capital asset pricing model 465 intra-daily data 380, 543−550, 552, 554, 555, 576 investor 5, 32, 128, 148−152, 160, 161, 171, 172, 182, 185−187, 189, 190, 357, 387, 452, 465, 470−472, 491, 497−499, 502, 515, 587−589, 591−597, 600−604, 606, 608, 642, 644, 647 IOCSO 163 IT strategy 339, 754 Itô formula 502, 504 Itô stochastic differential 502 J J2EE 15, 16, 18, 59 Java Native Interface 59 JNI 59 K Karush-Kuhn-Tucker 665 kernel methods 381, 655−658, 668−670, 673, 675−679 Kernel Principal Component Analysis 661 key success factor 101, 709 KKT 665 knowledge management 7, 333−336, 352−355

794

Index

knowledge transfer 710, 779, 780, 782, 783, 785, 786 KonTraG 96 KPCA 658, 661−663, 669, 677−680 Kreditanstalt fuer Wiederaufbau 782 KSE 144, 155, 165 Kuwait Stock Exchange 144−147, 155, 165 L Lagrange multipliers 391, 664 LAN 268, 276, 585 Laser Induced Detection and Ranging 323 least-squares 563, 564, 572, 595, 671, 678 legacy system 54 legacy systems 13, 14, 18, 22, 26, 54, 60, 97−100, 104, 302 leptokurtosis 515 LGD 677 LIDAR 323 life-cycle scenarios 228 LIFFE 555 Limit order 33 linear fractional Lévy motion 571 linear fractional stable motion 570, 571, 584 linear regression 573, 641, 655, 664, 670, 677 Linux 58, 266, 269, 270, 277 liveness properties 720 LLE 661 Local Area Network 268 locally linear embedding 661, 684 London International Financial Futures Exchange 555 long-range dependence 380, 515, 544, 547, 556, 560, 563, 564, 566−570, 576, 582−584, 623 loss function 590, 664 Loss Given Default 677

loss patterns 320 LS 441, 572, 668, 671, 678, 679 M M/M/1 queue 269 M/M/m 269 Manhattan Project 428 manifold learning 661, 662, 679 Man-in−the-middle attack 737 marginal distribution 623 market data 35, 44, 113, 120, 133, 152, 163, 204, 206, 351, 380, 675, 682 market efficiency 445, 449, 498, 544, 610 market frictions 382, 518, 554, 692 market liquidity 148, 163 market maker 148, 164, 176, 180, 182, 183, 185−189, 447, 448, 451, 454, 461, 498, 554 market model 163, 188, 190, 382, 428, 441, 450, 456, 609, 693, 702, 703 Market order 33 market risk 120, 382, 577, 596, 627, 656, 670, 673, 675, 677 Markov games 436 mathematical programming 636, 643 maximum variance unfolding 661 Maximum Variance Unfolding 669 MC 286, 428, 429, 441, 516, 682 MDA 67 MDS 658−663 mean-variance 386, 387, 389, 390, 587, 588, 593, 596−598, 600, 608, 622, 642, 643 measure transformation method 510 Mercer’s condition 657, 658 micropayment system 243 middleware 10, 11, 17, 21, 29, 44, 45, 49, 258, 259, 261−263, 266, 273, 277, 279, 280 Milstein scheme 505

Index

minimax probability machines 675 MIP 628, 629 MIPS 276 mixed-integer programming 628 model selection 381, 605, 608 model-driven architecture approach 67 momentum traders 475−477, 480 momentum trading strategy 476, 480, 481, 486 money market 30, 119, 200, 224, 230, 691, 735 Monte Carlo 120, 421, 426−429, 438, 440, 459, 492, 506, 509, 510, 512, 513, 516, 599, 607, 611, 619, 650 Monte Carlo Simulation 428 MPM 675 MSM 144, 156, 166 multidimensional scaling 658, 687 multiple linear regression (MLR) 648 Multiple Tier approaches 10 Muscat Securities Market 144−147, 156, 166, 168, 169 MVU 661, 669 N NASDAQ 33, 34, 37, 188 NATHAN 322 Natural Hazards Assessment Network 322 .NET 59, 67, 288, 304 network 18, 45−47, 75, 98, 102, 105, 111, 126, 163, 206, 207, 257−260, 262−267, 270, 271, 273, 276, 280, 281, 285, 304, 321, 322, 325, 329, 344, 383−389, 391, 394, 395, 397−400, 403, 407, 409, 410, 412−414, 430, 576, 580, 583, 585, 671−674, 712−714, 721, 723, 727, 760, 761, 763 network flow theory 384 network models 385

795

networking 49, 257, 258, 262, 263, 266, 272, 280, 281, 284−286, 325, 727, 784 noise trader 456, 463 NP-complete 382, 690, 694, 700, 705 NSF TeraGrid 279 NYSE 34, 553−556, 578, 579, 582, 585, 600 O OBI 46 Object Request Brokers 216 object-oriented 3, 10, 16 OGSI 264, 284 on-demand computing 258 OntoClean 343, 354 ontology 7, 333, 336−348, 352, 354 OntoRisk 7, 333, 336, 338, 343−345, 347, 348 Open Buying on the Internet 46 Open Grid Services Infrastructure 284 operating system 9, 16, 59, 266, 269 operational risk 7, 76−81, 117, 294, 313, 333−337, 339, 343−345, 349, 350, 353−355, 724, 725, 774 optimization 4, 59, 73, 100, 120−122, 194, 195, 200, 201, 204, 379, 381, 384−391, 394, 395, 402, 430, 450, 574, 593, 600, 613, 614, 619, 622, 625−630, 633, 635, 636, 641−643, 646−653, 684, 782 option pricing 380, 513, 522, 524, 542, 670, 675, 686 Oracle 59, 99, 258, 270, 276, 288 Order driven (auction) market 33 Order presentation systems 38 Order routing systems 38 OTC 173, 533 Over-extrapolation 482 over-pricing 482

796

Index

overshooting 476, 477, 482−484, 486−488, 491, 492 over-the-counter 533 P P2P 16, 240−243, 267, 268, 280 PAC learning 669 partial differential equation 380, 516 PARTITION 693 payment systems 239−244, 247−250, 253−255 PayPal 240, 242, 243, 245−247, 250, 254 PCA 655, 656, 658−661, 663, 675−677, 680, 686 PCAOB 723 PDE 380, 515−517, 522, 530, 534, 538 peer-to-peer 16, 47, 268, 280, 283 Peer-to-Peer 240, 287 perceptron training rule 666 pharming 710, 731, 737, 739, 741, 746−749 phishing 710, 731, 737, 739, 741−744, 746−749, 751 PIN 254, 733, 737, 738, 740, 742, 744, 746−749 platform-specific models 67 point process 557, 585 population 638−640, 646−648 portal layer 14 portfolio 19, 63, 95, 117, 118, 121, 135, 149, 151, 194−202, 204, 208, 213, 229, 235, 236, 291, 326, 381, 383, 385−391, 398, 401, 412, 458, 467−470, 491, 502, 516, 518, 520, 521, 524, 528, 529, 531−533, 540, 541, 583, 587−603, 607−611, 613−630, 633, 634, 642−647, 651, 652, 675, 677, 682, 755, 762, 777

portfolio optimization 121, 381, 383, 385, 386, 388, 390, 391, 469, 470, 491, 608, 613, 614, 647 portfolio selection 195, 201, 202, 204, 208, 213, 235, 386, 458, 589, 590, 592, 594, 608−611, 622 premium differentiation 317, 328 principal component analysis 655, 658, 661, 673, 682, 684 Principle of Static Replication 380, 520, 521, 523, 524, 528, 529, 541 process design 66, 291 process portal architecture 712 product management 3, 52, 58, 70 pruning 636 PSM 67 Public Company Accounting Oversight Board 723 Q QoS 254, 286, 301 quadratic model 394 Quadratic Programming (QP) 643 Quality of Service 254, 286, 301 questioning techniques 77, 78, 725 Quote-driven dealer markets 33 R radial basis function 668 RAIC 258 rational expectation hypothesis 444 rational expectations 444, 446, 455, 466, 475 RBF 668, 671, 673, 675 receiver operating curve 669 Redundant Array of Inexpensive Computers 258 REE 455, 458 reference model 5, 47, 50, 209, 210, 214, 216, 218, 220, 233 regulation 5, 7, 31, 37, 58, 147, 148, 150, 151, 153, 156, 162, 168, 190, 239, 253, 459, 723, 735, 745, 774

Index

REH 444 Reinforcement learning 432 Remote Procedure Call 9 representative agent theory 445 rescaled adjusted range method 560 Research Information Exchange Markup Language 47 return on investment 26 risk assessment 320, 322, 350 risk aversion 380, 452, 454, 457, 466−469, 593, 602−604 risk factor 648 risk management 7, 30, 32, 74, 96, 98, 99, 103, 106, 108, 109, 111, 121, 124, 128−131, 133, 136, 138, 139, 224, 228, 244, 251, 252, 294, 313, 320, 322, 323, 328−330, 333−338, 340, 341, 343, 345−348, 352, 353, 355, 382, 385, 412, 506, 544, 556, 577, 584, 621, 623, 633, 634, 656, 658, 666, 670, 673, 675, 677, 722−725, 727, 728 Risk Management Agency 326 Risk Neutral Measure 520 risk premium 196, 456, 463, 475, 476, 480, 484, 486, 491, 492, 596, 648 RIXML 47 RL 416, 432, 433, 438, 611, 681, 786 RMA 326 ROC 669, 678, 681 ROI 26 Runge-Kutta method 505 S safety properties 720 sales support 770 Santa Fe Artificial Stock Market 454, 462 SAR 323 Sarbanes-Oxley Act 54, 333, 340, 355, 728 SAS Non Life Subject Model 757

797

satellite image processing 323 Savage density 597, 598 SCD 558, 559 SCM 293 SDE 321, 501−504, 510−513 Second Fundamental Theorem on Asset Pricing 520 Secure Electronic Transaction 240 Securities and Exchange Commission 7, 192, 357, 375 securities trading 3, 34, 37, 39, 49, 151, 172, 173, 181 security 4, 5, 13, 20, 32−34, 37, 40, 41, 43, 45, 53, 73−77, 79, 82, 84−88, 91, 132, 148, 152, 178, 200, 206−208, 240, 243, 248, 262, 263, 267, 277, 278, 284, 325, 330, 337, 340, 342, 344, 351, 387, 498, 502, 515, 518−520, 627, 703, 709−722, 724−728, 731, 735, 738−745, 747, 748, 777 self organizing maps 675 self-controlling systems 710, 765, 770 self-similar processes 569, 570, 582 SEPA 6, 253, 254 sequence diagram 222, 227 service innovation 27 service layer 14, 22, 23, 304 Service orientation 11 Service-Oriented Architecture 3, 28, 309 service-oriented computing 3 SET 240, 243 SETCOVER 693, 694 Settlement and registry systems 38 SF-ASM 454−458 SGML 370 Sharpe ratio 620, 643, 644, 646 Simple Object Access Protocol 48, 216, 259, 284 Simple Workflow Access Protocol 47 Simulated Annealing 637, 638, 640, 644, 646, 652

798

Index

Single European Payments Area 6, 253 Six Sigma 293, 294, 309, 310 skewness 515, 571, 573, 603, 609, 623 SOA 3, 6, 9−18, 20−28, 54, 215, 261, 300−304, 306, 308, 309, 763 SOAP 48, 216, 232, 233, 259, 284, 713, 762, 763 social network 412−414 software agents 357, 430 software system 3, 17, 215, 303, 713 solvency 106, 225, 313 Solvency II 313, 330, 337, 339, 354 SOX 54, 96, 723 Spatial Database Engine 321 SRM 99 SSL 207, 239, 243, 250, 744 stable distribution 545, 570, 573, 574, 576, 607, 620, 623 stable Paretian distributions 614, 622 stochastic 635, 637, 638, 641, 651 stochastic calculus 380, 501, 502, 520, 542 stochastic conditional duration 558, 578 stochastic differential equation 501, 517 stochastic games 436 Stochastic methods 78, 79 STP 19, 20, 26, 57, 58, 65, 308 Straight Through Processing 140, 308 straight-through-processing 19, 57, 58, 65, 66, 70, 102, 110 Strategic HR Management 766 Strategic Human Capital Planning 766 Stratified sampling 510 strong convergence 503, 504 Sun 234, 258, 270, 272, 279, 289, 380, 543, 545, 547, 559, 576, 584, 673, 676, 682, 687

supernetwork 413, 414 supplier relationship management 99 Supply Chain Management 293, 417 support vector machine 664, 680−687 surveillance 36, 38−41, 48, 148, 150, 739, 741 SVD 558, 560 SWIFT 45, 47, 163 SWIFT Transport Network 45 Synthetic Aperture Radar 323 T Tabu Search 640, 646 Tadawul Saudi Stock Market 144−146, 154, 165 talent management 767 TAN 733, 737, 738, 740, 742, 744, 746−749 Taylor approximation 505 Taylor expansion 504 TCO 60, 65, 285 technology risk 7, 333, 336−349, 352 terms and conditions 185, 297, 733, 734, 739, 741, 742, 745 TFLOPS 276 Threshold Accepting (TA) 637−639, 644, 646, 647, 649−653 time series 100, 443, 444, 446, 451, 453, 459, 474, 480, 483, 485, 486, 488, 489, 494−496, 543−549, 560−563, 565, 566, 576, 578, 580, 581, 583, 585, 607, 655, 656, 664, 670, 673−675, 677, 680, 683, 686, 687, 776 timing risk 6, 312 total cost of ownership 3, 18, 60 Trader Workplace 44 trading engine 36, 37, 39, 41−44 transaction costs 5, 148, 171, 182, 184, 188, 190, 200−202, 255, 383,

Index

397, 398, 402−405, 407, 410−412, 517, 518, 625, 627, 643−647, 722 treasury 4, 30, 95, 97, 98, 100−102, 104−111, 113, 117−122, 128, 130, 134, 135, 348, 579 TSSM 144, 154, 165 U ubiquitous computing 711, 714, 721, 722, 727 UDDI 48, 216, 259, 284 UML 67, 68, 209, 214, 215, 222, 225, 234, 235 UMTS 206, 207 uncertainty 10, 312, 317, 388, 401, 587, 588, 590, 591, 593−595, 600, 602−604, 606−608, 614−616, 619, 647, 656, 723, 774, 775 underwriting risk 7, 312−314, 316, 318, 324, 329 Unified Transaction Modelling Language 216 UNIQA 760, 761 Universal Description, Discovery and Integration 48, 216, 259 US GAAP 7, 312, 357 use case 219−223, 225−228 utility computing 258, 262, 266, 268, 270, 279, 280, 286 UTML 216 V Value at Risk 647, 648, 653 Value-at-Risk 77−79, 580, 616, 618, 634, 653, 724 VaR 346, 616, 618, 620, 621, 626, 652, 677 VARHAC 553 variance reduction 510, 512, 513 variational inequalities 379, 384, 389 VC-dimension 666

799

virtual organization 260 Virtual Private Network 207 virtualization 73, 258, 261, 262, 266, 268, 269, 272, 280, 281, 283 virtualization technology 258 VMWare 258, 272 VO 260 volatility 31, 40, 380, 446, 466−468, 476, 479, 482−484, 486, 487, 489, 492, 511−517, 544, 546, 550−552, 555, 556, 558, 559, 567, 576−580, 582, 584, 585, 607, 609−611, 623, 624, 632, 643, 647−649, 674, 675, 681 volatility clustering 380, 446, 468, 483, 489, 492 Volatility clustering 489, 547 VPN 207 W Wagner-Platen expansion 504, 505, 507 Walrasian auctioneer 470 weak convergence 503, 504, 506, 513 wealth dynamics 380, 467, 468, 471, 481, 491, 497 web services 13, 14, 48, 216, 284, 286, 303, 762 Web Services Conversation Language 216 Web Services Description Language 13, 48, 216 WebSphere 763, 764 WFMS 47 Whittle estimator 565, 566, 571 Whittle method 566 Wiener process 501−503, 506 WLAN 206, 207 Workflow Management System 47 wrapper 364, 673 WSCL 216 WSDL 13, 48, 216, 232, 284, 713

800

Index

X

Z

XBRL 358, 371, 375 XML 22, 23, 28, 46−48, 140, 216, 232, 284, 305, 358, 368, 370, 371, 713, 763

z/VM 266, 269 ZKA 733

Handbook on Information Technology in Finance (International Handbooks on Information Systems)

Handbook on Ontologies, Second edition (International Handbooks on Information Systems)

Handbook on Architectures of Information Systems (International Handbooks on Information Systems)

Handbook on Business Process Management 1: Introduction, Methods, and Information Systems (International Handbooks on Information Systems)

Handbook on Business Information Systems

National Security, Volume 2 (Handbooks in Information Systems) (Handbooks in Information Systems) (Handbooks in Information Systems)

Cases on Strategic Information Systems (Cases on Information Technology Series)

Economics and Information Systems, Volume 1 (Handbooks in Information Systems) (Handbooks in Information Systems) (Handbooks in Information Systems)

Cases on Information Technology: Lessons Learned (Cases on Information Technology Series) (Cases on Information Technology Series)

Handbook of Research on Public Information Technology

Handbook on Scheduling: From Theory to Applications (International Handbooks on Information Systems)

Cases on Information Technology Entrepreneurship

Cases on Information Technology Entrepreneurship

International Handbook on Information Systems. Handbook on Decision Support Systems 1: Basic Themes

Workshop on Information Technology in Africa

WCMC Handbooks on Biodiversity Information Management

Cases on Database Technologies And Applications (Cases on Information Technology Series) (Cases on Information Technology Series)

Automotive Embedded Systems Handbook (Industrial Information Technology)

Information Extraction in Finance

Cases on the Human Side of Information Technology (Cases on Information Technology Series)

Cases on Information Technology and Organizational Politics & Culture (Cases on Information Technology Series)

Annals of Cases on Information Technology (Cases on Information Technology Series, Vol 4, Part 1)

Cases on Information Technology Planning, Design and Implementation (Cases on Information Technology Series)

A Handbook of Information Technology

The China Information Technology Handbook

A Handbook of Information Technology

Handbook of Research on Global Information Technology Management in the Digital Economy (Handbook of Research On...)

Women and Information Technology: Research on Underrepresentation

Annals of Cases on Information Technology (Cases on Information Technology Series)

Selected Readings on Information Technology and Business Systems Management

Annals of Cases on Information Technology

Handbook on Information Technology in Finance (International Handbooks on Information Systems)

Handbook on Ontologies, Second edition (International Handbooks on Information Systems)

Handbook on Architectures of Information Systems (International Handbooks on Information Systems)

Handbook on Business Process Management 1: Introduction, Methods, and Information Systems (International Handbooks on Information Systems)

Handbook on Business Information Systems

National Security, Volume 2 (Handbooks in Information Systems) (Handbooks in Information Systems) (Handbooks in Information Systems)

Cases on Strategic Information Systems (Cases on Information Technology Series)

Economics and Information Systems, Volume 1 (Handbooks in Information Systems) (Handbooks in Information Systems) (Handbooks in Information Systems)

Cases on Information Technology: Lessons Learned (Cases on Information Technology Series) (Cases on Information Technology Series)

Handbook of Research on Public Information Technology

Handbook on Scheduling: From Theory to Applications (International Handbooks on Information Systems)

Cases on Information Technology Entrepreneurship

Cases on Information Technology Entrepreneurship

International Handbook on Information Systems. Handbook on Decision Support Systems 1: Basic Themes

Workshop on Information Technology in Africa

WCMC Handbooks on Biodiversity Information Management

Cases on Database Technologies And Applications (Cases on Information Technology Series) (Cases on Information Technology Series)

Automotive Embedded Systems Handbook (Industrial Information Technology)

Information Extraction in Finance

Cases on the Human Side of Information Technology (Cases on Information Technology Series)

Cases on Information Technology and Organizational Politics & Culture (Cases on Information Technology Series)

Annals of Cases on Information Technology (Cases on Information Technology Series, Vol 4, Part 1)

Cases on Information Technology Planning, Design and Implementation (Cases on Information Technology Series)

A Handbook of Information Technology

The China Information Technology Handbook

A Handbook of Information Technology

Handbook of Research on Global Information Technology Management in the Digital Economy (Handbook of Research On...)

Women and Information Technology: Research on Underrepresentation

Annals of Cases on Information Technology (Cases on Information Technology Series)

Selected Readings on Information Technology and Business Systems Management

Annals of Cases on Information Technology

Recommend Documents