This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
International Handbook on Information Systems Series Editors Peter Bernus, Jacek Bla˙zewicz, Günter Schmidt, Michael Shaw
Titles in the Series M. Shaw, R. Blanning, T. Strader and A. Whinston (Eds.) Handbook on Electronic Commerce ISBN 978-3-540-65882-1 J. Bla˙zewicz, K. Ecker, B. Plateau and D. Trystram (Eds.) Handbook on Parallel and Distributed Processing ISBN 978-3-540-66441-3 H. H. Adelsberger, B. Collis and J.M. Pawlowski (Eds.) Handbook on Information Technologies for Education and Training ISBN 978-3-540-67803-8 C. W. Holsapple (Ed.) Handbook on Knowledge Management 1 Knowledge Matters ISBN 978-3-540-43527-3 Handbook on Knowledge Management 2 Knowledge Directions ISBN 978-3-540-43527-3 J. Bla˙zewicz, W. Kubiak, T. Morzy and M. Rusinkiewicz (Eds.) Handbook on Data Management in Information Systems ISBN 978-3-540-43893-9 P. Bernus, P. Nemes, G. Schmidt (Eds.) Handbook on Enterprise Architecture ISBN 978-3-540-00343-4 S. Staab and R. Studer (Eds.) Handbook on Ontologies ISBN 978-3-540-40834-5 S. O. Kimbrough and D.J. Wu (Eds.) Formal Modelling in Electronic Commerce ISBN 978-3-540-21431-1 P. Bernus, K. Merlins and G. Schmidt (Eds.) Handbook on Architectures of Information Systems ISBN 978-3-540-25472-0 (Second Edition) S. Kirn, O. Herzog, P. Lockemann, O. Spaniol (Eds.) Multiagent Engineering ISBN 978-3-540-31406-6 J. Bla˙zewicz, K. Ecker, E. Pesch, G. Schmidt, J. We¸glarz Handbook on Scheduling ISBN 978-3-54028046-0 F. Burstein and C.W. Holsapple (Eds.) Handbook on Decision Support Systems 1 ISBN 978-3-540-48712-8 Handbook on Decision Support Systems 2 ISBN 978-3-540-48715-9
Detlef Seese Christof Weinhardt Frank Schlottmann (Editors)
Handbook on Information Technology in Finance
123
Editors Prof. Dr. Detlef Seese University of Karlsruhe (TH) Institute AIFB 76128 Karlsruhe Germany [email protected]
Dr. Frank Schlottmann Gillardon AG financial software Edisonstraße 2 75015 Bretten Germany [email protected]
Prof. Dr. Christof Weinhardt University of Karlsruhe (TH) IISM 76218 Karlsruhe Germany [email protected]
Why do we need a handbook on Information Technology (IT) and Finance? At first, because both IT as well as finance, are some of the most prominent driving forces of our contemporary world. Secondly, because both areas develop with a terrific speed causing an urgent need of up to date information on recent developments. Thirdly, because serious applications of IT in Finance require specialists with a professional training and professional knowledge in both areas. Over the last decades the world has seen many changes in politics, economics, science and legislation. The driving forces behind many of these developments are of a technological nature. One of the key technologies with this respect is Information Technology. IT is the most prominent technology revolutionizing the industrial development, from products and processes to services, as well as finance, which is itself one of the central pillars of modern economics. The explosive development of the Internet emphasizes the importance of IT, since it is today’s key factor driving global access and availability of information and allows the division of labour on an international scale, the globalization. The profound transformation of finance and the financial industry over the last twenty years was driven by technological developments – e. g. the development of more efficient desktop computers (PCs) with a sufficient storage capacity and speed to perform efficient computations at the front desk as well as in the back office, or the network technology enabling intranets, the Internet and also home banking and service oriented architectures (SOAs) to allow new and more efficient forms of business processes and financial services. Other important developments are reflected by several eWords, such as eCommerce, eBusiness, eOrganisation as well as electronic markets, digital money and intelligent software agents. These developments led to whole new scientific areas as Market Engineering or Knowledge Management. But also the area of “pure” mathematical finance showed impressing developments, e. g. with the groundbreaking discoveries of Black, Scholes and Merton in the early 1970s, honored 1997 with the Nobel prize, with many applications for derivative financial instruments. Also the stunning discoveries of Benoît Mandelbrot on the Fractal Geometry of Nature with many applications in mathematics,
VI
Preface
physics and economics, explaining chaos as well as The (Mis)Behavior of Markets (cf. e. g. the corresponding books by Mandelbrot 1982 and by Mandelbrot & Hudson 2004) are a real milestone contributing essentially to the creation of the new area of Econophysics. Especially the latter discoveries were influenced by the developments of modern IT, since without powerful computers and efficient algorithms they simply would not have been possible. The important influence of IT in the area of finance can be observed in many ways, e. g. efficient algorithmic solutions realizing break-through mathematical solutions to financial problems, IT methods supporting new business processes or “simply” the IT-driven worldwide access to data and the visualization of discoveries. However, the development of modern IT and the contemporary business life with its global connectivity and speed lead to a growing complexity of tasks and questions which have to be solved with often diminishing resources. This enhances the risk of many decisions and emphasizes in conjunction with turbulences of financial markets the necessity to improve and develop further the mathematical and methodical foundation of the area of finance as well as the technological and methodological basis of IT. It also caused the Basel Committee on Banking Supervision to give recommendations serving as basis for several legislative initiatives to ensure the improvement of risk and security management of financial institutions. The present book is an attempt to cover both areas, modern IT and contemporary finance and to demonstrate the manifold of their interactions. Its primary objective is twofold: At first, up-to-date surveys on key areas are provided, containing new developments, essential methods and top results in both areas. Secondly, a series of case studies and applications illustrate the link between IT and finance. The intended readers are students as well as experts in both areas, IT and finance, and of course all those working in or being interested in applications of IT in the financial services sector. We are convinced that the book will be useful for academics as well as practitioners trying to get information on areas different from their expertise or trying to study relevant examples of professional applications. We have organized the book into three parts, the first part on IT systems, infrastructure and applications in finance, the second on IT methods in finance and the third on trends and challenges in the financial sector. An introduction to the respective content is given at the beginning of each part. The idea to this book was born when two of us, Detlef and Frank, met Günter Schmidt in 2004 at the International Workshop on Computational Management, Science, Economics, Finance and Engineering in Neuchâtel. In a conference break Günter Schmidt pointed out the need for a comprehensive handbook covering all aspects of the latest developments of IT and its impact on the area of finance, leading to new computational tools to implement more efficiently numerical computations, enabling new forms of markets, new organizational structures, new forms of cooperation, services, business and new products. Later Detlef and Frank
Preface
VII
continued this discussion while being on an excursion to Montreux, walking at the shore of the Lac Léman. It was a beautiful Spring day with bright sunshine, white clouds and glittering snow on the pikes of the Alps, the yellow mimosa trees and the pink Japanese cherry trees blossomed and the many colors of tulips on the fresh green of the lawn enhancing joy and optimism, which convinced us that the challenge of such a Handbook can be mastered and is worth all the effort. After returning to Karlsruhe the two of us contacted Christof and convinced him to join our team and to take all efforts to bring the project together to a successful end. Since then we got quite often the feeling that the walk at the Lac Léman probably enhanced our optimism too much compared to the enormous efforts necessary to realize the huge project of capturing the complete picture of IT in finance. Now, since our work is done we are convinced that it is probably impossible to capture such a vivid and dynamic world of research and applications in a single handbook. However, we are also convinced that we present an excellent choice of highlights, facets and snapshots giving the reader a birds-eye view on promising areas as well as contemporary and prospective applications shaping the future of finance from the perspective of IT. Of course, we are indebted to all those who contributed the chapters to this Handbook, cf. the list of contributors succeeding the Contents section. With their motivation, work and co-operation, they create advances in the areas of IT and finance, and this Handbook reflects a selection of these advances in its contributions. Moreover, we are grateful to all those who helped us through the arduous process of completing this handbook. At first we thank Günter Schmidt for his encouragement to start the project. We thank the Springer-Verlag for his support and especially Dr. Werner A. Mueller for his many valuable advices, his patience for the final delivery and for his encouragement to proceed with this Handbook. We wish to express our special gratitude to Dirk Neumann who contributed so many of his ideas and so much of his time and energy to the creation of this handbook that we can justly say that he is a co-editor of it. Furthermore, we also wish to express our gratitude to many colleagues we met at different conferences and meetings, e. g. at the conferences FinanceCom in Regensburg in Mai 2005, the 10. Symposium on Finance, Banking, and Insurance in Karlsruhe in December 2005, the conferences Quantitative Methods in Finance in Sydney in 2005 and 2006 and the International Workshop on Computational and Financial Econometrics in Geneva in 2007, with whom we discussed the handbook project and got many valuable hints; to Tim Becker, Hagen Buchwald, Jianer Chen, Jörn Dermietzel, Mike Fellows, Mark M. Davydov, Hermann Göppl, Benjamin Graf, Volker Gruhn, Ralf Herrmann, Christian Hipp, Carsten Holtmann, Ming-Yang Kao, Gottfried Koch, Dirk Krause, Michael Lesko, Torsten Lüdecke, Benoît Mandelbrot, Daniel Minoli, Rolf Niedermeier, Martin Op’t Land, Fethi Abderrahmane Rabhi, Svetlozar T. Rachev, Hartmut Schmeck, Rudi Studer, Marliese Uhrig-Homburg and Ute Werner for comments, hints, reviews and/or recommendations; to Christian Reimsbach Kounatze and Carolin Michels for helping us with the transformations of the manuscripts from LaTeX to the final style in Word; to Jörn Dermietzel, Tobias Dietrich, Andreas Mitschele and Gabriele Seese for
VIII
Preface
their help with different technical work related to the project; to Ingeborg Götz for her office work related to the book; to GILLARDON AG financial software for providing an IT infrastructure which efficiently supported the difficult production of the final Handbook; to Stefanie Altinger for supporting the final production. Detlef would like to thank Regina Betz, Fethi Abderrahmane Rabhi, Pradeep Kumar Ray and Nandan Parameswaran from the University of New South Wales Sydney and Stephan Chalup, University of Newcastle for their invitation and support during Detlef’s research semester in winter 2006/2007. Christof and Detlef gratefully acknowledge support from the Deutsche Forschungsgemeinschaft (DFG), especially from the project Graduate School Information Management and Market Engineering (IME). Last but not least it is us a pleasure to thank our wives Gabriele, Renate and Bianca for their constant support, their love and their patience on many nights, weekends and cancelled holidays we spent on the Handbook project. We thank all those concerned.
PART I. IT SYSTEMS, INFRASTRUCTURE AND APPLICATIONS IN FINANCE Introduction to Part I............................................................................................... 3
CHAPTER 1 SOA in the Financial Industry – Technology Impact in Companies’ Practice ....... 9 Heiko Paoli, Carsten Holtmann, Stephan Stathel, Olaf Zeitnitz, Marcel Jakobi 1.1 Introduction ....................................................................................... 9 1.2 SOA – Different Perspectives ......................................................... 11 1.3 SOA – Technical Details ................................................................. 12 Elements and Layers......................................................... 12 1.3.1 1.3.2 Comparison with State-of-the-art Distributed Architecture Alternatives.................................................. 15 1.4 Value Proposition of SOA............................................................... 17 1.5 Case Study Union Investment ......................................................... 18 Introduction to the Use Case Company ............................ 18 1.5.1 1.5.2 SOA Implementation at the Use Case Company .............. 20 1.6 Discussion ....................................................................................... 23 1.7 Conclusion....................................................................................... 25
CHAPTER 2 Information Systems and IT Architectures for Securities Trading ....................... 29 Feras Dabous, Fethi Rabhi 2.1 Introduction ..................................................................................... 29
X
Table of Contents
2.2
2.3
2.4
2.5
2.6
Financial Markets: Introduction and Terminology.......................... 30 Introduction to Financial Markets..................................... 30 2.2.1 2.2.2 Capital Markets Trading Instruments ............................... 31 2.2.3 Equity Market ................................................................... 31 2.2.4 Derivatives Market ........................................................... 32 2.2.5 Types of Orders ................................................................ 32 2.2.6 Classifications of Marketplaces ........................................ 33 Information Systems for Securities Trading.................................... 34 Securities Trading Information Flow................................ 34 2.3.1 2.3.2 Categories of Capital Markets’ Information Systems ....... 37 Case Studies .................................................................................... 39 SMARTS .......................................................................... 39 2.4.1 2.4.2 X-STREAM...................................................................... 42 IT Integration Architectures and Technologies ............................... 44 Transport Level................................................................. 45 2.5.1 2.5.2 Contents (Data) Level....................................................... 46 2.5.3 Business Process Level..................................................... 47 Conclusion....................................................................................... 49
CHAPTER 3 Product Management Systems in Financial Services Software Architectures...... 51 Christian Knogler, Michael Linsmaier 3.1 The Initial Situation......................................................................... 51 3.2 Weaknesses in Software Architectures and Business Processes.................................................................... 52 3.3 Objectives and Today’s Options...................................................... 55 3.4 A ‘Product Server Based Business Application Architecture’ ........ 56 Business Considerations ................................................... 56 3.4.1 3.4.2 Technical Considerations.................................................. 58 3.5 Migrating into a New Architecture.................................................. 60 Blueprint ........................................................................... 62 3.5.1 3.5.2 Product Data Analysis ...................................................... 62 3.5.3 Repository......................................................................... 62 3.5.4 Consolidated Architecture ................................................ 64 3.5.5 Innovated Architecture ..................................................... 65 3.6 Impacts on Software Development.................................................. 67 3.7 Conclusion....................................................................................... 69
CHAPTER 4 Management of Security Risks – A Controlling Model for Banking Companies ........................................................................................ 73 Ulrich Faisst, Oliver Prokein 4.1 Introduction ..................................................................................... 73
Table of Contents
4.2
4.3
4.4
XI
The Risk Management Cycle .......................................................... 74 Identification Phase........................................................... 74 4.2.1 4.2.2 Quantification Phase ......................................................... 77 4.2.3 Controlling Phase.............................................................. 79 4.2.4 Monitoring Phase.............................................................. 80 A Controlling Model for Security Risks.......................................... 82 Assumptions ..................................................................... 82 4.3.1 4.3.2 Determining the Optimal Security and Insurance Level... 85 4.3.3 Constraints and Their Impacts .......................................... 87 Conclusion....................................................................................... 91
CHAPTER 5 Process-Oriented Systems in Corporate Treasuries: A Case Study from BMW Group.......................................................................... 95 Christian Ullrich, Jan Henkel 5.1 Introduction ..................................................................................... 95 5.2 The Technological Environment ..................................................... 97 5.3 Challenges for Corporate Treasuries ............................................. 101 5.4 Portrait BMW Group..................................................................... 104 5.5 Portrait BMW Group Treasury...................................................... 105 Organization and Responsibilities .................................. 106 5.5.1 5.5.2 Business Process Design................................................. 108 5.6 Business Process Support .............................................................. 110 Best-of-Breed System Functionalities ............................ 111 5.6.1 5.6.2 Decision-Support Functionalities.................................... 119 5.7 Final Remarks ............................................................................... 120
CHAPTER 7 Capital Markets in the Gulf: International Access, Electronic Trading and Regulation ..................................................................... 141 Peter Gomber, Marco Lutat, Steffen Schubert 7.1 Introduction ................................................................................... 141 7.2 The Evolution of GCC Stock Markets and Its Implications .......... 143 7.3 Requirements of International Investors........................................ 148 7.4 Systematic Analysis of the Exchanges in the Region.................... 151 Bahrain Stock Exchange (BSE) ...................................... 151 7.4.1 7.4.2 Tadawul Saudi Stock Market (TSSM)............................ 154 7.4.3 Kuwait Stock Exchange (KSE)....................................... 155 7.4.4 Muscat Securities Market (MSM) .................................. 156 7.4.5 Doha Securities Market (DSM) ...................................... 157 7.4.6 Dubai Financial Market (DFM) ...................................... 158 7.4.7 Abu Dhabi Securities Market (ADSM) .......................... 159 7.5 Approaches to Further Open up the Stock Markets to the International Community − The Dubai International Financial Exchange (DIFX) Case.................................................. 160 7.6 Conclusion..................................................................................... 165 7.7 Appendix ....................................................................................... 165
CHAPTER 8
Competition of Retail Trading Venues − Online-brokerage and Security Markets in Germany ...................................................................... 171 Dennis Kundisch, Carsten Holtmann 8.1 Introduction ................................................................................... 171 8.2 Service Components and Fees in Securities Trading..................... 173 Service Components in Securities Trading..................... 173 8.2.1 8.2.2 Access Fees for Participants at German Exchanges ....... 174 8.2.3 Fees for Price Determination Services............................ 176 8.2.4 Discussion of Services and Fees ..................................... 179 8.3 Analysis of the Price Models of Online-brokers ........................... 181 Forms of Appearance of Online-brokerage .................... 181 8.3.1 8.3.2 Price Models of Selected Online-brokers ....................... 182 8.3.3 Comparison of the Price Models and Discussion ........... 184 8.4 Outlook: Off-exchange-markets as Competitors ........................... 187 8.5 Conclusion..................................................................................... 189
CHAPTER 9 Strategies for a Customer-Oriented Consulting Approach ................................. 193 Alexander Schöne 9.1 Introduction ................................................................................... 193 9.2 Qualitative Customer Service Through Application of Portfolio Selection Theory ........................................................ 194
Table of Contents
9.3
9.4
XIII
9.2.1 Markowitz’ Portfolio Selection Theory .......................... 195 9.2.2 Application to Real-World Business .............................. 197 9.2.3 Consideration of Existing Assets .................................... 199 9.2.4 Illiquid Asset Classes and Transaction Costs ................. 200 9.2.5 Conclusions for Daily Use.............................................. 201 Support of Customer-Oriented Consulting Using Modern Consulting Software ...................................................................... 202 Architecture of a Customer-Oriented 9.3.1 Consulting Software ....................................................... 202 9.3.2 Mobile Consulting as a Result of Customer-Orientation ................................................. 205 Conclusion..................................................................................... 208
CHAPTER 10 A Reference Model for Personal Financial Planning.......................................... 209 Oliver Braun, Günter Schmidt 10.1 Introduction ................................................................................... 209 10.2 State of the Art .............................................................................. 210 10.2.1 Personal Financial Planning............................................ 210 10.2.2 Systems and Models ....................................................... 212 10.2.3 Reference Models ........................................................... 214 10.2.4 Requirements for a Reference Model for Personal Financial Planning ...................................... 216 10.3 Framework for a Reference Model for Personal Financial Planning......................................................................... 218 10.4 Analysis Model.............................................................................. 220 10.4.1 Dynamic View on the System: Use Cases ...................... 220 10.4.2 Structural View on the System: Class Diagrams ............ 221 10.4.3 Use Case Realization: Sequence and Activity Diagrams.................................... 222 10.4.4 Example Models ............................................................. 222 10.5 System Architecture ...................................................................... 230 10.6 Conclusion and Future Trends....................................................... 233
CHAPTER 11 Internet Payments in Germany............................................................................ 239 Malte Krueger, Kay Leibold 11.1 Introduction: The Long Agony of Internet Payments.................... 239 11.2 Is There a Market for Innovative Internet Payment Systems?....... 240 11.2.1 Use of Traditional Payment Systems: Internet Payments and the Role of Payment Culture ...... 240 11.2.2 The Role of Technology ................................................. 242 11.2.3 The Role of Different Business Models.......................... 244
XIV
Table of Contents
11.3
11.4 11.5
Internet Payments in Germany: A View from Consumers and Merchants ............................................................................... 247 11.3.1 The Consumers’ View .................................................... 247 11.3.2 The Merchants’ View ..................................................... 249 SEPA ............................................................................................. 253 Outlook.......................................................................................... 254
CHAPTER 12 Grid Computing for Commercial Enterprise Environments ............................... 257 Daniel Minoli 12.1 Introduction ................................................................................... 257 12.2 What is Grid Computing and What are the Key Issues? ............... 259 12.3 Potential Applications and Financial Benefits of Grid Computing ........................................................................ 268 12.4 Grid Types, Topologies, Components, Layers – A Basic View ................................................................................ 271 12.5 Comparison with other Approaches .............................................. 279 12.6 A Quick View at Grid Computing Standards ................................ 283 12.7 A Pragmatic Course of Investigation on Grid Computing ............ 284
CHAPTER 13 Operational Metrics and Technical Platform for Measuring Bank Process Performance ................................................................................. 291 Markus Kress, Dirk Wölfing 13.1 Introduction ................................................................................... 291 13.2 Industrialisation in the 21st Century? ............................................ 292 13.3 Process Performance Metrics as an Integral Part of Business Performance Metrics ..................................................................... 293 13.3.1 Work Processes, Qualitative Criteria and Measuring Process Value.................................................................. 294 13.4 Controls for Strategic Alignment and Operational Customer and Resource Management............................................................ 296 13.5 Metrics Used for Input/Output Analysis........................................ 296 13.6 Output Metrics: Cash Inflows a Function of Output Quality ........ 297 13.7 Cash Outflows a Function of Available Capacity ......................... 298 13.8 Time and Volume as Non-financial Metrics of Process Performance ................................................................. 298 13.8.1 Fundamental Input/Output Process Performance Characteristics ........................................... 299 13.9 Technical Realization .................................................................... 299 13.9.1 Monitoring Challenges ................................................... 300 13.10 Related Work................................................................................. 301 13.11 SOA Based Business Process Performance Monitoring................ 302 13.11.1 Data Warehouse.............................................................. 305 13.11.2 Complex Event Processing ............................................. 305
CHAPTER 14 Risk and IT in Insurances ................................................................................... 311 Ute Werner 14.1 The Landscape of Risks in Insurance Companies ......................... 311 14.2 IT’s Function for Risk Identification and Analysis ....................... 314 14.2.1 Components of Underwriting Risk ................................. 314 14.2.2 Assessing Hazard, Exposure, Vulnerability and Potential Loss........................................................... 316 14.2.3 Information Management – Chances and Challenges ................................................. 321 14.3 IT as a Tool for Managing the Underwriting Risk.......................... 324 14.3.1 Disaster Assistance ......................................................... 324 14.3.2 Claims Handling ............................................................. 326 14.3.3 Market Development ...................................................... 327 14.4 Conclusion..................................................................................... 329
CHAPTER 15 Resolving Conceptual Ambiguities in Technology Risk Management ............................................................................................... 333 Christian Cuske, Tilo Dickopp, Axel Korthaus, Stefan Seedorf 15.1 New Challenges for IT Management............................................. 333 15.2 Knowledge Management Applications in Finance........................ 334 15.2.1 An Extended Perspective of Operational Risk Management ........................................................... 334 15.2.2 Formal Ontologies as an Enabling Technology .............. 336 15.2.3 The Case for Ontology-based Technology Risk Management ........................................................... 337 15.3 Technology Risk Management Using OntoRisk ........................... 338 15.3.1 The Technology Risk Ontology...................................... 339 15.3.2 The OntoRisk Process Model ......................................... 343 15.3.3 System Architecture........................................................ 347 15.4 Case Study: Outsourcing in Banking............................................. 348 15.4.1 Motivation and Influencing Factors................................ 348 15.4.2 Trading Environment...................................................... 349 15.4.3 Analysis and Results....................................................... 352 15.5 Conclusion..................................................................................... 352
XVI
Table of Contents
CHAPTER 16 Extracting Financial Data from SEC Filings for US GAAP Accountants................................................................................. 357 Thomas Stümpert 16.1 Introduction ................................................................................... 357 16.2 Structure of 10-K and 10-Q Filings............................................... 359 16.2.1 Structure of Document Elements .................................... 359 16.2.2 Embedded Plain Text...................................................... 360 16.2.3 Embedded HTML ........................................................... 361 16.2.4 Financial Reports Overview ........................................... 362 16.3 Information Retrieval Within EASE ............................................. 363 16.3.1 Information Extraction from Traditional Filings ............ 364 16.3.2 Standard Vector Space Model ........................................ 365 16.3.3 Extended Vector Space Model........................................ 367 16.3.4 Extraction of HTML Documents .................................... 368 16.4 The Role of XBRL ........................................................................ 371 16.5 Empirical Results .......................................................................... 371 16.6 Conclusion..................................................................................... 373
PART II. IT METHODS IN FINANCE Introduction to Part II ......................................................................................... 379
CHAPTER 17 Networks in Finance ........................................................................................... 383 Anna Nagurney 17.1 Introduction ................................................................................... 383 17.2 Financial Optimization Problems .................................................. 385 17.3 General Financial Equilibrium Problems ...................................... 388 17.3.1 A Multi-Sector, Multi-Instrument Financial Equilibrium Model.......................................................... 389 17.3.2 Model with Utility Functions.......................................... 394 17.3.3 Computation of Financial Equilibria............................... 395 17.4 Dynamic Financial Networks with Intermediation........................ 397 17.4.1 The Demand Market Price Dynamics............................. 400 17.4.2 The Dynamics of the Prices at the Intermediaries .......... 401 17.4.3 Precursors to the Dynamics of the Financial Flows........ 401 17.4.4 Optimizing Behavior of the Source Agents .................... 402 17.4.5 Optimizing Behavior of the Intermediaries .................... 403 17.4.6 The Dynamics of the Financial Flows Between the Source Agents and the Intermediaries....................... 404 17.4.7 The Dynamics of the Financial Flows Between the Intermediaries and the Demand Markets .................. 405 17.4.8 The Projected Dynamical System................................... 406
Table of Contents
17.5
XVII
17.4.9 A Stationary/Equilibrium Point ...................................... 406 17.4.10 Variational Inequality Formulation of Financial Equilibrium with Intermediation..................................... 407 17.4.11 The Discrete-Time Algorithm (Adjustment Process) .... 408 17.4.12 The Euler Method ........................................................... 408 17.4.13 Numerical Examples....................................................... 410 The Integration of Social Networks with Financial Networks ...... 412
CHAPTER 18 Agent-based Simulation for Research in Economics.......................................... 421 Clemens van Dinther 18.1 Introduction ................................................................................... 421 18.2 Simulation in Economics............................................................... 423 18.2.1 Benefits of Simulation .................................................... 423 18.2.2 Difficulties of Simulation ............................................... 424 18.3 Agent-based Simulation Approaches ............................................ 427 18.3.1 Pure Agent-based Simulation: The Bottom-up Approach ............................................... 427 18.3.2 Monte Carlo Simulation.................................................. 428 18.3.3 Evolutionary Approach................................................... 430 18.3.4 Reinforcement Learning ................................................. 432 18.4 Summary ....................................................................................... 438
CHAPTER 19 The Heterogeneous Agents Approach to Financial Markets – Development and Milestones.............................................................................. 443 Jörn Dermietzel 19.1 Introduction ................................................................................... 443 19.2 Fundamental Assumptions in Financial Theory ............................ 444 19.2.1 Rationality ...................................................................... 444 19.2.2 Efficient Market Hypothesis ........................................... 445 19.2.3 Representative Agent Theory ......................................... 445 19.3 Empirical Observations ................................................................. 446 19.4 Evolution of Models...................................................................... 447 19.4.1 Fundamentalists vs. Chartists.......................................... 447 19.4.2 Social Interaction ............................................................ 450 19.4.3 Adaptive Beliefs ............................................................. 452 19.4.4 Artificial Stock Markets.................................................. 454 19.5 Competition of Trading Strategies ................................................ 456 19.5.1 Friedman Hypothesis Revisited ...................................... 456 19.5.2 Dominant Strategies........................................................ 457 19.6 Multi-Asset Markets...................................................................... 458 19.7 Extensions and Applications ......................................................... 459 19.8 Conclusion and Outlook ................................................................ 459
XVIII
Table of Contents
CHAPTER 20 An Adaptive Model of Asset Price and Wealth Dynamics in a Market with Heterogeneous Trading Strategies........................................... 465 Carl Chiarella, Xue-Zhong He 20.1 Introduction ................................................................................... 465 20.2 Adaptive Model with Heterogeneous Agents................................ 468 20.2.1 Notation .......................................................................... 469 20.2.2 Portfolio Optimization Problem of Heterogeneous Agents................................................ 470 20.2.3 Market Clearing Equilibrium Price – A Growth Model............................................................. 471 20.2.4 Population Distribution Measure .................................... 471 20.2.5 Heterogeneous Representative Agents and Wealth Distribution Measure ................................... 472 20.2.6 Performance Measure, Population Evolution and Adaptiveness ............................................................ 472 20.2.7 An Adaptive Model ........................................................ 474 20.2.8 Trading Strategies ........................................................... 475 20.2.9 Fundamental Traders ...................................................... 475 20.2.10 Momentum Traders ........................................................ 475 20.2.11 Contrarian Traders .......................................................... 476 20.3 An Adaptive Model of Two Types of Agents ............................... 477 20.3.1 The Model for Two Types of Agents ............................. 477 20.3.2 Wealth Distribution and Profitability of Trading Strategies....................................................... 478 20.3.3 Population Distribution and Herd Behavior.................... 478 20.3.4 A Quasi-Homogeneous Model ....................................... 479 20.4 Wealth Dynamics of Momentum Trading Strategies .................... 481 20.4.1 Case: (L1,L2) = (3,5) ........................................................ 481 20.4.2 Other Lag Length Combinations .................................... 484 20.5 Wealth Dynamics of Contrarian Trading Strategies...................... 486 20.5.1 Case: (L1,L2) = (3,5) ........................................................ 487 20.5.2 Other Cases..................................................................... 489 20.6 Conclusion..................................................................................... 491 20.7 Appendix ....................................................................................... 492 20.7.1 Proof of Proposition 1..................................................... 492 20.7.2 Time Series Plots, Statistics and Autocorrelation Results............................................ 494
Strong and Weak Convergence ..................................................... 503 Strong Approximation Methods .................................................... 505 Weak Approximation Methods ..................................................... 506 Monte Carlo Simulation for SDEs................................................. 509 Variance Reduction ....................................................................... 510
CHAPTER 22 Foundations of Option Pricing............................................................................ 515 Peter Buchen 22.1 Introduction ................................................................................... 515 22.2 The PDE and EMM Methods ........................................................ 517 22.2.1 Geometrical Brownian Motion ....................................... 517 22.2.2 The Black-Scholes pde ................................................... 518 22.2.3 The Equivalent Martingale Measure............................... 519 22.2.4 Effect of Dividends......................................................... 521 22.3 Pricing Simple European Derivatives............................................ 521 22.3.1 Asset and Bond Binaries................................................. 522 22.3.2 European Calls and Puts ................................................. 524 22.4 Dual-Expiry Options ..................................................................... 525 22.4.1 Second-order Binaries..................................................... 525 22.4.2 Binary Calls and Puts...................................................... 527 22.4.3 Compound Options ......................................................... 528 22.4.4 Chooser Options ............................................................. 529 22.5 Dual Asset Options........................................................................ 530 22.5.1 The Exchange Option ..................................................... 531 22.5.2 A Simple ESO................................................................. 532 22.5.3 Other Two Asset Exotics ................................................ 532 22.6 Barrier Options .............................................................................. 533 22.6.1 PDE’s for Barrier Options .............................................. 533 22.6.2 Image Solutions for the BS-pde...................................... 534 22.6.3 The D/O Barrier Option.................................................. 535 22.6.4 Equivalent Payoffs.......................................................... 535 22.6.5 Call and Put Barriers....................................................... 536 22.7 Lookback Options ......................................................................... 537 22.7.1 Equivalent Payoffs for Lookback Options...................... 537 22.7.2 Generic Lookback Options ............................................. 539 22.7.3 Floating Strike Lookback Options .................................. 540 22.8 Summary ....................................................................................... 540
CHAPTER 23 Long-Range Dependence, Fractal Processes, and Intra-Daily Data ................... 543 Wei Sun, Svetlozar (Zari) Rachev, Frank Fabozzi 23.1 Introduction ................................................................................... 543 23.2 Stylized Facts of Financial Intra-daily Data .................................. 545 23.2.1 Random Durations .......................................................... 545
XX
Table of Contents
23.3
23.4
23.5
23.6
23.7
23.2.2 Distributional Properties of Returns ............................... 545 23.2.3 Autocorrelation and Seasonality ..................................... 546 23.2.4 Clustering........................................................................ 546 23.2.5 Long-range Dependence ................................................. 547 Computer Implementation in Studying Intra-daily Data ............... 547 23.3.1 Data Transformation ....................................................... 547 23.3.2 Data Cleaning ................................................................. 549 Research on Intra-Daily Data in Finance....................................... 550 23.4.1 Studies of Volatility ........................................................ 550 23.4.2 Studies of Liquidity ........................................................ 553 23.4.3 Studies of Market Microstructure ................................... 554 23.4.4 Studies of Trade Duration............................................... 556 Long-Range Dependence .............................................................. 560 23.5.1 Estimation and Detection of LRD in Time Domain ....... 560 23.5.2 Estimation and Detection of LRD in Frequency Domain...................................................... 564 23.5.3 Econometric Modeling of LRD ...................................... 566 Fractal Processes and Long-Range Dependence ........................... 569 23.6.1 Specification of the Fractal Processes............................. 569 23.6.2 Estimation of Fractal Processes ...................................... 571 23.6.3 Simulation of Fractal Processes ...................................... 574 23.6.4 Implications of Fractal Processes.................................... 575 Summary ....................................................................................... 576
CHAPTER 24 Bayesian Applications to the Investment Management Process ......................... 587 Biliana Bagasheva, Svetlozar (Zari) Rachev, John Hsu, Frank Fabozzi 24.1 Introduction ................................................................................... 587 24.2 The Single-Period Portfolio Problem ............................................ 587 24.3 Combining Prior Beliefs and Asset Pricing Models...................... 592 24.4 Testing Portfolio Efficiency .......................................................... 596 24.4.1 Tests Involving Posterior Odds Ratios............................ 597 24.4.2 Tests Involving Inefficiency Measures ........................... 599 24.5 Return Predictability...................................................................... 600 24.5.1 The Static Portfolio Problem .......................................... 601 24.5.2 The Dynamic Portfolio Problem..................................... 603 24.5.3 Model Uncertainty .......................................................... 604 24.6 Conclusion..................................................................................... 607
CHAPTER 25 Post-modern Approaches for Portfolio Optimization ......................................... 613 Borjana Racheva-Iotova, Stoyan Stoyanov 25.1 Introduction ................................................................................... 613 25.2 Risk and Performance Measures Overview................................... 614 25.2.1 The Value-at-Risk Measure ............................................ 616
Table of Contents
25.3
25.4
25.5 25.6
XXI
25.2.2 Coherent Risk Measures ................................................. 616 25.2.3 The Conditional Value-at-Risk ....................................... 618 25.2.4 Performance Measures.................................................... 619 Heavy-tailed and Asymmetric Models for Assets Returns............ 621 25.3.1 One-dimensional Models................................................ 622 25.3.2 Multivariate Models........................................................ 623 Strategy Construction Based on CVaR Minimization................... 624 25.4.1 Long-only Active Strategy.............................................. 625 25.4.2 Long-short Strategy ........................................................ 627 25.4.3 Zero-dollar Strategy........................................................ 628 25.4.4 Other Aspects.................................................................. 629 Optimization Example................................................................... 629 Conclusion..................................................................................... 633
CHAPTER 26 Applications of Heuristics in Finance................................................................. 635 Manfred Gilli, Dietmar Maringer, Peter Winker 26.1 Introduction ................................................................................... 635 26.2 Heuristics....................................................................................... 636 26.2.1 Traditional Numerical vs. Heuristic Methods................. 636 26.2.2 Some Selected Approaches............................................. 637 26.2.3 Further Methods and Principles ...................................... 639 26.3 Applications in Finance................................................................. 642 26.3.1 Portfolio Optimization .................................................... 642 26.3.2 Model Selection and Estimation ..................................... 648 26.4 Conclusion..................................................................................... 651
CHAPTER 27 Kernel Methods in Finance................................................................................. 655 Stephan Chalup, Andreas Mitschele 27.1 Introduction ................................................................................... 655 27.2 Kernelisation ................................................................................. 657 27.3 Dimensionality Reduction ............................................................. 658 27.3.1 Classical Methods for Dimensionality Reduction........... 658 27.3.2 Non-linear Dimensionality Reduction ............................ 661 27.4 Regression ..................................................................................... 664 27.5 Classification ................................................................................. 666 27.6 Kernels and Parameter Selection................................................... 668 27.7 Survey of Applications in Finance ................................................ 670 27.7.1 Credit Risk Management ................................................ 670 27.7.2 Market Risk Management............................................... 673 27.7.3 Synopsis and Possible Future Application Fields ........... 675 27.8 Overview of Software Tools ......................................................... 678 27.9 Conclusion..................................................................................... 679
XXII
Table of Contents
CHAPTER 28 Complexity of Exchange Markets ...................................................................... 689 Mao-cheng Cai, Xiaotie Deng 28.1 Introduction ................................................................................... 689 28.2 Definitions and Models ................................................................. 691 28.3 On Complexity of Arbitrage in a General Market Model ............. 694 28.4 Polynomial-Time Solvable Models of Exchange Markets ............ 697 28.5 Arbitrage on Futures Markets........................................................ 699 28.6 The Minimum Number of Operations to Eliminate Arbitrage ...... 702 28.7 Remarks and Discussion................................................................ 703
PART III. FURTHER ASPECTS AND FUTURE TRENDS Introduction to Part III ........................................................................................ 709
CHAPTER 29 IT Security: New Requirements, Regulations and Approaches .......................... 711 Günter Müller, Stefan Sackmann, Oliver Prokein 29.1 Introduction ................................................................................... 711 29.2 Online Services in the Financial Sector......................................... 712 29.2.1 Integration of Financial Services .................................... 712 29.2.2 Security Risks of Personalized Services ......................... 714 29.3 Security beyond Access Control.................................................... 714 29.3.1 Security: Protection Goals .............................................. 715 29.3.2 Security: Dependability .................................................. 717 29.4 Security: Policy and Audits ........................................................... 717 29.4.1 Security Policies ............................................................. 718 29.4.2 Logging Services ............................................................ 721 29.5 Regulatory Requirements and Security Risks ............................... 722 29.5.1 Data Protection Laws...................................................... 722 29.5.2 Regulatory Requirements ............................................... 723 29.5.3 Managing Security and Privacy Risks ............................ 724 29.6 Conclusion..................................................................................... 727
CHAPTER 30 Legal Aspects of Internet Banking in Germany.................................................. 731 Gerald Spindler, Einar Recknagel 30.1 Introduction ................................................................................... 731 30.2 Contract Law – Establishing Business Connections, Concluding Contracts, and Consumer Protection.......................... 732 30.2.1 Establishing the Business Connection ............................ 732 30.2.2 Contract Law .................................................................. 732 30.2.3 Terms and Conditions..................................................... 733 30.2.4 Consumer Protection – EU-Legislation and Directives .. 733
Table of Contents
30.3
XXIII
Liability – Duties and Obligations of Bank and Customer............ 736 30.3.1 Phishing and Its Variations – A Case of Identity............ 736 30.3.2 Legal Situation in Case of Phishing or Pharming ........... 737 30.3.3 Procedural Law and the Burden of Proof........................ 746
CHAPTER 31 Challenges and Trends for Insurance Companies............................................... 753 Markus Frosch, Joachim Lauterbach, Markus Warg 31.1 Starting Position ............................................................................ 753 31.2 Challenges and Trends .................................................................. 754 31.2.1 Business and IT Strategy in the Balanced Scorecard...... 754 31.2.2 Self-controlling Systems: Management Based on the Contribution Margin ....................................................... 756 31.2.3 Unique Selling Propositions: On Demand Insurance...... 759 31.2.4 Sales and Marketing Support: Access to Primary Data................................................... 761 31.2.5 Talent Management: Identify, Hire and Develop Talented Staff ............................................ 765 31.3 Summary and Outlook................................................................... 770
CHAPTER 32 Added Value and Challenges of Industry-Academic Research Partnerships – The Example of the E-Finance Lab .................................................................... 773 Wolfgang Koenig, Stefan Blumenberg, Sebastian Martin 32.1 Introduction ................................................................................... 773 32.2 Structure of the EFL ...................................................................... 773 32.3 Generation and Sharing of Knowledge Between Science and Practice: Research Cycle ........................................................ 775 32.3.1 Problem Definition ......................................................... 776 32.3.2 Operationalization........................................................... 776 32.3.3 Application ..................................................................... 777 32.3.4 Analysis .......................................................................... 777 32.3.5 Utilization ....................................................................... 779 32.4 Added Value and Challenges of the Industry-Academic Research Partnership ..................................................................... 783 32.5 Conclusion..................................................................................... 785 Index ................................................................................................................... 787
CONTRIBUTORS
Biliana Bagasheva Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106–3110, USA Stefan Blumenberg Institute for Information Systems/E-Finance Lab Frankfurt University 60325 Frankfurt/Main, Germany [email protected] Oliver Braun Lehrstuhl für Informations- und Technologiemanagement Saarland University Im Stadtwald 15 66123 Saarbrücken, Germany [email protected] Peter Buchen School of Mathematics and Statistics University of Sydney Sydney, NSW, Australia [email protected] Mao-cheng Cai Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing 100080, P. R. China
XXVI
Contributors
Stephan K. Chalup School of Electrical Engineering and Computer Science The University of Newcastle Callaghan, NSW 2308, Australia [email protected] Carl Chiarella University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Sydney, Australia [email protected] Christian Cuske Protiviti Taunusanlage 17 60325 Frankfurt/Main, Germany [email protected] Feras T. Dabous Senior Advisor Smarts Group International Level 4, 55 Harrington Street The Rocks Sydney NSW 2000 Australia [email protected] Xiaotie Deng Department of Computer Science City University of Hong Kong Hong Kong, P. R. China Jörn Dermietzel Graduate School IME and Institute AIFB Universität Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected] Tilo Dickopp Lehrstuhl für Wirtschaftsinformatik III University of Mannheim 68131 Mannheim, Germany [email protected] Frank Fabozzi Yale School of Management 135 Prospect Street, Box 208200 New Haven, Connecticut, 06520-8200, USA
Contributors XXVII
Ulrich Faisst Department of Information Systems & Financial Engineering Business School, University of Augsburg Augsburg, Germany [email protected] Markus Frosch Promerit AG Wiesenhüttenstraße 1 60329 Frankfurt/Main, Germany [email protected] Martin Glaum Chair for General Business Administration, International Management Accounting and Auditing, Justus-Liebig-University of Giessen 35390 Giessen, Germany Manfred Gilli Department of Econometrics University of Geneva 40, Bd du Pont d’Arve CH-1211 Geneva 4, Switzerland Peter Gomber Chair of Business Administration, especially e-Finance Goethe-University Frankfurt Robert-Mayer-Straße 1 60054 Frankfurt/Main, Germany [email protected] Xue-Zhong He University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Sydney, Australia [email protected] Jan Henkel BMW AG 80788 München, Germany [email protected] Carsten Holtmann Research Unit Information Process Engineering (IPE) FZI Forschungszentrum Informatik Karlsruhe, Germany [email protected]
XXVIII Contributors
John Hsu Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106-3110, USA Marcel Jakobi Union Investment Wiesenhüttenstraße 10 60329 Frankfurt/Main, Germany [email protected] Christian Knogler msg systems ag Robert-Bürkle-Str. 1 85737 Ismaning, Germany Wolfgang Koenig Institute for Information Systems/E-Finance Lab Frankfurt University 60325 Frankfurt/Main, Germany [email protected] Axel Korthaus Lehrstuhl für Wirtschaftsinformatik III University of Mannheim 68131 Mannheim, Germany [email protected] Markus Kress Institute AIFB University of Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected] Malte Krueger Lecturer Institute for Economic Policy Research University of Karlsruhe (TH) and Consultant, PaySys Consultancy GmbH 76128 Karlsruhe, Germany Dennis Kundisch Department of Information Systems & Financial Engineering Competence Center IT & Financial Services University of Augsburg 86159 Augsburg, Germany [email protected]
Contributors
Joachim Lauterbach vwd Vereinigte Wirtschaftsdienste GmbH Tilsiter Straße 1 60487 Frankfurt/Main [email protected] Kay Leibold Researcher Institute for Economic Policy Research University of Karlsruhe (TH) 76128 Karlsruhe, Germany Michael Linsmaier msg systems ag Robert-Bürkle-Str. 1 85737 Ismaning, Germany Marco Lutat Chair of Business Administration, especially e-Finance Goethe-University Frankfurt Robert-Mayer-Strasse 1 60054 Frankfurt/Main, Germany [email protected] Dietmar Maringer Centre for Computational Finance and Economic Agents (CCFEA) University of Essex Wivenhoe Park Colchester CO4 3SQ, United Kingdom Sebastian Martin Institute for Information Systems/E-Finance Lab Frankfurt University 60325 Frankfurt/Main, Germany [email protected] Daniel Minoli Stevens Institute of Technology Hoboken, NJ 07030-5991, USA Andreas Mitschele Institute AIFB University of Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected]
XXIX
XXX
Contributors
Günter Müller Institute of Computer Science and Social Studies Department of Telematics University of Freiburg 79085 Freiburg, Germany [email protected] Anna Nagurney Department of Finance and Operations Management Isenberg School of Management, University of Massachusetts Amherst, Massachusetts 01003, USA Heiko Paoli Research Unit Information Process Engineering (IPE) FZI Forschungszentrum Informatik 76131 Karlsruhe, Germany [email protected] Eckhard Platen School of Finance & Economics and Dept. of Mathematical Sciences University of Technology Sydney, Australia [email protected] Oliver Prokein Institute of Computer Science and Social Studies Department of Telematics University of Freiburg 79085 Freiburg, Germany [email protected] Fethi Rabhi School of Information Systems Technology and Management The University of New South Wales Sydney, NSW 2052, Australia [email protected] Svetlozar (Zari) Rachev School of Economics and Business Engineering University of Karlsruhe Postfach 6980, 76128 Karlsruhe, Germany and Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106-3110, USA
Contributors
Borjana Racheva-Iotova Research and Development, FinAnalytica Inc. Milin Kamak 65 Sofia 1164, Bulgaria Einar Recknagel RAe Schulz Noack Bärwinkel Baumwall 7 20459 Hamburg, Germany Stefan Sackmann Institute of Computer Science and Social Studies Department of Telematics University of Freiburg 79085 Freiburg, Germany [email protected] Günter Schmidt Lehrstuhl für Informations- und Technologiemanagement Saarland University, Im Stadtwald 15 66123 Saarbrücken, Germany [email protected] Alexander Schöne GILLARDON AG financial software Edisonstr. 2 75015 Bretten, Germany [email protected] Steffen Schubert Dubai International Financial Exchange Ltd. Level 7, The Exchange Building, Gate District Dubai International Financial Centre P.O. Box 53536, Dubai, United Arab Emirates [email protected] Stefan Seedorf Lehrstuhl für Wirtschaftsinformatik III University of Mannheim 68131 Mannheim, Germany [email protected] Gerald Spindler Juristische Fakultät Universität Göttingen Platz der Göttinger Sieben 6 37073 Göttingen, Germany
XXXI
XXXII Contributors
Stephan Stathel Research Unit Information Process Engineering (IPE) FZI Forschungszentrum Informatik Haid-und-Neu-Str. 10–14 76131 Karlsruhe, Germany [email protected] Stoyan Stoyanov FinAnalytica Inc. Milin Kamak 65 Sofia 1164, Bulgaria Wei Sun Institute for Statistics and Mathematical Economics University of Karlsruhe 76128 Karlsruhe, Germany [email protected] Thomas Stümpert Institute AIFB University of Karlsruhe (TH) 76128 Karlsruhe, Germany [email protected] Christian Ullrich BMW AG 80788 München, Germany [email protected] Clemens van Dinther FZI – Research Center for Information Technologies Universität Karlsruhe (TH) Haid-und-Neu-Str. 10–14 76131 Karlsruhe, Germany [email protected] Hong Tuan Kiet Vo Institute of Information Systems and Management IISM Research Group Information & Market Engineering University Karlsruhe (TH) 76137 Karlsruhe, Germany Markus Warg Deutscher Ring Versicherungsunternehmen Ludwig-Erhard-Straße 22 20459 Hamburg, Germany [email protected]
Contributors XXXIII
Ute Werner Lehrstuhl für Versicherungswissenschaft Universität Karlsruhe (TH) 76133 Karlsruhe, Germany [email protected] Peter Winker Department of Economics University of Giessen Licher Str. 64 35394 Giessen, Germany Dirk Wölfing cirquent | BMW Group Königsberger Straße 29 60487 Frankfurt/Main [email protected] Remigiusz Wojciechowski Bayer AG Treasury Systems Manager Building Q26, Room 3.308 51368 Leverkusen, Germany [email protected] Olaf Zeitnitz Union Investment Wiesenhüttenstraße 10 60329 Frankfurt/Main, Germany [email protected]
PART I IT Systems, Infrastructure and Applications in Finance
Introduction to Part I
The sixteen chapters of this part cover essential aspects of IT systems and their relation to the infrastructure of the financial service industry. In the very first chapter Paoli, Holtmann, Stathel, Zeitnitz and Jakobi present the Service-Oriented Architecture (SOA) as an innovative concept for designing software infrastructures in the financial industry. The SOA promises benefits like decoupling of tasks and increased reuse of software components, since the SOA software system is divided into logical, self-sustaining parts (services). Services are, in contrast to e. g. objects in object-oriented architectures, loosely coupled to form the business logic of e. g. companies. SOA has also a very strong economic impact, as it allows more flexible ways for adapting business logic to fit new market requirements and simultaneously promises to reduce the total cost of ownership. In this chapter a case study from the German financial industry provides insights to the introduction of a SOA and reveals how much of the SOA impact and potential promised can today be fulfilled in daily practice. Chapter 2 is devoted to information systems and IT architectures for securities trading. Here, Dabous and Rabhi give an introduction to the basic principles of securities trading by presenting some essential concepts such as financial instruments, order types and market structures. It describes the securities trading cycle as consisting of a number of business processes within and across different institutions such as brokerage houses, investment banks, and markets. It classifies the different types of information and transaction systems and their role in the trading cycle. For illustration, the chapter focuses on two commercial systems in terms of their architecture. The chapter also examines the architecture of such systems and surveys the different levels of integration and technologies that are currently being used. It also discusses current trends in utilizing service-oriented computing, business process automation and integration principles in order to achieve the development of stable, flexible and extensible business processes. The next chapter of Knogler and Linsmaier investigates product management systems in financial services software architectures and presents a controlling model for banking. Flexibility and adaptability of core business application software
4
Introduction to Part I
are major drivers for both business and IT managers in the insurance industry. Extensively automated straight through processing from the point of sale (or point of contact) into the back office systems and back again has to be enabled in order to cut time and costs spent on sales and services processes. The authors point out a concept of ‘Product Data Driven Business Application Software Architecture’ for the insurance industry reflected by a product management system. Moreover, the scalability of services is a new approach that ranges from simple services (e. g. calculations) to product related business services which have a key role in nearly all core processes within the financial services and insurance industry. The chapter authors identify strategic IT-consulting and systems integration services as well as standard business applications as the ingredients for tomorrow’s business success. Faisst and Prokein discuss the management of security risks and introduce a controlling model for banking companies in Chapter 4. Increasing importance of information and communication technologies, new regulatory obligations (e. g. Basel II) and growing external risks (e. g. hacker attacks) put security risks in the management focus of banking companies. The management has to decide whether to carry security risks or to invest into technical security mechanisms in order to decrease the frequency of events or to invest in insurance policies in order to lower the severity of events. Based on a presentation of the state-of-the-art in the management of security risks, this chapter develops an optimization model to determine the optimal amount to be invested in technical security mechanisms and insurance policies. Furthermore the model considers budget and risk limits as constraints. This chapter is particularly supposed to help practitioners in controlling security risks. In Chapter 5 Ullrich and Henkel present a case study on process-oriented systems in corporate treasuries. Any kind of risk and transaction management conducted in corporate treasury department requires technology support. The reason for this stems not only from the volume, complexity and time criticality of the necessary data processing, but also from the desire to make quicker and betterinformed decisions. A practical conflict emerges from the need to consolidate IT systems at the one end, while at the same time granting a certain degree of specification. A case study provides insights into the treasury organization, financial services and treasury process of BMW Group, a multinational manufacturer of premium automobiles and motorcycles. In order to address the above mentioned conflict, BMW’s treasury system functions and their capability of supporting BMW’s treasury operations are outlined. It is found that, given the current market environment for IT systems, BMW needs to rely on a combination of vendors in order to obtain a complete business solution. How to streamline foreign exchange management operations with corporate financial portals is investigated by Kiet Vo, Glaum and Wojciechowski in Chapter 6. This work outlines the characteristics and determinants of multinational financial processes and identifies the challenges financial management has to face when striving for operational effectiveness and efficiency. It is argued that these challenges can be effectively managed only with the support of IT. Hence, the concept of a corporate financial portal as a central hub to financial information, systems and transaction services is introduced and discussed. To complement the
Introduction to Part I
5
theoretical discussion a corresponding application is introduced and the impact of a corporate financial portal implementation with regard to the foreign exchange management process of the Bayer AG is evaluated. Stock markets in the Gulf Region have experienced a significant growth in the recent past. This area is discussed in Chapter 7 by Gomber, Lutat and Schubert, with a special emphasis to international access, electronic trading and regulation of this market, for which market performance, trading activity and market capitalization have increased by nearly triple digits annually. This development implicates two effects: on the one hand financial markets gain more attractiveness for international investors which hardly had access for a long time; on the other hand the region’s exchanges are running through modernization in their market models, regulatory frameworks, functionalities and technical infrastructures. This chapter highlights recent stock market developments in the six Gulf Cooperating Council (GCC) countries against the background of international investors’ requirements. The region’s exchanges are characterized from an organizational and regulatory perspective with a focus on trading mechanisms. Recent approaches for modernizing and opening up GCC markets are investigated taking the Dubai International Financial Exchange (DIFX) as a case study. Kundisch and Holtmann investigate the competition of retail trading venues, online-brokerage and security markets in Germany in Chapter 8. Germany is one of the few countries with a vivid regional stock market structure. Some of the markets specifically address the retail investor with their offerings and compete on price as well as on quality. Retail investors cannot access these markets directly, but have to make use of intermediaries such as an online-broker. The explicit transaction costs are one important factor that influences the decision of a retail investor whether and where to perform her transaction. In this contribution on the one hand the services that are affiliated with a stock market transaction and on the other hand pricing schemes of four online-brokers are analysed. The research question addressed is, whether a potentially existing price competition of the stock markets and other affiliated service providers is visible for the retail investor. Due to the pricing schemes of the online-brokers which are primarily relevant to the retail investor, potentially existing price competition on the level of the stock markets does not become visible. However, differences can be reported in the distortions caused by the different pricing schemes of the analysed online-brokers. In Chapter 9 Schöne develops strategies for a customer-oriented consulting approach in sales processes for financial products. Following a phase where customer service was not considered a core business, banks have begun focusing on customer service again. The chapter discusses two central requirements for customer-oriented consulting. At first, it will highlight which methods are available for enhancing the functional quality of consulting services. Then it will introduce the IT structures, which modern consulting software should provide, especially to enable quick reactions to the changing requirements in the market. Chapter 10 is devoted to reference models for financial planning. Here Braun and Schmidt present a reference model for personal financial planning based on an application architecture and a corresponding system architecture. Personal financial
6
Introduction to Part I
planning is the process of meeting life goals through the management of finances. After an introduction the chapter provides a short review of the state-of-the-art in the field of reference modelling for personal financial planning and describes the framework of a suitable reference model. The chapter discusses essential parts of reference models for the analysis and architecture level. As an example is described a prototype (FiXplan – IT-based personal financial planning) in order to evaluate the requirements compiled. Finally the results are resumed and future research opportunities are discussed. Internet Payments in Germany is the central focus of Chapter 11 by Krueger and Leibold. After discussing the historical development they take a look at the current state of the market and some of the important drivers. Then the role of technology and different business models is investigated. Section three of this chapter presents the special situation of Internet payment in Germany. It is followed by a discussion of the Single European Payments Area (‘SEPA’) and an outlook to possible future developments. Grid Computing is an evolving field of virtualized distributed computing environments, providing dynamic “runtime” selection, sharing, and aggregation of (geographically) distributed autonomous resources. Minoli’s Chapter 12 gives a survey of the Grid Computing field as it applies to corporate environments and it focuses on what are the potential advantages of the technology in these environments. The chapter starts with a survey of the historical development. After a survey on the historical development it gives a profound introduction to grid computing and its key issues. An analysis of potential applications and financial benefits of grid computing and a survey on different grid types, topologies, components and layers is provided. This basic view is compared with other approaches. A quick view at grid computing standards and a pragmatic course of investigation on grid computing close the chapter. In Chapter 13, Kress and Wölfing present operational metrics and a technical platform for measuring bank process performance. Industrial methods to improve the productivity require metrics to control the performance of processes. For this purpose the end-to-end processes should be broken down into components, at best indicating the division of labour. The interfaces between the components of the process are stations for measurement. The coordination of the different workers is based on controlling information, produced in a technical business process management platform. Technically the business process management platform produces information by a monitoring system integrated in a SOA. Depending on the specific scenario various technical approaches exist to build such a monitoring solution which has to cope with increasing data volume as well as constant changes. The presented solution provides business driven monitoring in challenging technical environments. Werner gives a survey on special aspects of risk and IT in insurances in Chapter 14. Insurance companies run different risks, with underwriting and timing risk being the most typical ones. As insurance contracts cover natural and man-made hazards and the resulting damages, it is important for insurers to assess their interdependent effects. Geographical Information and Remote Sensing Systems are used for risk identification and analysis since they support the mapping of hazards,
Introduction to Part I
7
exposure, vulnerability and risk. Insurance companies need this information for risk inspection and pricing. Accumulations of risks insured have to be controlled, too. This task may be enhanced by IT since it requires the identification and visualization of spatial and temporal contingencies which is important for risk selection and complex risk analyses. Geographic Information and Remote Sensing Systems have been used as tools for disaster assistance and insurance claims management. Another promising application area is market development, i. e. increasing the quantity and improving the quality of insurance business written. Both effects are able to reduce the underwriting risk of insurers. Cuske, Dickopp, Korthaus and Seedorf investigate in Chapter 15 how conceptual ambiguities in technology risk management can be resolved. Knowledge management is widely recognized as a business enabler, especially in large organizations depending on intangible assets. Accordingly, financial institutions need to acquire, store and apply knowledge for supporting their core business processes. This particularly holds for risk management, which crosscuts the financial service domain as one core competency. Unlike e. g. market or credit risk, operational risk suffers from an underlying conceptual ambiguity, creating special challenges for knowledge management. In this chapter the OntoRisk process model is presented, which is used for identifying, measuring and controlling technology risk, a subcategory of operational risk which is also in the focus of the Basel II regulation. The process is supported by an ontology-based platform implementing a suitable knowledge representation and application components supporting the process steps. The advantages over traditional approaches are analysed by means of a case study. The last chapter of this part contains a contribution by Stümpert who discusses how to extract financial data from SEC filings for US GAAP accountants as a key example of automatic information retrieval. U.S. domestic companies with publicly traded securities have to disseminate financial information in the EDGAR database of the Securities and Exchange Commission (SEC), a regulation authority with the task to protect investors and to maintain fair, honest and efficient markets. The most important financial information in this database is 10-K filings which contain the annual reports, i. e. balance sheets, cash flow statements and income statements. The chapter presents a software agent which extracts data from annual and quarterly reports and enables financial accountants to compare financial data of different companies and different years easily. The structure of 10-K filings varies between companies, and changes over time for individual companies as well. Moreover, their technical specification is comparably weak. Altogether, this makes automated data extraction a great challenge. The focus of this chapter is the recognition of balance sheets, that is, how to find relevant sections in a large document. A filing consists of either HTML or plain ASCII text. Regarding HTML encoded content the agent builds a DOM (Document Object Model) instance on top of a filing. This DOM-based approach revealed additional potential for navigation in these filings in order to detect financial information faster and reliably even when filings do not adhere strictly to syntactical conventions. For plain text, a modified vector space model has been developed. The agent detects 100% of balance sheet data from 10-K filings in a sample of the companies of the Dow Jones Industrial Average Index.
CHAPTER 1 SOA in the Financial Industry – Technology Impact in Companies’ Practice Heiko Paoli, Carsten Holtmann, Stephan Stathel, Olaf Zeitnitz, Marcel Jakobi
1.1
Introduction
Service-oriented Architecture (SOA) basically is an innovative concept for designing software infrastructures. To better understand the idea and concepts of SOA it seams promising to take a look at how the Information and Communication Technology (ICT) has developed since the beginning of 70ths (see Table 1.1). Four major developments concerning architectural paradigms can be observed: • Monolithic and host-based approaches: The history of ICT started with monolithic software systems on mainframes. Typical applications in that era have been developed for the accounting, payroll and inventory purposes. Software primarily concentrated on the functions it should provide. Systems were designed as building blocks and therefore the reuse of system elements or components was largely impossible. Functionality was implemented on hosts, terminals provided the user interface to the systems. The next step in software architecture development was the era of client-server systems beginning in the 1980ths. • Client Server (CS) approaches: With the CS paradigm software started to increasingly address the needs of single user groups through applications for Enterprise Resource Planning (ERP), Human Resource Management (HR) or Collaboration issues. Terminals were replaced by personal computers and software systems were divided into two parts: Servers built the backbone and provided necessary business functionality while the clients were used mainly for presentation purposes. Hardware and/or operating system of client and server became independent from each other. Hence, server functionalities were used by multiple clients while the communication scheme was based on the technology known as Remote Procedure Call (RPC). Clients and servers were still coupled quite tightly.
10
Heiko Paoli et al.
• Multiple Tier approaches: In the 90ths the object-oriented software developing paradigm became common and it was possible to divide complex systems into smaller parts (components and objects), which were able to communicate with each other to build the overall functionality. One central goal of the object-oriented paradigm was to ease the reuse of system parts and objects. To support the reuse of server objects in a distributed environment the concept of a communication middleware was introduced. So it became possible to build complex systems, but like in client-server systems all parts of the distributed system still were tightly coupled through the middleware. To solve the challenges of multiple tier systems SOA started to take-off in the early 2000ths. Today SOA is one of the most often used terms in discussions about state-ofthe-art software architectures paradigms. Nearly every important software vendor offers its own solutions to support SOA. Promises are huge and so is uncertainty about real benefits: What exactly distinguishes SOA from other paradigms for distributed software systems? What are the primary benefits, and which of them can be observed in real world scenarios? In the remainder of the chapter these questions will be addressed. After an introduction into some different understandings one could have about SOA, the often mentioned benefits concerning technical and economic aspects are discussed, before the implementation of a SOA in a German Asset Management Company (AM) is illustrated and discussed with respect to both aspects.
Table 1.1. Architecture approaches (see Magic Software (2005))
Approach
Applications
Range
Characteristics
Host Based
Accounting, Payroll, Inventory
Function (Discrete)
ClientServer
Employee (Internal)
Multiple Tier (Web)
ERP, HR, Collaboration (Office) CRM, eBusiness,
Monolithic Very tightly coupled No reuse possible RPC Tightly coupled Little reuse possible
Enterprise Network
EAI BPM, CPM/BI
Customer (External) Business Process (Global)
Objects Tightly coupled Medium reuse possible Services Loosely coupled Reuse possible
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
1.2
11
SOA – Different Perspectives
The introduction provides a rough idea about the history of different ICT architecture paradigms and briefly summarizes important characteristics of them. The question ‘what a SOA is’ was raised. The answer is simple: It depends. Like in any emerging ICT field there is a huge heterogeneity of understandings to what SOA refers to. Typically none of them is able to sketch all aspects of the topic; some try to be broad and high level others focus on special characteristics.1 To illustrate different perspectives, three approaches – namely a software vendor, a consultant, and a common perspective – are briefly described before the basic architecture of a SOA is presented. A first way of understanding SOA could be characterized as a vendor’s perspective on the terms service, service orientation, service-oriented architecture and composite application since it has been provided by IBM. Interestingly, the very basis of the understanding is the definition of a service as being a repeatable business task (like to check a customer credit or open a new account). Service orientation is then primarily non-technical understood as a way of integrating business processes as a linked chain of services and the outcomes that they provide.2 Hence, a Service-oriented architecture is an IT architectural style that supports service orientation. These basic definitions of a software vendor interestingly stress the business aspect of services and SOA. Nothing is mentioned about the requirements upon the infrastructure, middleware, or software architecture. In this definition business processes seem to be the most important aspects to focus. Composite applications facilitate business processes across multiple functions and systems. A SOA infrastructure obviously has to be built upon the business needs − this would be in clear contrast to older concepts where the IT infrastructure dictated how the business tasks have to be handled. Building applications on a SOA approach seems to be a task of decomposing business needs into functions and services. Composite applications can be described as a jigsaw puzzle where business services are connected by using middleware functionalities. Following this insights a SOA has to be • process-oriented, • context-driven and dynamic which means it accomplishes business goals, responds to business events and changes over time depending on changing business needs, • user-centric which means that they integrate and provide information requested by specific user groups or roles.
1
2
Even manufacturers of the available software solutions which support this new paradigm have different understandings of SOA leading to a diversity of SOA products with very different properties. A set of related and integrated services that support a business process built up on service-oriented architecture is dubbed a composite application.
12
Heiko Paoli et al.
A second perspective upon SOA can be dubbed a consultant’s perspective. A Service-oriented architecture consists, referring to (Gartner 2005), of some established formal interfaces between reusable business components and clients. Services are designed to be used across applications and organizations. Like the first perspective this understanding also concentrates on the business aspect of SOA, but also integrates a more technical perspective and stresses the possibility of reusable business components. A third understanding can be dubbed a common understanding as it is provided in the free encyclopaedia wikipedia (Wikipedia 2005). Like in the previous definitions a service is defined as abstract and self-contained part needed for the business without considering how a service could be implemented. While additional questions upon the business perspective remains open, this source at least clarifies • how (technical) services could be used (through a stateless request/ response interaction pattern), and • how (technical) services interact with one another (orchestration, choreography). For using a service at least two parties are necessary (provider, consumer) and the relationship between a consumer and a provider can be dynamic in which case the binding between them two is established through a service repository (discovery). This definition allows that a provider for a specific service is also a consumer for another service. As this perspectives on SOA didn’t explain their fundamental nature the technical aspects are described in more detail in the following.
1.3
SOA – Technical Details
To illustrate the idea of SOA in more detail some insight in their basic elements and layers as well as a comparison with alternative state-of-the-art technologies is deemed promising – both are provided on the following pages.
1.3.1
Elements and Layers
Different from alternative SOA understandings that can be found in business scenarios, the technical elements and details of a SOA are more a matter of consensus between most authors. Referring to e. g. (Krafzig et al. 2004) a SOA comprises • • • •
services, application front-ends, a service repository, and a service bus.
Application front-ends are built upon services. The communication between services, service repository and application front-ends is enabled by the service bus.
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
13
Figure 1.1. Basic Elements of a SOA, see (Krafzig et al. 2004)
The service repository is needed to find and bind services dynamically to application front-ends and/or other services. A service can be divided in further parts (see Figure 1.1). Services can be accessed through their interfaces which are publicly visible. Without knowing the interface it is not possible to access or use a service. Hence it is necessary to describe interfaces in an appropriate way.3 A Service can have more than one interface. The advantage is the possibility to use a service by different protocols and/or technologies. A service needs a concrete implementation which provides the business logic and the data to fulfil its tasks. Data is typically partitioned in a way that every partition can only be manipulated through one service. Hence the necessary data of a service can be interpreted as a private property of that single service and if other services need to access that data the associated service and its interface(s) has to be used. The implementation of a service is typically hidden from the consumers. Last but not least a service has also a contract. Currently, the idea of contracts is often disregarded in current implementations. Service contracts define what the service guarantees to its consumers (e. g. quality, security, performance …) and also states what is required to use the service (e. g. price, required authorization level …). One can distinguish business services and basic or technical services respectively. The latter are typically applied to • interact with technical infrastructure elements (send and receive messages, use of data puffer and queues), • connect to already existing legacy systems or applications (e. g. sales system, customer relationship management systems), • use the technical infrastructure (through wrappers, adapters, transformations …). 3
Web services, e. g., can be described by WSDL (Web Services Description Language).
14
Heiko Paoli et al.
Table 1.2. Technical elements in a SOA
Term SOA Application front-end Service
Service Repository Service Bus Contract Implementation Interface Business logic Data
Definition/Comment Entirety of Application front-ends, Services, Service Repository, and Service Bus. Logical aggregation and presentation of services. Logical unit of business logic, which can be used by application front-ends. A service can be used through its interface and assures the conditions from it contract. Optional part of a SOA for managing services. It allows dynamic coupling between services and/or application front-ends. Communication infrastructure between Services, application front-ends, Service, and Service repository. A set of conditions under which a service can be used together with conditions the service will assure by using it. Business logic and data of a service needed to fulfil its contract. Description of a service to connect and use it. Functionality of a service which is needed by application front-ends. Data which belongs to a service and can be accessed and managed by the business logic.
In contrast the former provide logic business functionality to support business processes (often by using basic services), e. g. to check the creditworthiness of a customer. The business services can again be layered, meaning that there are simpler ones (e. g. to check the address of a customer) and more complex ones (e. g. to execute a securities order). The more complex business services typically approach the simpler ones to fulfil their goals. The communication infrastructure between all services is called the enterprise service bus. Typically, a SOA is implemented by using a number of building blocks or layers (see Figure 1.2). The lowest layer (legacy layer) typically consists of – already existing – legacy systems and applications typically providing back-office and persistence functionality. This is the core IT architecture. In the next layer (integration layer) these legacy systems and applications are encapsulated and integrated through basic services. On top of the basic service layer the business service layer (service layer) services are integrated – or orchestrated – to build business processes and business workflows. The uppermost layer (portal layer) is used for presentation purposes. This layer is the business process management portal consisting out of composite applications for controlling and monitoring the business processes. Referring to the basic description of a SOA, the integration and services layer can be implemented using different technologies – the best choice depends on the requirements to be met. Web services are independent from the platform (hardware, programming language and operations system), well supported by different providers and hence applied quite often.
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
15
Figure 1.2. Layers of a SOA implementation, adapted from (Eckwert and Saft 2006)
But they have also their shortcoming – the performance of web services is typically limited and traditional transaction protocols (e. g. 2-phase-commit) are difficult to support. CORBA is often considered as a possible alternative because it is well-suited for handling transactions and provides high performance. On the other side it lacks the profound possibilities to support B2B integration, so it seems best suited for internal processes; the same holds for the J2EE technology. Of course also a mix of different technologies could be used for a SOA implementation as long as the SOA goals like loose coupling of services, reuse and usage of services are regarded.
1.3.2
Comparison with State-of-the-art Distributed Architecture Alternatives
To finally understand what a SOA is, it is interesting to understand what it is not. Therefore a brief delimitation from differing architecture alternatives is given. • Client Server: The client-server approach is one of the best established architectures for distributed computing. The approach and the technology available are mature and stable. Systems of this type can be easily implemented and optimized for special application scenarios. On the other hand, once a client-server system is established it is difficult to change the functionality of the system. Therefore its flexibility is limited and best used, if the business process is expected to be stable and changes are seldom (see (Sadoski 1997)).
16
Heiko Paoli et al.
• J2EE: Java Enterprise Edition (J2EE) is also a mature technology and widely used for distributed computing in object-oriented worlds. Objects are typically smaller than services in a SOA and they are more closely intertwined with each other. The technology is independent from the operating system, but of course, not from the programming language. Objects can be inserted, modified or replaced without stopping and restarting the whole system, so J2EE increases the flexibility for adapting business processes compared to client-server systems. The expected reuse factor in a J2EE environment is as high as expected in a SOA. One main difference between objects of a J2EE architecture and services of a SOA is, that distributed objects lack an explicit contract – they only define an interface. From the conceptual point of view objects are passive and can be used from any other object. Objects don’t commit their functionality to other objects like services do. It is possible to implement a SOA with J2EE (see (Modi 2005)). • P2P: In a true peer-to-peer (P2P) architecture every participant is independent and equal. Central components or services which are used by all other participants should not exist. Every participant provides services to and also consumes services from others. The whole infrastructure is very dynamic and self-organizing, which leads to a high level of flexibility but on the other hand reliability can not be granted. The technology behind peer-to-peer networks is relatively young and mainly used in the field of sharing scenarios. In practice most peer-to-peer architectures use some central components like index servers or registries for better performance but with the disadvantage of providing single points of failure. Without using central services it is very difficult to find resources needed in a suitable amount of time. P2P networks are well suited for highly dynamic environments (see (Ratnasamy et al. 2000), (Stoica et al. 2001)). • Multi Agent Systems: Multi agent systems typically consist of many individual units, so called agents. The most important properties of an agent are that it is autonomous, aware of its environment and pro-active. The system infrastructure provides the communication technology for all agents in the system. These provide and consume services, which can be regulated through contracts, so it could be possible that an agent has to pay for some services other agents provide. On the other hand agents commit the services they can provide and define the level of quality. A major difference between agents and services is that the former can ‘decide on their own’ – agents are meant for behaving like human beings. Therefore it is also possible that agents behave opportunistic, meaning that they possibly try to cheat others. Multi agent systems can be regarded as a mixture of peer-to-peer architectures and SOA, but because of the complexity of the infrastructure required, they seem to be seldom suited for huge internet applications or B2B-integration (see (Peng et al. 1998)).
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
1.4
17
Value Proposition of SOA
Having introduced the basic elements, layers and particularities of SOA, we will take a closer view upon the benefits this paradigm promises. Technical and economic aspects are introduced separately. • Flexibility/Loose Coupling/Agility Loose coupling of services is one of the most important goals of a SOA. Loose coupling is important as it minimizes the probability of having a single point of failure. Every service is as independent as it could be. A service should be able to run in parallel to all other services and should not be dependent on failures of other services. In the case if redundant services exist, it should be even more possible to switch on-the-fly to another equivalent service. One major consequence of loose coupling of services is that services should be stateless, otherwise they are not independent. So normally a service will get all parameters which it needs to fulfil the business functionality at its instantiation. The result of its processing will then be delivered to the consumer. One related problem is the difficulty to implement traditional transaction protocols like a 2-Phase-Commit on basis of stateless services. • Application Integration The design of SOA supports easy application integration. Applications can be regarded as independent services which are applied by other services through a connector to the SOA communication framework (Enterprise Service Bus or ESB). The connector translates the data formats, messages and protocols into the common communication model of the ESB. Other services can virtually integrate functionality of applications and provide a combined view of the integrated functionality from all applications as a new composite application. • Complexity Reduction Another promise of SOA is the reduction of the system complexity. A good indicator of the complexity of a software system is the number of point-to-point connections between all used software components. SOA separates software functionality into technical oriented functionality, like storing data into databases, and business related functionality, like providing a new customer account while the needed communication middleware is normally provided by a deployed SOA framework. This separation of concerns eliminates the fact that software components need to have many point-to-point connections between each other. • Reuse Developing software is complex, error-prone and cost intensive. Therefore one of the oldest aims in developing software is reusability. In the past many attempts were made to support reuse of software and software components: At the beginning of software development the concept of procedural programming languages
18
Heiko Paoli et al.
was developed. All the functionality was put into procedures and then bundled into software libraries. Reuse is then established by using already developed libraries to build up new software. In the era of client-server systems the business functionality was put into the server where it could be reused by many (different) clients over a network. The next attempt was undertaken by the idea of object orientation. Now instead of procedures whole objects could be reused and with the technologies like CORBA or J2EE this is also possible over a network. In a SOA this idea is applied to services, which are now the reusable part. Today over 70% of all data used is managed by mainframe systems (see (Datamonitor 2006)). To access these data in a SOA suitable connectors are needed for the legacy systems. Today over hundred connectors for well established legacy systems are already available for SOA infrastructures. After discussing the technical value proposition the economic potentials have to be addressed – these are flexible to innovate or differentiate and cost savings: • Costs One goal of introducing a new technology is the reduction of costs. Potential costs can be divided into total cost of ownership and costs for developing new functionality. With SOA it is expected that the total cost of ownership is reduced because there are fewer connections between the software components and reduced data redundancy. The SOA enterprise bus is the only and central communication platform that connects all software components (services) together, this leads to reduced costs for the system maintenance. • Economic Flexibility/Differentiation The second important value proposition is the ability to easily adapt infrastructure to business challenges. If the requirements of business change IT infrastructure should be able to allow for technical adoptions. Increased flexibility allows to defend or to afford competitive advantage. The delay in adapting IT infrastructures to business challenges has to be minimized. This goal is supported by SOA because the architecture is designed for flexibility. Normally, new business ideas are realized by implementing new business services and business services are self-sustaining and don’t have strong dependencies on each other. Therefore it is possible to develop new business services independently by using the technical basis services. This typically leads to shorter development at lower costs.
1.5 1.5.1
Case Study Union Investment Introduction to the Use Case Company
Union Investment (UI) is tightly integrated into the co-operative sector where the local ‘Volks- und Raiffeisenbanken/Sparda-Banken’ serve the local clients (B2C)
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
19
and with ‘DZ Bank’ and ‘WGZ-Bank’ two central banks provide national and international service for the Volks- and Raiffeisenbanken. A group of companies (‘Verbundunternehmen’) is responsible to offer financial products – the case company Union Investment is responsible for funds, whereas Bausparkasse Schwäbisch Hall concentrates on real estate savings/loans, R+V on insurances and so on. These companies are mainly owned by the central banks, who themselves are owned by local banks. Three computing centres (GAD eG, Fiducia IT AG and Sparda Datenverarbeitung (SDV)) provide IT services including banking applications for the local banks and their individual IT departments. The local banks – about 1.200 with 12.000 subsidiaries, more than 40.000 sales people and over 20 Mio clients – are independent in their decision to sell UI funds or products of competitors. Hence there is a high pressure on UI to have a competitive product portfolio, an attractive fee structure and appropriate services concerning information, consulting/presales, after sales services and so on. The co-operative sales structure of UI differs from other German or European Asset Managers (AM). The main difference is that UI sells their funds only through the local bank channel and does not actively approach the clients in- or outside the co-operative sector. UI supports the local sales channel with massive public relations efforts (TV, Newspaper, ...) for the strong UI brand. Regarding Assets under Management UI ranks 3rd in Germany with 137 Bill. EUR (a step-up from number 5 six years ago). Nearly all German AM offer depot services/client accounting for their customers. UI keeps around 7 Mio. deposits for 4 Mio. clients. 90% of the transactions origins from the B2B local bank channel, around 10% arrive through the direct/ online channel from the client (these clients are also registered at a local bank and had contact to the local bank when they opened up their account). External B2B Business Processes like account opening or subscription/ redemption of funds are crucial key factors for UI. The fluctuating number of transactions over day/week/ month and year and the high influence of the overall stock market require a very efficient back office to handle huge numbers of fax and letters (up to 40.000 fax pages per day have been reached plus similar numbers of letters). UI tries to maximize the level of online transactions and to use straight-through-processing (STP) approach. The local banks use standard desktops and full service back office banking applications from their regional computing centre partners. Each computing centre typically runs different front and back office systems and uses different technological principles (3270, Java, HTML, etc.). The seamless integration into this external world is crucial for UI: 1. 90% transactions result from the B2B local bank side, 2. all required service have to be available at the local bank desktop and through web/internet front-ends, 3. highly standardized interfaces have to be able to handle development und maintenance.
20
Heiko Paoli et al.
Beside the external B2B interface there are also the following B2C and internal interfaces who require the same or similar transactional services: • UI call centre system for the agents servicing B2B and B2C calls, • B2C web system for UI private clients. In total there are six different front-ends, three from external computing centres and three UI owned for B2B Web, B2C Web and the call centre. The core functionality as well as the core processes are implemented in the depot management system Daska. The satellite/external systems usually access the core system through service interfaces (see Section 3.2). In 1999/2000 the IT infrastructure of the UI consisted of approximate 160 applications with more than 800 interfaces. Quite a number of these interfaces required intense orchestration. The boom of fund sales due to financial market developments and the success of new UI products increased the pressure on the internal IT department to handle ever increasing volumes. The management of IT-based business processes instead of single applications required higher adaptability and a more agile integration strategy. These basic conditions led to the strategic decision to introduce a central EAI platform. The most important objectives were: • • • • • •
Faster adaptability to the business processes requirements Reduction of system architecture’s complexity Increased interface quality Increased level of automation (STP) Decreased data replication efforts Decreased maintenance efforts
The steps UI followed to introduce an EAI platform are illustrated in more detail.
1.5.2
SOA Implementation at the Use Case Company
The development of UI’s SOA-based EAI infrastructure was realized in a dedicated process. Every phase increased the importance of the EAI platform for UI. • Pilot Study The very first step was the analysis of existing interfaces and systems. Core business processes were identified and modelled in ARIS. 17 EAI tools were evaluated against a criteria catalogue that comprised criteria like ‘transaction security with two-phase commit’, ‘universal tools for all integration levels’ just to name two. After the first check two tools were left fulfilling UI’s core requirements. After also considering additional nice-to-have criteria Vitria BusinessWare was selected. • First services roll-outs In the first EAI projects simple replication interfaces were implemented which transferred data from the depot management system into additional databases to
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
21
ensure fast transfer to web front-ends. This project served as the proof of concept of the EAI platform. Additionally, new processes and standards (development, deployment process, etc.) had to be developed and adapted. The first services were established in 2002. Through standard interfaces the Volks- and Raiffeisenbanken were given the opportunity to access the depot management system of UI. Now transactions and depot openings can be processed. The process logic as well as the plausibility checks were implemented in the EAI. With this project UI used process integration within the middleware layer for the first time. Furthermore the status of the platform EAI changed to a real on-line system. • Architecture Redesign With the subproject KOS (consolidation of On-line systems) in the year 2004/ 2005 the SOA was completely redesigned. This became necessary, because new front end systems and external interfaces were added to the service interfaces and new complex services had to be implemented. The requirements for the platform changed (hundred times more requests, 24 × 7 availability, etc.). The goal was to bring real-time-data from the legacy back-end system to all connected front-end. This had the major effect that UI got rid of the replicated databases and reduce the complexity within EAI. The process logic was centralized and the integration of new external systems is now a simplified task. Every direct access to the database was removed and a service-call via CTG (CICS Transaction Gateway) was implemented. The subproject was carried out in several steps: Step I: • • • •
Construction of the basic components of the SOA Implementation of 14 service methods Integration of the B2B-front front-end system to the SOA Integration of a database to the SOA
Step II: • Extension of SOA to approximately 6 services with 43 methods • Integration of the B2C-front-end system to the SOA • Integration of the callcenter-front-ends to the SOA Step III/IV: • • • •
Merge of B2B-/B2C-front-ends by a new, consolidated front-end system Extension of the SOA on 12 services with 105 methods Integration of the VR banks to the SOA New Architecture
The SOA is now divided into two functional blocks: The business services, on the one hand, are structured in customer-services or deposit-services. The second
22
Heiko Paoli et al.
block encapsulates all back-ends within a technical service (every back-end has its own service-implementation, like e. g. the oracle-service). The business-layer communicates through the technical layer with the back-end systems, the technical layer is transparent for the business layer and the front-ends. So UI is able to change the back-end systems without any implication for the front-ends. The methods within the services are classified in process und basic methods. Process methods capsulate the business process as well as control the orchestration of one or more basic methods. The integration of a front-end system is based on process methods. The basis methods are rather slender and re-usable. Basic methods are used by the process methods or for internal functions. All methods are administered through the service repository (SR). The SR controls the status for every method (online/offline) and the dependency between process and basis methods. In addition the SR has implemented an access control list for all known front-end to control the access to special set of methods for every front-end. The heart of the SOA is the service engine. With the consolidation the access to the juridical continuance occurs in the host and no more on replicated databases, hence an abstraction layer was implemented through CICS services. All inquiries of the SOA in the host are handed over the common transaction gateway to CICS and are answered by the suitable programs. This allows to use the whole logic in the host and to implement new logic without integrating SQL’s or business logic into the SOA. The outside world communication is done via XML structures. The goal of the service engine (SE) was to design the interface between CICS and XML instead of implementing the mappings. The SE implements the technical error handling and an XML based mapping description which is used to transform the XML into CICS and vice versa by runtime. This reduces the implementation costs for a new service. The platform supports on-the-fly-deployment of service configurations and recognizes cyclic changes within the service repository. All requests and replies are logged in a database. Per day approximately 200,000 requests like buy/sell orders for funds are processed by the platform. The new architecture consists of the following components: a. Service adaptor: Converts client specific requests into internal requests, supported technologies are (HTTP, MQ-Series, etc.). b. Dispatcher: Handles the processing of the requests, uses the service repository. The dispatcher is part of the enterprise service layer and part of the system integration layer. c. Service repository: Contains information about services (name, type, access right-hand side, etc.). d. Enterprise service: Enterprise services contain the business logic and use integration services to access the legacy systems. e. Integration service: Handles the access to the legacy systems.
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
1.6
23
Discussion
To illustrate which benefits of the SOA approach were realized at UI they are discussed in more detail. • Flexibility/Loose Coupling/Agility The main advantage of the new SOA for UI is the elimination of replication interfaces. This avoids data inconsistency and improves quality and respond times. The business logic is decoupled and only located at the host whereas it was distributed among several layers prior to the SOA introduction. In principle changes of business logic and/or technical services should have become easier and at reduced time and cost. Choosing the correct service for adoptions is still not always straightforward because of the high number of existing services. Having found the service to revise, most adoptions can be made by changing the configuration of its description – in contrast to changes of the Java code itself like before. As a consequence these changes don’t require a compilation of the service implementation anymore. They can be performed by staff different from IT-developers. Loose coupling of the components was required for the orchestration of the services but it is also strategy of UI to improve the stability and reliability of the whole system. Most of the services (business and technical) are dependent from additional technical services. So far, it is not exactly transparent whether these dependencies are really necessary or could at least be minimised by restructuring the technical service layer itself. These dependencies lead to complex release planning. For the reliability of the whole infrastructure it was noticed that after the SOA introduction the availability increased. However risk increased as well: With the service repository/ the service dispatcher the architecture now has a new singlepoint-of-failure. Prior to the introduction of the SOA redundancy avoided singlepoints-of-failures. For critical components additional investments in fallback solutions and load balancing became necessary. One surprising result of the new architecture is performance. As XML is used intensively in the new system (which basically comes along with a lot of serialisation, de-serialisation and transformation efforts for this text based format) so lower performance was expected. Surprisingly, the performance was ten times higher than in the old system. • Application Integration With the introduction of SOA seven applications have been consolidated and integrated. It became easier to build new application layers consisting of composite applications. Because of independent efforts in enterprise application integration there was already a prototype established before the full roll-out of the final solution. Hence, these independent strategies fitted together nicely and alternatives for application integration weren’t discussed.
24
Heiko Paoli et al.
• Complexity Reduction The centralisation of the business logic into the depot system reduced the interface of the SOA to its core capabilities. Multiple development efforts for insurance business plausibility weren’t needed. Inside the SOA technical aspects and functions are wrapped and can be orchestrated through business processes. Hence, the complexity of the technical services and business plausibility is hidden from the user. It is now possible to add new front-ends and new services at minimal costs. On the other side additional effort for coordinating and testing is needed. Modifications on already installed and utilised services or basic functionalities are sensitive. Therefore, advanced planning prior to modifications and complex tests afterwards became necessary and have to be coordinated by a central release planning. The total number of software components has more or less remained the same. • Reuse When a huge number of software projects uses the same platform reusability of components becomes an important issue. Every project can profit from the availability of services tools and service components. The first step for every new EAI project is to examine the existing components whether they can be utilised at different points or whether different components/services have to be implemented. Analyses at the beginning of the SOA introduction found out that reuse of the old and already installed software components from the old system was not suitable. Hence most functionality has been developed from scratch; one exception was the communication interfaces to the B2B partners. The newly developed components will be easily reusable in the further development of the system. During the last years many components, modules and service tools have been created which accelerate the development of new projects. Some of these components are: • error processing: centralized, data based error handling, • deployment operations: fully automated, staging (E-T-A-P) support, • data base XSL transformer: centralized service, can read XSLT documents dynamically from data base or file system, • ... In the beginning the re-use factor was rather low and limited to some sourcecode or parts of the framework. With the introduction of basic function and the SOA the relation ‘newly created software parts’ to ‘re-used software parts’ improved a lot. Nearly 80% of the basic services can be re-used without customization. The rest splits itself in basic services where minor adaptations are necessary, and a small part of new functions and services. • Economic Aspects Besides the technical observations economic ones can be illustrated. • Costs: In the case of UI the direct costs of the EAI System have not been reduced after introducing the SOA, but the SOA enabled to consolidate the front-end systems where the cost reduction was significant.
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
25
UI noticed that developing new functionality through services is not cheaper than adding new functionality to the old architecture. Pure implementation efforts for functionality as service are lower but there is increased planning effort because services should be reusable and must be conform to the overall SOA strategy. Cost reduction is possible both for operating and maintaining the system, but maintenance became more complex in terms of dependencies from reused services. • Economic Flexibility/Differentiation: As stated above, it turned out that the SOA itself is more flexible than the old solution, but it was also mentioned that this flexibility is today hardly helpful for UI. At the moment the time needed for implementing new functionality is nearly the same like it was prior to the introduction. This is because major changes of the functionality in the back-end system often also require changes in the front-end. Unfortunately these front-ends are administrated by UI partners and are not controlled by UI nor are they part of the SOA structure.With the new SOA users of the system got direct access to real-time data and don’t have to work with replicated data. This increases user satisfaction. Because of the separation between business-oriented services and technical services it is easier to locate which services should be changed or adapted if the business requires this. It is often possible to change only the configuration of a business service instead of changing the implementation by means of programming. A good example of the new flexibility is the implementation of an OCR system which uses existing SOA services like name search or ordering. For the new technology only a few parameters had to be added to the services.
1.7
Conclusion
Business requirements drive the requirements of the IT infrastructure. A modern IT infrastructure has to support business process. In a dynamic environment IT architecture has to be prepared for continuously changing process. One of the key success factors of today’s infrastructures is flexibility. The new development paradigm is to assemble new applications from existing components (composite applications). Programmers create services but architects and business analysts orchestrate services to implement business processes. Therefore there are two closely intertwined levels: the business and the IT level. In the case of UI the main goal for introducing a new architecture were • • • •
reduction of costs, easier maintenance of the system, timeliness of data, and elimination of redundancy.
The introduced SOA is fast and flexible whereas the direct costs remained the same, cost reduction was achieved by using the SOA for front end consolidation.
26
Heiko Paoli et al.
The strategic decision to introduce a central EAI platform was mainly driven by requirements like agility and a high STP rate. Vendors of EAI/ESB systems typically promise fast deployment and high return on investment (ROI). UI realized that this only holds true for either build-from-scratch or very simple infrastructures where no legacy systems etc. have to be integrated. Even there the result seems still vague. SOA success requires time and expertise. A positive or negative outcome of an economic analysis of the business case depends heavily on the technical situation (like the existing infrastructure with respect to number of interfaces, STP requirements, BPO and so on) and the individual framework of the company (e. g. with respect to existing contracts). In a more or less static or simple environment, where changes are rare it seems pretty hard to calculate a positive business case. SOA fits perfect into complex infrastructures with time dependent linked interfaces and a high rate of change. It is common that in such infrastructures data is often replicated and (nearly) identical business services are developed and maintained a couple of times. Minimizing data replication and reusing services will result in cost reductions. Technical challenges have somehow been transferred into business challenges: Management and planning of releases and testing became more complex and that flexibility gained is not fully used because of external dependencies to the business partners. The new architecture allows more configurations and needs less programming than the old architecture but for changing the configuration there is still a developer necessary. The new architecture has more or less of the same complexity like the old one, a reduction in complexity could not be achieved. But even if the total complexity is the same the overall structure has been improved. The old structure has grown historically and consisted of a number of completely different systems together with a lot of workarounds (spaghetti pattern). Now the different subsystems form an integrated structure and follow a common idea. This leads to an interesting new insight and psychological side effect of SOA, the employee ‘like to work’ with the new architecture. Development of the old system often meant to do boring and nasty work whereas the further development of the SOA architecture tends to be interesting and motivating. All in all the introduction of SOA was a success and the new system provides more flexibility, higher performance and reliability at the same costs. From the sight of Union Investment the possibility of cost reduction alone is not the main point for introducing a SOA. To further increase the momentum of the new technology at UI and in general continuous improvement with respects to technical services but also with respect to organisational aspects is needed – areas that themselves are still objects to scientific research: Two of the most important questions might be how to divide new functionality into appropriate services and how to innovate new services at all.
1 SOA in the Financial Industry – Technology Impact in Companies’ Practice
27
• How ‘big’ should a typical service be? What are the rules to design a service? This question is today only answered by rules of thumb and surely need more experience with SOA. In practice the most implementations tend to design services in dual to the business process. Therefore the driver for the service granularity is more the business and not some technical aspect. It is obvious that it makes no sense to have some artificial rules like e. g. the number of methods a single service could have at a maximum before to split it. For deciding how requirements could/ should be divided in services and/or orchestrated a detailed analysis and examination is recommended. At first it should be ascertained which requirements are needed by the business process (top-down approach) then, secondly, it has to be decided how these requirements could best be provided by business and technical services (bottom-up-approach). At this point it has also to be checked if services should be reused by already installed and integrated systems. Today’s guidelines at UI prefer thin business services with a high reuse-rate. • How can or even should service innovation look like? This process surely needs strong communications between a number of organisational units like i. e. business departments and system development. In SOA scenarios exists the same problem like in most software development process – namely that business experts and system experts have to be more closely connected and that methods, tools and languages have to be available that are easy to understand for both of them but powerful enough to be of real help. So far at UI exists no continuous innovation cycle, there are no regularly planned milestones for checking and refactoring deployed services. SOA has a high potential in improving companies’ infrastructures. The momentum that can be reached surely depends on the questions how well the different internal stakeholder groups can be supported and how well technical and economic questions can be addressed simultaneously.
References Bloomberg J (2005) The Four Pillars of Service-Oriented Development, zapthink, http://www.zapthink.com/report.html?id=ZAPFLASH-2005418, as seen on 2006-09-20 Datamonitor (2006) OpenConnect Exposes Mainframe Services by Listening, http://www.computerwire.com/industries/research/?pid=D72EDD51%2DE37 4%2D4F46%2D9904%2DB1C115B08624, as seen on 2006-09-20 Eckwert B, Saft F (2005) Konsolidierung Onlinesysteme – ein SOA-Praxisbericht, Talk at EAJ-Forum, Karlsruhe, 2005-04-20 Empolis (2006) Empolis official Hompage, http://www.empolis.com/, as seen on 2006-09-20 Fu X, Bultan T, Su J (2004) Analysis of Interacting BPEL Web Services, in: Proc. of the 13th Int. World Wide Web Conference, pages 621–630, New York, USA
28
Heiko Paoli et al.
Garlan D (2000) Software Architecture: a Roadmap, in: ICSE – Future of SE Track http://www.cs.ucl.ac.uk/staff/A.Finkelstein/fose/finalgarlan.pdf Gartner (2005) Gartner’s Positions on the Five Hottest IT Topics and Trends in 2005, David Cearley, D./Fenn, J./Plummer, D., http://www.gartner.com/ DisplayDocument?doc_cd=125868, as seen on 2006-09-20 He H (2003) What Is Service-Oriented Architecture, XML.COM, O’Reilly, http:// webservices.xml.com/pub/a/ws/2003/09/30/soa.html, as seen on 2006-09-20 Huhns M, Singh M (2005) Service-Oriented Computing: Key Concepts and Principles, in: IEEE Internet Computing, vol. 09, no. 1, pp. 75−81 Jones S, Morris M (2005) A METHODOLOGY FOR SERVICE ARCHI TECTURES, OASIS, http://www.oasis-open.org/committees/download.php/ 15071/A%20metho-dology%20for%20Service%20Architectures%201%202% 204%20-%20OASIS%20 Contribution.pdf, as seen on 2006-09-20 Kossmann D, Leymann F (2004) Web Services, in: Informatik Spektrum 27(2): 117−128 Krafzig D, Banke K, Slama, D (2004) Enterprise SOA, Service-Orientated Architecture Best Practices, Prentice Hall Magic Software (2005) EAI Competence Days, Mainz, Germany, http://www.magicsoftware.com Modi T (2005) Splitting Hairs: Web Services Vs. Distributed Objects, Web Services Pipeline, http://ie.webservicespipeline.com/57704023, as seen on 2006-09-20 Nickull D, McCabe F, MacKenzi M (2006) SOA Reference Model TC, OASIS http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=soa-rm, as seen on 2006-09-20 Peng Y, Finin T, Labrou Y, Chu B, Long J, Tolone W, Boughannam A (1998) A Multi-Agent system for Enterprise Integration, in: Proceedings of the 3rd International Conference on the Practical Applications of Agents and MultiAgent Systems (PAAM-98), pages 155−169, London, UK Ratnasamy S, Francis P, Handley M, Karp R, Shenker S (2000) A Scalable Content Addressable Network, Berkeley, CA, USA Sadoski D (1997) Client/Server Software Architectures – An Overview, Carnegie Mellon Software Engineering Institute, http://www.sei.cmu.edu/str/ descriptions/client-server_body.html, as seen on 2006-09-20 Stoica I, Morris R, Karger D, Kaashoek F, Balakrishnan H (2001) Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, in: Proceedings of the 2001 ACM SIGCOMM Conference, pages149−160 Stojanovic Z, Dahanayake A (2005); Service-Orientated Software System Engineering, Challenges and Practices, Idea Group Publishing Wikipedia (2006) Service-oriented architecture, in the free encyclopaedia (http:// en.wikipedia.org/wiki/Service-oriented_architecture, as seen on 2006-09-20) Zimmermann O, Krogdahl P, Gee, C (2004) Elements of Service-Oriented Analysis and Design: An interdisciplinary approach for SOA project, IBM DeveloperWorks Web Services Zone, July 2004
CHAPTER 2 Information Systems and IT Architectures for Securities Trading Feras Dabous, Fethi Rabhi
2.1
Introduction
Electronic finance (e-finance) is mainly concerned with the automation of traditional activities in the finance domain and their associated processes. In particular, trading has become more and more a global activity because of the recent technological developments that have facilitated the instantaneous exchange of information, securities and funds worldwide. Trading decisions have also become more complex as they involve the cooperation of different participants interacting across wide geographical boundaries and different time zones. Although traditional communication media (such as phone, email and fax) are steadily being replaced by automated systems, there are still many areas that require human intervention. The evolutions in IT middleware and integration technologies (particularly those based on the Internet) are providing endless opportunities for addressing such gaps (Banks 2001). One of the first challenges for software architects and IT professionals is getting sufficient knowledge about existing practices, business processes systems and architectures. This chapter should be primarily regarded as a contribution to the understanding of the domain of capital markets trading by illustrating some of its associated architectures, technologies and systems. The remaining part of this chapter is organized as follows. Section 2.2 introduces financial markets in general and reviews the concepts of capital markets structures, trading instruments, and trading order types. Section 2.3 discusses the information flow across the trading cycle in a capital market. It also addresses different types of information systems that are in use across such a cycle. Section 2.4 provides a closer look to the software architecture of two selected systems. Section 2.5 discusses an integration model that illustrates the different levels of abstraction currently being used in capital markets information systems. Section 2.6 concludes this chapter.
30
2.2
Feras Dabous, Fethi Rabhi
Financial Markets: Introduction and Terminology
2.2.1 Introduction to Financial Markets Financial markets have the role of facilitating the movement of funds among people and organizations. The surplus units (e. g. funds suppliers) are seeking to generate returns (profits) by supplying their funds to the deficit units (e. g. the borrowers). Financial markets have evolved in such a way that enhances efficiency with which trades may take place and maximizes the fairness of trading in order to attract investors. The evolution of financial markets is a continuous process in response to changing needs, new technologies and changes in public policies. Financial markets are constantly introducing new financial instruments with various levels of risk management that equip the market participants with more confidence and risk management to conduct trades. Financial markets are classified into primary and secondary markets. The former is used by deficit units such as businesses, governments and households to raise funds for the expenditure of goods, services and assets. The latter is used by surplus units such as the investors to trade what has been bought in the primary market. Financial markets are also classified into money markets and capital markets (Viney 2000). Money markets bring market participants together to trade market supported instruments offering high levels of liquidity. Australian money market participants include banks, superannuation funds, money market corporations, fund managers, building societies, credit unions, cash management trusts and companies. Money market instruments include exchange settlement account funds, treasury notes, intercompany loans, interbank loans, etc. On the other hand, capital markets have the primary responsibility for the operation of both its primary and secondary capital markets. Companies that need to raise capital by offering new stocks use the primary market. These new stocks can include initial public offerings (IPOs), right issues, placement and company options. Once those newly issued stocks are sold in the primary market, they can continue to be traded through the secondary market, called the stock exchange market. Capital market participants include traders (e. g. investors), intermediaries (e. g. brokers), statutory authorities (e. g. securities commissions) and third parties such as commercial information vendors (e. g. Bloomberg or Reuters). The market structure is determined by the trading rules and trading systems used by that market. These rules and systems determine who can trade, what to trade, when, where and how they can trade. They also determine rules governing market information dissemination to traders and other market participants (Harris 2003). Understanding market regulations, mainly trading rules and systems, is crucial for conducting trades. Normally, trading strategies that are applied to one marketplace
2 Information Systems and IT Architectures for Securities Trading
31
are not applicable to most other markets that have different regulations and structure. Such differences are motivated by different understanding and research outcome on how to facilitate fairness and efficiency of a marketplace. A framework for market characterization has been defined in (Aitken et al. 1997). The framework has six crucial elements which are: technology, regulation, information, participants, instruments and trading protocol. It also presents four characteristics to achieve market fairness and efficiency. These characteristics are liquidity, transparency, volatility, and transactions costs. Market organizers have the role of achieving the best combination of the six factors in such a way that satisfies required levels of a fair and efficient market. In the remainder of this section, we illustrate key elements in markets microstructures that are: trading instruments, types of trading orders, and types of markets microstructures.
2.2.2 Capital Markets Trading Instruments Capital markets have evolved by introducing a number of trading instruments that diversify investment opportunities and attractiveness. These instruments can accommodate different trading styles for investors. They range from trading physical stocks (i. e. shares) to contracts on the willingness to trade stocks on a future date at a price determined today.
2.2.3 Equity Market The equity market, commonly known as sharemarket, plays the role of both primary and secondary capital markets. It is the place where financial assets are bought and sold. These assets can be the new stocks issued at the primary market or existing stocks that are currently traded at the secondary market. Secondary market transactions do not directly affect the cash flow of the corporation whose shares are being traded. The cash flow of that corporation is directly affected when it issues new stocks at the primary market. However; it has already been suggested that the existence of a well-developed secondary market is of a great significance to a corporation that may be seeking to raise capital in the primary market (Viney 2000). The existence of an active, liquid, well-organized market in existing shares adds to the attractiveness of acquiring new shares once advertised in the market. Figure 2.1 (based on (Viney 2000)) summarizes the flow of capital and ownership rights of stocks in secondary capital markets in Australia while showing the participants role in that process.
32
Feras Dabous, Fethi Rabhi
Figure 2.1. Roles of secondary capital market participants and the flow of funds in Australia
2.2.4 Derivatives Market The derivatives market provides instruments that focus on risk management (interest rate, foreign exchange and price risks). These instruments such as options and futures have their value derived from the underlying security in the equity market or the value of a specified index for the index based derivatives. The unit of derivative trading is a contract. Once the contract is written, it can be traded at the derivatives market at a price that depends on the underlying stock price movement or the underlying index value. Each contract is a variant of an obligation or no obligation to trade (sell/buy) the underlying stocks at a predetermined price before a predetermined date or at predetermined date at some cost or no cost. Derivatives investors should be obligated to pay or receive margins during the life of the contract on timely bases. Organized derivatives can also be based on an equity market index (index derivatives) that gives the investor exposure to a certain market index. The advantage here is the exposure to a broad range of securities comprising an index rather than being limited to one particular company. Index derivative are normally cash settled.
2.2.5 Types of Orders There are a number of standardized orders that can be used by investors to inform their brokers about the order specification and/or by traders or brokers to inform the market about the intension to sell (ask order) or buy (bid order). Some markets may not support certain types of orders. In this case, brokers may offer services to their clients. Brokers may support a more complicated order specification service. In both cases, the broker service would imply the execution of the order as soon as all its requirements and conditions set by the trader are satisfied. Harris (Harris 2003) has presented a number of common orders’ types some of which are:
2 Information Systems and IT Architectures for Securities Trading
33
• Market order is an instruction to trade at the current available best price. This type of order is traded immediately but the trading price is not fixed before trading. • Limit order is an instruction to trade at a specified price or better. The price of bid limit order is normally equal or less than the current best bid price, while the price of ask limit order is normally equal or greater that the current best ask price. This order is only trader when the market price moves towards it. • Any order can carry additional information such as validity, expiration, etc. Some of the validity and expiry instructions are: open-order, dayorder, good-this-week, good-this-month, good-until, immediate-or-cancel, fill-or-kill, good-after, market-on-open, market-on-close. Quantity instructions may also be included such as all-or-nothing or minimum-or-none orders. Other instructions may also be attached such as hidden-volume (undisclosed) order and substitution orders.
2.2.6 Classifications of Marketplaces There exist two main types of trading sessions: continuous and call. Continuous session allows brokers and traders to place orders at any time during the trading session and the exchange executes trades as soon as a bid/ask match occurs. On the contrary, in call session, brokers and traders can also place orders at any time during the session until the market is called. A classification of market microstructures (Harris 2003) shows the following types of markets: • Quote-driven dealer markets: in this market type, the dealers participate in each trade by quoting the bid and ask prices once requested by brokers or traders. Traders request quote and negotiate with a number of the security’s dealers and normally select the dealer with the best quote that suites their order. If the submitted order is not matched immediately, then it is processed according to the type of that order and its routing instructions, for instance, it may be cancelled or awaiting as an outstanding order. Matching outstanding orders can occur through a negotiation process in which the dealer moves his offered price to match that outstanding orders and/or the trader updates his order price to match the dealer price. NASDAQ, London stock exchange (LSE) and Reuters200 foreign exchange trading system are examples in which quote-driven market is the dominant structure organized by a dealer association and an exchange. • Order driven (auction) market: include oral auction (open outcry), singlepriced auction, continuous rule-based two-sided auction and crossing networks. In this type of markets, trader orders for any particular security are centralized in a single orderbook where buyers are seeking the lowest price and sellers are seeking the highest price. An order-driven market uses order precedence rules to match buyer to sellers and trade pricing
34
Feras Dabous, Fethi Rabhi
rules to price the resulting trades. The primary order precedence rule at all markets is the order price where the higher bids and the lower asks have higher priorities for matching. Other precedence rules vary for different markets and used when a number of orders have the same price. These rules include placement time where the earlier submitted orders have a higher priority, order size where smaller/larger order volume has higher priority, public order precedence where public traders have higher priority than floor traders and display precedence where disclosed orders have higher priority than undisclosed orders. Continuous auction markets, which maintain a real-time outstanding orderbook for all securities, are used in most stock exchanges such as ASX continuous trading session. • Brokered market: in such markets, brokers take the role of the exchange order matching under certain circumstances which is normally called crossing. Brokers’ role is to search for a counterparty trader(s) for a trading request. This is normally occurs for very large trading orders which is called block orders. In dealer market, the dealer sometimes refuses to conduct risky trades due to expected abnormal price movement or in illiquid securities. In auction markets, block orders could have an impact on the market price especially if that security is illiquid that may cause the market price to move away from the quoted price before matching that block order. The ASX allows brokers under some conditions to arrange trades for block orders through in-market and off-market crossing trading rules. • Hybrid markets: combine characteristics of the previous three market types may be presented. Normally, the market structure features of one market type are the dominant while some other features from other markets types are embedded. The NYSE, which is classified as an orderdriven market, requires the specialist dealers to offer liquidity if no one else does that. The NASDAQ, which is classified as a quote driven dealer market, requires its dealers to display or execute the limit order.
2.3
Information Systems for Securities Trading
This section describes the information flow across the securities trading process. It also illustrates the different categories of information systems that can be utilized across such a process.
2.3.1 Securities Trading Information Flow Market information includes real-time trading data and events such as orders, trades, quotes, indexes and announcements. It also includes historic information such as past trading data and companies’ periodic reports. Additional statistical data and analytics may also be available; however, participants such as brokers
2 Information Systems and IT Architectures for Securities Trading
35
Figure 2.2. Flow of trading information within the trading cycle
can generate their own analytics based on raw historic market data. As mentioned earlier, market information dissemination is very crucial to the fairness and efficiency of a capital market and hence contributes to its attractiveness to traders and encourages companies to be listed on the market. Figure 2.2 illustrates the overall information flow across the trading processes for an order-driven market structure which is derived from the way the Australian Stock Exchange (ASX) (www.asx.com.au) operates. It comes in the shape of a continuous cycle in which we identify five phases: pre-trade analytics, trading, post-trade analytics, settlement and registry. In the following is a description of each trading phase. 2.3.1.1 Pre-trade Analytics Phase The trading cycle starts with a pre-trading analytics phase that is normally conducted by the brokerage firms. In this phase, analysts use real-time information (trades, orders, volume, etc.) and correlate it with company announcements and historic information (including periodic company reports) to predict price movement with a degree of certainty. Consequently, the purpose of this phase is to support buy or sell decisions in order to gain capital profit or prevent losing capital respectively. Most brokers provide the outcome of their analysis as services to their clients (investors) and extra fees normally apply. Investors can seek historic information, delayed real-time market information and announcements from the
36
Feras Dabous, Fethi Rabhi
capital market web site or from third party such as commercial information vendors e. g. Bloomberg (Bloomberg.com) and Reuters (www.reuters.com). These services are offered free or for some fee that depends on the level of detail and real-time accuracy. 2.3.1.2 Trading Phase The trading phase basically implements a market structure as discussed in the previous section. Buyers and sellers both signal their intentions in the form of orders (different types of orders were also discussed earlier) and different markets have different methods for determining trades depending on the order type, the market structure and the degree of the trade process automation. At the heart of the trading system is the trading engine (TE). 2.3.1.3 Post-trade Analytics Phase A capital market attracts investors by maximizing the fairness and efficiency of trading. Post-trade analytics focuses of detecting illegal trading behavior based on the available real-time and historic market information. Post-trade analysis can also use this information for other purposes such as analyzing past and current market behavior and trading patterns and predicting future behavior, etc. A capital market surveillance department is an example of a place for conducting post-trade analytics. The major responsibility of a surveillance department is to detect illegal trading incidents and to generate real-time alerts for such incidents. A surveillance department must have real-time access to every piece of market information across all stages in the trading process such as order and trading data, settlement and registry databases, possibly brokerage firms’ databases and investors’ holdings and personal data. Due to information privacy measures that are enforced, surveillance departments may not be able to access some of this information. Instead, they are partially coping by implementing intelligent and smart techniques to detect illegal trading behavior. Because of heuristic nature of these techniques, surveillance is in constant need for improvements. 2.3.1.4 Settlement and Registry Phases Settlement and registry is a transactional-based process for transferring securities from seller’s profile to buyer’s profile and transferring money from buyer’s account to seller’s account. The transfer of both securities and money is simultaneous and irrevocable once the settlement process is started. The registry phase completes the settlement process by modifying the ownership of shares in the company’s registry from the seller to buyer. It is possible that a number of brokers are involved in one single trade, this happens when, for example, in equity market one large bid order is matched with a number of smaller ask orders. The settlement process is complicated since it involves transferring money and ownership of securities between the different parties involved.
2 Information Systems and IT Architectures for Securities Trading
37
2.3.2 Categories of Capital Markets’ Information Systems We now take a close look at the different types of computer-based systems that participate in the securities trading process. These systems include those that facilitate information dissemination to market participants and contribute to market’s transparency. Other systems facilitate order routing to and from the exchange and across the trading cycle. Remarkably, each phase in the trading cycle is supported by one or more computer information system. In the following classification (based on (Harris 2003)), one system may perform tasks related to more than one category. 2.3.2.1 Information Collection and Distribution Systems These systems have replaced the clerks (market reporters) who used to observe and record all trading activities. Information collection includes reordering all market activities such as orders, cancellations, amendments, trades, quotes and announcements. Most exchanges provide real-time market feeds that can be accessed by outside systems. The distribution of market information is directly related to market’s transparency. Distribution can occur in real-time or delayed basis governed by market regulations and the type of requester. Most markets provide delayed quotes and companies historic information as a free service for public. Real-time services are only available registered members such as brokers and data vendors. Data vendors, such as Bloomberg and Reuters, often support query and broadcast services of its customers. Query services provide information on demand while broadcasting provides continuous streams of information as it is received from the participant exchange. Broadcast services include trade reports (ticker tape) and/or quotes (quotation feeds). Broadcasting delayed quotes service in normally available free on the watch board of the market while query services of delayed quotes are also available for free on the Web sites of markets or data vendors. 2.3.2.2 Trading Engines The trading engine (TE) is at the heart of an automated trading system. The main function of the TE is to match bid/ask orders using order precedence set by the market regulation. In an order-driven market such as the Australian Stock eXchange (ASX), the TE implements a number of bid/ask order-matching algorithms that depend on the security concerned, the current trading phase (opening, continuous, closing etc.) and the type of order. Once a security’s best bid price matches its best ask, an irreversible trade is executed. In-house developed trading engines are often the choice by many large markets such as SYCOM in Sydney Futures Exchange (www.sfe.com.au), SuperDot in New-York Stock Exchange (www.nyse.com) and SuperMontage in NASDAQ (www.nasdaq.com). Other markets choose a thirdparty trading engine. For example, SWX Swiss and Shanghai Stock Exchange use
38
Feras Dabous, Fethi Rabhi
OMX’s X-STREAM trading engine (www.omxgroup.com) which will be described in more detail later in this chapter. Another example is OMX’s CLICK trading system that is utilized in a number of equity options markets such as the International Securities Exchange and the ASX. 2.3.2.3 Order Routing Systems Order routing systems transmit orders between the appropriate market participants and systems. Traders use them to send orders to the broker or dealer. Brokers use them to transmit orders to a dealer or an exchange and to report back to the trader about the status of the order. Dealers and exchanges use them to report trades to the brokers and/or traders and to the registry and settlement systems. In ASX, order routing systems are utilized by the SEATS message based Open Interface (OI). Order routing systems of Australian brokers such as IRESS IOS (www.iress.com.au) are conformant with the SEATS IO. American ITS (Intermarket Trading Systems) facilitate trading across different equity markets in US. Some Electronic Communication Networks in the US are participants to order routing systems. 2.3.2.4 Order Presentation Systems Order presentation systems present information about orders and quotes to the market participants (e. g. brokers). In order driven markets, this information is presented on traders screen connected to the screen-based trading system. Brokerage firms support their own presentation/visualization systems that receive feeds from ORSs. For example, IRESS system has its own facility for order presentation to the brokers whereas SMARTS (to be described later) has two subsystems called SPREAD and REPLY which support orders visualization facility to market researches and surveillance personnel. Dealers markets often use messaging systems that allow participant to communication via messages. 2.3.2.5 Settlement and Registry Systems Settlement and registry systems facilitate the registration of the trade volume to the buyer and transferring the trade value to the seller. We often distinguish between settlement and registry systems. Settlement is about the transfer of money in return for title and the legal title is often (not always) then held in what is known as a depository. The settlement and depository is often done in the same organization which are in general called “CSD” – central securities depository. The registry system is really a duplication of this information but is a specialized function because its role is to facilitate the communication between a company and its shareholders (such as voting, annual meetings, distributing annual reports and dividends etc.) and is becoming an increasingly important role given that issues of corporate governance are becoming more prominent. In the ASX, CHESS is the official settlement system and depository that acts upon receiving trades from
2 Information Systems and IT Architectures for Securities Trading
39
SEATS with the cooperation of the trade participating brokers. CHESS is capable of communicating with issuer sponsored registry if one of the trade counterparties is not CHESS sponsored. 2.3.2.6 Surveillance Systems Market surveillance is facilitated by computer systems that try to detect illegal market behaviors that violate the market regulations. Surveillance systems should have access to every piece of real-time or historic market information. For example, the Financial Services Authority (FSA) (www.fsa.gov.uk) in UK is tendering for a supplier to upgrade its SABRE surveillance system. SMARTS (www.smarts-systems.com) is a universal surveillance system that is used in more than 16 countries.
2.4
Case Studies
This section presents two selected systems that participate in the securities trading cycle, namely the SMARTS surveillance system and the X-STREAM trading engine.
2.4.1 SMARTS SMARTS (www.smarts-systems.com) is a securities markets surveillance and information system that provides easy-to-use alert management, reporting and research capabilities for exchanges. The system allows accurate alert algorithms to be created. SMARTS is a software tool which originates from the research labs at Sydney University and was developed under the direction of Prof. Mike Aitken and two PhD students Thomas Jones and Tim Cooper. After the involvement of Computershare Ltd in 1996, the tool changed its focus from a research tool into a surveillance and compliance product. SMARTS is now used in many stock exchanges including Jakarta and Hong Kong, SWX Swiss Exchanges, ASX, Tokyo Commodities Exchange, SE of Thailand, Singapore, Moscow. The SMARTS architecture comprises the following important elements: 1. A number of application components that perform surveillance and research functionalities on real-time and historic financial data. 2. A data model called MPL which stores all the market real-time and historic information in an optimized manner. 3. A number of software modules (called engines) for the creation and manipulation of MPL data. The most commonly used are the tracking and alerting engines. 4. The Alice language that is utilized to specify the rules and conditions that can cause the issuing of alerts.
40
Feras Dabous, Fethi Rabhi
2.4.1.1 Application Components SMARTS comprises a number of application components (or modules) which carry out predefined and autonomous tasks on behalf of users. There are different types of components. The surveillance-oriented components are: ACCESS (benchmark visualization tool depicting the normal behavior of trading metrics representing liquidity, volatility, transparency and transactions costs), REPLAY (replays historical order and trade data), SPREAD (an interactive graph of trading behavior on price and time axes), ALDIT (Alice language editor), and ALMAS (Alice Management System), COLLUDE (depicts trading activity between brokers). Besides real-time detection of illegal behaviors in capital markets (e. g. insider trading and market manipulation), SMARTS can also be utilized as a platform for capital market analysis and research purposes. The analysis/research oriented components are: EVENTS (Market research), FACTS (Market statistics report), INFORM (Information Editor for news such as company announcements), PORTFOLIO (visual representations of trading metrics), and REPORT (lists of transactions, such as trades and orders, pertaining to a single or multiple security, house, trader and/or client). 2.4.1.2 SMARTS MPL Data Model The MPL data model manages all information relevant to SMARTS application components. As Figure 2.3 shows, there are two types of data: transient data which is held in the main memory and persistent data which is held on disk. The
Figure 2.3. SMARTS architecture
2 Information Systems and IT Architectures for Securities Trading
41
MPL data model can manage and process real-time transactions as they arrive from the trading engine. It also provides the required infrastructure for monitoring multiple markets (either different trading engines or different types of markets such as options and futures). The MPL data model manages the following market relevant data and concepts: trades, orders, quotes, historic indexes, interest rates, dividends and stock splits, static data such as companies names, shares-on-issue, index weighting, special fields for options, futures, bills and bonds, broker house or trader cash position, inventories of holdings, and news announcements. One feature of MPL is that it relies on new data (i. e. market transactions) being added to track data (FAV format) in append mode only. The “never delete anything” feature is needed for storing a full history of events so that, at any time, a surveillance department can regenerate the exact conditions as they occurred in the past for research and investigation purposes. The concepts of “append-mode only” and “never delete any data” allow creating redundant data in different formats and indexing for performance needs such as AM data (indexed by security and contains information such as best bid/ask, trade info, security status control info and liquidity info for each trading day) and Daytot data (daily summaries indexed by date including open price, close price, max price, min price, average price, and turnover). MPL comprises a set of APIs that can be invoked from SMARTS application components, C programs or through the Alice language commands. At the lowest level, MPL data is stored using binary proprietary formats that provide efficient access and storage conditions. 2.4.1.3 MPL Manipulation Engines The alerting engine is the most significant part of any surveillance system. This engine implements the specialized alerting language Alice. Once the alerting engine is started (by the “alertserver” program), it interprets all text alert rules from the companion Alice code into a convenient data structure that will direct the alerting engine processing for that trading day. Normally, the alerting engine examines those alerts rules that are activated by the type of the FAV message that has been processed or on a timely basis. 2.4.1.4 The Alice Language SMARTS uses the in-house built programming language Alice for coding and creating alerting rules based on historical or real-time trading data. It also used for generating reports based on historical or real-time trading data. Alice programs make it clearly possible to separate SMARTS functionalities from the alerting rules. Alerting rules (i. e. Alice program) can be modified at any time to incorporate new or modified rules without the need to recompile SMARTS. The alerting engine interprets the contents of the respective Alice program to build the memory structure that represents those rules. Alert engine then uses this structure to control
42
Feras Dabous, Fethi Rabhi
its behavior on when (following which events) and what rules to be check. Event can be the type of trading event (i. e. FAV message type) that is currently being processed. It also can be on timely basis (e. g. every 5 minutes). A converter program needs to be written for every trading engine that SMARTS connects to. This conversion program generates FAV formatted messages out of the trading engine output (trades, orders, control info, announcements, etc.). Information from more than one trading board can be combined in a single FAV. Indeed, output of the trading engine can form more than one feed to SMARTS where each feed data is collected in a separate FAV file. Normally, order/trade/control data compose one feed, announcements (news) compose a second feed and index data composes a third feed.
2.4.2 X-STREAM X-Stream, previously called ASTS (Automated Securities Trading System), is a system built and commercialized by Computershare (and recently purchased by OMX of Sweden). It is primarily designed as a trading engine and market management system but it is also able to keep track of users’ accounts, forward market information, and provide a personal instant messaging service to users on the system. X-Stream can be customized for different markets and system configurations and is one of the most widely used in the world. 2.4.2.1 Orders and Instruments There are a large number of different types of instruments that are tradable with X-Stream. X-Stream was developed to work seamlessly with securities, options, futures, bonds, swaps, commodities, foreign exchange etc. As such there must be a defined standard for how X-Stream treats these different types. With these different types of instruments comes the ability to trade in different ways. An order is an “Instruction to a broker/dealer to buy, sell, deliver, or receive securities or commodities that commits the issuer of the “order” to the terms specified.1 2.4.2.2 The X-Stream Architecture The X-Stream software architecture is broken up into several components. Figure 2.4 depicts the organization of these components, described next.
1
Computershare Technology Services PTY LTD. Programmer’s References for ASTS Client SDK. Version 1.6.2002
2 Information Systems and IT Architectures for Securities Trading
43
Figure 2.4. X-Stream architecture
2.4.2.3 Trading Engine The Trading Engine is the main transaction component and its main function is to take orders and process the order book. It also manages all the user accounts on the system. On start-up, the Trading Engine requests all of the market and system data from the relational database server. It then stores this data in log files updated in real-time. If the Trading Engine ever crashes, then it can be restarted from data in these log files. 2.4.2.4 Relational Database Server All of the parameters in X-Stream (such as the number and type of users, which securities to trade, security boards, currencies and etcetera) are held within the relational database server. The flexibility of the system is provided through the database server because all of the system parameters can be defined easily within the relational database. 2.4.2.5 Backup Trading Engine The Backup Trading Engine processes the same orders as the Trading engine. In the event of a failure of the Trading Engine, the Backup Trading Engine can be used as the primary trading engine transparently. The Backup Trading Engine starts up after the main Trading Engine, so it simply downloads the market information from the Trading Engine’s log files. In all other respects, the Backup Trading Engine is the same as the main Trading Engine.
44
Feras Dabous, Fethi Rabhi
2.4.2.6 Governor The Governor handles the fault management of the whole system. After an initialization protocol with the Governor, the two trading engines stay synchronized. Upon failure of either service there will be a breakdown in communication and both services will attempt to contact the Governor, and the one that is successful becomes the only trading engine. 2.4.2.7 Gateway As there could possibly be many client machines connected with the Trading Engine, the Gateway component is used to load balance the different transactions and queries and to distribute all the market data to the clients. The Gateway processes all the same transactions as the Trading Engine so that it maintains a complete picture of the market. 2.4.2.8 Trader Workplace The Trader Workplace is the user end interface developed by Computershare to the X-Stream software. It has different functions depending on which user is logged into the terminal. Its main priorities are to display market information, to submit orders, allow communication between brokers, and to submit market sensitive information and to control and operate a market. 2.4.2.9 Client Developed Software X-Stream also comes with a complete software development kit, called Comet SDK, for interfacing with all aspects of the system. The SDK was developed as a complete way to interface to the system. Any client that wishes to develop or modify their own Trading Engine’s client can easily interface through Comet. The main usage of the SDK is to interface to the Gateway as a client application, however, because of its extensiveness; anyone can connect to any part of the system. It allows enough flexibility for developing a different Gateway or Governor. Software components communicate with each other using the AMP Messaging Protocol, which is X-Stream’s main messaging protocol for handling all transaction and query information and is formally described in the message definition language ASN.1 (ISO/IEC 8824-1:2002).
2.5
IT Integration Architectures and Technologies
We now turn our attention to the middleware technologies that are used to facilitate the integration between the different systems described earlier. As these technologies intervene at different levels of abstraction, we first define a layered conceptual model that will be used to classify the different technologies involved.
2 Information Systems and IT Architectures for Securities Trading
45
Figure 2.5. Conceptual integration model for middleware technologies
Figure 2.5 shows three possible levels of integration between two different systems X and Y. In the upper level, the business process level is the highest level in the hierarchy. It offers business services in a way that “hides” the underlying data formats or middleware platforms. It represents the appropriate level for defining integrated services that correspond to particular business models. The content (data) level defines the structure and format of messages between the software components. The transport level defines the interfaces of communication between the various software components. We now present a brief review of existing technologies classified in relation to this model.
2.5.1 Transport Level For security reasons, most financial systems traditionally interact with each other using private or dedicated networks. An example is the SWIFT Transport Network (STN) based on the X.25 protocol. As communication software is non-standard, this is a very expensive solution. Other than for historical reasons, there is less and less interest in integration at this level because of the low level nature of the corresponding programming interfaces. Even private networks are now moving towards IP-based solutions (Intranets and Extranets). For example, SWIFT has developed a new generation of interactive services (called SWIFTNet) using an IP-based network called the Secure IP Network (SIPN). Besides loosely coupled communication by means of messages and file transfers, systems can also interact with each other in a tightly coupled way using dedicated
46
Feras Dabous, Fethi Rabhi
Application Programming Interfaces (APIs) accessible through the network protocols. A number of possibilities exist for such interactions such as the use of remote procedure calls and remote objects.
2.5.2 Contents (Data) Level The content layer provides languages and models to describe and organize information which enables the participating applications to understand the semantics of content and types of business documents. The objective of interactions at this layer is to achieve a seamless integration of data formats, data models, and languages (Medjahed el al. 2003). The two major enabling technologies at this layer are Electronic Data Interchange (EDI) and eXtensible Markup Language (XML)based frameworks. Electronic Data Interchange (EDI) (www.edi-information.com) is well-defined and well-established in the industry in the sense that it provides both the defined syntax and defined vocabularies for values of the EDI message fields (Bussler 2001). Trading partners exchange business documents via value-added-networks (VAN) using EDI standards. EDI is an established technique for B2B integration that focuses on interactions at transport layer where VANs are used to handle message delivery and routing. EDI standards also provide a single homogeneous solution for the interactions at the content layer where the set of supported document types is limited. In addition, EDI standards, as currently defined, do not support interactions at the BP layer. EDI has been in use long time before the emergence of the Internet. It depends on a moderately sophisticated technology infrastructure based on proprietary and expensive VANs and ad hoc development that needs direct communication between partners. Therefore, it remains unaffordable for the majority of the business community. The emergence of the Internet as a global standard for data exchange has influenced EDI to expand in many directions. The Open Buying on the Internet (OBI) (www.openbuy.org) is an initiative that leverages EDI to define an Internet-based procurement framework thereby utilizing the free use of the Internet instead of the expensive VANs. In parallel with XML developments, a number of international organisations were already working towards the adoption of common standards in the financial area. One of such standards is the Financial Information eXchange (FIX) (www.fixprotocol.org) protocol for communicating securities transactions between two parties. Since FIX does not define how data is transported, an XML version of FIX called FIXML (www.fixml.org) has been defined. FIXML facilitates real-time delivery of messages and it supports multiple currencies and financial instruments types. It also specifies how the trading orders are advertised, acknowledged, placed, rejected and executed. It also can be used to manage settlement instructions between the trades’ parties. Another standard is the Financial Products Markup Language (FpML) (www.fpml.org) supports business information exchange standard for electronic dealing and processing of financial derivatives instruments. It establishes a new protocol for sharing information and
2 Information Systems and IT Architectures for Securities Trading
47
dealing in financial derivatives over the Internet. The FpML is currently focuses on partial derivatives instruments such as interest rate swaps and forward rate agreement. Research Information Exchange Markup Language (RIXML) (www.rixml.org) is an initiative for developing an open industrial standard to tag and deliver investment research. It provides a structure for classifying financial research in a way that will enable users to define specific sorting, filtering and personalization criteria across all research publishers. In each case, a working group defines a Data Type Definition (DTD) that defines the XML tags and how they can be legally combined. These DTDs can then be used with XML parsers and other XML tools to rapidly create applications to process and display the stored information in whatever way is required. Other related standards include ISO15022 (www.iso15022.org) which defines a standard for messages in banking, securities and related financial services. This standard is suitable for inclusion within an XML format. Another standard is SWIFT’s messaging service, FIN, which runs on its private network (STN) and enables financial institutions to exchange standardized financial messages worldwide.
2.5.3 Business Process Level The business process layer is concerned with the conversational interactions among the activities that comprise a BP. The objective of interactions at this layer is to allow autonomous and heterogeneous partners to come online, advertise their terms and capabilities, and engage in peer-to-peer interactions with any other partners (Medjahed et al. 2003). Interoperability at this higher level is a challenging issue because it requires the understanding of the semantics of partner BPs. A few approaches have emerged for the purpose of facilitating BP interactions. We focus here on two popular ones, namely Workflows and Web Services. 2.5.3.1 Workflows Current BPs within an organization are integrated and managed either using ERP systems such as SAP/R3, Baan, PeopleSoft or various workflow systems like IBM’s MQ Series or integrated manually on demand-basis. There are many Workflow Management System (WFMS) products available each of which proposes a different specification that makes it hard to replace and ineffective when considering the needs of B2B interactions (Medjahed 2003). For this purpose a workflow reference model has been defined (Hollinsworth 1994) which motivates some standardization efforts such as jointFlow, the Simple Workflow Access Protocol (SWAP) (Bolcer and Bolcer 1999) and Wf-XML message set (www.wfmc.org). However these efforts provide little support for inter-enterprise BPs. The purpose of inter-enterprise workflows is to automate BPs that interconnect and manage communication among disparate systems. An Inter-Enterprise Workflow (IEWf), is a workflow consisting of a set of activities that are implemented by different enterprises, that is, it represents a BP that crosses organizational boundaries. Delivering
48
Feras Dabous, Fethi Rabhi
the next generation of IEWf Systems is having potential attention with the objective to interconnect cross organization BPs while supporting the integration of diverse users, applications, and systems (Yang and Papazoglou 2000). However, no current approach seems to be extensively used in the financial sector. 2.5.3.2 Web Services The Web Services model (Alonso 2004) is emerging steadily as a loosely-coupled integration framework for Internet-based e-business applications. Unlike traditional workflows, Web services support inter-enterprise workflows where BPs across enterprises can interact in a loosely coupled fashion by exchanging XML document-based messages. The Web Services model idea is to break down web accessible applications into smaller services. These services are accessible through electronic means, namely the Internet. They are self-describing and provide semantically well-defined functionality that allows users to access and perform the offered tasks. Such services can be distributed and deployed over a number of Internet-connected machines. A service provider is able to describe a service, publish the service and allow invocation of the service by parties wishing to do so. A service requester may request a service location through a service broker that also support service search. There are four emerging de facto standards for Web Services. The first one is Simple Object Access Protocol (SOAP) (www.w3.org/SOAP) which is an XML-based protocol for exchanging documentbased messages across the Internet. The second one is Web Services Description Language (WSDL) (www.w3.org/TR/wsdl) which is a general purpose XMLbased language for describing the interface, protocol bindings and the deployment details of Web services. The third one is Universal Description, Discovery and Integration (UDDI) (www.uddi.org) which refers to a set of specifications related to efficiently publishing and discovering information about Web services. The last one is Business Process Execution Language for Web Services (BPEL) (www.siebel.com/bpel) that has emerged based on a combined effort in merging the Microsoft XLANG and the IBM WSFL. This Web services model is not matured enough for wide acceptance, yet it tries to leverage the three layer interaction framework. At the transport layer, SOAP is used as a messaging protocol. At the content layer, a WSDL based XML schema is used. At the business process layer, BPEL is yet to evolve to cope with complex inter-enterprise interaction issues. In a previous study (Dabous 2005), we have justified with examples that some of the functionalities of capital market systems can be reused to assist the automation of many of the domain business processes. Therefore, we consider each of such systems as consisting of one or more e-services that can facilitate the enactment of domain business processes. Such business processes are related to trading activities across the trading cycle discussed earlier. This can include but not limited to activities such as placement of orders, online settlement and registry once a trade is matched, real-time surveillance and real-time information dissemination (announcements, quotes, trades, etc.).
2 Information Systems and IT Architectures for Securities Trading
49
Figure 2.6. High level service oriented architecture for securities trading
In (Rabhi and Benatallah 2002), preliminary service oriented architecture for the integration of capital market financial services has been proposed. In this architecture, each software component provides either a basic service or composite (integrated) service. The former is where the service logic is totally provided by that architectural component. The latter is where the architectural component interacts and requires other services (called outsourced services) to fulfill its role. Therefore, a service-based approach separates between business concerns (i. e. the use of the service) and technological considerations (i. e. the computing and networking infrastructure to support a service). Figure 2.6 shows the essential basic and integrated services that are associated with securities trading. Very few capital market systems are integrated at this level. Therefore, the B2B integration at this level in capital markets domain is still an active research area.
2.6
Conclusion
The main purpose of this chapter is providing software architects and IT professionals with sufficient knowledge about existing practices, business processes, systems and architectures in the area of securities trading. It introduced the concepts of market structures, order types and described the information flow throughout the trading cycle and the different types of information systems that participate in this cycle. The chapter pays some attention to business process integration issues by reviewing relevant middleware technologies according to their level of abstraction. We have seen that most integration currently occurs at the transport and content layer. Heterogeneous technologies in development and middleware platforms together with the absence of dominant standards have complicated the achievement of an interoperable environment that facilitates the integration among different systems. There are many opportunities yet to be exploited for integration at the business process (BP) level. This is essential for two reasons. On one hand, we are witnessing dramatic changes in computing, data communication and middleware technologies. On the other hand, the BPs themselves are changing as a result of
50
Feras Dabous, Fethi Rabhi
regulations, global competition and new financial instruments or market structures. Future developments may witness the maturity of the emergent technologies especially at the BP level which will facilitate the integration of multiple markets worldwide.
Acknowledgements We wish to acknowledge the Capital Markets Cooperative Research Centre (CMCRC) for funding this work. We are also grateful to staff at SMARTS and Computershare for their support especially in providing information about their systems.
References Michael Aitken M, Frino A, Jarnecic E, McCorry M, Segara R, Winn R (1997) The microstructure of Australian stock exchange. Securities Industry Research centre of Asian Pacific (SIRCA) Alonso G, Casati F, Kuno H, Machiraju V (2004). Web Services: Concepts, Architectures and Applications. Springer Banks E (2001) e-Finance: The Electronic Revolution, John Wiley & Sons Bolcer G, Kaiser G (1999). SWAP: Leveraging the Web to Manage Workflow, IEEE Internet Computing, 3(1), pp. 55−88 Bussler C (2001) B2B Protocol Standards and their Role in Semantic B2B Integration Engines. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 24(1), pp. 3−11 Dabous FT, Pattern-Based Approach for the Architectural Design of e-Business Applications, PhD thesis,The University of New South Wales, Australia Harris L (2003) Trading and Exchanges: Market Microstructure for Practitioners. Oxford University Press Hollinsworth D (1994) The workflow reference model, #TC00-1003, Brussels, Belgium Medjahed B, Benatallah B, Bouguettaya A, Ngu AHH, Elmagarmid AK (2003) B2B Interactions: Issues and Enabling Technologies,The VLDB Journal, pp. 59−85, Springer-Verlag, Apr Rabhi FA, Benatallah B (2002) A Service-Based Architecture for Capital Markets Systems, 16(1), pp. 15−19 Yang J, Papazoglou M (2000) Interoperation Support for electronic business, Communications of ACM, 43(6), pp. 39−47 Viney C (2002) Financial Institutions Instruments and Markets. McGrath’s
CHAPTER 3 Product Management Systems in Financial Services Software Architectures Christian Knogler, Michael Linsmaier
3.1
The Initial Situation
Consolidating and modernizing software architectures in the financial services industry has become a number one issue on the current agenda of Chief Information Officers (CIOs). Common software architectures have matured over the past twenty years or more but the business itself has changed so seriously that many companies, especially insurance carriers, have architectures that lag behind the business innovation. Today, more and more CIOs agree that they have to spend major parts of their IT budget for maintenance and that the ratio between costs and innovation is not appropriate anymore (e. g. cf. (Lier 2005)). (Hersh 2006) confirms and predicts that reducing operational costs will stay a top priority for quite a while because maintenance costs can range from uncomfortably high to downright crushing when carriers spend 80 percent or more of their budgets on maintenance rather than new projects. What forces carriers in the financial services industry to innovate their business? First, customers expect products and services that meet their needs and which are delivered in time and budget. Products and services have to have – from the customers point of view – an optimum price/performance ratio. If a carrier focuses on ‘quality leadership’ or ‘price leadership’ depends on the business strategy. Delivery in time and budget means that customers do expect that an offer or proposal is immediately delivered at the point of sale and that the printed contract is delivered within 48 hours. Second, competitors have also perceived customers expectations and try to meet this challenge their way. Third, a 2004 study reported by (Hersh 2006) came up with a result that we already know from fast moving consumer good markets (e. g. mobile phones): Products that get to market quickly typically generate more revenue and greater profits. Fourth, the value chain of a financial services (FS) company starts with product management and thus has a strong influence on subsequent elements of the value chain (e. g. underwriting, risk carrying/transformation,
52
Christian Knogler, Michael Linsmaier
claims management, etc.). Our conclusion is to improve business by improving product management and the subsequent core business processes (which must include IT issues). It is not our objective to discuss how a FS company comes down to a new business strategy. What we offer is a discussion of the weaknesses of current software architectures and business processes and the opportunities of improvement.
3.2
Weaknesses in Software Architectures and Business Processes
When the product development or the marketing department starts to create a new product they end up with a paper based description of the product. Life insurance products are often ‘described’ on spreadsheets because they allow interactive calculation. Based on this set of information the IT department implements a ‘piece of software’ and the product development department tests it. This is how software development works in general one might say. Where are the weaknesses? The product development department has comprehensive product knowledge but lacks of a clear method how to define a financial services product. Especially every piece of information that has to be implemented as code needs a precise description. It is common sense that no software specification is as perfect as software engineers want it. From our point of view the traditional process based on division of labor has to be discussed again. Trying to improve paper based description and coding is, like (Watzlawick 1983) called it, ‘more of the same’. What the product development department needs is an approach to a tool based definition of products which enables them to precisely define and test the calculation and the business rules, the properties of the product, etc. (Gruhn 2005) reports a comprehensive set of requirements for such a tool, he uses the acronym PMS (product management system). We follow Gruhn and use term and acronym below. Finally a PMS has to generate a ‘piece of software’ that includes all the product information that is needed by the core business applications of a carrier. Process weaknesses have not only been identified within the product implementation process moreover the software deployment is also critical. If an IT department implements FS products in a traditional way they have to implement the same product several times. First for the contract management system, second for the offline sales system, third – for example – for a broker portal, fourth for … etc. etc. The authors have seen insurance companies that have to deploy products to ten and more business applications. A quite common way to meet this challenge is to create a so called ‘calculation box’ which contains the companies set of calculations. Because of their platform variety IT departments use ANSI-C code (a version of
3 Product Management Systems in Financial Services Software Architectures
53
the programming language C standardized by the American National Standards Institute) to provide a multi-platform calculation box. Although this seems to be a quite good approach it is not a silver bullet because the process of definition has not been changed (as related above) and an FS product consists of much more than calculations. What about attributes, properties and business rules for instance? Looking at sales channels of FS companies it is easy to recognize that today every company has much more sales channels than ten or fifteen years ago. On the one hand carriers cooperate with nation- or europeanwide independent brokers and banks and sell their products ‘over their counter’. On the other hand they cooperate with big multinational groups and deliver customized FS products as add-on products to the groups products (for instance a finance and insurance package for car buyers) or as part of the groups benefit program (e. g. special homeowner loans, private financial security for future retirees, and so on). Common to all of the cooperation partners of a carrier is that the partners have their own IT platforms and business applications. Even today carriers still provide only simple paper based proposal forms to their business partners, others offer at least a calculation box (which normally does not include the special business rules for the partner). Media discontinuity and a lack of data validation at the point of sale (no wonder a paper form is still not able to validate its data) is one of the major causes why callbacks are still necessary and the time elapsed for the processing of a standard proposal form is about several weeks. If a partner asks for a seamless process integration and automation, many carriers wonder about that. A remarkable aspect in that context is that today some carriers do not have a process integration and automation even with their own sales staff and their back office. Process integration and automation is not only an issue with sales. Just take a look at claims management, or quite simple, a claims notification. Today an FS company has many points of service. Sooner or later every point of sale turns into a point of service. No matter what has been sold (a simple ‘Bausparvertrag’1 or a new car with a leasing and insurance package), customers have questions or claims and come back to where they have bought. Regardless whether the service employee at the point of service calls the call center or fills out a paper form: media discontinuity, callbacks, needless idle time and so on are the unwanted effects. A lack of integration and automation is often a symptom of a more substantial weakness: Inflexibility (of applications/components and sometimes in peoples mind). This in most cases goes hand in hand with the well known ‘legacy concept’. Undocumented business rules and logic, ineffectual code and so on. For example product enhancements that appear quite straightforward to the prod1
A translation is quite difficult because a ‘Bausparvertrag’ is a special kind of a (future) homeowner savings contract with the right to close a homeowner mortgage loan contract at a special fixed interest rate at the end of the savings period. This combination of a savings contract with an optional future loan contract is a common governmental promotion for private proprietary.
54
Christian Knogler, Michael Linsmaier
uct development department (‘small tweaks’ from an actuaries point of view) take the IT almost a year or more and millions of Euros to make and test that change. Carriers who are listed at a US stock exchange and run a legacy system in the back office are often still struggling with the implementation of the Sarbanes-Oxley Act (SOX)2. (Cyphert 2006) recently reported that some companies repeatedly have been unable to project their claims, new business, and the like and have made up for it with workarounds. Although he meant American carriers the situation with legacy systems in Europe is, from our point of view, at least similar or even worse. This is also true for what Gartner recently found out. The most problematic part for Property and Casualty (P&C) insurance companies is rating/quoting and new-product development, which means that carriers want to get their products to market faster and be more accurate in their pricing. Gartner says that they are trying new concepts, but the technology has not matured at the rate they would like. The second problem piece is underwriting which is in Gartner’s opinion an extension on the movement toward pricing and new-product development (Hyle 2006). A typical legacy architecture looks like that (the following list is not final): • A separate contract administration system for every (or almost every) product line and a policy-centric view on data • Code and tables for the products are an integral part of the system, a segregation is not so easy and takes a remarkable effort • (Almost) no modularization/componentization (monolithic design) • A large number of (and sometimes old) technologies/platforms3 Common side effects of such legacy architectures are skill-set problems with old technologies because the people who understand them have retired. If a carrier wants to rely on third-party outsourcers instead, they ask for a current documentation that is in most cases not available. The old documentation does not fit anymore because the system has been customized, modified and debased to the extent. How does a FS company know when to pull the trigger on replacing legacy systems? The following questions might help to clarify: • Are the systems inhibiting the company from being competitive respectively are the business requirements met today? • Will the company be able to meet the business requirements in two or three years?
2
3
We prefer the acronym SOX instead of SOA because in the IT world SOA is widely used for Service Oriented Architectures. (Bozovic and Jakubik 2006) relate that they have seen companies “that internally have to juggle up to 30 different technology platforms”.
3 Product Management Systems in Financial Services Software Architectures
55
• Does the company have a clear vision of its future business? A clear strategy and the steps to implement it? • Is it clear how the core business processes will run in the future? Which steps in each process will be supported or even automated by IT?
3.3
Objectives and Today’s Options
One of the biggest changes today are the options available to companies. “Companies are starting to change what their choices are for policy administration,” (Harris-Ferrante 2006) says. “It used to be companies were thinking about buy, build, or outsource – those were the three choices you had. A lot of companies now are looking at the different alternatives and broadening some of these options.” Such options include installing components rather than the entire system or adding different alternatives on top of the current system, which could improve the policy system without changing it. “We’ve seen companies start looking more and more at the option of adding things on top of the system – Business Process Management (BPM), workflow, and different types of supplemental systems you can add on top without replacing what you already have,” says Harris-Ferrante. “Companies are taking a variety of approaches to their policy system vs. the big-bang approach.” Our approach is similar but different. We also consider the risks of a big-bang approach in most cases as “to high” and prefer an approach that we call ‘smart consolidation’. What does it mean? The weaknesses we have related prior in this article lead us to the following conclusions: • The product development department has to consolidate and optimize4 its methods of defining FS products • The IT department has to consolidate and centralize the way it provides product information to business applications in- and outside the company Based on these conclusions we have established an architecture that we call ‘product server based business application architecture’. A product server is an application that provides every peace of product information that is required by a business application (in- or outside the company). In general a product server is product line and platform/technology independent and it is the runtime environment for the products defined and implemented by using a PMS.
4
‘Industrialize’ is sometimes used as an equivalent term.
56
Christian Knogler, Michael Linsmaier
3.4
A ‘Product Server Based Business Application Architecture’
Misc. Supporting Systems (e.g. Business Partner, DWH, etc.)
The following diagram displays the main components of the suggested architecture:
Contract Management System
Claims Management System
Product Server
Commission & Charge System
Point of Sale / Service System
Intermediary 1
Intermediary 2
Intermediary n
Figure 3.1. Product server based business application architecture
3.4.1 Business Considerations From a professional point of view our architecture is able to provide a broad range of services in order to support the following processes: Table 3.1. Processes and services
Processes
Services (examples)
quotations
calculations (premiums, sums, endowment values, installments, etc.) data validation identification of needs/recommendation of products product (e. g. available coverage) structure additional calculations (e. g. loan redemption schedules, detailed coverage premiums, …) data validation
offers/proposals
3 Product Management Systems in Financial Services Software Architectures
57
Table 3.1. Continued
contracts
claims
assistance services
actuarial methods (e. g. contract alterations) business rules (both global and sales channel specific) document output (control of text modules) additional calculations (e. g. reserves, commissions, …) product structure additional actuarial methods (e. g. contract conversion) business rules (underwriting, straight-through-processing, …) document output (control of text modules) context sensitive claims notification questionnaire calculations (loss reserve, compensation, installments, etc.) business rules (compensation yes/no, what, how, immediate claims adjustment, …) document output (control of text modules) calculations data validation business rules document output (control of text modules)
The services are available throughout the entire process chain and are not limited to the company itself, which means that financial intermediaries are also able to use the services (both offline and online) provided by the product server. Straight-through-Processing The standardized ‘rollout of services’ enables a positive shift in process quality and speed. Everywhere and every time business applications can call for a service and get – for the same request – the same precise result. The service examples given in Table 3.1 go beyond a simple calculation/validation approach. This architecture is the basis for an all-round service at the point of sale/service and for straight-through-processing (STP). From our point of view STP is crucial to lowering costs and speeding up execution. While the case for STP may be changing, the goals remain the same: to reduce the number of exceptions and to allow as many proposals as possible to be executed, settled, and cleared without manual intervention. Considering the various sales channels of a carrier with different products, different prices and different business rules we believe that execution costs and time elapsed can not be fixed at a competitive level without running a product server as part of a STP-strategy. STP is not limited to sales processes; service processes are even a bigger case. As we have stated prior in this article every point of sale is a potential point of service. A process chain which is free off media discontinuity and which integrates service partners seamlessly is important in tomorrow’s competition where productivity as well as quality and speed of service are crucial. Imaging a car repair shop or a homeowner who enters a carrier’s customer web portal and notifies a minor damage (car body damage, glass damage, etc.). If the attached documentation (photos, estimation of costs, etc.) is sufficient and the business rules for minor damages allow an instant regulation the clients get their approval only hours later. It is clear that
58
Christian Knogler, Michael Linsmaier
a STP-approach with service processes needs a powerful fraud detection system and overall human claims revision professionals who systematically re-check settled claims or check cases that were marked by the fraud detection system. Already Traeger (in (El Hage and Jara 2003)) relates a scenario called claims portal solution that integrates every service provider and business partner of an insurance carrier and uses a car break-down to illustrate straight-through-processing with service processes. Outsourcing Is a product server based business application architecture compatible with an outsourcing approach? Outsourcing is a trend among FS companies and we are sure that our architecture is the key to keep major parts of the core business know-how inhouse. The product development departments create, maintain and test the products by using a PMS. At the end of this process they provide encapsulated product information in a platform independent format. The outsourcing partner tests the integrity and deploys it to the business applications he/she is running for the company. Collaboration Using a product management system is also a sophisticated way for a collaborative product development. The headquarters is able to determine general structures; calculations and business rules and subsidiaries abroad add local value. New business strategies go one step further (e. g. cf. (Riemer 2005)). The separation of risk carrying from product selling to individuals is the core element in these strategies and they imply some sort of a ‘wholesale-retail-scenario’ with insurance factories that create ‘white label products’ and sell them via other insurance companies or financial intermediaries (brokers, banks, etc.) who apply their own brand to these products. Again, a PMS is the professional basis to implement such a strategy. If two FS companies use the same PMS it is easy to share (basic) parts of their products. As long as the shared elements do not influence the interface to the business applications this is a good way to exchange product development experiences or ‘product kernels’ that are expected to be used in both companies.
3.4.2 Technical Considerations Based on our architectural approach we have derived some important technical requirements for a PMS. Since one of the core points of the suggested architecture is straight-through-processing, a PMS will have to support the core components of the architecture. We found the following characteristics to be crucial for a state-ofthe-art PMS. Operating System Independency Many carriers use different operating systems. There are Windows based notebooks (or even handhelds) at the point of sale, open source/Linux based worksta-
3 Product Management Systems in Financial Services Software Architectures
59
tions at the point of service, Unix-based web applications for e. g. brokers and finally a zOS-based (or any other classic OS) mainframe back office systems. Since the runtime environment must show the same behavior and compute the same results on all different platforms, we state that the runtime environment has to use operating system independent technologies when different platforms in (potentially) different companies have to be served. Database System Independency The technologies for data storage also differ according to which application/component is used. For example, if the back office system uses a DB/2 database, then the web application will use, e. g. an Oracle database or even MySQL. On the other side the point of sale system handhelds should not have a separate database management system, rather should only use the file system. This means that a PMS cannot be based on one database system only, any kind of (product) data storage must be available. Serving a Broad Array of Technologies There are as many different requirements for the interfaces as there are components, which have to use the PMS. On the mainframe side (e. g. a COBOL administration system) program libraries have to be integrated, in a workstation environment.NET or Java Native Interface (JNI) are used and for web applications web services or interfaces for J2EE (Java Platform Enterprise Edition) applications are needed. From our point of view a PMS is not ready for integration and not able to be a central part in the suggested architecture until it is not able to connect to and serve the systems of FS carriers. Deployment Any advantages gained from the support of different operating systems, databases and interfaces get lost if the product data itself cannot be efficiently distributed. During the implementation of a new product, it is important to ensure that the product data is available to all locations/applications in the value chain as quickly as possible. A product implementation which is based on the classic cascading phases component design, creation of source code, translation, (technical) testing and integration etc., (the ‘traditional long running software development process’), will not be sufficient. A smarter approach has to be chosen since one of the major success factors for process optimization is the separation of the requirements of the product development from those of the software development. In general there are two options available: First, a code based development approach such as for J2EE components that was designed for simple deployment (besides other requirements of course) or second, an interpreter based, data-driven approach. We appreciate the datadriven approach because it guarantees the platform independency for really all platforms. Thus you can distribute (deploy) the product data and product functionality fast and without any conversions. (We are aware of taking care for potential performance issues.)
60
Christian Knogler, Michael Linsmaier
Scalability Carriers require a PMS to be scalable. Although there are multiple scenarios like: • A workstation/notebook environment • Working as part of a transaction monitor (e. g. CICS, IBMs Customer Information Control System) controlled system with a large number of concurrent users • Working as part of a mainframe batch environment that processes millions of units • Working on servers in a web environment with thousands of requests within one hour.5 Hence the internal technical architecture of a PMS must guarantee a flexible scalability at most not only on single processor systems but also on multi processor machines. Integration The ability to integrate is different from the ability to be integrated into business applications. Ability to integrate means that existing product components, tables, calculation boxes (please refer also to ‘migration methods’ below) etc. can be perfectly integrated respectively wrapped. This enables a PMS to be used as a commoditized interface (using plug-in mechanisms, user exits, etc.) for any external product related components.
3.5
Migrating into a New Architecture
Prior in this article we have related the most important migration reasons. From our point of view a migration to the suggested architecture is eligible to result in a lower total cost of ownership (TCO). Whatever the reasons for a CIO in detail are, TCO is always on his mind. The following figure shows the two major options for a carrier to transform its business application architecture. Before a company decides to go for a new architecture the major questions we have exemplified prior in this article have to be answered clearly. Considering the given constraints (technical, business, organizational, etc.) and the actual requirements we have experienced that many companies come down to the conclusion that a smart consolidation of their legacy systems based on our approach of a product server based business application architecture better limits the risks of a migration than introducing a completely new ensemble of core business applications (e. g. contract administration system plus business partner plus commission and charge plus … etc. etc.). In our point of view a company will rather accept the 5
In this context carriers require the so called ‘24 × 7’ availability which means that the system must be up and running 24 hours a day, 7 days a week and 365 days a year.
3 Product Management Systems in Financial Services Software Architectures
61
consolidate or innovate ? original component renovated component costs
(re)developed component
PMS
original architecture
risk
benefit
PMS
consolidated architecture
innovated architecture (blueprint)
... consolidate, then innovate ! costs
risk
benefit
Figure 3.2. Costs, risks and benefits for consolidation and innovation
moderate increased costs of the consolidation approach than the risk of a much more expensive total miss of a new system landscape. The following figure displays the main milestones in our program scope from the blueprint to an innovated architecture: 5 milestones to an innovated product server based architecture
substitute or enhance components
renovate components
innovated architecture
consolidated architecture
repository product data reengineering
product analysis analyze
analyze blueprint
t
Figure 3.3. The milestones and activities to the drafted blueprint
62
Christian Knogler, Michael Linsmaier
3.5.1 Blueprint The blueprint6 of the future IT architecture displays the companies specific ‘interpretation’ of the ‘product server based business application architecture’. Based on the blueprint the following questions for each component have to be clearly answered: 1. Is there a need to retain the component in the new architecture? 2. Is it possible to compensate the component’s deviations from new requirements by renovation? 3. If the answer to question 2 is ‘No’: Are costs and risks of a substitution (make or buy) appropriate to the expected benefits?
3.5.2 Product Data Analysis The second milestone is a fundamental one for the architectural transformation laid down in the blueprint. It’s the “metadata” or “product data” analysis. The existing components will be analyzed with a clear focus on their included productbased metadata, referred to hereafter as product data. Such product data is usually scattered throughout databases or is an integrated part of the program code of the components. The product data found in the databases usually include product module information like product names or product selling periods, quotation rates or even simple business rules etc. The program code itself contains metadata or functionality, which cannot be stored in database structures or should not be stored there for performance reasons (e. g. financial or insurance calculation instructions). Some of these program code units are so called calculation boxes. In this case it is sometimes reasonable to reuse the calculation box within the PMS.
3.5.3 Repository This milestone is a basis for the consolidated (renovation scenario) as well as the innovated architecture (substitution or enhancement scenario). The repository is vital and comprises the entire (formerly scattered) product data. In general the design of the repository has to meet the current and future business requirements. 6
A blueprint is, conceptually, a plan or design documenting an architecture or an engineering design. While this may be as simple as a quick sketch or outline using pencil and paper, a blueprint is traditionally reflected as the contact printing process of cyanotype. And it is cyanotype which produces the familiar white lines on blue background. The term is often used metaphorically to mean any detailed plan (Wikipedia 2006).
3 Product Management Systems in Financial Services Software Architectures
63
Redesign the Product Model? We have learned in the past that professionals love to discuss this issue. To our mind it needs to be more of the business requirements that drive decisions on that issue. If the business requirements can be satisfied by consolidation (and probably some enhancements) for the next 3−5 years the existing contract administration systems are the benchmark. If not, the company does have to work carefully on its future product model design because it has to satisfy the business needs for the next ten or more years. Everything that is in-between is a compromise or temporary solution. Its common sense that these solutions live longer than everyone expected them to. In general a compromise implies advantages and disadvantages. Complex interfaces and therefore performance troubles during batch processing are a possible disadvantage in this case. In our experience performance leaks occur mainly during writing to and reading from the product server (the runtime environment of a PMS). Instead of a bad compromise we prefer this way: • Create a product model based on the consolidated architecture. • Define a product model that satisfies your future business requirements. • Lay one upon the other and analyze the differences. Consider elements of the future model which are not part of the current model as harmless enhancements of the current model. Take small workarounds into account for existing model concepts which could affect the future usability of the product model. Furthermore a product design guideline has to ensure, that new developed products fit easily into the future model. Redesign the Products? Once the product model has been fixed, the next decision deals with the products itself. The question goes like this: Do we have to (re-)implement every product or only new products? We belief that a full migration of all products would be the optimum but in fact business constraints are the real driver for this decision. Therefore, both the current and future product portfolios have to be analyzed and the following questions have to be answered: • Is there a small number of products in the portfolio which are too expensive to migrate? Are there ‘exotic’ products which would be also to expensive? Is it possible to convert the policies that use these products? (If not, consider a termination of the policy.) • Are there any policies in the database which cannot be sold or adjusted anymore? Are there products with a very small number of active policies? If yes, you don’t need to migrate its products. It is cheaper to manage these policies manually. The workaround in this case is to create a ‘one size fits all’ product7 and link it to the ‘read-only’ policy. 7
This is some sort of a ‘dummy’ product without quotation rates or any other product specific business functions.
64
Christian Knogler, Michael Linsmaier
All other products have to be migrated into the PMS and adapted to additional/ future business requirements if necessary (e. g. classification for the data warehouse system). A Direct Access to the Repository? From our point of view components should be modified during the next step (consolidated architecture). Therefore we do not access the repository directly by using it’s interface. The question is how the components can use the repository anyway. In general, database tables contain the majority of the meta product data and these tables can be maintained by the repository in this way: • The product development department maintains the repository. • A small SQL (structured query language) product data export tool transfers the data from the repository into the database tables. The advantages are (a) a standardized definition and maintenance of product data and (b) a central storage. Argument (a) also influences the quality, as every certified quality assurance professional knows by his/her experience. And last but not least (c): via a standard interface other applications/system could access the repository information (e. g. business intelligence/data warehouse).
3.5.4 Consolidated Architecture We have already pointed out that most carriers want to reduce their project risks or at least spread them. The ‘5 milestones to an innovated product server based architecture’ are our suggestion to spread the risks. As we have stated prior in this article the blueprint milestone fixes (among other things) which components can be retained and if necessary renovated. Common renovation issues are: Mathematical Subroutines Are mathematical subroutines (‘calculation box’) reusable within the PMS? FS companies often use for several reasons encapsulated subroutines for mathematical instructions. Sometimes it is reasonable, sometimes its even a (temporarily) ‘must’ to reuse these instructions. In general a State-of-the-Art PMS is able to integrate such encapsulated subroutines via a standard interface. The benefit is that the business applications just call the product server. If a carrier has several subroutines that have to be reused, a PMS is a smart option to substitute the different interfaces with one standard interface to the product server. Second, subroutines that were not available before can be made available to every business application that runs an OS which is appropriate for the subroutines. Third, within the PMS every method is able to use every subroutine that is integrated. Replacing Embedded Functionality What about embedded functionality that could be replaced by a service provided by the product server? What are the risks, what are the benefits? Its common sense
3 Product Management Systems in Financial Services Software Architectures
65
that cutting off embedded functionality is risky because in-depth adaptations are in most cases subsequent. The quality of the adaptations in question has to be considered carefully. In case of doubt we prefer the safe road to the consolidated architecture and bundle this kind of adaptations for a later implementation when we go for the last milestone, the innovated architecture. Minor Process Improvements Even today it is not a matter of course that business applications avoid wrong user input, wrong selections, clicks on wrong buttons and so on. A product server provides services that are not so difficult to integrate and significantly improve the usability of a process. Some examples: • Immediate validation of every user input and in case of a mistake a suggestion for a context specific correction • Delivery of results as soon as they can be calculated without clicking the ‘calc’ button • Context specifically filtered selection lists. Only actually valid items can be selected, invalid items are not visible • Context specific disabling of controls (input fields, lists, buttons, etc.) if their availability makes no sense This kind of improvements has to be also carefully considered because it depends on the application how easy or difficult it is to implement them. Again, in case of doubt we prefer to bundle these modifications with a later implementation when components are replaced or enhanced.
3.5.5 Innovated Architecture This is the last milestone in our roadmap to an innovative architecture. All in all, the main objective for an innovation is to satisfy future business requirements and to reduce TCO. A consolidated architecture is a pre-stage for the implementation of future business requirements like e. g. straight-through-processing8 (STP). During the initial step the company has developed its blueprint based on the architectural requirements: • • • • 8
Business processes Straight-through-Processing Outsourcing Collaboration
Although the management of a company recognizes the benefits of STP immediately people with underwriting skills not always do. Its no secret that underwriters often maintain ‘their work is an art and not a science’. Gaining commitment from the underwriters will be crucial for the project success and is therefore a No. 1 duty for the project managers.
66
Christian Knogler, Michael Linsmaier
Regardless of a make-or-buy decision the new application that will replace an existing has to meet these functional requirements. So let’s have a look on how we can satisfy these requirements during this step: Service requests to the product server are handled by a single generic mechanism9. Future enhancements of the business functionality provided by the product server are easier to implement and do not enforce program code modifications in the requesting application. Therefore the broad range of services we mentioned in “business considerations” can be provided. The standardized product data enable a common language for the components interaction. Furthermore these components get product data which haven’t been available before at all or haven’t been available in that quality. Since, in addition, the same product services are available all-over the process chain, the components can fulfill the straight-through-processing requirement. Because the new product model is the benchmark for new developed components, there is no need for conversions (or at least only a few if they are inevitable) between the requesting application and the product server. Thus the overall performance of processes will be enhanced. A PMS is the appropriate development and maintenance environment that enables carriers to implement new products or product enhancements, thus satisfying the product related business needs. To satisfy requirements that concern business processes a PMS is important and more than half the battle. As we have exemplified in Table 1 processes and services, a PMS is able to provide a variety of services, the initial step to process improvement is process design. When a company has created a clear vision of the innovated process and the project team is sure that this process will satisfy the business requirements, it derives the software requirements and differentiates by services delivered by the product server and functionality implemented in applications/components. Regardless of a make-or-buy decision the new application that will replace an existing has to meet these functional requirements. (Bozovic and Jakubik 2006) give us a word of caution saying: “If IT can’t provide the agility not just to support but to help promote business innovation, the gap between the two will widen and then the business can and will tank.” We have experienced that this approach to process innovation and implementation is unfamiliar for companies that have not worked with a ‘product server based business application architecture’ before and hence we want to sum up the important PMS aspects in this context: • The product server provides a bunch of new services and makes product data which haven’t been available before at all or haven’t been available in that quality. 9
Because not only calculation rules etc. are product specific but also the product structures (e. g. attributes) or even types of product functions the communication with a PMS must built as a generic mechanism. Which means that this mechanism has to use flexible structures for communication.
3 Product Management Systems in Financial Services Software Architectures
67
• Service requests to the product server are handled by a single generic mechanism.10 Future enhancements of the business functionality provided by the product server are easier to implement and do not enforce program code modifications in the requesting application. • The future product model is the benchmark for new developed components. No conversions (or at least only a few if they are inevitable) between the requesting application and the product server is a strong requirement.
3.6
Impacts on Software Development
The Model-driven Approach and Product Data Its obvious that the software development has reached a new level of evolution. This evolution can be compared to the evolution in the textile industry where materials are no longer made at home, but through mechanized looms and factories. In the same way, software development has moved from pure programming activities using case-tool support to today’s model-driven architecture approach (MDA). The goal of the MDA approach is to define technical specifications in a platformindependent model. They are then transferred to the corresponding concrete implementation by using platform-specific models (PSM), e. g. via model code generators for a.NET platform. This is an integrated way from a formally described technical specification (via UML11) to executable components. The approach of a PMS is quite similar. A formal business model is implemented instead of traditional (database) programming in a common language/tool. This business model describes the structure of the (real) product data. In contrast to the MDA approach the intermediate step of a platform-specific model is skipped since the user of a PMS is (in general) a member of the product development or actuarial department. Hence “executable” product data have to be generated directly. A product model is able to provide a business description of an object, e. g. a “fund contract”. But then a technical class, for example described with UML, describes an object abstractly based on its technical structure. Thus, from a technical point of view, the product data illustrates an operative object more concretely than a “technical” class. 10
11
Because not only calculation rules etc. are product specific but also the product structures (e. g. attributes) or even types of product functions the communication with a PMS must built as a generic mechanism. Which means that this mechanism has to use flexible structures for communication. The Unified Modeling Language (UML) is a non-proprietary, object modeling and specification language used in software engineering. UML includes a standardized graphical notation that may be used to create an abstract model of a system: the UML model. While UML was designed to specify, visualize, construct, and document software-intensive systems, UML is not restricted to modeling software. UML has its strengths at higher, more architectural levels and has been used for modeling hardware (engineering systems) and is commonly used for business process modeling, systems engineering modeling, and representing organizational structure (Wikipedia 2006).
Figure 3.4. Product data as a description of operative objects
Unified Modeling Language as a Product Definition Language? From our point of view UML is not appropriate for business users to specify the business aspects of a product. Why? On the one hand UML is a modeling language/tool rather for software engineers than for FS product development specialists. On the other hand UML would help to emerge a myriad of problems business users do not have to face without UML (e. g. large product libraries could comprise thousands of product modules). Last but not least, the integration of the product development process into a standard software development process is very difficult as we have learned from the past thus leading the authors at the beginning of the 1990’s to the conclusion that business users in a FS product development department need to have a specific business oriented development environment. Different Development Processes Prior in this chapter we have pointed out that the traditional way to implement FS products is not sufficient. Introducing a PMS enables the IT department to concentrate on technical issues while the product development department provides ready-to-run software modules that encapsulate the entire product-specific information and the required business/actuarial functionality. Although it seems to be easier then for the IT department they have to cope with a new kind of requirements. Application components have to be enabled to act in accordance to context sensitive results, outcomes and messages provided by the product server or even to delegate functionality to the product server. On the other hand the product development department delivers all the business/actuarial functionality.
3 Product Management Systems in Financial Services Software Architectures
Software Engineer
Platform Independent Model
69
Product Expert
Modell oriented Language
Product Data
Platform Specific Model
Platform Code
Application Component
Product Management System Component
Software Development Process
Interpretabel Product Data Information
Product Development Process
Figure 3.5. Different development processes
An important question which IT departments and product development departments ask is: “What is our functional and data scope?” The rule of thumb says that everything, that is product-specific, should be implemented by the product development department by using a PMS and provided by the product server.
3.7
Conclusion
To cope with today’s challenges IT must provide a flexible, nimble architecture that can not only offer tangible benefits today but also adapts to whatever tomorrow brings. A product server based business application architecture offers many tangible benefits: • A well integrated PMS is a considerably faster tool for developing products than existing methods. • The integration certainly means much less involvement by IT staff to take products from concept to rollout.
70
Christian Knogler, Michael Linsmaier
• This architecture is perfect for a smart consolidation approach when a carrier wants to migrate several contract administration systems into one because the entire product management is vital to an FS company and has to be consolidated at first. • After a successful consolidation of the product management the company can start to renovate or replace other components in its architecture and has much less risks than before because there is one platform for all new and existing products. • One platform for all products and their related services is a core element in a straight-through-processing strategy that – in combination with a consolidated administration environment – presents enormous potential cost savings, since currently many carriers rely on multiple systems for their broad array of products and core processes that do have to be automated. There is no other way to process standard business with a speed and quality clients expect and with shareholders that expect an increase in productivity, revenue and profit on the one side and a proportional decrease in costs on the other side.
References Bozovic A, Jakubik M (2006) Time to Say Goodbye. Retrieved January 5, 2006 from http://www.nationalunderwriter.com/tech/news/viewFeatures.asp? articleID=1059 Cyphert M in Hyle RR (2006) Taking the Big Step. Retrieved January 5, 2006 from http://www.nationalunderwriter.com/tech/news/viewFeatures.asp? articleID=1061 Gruhn V (2005) Agieren statt reagieren. vb-versicherungsbetriebe 4/2005, 20−21 Lier M (2005) Dies und das aus der elektronischen Versicherungswelt. Die ITBudgets regiert der Rotstift. Versicherungswirtschaft 7/2005, 480−487 Hahn, C-P (2005) Metadaten, Application Understanding und Software Migration, Gesellschaft für Informatik e. V. Softwaretechnik Trends Band 25 Heft 4, ISSN 0720-8928, 28−29 Harris-Ferrante K in Hyle RR (2006) Taking the Big Step. Retrieved January 5, 2006 from http://www.nationalunderwriter.com/tech/news/viewFeatures.asp? articleID=1061 Hersh C (2006) Power Toolbox. Retrieved January 5, 2006 from http://www. nationalunderwriter.com/tech/news/viewFeatures.asp?articleID=1060 Openmdx.org: openMDX introduction: Version 1.0 http://www.openmdx.org/ documents/introduction/openMDX_Introduction.pdf Riemer O (2005) Vorsorge Lebensversicherung AG: Die Versicherungsfabrik. I.VW Management-Information 4/2005, 14−19
3 Product Management Systems in Financial Services Software Architectures
71
Traeger M (2004) Automatisierung/Industrialisierung im Schadenmanagement. In: El Hage B, Jara M (eds): Schadenmanagement. Grundlagen, Methoden und Instrumente, Praktische Erfahrungen. I.VW Management-Information 6/2003, 94−112 Watzlawick P (1983) Anleitung zum Unglücklichsein. Piper, München 1983 Wikipedia contributors (2006) Unified Modeling Language. Wikipedia, The Free Encyclopedia. Retrieved February 25, 2006 from http://en.wikipedia.org/w/ index.php?title=Unified_Modeling_Language&oldid=41083251
CHAPTER 4 Management of Security Risks – A Controlling Model for Banking Companies Ulrich Faisst, Oliver Prokein
4.1
Introduction
Increasing virtualization of business processes and the cumulative adoption of ICT involved leverage the importance of security risks. The “Electronic Commerce Enquête IV” inquiry carried out in August 2004 concluded that the majority of German banking companies plan to increase the investments in ICT within the next two years (Sackmann and Strüker 2005). However, the rising deployment of ICT implicates growing security risks. To lower security risks, banking companies may invest into technical security mechanisms and insurance policies. Generally, the more a banking company invests into technical security mechanisms and insurance policies, the lower are the expected losses and opportunity costs of the economical capital charge, et vice versa (Faisst 2004). Overall, a trade-off exists between the expected losses and the opportunity costs of the economical capital charge on the one hand and the investments in technical security mechanisms and insurance policies on the other. In practice, such investment decisions depend on explicit responsibilities within a banking company. In case, where an explicit responsibility exists, the decisionmaker might tend to make every possible investment within his budget, although holistically viewed not every investment is profitable. If no explicit responsibility exists, the decision-maker might tend to minimize costs and therefore neglects further investments, although such investments are profitable in a holistic view. This contribution1 aims at developing an optimization model that is able to map the described trade-off between the expected losses and the opportunity costs of the 1
This contribution is an enhanced and revised version of the following already published contribution: Faisst U, Prokein O (2005) An Optimization Model for the Management of Security Risks in Banking Companies. In: Müller G, Lin K-J (eds.) Proceedings of the Seventh IEEE International Conference on E-Commmerce Technology CEC 2005, Munich, IEEE Computer Society Press, Los Alamitos, CA, pp. 266–273.
74
Ulrich Faisst, Oliver Prokein
economical capital charge on the one hand and the investments in security mechanisms and insurance policies on the other in a decision calculation. Moreover, the model helps to allocate available budgets to security mechanisms (ex-ante prevention) and into insurance policies (ex-post risk transfer) in an efficient way. In order to present the state-of-the-art of the management of security risks we will portray the risk management cycle.
4.2
The Risk Management Cycle
The following risk management cycle in Figure 4.1 illustrates the main activities to handle security risks. The cycle contains four phases: identification, quantification, controlling and monitoring (Piaz 2001).
Figure 4.1. The risk management cycle
4.2.1
Identification Phase
Within the scope of the identification phase the security risks are identified and classified. Security in ICT covers the wide range from the physical protection of the hardware to the protection of personal data against deliberate attacks (Müller et al. 2003). In an open information system like the Internet, one cannot assume, that all parties involved (such as communication partners, services providers etc.) trust or even know each other (Müller and Rannenberg 1999). Therefore, the analysis of security risks requires not only the observation of external attackers but also the inclusion of all parties involved as potential attackers. The concept of multilateral security (Rannenberg 1998) considers the security requirements of all parties involved. The security risks result from the threat of the so-called four
4 Management of Security Risks – A Controlling Model for Banking Companies
75
protection goals of multilateral security according to Müller and Rannenberg (Müller and Rannenberg 1999): • • • •
Confidentiality Firm specific data or personal data of individual users may not be noticed of unauthorized users: • Message contents have to be confidential in respect of all parties apart from the communication partner, • communication partners (sender and/or recipients) have to be able to remain in an anonymous way to each other. It should further be possible for other parties to observe them, • for either potential communication partners or other parties should it be possible to determine the current location of a mobile terminal or its user without his consent. Integrity A change of messages by unauthorized people may not be unnoticed, i. e. manipulation of messages must be detected. Accountability Accountability of communication transactions is indispensable for a required internalization of actions. Responsibilities and liabilities, efficient incentive scheme, definition of property rights and their transactions are otherwise not to be established: • The recipient of a message should be able to proof to a third party that a certain communication partner has sent the message, • the sender of a message should be able to prove and verify the sending of a message together with its actual content. Beyond it, it should be possible for the sender to prove that the message was received, • the payment for the used services cannot be withheld from the provider – at least the provider receives sufficient evidence that a service was requested. In contrary the service provider can only require payment for services which are correctly provided. Availability A restricted availability can lead to material or immaterial impacts, i. e. the communication network has to enable communications between all requiring partners.
76
Ulrich Faisst, Oliver Prokein
The concept of multilateral security is a possible concept to define security risks. Moreover, to quantify the security risks it is necessary to consider the attacks that threat and violence the four protection goals. According to the CSI/FBI Computer Crime and Security Survey (CSI/FBI 2005) viruses, unauthorized attacks and the theft of proprietary information are the most profoundly attacks. The following table illustrates the protections goals of multilateral security, selected attacks and potential economic impacts. The probability of loss occurrence arises from the observed attacks, the amount of losses from the economic impacts: Table 4.1. Economic impacts of attacks
Protection goal
Selected attacks
Potential economic impact
Confidentiality
Theft of data, access misuse etc.
Loss of competitive advantage, liability claims of third, punishments etc.
Integrity
Sabotage, man-in-the-middleattack, computer bug etc.
Loss of data, business interruption, sales shortfall etc.
Accountability
IP-spoofing, social hacking, inadequate access control etc.
Loss of image, business interruption, liability claims of third etc.
Availability
Distributed denial-of-serviceattack, computer virus, server failure etc.
Loss of recovery, loss of market share etc.
The threats can be classified into different categories. The German “Bundesamt für Sicherheit in der Informationstechnik (BSI)” differs between the categories act of nature beyond control (e. g. fire), organizational deficiencies (e. g. insufficient maintenance), human error (e. g. incorrect data input), technical failure (e. g. server failure) and deliberate act (e. g. theft of hardware) (BSI 2004). According to Basel II banks have to charge capital for operational risk. Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes, people and systems of from external events. This definition includes legal risks, but excludes strategic and reputational risk (Basel Committee on Banking Supervision 2001). Security risks are a subset of operational risks. Figure 4.2 provides a comparison of the loss event type classification according to Basel II and the loss event type classification according to BSI, which portrays that security risk are included in a large number of categories of operational risk.
4 Management of Security Risks – A Controlling Model for Banking Companies
77
Loss Event Type Classification according to Basel II
internal fraud
external fraud
business disruption and system failures
clients, products & business practices
Loss Event Type Classification according to BSI
act of nature beyond control
employment practices and workplace safety
execution, delivery & process management
e.g. fire e.g. insufficient maintenance
organizational deficiencies
deliberate act
e.g. theft of hardware
human error
e.g. careless erasure of objects
technical failure
damage to physical assets
e.g. computer virus
e.g. theft of data
e.g. insufficient net capacity e.g. misuse of confidential information
e.g. server switch-off in going concern e.g. breakdown of electric power supply
e.g. terror attack e.g. negligent demolition of hardware
e.g. incorrect data input
e.g. server failure
Figure 4.2. Security risks as a subset of operational risks
4.2.2
Quantification Phase
The identified security risks are measured by the use of different methods within the quantification phase (Cruz 2002). So far, no quantification model has been developed for the measurement of the security risks, defined above. For the measurement of this subset of operational risk the Basel Committee on Banking Supervision (Basel Committee on Banking Supervision 2004) suggests five different quantification methods in order to determine the economical capital charge. The methods reach from simple, factor-based approaches to complex stochastic loss distribution models based on the Value-at-Risk (Basel Committee on Banking Supervision 2004). Beyond that, further methods exist for the quantification of operational risks, as for instance questioning techniques or causal methods, like Bayesian Belief Networks (Faisst and Kovacs 2003). Selected methods are described more detailed in the following. 4.2.2.1 Questioning Techniques Questioning techniques subsume methods like expert interviews and self-assessments by responsible managers. Operational risks are identified and quantified by using structured interview guidelines as well as management workshops. Beside the identification and quantification of operational risks questioning techniques are used to leverage the awareness of operational risk within the company.
78
Ulrich Faisst, Oliver Prokein
4.2.2.2 Indicator Approaches Indicator approaches use a specific indicator (single indicator approaches) respectively a set of key indicators (key indicator approaches) to indirectly determine the amount of operational risk. Such indicators are selected on base of empirical surveys as well as expert opinions in case the coherence to operational risk can be assumed. An example of the simple indicator approach is the Basic Indicator Approach (Basel Committee on Banking Supervision 2001), on which the gross income of a banking company is used as an exposure indicator multiplied with a factor to determine the economical capital charge. Key indicator approaches consider a set of specific indicators. These comprise key indicators for historic losses, company-specific risk indicators, such as down-time of systems, employee loyalty or the number of transactions. Such indicators may be finally gathered in a scorecard for operational risk.
4.2.2.3 Stochastic Methods Stochastic methods use distribution functions to describe the level of operational risk. One of these methods is the operational Value-at-Risk. This approach is based on the general Value-at-Risk approach which was originally developed for market risks (Beek and Kaiser 2000). The frequency and severity of events are normally forecasted by proceeding simulations based on historical loss data.
4.2.2.4 Causal Methods Causal methods are used to analyse the coherence between sources and drivers of operational risk and resulting losses. Such a method is e. g. Bayesian Belief networks. Bayesian Belief networks can be used to connect historical data on past events on the one hand with expert opinions on future events on the other hand. The Table 4.2 classifies the described quantification methods. Risk managers at banking companies face the problem to select and implement approaches to quantify operational risk. Single indicator approaches appear to be only suitable for small banking companies, for which the usage of more advanced approaches is too costly. Large banking companies usually have implemented stochastic methods, such as operational Value-at-Risk in connection with questioning techniques. This helps banking companies to determine the level of operational risk on base of a large number of historic data and expert opinions. Further development of quantification methods for operational risk is still required. An integrated and consistent method is needed to quantify operational risk for the banking company as a whole and its business units. The creation of interfaces to other methods and risk types as well as a consistent aggregation of risk types is still a challenge.
4 Management of Security Risks – A Controlling Model for Banking Companies
79
Table 4.2. Overview on quantification methods2 Questioning techniques:
Based on the identified and quantified operational risks, decisions on carrying, decreasing, avoidance as well as the transfer of the security risks are made within the controlling phase. There are internal and external controlling instruments for operational risks: • Internal controlling instruments are focused on the sources and drivers of operational risk and are used to prevent loss events caused by operational risks. • External controlling instruments aim at transferring operational risks out of the company. External parties carry potential losses caused by operational risks. Such instruments are insurance policies, outsourcing of process or systems and alternative risk transfer instruments, such as Operational Risk Linked Bonds. 2
Basel II approaches are based on indicator-approaches (such as Basic Indicator, Standardized Approach, Internal Measurement Approach or Scorecard Approach) as well as a stochastic method (Loss Distribution approach based on operational value-at-risk).
80
Ulrich Faisst, Oliver Prokein
4.2.4
Monitoring Phase
The monitoring phase encompasses all procedures and techniques, which are necessary for a continuous monitoring of operational risk. Thereby it is analyzed, if • all the occurred events have been prior identified as possible events, • the distribution of probabilities of occurrence of events and the distribution of severities of losses have been anticipated within the quantification phase, • the selected controlling measures have lead to the desired results. To summarize the state-of-the-art in each of the four presented phases of Operational Risk Management, Table 4.3 provides an overview on research questions in selected contributions: Table 4.3. Overview on phases and research in selected contributions Phase
Research questions
Identifica- • Which methods can tion be used to identify operational risks?
Method
Source
Descriptive analysis
(Basel Committee on Banking Supervision 2001; Brink 2000; Cruz 2002; Eller et al. 2002; Jörg 2002; Faisst and Kovacs 2003; Lokarek-Junge and Hengmith 2003; Marshall 2001; Piaz 2001)
• How can identified operational risks be categorised? • How can sources and drivers of operational risks be analysed?
Quantification
• Which methods can be used to quantify operational risks? • How can operational risks be aggregated? • How can rare events with large severity be quantified?
Case study (Hoffman 2002) (N = 20 banking companies) Descriptive analysis
(Brink 2000; Buhr 2000; Cruz 2002; Eller et al. 2002; Faisst and Kovacs 2003; Jörg 2002; Marshall 2001; Piaz 2001).
Case study (Hoffman 2002) (N = 20 banking companies)
4 Management of Security Risks – A Controlling Model for Banking Companies
81
Table 4.3. Continued
Controlling
• Which quantification method is suitable for which implementation area?
Descriptive analysis
(Faisst and Kovacs 2003)
• Which amount of losses has been caused by operational risk (based on historical data)?
Empirical study (N = 89 banking companies)
(Basel Committee on Banking Supervision 2004)
Simulation model
(Beeck and Kaiser 2000)
• How much does a business process contribute to the aggregated operational risk?
Simulation model
(Ebnöther et al. 2001)
• Which instruments can be used to control the level of operational risk?
Descriptive analysis
(Brink 2000; Cruz 2002; Eller et al. 2002; Jörg 2002; Lokarek-Junge and Hengmith 2003; Marshall 2001; Piaz 2001).
Case study
(Spahr 2001)
• Which impact have these controlling instruments on the frequency and severity of events caused by operational risk? Monitoring
• Which procedures and methods should be implemented to monitor operational risk? • Which requirements have to be considered to monitor operational risk in the most relevant business processes?
(Hoffman 2002) Case study (N = 20 banking companies) Descriptive analysis
(Basel Committee on Banking Supervision 2001; Basel Committee on Banking Supervision 2004; Brink 2000; Cruz 2002; Eller et al. 2002; Lokarek-Junge and Hengmith 2003; Marshall 2001; Piaz 2001).
(Hoffman 2002) Case study (N = 20 banking companies)
82
Ulrich Faisst, Oliver Prokein
This contribution focuses on the controlling phase and the steering of security risks by implementing security mechanism to prevent loss events (ex-ante) and using insurance policies to transfer parts of the losses in case of an event (ex-post). The contribution analyses, which combinations of ex-ante and ex-post controlling measures lead to an efficient solution.
4.3
A Controlling Model for Security Risks
The model aims to solve the described trade-off between the expected losses and the opportunity costs of the economical capital charge on the one hand and the investments in security mechanisms and insurance policies on the other. Thereby, the amount to be invested in security mechanisms and insurance policies will be optimized.
4.3.1
Assumptions
The time horizon accounts for a single period. 4.3.1.1 Assumption 1: Independence of a Single Information System A single information system is regarded, neglecting possible dependencies to other information systems. 4.3.1.2 Assumption 2: Relevant Cashflow Parameters The expected total negative cashflow μ of the information system is composed of the items expected losses due to security risks E(L), the opportunity costs of the economical capital charge OCC, the investments in security mechanisms ISM and the investments in insurance policies IIns. μ = E(L) + OCC + ISM + I Ins
(4.1)
The cashflow items E(L), OCC, ISM and IIns are estimated ex-ante. 4.3.1.3 Assumption 2a: Expected Losses The expected loss E(L) arises as a result of multiplying the expected frequency of attempted attacks λ by the expected loss given events LGE (with LGE > 0). For simplicity, we assume constant LGE. We also assume, that investments in security mechanisms can reduce the expected frequency of attempted attacks λ ex-ante by the so called security level (SL = 1−a). The security level represents the percentage of prevented attacks through the implementation of technical security mechanisms. In order to make allowance for the impacts of these
4 Management of Security Risks – A Controlling Model for Banking Companies
83
mechanisms, the expected frequency of attempted attacks λ is multiplied by the factor a (whereby 0 < a ≤ 1) that represents the percentage of successful attacks. We further assume that investments in insurance policies can reduce the amount of losses LGE by the so called insurance level (IL = 1−b). The insurance level represents the percentage of the transferred respective insured loss given events. In order to make allowance for the impacts of insurance policies, the expected loss given events LGE are multiplied by the factor b (whereby 0 < b ≤ 1) that represents the percentage of not insured loss given events LGE. Thus, the expected losses E(L) are: E(L) = E ⎣⎡( a ⋅ N ) ⋅ ( b ⋅ LGE ) ⎦⎤ = ( a ⋅ E(N) ) ⋅ ( b ⋅ LGE ) = (a ⋅ λ) ⋅ (b ⋅ LGE)
(4.2)
with: E(N) = λ: = Expected frequency of attempted attacks N, = Percentage of successful attacks, a: LGE: = Expected loss given events, = Percentage of not insured loss given events. b: We assume, that the frequency of successful attacks Q is Poisson distributed. The Poisson distribution exhibits the characteristic that the variance corresponds to the expected value. The variance σ Q2 is determined by: σ Q2 = E(Q) = a ⋅ λ
(4.3)
For constant LGE, the standard deviation of losses σL is given by: σ L = a ⋅ λ ⋅ (b ⋅ LGE)
(4.4)
4.3.1.4 Assumption 2b: Opportunity Costs of Economical Capital Charge Under the assumption 2a of Poisson distributed frequency of loss events and of constant LGE, the economical capital charge ECC can be determined as follows: ECC = γ ⋅ E(L)
(4.5)
The economic capital charge ECC arises as a result of multiplying E(L) by the so-called gamma-factor γ3. With an opportunity interest rate r, the opportunity costs of the economical capital charge OCC are given by: OCC = r ⋅ ECC = r ⋅ γ ⋅ E(L) 3
(4.6)
The gamma-factor translates the estimate of expected losses into an estimate for the unexpected losses to be covered by capital charge (Basel Committee on Banking Supervision 2001).
84
Ulrich Faisst, Oliver Prokein
The opportunity costs of the economical capital charge OCC exhibit a deterministic character. The standard deviation σOCC is therefore given by: σ OCC = 0
(4.7)
4.3.1.5 Assumption 2c: Investments in Security Mechanisms According to assumption 2a, increasing investments in security mechanisms ISM implicate decreasing expected losses E(L) et vice versa. We assume that the frequency of attempted attacks λ can be reduced ex-ante by implementing security mechanisms. ISM =
λ ⋅ LGE
[a ⋅ λ ⋅ LGE ]β
−1
(4.8)
with: ISM = 0 for a = β = 1; ISM > 0 for 0 < a < 1 and 0 < β < 1. According to assumption 2a and Equation (4.8), there is an inversely proportional relationship between the investments in security mechanisms ISM and the percentage of successful attacks a for a constant calibration factor β. This calibration factor determines the sensitivity of the relationship (whereby 0 < β ≤ 1). Similar to the opportunity costs of the economical capital charge, the investments in security mechanisms ISM exhibit a deterministic character. The standard deviation σSM is therefore given by: σSM = 0
(4.9)
4.3.1.6 Assumption 2d: Investments in Insurance Policies According to assumption 2a, increasing investments in insurance policies IIns implicate decreasing E(L), et vice versa. If the security risks fulfil the criteria of insurability, banking companies are able to reduce the extent of damage ex-post by investments in insurance policies. ⎛ λ ⋅ LGE ⎞ − 1⎟ I Ins = ⎜ δ ⎜ [ b ⋅ λ ⋅ LGE ] ⎟ ⎝ ⎠
(4.10)
with: IIns = 0 for b = δ = 1; IIns > 0 for 0 < b < 1 and 0 < δ < 1. Analogous to the investments in security mechanisms we assume an inversely proportional relationship between the investments in insurance policies IIns and the percentage of not insured loss given events b for a constant calibration factor δ. This calibration factor determines the sensitivity of the relationship (whereby 0 < δ ≤ 1).
4 Management of Security Risks – A Controlling Model for Banking Companies
85
The investments in insurance policies IIns exhibit a deterministic character and the standard deviation is given by: σ Ins = 0
(4.11)
4.3.1.7 Assumption 3: Solution Space with Continuous σ and Its Transformation on μ(σ): We assume that any number of σ ∈ (0,∞) exists and the corresponding cashflows can be mapped through the continuous function μ(σ).4 Only one σ can be realized, combinations are not possible.
4.3.2
Determining the Optimal Security and Insurance Level
In order to determine the optimal security and insurance level (SL*, IL*) and the corresponding optimal amount to be invested in technical security mechanisms I*SM and insurance policies I*Ins , we assume a risk neutral decision-maker that aims at minimizing his expected total negative cashflow μ. The expected total negative cashflow is obtained by the substitution of (4.2), (4,6), (4.8) and (4.10) in (4.1): μ = (1 + r ⋅ γ) ⋅ (a ⋅ λ) ⋅ (b ⋅ LGE) +
λ ⋅ LGE
( a ⋅ λ ⋅ LGE )
β
+
λ ⋅ LGE
( b ⋅ λ ⋅ LGE )
δ
−2
(4.12)
The derivation of equation (4.12) with respect to a is given by: ∂μ λ ⋅ LGE = λ ⋅ (1 + r ⋅ γ) ⋅ (b ⋅ LGE) − β ⋅ ∂a (λ ⋅ LGE)β ⋅ a β +1
Equation (4.13) fulfils the necessary as well as (4.13) and (4.14) the sufficient conditions of a minimum of the expected total negative cashflow. Transformation of (4.13) leads to: 1
Thus, it is assumed that any number of σ can be obtained. In reality only a finite number of discrete values of σ exist. Another simplification is the assumption that for any number of σ a continuous function μ(σ) exists.
86
Ulrich Faisst, Oliver Prokein
The derivation of the variable b is obtained analogous and is given by: 1
The minimum of the expected total negative cashflow μ* (a * ,b* ) is obtained by the substitution of the equations (4.17) and (4.18) in equation (4.12). In doing so, it is further possible to determine the optimal amount to be invested in security mechanisms I*SM and insurance policies I*Ins : I*SM =
I*Ins =
λ ⋅ LGE ⎡ ⎢ ⎢( λ ⋅ LGE ) ⎢ ⎣
⎛ ⎞ 2⋅δ ⎜ ⎟ ⎝ (1+ δ)⋅β + δ ⎠
β
−1
δ
−1
⎛ ⎞⎤ 1 ⎜ ⎟ ⎞⎝ (1+ δ)⋅β + δ ⎠ ⎥
⎛ β (1+ δ) ⋅⎜ δ ⎟ ⎝ δ ⋅ (1 + r ⋅ γ) ⎠
⎥ ⎥ ⎦
λ ⋅ LGE ⎡ ⎢ ⎢( λ ⋅ LGE ) ⎢ ⎣
⎛ ⎞ 2⋅β ⎜ ⎟ ⎝ (1+ δ)⋅β + δ ⎠
⎛ ⎞⎤ 1 ⎜ ⎟ ⎞⎝ (1+ δ)⋅β + δ ⎠ ⎥
⎛ δ(1+β) ⋅⎜ δ ⎟ ⎝ β ⋅ (1 + r ⋅ γ) ⎠
(4.23)
⎥ ⎥ ⎦
(4.24)
4 Management of Security Risks – A Controlling Model for Banking Companies
87
Figure 4.3. Minimum of the expected total negative cashflow5
The mapped area in Figure 4.3 illustrates all possible μ(a,b)-combinations and the corresponding security and insurance level (SL,IL) for given values of β and σ. However, there is only one minimum in μ*(a*,b*) and therefore only one efficient (SL*,IL*)-solution. Example 4.1: Consider the following instance for the model: β = 0,2 ; δ = 0,6 ; r = 0,1 ; LGE = 1.000 ; λ = 0,3 ; γ = 7,3 . The optimal security and insurance level are given by SL* = 0,58 (with a* = 0,42 ) and IL* = 0,9 (with b* = 0,1 ). Therefore a banking company would invest I*SM = 112,99 in technical security mechanisms and I*Ins = 36,99 in insurance policies. The expected loss E(L) equals the value E(L) = 13,18 and the opportunity costs of the economical capital charge OCC = 9,62 . Therefore the minimum of the expected total negative cashflow is given by μ* = 172,78 . Result 4.1: For any given values of (β, δ) only one minimum of the expected total negative cashflow μ*(a*,b*), respective (SL*, IL*)-solution exists.
4.3.3
Constraints and Their Impacts
In practice, constraints like limited economical capital charge and limited budget affect the controlling of security risks. We will further analyze the impacts of such constraints. 5
According to assumption 2a, the domains of a, b and SL, IL respectively are given by a, b ∈ (0; 1] and SL, IL ∈ [0; 1). The closer SL, IL approaches to 1, the greater the expected total negative cashflow μ . Therefore, μ is not limited and can rise infinitely. For illustration reasons, the plotted graph shows all the SL, IL -combinations within the domain [0; 0,99].
88
Ulrich Faisst, Oliver Prokein
At first, we transform the expected total negative cashflow in an equation dependent on the standard deviation. The standard deviation of the expected total negative cashflow σETNC arises by considering (4.3), (4.7), (4.9) and (4.11): σ ETNC = σ L = a ⋅ λ ⋅ (b ⋅ LGE)
a ⋅λ =
(4.25)
σ 2ETNC
(4.26)
( b ⋅ LGE )2
In the following, σETNC is denoted as σ. We obtain E(L), OCC, and ISM in dependence of σ by the substitution of (4.26) in (4.2), (4.6), (4.8) and (4.10): E(L) =
σ2 b ⋅ LGE
OCC = r ⋅ γ ⋅ ISM =
(4.27)
σ2 b ⋅ LGE
(4.28)
σ2 β
⎛ σ2 ⎞ a ⋅ b ⋅ LGE ⋅ ⎜ 2 ⎟ ⎝ b ⋅ LGE ⎠
−1
(4.29)
−1
(4.30)
2
I Ins =
σ2
⎛ ⎞ σ2 a ⋅ b ⋅ LGE ⋅ ⎜ ⎟ ⋅ ⋅ a b LGE ⎝ ⎠
δ
2
The expected total negative cashflow μ(σ) is obtained based on (4.27), (4.28), (4.29) and (4.30): μ=
β δ ⎛ σ2 1 ⎜⎛ ⎛ b 2 ⋅ LGE ⎞ ⎛ a ⋅ b ⋅ LGE ⎞ ⎟⎞ ⎟⎞ ⋅ ⎜1 + r ⋅ γ + ⋅ ⎜ + ⎟ ⎜ ⎟ ⎟⎟ − 2 b ⋅ LGE ⎜ a ⋅ b ⎜ ⎝ σ2 σ2 ⎝ ⎠ ⎠ ⎝ ⎠⎠ ⎝
(4.31)
Equation (4.31) describes μ(σ) as a continuous function in dependence of σ. μ(σ) thereby maps the domain σ ∈ (0;∞) well defined on μ(σ) ∈ (μ * ;∞) . Figure 4.4 illustrates the trade-off between the expected losses E(L) and the opportunity costs of the economical capital charge OCC on the one hand and the investments in security mechanisms ISM and insurance policies IIns on the other. The expected total negative cashflow μ(σ) thereby possesses only one minimum in μ*(σ*). Example 4.2: Analogous to example 4.1, we consider the following instance for the model: β = 0,2; δ = 0,6; r = 0,1; LGE = 1.000; λ = 0,3; γ = 7,3; SL* = 0,58 (with a* = 0,42); IL* = 0,9 (with b* = 0,1) and μ* = 172,78. Substituting these values in (29) leads to an optimal risk level amounting to σ* = 36,17.
4 Management of Security Risks – A Controlling Model for Banking Companies
μ (σ )
μ
89
E (L )
μ* OCC I = I SM + I Ins
σ*
σ
Figure 4.4. Trade off between E(L) and OCC on the one hand and ISM and IIns on the other
As mentioned above, the minimum of the expected total negative cashflow μ*(a*,b*), respectively (SL*,IL*))-solution, is derived by equation (4.12) in connection with (4.19) and (4.20). The appropriate optimal risk level σ* can be determined by equation (4.31) in conjunction with μ*. 4.3.3.1 Constraint 1: Risk Limits of the Economical Capital Charge
Composing limits of the economical capital charge determines the amount of risks a banking company is prepared to carry. In addition, we assume that a banking company defines a limit of the economical capital charge LECC for a single information system. In doing, so the amount of feasible solutions is restricted. In order to illustrate the impacts on the expected total negative cashflow, we consider two different limits of the economical capital LECCi (i = 1,2). The two limits LECC1,2 are represented in Figure 4.5. LECC1 cuts the expected total negative cashflow in (μ1,σ1). In this case, the banking company is further able to realize the global minimum of the expected total negative cashflow in (μ*,σ*). Therefore, the limit of the economical capital charge LECC1 does not tap the full potential (σ* < σ1). However, in the second case the limit LECC2 cuts the expected total negative cashflow in the suboptimal solution (μ2,σ2). The expected total negative cashflow μ2 is greater than μ*, the corresponding risk σ2 is accordingly smaller (σ2 < σ*). Result 4.2: In case the LECC exceeds the optimal solution (μ*,σ*), a banking company can further realize the minimal expected total negative cashflow μ*. However, if the LECC is smaller than σ* a banking company can only realize suboptimal solutions.
90
μ (σ )
Ulrich Faisst, Oliver Prokein LECC
2
LECC
1
μ
μ2 μ1 μ*
σ2 σBL σ *
σ1
σ
Figure 4.5. Risk limits of economical capital charge
Analogous to the limits of the economical capital charge, budget limits restrict the amount of feasible solutions. Their impacts are analyzed in the following. 4.3.3.2 Constraint 2: Budget Limits
Budget limits BL serve as a limitation of the payments in business areas or in our case in information systems. Budgeting generally considers the expected losses E(L) ex-ante. Anyhow, L is a random variable that can exceed ex-post the defined budget limit. The amount exceeded can be covered through equity capital. Figure 4.6
μ (σ )
μ
μ2 μ1 BL
μ*
σBL σ * Figure 4.6. Budget limits
σ
4 Management of Security Risks – A Controlling Model for Banking Companies
91
illustrates the impact of the budget limit BL exemplarily. A budget limit BL cuts the expected total negative cashflow in the suboptimal solution (μBL,σBL). In this case, a banking company can further realize the optimal solution (μ*,σ*). Assumed that the budget limit BL is smaller than the optimal solution (μ*,σ*), the information system cannot be carried on. Result 4.3: If the budget limit BL exceeds the minimal expected total negative cashflow BL ≥ μ*, a risk neutral decision maker will further choose the optimal solution (μ*,σ*).
4.4
Conclusion
The developed decision model is able to map the existing trade-off between the expected losses and the opportunity costs of the economical capital charge on the one hand as well as the investments in security mechanisms and insurance policies on the other hand in a common framework. Thereby, the model optimizes the investments in ex-ante and ex-post controlling mechanisms. Only one (a*,b*)respective (SL*,IL*)-combination exists that minimizes the expected total negative cashflow. Furthermore, the model points out that the constraints – limits of the economical capital charge and budget limits – can, under certain conditions, lead to suboptimal solutions. Normally a banking company determines its limits centrally via an individual information system. Not fully taped limits of economical capital charge and budget limits cannot be exchanged. Thus, this can lead to inefficiencies from a holistic point of view. The exchange of not fully taped potentials can implicate a greater utility. However, further research questions arise from the defined assumptions: • In the model an isolated information system is regarded, by which it is assumed that it is independent of all other systems. Correlations to other information systems are not considered. Taking correlations into account can lead to different results. • We further assume that investments in security mechanisms and insurance policies can be mapped within the domain of σ ∈ ( 0;∞ ) . This can be traced back to the property of continuity of the function μ(σ). It is assumed that in reality only discrete action alternatives exist. • For simplicity, we assumed constant loss given events. However, if the standard deviation of the expected loss given events is taken into account, the expected total negative cashflow will be affected. A further research topic includes modeling random variables for the loss given events.
92
Ulrich Faisst, Oliver Prokein
References Basel Committee on Banking Supervision (2001) Working Paper on the Regulatory Treatment of Operational Risk. Retrieved December 12, 2005, from www.bis.org/publ/bcbs_wp8.pdf Basel Committee on Banking Supervision (2004) Internal Convergence of Capital Measurement and Capital Standards. Retrieved December 12, 2005, from http://www.bis.org/publ/bcbs107.pdf Beeck H, Kaiser, T (2000) Quantifizierung von Operational Risk. In: Johanning, L., Rudolph, B. (eds.) Handbuch Risikomanagement. UHLENBRUCH Verlag, Bad Soden/Ts, pp. 633−654 Brink J. van den (2000) Operational Risk: Wie Banken das Betriebsrisiko beherrschen. Dissertation, St. Gallen, Switzerland BSI (2004) IT-Grundschutzhandbuch. Retrieved March 12, 2004, from http://www.bsi.de/gshb/deutsch/download/GSHB2004.pdf Buhr, R (2000) Messung von Betriebsrisiken – ein methodischer Ansatz. Die Bank 40: 202−206 Cruz A (2002) Modeling, Measuring and Hedging Operational Risk. John Wiley & Sons, Chichester CSI/FBI (2005) Computer Crime and Security Survey 2005. Retrieved April 9, 2005, from http://www.usdoj.gov/criminal/cybercrime/FBI2005.pdf Ebnöter S, Vanini P, McNeil A, Antolinez-Fehr, P (2001) Modelling Operational Risk. Working Paper, ETH Zürich Eller R, Gruber W, Reif M (2002) Handbuch Operationelle Risiken – Aufsichtsrechtliche Anforderungen, Quantifizierung und Management, Praxisbeispiele. Schäffer-Poeschel Verlag, Stuttgart Faisst U (2004) Ein Modell zur Steuerung operationeller Risiken in IT-unterstützten Bankprozessen. Banking and Information Technology (BIT) – Sonderheft zur Multikonferenz Wirtschaftsinformatik 2004 in Essen 1: 35−50 Faisst U, Kovacs M (2003) Quantifizierung operationeller Risiken – ein Methodenvergleich. Die Bank 43: 342−349 Gemela J (2001) Financial analysis using Bayesian networks. Applied Stochastic Models in Business and Industry 17: 57−67 Hoffman DG (2002) Managing Operational Risk – 20 Firmwide Best Practice Strategies. John Wiley & Sons, Chichester Jörg M (2002) Operational Risk – Herausforderung bei der Implementierung von Basel II. Diskussionsbeiträge zur Bankbetriebslehre, Hochschule für Bankwirtschaft, Frankfurt, Germany Locarek-Junge H, Hengmith L (2003) Management des operationalen Risikos der Informationswirtschaft in Banken. In: Geyer-Schulz A, Taudes A (eds.) Informationswirtschaft: Ein Sektor mit Zukunft – Symposium. Köllen Druck + Verlag, Bonn Marshall CL (2001) Measuring and Managing Operational Risk in Financial Institutions. John Wiley & Sons, Chichester
4 Management of Security Risks – A Controlling Model for Banking Companies
93
Müller G, Eymann T, Kreutzer M (2003) Telematik- und Kommunikationssysteme in der vernetzten Wirtschaft. Oldenbourg, München Müller G, Rannenberg K (1999) Multilateral Security in Communications. Vol. 3: Technology, Infrastructure, Economy. Addison-Wesley-Longman, New York Piaz JM (2001) Operational Risk Management bei Banken. Versus-Verlag, Zürich Rannenberg K (1998) Zertifizierung Mehrseitiger Sicherheit: Kriterien und organisatorische Rahmenbedingungen. Vieweg, Braunschweig, Wiesbaden Sackmann S, Strüker J (2005) 10 Jahre E-Commerce – Eine stille Revolution in deutschen Unternehmen. Konradin Verlag, Leinfelden Spahr R (2001) Steuerung operationaler Risiken im Electronic und Investment Banking. Die Bank 41: 660−663
CHAPTER 5 Process-Oriented Systems in Corporate Treasuries: A Case Study from BMW Group Christian Ullrich, Jan Henkel
5.1
Introduction
Today, most organizations in all sectors of industry, commerce and government are fundamentally dependent on their information systems. In the words of (Rockart 1988): “Information technology has become inextricably intertwined with business.” In industries such as telecommunications, media, entertainment and financial services, where the product is digitized, the existence of an organization depends on the effective application of information technology (IT). The treasury management function of multinational companies provides another interesting example. Treasury management has developed during the last two decades from a simple function of protecting and enhancing cash flows, to a complex one, where all aspects pertaining to modern cash management, foreign exchange management, investment management, exposure management, risk-return analysis, and accounting are handled. The reason for the changing role of corporate treasuries can be found in the changing financial landscape itself. Changes in financial markets over the past decade have been rapid and profound. New computing and telecommunication technologies, along with the removal of legal and regulatory barriers to entry, have led to greater competition among a wider variety of institutions, in broader geographic areas, and across an expanding array of instruments. The result has been an increase in the efficiency with which financial markets channel funds from borrowers to lenders. Moreover, there has been considerable improvement in the ability of market players to adjust their risk-return profiles. Among the key technological innovations are those that have enabled the development of databases critical to pricing and managing the risks of financial instruments. The ability to value risky financial assets laid the foundation for understanding the risks embedded in those assets and managing them on an aggregated, portfolio basis. While the value of risk unbundling has been known for decades (Markowitz 1971) the ability to create sophisticated instruments that
96
Christian Ullrich, Jan Henkel
could be effective in a dynamic market had to await the last decade’s development of computer and telecommunications technologies. In fact, technological changes have tremendously contributed to the evolution of financial markets, since more people and institutions have a greater number of alternatives for supplying or acquiring funds. In addition, both lenders and borrowers have a much wider range of instruments from which to choose the risk profile of their financial positions and to adjust that profile in response to changes in their own situations or in the overall financial and economic environment (BIS 2005; Bodnar et al. 1999). According to (Greenspan 1999), “… the extraordinary development and expansion of financial derivatives …” has been “by far the most significant event in finance during the past decade”. For example, it has helped to reduce the cost of transferring savings from anywhere around the globe into investments anywhere around the globe. It has also reduced the reliance of particular classes of borrowers and lenders on particular types of financial intermediaries. However, it has also challenged financial and technology service providers to identify the value they add to the intermediation process. Although these changes have created potential for more robust financial markets, in reality the shifts have not been without difficulties. Some significant disturbances in the past decade have challenged market participants’ and regulators’ understanding of risk and revealed considerable weaknesses in risk management tools and practices. For example, in 1997 and 1998, capital flows to a number of Asian countries were disrupted, with severe effects on their economies. In 1998, the global financial system was rocked by the Russian debt default and the troubles of Long-Term Capital Management. Those events were followed by the worst excesses of the stock market bubble, including not only the overvaluation of many stocks, but also the lapses in corporate governance as revealed by a range of corporate disasters which have marked that era (see e. g. Benston et al. 2003; Culp and Miller 1994; Mello and Parsons 1994). New technologies and changing market structures imply regulatory bodies to constantly review their legal frameworks. During the last years, a range of regulatory requirements (FAS, IFRS, KonTraG, SOX) have been put into practice in order to promote greater flow of accurate information and therefore enable private participants to make better informed decisions. Today, we observe that companies are not trying to fulfil regulatory requirements only, but strive to integrate regulatory demands in a corporate value-orientated context. This tendency has become particularly obvious in firms’ financial risk management practices. As creators of value, treasuries seek to progress from pure measurement of financial risk and return to effective decision-support analyses.1 In addition, they play a pioneering role in passing risk management expertise on to other, core business related areas, in order to establish enterprise-wide risk management frameworks. In order to fulfil this role, treasuries face the challenge to move from monitoring and managing accounts 1
Note that this behaviour is contrary to the theoretical findings of (Modigliani and Miller 1958) which basically states that under the assumption of perfect capital markets, firms need not manage their financial risks since investors can do it for themselves.
5 Process-Oriented Systems in Corporate Treasuries
97
locally, regionally or globally, to providing detailed analytical reports on the financial position of the company. These new tasks can hardly be accomplished without technology support. In this chapter, we present the current status on how IT systems support BMW’s treasury activities. We take different views according to the notions of (Zachman 1987).2 The contribution is organized in seven sections as follows. In Section 5.2 we give an overview of the technological options available. Section 5.3 presents the major challenges that multinational corporations are confronted with today when optimizing their treasury procedures. In Section 5.4 we portray BMW, its business strategy and its current position on the international market for automobiles and motorcycles (“ballpark view”). In Section 5.5 an “owner’s perspective” of BMW’s treasury process is taken. We describe the role and function of BMW’s treasury department by examining its organizational structure and responsibilities, as well as its processes and the flow of information (“architect’s view”). In Section 5.6, special emphasis is put on the functionality and architecture of BMW’s treasury IT (“designer’s view”). How IT supports the company’s treasury operations is shown within a case study. Section 5.7 gives the conclusion.
5.2
The Technological Environment
The market for Treasury IT systems is characterized by a huge number of differently aligned system manufacturers, which offer the whole range from simple plugand-play solutions up to complicated integrated systems including comprehensive range of functionalities. In the following, we survey the technological possibilities that exist. We focus on four key technological options that have emerged to shape the topography of information architectures over the past three decades: • • • •
2
legacy systems best-of-breed solutions enterprise resource planning systems (ERP) decision-support systems (DSS)
In 1987, John Zachman published an influential approach to the elements of system development. Instead of representing the process as a series of steps, he organized it around the points of view taken by the various players within a company. These players included (1) the CEO or whoever is setting the agenda and strategy for an organization (“ballpark view”), (2) the business people who run the organization (“owner’s view”), (3) the systems analyst who wants to represent the business in a disciplined form (“architect’s view”), (4) the designer, who applies specific technologies to solve the problems of the business (“designer’s view”), the IT experts who choose a particular language, program listings, database specifications, networks, etc. (“builder’s view”), and finally, (5) the system itself.
98
Christian Ullrich, Jan Henkel
A brief look back in history demonstrates that companies were deploying large numbers of internally developed, agglomerated smaller systems isolated by function – so called legacy systems – to support critical business processes. Little thought was given to integrating these applications with each other and existing systems. As a consequence, these resulting information architectures made it difficult for decision-makers to combine and gain insight from information housed in unrelated systems. The delivery of sophisticated treasury management system (TMS) solutions has progressed hand in hand with the availability of the necessary network, server and processor power. Until the mid 1990s, sophisticated risk management was the preserve of banks whose massive IT budgets enabled them to deploy the mainframe computers, which offered the necessary storage capacity and processing power. IT budgets for corporate treasuries were comparatively modest – despite the volume of money which they process and the various risks that this implies. The advent of networked minicomputers in the early 1990s was the first big advance, as it enabled low-cost PCs to share information to complete such processes as the aggregation of different kinds of treasury deals. This provided much of the raw data needed for risk analysis in a timely and convenient way. However, early networked PCs were significantly constrained by disk capacity, network speed, processor power and memory capacity. As the demand for treasury systems grew, it was cheerfully met by programmers who produced more and more automated solutions in order to limit operational risks. Still, success was inhibited by the slowness of the technology on which they had to operate. During the last years, these constraints have been eliminated by the exponential improvement in capacity and performance, coupled with the downward pressure on prices which finally led to the adoption of business process segment specific, best-of-breed TMS in many corporations, especially large multinational ones. This is also reflected in a range of industry surveys which confirm that the majority of industrial corporations today use TMS for supporting treasury processes (Ernst & Young 2005; PricewaterhouseCoopers 2003). Very generally, a TMS can be characterized as a “… complex piece of software which can be used to automate, record, and control many core treasury functions. TMSs, sometimes referred to as ‘Treasury Workstations’ also act as a central database for information flowing in and out of the treasury, …” (Unknown 2006). The TMS vendor business itself has evolved into a fragmented industry in which numerous providers exhibit strength in different regions and among different customer segments, but no competitor is capable of dominating the field globally. Wellrecognized TMS providers on a global basis include FXpress, Reuters, SAP, SimCorp, SunGuard, Trema and XRT-CERG. While the TMS competitive landscape changes slowly, shifts that are taking place are arising largely from consolidation as the recent acquisitions of Integrity Treasury by SunGuard, Selkirk by Thomson and Richmond by Trema demonstrate. Still, TMS providers have focused on providing internet-based tools that facilitate the extension of transaction processing beyond the central treasury operation, including foreign business subsidiaries (customers) and banks (suppliers). Web browsers on desktops and
5 Process-Oriented Systems in Corporate Treasuries
99
cheaper bandwidths create an environment whereby deployment of internet-based tools is a viable option for corporate treasuries. Over the last decade, ERP systems have been widely developed to replace aging legacy systems, as well as to better integrate and automate core business processes, such as manufacturing, logistics, distribution, inventory, quality, human resources shipping, invoicing, and accounting. Thus, all functional departments that are involved in operations or production are integrated in one system. ERP systems are often called back office systems indicating that customers and the general public are not directly involved. This is contrasted with front office systems like customer relationship management (CRM) systems, eFinance systems, or supplier relationship management (SRM) systems that deal with companies’ external business partners. ERP systems are cross-functional, enterprise-wide and are marked by a seamless design that reduces time spent maintaining and upgrading separate systems and interfaces. Treasury and risk management functionalities are not represented as single systems but as integrated modules. This may remove organizational, geographic and political barriers. For instance, a single-vendor package of enterprise solutions for a range of global transaction processing needs brings the advantages of simplicity, integration and single-contact points for sales and support. Vendors whose packages provide greatest functional coverage include SAP, Oracle and PeopleSoft. However, although ERP systems are designed to handle heavy transaction-processing loads, they have not been universally esteemed for their performance management features. Reporting and analysis capabilities are pre-built and usually not customizable. The complexity involved to create new reports or to code changes has accelerated the adoption of third-party products that hide back-end complexity while simplifying reporting and analysis activities. Furthermore, the difficulty of extracting, combining and analyzing information from multiple disparate sources has led to an explosion of business intelligence tools, such as DSS in the very recent years.3 DSS can take many different forms and the term can be used in many different ways (Alter 1980). For example, (Finlay 1994) considers a DSS broadly as “a computer-based system that aids the process of decision making.” In a more precise way, (Turban 1995) defines it as “an interactive, flexible, and adaptable computerbased information system, especially developed for supporting the solution of a non-structured management problem for improved decision making. It utilizes data, provides an easy-to-use interface, and allows for the decision maker’s own insights.” At the conceptual level, (Power 2002) differentiates model-driven, communication-driven, data-driven, document-driven, and knowledge-driven DSS: 3
According to (Keen and Morton 1978), the concept of decision support has evolved from two main areas of research: the theoretical studies of organizational decision making done at the Carnegie Institute of Technology during the late 1950s and early 1960s, and the technical work on interactive computer systems, mainly carried out at the Massachusetts Institute of Technology in the 1960s. It is considered that the concept of decision-support systems became an area of research of its own in the middle of the 1970s, before gaining in intensity during the 1980s.
100
Christian Ullrich, Jan Henkel
• A model-driven DSS emphasizes access to and manipulation of a statistical, financial, optimization, or simulation model. Model-driven DSS use data and parameters provided by DSS users to aid decision makers in analyzing a situation, but they are not necessarily data intensive. • A communication-driven DSS supports more than one person working on a shared task. • A data-driven DSS or data-oriented DSS emphasizes access to and manipulation of a time series of internal company data and, sometimes, external data. • A document-driven DSS manages, retrieves and manipulates unstructured information in a variety of electronic formats. • A knowledge-driven DSS provides specialized problem solving expertise stored as facts, rules, procedures, or in similar structures. Moreover, at the system level, (Power 1997) differentiates enterprise-wide DSS and desktop DSS. Enterprise-wide DSS are linked to large data warehouses and serve many users in a company, whereas desktop single-user DSS are small systems that reside on individual users’ PCs. In the field of treasury, DSS tools are mostly model- and data-driven desktop solutions, which involve the special subjects of financial market prediction, scenario analysis and risk-return optimization. Figure 5.1 characterizes all four system types in terms of their functional breadth (which refers to the system’s functional reach within the company) and their functional depth (which refers to the degree of specialization). Typically, systems with high functional breadth, such as ERP, are marked by low functional depth. Vice versa, systems with high functional depth such as legacy systems or DSS have low functional breadth. Legacy & Decision Support Systems High
Functional depth
Best-of-breed Systems
Enterprise Resource Planning (ERP) Systems Low Treasury
Cost Management
HR
Materials Management
Functional breadth
Figure 5.1. Categorization of system solutions
Supply Chain Management
5 Process-Oriented Systems in Corporate Treasuries
5.3
101
Challenges for Corporate Treasuries
For treasury operations today, the application and integration of IT systems has become a key success factor in order to meet a range of internal and external challenges (see Figure 5.2): • • • •
enhancing transaction processing providing timely-accurate, periodic management information complying to business controls and corporate governance integrating decision-support mechanisms into the existing IT systems infrastructure
Transaction Processing: Treasury management has developed during the last two decades from a simple function of protecting and enhancing cash flows, to a broader one, where all aspects pertaining to modern cash management, foreign exchange management, investment management, risk-return and exposure management and analysis, and accounting are handled. Naturally, these activities are being accompanied by increasing transaction volumes. Since the flow of data is faster than ever, the data itself is more detailed, and it has no regional boundaries, data management is becoming more sophisticated. Consider a global corporate treasury operation that comes with a whole taxonomy of data around it. For instance, if the company has 1 million Euros in an account, that figure is associated with a currency, a legal entity, a bank, a business unit, geographic data, a date, etc. Consider now that the same firm has thousands of accounts with many banks all over the globe. This may result in voluminous financial transactions and hence huge amounts of information across thousands of accounts that have to be sliced, diced, aggregated and segmented in
Tomorrow
Today 100 Decision-Support
Search for value enhancement, Deployment of expert knowledge.
Business Controls/ Corporate Govenance Management Information
Decision-Support Business Process Reengineering to reflect environmental changes in the organization.
Transaction Processing
0
Value Contributions
Business Controls/ Corporate Governance Management Information
Cost savings through economies of scale and IT systems support.
Transaction Processing
Figure 5.2. The changing role of corporate treasury (Source: PricewaterhouseCoopers)
102
Christian Ullrich, Jan Henkel
many different ways. The collection of the necessary data from a range of sources within a company in order to compile management and treasury position reports as well as financial forecasts may thus be a very time-consuming process. Consequently, there is a technological need to quickly process financial transactions from initiation to settlement in an automated and secure fashion. According to Figure 5.2, the main challenges that multinational firms are confronted with are still efficiencyrelated. The main purpose herein is to exploit economies of scale that result from efficient transaction processing. Among system vendors, the term “transaction processing”, understood as the embracement of an efficient, instantaneous flow of information within a system in order to realize economies of scale, is often referred to as straight-through-processing. Management Information: Often, financial processes lack transparency because the relevant data does not come from one single source, but is distributed over a variety of systems. Missing interfaces make straight-through-processing and exchange of real-time data difficult. As a result, firms encounter contradictory data, which may lead to the problem that questions about the liquidity reserves in a particular country, legal entity, or business unit or about the financial obligations towards a certain business partner can often not be answered. In order to avoid such problems, automated systems are necessary that provide precisely timed access to accurate information, and thus allow for real-time financial performance measurement, risk-return controlling and financial reporting. The degree of straight-through-processing becomes a particular issue during those times when management reporting cycles coincide with important market shifts or roll-over dates and rate fixings. In addition, since managers rarely rely on a single source of information, the focus of information systems planning has moved towards the integration of individual systems into coherent sources of management information. This involves information analysis techniques such as data modelling and entity analysis to devise new ways of organizing and delivering information. Moreover, advances in communication capability facilitate new and complex forms of organization embodying distributed network structures and processes. Business Controls/Corporate Governance: It is nothing new that the treasury function in global corporations also faces a highly competitive and demanding regulatory environment that mandates a seamlessly integrated corporate treasury solution. The recent spate of corporate scandals and failures, including Enron, WorldCom, and Global Crossing in the United States, Daewoo Group in Korea, and HIH in Australia, has raised serious questions about the way public corporations are governed around the world. When managerial self-dealings are excessive and left unchecked, they can have serious negative effects on corporate values and the proper functions of capital markets. In fact, there is growing consensus around the world that it is important to strengthen corporate governance in order to protect the rights of shareholders, curb managerial excesses, and restore confidence in capital markets.4 The recent years have shown a huge push on corporate govern4
The term “corporate governance” is understood here as “the economic, legal and institutional framework, in which corporate control and cash flow rights are distributed among
5 Process-Oriented Systems in Corporate Treasuries
103
ance recommendations and regulations, such as Sarbanes-Oxley in the US or IFRS and hedge accounting rules (IAS 39) in Europe. This regulatory demand, coupled with stakeholder pressure for more financial transparency and control over treasury transactions is making increased tracking and justification of treasury transactions not only desirable but necessary. Since spreadsheet calculations are prone to error due to their low degree of automation, there is now a demand to create financial figures using a controlled, visible and audited process. As a consequence, more and more organizations turn to treasury automation with a system solution to the problems of clarity and accountability in the production of financial figures. A related trend is the demand from corporations for their TMS to not only control and automate treasury processes, but also document them in order to prove that a clear and controlled process exists. As a consequence, treasury technology providers must now include compliance analytics and metrics, such as verification of profit/loss entries being double-checked or payment authorities and approvals. Decision-Support: It is anticipated that in the next years there will be a shift in focus from efficiently “running the business” to effectively “managing and optimizing the business” (Ernst & Young 2005; PricewaterhouseCoopers 2003). Accordingly, resources are expected to be redirected from back-office transaction processing to front-office decision-support and performance management related activities. Two important resources that aid users in managerial decision making within an organization are models and data.5 Prerequisites for making better decisions faster consist not only in having access to clean and complete data. Greater connectivity and access to increased variety and volume of information generates larger data volumes and also greater informational complexity. Since data offer an intrinsic value proposition, it has to be determined what data are relevant for the underlying decision context. Data Mining, also known as Knowledge-Discovery in Databases (KDD), is important for “the nontrivial extraction of implicit, previously unknown, and potentially useful information from data” (Frawley et al. 1992). Furthermore computational methods from the field of Artificial Intelligence (AI), such as Fuzzy Logic (Nguyen and Walker 2005), Machine Learning (Mitchell 1997) and Evolutionary Computation (Ashlock 2006) can transform data into information, and, at peak performance, into intelligence. As literature on applied informatics has shown, these areas can be helpful for solving decision problems involving corporate financial planning (Pacheco et al. 2000), risk management (Chan et al. 2002), and financial market forecasting (Peramunetilleke and Wong
5
shareholders, managers, and other stakeholders of the company” (Eun and Resnick 2004, p. 472). From the view of regulatory authorities the resulting problem to be solved can be stated as “how to best protect outside investors from expropriation by the controlling insiders so that the former can receive fair returns on their investments” (Eun and Resnick 2004, p. 474). How authorities deal with this problem has enormous implications for shareholder welfare, corporate allocation of resources, financing and valuation, development of capital markets, and economic growth. Note that we view a “model” as a computer-executable procedure that may require data inputs, a view that is widely held in the DSS literature (Gachet 2004; Power 2000).
104
Christian Ullrich, Jan Henkel
2002; Ullrich et al. 2005). It is important to note that the opportunities in this operational area of treasury management clearly drive the demand for broader and technically better educated people whose value-orientated assignments should not be blocked by repetitive actions. While manual preparation of relevant information may take lots of time and lead to decisions that are based on outdated or incomplete data, automated applications with a high degree of interconnection to market and intra-company data sources are required that enable operational units to make quicker and more informed decisions. We observe today that multinational corporations are not only trying to fulfil regulatory requirements, but strive to integrate them in a corporate value-orientated context. The need for consolidating and integrating treasury systems stems from the desire to improve business controls through visible processes and the installation of transparent management information systems. The purpose here is not only to avoid functional lacks that naturally exist due to aging legacy systems and heterogeneously organized decision-support solutions. It is also about minimizing departmental deficits in terms of redundant and complicated flows of information. Hence, the ability to gather transactional data from remote locations through automated data feeds to the corporate finance engine without additional interface requirements would certainly represent a significant win. For these reasons, it is quite obvious that a consolidated treasury system landscape can only be realized by companies if they are heading towards single-vendor ERP systems. However, the barriers to streamlining information flows in treasury can be significant. It is important to note at this place that treasury is not a priority in ERP systems whose vendors are slow to respond to changing circumstances that impact treasury. The slow uptake of new financial accounting standards (FAS) accounting rules over derivatives and hedges is one such example. The lack of flexibility around interfacing to external information sources, due to the need for strict interfacing rules, inflexible reporting capabilities, and the inability to perform complex treasury analysis are all arguments accusing ERP of failing to offer a comprehensive treasury solution. In addition, the increasing demand towards value creation requires highly specified and context-dependent decision-support solutions that supply decision-makers with superior information. For these reasons, the choice of IT systems that suit a company’s business needs is an interesting practical problem that is far from being trivial. In the following we will provide insights into the treasury organization, process and system support of BMW Group (“BMW”), a multinational manufacturer of premium automobiles and motorcycles.
5.4
Portrait BMW Group
The BMW Group is one of the ten largest car manufacturers in the world and possesses, with its BMW, MINI and Rolls-Royce brands, three of the strongest premium brands in the car industry. The BMW Group also has a strong market position in the motorcycle sector and operates successfully in the area of financial
5 Process-Oriented Systems in Corporate Treasuries
105
services. The BMW Group aims to generate profitable growth and above-average returns by focusing on the premium segments of the international automobile markets. With this in mind, a wide-ranging product and market initiative was launched back in 2001, which has resulted, over the past years, in the BMW Group expanding its product range considerably and strengthening its worldwide market position. The BMW Group will continue in this vein in the coming years. The BMW Group is the only automobile company worldwide to operate with all its brands exclusively in the premium segments of the automobile market, from the small car to the absolute top segment. The BMW Group’s vehicles provide outstanding product substance in terms of aesthetic design, dynamic performance, cutting-edge technology and quality, underlining the company’s leadership in technology and innovation. As an international corporation, the BMW Group currently has 22 production and assembly plants in 12 countries. Within the framework of its sales strategy, the BMW Group has, since the 1970s, been consistently pursuing its objective of having its own sales subsidiaries in all the major markets around the world. Thus in 1981, BMW was the first European manufacturer to establish a sales subsidiary in Japan. Today, the sales network consists of 35 group-owned sales companies and 3,000 dealerships. Approximately 100 more countries are handled by local importers. This means that the BMW Group is represented in over 150 countries on all five continents. In 2005 the BMW brand outperformed its historical record for the fifth time in a row since 2001 to 1,327,992 units. For the first time, more than 200,000 MINI brand cars were sold in a single year, with the number of cars delivered increasing by 8.7% to 200,428 units, as compared to 2004. The Rolls-Royce brand confirmed its position at the top of the absolute luxury class. 796 Phantoms were delivered to customers in 2005, marginally higher than the 792 sold in the previous year. In the business year 2005, the BMW Group employed 105.800 employees approximately. Three quarters of all BMW employees are located in Germany. According to IAS, the BMW Group reported revenues of 46.7 billion Euros and a net profit of 2.2 billion Euros. The BMW ordinary share has been listed since 1926. In 1999 BMW Group introduced the 1 Euro par value share. Concurrently BMW moved to the collective custody account procedure. Physical share certificates are no longer available. As of 31 December 2005, the subscribed capital of BMW AG amounted to 674 million Euros and comprised 622,227,918 ordinary shares and 52,196,162 preferred shares.
5.5
Portrait BMW Group Treasury
There are many dimensions for determining the structure, role and function of a treasury organization. Generally, corporate treasury practices can vary according to size, geographical reach, and process complexity. At one end there are treasuries that focus on managing daily cash flows and optimizing working capital through process excellence. At the other end of the spectrum, there are treasuries
106
Christian Ullrich, Jan Henkel
that are strategic to the business and whose objective is to support the financial performance of the company. Clearly any organization can exercise its choice on the scope of the treasury functions that it undertakes and where it positions itself between these two extremes. In doing this, it may be governed by a variety of considerations. It may choose to handle only those needs driven by utilitarian motives such as liquidity support or, on the other hand, it may consider treasury as a “core” organizational process and hence handle the full range of services. It may also choose to outsource portions of the activities required or it may choose to foster these capabilities in-house. Independently, an organization can also decide on the extent of centralization of treasury management. For instance, it may be efficient to centralize transaction processing, while the front office may need to be decentralized to aid speedy local decision-making. It may also be important to have a common risk management strategy, while execution may be decentralized. Many large corporations manage their business entities in a decentralized fashion to promote local ownership of business decisions and accountability for results which is in turn favoured by many investors.
5.5.1 Organization and Responsibilities At BMW, the treasury division is responsible for all of the company’s relations with the capital markets. This includes pension funds management, financial marketing, and corporate finance. The main goals are to minimize the cost of capital to the company, protect its solvency and each of its individual subsidiaries and to safeguard BMW’s financial independence in light of strategic decisions along with the requirements for profitability and value contribution by managing risk-return. The question of appropriately aligning the Treasury within the enterprise organization, whether as a cost, service or profit centre, has a very special meaning in regards to the definition of the single Treasury-functional areas and of course the overall definition of treasury goals and strategy. (PricewaterhouseCoopers 2003) report, that the great majority of corporate treasuries are organized as service centre models. Only some companies characterize their treasury as cost centres and even fewer ones as profit centres. The main features of a service centre include the management of financial risks and liquidity within narrow guidelines, a restricted scope of action for entering open positions, as well as the idea of serving and supporting operational subsidiaries in order to optimize performance with regards to foreign exchange and liquidity questions. Since conformity with BMW goals requires both efficient processes and positive financial performance as compared to financial market developments, the company may consider its treasury function as a hybrid one. It is organized according to the service centre principle, however, does not possess all of its characteristics, such as narrow operational guidelines. Instead, the BMW Group Treasury also contains notions of the profit centre concept, especially with regards to its risk management practices, and of the cost centre concept.
5 Process-Oriented Systems in Corporate Treasuries
107
Figure 5.3. Vertical perspective: BMW Group Treasury
Figure 5.3 depicts the organizational structure of BMW Group Treasury. In this article, we will not embrace the BMW Group Treasury as a whole, but rather focus on those divisions which are responsible for the execution of treasury functions as commonly known. These functions are attributed to the Corporate Finance department along with local Treasury Centres (TC). The purpose of the latter ones is to align local treasuries with the BMW Group’s treasury strategy and represent the BMW Group Treasury in the local markets. This allows the local subsidiaries (Sales, Financial Services, and Production companies) to benefit from uniform standards, methods, and expertise. TCs are located in Europe (Belgium, Netherlands, Great Britain and Austria) and the United States. Recently, the company has decided to build an Asia-Pacific TC in Singapore. It is due to this step that BMW Group Treasury is now able to cover the entire eastern hemisphere and realize economies of scale from optimized processes in treasury management. In spite of the functional basic structure of the organization, a decentralized leadership philosophy prevails at BMW. This leadership philosophy is marked by a central guidance that is coupled with decentralized responsibility and stretches through all parts of the BMW organization. Thus, decisions on the overall treasury policy, procedures and guidelines (including strategy, concepts, standards, principles and structuring) are made centrally in Munich. Local TCs are wholly responsible for the way in which the treasury policy and strategy are executed. This helps BMW to ensure a uniform approach to the financial markets. It also enables BMW to offset contrary positions and evaluate executed transactions via the application of a centralized treasury system which we will refer to below. Furthermore, the core benefits of decentralization hold: local proximity, operating in the same time zone, knowledge of local culture, regulations and accounting practices, as well as existing opportunities for local support. The fact that BMW neither has a totally centralized nor a totally decentralized treasury is not an unusual phenomenon in German multinational companies. (PricewaterhouseCoopers 2003) demonstrate that 62% of the responding companies pursue a similar approach to treasury management. This can be attributed to the fact that, traditionally, expansion into new markets has been accompanied by deployment of local treasury staff to handle funding, investment, liquidity and exposure management in unfamiliar business environments, especially in those where change is rapid and the organization has
108
Christian Ullrich, Jan Henkel
significant business and/or financial exposure. At present, financial transactions are conducted exclusively via BMW TCs as long as the BMW Group is not affected by adverse legal, tax or economic effects.
5.5.2 Business Process Design In this subsection the current status of the operational work and information flows at BMW are examined throughout the treasury process chain. This is important because the functional representation of the Treasury as shown in Figure 5.3 is not valid from a business process and information flow perspective. As depicted in Figure 5.4, the treasury process chain as a whole further affects other parts of Group Finance (namely the Financial Controlling and Financial Reporting divisions) and other divisions of the company (Corporate Sales). This horizontal, cross-functional flow model of a business is better suited for identifying simplifications of processes, standardization of procedures across diverse business units and their integration, both within the organization and with those of external organizations, including customers and suppliers. Intuitively, these aspects provide an important starting-point for IT systems design. An effective separation of duties is ensured by grouping single process activities into process units or process classes. A typical treasury is divided into the following process units: front office, middle office and back office. Due to decentralization of treasury activities at BMW, desks are separated market-specifically, i. e. there is a front and back office in each of BMW’s financial entities. Middle office tasks are managed centrally. The treasury process starts off at the local operating units, which estimate and adjust their future sales volumes periodically according to the rolling window principle. Based on product- and market-related sales forecasts, important middle office responsibilities include periodic determination of exposures and quantification of risks which, above all, provide the foundation for front office activities as will be described below. Middle office is further responsible for medium- and long-term financial planning, including profit/loss and budget rates. Trading activities are managed by setting counterparty limits, which represent the maximum exposure to a particular counterparty or group of related counterparties. Limit calculation is based on equity, a rating factor (as delivered by Standard & Poors, Moody’s and Fitch) and a country factor. Compliance with counterparty limits is monitored very closely. Non-compliance is not tolerated and repeated non-compliance is a dismissible offence. Exposure limits, which represent maximum exposure to particular market rates, or stop loss limits, which represent the maximum loss that a particular deal or strategy may incur before it must be closed out, are not relevant. Middle office’s second area of responsibilities – “risk controlling” – includes benchmarking, performance analysis and reporting, and is aligned to the prevailing risk management objectives of the front office. Performance measurement provides information about the value contribution that treasury has produced according to pre-determined benchmarks and shows whether and how strategic
5 Process-Oriented Systems in Corporate Treasuries
TC Asia TC America TC Europe AG
Sales- orientated Planning
Financial Planning, Risk Measurement, Limit Mgmt.
AG
109
TC Asia TC America TC Europe AG
Confirmation, ImplemenRisk Analysis, Settlement, tation, Strategy Development Documentation Reconciliation
AG
Accounting
Performance Analysis, Reporting
Middle Office
Front Office
Back Office
Middle Office
Controlling
Treasury
Reporting
Controlling
Figure 5.4. Horizontal perspective: BMW Treasury Process Chain
requirements and goals have been reached by providing management reports. Hence, middle office represents the process unit that is responsible for covering the beginning and the end of the treasury process chain. BMW’s front office is divided into several different desks with each desk focusing on a specific market. In regards to the corporate finance departments, front office desks at BMW include the following: foreign exchange, fixed income, commodities, money markets and capital markets. Consider the risk management process, for instance: when risk models suggest the trader to get ready to enter the market, hedging strategies are developed, analyzed and finally executed. In the case of foreign exchange, the primary factor that affects the risk manager’s decision in the longer term is macroeconomic data. The results of market research as conducted by banks, independent research institutes, global and domestic events, political and social conditions, fiscal and monetary policy, chart analyses and rumours are only important for the shorter term. Traders at BMW consider these factors and sources of information and translate them into a range of hedging strategies that are designed to match BMW’s overall objectives. The best strategy that is also in line with BMW’s hedging policy is finally chosen. In this context, it is important to note that decisions on financial risk management strategy and tactics are seldom made independently and entirely by the front office itself. Due to the complexity of interacting decision-relevant parameters as well as reasons of supervision, risk management decisions at BMW are often discussed and made within a committee, which does not include risk management officers only, but also financial controlling experts along with corporate finance and treasury executives. Upon execution, the deal has to be documented and finally recorded. Deal records are required for both correct revaluation and settlement. It should be noted that online trading platforms are not relevant for BMW. Trading is not done continuously, but occasionally and hence it is sufficient to use phone and email as communication media.
110
Christian Ullrich, Jan Henkel
The nature of the back office’s business can simply be summarized as transaction processing. This involves controlling the accuracy and verification of trades as entered by the front office. Back office further has to ascertain that all deals have been confirmed by the counterparty and that all resulting cashflows from these transactions will be available for cash management, settlement and accounting. Cash management as part of the back office is responsible for reconciling deals with bank statements. Only when forecasted cashflows have been reconciled with the bank account they will be posted via interface to the accounting system. Standard agreements regarding settlement, payment instructions, confirmations, etc. are set up with bank partners.
5.6
Business Process Support
IT systems are both drivers and enablers for changes in treasury processes. IT is a driver because computational developments pertaining hardware and software open up new opportunities for changing strategies and processes. However, it is also an enabler, since its installation and application enables companies to support existing strategies and business processes. In the following, we will focus on the role of IT as an enabler. A key consideration for companies when selecting a treasury management system is the functional requirements that the organization needs to deliver. If the treasury function is aligned more operationally a system would be required which is closely integrated with the planning, forecasting, and budgeting tools within the business and to external sources of data, such as banks and capital markets. The foundation for value-orientation then consists in a high degree of straight-through-processing, i. e. transactions that are recorded only once and are available afterwards for other subsystems or are automatically booked in the ledger. However, for treasuries that are strategic to their business, information extraction is the primary concern because such treasuries are required to perform complex and sophisticated analysis on business data in order to support financial decision-making. Since the BMW Group Treasury can neither be characterized as a purely operationally aligned treasury nor as one that is totally strategic to its business, a more differentiated view is required, that segregates operational workflows from decision-making procedures. In BMW Group Treasury’s framework of organizational structure and workflows, the objective is the integration of all of BMW’s financial positions as a basis for a central management of these positions in real-time and the standardization of treasury processes across all BMW entities. Figure 5.5 illustrates the IT system architecture deployed by BMW. It combines the strengths of an ERP as the general finance engine (SAP), interfacing into a best-of-breed treasury application (Trema) that is again interfaced to back-office satellite systems, real-time news provision service (Reuters), historical market rate database (Zentrale Kursdatenbank ZKDB) and specialized DSSs which are not yet fully implemented.
5 Process-Oriented Systems in Corporate Treasuries
TC Asia TC America TC Europe AG
Sales- orientated Planning
AG
Financial Planning, Risk Measurement, Limit Mgmt.
111
TC Asia TC America TC Europe AG
ImplemenConfirmation, Risk Analysis, tation, Settlement, Strategy Development Documentation Reconciliation
Figure 5.5. IT systems support for treasury process and business process units
5.6.1 Best-of-Breed System Functionalities Whereas, until now, the ERP solution SAP is used for data collection and highlevel reporting only, the central component in supporting BMW’s treasury activities is Finance KIT, a best-of-breed TMS from Swedish system vendor Trema. Finance KIT uses standard client server architecture along with a portable window environment. The system is used Group wide, mainly to guarantee operational integrity by embracing front, middle, and back office functionalities. Intrinsic to the design are automated transaction flows to reduce redundant keying. However, Finance KIT also supports certain functionalities within trading and risk management. Systematic coverage and evaluation of all corporate financial risks, market evaluation of balance sheet positions and financial derivatives are covered Groupwide. All BMW Group entities, namely BMW AG (Internal Audit, Financial Controlling, Group Reporting, and Financial Processes Integration), BMW Financial Services, BMW overseas entities (TCs) and network partners (Auditing, System partners) are connected via Finance KIT. The system is available 24 hours a day and seven days a week. It connects approximately 200 users worldwide which access Finance KIT via internet browser. The central database is located in Munich. The system handles an annual transaction volume of approximately 680 Billion Euros. About 7000 user enquiries per year have to be tackled by BMW internal Finance KIT support and Treasury Consulting with an increasing tendency. Finance KIT includes an interface to local systems for accounting and payments, as well as a standardized reporting and unique system layout.
112
Christian Ullrich, Jan Henkel
Figure 5.6. Information flow in Finance KIT (Trema 2002)
The information flow in Finance KIT consists of four types of information (Trema 2002): static information, market information, transaction and cashflow management information, and calculated information. Figure 5.6 provides an overview of the different kinds of information and their flow in Finance KIT. Static information is the information that usually does not change. It includes the definition of currencies, counterparties, issuers, countries, rules, etc. and is used by most calculations. Editors are used to define and modify static information. There is a specific editor for each type of static information. Currencies, for instance, are defined in the Currency Editor. The report application is a report generator which is attached to each editor and allows creating reports from information defined in the editors. Market information refers to the market quotes of currencies, instruments, yield curves, equities, volatilities, etc. which are stored in Finance KIT. Market information is used to measure and evaluate the risk of positions. This information is typically imported from market information providers, such as Reuters, but can also be entered manually. The Finance KIT rate board can be used to view and modify the market information. Market rate reports can be established via a specific market rate reporting engine. Transaction information is information that is essential to define a transaction, for example, transaction and cashflow amounts, transaction counterparties and deal rates. Calculated information such as unrealized results, market value, etc., is not part of this information. Transaction boards are used to enter, view or modify this information. For example, deals are entered in the Deal Capture transaction board. Some reports, such as Transaction Log Report, Daily Deals Report and Cashflow Report are designed to report this kind of information. A primary role also consists in calculating figures and in displaying them in different applications. In order to do these calculations, information
5 Process-Oriented Systems in Corporate Treasuries
113
Figure 5.7. Finance KIT principles (Trema 2002)
and instructions are required from other parts of Finance KIT, such as static data, market data and transaction data. In Finance KIT, there are five basic principles that support the treasury process chain from a systems point of view: “Transaction Flow”, “Cashflow”, “Portfolio Hierarchy”, “Valuation”, and “Real-Time”. The entire information flow in Finance KIT is based on these principles. Due to their powerful impact on the system’s performance, as well as on its various applications, we will describe the ideas in more detail, according to Figure 5.7. We start in the front office. Every trade that is entered in Deal Capture is regarded as a transaction. The process of transaction handling and control from initial input to final acceptance is referred to as “Transaction Flow”. The Transaction Flow is used to monitor the movement of transactions from one state to another. Different applications, mainly transaction boards, are configured to control and manage the transactions in different parts of the flow and access rights to these applications that are limited to those people who actually use them. When a transaction is committed in a transaction board, it receives a state and status. The
114
Christian Ullrich, Jan Henkel
Table 5.1. Entering a bond and a foreign exchange forward via Deal Capture
Bond Instrument Name Nominal Amount (Face Value) Price Portfolio Counterparty
Bond A USD 10.000.000 105,043 Bond Portfolio Bank A
FX Forward Currency Pair
EUR/USD
Purchased Amount Rate Portfolio Counterparty Value Date
EUR 1.000.000 1.1500 + 150 bps FX Forward Portfolio Bank B March 15, 2007
state of the transaction determines where it stands in the Transaction Flow. All transactions must have a state and can only have one of the following states at a time: “Open”, “Trader Verify”, “Back Office Verify” or “Final”. In the following we consider two imaginary transactions, a bond purchase and a foreign currency purchase that will be followed from trade entry in the front office, through the middle office, to bookkeeping in the back office. Let us imagine that on the morning of January 22, 2006, the trader (located in TC America) decides to buy a government bond as well as Euros for US-Dollars. The trader enters the transactions via “Deal Capture” in Finance KIT. Table 5.1 shows all the information the trader needs to specify in order to make these transactions. As soon as they are saved in Finance KIT, these transactions are split into their resulting cashflows (“Cashflow” principle). A cashflow is a certain amount of money in a particular currency that occurs at a specific point in time. Thus, each cashflow has a value date, necessary for valuation, payment settlement and bookkeeping. The nature and processing rules of various cashflows are dependent on the type of financial instrument that is recorded. Generally, cashflows are divided into four groups. “Simple Cashflows” are those cashflows that actually take place, for example a coupon or redemption. “Pseudo Cashflows” are used for calculating the actual settlement before fixing takes place, for example in forward rate agreements. “Proxy Cashflows” are used for the valuation of floating rate instruments. “Conditional Cashflows” are used for options where the magnitude of the cashflows is known but the delivery is uncertain. In order to administer various cashflows it is necessary to further classify cashflows into different types. All cashflow types are defined during implementation. The six main cashflow types are: “Principal”, “Interest”, “P/L”, “Fee”, “Payment”, and “Balance”. Each cashflow type contains subtypes. For example the cashflow type Principal has the subtypes “Principal”, “Amortization”, “Expiration”, and “Redemption”. These subtypes are used in money settlement applications and to define accounting rules. All cashflows can be analyzed and valued together as one position or divided into several components by various grouping parameters. Table 5.2 depicts the cashflows for the bond transaction as generated by Finance KIT. Furthermore, Table 5.3 depicts the cashflows for the foreign exchange forward contract as generated by Finance KIT.
5 Process-Oriented Systems in Corporate Treasuries
115
Table 5.2. Cashflows of the bond transaction
Date
Cashflow Amount Currency
Jan 29, 2006
-10,504,300.00 USD
Jan 29, 2006 Jun 03, 2008 Jun 03, 2009
-525,000.00 USD 900,000.00 USD 900,000.00 USD
Jun 03, 2009
10,000,000.00 USD
Description The principal payment for the bond. The acccrued interest of the bond paid to the previous owner. The future coupon of the bond. The future coupon of the bond. The redemption payment of the bond.
Table 5.3. Cashflows of the foreign exchange forward transaction
Date Cashflow Amount Currency Description Mar 15, 2007 1,150,000.00 EUR The Euros bought. The US Dollars paid for the Euros bought. Mar 15, 2007 -1,000,000.00 USD
When the trader commits these transactions in Deal Capture, they obtain the transaction state Open. Trade tickets are printed automatically in order to document conformity to market conditions at the time of the trade. The nominal amount of the transaction is verified against the trader limit. When the trader commits the transactions, they move forward in the Transaction Flow. If the transaction exceeds the limit a warning appears on the trader’s screen and the exceeded limit is logged. The state of the transaction defines which transactions are visible in which transaction board. For example, committed transactions no longer appear in Deal Capture, unless they are sent back to the Open state from another transaction board. In the configuration of this case study, the next state for these transactions is Trader Verify. When transactions are in this state, each trader must review the transactions they have entered. If the transaction is incorrect, the trader sends it back to the Open state in order to modify or delete the transaction. Transactions reach the back office after they have been verified by the front office traders. A back office clerk views the two transactions that are now in the state Back Office Verify in the “Back Office Verification” transaction board. If an error is identified, the respective transaction is sent back to the Open state for correction. The states of the transactions change to Final as soon as back office receives transaction confirmations from the deal counterparties. If the transaction data is correct and the counterparty confirmation is received, the back office clerk confirms the transaction using the “Confirmation” transaction board and transfers the payment to the bank. The money is automatically transferred to the bank by either fax or file. In this context, different payment rules are defined, which enable the system to select the correct transfer method for each payment. The next day the bank account statement is collected via file and is automatically reconciled against the payments sent from Finance KIT. At the end of each business day, the bookkeeping entries from trans-
116
Christian Ullrich, Jan Henkel
actions in the Final state are automatically generated. This also requires the definition of rules in order to enable Finance KIT to identify the correct bookkeeping account for each cashflow. Bookkeeping entries are made on the value date of each cashflow. For the government bond, the principal payment and the interest accrued to the previous owner are booked on January 30, 2006. The generated entries are as follows: Table 5.4. Bond transaction bookkeeping entries
Bookkepping Account Cash Account (USD) Bond Account Bond Interest Account
Entry Type Credit Debit Debit
Amount 11,029,300.00 10,504,300.00 525,000.00
Once the FX forward transaction reaches its value date, the cashflows resulting from that transaction are booked. Thus, if 1 Euro equals 1.1700 US-Dollars on March 15, 2007, the bookings read according to Table 5.5: Table 5.5. FX Forward bookkeeping entries
In this example, it results in a debit entry of 854,700.85 Euros, which are 145,299.15 Euros less than the transaction exchange rate. This difference is booked as a foreign exchange loss in the foreign exchange revaluation account of the US operating unit. The result from the forward points is booked into the interest rate profit/loss account. The transactions entered in closing books can be entered at an arbitrary date. The entries are similar to bookkeeping entries, except that the items that are booked in the closing books are not an actual cashflow but unrealized items (for example accrued interest, net market value). The unrealized items need to be allocated to a correct accounting period for which entries are made at the end of every month. The effect of the two transactions can be seen in different back office reports, including the “Delivery Report”, the “Daily Deals” and the “Outstanding Positions” Report. The effect of sample transactions on the liquidity position is seen in real-time in “Treasury Monitor” as soon as the transactions are entered. There are three major benefits that arise from this procedure of handling financial transactions. First, all transaction types are properly documented and recorded in a uniform way by entering data once only. This allows for better accuracy and correctness, since bookkeeping entries and payments are not made from transactions which have not been sufficiently verified. Second, communications in the
5 Process-Oriented Systems in Corporate Treasuries
117
business process are clearly determined which significantly improves the subject of error processing by reducing miscommunication and the time spend on correcting data inconsistencies. Third, data migration to spreadsheets and neighbourhood systems is done from a single source, central database. This guarantees a substantial reduction of operational risk by eliminating the possibility of human error that would occur from loading data manually. The operations that take place in the middle office are related to risk controlling and the profit/loss supervision of the BMW Group Treasury’s positions. Positions are administrated via so-called portfolios that can both be defined with regards to competencies and on a product group level (“Portfolio Hierarchy” principle). For example, let us first consider the BMW Group portfolio of competencies from a top-down perspective. It is the Group portfolio itself that represents the parent portfolio for each of its regions. Regions represent a second parent level themselves in the top-down hierarchy and consist of entity sub-portfolios, i. e. subportfolios for every BMW subsidiary that are assigned to a specific region. Furthermore, every entity consists of a range of businesses which are represented by another third level of sub-portfolios. The overall treasury position is monitored by viewing the entire BMW Group Treasury portfolio while the American Region position is monitored by the American Region’s portfolio. Since the same data and the same transactions will often be viewed from different perspectives, one portfolio can also belong to several hierarchies. For example on a product group level, the currency spot risk can be transferred from the currency forward portfolio to the spot portfolio. Figure 5.8 displays the portfolio hierarchy as defined for the BMW case study. The transactions are entered into the bond and forward portfolios. To view the positions of both transactions at the same time, the risk manager must choose to view the “MAIN” portfolio. It is possible to view both transactions because both portfolios are underlying portfolios of the MAIN portfolio. Similarly, if the risk
Figure 5.8. Example of portfolio hierarchies at a product group level
118
Christian Ullrich, Jan Henkel
manager selects the foreign exchange (FX) portfolio, only foreign exchange transactions are included in the position results, because only foreign exchange transactions are located in the foreign exchange portfolio and its underlying portfolios. The portfolio tree structure thus enables powerful opportunities for consolidation and aggregation of positions which enables the efficient control of risks and results on various organizational levels and according to various criteria. When the risk manager has selected the position to be monitored, Finance KIT retrieves the relevant transaction cashflows and calculates the requested keyfigures based on the cashflows and market information it contains. In Finance KIT, risk and unrealized result calculations are based on the market value of specific cashflows at the time of valuation. Therefore, when a transaction is entered, it is valued with mark-to-market rates in order to manage risks and calculate realized and unrealized results (“Valuation” principle). According to the Valuation principle, the market value of a position is the present value of the cashflows in the future that make up the position. In order to provide the market value of the cashflow, it has to be discounted by a specific discounting rate. The discounting rate is defined separately for each instrument and can be either a direct mark-to-market quotation of the instrument or a specific yield curve. By default, market value is expressed in the base currency of the portfolio. Thus the present values denominated in the foreign currency are converted into the base currency using the current spot exchange rate. All cashflows in the future are discounted to the date from which the position is monitored. This produces the present value of the cashflows on which risk figures are based. The most recent market information is always used when valuing positions. The discount rate depends on the definitions of the instrument from which the cashflow results. The rate can be any direct market quotation, yield curve or zero-coupon rate available in Finance KIT. These quotations and yield curves are usually obtained from external market information sources. The zerocoupon rate is calculated from swap yield curves or instruments. Similarly, other instrument specific definitions, such as date basis and price type for the interest (periodic or yield) affect the calculations. Middle office can view the case study in different reports, for example in the key-figure report and in the periodic P/L report. The risk manager can request reports in which only the transactions that have reached a certain transaction state are included. In the example used in this case study, the risk manager keeps Treasury Monitor open on the computer screen at all times in order to monitor the overall risk position of the treasury and to see the change in the risk figures in real-time as new transactions and market information are received. Limit Follow-Up must also be open in order to see how the defined portfolio and instrument limits are used in real-time. If the risk figures reach an unsatisfactory level, the user can examine the position in more detail in Limit-Follow-Up, in order to determine the origin of the excess risk. To identify the root of the problem, the user can split each key-figure by counterparty, instrument, portfolio, a transaction etc. Based on this information, the risk manager can inform the traders of possible hedging or advise other actions. As the position is monitored in real-time, the discrepancies from the desired levels of risk can be
5 Process-Oriented Systems in Corporate Treasuries
119
Table 5.6. Position Monitoring
Market Value 10.500.200,00 USD
Net Market Value -4.100,000 USD
IR Exposure 82.420,00 USD
quickly identified. In this case study, the user monitors the money market (MM) portfolio. The bond transaction is part of the money market position and the figures below are the key-figures of this transaction. They can be seen in reports and in Treasury Monitor. The market value is the present value of the future cashflows of the bond. The net market value is the difference between the purchase price and the current market value. Interest rate (IR) exposure is the change in the market value (for example profit or loss) if the discount rate rises by a defined percentage. The risk manager can view similar profit/loss figures for the FX transaction. It is also possible to see the effect of these transactions on the credit risk of the treasury by printing a credit risk report. Deal Capture continuously receives the latest market information. This information is used in valuations in order to obtain current and reliable risk and result figures (“Real-Time” principle). Whether it is a new deal entered into the system or updated market quotes for currencies, instruments, yield curves or volatilities, Finance KIT automatically updates all new or modified information in real-time for all its applications without it having to be requested. Real-Time means, that all new information is available instantly, i. e. risks, market value and other key figures are recalculated as soon as there is a change in a market variable and/or a position. Recalculation occurs permanently according to the updating cycles of the market rates. Market rates are obtained from Reuters, an external news provider. For every position, Finance KIT calculates the P/L for the chosen periodicity (daily, monthly, quarterly, annually) and displays realized and unrealized results, break-even-rates and net present value. Revenues are displayed rate- and currency-specific. Reports are not updated in real-time which means, that they must be updated manually in order to reflect new information. The main advantage of real-time position updating is that it enables increased trading and risk control. All mathematical calculations and formulae for trade entry, valuation and risk calculations are programmed in so called “Financial Objects”. Financial Objects are procedures, which Finance KIT stores and then uses to make calculations. Financial Objects and instrument information determine how Finance KIT handles cashflows.
5.6.2 Decision-Support Functionalities TMS often lack in-depth risk analysis capabilities that fit the very specific nature of single risk factor exposures. This can be attributed to slow updating cycles from TMS system vendors, their reluctance in addressing single customer needs, and
120
Christian Ullrich, Jan Henkel
functionally restricted applications due to high degree of standardization. In order to replace individual spreadsheet-based solutions, large efforts are currently being made at BMW to endorse the best-of-breed environment by statistical decisionsupport system functionalities in the areas of currency hedging, interest rate and asset management. Even if suitable system providers are selected, this is a complex and time-consuming task because systems integration has to be achieved on a variety of levels, such as complementing or even redefining existing processes, incorporating corporate goals and restrictions, user training, etc. In Section 5.2 we distinguished between different kinds of decision-support systems, namely modeldriven, communication-driven, document-driven, and knowledge-driven ones. In addition, academics and practitioners have discussed building decision-support systems in terms of four major components: (a) the user interface, (b) the database, (c) the model and analytical tools, and (d) the decision-support system architecture (Power 2002). At BMW, decision-support system functionalities are intended to be model- and data-driven desktop solutions for the use of a few specialists only. Although these solutions are generally model-independent, they are not dataindependent from Finance KIT. All financial planning data is centrally saved in Finance KIT. Thus, automated interfaces have to be established that enable the transfer of predefined corporate financial planning data from Finance KIT. In addition, risk factor market data, such as rates, volatilities and correlations need to be transferred to the system from an external database. Automation is important in order to eliminate human error and quickly react to market events. As soon as the systems are supplied with the required data, the treasury process is continued according to front and middle office responsibilities (see Section 5.2). For example, for the purpose of strategy development, analysis and implementation, a front office trader may simulate a range of alternative financial instruments via historical or Monte Carlo simulation methods in order to compare their risk-return profiles. Implementation of intelligent optimization procedures is necessary in order to obtain optimal solutions to decision problems, such as finding the optimal asset allocation or determining the optimal mix of hedging instruments, in a reasonable amount of time. Once a decision is made and a deal has been executed – no matter if it was recommended by the system or not – it is entered in Finance KIT’s Deal Capture, and all references are updated. Otherwise, for the purpose of middle office responsibilities, market risk and opportunities can be quantified by evaluating exposures with respective market factor scenarios which serves as a foundation for risk reporting and strategy development.
5.7
Final Remarks
The subject of information systems architecture and process support in corporate treasuries is receiving considerable attention. Over the years, decentralized governance structures have resulted in highly fragmented, duplicative information architectures that complicate information integration and increase ongoing support
5 Process-Oriented Systems in Corporate Treasuries
121
and maintenance costs. In contrast, today’s fast-paced business environment requires companies not only to simplify information flows, but also to improve their knowledge and information architectures. Thus a conflict has to be solved that arises from the parallel need to consolidate IT systems at the one end, while at the same time granting a certain degree of IT systems specification. Since these two objectives are somewhat contradictive, the problem of determining the optimal IT systems architecture becomes a very complex one. We demonstrated in detail, how BMW copes with this conflict by considering the current state on how IT systems support the BMW Group’s treasury process. Uniform data records, single data sources and standardized information flows between process partners have enabled BMW to streamline its business processes and reduce operational risks originating from human error. This consolidation provides the foundation for the current implementation of context-dependent decision-support systems in the area of risk management. Given the technological opportunities as being offered by system vendors, we argue that there exists no single one that is able to comprehensively meet the challenges that BMW faces today. As the market stands now, BMW needs to rely on a combination of vendors in order to obtain a complete business solution.
References Ashlock D (2006) Evolutionary computation for modelling and optimization, Springer Alter SL (1980) Decision support systems: current practice and continuing challenges, Addison-Wesley, Reading/Mass Benston G, Bromwich M, Litan RE, Wagenhofer A (2003) Following the Money: The Enron failure and the state of corporate disclosure. AEI-Brookings Joint Center for Regulatory Studies, Washington DC Bank for International Settlements BIS (2005) Triennial central bank survey – foreign exchange and derivatives market activity in 2004 Bodnar G, Marston RC, Hayt G (1999) Survey of derivatives usage by U.S. nonfinancial firms. Financial Management, 27:70–91 Chan M-C, Wong C-C, Cheung BK-S, Tang GY-N (2002) Genetic algorithms in multi-stage portfolio optimization system. In proceedings of the eighth international conference of the Society for Computational Economics, Computing in Economics and Finance, Aix-en-Provence, France Culp C and Miller M (1994) Risk management lessons from Metallgesellschaft, Journal of Applied Corporate Finance Ernst & Young (2005) Treasury operations survey 2005 results Eun CS and Resnick BG (2004) International financial management. Third edition, Irwin McGraw Hill, New York Finlay PN (1994) Introducing decision support systems, NCC Blackwell, Blackwell Publishers, Oxford/UK Cambridge/Mass
122
Christian Ullrich, Jan Henkel
Frawley W, Piatetsky-Shapiro G, Matheus C (1992) Knowledge discovery in databases: An overview. AI Magazine, Fall 1992, 213−228 Gachet A (2004) Building model-driven decision support systems with dicodess, VDF, Zurich Greenspan A (1999) Financial derivatives. Speech, before the Futures Industry Association, Boca Raton, Florida, March 19 Keen PGW and Morton MSSS (1978) Decision support systems: an organizational perspective, Addison-Wesley Publishing, Reading/Mass Markowitz HM (1971) Portfolio selection: efficient diversification of investments, Yale University Press Mello AS and Parsons JE (1994) Maturity structure of a hedge matters. Lessons from MG AG, Derivatives Quarterly, Fall, 7−15 Mitchell TM (1997) Machine Learning, McGraw-Hill Modigliani F and Miller M (1958) The cost of capital, corporation finance and the theory of investment. American Economic Review, 48:261−97 Nguyen HT and Walker EA (2005) A first course in fuzzy logic, Third edition, Chapman & Hall Pacheco MA, Noronha M, Vellasco M, Lopes C (2000) Cash flow planning and optimization through genetic algorithms. In: Computing in Economics and Finance 2000, Society for Computational Economics (eds) Peramunetilleke D, Wong RK (2002): Currency exchange rate forecasting from news headlines. In proceedings of the thirteenth Australasian conference on database technologies, 131−139, Australian Computer Society, Inc Power DJ (1997) What is a DSS? The On-Line Executive Journal for DataIntensive Decision Support, October 21, vol 1, no 3, http://dssresources.com/papers/dssarticles.html Power DJ (2000) Web-based and model-driven decision support systems: concepts and issues. In proceedings of the Americas conference on information systems, Long Beach, California Power DJ (2002) Decision support systems: concepts and resources for managers, Quorum Books, Westport/CT PricewaterhouseCoopers (2003) Corporate treasury in Deutschland, Industriestudie Rockart J (1988) The line takes leadership – IS management in a wired society, Sloan Management Review, Summer, 57−64 Trema (2002) Introducing Finance KIT for Unix and Windows, Version 8.0.8 Turban E (1995) Decision support and expert systems: management support systems, Prentice Hall, Englewood Cliffs/NJ Ullrich CM, Seese D, and Chalup S (2005): Predicting foreign exchange rate return directions with support vector machines. In proceedings of the fourth Australasian data mining conference, Simoff SJ, Williams GJ, Galloway J and Kolyshkina I (eds), 221−240, Sydney, Australia Unknown (2006): Treasury management systems – choosing a TMS. Treasury Today, April, 7 Zachman JA (1987): A framework for information systems architecture, IBM Systems Journal, 26, no 3
CHAPTER 6 Streamlining Foreign Exchange Management Operations with Corporate Financial Portals Hong Tuan Kiet Vo, Martin Glaum, Remigiusz Wojciechowski
6.1
Introduction
Global competition and decreasing profit margins force companies to reduce their operational costs. This puts pressure on all business units and operations, including financial management. One approach to increase operational efficiency is the automation of existing processes by eliminating unnecessary manual interventions. Alternatively, one can design and implement new, improved processes. This is a challenging task that gets even more complex if the financial management has to deal with the heterogeneous environments of a multinational corporation. In this sense, the deployment of information systems that help to reduce the complexity of integration, communication, coordination, and collaboration challenges is required. Thus, financial management is in search for the equivalent of the global information system postulated by Chou (1999). In this work, we present and discuss the concept of the corporate financial portal as a global financial information system and management reporting tool. We discuss the extent to which corporate financial portals can help to meet the information management challenges of today’s multinational financial management. To complement the theoretical argumentation, we present the foreign exchange management practices as they are currently conducted by the Bayer AG, a German company with core competencies in the business areas: health care, nutrition, and high tech material. By the means of a corporate financial portal and respective service implementations, Bayer’s foreign exchange (FX) management practices have been fundamentally restructured and simplified. We evaluate the impact of the corporate financial portal implementation on Bayer’s foreign exchange management practices by analyzing the transaction data from 2002 to 2006. The chapter is structured as follows: Section 6.2 briefly outlines the challenges the financial management of multinational corporations has to face. In today’s international environment, financial management on a corporate level is faced with
124
Hong Tuan Kiet Vo et al.
integration, coordination and collaboration challenges when striving for operational efficiency. In Section 6.3, we present the concept of the corporate financial portal and discuss how such systems can be of use in the context of financial management. In Section 6.4, we describe the Corporate Financial Portal – CoFiPot – project at Bayer. We explain how the information and transaction services offered by the portal led to major improvements in the efficacy, efficiency and quality of foreign exchange management at Bayer. The paper concludes with a summary of the lessons learned and an outline of further research questions regarding this topic.
6.2
Challenges of Multinational Financial Management
Corporate financial management has to fulfill three primary functions. First, it has to guarantee the financing of long term business-activities throughout the organization. Second, it has to manage short-term liquidity needs and optimize corporate cash flows and cash balances. And third, it has to design and implement financial risk management strategies.1 While already demanding in a domestic setting, in a multinational context these tasks become highly complex. MNCs operate in several countries and are therefore confronted with a multitude of economic, financial, legal and tax environments. Moreover, MNCs are faced with additional risks, in particular exchange rate risks, and political risks. The complexity of multinational financial management is increased by the fact that MNCs are usually organized as groups of legally independent firms, with local subsidiaries in various foreign countries being under the common control of the parent company: On the one hand, the group structure can be used by the MNC to exploit differences in national legal, tax and other systems. On the other hand, it sets further restrictions for management as each unit is subject to the legal requirements of its respective host country (Glaum and Brunner 2003). Geographical distance, different time zones as well as language and cultural differences further contribute to the challenges. As mentioned before, in this setting financial management has to meet integration, coordination and collaboration demands. Today, MNCs are subject to fierce global competitions. Decreasing profit margins force them to reduce their operational costs. This also puts pressure on the financial management to cut costs and increase the efficiency of its financial operations. The automation of existing processes by eliminating unnecessary manual interventions is one way to increase operational efficiency. An alternative approach 1
Refer to (Cooper 2000; Eun and Resnick 2004; Eiteman, Moffett et al. 2006; Shapiro 2006) for a detailed discussion of (multinational) financial management functions; a brief summary is presented by Glaum and Brunner 2003.
is to design and implement new, improved processes. The prerequisite in each case is the integration of the involved information systems. The financial industry is well known for its highly heterogeneous systems landscape (Weitzel et al. 2003). Considering that the growth of MNCs is often based on mergers and acquisitions, MNCs are often confronted with the need to integrate a highly diverse systems landscape with regard to global business processes. One further challenge to financial management represents the coordination of the financial activities in the multinational business environment. As noted above, geographical distances, different time zones and local holidays make coordination and communication within the financial organization difficult. In particular, if the corporate centre has to provide in-house banking services (Richtsfeld 1994) to the dispersed affiliates, it will want to establish mechanisms and processes that enable services on a global scale – regardless of time zones and geographical restrictions. In addition to the coordination requirements, collaboration also poses challenges in this context. The major goals and strategies of the MNC’s financial management are developed at the corporate centre. However, local managers may pursuit their own, personal goals. The MNC thus may have to deal with principal-agent problems (Hommel and Dufey 1996) that affect the willingness and quality of collaboration within the organization. Yet, the quality of the decisions made by the corporate financial centre strongly dependents on the timely delivery of accurate financial reports provided by the operational and legal sub-units. The efficient execution of decisions also depends on the willingness of local management to follow the MNC’s overall financial strategies. Thus, the financial management of the MNC has to design and implemement incentive mechanisms and control structures that enforce and ensure collaboration and compliance within the organization. In order to meet these challenges, financial management has to make use of the benefits that information systems based on Internet technology offer. Therefore the following section will present the concept of the corporate financial portal and discuss to what extend this tool can support the main financial management functions.
6.3
Corporate Financial Portals
Corporate portals are applications built upon and accessed through the use of Internet technology. The primary function is to offer one single point of access to personalized, business-relevant information, applications and services (e. g. Collins 2001; Dias 2001; Vo et al. 2006). Depending on the actual field of application, literature distinguishes between a multitude of different types of corporate portals including knowledge portals, process portals, or the enterprise information portal (Dias 2001). This work focuses on the concept of corporate financial portals that address the information management requirements of the corporate financial management
126
Hong Tuan Kiet Vo et al.
functions on a global scale. It serves as a central hub to financial information, application and services on the corporate level as well as on the level of the individual affiliates. In the following, we discuss key characteristics of corporate portals in general, as well as the role of the portal concept in the context of financial management of MNCs.
6.3.1
Key Characteristics of Corporate Portals
Today, most companies have an efficient, always-on communications channel – the Intranet. This network can be accessed from virtually everywhere in the world, provided that access to the Internet is available. Applications build upon the use of Internet technology are called Web applications. Every corporate portal implementation is a special form of Web application that consequently can be accessed from everywhere through the means of a standard Web browser. In contrast to traditional applications, Web applications offer various advantages to users and to operators, alike. From the latter’s perspective, Web applications have to be installed and maintained on one central application server only, reducing the complexity of software administration and maintenance tasks (Balzert 2000, p. 946) which is particularly helpful with regard to a global operating environment. Further, it is guaranteed that the users will always access the latest version of the application without the need to pay respect to installation and software updates. From the end-users’ perspective, the primary function of corporate portals is to act as a hub to information, applications and services. For this purpose, the integration of data and systems is a prerequisite (cf. Davydov 2001; Linthicum 2003). Corporate portal initiatives have to be accompanied by associated integration initiatives. Preferably, the integration efforts result in a reusable business integration architecture (Vo et al. 2005), which forms the basis for any portal service or application as it allows portal services to easily access needed data and system functionality for further processing and transformation. A corporate portal will not per se solve the technical issues of system and data integration: these problems have to be addressed within separate integration initiatives. However, the corporate portal can foster integration on the business process level: corporate portal services can access the data and system functionality of one system and make them available to the user within a specific business context. More importantly, the corporate portal architecture supports the implementation of application components and services that can easily combine, transform, and process data and functionality from various information sources. This facilitates the streamlining of existing, as well as the development new business processes. With the corporate portal, these services will be available instantly throughout the organization. The corporate portal therefore not only “opens the door” to existing integration, but also facilitates process integration on a global level.
With the corporate financial portal, financial management is offered a tool that facilitates the coordination needs within the organization. The corporate financial portal can provide often requested information and service regardless of time-zone and geographical restrictions. In this context, the corporate financial portal can be regarded as a one-stop-shop (Österle 2001) that fulfills the user’s financial information and transaction needs. Thus, the objective is to establish among the financial community the mentality to search within the corporate financial portal for the needed information or service, first. To support this target, it is important to inform portal users on a regular basis about new services that are available and to monitor their wishes. As a result the portal will help reducing communication and coordination needs of standard processes. Moreover, in a multinational environment, financial management processes will depend on a variety of information that is dispersed throughout the organization. The corporate financial portal and the accompanying integration activities provide the means to access, process, and analyze the financial data throughout the organization, with regard to specific business scenarios. As an example, a corporate financial portal can implement services that periodically (e. g. on a daily basis) evaluate the corporate foreign exchange exposure level and automatically notify the persons in charge in case critical limits are reached. Aside from the information requirements, the effective and efficient coordination of the information flows is a critical determinant of multinational financial management operations. From a corporate perspective, information processes have to be coordinated between subsidiaries and the corporate centre. As the quality of financial management decisions significantly depends on the quality of the underlying data, it is important to provide the financial managers of the local subsidiaries with the necessary incentives to supply the data of the required quality. With a corporate portal, the corporate finance centre can offer value-adding reporting services for these local subsidiaries. By linking the quality of the reports provided to the quality of the original data provided by the local managers, the local managers are required to provide data of high quality for their own benefit. The financial planning activities on the corporate-level that requires the collection of planning figures from the dispersed local affiliates is one potential scenario for the application of a corporate portal as a reporting hub. In summary, the corporate financial portal provides the instruments to improve the effectiveness and efficiency of the financial processes on a global scale. It fosters the management of data, information and knowledge regardless of time zones and geographical boundaries. Still, one has to keep in mind that a corporate financial portal is not a product that is bought off-the-shelf. Rather, it is an information system that follows an evolutionary development process, thus grows in functionality with the number of services provided. In the following section, we will describe a corporate financial portal implementation and evaluate its impact on the financial management processes of the regarded company.
128
6.4
Hong Tuan Kiet Vo et al.
Bayer’s Corporate Foreign Exchange Management with the Corporate Financial Portal
Bayer is a global enterprise with 110.200 employees and core competencies in four business areas: health care, nutrition, and high tech material. The Bayer Group comprises of three business areas2 that operate independently under the leadership of the management holding Bayer AG. The German headquarter supervises and coordinate the activities of 350 companies, worldwide. In 2005, the Bayer Group generated a net income of 27.3 billion Euros resulting in an operating result (EBIT) of 2.8 billion Euros.3 In 2000, Bayer decided to re-organize the organization of the group’s financial management structure. The decision was to centralize the financial management activities at the Corporate Center in Germany, Leverkusen, thus reduce the numbers of the dispersed regional financial services centers worldwide. Currently, the Bayer Group’s Corporate Finance Center comprises of the Corporate Finance, Corporate Treasury, Credit Risk Management, and Corporate Financial Controlling departments. It takes the role of a global financial services centre and bears global responsibility for the financial management of the Bayer Group. The strategic re-organization poses new requirements and challenges on the information management for Bayer’s financial management. On the one hand, the Corporate Finance Center has to provide the worldwide subsidiaries with the required financial information and services. On the other hand, group-level financial management decisions require the Corporate Finance Center to collect, aggregate and process the relevant financial data from the individual subsidiaries worldwide. To address these information management challenges, in 2001 the Corporate Financial Portal project was initiated. The objective was to provide a central point of entry for all manners of financial services for the global financial community of the Bayer Group by the means of an Intranet corporate portal, the CoFiPot. Currently, the CoFiPot provides services that support the key tasks of the corporate treasury management, the corporate financing, the corporate controlling, and the corporate financial planning. Furthermore, a knowledge-base and a number of general services address the requirement of Bayer’s financial community. Figure 6.1 shows a screenshot of the homepage of the Corporate Financial Portal. In this section, we describe and discuss the impact of the implementation of the corporate financial portal on Bayer’s foreign exchange risk management practices. We first describe the corporate foreign exchange management processes and then elaborate the portal services that were introduced to reengineer key aspects of the corporate foreign exchange management.
Figure 6.1. The homepage of the Corporate Financial Portal (as of February 2007)
6.4.1
Corporate Foreign Exchange Management
Companies with international business operations are exposed to foreign exchange risk in as much as the cash flows from their projects depend on exchange rates and future exchange rate changes cannot be fully anticipated. Broadly speaking, one can distinguish three concepts to measure the effect of exchange rate changes on the firm. The accounting exposure concept (translation or book exposure) measures the impact of parity changes on accounting profits and owners’ equity. The transaction exposure concept concentrates on contractual commitments which involve the actual conversion of currencies. Finally, the economic exposure concept also takes into account the long-term effects of exchange rate changes on a firm’s competitiveness in its input and output markets.4 Surveys among MNCs show that the majority of companies base their foreign exchange risk management on the hedging of transaction exposure (e. g. Glaum 2000; Marshall 2000; Pramborg 2005). Companies first try to determine their exposures on the basis of their open foreign-currency denominated contracts (receivables, payables, loans, etc.) Sometimes, foreign-currency cash flows which are expected in the short-run, typically over the budgeting horizon, are included in the determination of the foreign exchange risk exposure. Depending on the firm’s hedging strategy, financial management can then neutralize the exposure by enter4
Refer to (Glaum 2001) for a detailed analysis of the three exposure concepts.
130
Hong Tuan Kiet Vo et al.
ing into counter-balancing forward contracts. For example, an exporting firm that expects US-dollar inflows from US exports receivables can sell the expected future US-dollar inflow in the forward market The effects of exchange rate changes on the receivables and on the forward market position will now cancel each other out, the home currency value of the future cash flow is fixed (US-dollar amount times the forward rate). Bayer’s foreign exchange risk management process is depicted in Figure 6.2:5
Corporate FX management practices and guidelines (e.g. hedging limits and restrictions, benchmark definition,…)
Measuring FX Exposure
Hedging FX Exposure
Controlling FX Hedging
feedback
Figure 6.2. Bayer’s foreign exchange risk management process
In order to gain from economies of scale and to consequently reduce the cost of risk management, MNCs to manage foreign currency risk on a global level rather than on a domestic level. More precisely, firms try to identify their overall net positions in any given currency by subtracting expected cash outflows (“short” positions) from expected cash inflows (“long” positions) over all of its sub-units. Since the effects of exchange rate changes on long and short positions cancel each other out, only the net position is effectively exposed to exchange risk. Hence, the netting of foreign currency exposure over the parent company and all subsidiaries results in substantially reduced hedging transactions. In order to enable netting on a global, group-wide level, corporate financial management has to deploy a central treasury management system which retrieves the local exposure data from the subsidiaries and implements mechanisms and functionality to support the multilateral netting process (Cooper 2000).
6.4.2
Bayer’s Foreign Exchange Management Practice
The primary goal of foreign exchange management at Bayer is to reduce the risks that arise from exchange rates fluctuations. This is achieved through the elimination 5
Refer to (Glaum and Brunner 2003) for a detailed discussion of the individual process steps.
of the effects of exchange rates on booked and anticipated foreign exchange cash flows, such that the impact of currency rate changes on receivables and payables is fully offset (except for hedging costs). Financial risk management is focused solely on exposures which have an impact on cash flows. This means that only accountingbased risks, for example, gains and losses stemming from foreign exchange translation, will not be hedged. Foreign currency exposures of booked cash flows are completely hedged while currency exposures from anticipated cash flows are only partially hedged. The hedge ratios for the anticipated foreign-currency exposures are determined once a year by the corporate financial management. Bayer distinguishes between two perspectives on FX management: group level perspective and legal entity perspective. From the group perspective, the corporate financial head quarter hedges the group net exposure with external banks counterparties usually by executing FX forward contracts. This practice ensures that the FX risk of the Bayer group is hedged according to the corporate financial guidelines. Yet, the legal entities do not automatically participate from the group level hedging activities in the sense that the hedging results are not automatically attributed to the respective corporate entities and their exposure. The legal entities are free to decide on their individual FX management strategy with the option to hedge their FX risk or speculate on improving their earnings by betting on positive exchange rate developments. If they decide to safe-guard their own P&L from effects of FX rate fluctuations they may hedge their exposure with the corporate financial service centre CFSC by means of internal FX hedges. Hence, the CFSC acts as an in-house bank and quotes market oriented prices for the requested hedging transaction.
6.4.3
Redesigning Foreign Exchange Risk Management with CoFiPot
In this section we present the redesigned foreign exchange risk management process at Bayer as it has been implemented with regard to the means offered by the CoFiPot. Still, due to the possibilities offered by the portal, financial processes at Bayer are in a state of constant evolution. The discussion will follow the steps of the foreign exchange risk management process presented in Section 6.4.2. We exclude the task of determining the corporate FX guidelines from the subsequent discussion as the definition of corporate financial management policies is not within the scope the presented project. 6.4.3.1 Identification and Measuring FX Exposure To determine the Bayer group net exposure, the data on the booked foreign exchange exposure has to be reported by the legal entities on a regular basis e. g. For the majority of entities this process is done automatically on the basis of a daily
132 1
Hong Tuan Kiet Vo et al.
Report Exposure Entity A
Corporate Center Manual exposure report
Automated exposure report
Request exposure reports On-demand 2
Monitor Exposure
Email, CoFiPot
Corporate Exposure Database
Local exposure
Entity B
Local exposure
Request exposure reports On-demand FX Exposure Monitor
Figure 6.3. Measuring FX exposure at Bayer with CoFiPot
FX Exposure reports offer different levels of aggregation (from group level to the individual transaction)
Figure 6.4. CoFiPot FX Exposure Monitor service
data import process. The remaining entities that lack the technical requirements to participate in the automated process have to report their exposure data manually either via e-mail or by directly entering the data using an exposure upload service provided by the Corporate Financial Portal ((1) in Figure 6.3). In either case, the reported data is stored in a central exposure data base that is managed and used primarily by the financial headquarter for the purpose of group level exposure management.
The role of the FX Exposure Monitor is to access the exposure database, extract the relevant data, and prepare the required exposure reports ((2) in Figure 6.3). Personalization services provided by the CoFiPot application framework enables the FX Exposure Monitor to provide exposure reports on different levels of details and aggregation with regard on the user’s preferences and security level. Grouplevel FX managers have access to all exposure data, while local FX managers only get the exposure reports for the entities within their responsibility. By the means of the FX Exposure Monitor the local FX managers have access to and rely on the same exposure reporting capabilities as the group level FX managers. The FX Exposure Monitor offers on-demand reports of the current FX exposure that automatically highlight critical exposure levels. The exposure report provides three levels of aggregation from the group net exposure, over the entity level, to the transaction level. Moreover, the application offers basic data warehouse slice and dice capability with regard to the dimensions region, sub group, entity and date (cf. Figure 6.4). 6.4.3.2 Hedging FX Exposure On the basis of the FX exposure level reported by the CoFiPot, the corporate centre decides on the subsequent hedging actions on the group level. In the past, this process was done on a monthly basis. However, due to systems integration and process automation, a weekly evaluation of the current corporate foreign currency exposure could be implemented. The process of evaluating the foreign currency exposure, deciding on the hedging needs, and subsequently engaging in hedging transaction on the global FX market has been automated to a large extend. The CoFiPot does not only support affiliates in their data reporting tasks, but also assists them in their individual risk management needs. If the local FX manager identifies the need to save-guard the legal entity’s profit and loss from FX rate fluctuations, he may do so by entering into internal FX forward contracts with the CFSC. He can either rely on the traditional voice trade channel (i. e. telephone trading) or chose the CoFiPot FX Trade Entry application (Figure 6.5) to enter into valid FX transactions on a 24/7 h basis. With the FX Trade Entry the FX manager has to specify the trade type, the settlement type, and the currency pair with the buy or sell amount respectively. The application relies on personalization mechanisms to reduce the interaction complexity by pre-selecting suitable, and omitting irrelevant options. Using the “get quote” functionality, the FX manager obtains a quote for the requested transaction that is based on the current market rate as provided by an external market data provider. The quote is valid for five minutes. Figure 6.6 describes the FX exposure hedging process. After the FX manager executes the FX trade, the CoFiPot FX Trade Entry application directly saves the trade details as specified into the corporate treasury management system TMS and notifies the staff at the CFSC for final agreement and confirmation. It is important to note that regardless of the actual time the trade is processed at the CFSC, the trade conditions on execution are guaranteed. Upon agreement and
134
Hong Tuan Kiet Vo et al.
Quote from market data provider. Mid rate between bid and ask. Valid for 5 minutes.
Figure 6.5. CoFiPot FX Trade Entry service
CoFiPot – FX Trade Entry 1
Request deal 24/7h
2
Get valid market quotes
Hedging demand 3 4
Save trade
Confirmation & settlement
Corporate Treasury Management System
External market data provider
Local exposure Legal Entity
Corporate Center
Figure 6.6. Hedging FX risk process (intercompany)
confirmation, the FX manager automatically receives a confirmation letter as a contract for this hedge. The FX manager then either confirms the trade by replying to the e-mail with the confirmation letter attached or via the CoFiPot. Furthermore, documentation that complies with the IFRS hedge accounting requirements and guidelines6 is automatically created and stored for the transaction. This documentation is necessary for the subsequent controlling requirements. 6
The reader is referred to (Deloitte 2006) for a comprehensive discussion on the IFRS hedge accounting guidelines and requirements.
6.4.3.3 Controlling FX Hedging Activities Every FX transaction that is closed with the CFSC, is stored and managed by the corporate treasury management system TMS. Consequently, these trades automatically participate in the TMS workflows that include the valuation of the individual trades. The TMS provides FX reporting capabilities that originally were only available to the employees at the CFSC. In this regard, the primary objective of the CoFiPot FX Trades application is to provide a convenient access to these FX trade reports as provided by the TMS – not only for the CFSC but for every FX manager with Bayer’s financial community. With FX Trades application the FX manager can now request an overview of all FX trades that he is responsible for, and with the respective market values either in Euro or in local currency. Following the general CoFiPot reporting standards, the FX Trades application allows for a slice and dice of the data according to the dimensions entity, portfolio, date, type, and the current trade status (Figure 6.7). It is important to note that the level of detail provided by the CoFiPot FX Trades reports satisfies the requirements of the Bayer’s accounting principles (HGB, IFRS, US-GAAP).
Close new internal/external FX transactions
Documentation according to IAS39 is automatically created for new FX transactions
Figure 6.7. CoFiPot FX Trades service
6.4.4
Lessons Learned
In the following, we evaluate the effects of the corporate financial portal on the foreign exchange risk management process of Bayer. The discussion follows the steps of the risk management process.
136
Hong Tuan Kiet Vo et al.
The integration of the transactional data from the local/regional ERP systems and the automated import of this data into the corporate exposure database on a daily basis has significantly reduced the need for coordination and communications with the respective subsidiaries. More importantly, the exposure database is updated on a daily basis offering the CSFC a timelier and more accurate overview of the group-level exposure. Currently, virtually all of the relevant affiliates are integrated into the exposure upload process. The process of managing foreign currency risk is also significantly improved as the integration efforts laid the basis for the automation of this process. The information stored in the exposure database is assessed and used to automatically determine and execute hedging activities on the corporate level. As stated in Section 6.4.2, this is done on a weekly basis which would not be manageable with manual processing. Actually, automation would allow for an even higher frequency of the risk management process. Yet, it still has to be evaluated if this activity would result in any further significant improvement of the risk management process. Traders of subsidiaries have the option to close foreign exchange transactions using the traditional means of communication (e. g. telephone, fax) or the corporate financial portal. Figure 6.8 illustrates the development of the hedging transactions per month for the regarded periods and shows whether the telephone or the CoFiPot was used as a transaction channel. Furthermore, the line illustrates the development of the ratio of CoFiPot transactions versus telephone transactions. In 2005, on average 82% of all internal transactions were closed using CoFiPot. From February 2002 till June 2006, a total of 8,966 internal trades were closed using the financial portal (1,694 using the telephone trading channel). We deter-
# CoFiPot
# Telephone
Percentage CoFiPot 320
100
90 270 80 220
70
60 170 50 120 40
30
70
20 20 10
02
M rz
Ja n
M ai 02 Ju l0 Se 2 p 02 N ov 02 Ja n 03 M rz 03 M ai 03 Ju l0 Se 3 p 03 N ov 03 Ja n 04 M rz 04 M ai 04 Ju l0 Se 4 p 04 N ov 04 Ja n 05 M rz 05 M ai 05 Ju l0 Se 5 p 05 N ov 05 Ja n 06 M rz 06 M ai 06
-30
02
0
Figure 6.8. CoFiPot vs. telephone transactions (intercompany)
mined through observation, that the average time to execute an internal transaction is about 4 minutes using the telephone. Even under the assumption that closing transactions using the portal take approximately the same amount of time, one has to consider that with the telephone two persons are involved in the trading process. The analysis of the transaction data further shows that since the introduction of the CoFiPot FX management services in January 2002 the number of transactions per month has increased from an average of 60 trades per month in the year 2001 to an average of 235 trades in the year 2005. Figure 6.8 further outlines the different trading channels used for the transactions. The data reveals that following the introduction phase CoFiPot has replaced the traditional telephone trading for most transactions. As of July 2002 the number of CoFiPot FX Trades Entry trades surpasses that of telephone trades. There are at least two reasons for the strong increase in total hedging transaction (cf. Figure 6.9). First, the average number of hedges per entity has increased from an average of 4 trades to 8 trades per month. Second, and more important, the increase in hedges stems from an increased number of entities that hedge their exposure. Figure 8 shows that this number has increased from 14 entities that do hedge their exposure in 2001 to 35 entities in 2005. Both effects explain the strong increase in hedges. Fluctuations in this number are due to external business events. As an example, the Lanxess spin-off in January 2005 was followed by a departure of the respective entities that consequently leads to a significant drop in the hedging activities for the subsequent periods. To further support the subsidiaries with their FX management operations, the CFSC offers them the option to participate in a so called Autohedging process. This initiative is another example that illustrates the role of the corporate financial portal as an enabler of new financial practices. The idea is to automatically create internal hedges whenever the foreign exchange exposure of a participating subsidiary changes substantially. The only prerequisite is that the subsidiary feeds the corporate exposure database with its exposure data. The corporate centre will continuously observe the currency exposure of these subsidiaries and execute
Ø Number of telephone trades per entity per month
Number of [Entities, Transactions]
Ø Number of CoFiPot trades per entity per month
Number of active hedging entities
LanXESS Spin-off
Autohedging
30
CoFiPot Trade Entry
40
20
10
0
Figure 6.9. CoFiPot vs. telephone transactions per entity; number of active entities
138
Hong Tuan Kiet Vo et al.
hedging activities when needed. The financial portal can be used to monitor these hedging activities. Following the active promotion of this service in 2004, currently the Autohedging service accounts for one half of all FX exposure hedges. Taking aside effects from external business events, the CoFiPot currently accounts for an average of 80 percent of all hedging transactions.
6.5
Conclusion
This paper identifies and discusses the challenges that the financial management of multinational corporations has to meet. Under the pressure to continuously improve financial business processes, financial management not only has to find ways to streamline existing processes but also has to look for new ways of doing business by designing and implementing new processes. This demand can only be met with the help of modern information systems that reduce the complexity of integration, communication, coordination and collaboration challenges that might be especially demanding in a multinational business environment. We describe the concept of the corporate financial portal as a central hub to financial information, systems and services. We argue that with this information system some of the above-mentioned demands can be met. Moreover, we believe that a corporate financial portal project does not only help to solve existing operational needs but, more importantly, also enables new financial practices that improve the quality of financial management functions. Finally, we describe and evaluate the corporate financial portal implementation of the Bayer AG. On the basis of Bayer’s foreign exchange risk management practices we evaluated the role of the corporate financial portal within the context of redesigning the underlying processes of this key function. System integration and process automation enabled us to design and implement new financial practices like a hedging of currency exposure on a weekly rather than a monthly basis. Furthermore, we were able to better meet the foreign exchange risk management demand of the local affiliates by providing them with the possibility to use the corporate financial portal in order to execute and monitor inter-company foreign exchange transactions. Following the process reengineering, we could observe a strong increase in the hedging activities of the affiliates since the introduction of these Web-based trading service.
References Balzert H (2000) Lehrbuch der Software-Technik – Bd. 1. Software-Entwicklung. Heidelberg; Berlin, Spektrum – Akademischer Verlag Chou DC (1999) “Is the Internet the Global Information System.” Decision Line 30(2): 4−6 Collins H (2001) Corporate Portals. New York
Cooper R (2000) Corporate Treasury and Cash Management. New York, Palgrave Macmillian Davydov M (2001) Corporate Portals and eBusiness Integration, McGraw-Hill Companies Deloitte (2006) iGAAP 2006 Finanzinstrumente: IAS 32, IAS 39 und IFRS 7, CCH. Dias C (2001) Corporate Portals: a literature review of a new concept in Information Management. International Journal of Information Management 21: 269−287 Eiteman DK et al. (2006) Multinational Business Finance, Addison Wesley Eun CS and Resnick BG (2004) International Financial Management, McGrawHill Professional Glaum M (2000) Foreign Exchange Risk Management in German Non-Financial Cor-porations: An Empirical Analysis. Risk Management – Challenge and Opportunity. Risk Management – Challenge and Opportunity. M. Frenkel, U. Hommel and B. Rudolf. Berlin, Springer: 373−393 Glaum M and Brunner M (2003) Finanz- und Währungsmanagement in Multinationalen Unternehmungen. Management Multinationaler Unternehmungen. D. Holtbrügge. Heidelberg, Physica-Verlag: 307−326 Hommel U and Dufey G (1996) Currency Exposure Management in Multinational Companies: ‘Centralized Coordination’ as an Alternative to Centralization. Strategische Führung internationaler Unternehmen: Paradoxien, Strategien und Erfahrungen. J. Engelhard. Wiesbaden, Gabler: 199−220 Linthicum DS (2003) Next Generation Application Integration. From Simple Information to Web Services, Addison-Wesley Marshall AP (2000) Foreign exchange risk management in UK, USA and Asia Pacific multinational comanies. Journal of Multinational Financial Management(10): 185−211 Österle H (2001) Geschäftsmodell des Informationszeitalters. Business Networking in der Praxis: Beispiele und Strategien zur Vernetzung mit Kunden und Lieferanten. H. Österle, E. Fleisch and R. Alt. Berlin, Springer: 17−37 Pramborg B (2005) Foreign exchange risk management by Swedish and Korean nonfinancial firms: A comparative survey. Pacific-Basin Finance Journal 13: 343−366 Richtsfeld J (1994) In-House-Banking: neue Erfolgsstrategien im Finanzmanagement internationaler Unternehmen. Wiesbaden, Gabler Shapiro AC (2006) Multinational Financial Management, Wiley & Sons Vo HTK et al. (2006) Corporate Portals from a Service Oriented Perspective. CEC/EEE 2006, San Francisco, IEEE Vo, HTK et al. (2005) Integration of Electronic Foreign Exchange Trading and Corporate Treasury Systems with Web Services. Wirtschaftsinformatik 2005, Springer Weitzel, T et al. (2003) Straight Through Processing auf XML-Basis im Wertpapiergeschäft. Wirtschaftsinformatik 45(4): 409−420.h
CHAPTER 7 Capital Markets in the Gulf: International Access, Electronic Trading and Regulation Peter Gomber, Marco Lutat, Steffen Schubert
7.1
Introduction
In 1981, the Gulf Cooperating Council (GCC) was formed by six Arabic countries: Kuwait, Bahrain, Qatar, Saudi Arabia, Oman and the United Arab Emirates (UAE). The GCC aims to promote cooperation between its member states, mainly in the fields of economy and industry. So far, the GCC has succeeded in a number of areas. Citizens of GCC countries can move freely amongst the six countries without the need for visas and there are no customs duty within the GCC (but only on products made within the GCC). More significantly, there is freedom for professionals of one GCC state to work in another since then. Members of the GCC group of nations are also allowed to own shares in the companies that operate within that group. In academic literature one finds only little information on Arabic stock markets, although their markets went through rapid developments in the recent past and are currently aiming to establish world class stock exchange standards. In the following, an insight into those markets describing regional macroeconomic and exchange business development will be provided. The chapter is structured as follows: Section 7.1 will give an overview of the member countries of the GCC by presenting recent macroeconomic developments while Section 7.2 will characterize the exchange business in the region with trading figures, market capitalization etc. International investors’ requirements will be investigated in Section 7.3 and the exchanges of the region are analyzed in Section 7.4. The most recent exchange in the Gulf, the Dubai International Financial Exchange (DIFX), will serve as a case study of the approaches to further open up the region’s markets in Section 7.5. Findings will be summarized and conclusions presented in the last section. While the combined GCC population was at a level of 21.89 million in 1990 and 25.99 million in 1995, it reached more than 35 million in 2005 according to estimates – one of the fastest growing regions in the world. The Kingdom of Saudi
142
Peter Gomber et al.
Table 7.1. Population in the GCC countries (Source: Central Intelligence Agency 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Total
Population (million) 0.69 26.42 2.34 3.00 0.86 2.56 35.87
Table 7.2. GDP for GCC countries, purchasing power parity, compared to Germany and the UK1 (Source: Central Intelligence Agency 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Total GCC Germany United Kingdom
GDP for 2004 ($ billion) 13.01 310.2 48.0 38.09 19.49 63.67 492.46 2,362.00 1,782.00
Arabia (KSA), by far the biggest GCC nation, makes up more than 73 percent of the whole population while Bahrain, the smallest of those countries contributes by less than 2 percent. In terms of population, the whole GCC region is comparable to Canada whose population is estimated at 32.8 million for July 2005 (Central Intelligence Agency 2005). The discrepancy in the size of the countries’ populations is also reflected in the national gross domestic products (GDP) which, for 2004, range between $13.01 billion for Bahrain and $310.2 billion for the Kingdom of Saudi Arabia. Compared to those of Germany and the UK, even the GDP for the whole Gulf region is low. Related to its population the highest GDP per capita can be found in the United Arab Emirates and the lowest in the Kingdom of Saudi Arabia despite its oil richness. In 2003, GDP growth rates have increased significantly in the GCC countries compared to the average annual growth rate from 1998 to 2002 (except for Oman). Since then, growth rates have fallen slightly in most countries but are still remarkable. 1
In the following tables, we will provide comparisons to Germany, UK and/or the US and the World respectively without listing the comparison countries explicitly in the headers of the tables.
7 Capital Markets in the Gulf
143
Table 7.3. GDP per capita of GCC countries, purchasing power parity (Source: Central Intelligence Agency 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Germany United Kingdom
GDP per capita for 2004 ($) 19,200 12,000 21,300 13,100 23,200 25,200 28,700 29,600
Table 7.4. Real GDP growth rates, annual change in percentage (Source: IMF, Middle East and Central Asia Regional Economic Outlook, September 2005, p. 41 and IMF, World Economic Outlook, September 2005, p. 205−206)
Bahrain KSA Kuwait Oman Qatar UAE Germany UK USA World
Based on these macroeconomic data in comparison to developed economies, in the following an overview of the development of the stock markets in the region will be provided.
7.2
The Evolution of GCC Stock Markets and Its Implications
Stock exchanges in the GCC region do not have a long history compared to markets in the US or Europe. Local stock exchanges have been in place both formally and informally from the 1970’s, but organized trading activity started only after the 1980’s. Since the inauguration of the Dubai Financial Market (DFM) in March 2000, all six member states of the GCC now have officially regulated stock ex-
144
Peter Gomber et al.
Table 7.5. Stock exchange commencement in the GCC countries (Source: FINCORP, GCC Markets – A special review, November 10, 2005, p. 2) Country Kuwait Oman Bahrain KSA Qatar UAE UAE
Name of the stock market Kuwait Stock Exchange (KSE) Muscat Securities Market (MSM) Bahrain Stock Exchange (BSE) Tadawul Saudi Stock Market (TSSM) Doha Securities Market (DSM) Abu Dhabi Securities Market (ADSM) Dubai Financial Market (DFM)
Year of commencement 1977 1988 1988 1989–2001* 1997 2000 2000
* Automated clearing system was introduced in 1989 and the electronic trading system Tadawul started in 2001
Table 7.6. Market capitalization for GCC countries, Germany and the UK (Source: World Bank, World Development Indicators 2005, Table 5.4 – Stock markets; GCC exchanges websites, Federation of Euro-Asian Stock Exchanges (FEAS) 2005) Country Bahrain KSA Kuwait Oman Qatar UAE Total GCC Germany United Kingdom
Market Capitalization at the end of 2004 (US-$ millions) 13,500 306,248 75,200 9,317 40,452 84,088 528,805 1,079,026 2,412,434
changes (Dubai Financial Market 2006). Following the commencement of the Dubai Financial Market, the Abu Dhabi Securities Market (ADSM) opened up in November 2000 and thus makes the UAE the only GCC member country to be home to more than one stock exchange. The following table provides the commencement dates of regulated stock exchanges for the GCC countries. Going hand in hand with the growing economies in the region (see Section 7.1) market indicators show that GCC stock markets are experiencing a positive development in recent years, i. e. growth in market capitalization, trading activity and stock performance. In terms of absolute market capitalization the region is dominated by the exchange of Saudi Arabia, which is by far the biggest market, followed by the aggregated market capitalization of both UAE markets; the smallest markets are those of Oman and Bahrain. The following table compares the GCC market capitalization to Germany and the United Kingdom.2 2
Due to data availability the following statistics are listed until the end of 2004.
7 Capital Markets in the Gulf
145
Table 7.7. Market capitalization as a percentage of national GDP Country
Market Cap as a percentage of GDP (2004) 103.8 98.7 156.7 24.5 207.6 132.1 45.7 135.4
Bahrain KSA Kuwait Oman Qatar UAE Germany United Kingdom
Table 7.8. Annual market capitalization growth rates in percentage (Source: Exchanges’ websites and Federation of Euro-Asian Stock Exchanges, 2005) Name of the stock exchange
2002
2003
2004
Bahrain Stock Exchange Tadawul Saudi Stock Market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market Dubai Financial Market
14.54 2.0 27.0 15.6 44.0 252.8 20.33
29.13 110.14 72.0 40.65 152.7 49.1 50.26
40.7 94.7 22.2 29.9 51.4 82.7 145.47
2002−2004 average 27.67 60.99 38.71 28.30 76.62 112.61 64.33
Market capitalization in the region as a whole is less than half the German and less than one fourth of the UK market capitalization. Concerning the ratio of market capitalization to GDP many countries in the GCC region show quite high ratios, lead by Qatar with a market capitalization of 207.6 percent its GDP. Beginning in 2002, GCC members have faced triple digit growth rates in their market capitalization which can be explained by two factors: First, a successful wave of privatization has begun in these countries, e. g. in the telecommunication sector. This development has met interest by local investors which led to initial public offerings being regularly over-subscribed many times; the IPO of Ettihad Etisalat for example was over-subscribed 51 times in Saudi Arabia in November 2004 (Jafarkhan 2004). Second, index performance soared in most markets by double digits in recent years. Taking a closer look at the market capitalization growth rates, it is noticeable that the most positive developments have been experienced by countries with high oil production, which are benefiting from soaring oil prices. Another explanation of the comfortable environment for stock business is the repatriation of money. Money that was invested in markets in the US and the UK has been reinvested in the home markets. This started after the attacks of September 2001 as Arabic investors feared that
146
Peter Gomber et al.
Table 7.9. Index performance in percentage (Source: Exchanges’ websites and FEAS 2005) Name of the stock exchange
2002
2003
2004
Bahrain Stock Exchange Tadawul Saudi Stock Market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market Dubai Financial Market
3.41 4.0 24.1 26.2 37.3 7.8 21.1
28.81 76.23 63.9 42.1 69.8 28.6 45.4
30.17 84.93 11.9 23.8 64.5 74.8 49.0
2002−2004 average 20.14 50.21 31.54 30.45 56.53 34.32 37.92
Table 7.10. Annual growth rates of trade volume in percentage (Source: GCC exchanges’ websites and FEAS 2005) Name of the stock exchange
2002
2003
2004
Bahrain Stock Exchange Tadawul Saudi stock market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market Dubai Financial Market
7.21 60.0 87.6 41.02 113.8 150.0 157.5
26.4 346.0 142.7 156.64 264.6 176.0 49.3
70.71 197.0 −6.0 24.12 97.0 343.0 1337.8
2002−2004 average 32.25 176.74 62.36 65.00 148.56 212.67 280.93
the US government, among others, might take it from them or freeze their accounts. As a further reason it is argued the GCC-US dollar peg and low interest rates in the region lead to little incentive to save but instead to borrow money to speculate on better returns (Rutledge 2005). Besides the initial public offerings, companies already listed release new shares as there is a positive climate for secondary offerings. Cumulatively, those factors are reflected in an increase in trading activity. Table 7.10 and Table 7.11 show the development of trade volume growth rates and the increase in the number of trades respectively for the years 2002 to 2004. Despite these statistics on the positive evolution of GCC stock market business one can argue that, compared to traditional markets like the US or Europe, markets in the region are still developing regarding exchange activity. This can be shown by taking a closer look at the respective turnover ratios, i. e. the quotient of traded volume and market capitalization, and by comparing them to other countries. Except for Saudi Arabia, all GCC markets show a low turnover ratio compared to the markets in Germany, the UK and the US. Especially Qatar can be seen as an outlier as its trade value counts for only 0.16 percent of its overall market capitalization.
7 Capital Markets in the Gulf
147
Table 7.11. Annual increase in the number of trades in percentage (Source: GCC exchanges’ websites and FEAS 2005) Name of the stock exchange
2002
2003
2004
Bahrain Stock Exchange Tadawul Saudi Stock market Kuwait Stock Exchange Muscat Securities Market Doha Securities Market Abu Dhabi Securities Market
−2.87 71.0 46.8 58.67 88.9 97.0
12.73 264.0 107.5 92.92 352.2 126.0
7.62 254.0 −2.3 33.44 111.3 n/a**
Dubai Financial Market
84.2
0.1
754.3
2002−2004 average 5.62 180.35 43.84 59.85 162.31 111.00 (2002/2003) 150.68
**Annual report 2004 contains only statistics for 2003
Table 7.12. Turnover ratios defined as trade volume relative to market capitalization (Source: World Bank, World Development Indicators 2005; GCC exchanges’ websites) Country Bahrain KSA Kuwait Oman Qatar UAE Germany United Kingdom USA
Two major implications can be derived from the recent evolution of GCC stock markets: First, the rapid increase of listed companies, market capitalization and trading activity requires the regional exchanges to provide professional and reliable infrastructures and functionalities to their investors. As local investors are developing and improving their skills in stock trading they will call for regulation, market integrity, trading models, information services and operational reliability according to international standards, if not yet existent. Second, due to their boost in stock performance and therefore their attractiveness, GCC markets will experience a broader interest from international investors seeking high profit investments. Stock exchanges in the region as well as local investors should have an interest in attracting foreign investments as this will lead to additional liquidity. If obstacles for foreign investments in the countries of the region are too high, additional liquidity from other GCC and non-GCC investors will abstain from local markets.
148
7.3
Peter Gomber et al.
Requirements of International Investors
Attracting international investors is important for developing economies, because foreign investment is one way to fuel economic growth. The following analysis of international investors’ requirements discusses (i) transaction costs and market liquidity, (ii) the instrument scope, (iii) information dissemination, (iv) market regulation and trading surveillance, (v) trading systems and its mechanisms and (vi) investment restrictions. (i)
Transaction Costs and Market Liquidity
A well organized financial market is characterized by low costs for all aspects related to a security transaction (Reilly 1989, pp. 75−76). When making a transaction, any investor needs to incorporate total transaction costs as they will have a negative influence on the investment’s performance. The different cost components of a transaction can be classified either as explicit or implicit. Those cost components which are known and calculable before a trade occurs are referred to as explicit, while those costs induced by the trade itself and the market conditions at the time of the transaction are referred to as implicit. Implicit trading costs are highly linked to market liquidity and are not easily observable (Davydoff et al. 2002, p. 4). The explicit costs are determined directly by the exchange itself and via the intermediary serving the investor while the implicit costs result from various factors, e. g. the form of the market system or existence of liquidity providers (Degryse and Van Achter 2002). Possible explicit cost components for a transaction are: broker commissions, market maker commissions, exchange fees, clearing and settlement costs.3 In most cases the investor will not face explicit exchange fees directly, unless he has direct non-intermediated access to a market, as he is charged a commission by the intermediary. Nevertheless, there exists an indirect relationship between the exchange fees and the commission which an investor is charged. Market related cost components are often displayed in bundled form, e. g. exchange, clearing and settlement fee as an aggregated cost component if those services are offered from one source. Especially international investors will be able to perform a cost comparison between international markets and the potential new markets. Market liquidity is widely seen as the most decisive criterion for market quality, as it gives investors the opportunity to buy and sell equities at justifiable costs. But liquidity can be influenced only indirectly by the design of an exchange as liquidity results from various other factors which have to be fulfilled first, e. g. market transparency, market integrity, broad scope of investment instruments (Gomber 2000, p. 81). Then a process may set in where further liquidity is lured by the liquidity already existing in a market. Exchanges can improve their market 3
For a definition of transaction costs and its components see: Pagano and Röell 1990, p. 75.
7 Capital Markets in the Gulf
149
quality by engaging market makers to provide (artificial) liquidity on a continuous basis. For international investors liquidity in developing markets might be even more important than in established markets as perception of the investment risk is higher in these markets and the fear of not getting out of positions if necessary is more apparent. It is important to note that liquidity rarely is a critical issue for the average size order and during phases with normal trading activity, but more so for block orders and under extreme market conditions (Radcliffe 1973 and Reilly and Wright 1984). (ii)
Instrument Scope
Investors reduce their risk exposure by a well balanced and diversified portfolio (Davis and Steil 2001, p. 52). This requires the market to offer a broad variety of possible investments over different company sectors and different instrument types, i. e. equities, futures, options, bonds etc. Access to derivatives is important to both investors and market makers as they allow them to reduce trading risk by hedging their positions. (iii)
Information Dissemination
Information on a securities market is distinguished between market endogenous and market exogenous information (Gomber 2000, p. 19). Data which originates directly from trading is called market endogenous (Füss 2002, p. 9): this includes real-time data on price development, traded volumes, order book depth etc. This category also includes up-to date ex-post market statistics. Real-time endogenous information is especially important for decisions on the order execution style and also essential for the evolution and implementation of new technologies: e. g. algorithmic trading is dependent on market endogenous information as it analyzes current and past market conditions in order to optimize the execution style for an order (Economist 2005, p. 14). Information originating from the companies listed on an exchange is denoted as exogenous. It includes financial reports on a regular basis and ad-hoc news for the respective company. As analyst research is based on exogenous information, the dissemination of it is inevitable for investment decisions. The availability of both endogenous and exogenous data and access for all market participants to this information contributes to market transparency and a high grade of market transparency is necessary for a market’s information efficiency (Fama 1970, p. 383). Both types of information should be easily accessible for any investor independent of his/her location in the world. Market fairness requires that all information is distributed in a way that all investors have the chance to learn it simultaneously thus enabling a level playing field. Consequently missing, incorrect or incomplete and time-lagged data potentially leads to wrong investment decisions. If investors thereby feel treated in an unfair manner, they will absent themselves from those markets (Schiereck 1995, p. 30). Information by exchanges as well as listed companies should be made available in the English language – in
150
Peter Gomber et al.
a region of the world where Arabic is the main language and script, it is worth mentioning this. (iv)
Market Regulation and Trading Surveillance
Insider trading is a problem for every securities market, generally unwanted and “no longer regarded as socially acceptable behavior“ (Sunder 1992, p. 12). Repeated public cases of insider trading reduce the international investor’s trust in the markets. In order to avoid such cases, trading surveillance and firm market regulation are indispensable. In market regulation the following principles regarding a local regulator should be followed (International Organization of Securities Commissions 2005): As a principle, the regulator should be operationally independent from governments and the regulated exchanges. Furthermore, it should be accountable for the exercise of its functions and powers. This requires the regulator to act at any time following the highest professional standards including those for confidentiality. The regulator’s responsibilities should be clearly stated so that every market participant is aware of the situations where the regulator may be called. It is necessary that the regulator is equipped with adequate powers and capacities to perform its duties, which includes inspection, investigation and surveillance of the market and its participants. As international investors are usually located at a far distance from their investment markets and the companies they invest in, they are dependent on a reliable and functional market regulation, which should ensure that those investors are not discriminated in comparison to locally based investors. (v)
Trading System and Its Mechanisms
The trading system and its mechanisms can be seen as the core part of an exchange. In the 1990’s several studies on the preferences of institutional investors (including questions on their preferred trading system) showed that investors favor computer-based over floor-based trading (Economides and Schwartz 1994 and Schiereck 1995, p. 122). Electronic markets combine various advantages over traditional floor-based trading, such as anonymity, pre- and post-trade transparency and reduced risk of pre-trade information leakage via intermediating brokers. Independent of the design of the trading system, electronic or floor-based, market participants need fast, reliable and unrestricted access to their investment markets. In order to offer fair trading services, all investors, local as well as international, and the brokers acting as their agents’ need a level playing field concerning market access. Trading hours of an exchange are of importance to international investors, especially if their location is on a different continent or their investment securities are cross-listed. In this case they need to coordinate different time zones (Miller and Puthenpurackal 2005, p. 12). Trading days which differ due to cultural reasons present a further obstacle and public holidays may also be different from the international investor’s location.
7 Capital Markets in the Gulf
(vi)
151
Investment Restrictions
When investors trade in an international environment they have to deal with different legal and regulatory frameworks, possibly even including investment restrictions. When talking about foreign investments one has to differentiate between foreign direct investment (FDI) and portfolio investment. While the International Monetary Fund states on FDI that it “… is made to acquire a lasting interest in an enterprise operating in an economy, other than that of the investor, the investor’s purpose being to have an effective voice in the management of the enterprise”, portfolio investment is described as “gaining ownership without control” and is of short-term interest (Italy and Razin 2005, p. 2). Possible restrictions for foreign investments may be limitations on foreign ownership, screening or notification procedures, management and operational restrictions (Golub 2003, p. 7). Limitations on foreign ownership again can have two forms: the typical form of “limiting the share of companies’ equity capital in a target sector that non-residents are allowed to hold” or a prohibition of any foreign ownership (Golub 2003, p. 7). An international investor will only invest in a market as long as cost and risk, caused by access restrictions, is justifiable for him/her.
7.4
Systematic Analysis of the Exchanges in the Region
This section examines the seven exchanges of the GCC countries regarding the key aspects concerning infrastructure, market functionality and regulation. An overview of the tradable instruments types, the form of the market mechanisms, information dissemination and the services offered along the securities trading value chain will be provided. Individual pricing models are investigated with a focus on the incentives for liquidity provision and the distribution of commissions resulting from a transaction. Finally, an insight into the regulatory framework and possible restrictions on foreign ownership will be given. In order to introduce the structure of the analysis, the Bahrain Stock Exchange will be described in greater detail first. This provides the structure for the remaining six exchanges4. They will be presented in tabular form and follow the analysis scheme of the Bahrain Stock Exchange.
7.4.1
Bahrain Stock Exchange (BSE)
On the Bahrain Stock Exchange investors are able to trade four types of tradable instruments, namely equities, government and corporate bonds as well as mutual funds on Bahraini equities. Derivatives are not offered. 4
The analysis is based on information available until January 2006.
152
Peter Gomber et al.
The trading mechanisms/system can be described as follows: The BSE provides a hybrid market mechanism, i. e. it is order and quote driven. Market makers provide (artificial) liquidity. The Bahrain Automated Trading System (BATS) was established in 1999 and is based upon Computershare’s Horizon system. Supported order types are market and limit orders including a restriction on order validity period (BSE 1999, article 17−18). Investors can also submit iceberg orders (BSE 1999, article 20). Price determination takes place according to the following criteria (in sequence): price priority, source of order priority, time priority, cross priority and random factor priority (BSE 1999, article 23). The source of order priority discriminates between Bahraini and non-Bahraini orders whereby orders from Bahraini investors are prioritized. A trading safeguard is effective for equities only and avoids price fluctuations of more than 10 percent from the last closing price (BSE 1999, article 15). Bonds and mutual funds are free in their price fluctuations. The market is accessed via brokers, who have to use the automated trading system to input their orders. The automated trading system is available via terminals on the trading floor and, since 2001, via remote access (BSE 2001, article 1). It permits two hours of continuous trading per day, namely from 10 a. m. to 12 p. m., Sundays through Thursdays, and a pre-opening phase takes place every morning from 9.30 a. m. to 10 a. m. followed by an opening auction (BSE 1999, article 25−26). Information dissemination of realtime market data is provided via BATS which features an open order book. Historical market statistics are provided by BSE in English language, ranging from daily to annual publication frequency. Listed companies are, among other obligations, required to publish quarterly, semi-annual and annual financial reports (Bahrain Monetary Agency 2003, article 34). As services along the value chain, BSE provides trading facilities and a clearing and settlement unit including a central counterparty (CCP). The net position of trades in a security is settled via the broker’s clearance account two days after the trade’s occurrence (T+2). Afterwards, brokers are responsible for settlement with their clients. BSE’s pricing model differentiates between equities and bonds. Brokerage commissions are negotiable (between the investor and the broker) but the exchange fee is a fixed rate in a prescribed brokerage commission model with commissions decreasing depending on the amount of a transaction. The pricing schedule for equity trading is shown in table 7.13: Table 7.13. Pricing schedule for equity trading at the BSE (Source: BSE website) Value of Transaction (BD*)
Broker (negotiable)
BSE
1 – 20,000 20,001 – 50,000 Above 50,000
24 bps 16 bps 8 bps
6 bps 4 bps 2 bps
* BD 1 = US-$ 2.65259
Total (depending on negotiations) 30 bps 20 bps 10 bps
7 Capital Markets in the Gulf
153
The BSE pricing schedule does not offer any pricing incentives for liquidity provision, neither for market makers nor for investors submitting limit orders. Following the description of Table 7.13, the overall commission is distributed at 80 percent to the broker and 20 percent to the exchange. The regulatory framework for Bahrain’s capital market was set by the Bahrain Monetary Agency (BMA) in 2002 (Bahrain Monetary Agency 2006). Simultaneously, the BMA is responsible for Bahrain’s monetary policy and the regulation of all financial institutions in the kingdom. Since 1999, restrictions on foreign ownership have been reduced allowing GCC nationals to own up to 100 percent (formerly 49 percent) of shares of a listed Bahraini company while nonGCC nationals may hold up to 49 percent (formerly 24 percent) of shares (Bahrain Stock Exchange 2006). Table 7.14 summarizes the BSE analysis and provides the framework for the other markets.
Table 7.14. Characteristics of the Bahrain Stock Exchange (BSE) Category Tradable instruments Trading mechanisms/system
Supported order types
Safeguards Information dissemination
Services along the value chain Pricing model
Regulatory framework
Restriction on foreign ownership
Results Equities, government and corporate bonds, mutual funds Continuous, order-driven with liquidity providers; BATS electronic trading system based on Computershare’s Horizon since 1999; opening hours: Sunday – Thursday, pre-opening 9.30 a.m. – 10 a.m., continuous trading 10 a.m. – 12 p.m. Limit and market orders (extended by special conditions), hidden volume, order validity restrictions 10 percent of fluctuation from the last closing price (for equities only) Market statistics ranging from daily to annual; company financial statements quarterly, semi-annual and annual Trading, clearing with CCP, settlement (T+2) Broker’s commission negotiable, while exchange fee is not; linear fee model with discounts for higher trading values Regulated by the Bahrain Monetary Agency (BMA) since 2002; BMA regulates all financial institutions and serves as central bank Discrimination between GCC and non-GCC nationals; GCC nationals allowed to hold up to 100 percent, non-GCC nationals 49 percent of a company’s shares
154
Peter Gomber et al.
7.4.2
Tadawul Saudi Stock Market (TSSM)
Table 7.15. Characteristics of the Tadawul Saudi Stock Market Category Tradable instruments Trading mechanisms/system
Supported order types
Safeguards Information dissemination Services along the value chain Pricing model
Regulatory framework
Restriction on foreign ownership
Results Equities Continuous, order-driven without liquidity provision; Tadawul electronic trading system since 2001 (based on Computershare’s Horizon); trading hours: Saturday – Wednesday, 10 a.m. – 12 p.m., 4 p.m. – 6.30 p.m.; Thursday: 10 a.m. – 12 a.m. Limit and market orders (extended by special conditions), hidden volume, order validity restrictions None Market statistics ranging from daily to annual; company financial statements quarterly Trading, clearing, settlement Linear model with floor of 15 Saudi Riyal* for total commission, otherwise a maximum of 15 bps of trade value (negotiable, see appendix) Regulated by the autonomous Capital Market Authority (CMA) based on the Capital Market Law (CML) (Capital Market Authority 2006) Foreign participation in the stock market only through mutual funds (Saudi Arabia General Investment Authority 2006)
* 1 Saudi Riyal = USD 0.26666
Additional information on TSSM: Bonds are currently traded via telephone but in October 2005, the Saudi government announced to “… establish a secondary market for government bonds” (AlJasser and Banafe 2003, p. 182 and Banker 2005, p. 74). Mutual funds on Saudi equities are offered by investment companies.
7 Capital Markets in the Gulf
7.4.3
155
Kuwait Stock Exchange (KSE)
Table 7.16. Characteristics of the Kuwait Stock Exchange Category Tradable instruments Trading mechanisms/system
Supported order types Safeguards Information dissemination Services along the value chain Pricing model Regulatory framework Restriction on foreign ownership
Results Equities, options, forwards, futures Continuous, order-driven without liquidity provision; Kuwait Automated Trading System (KATS); access only via brokers; trading hours: auction at 8.50 a.m., 9 a.m. – 12.30 p.m. continuous equity trading Limit and market orders None Market statistics available via KSE website in Arabic language only, company information: n/a Trading; clearing and settlement is performed by the Kuwait Clearing Company (KCC) Linear model with discount for higher transaction values (see appendix for detailed information) Regulated by the government-owned Kuwait Investment Authority (KIA) Foreign participation in Kuwaiti companies is limited to 49 percent, expatriates within Kuwait are allowed to hold 100 percent
156
Peter Gomber et al.
7.4.4
Muscat Securities Market (MSM)
Table 7.17. Characteristics of the Muscat Securities Market Category Tradable instruments Trading mechanisms/system
Supported order types Safeguards Information dissemination
Services along the value chain
Pricing model
Regulatory framework
Restriction on foreign ownership
Results Equities, commercial and government bonds Since December 2005 electronic, continuous and order-driven market, no liquidity provision; Atos Euronext trading system; trading hours: Sunday – Thursday, Regular market 10 a. m. – 11 a. m., parallel & bond markets 11:30 a. m. – 12:30 p. m., third market 12:35 p. m. – 12:55 p. m. Limit/market orders and various extensions in restrictions and execution conditions No information available Real-time market information dissemination through the automated trading system; quarterly financial company statements Trading; clearing and settlement is performed by the Muscat Depository & Securities Registration Company (Muscat Securities Market 2006a), settlement at T+3 MSM’s commission is a fixed percentage of the order value, the broker’s commission is subject to negotiation for equities trading (see appendix for details) In 1998 regulation has been transferred from MSM to an governmental Capital Market Authority (CMA) No restrictions (Muscat Securities Market 2006b)
7 Capital Markets in the Gulf
7.4.5
157
Doha Securities Market (DSM)
Table 7.18. Characteristics of the Doha Securities Market Category Tradable instruments Trading mechanisms/system
Supported order types Safeguards Information dissemination
Services along the value chain Pricing model Regulatory framework Restriction on foreign ownership
Results Equities Electronic order-driven market based on Computershare’s Horizon system, access only personally via local brokers; trading hours: no information available Limit/market orders No information available Market statistics on a monthly basis; quarterly financial statements by companies; weekly publication of main financial indicators, e. g. dividend yield, p/e ratios Trading; clearing house organized by the settlement bank, T+3 cycle Exchange fee is a fixed rate of 30 percent of the broker’s commission No information available Since April 2005 foreigners (including GCC nationals) are allowed to hold up to 25 percent of a listed company’s shares (Doha Securities Market 2006)
158
Peter Gomber et al.
7.4.6
Dubai Financial Market (DFM)
Table 7.19. Characteristics of the Dubai Financial Market Category Tradable instruments Trading mechanisms/system
Supported order types Safeguards Information dissemination
Services along the value chain Pricing model
Regulatory framework
Restriction on foreign ownership
Results Equities, bonds; mutual funds on equities and fixed income existent Order-driven, automated screen-based trading system (Computershare’s Horizon), access via local brokers; trading hours: pre-opening at 9.30 a.m.; continuous trading from 10 a.m. – 1 p.m. Limit/market orders No information available Real-time and historic market endogenous information through the trading system screens on the trading floor; market statistics publications ranging from daily to annual; companies’ financial statements have to be published on a quarterly, semi-annual and annual basis (Emirates Securities and Commodities Authority 2000, article 36) Trading, clearing and settlement Commission resulting from a transaction are distributed among broker, DFM, and the market regulator (see appendix for details) Regulated by the financially and administratively autonomous Emirates Securities and Commodities Authority (ESCA) which is led by government officials; exchange rules for market conduct None
7 Capital Markets in the Gulf
7.4.7
159
Abu Dhabi Securities Market (ADSM)
Table 7.20. Characteristics of the Abu Dhabi Securities Market Category Tradable instruments Trading mechanisms/system
Supported order types Safeguards Information dissemination
Services along the value chain Pricing model
Regulatory framework
Restriction on foreign ownership
Results Equities (bonds are announced) Automated central limit order matching trading system (Computershare’s Horizon), access via local brokers; trading hours: pre-opening at 10 a.m.; continuous trading from 10.30 a.m. – 12.30 p.m. Limit/market orders No information available Real-time and historic market information through the trading system screens on the trading floor; market statistics publications ranging from daily to annual; companies’ financial statements have to be published on a quarterly, semi-annual and annual basis (Emirates Securities and Commodities Authority 2000, article 36) Trading, clearing and settlement Commission resulting from a transaction are distributed among broker, ADSM, and the market regulator; exchange fee: percentage of trade value with a floor for total commission (for detailed information see appendix) Regulated by the financially and administratively autonomous Emirates Securities and Commodities Authority (ESCA) which is lead by government officials No restrictions
As presented above, all six exchanges – except for Saudi Arabia which allows foreign participation via mutual funds only – have opened up their markets for foreigners. Some markets discriminate between GCC and non-GCC nationals. While the markets in Oman and the UAE impose no restrictions on non-GCC investments, the markets in Bahrain, Kuwait and Qatar allow non-GCC nationals to hold only a minority of a listed company’s shares.
160
7.5
Peter Gomber et al.
Approaches to Further Open up the Stock Markets to the International Community − The Dubai International Financial Exchange (DIFX) Case
The previous sections identified the underlying infrastructure of the markets in the region. These markets developed over a relatively short period of time. If we look at the drivers we find that the underlying economies of the Gulf markets developed strongly on the back of oil and gas. This development meant that a significant local cash reserve was created and that any surplus liquidity was invested in business opportunities abroad. Over time the local economy created more business opportunities and the net export of cash slowed down. A significant number of these businesses had been financed by debt before the need for a stock market arrived. The markets are located in comparably small countries which traditionally are closed to outsiders for investment purposes. However, the next step in the development of a stock market lies in widening the investor base. The underlying legal framework hinders rapid expansion into non-GCC areas. Whilst most markets are fully electronic in the best sense of the word, market participants cannot be located in other jurisdictions than the home market. All markets have introduced electronic trading systems, but market access to these systems in the GCC markets is different from access to electronic markets in the US and Europe, where brokers have remote access to the exchanges’ trading systems, i. e. access without restrictions to the broker’s location. In some cases trading firms have actually to be present “on the floor” even if actual trading is fully electronic and trades are entered right there into the central system and the trade is matched by Central Limit Order Books (CLOB). Today, there are two ways to accommodate growth in stock markets: (i)
Evolutionary Regional Market Development
The development of markets in the GCC shows many similarities with market development in Europe during the past 30 years. Harmonization of the regulatory framework of local markets is a cornerstone of the capital market evolution. In this, there are many similarities to the EU from the quest to connect markets to the establishment of a single currency. The overall geographical market of the region however is less harmonized and will need more development to come even close to the environment currently existing in Europe. In recent history, markets of the region and beyond started discussing ways to connect their markets technically in the hope that this will lead to increased liquidity. Before technical connectivity will pave the way it is more important that legal and regulatory systems of these markets are harmonized. Thriving towards a common currency will also lead to a more integrated capital market structure.
7 Capital Markets in the Gulf
161
These developments will build an essential foundation which will ultimately allow local markets to fuse in an evolutionary way. (ii)
Leapfrog Development
The revolutionary development of markets is preferred in the region as time seems to be the least available commodity. The DIFX is a prime example of such a development as it benefits from the concept of an international financial center. The DIFX was set up within the Dubai International Financial Centre (DIFC) and specifically targets an international investor, issuer and intermediary community – therefore the following section, will further described it as a case study of an approach to further open up the markets in the region. Dubai (and thus the UAE) did not want the evolutionary approach over years and decades to change their stock exchanges and therefore designed the concept of financial free zones. The generic free zone concept has been established over the years for trade and allows foreigners or international organizations to build a business as 100% owners and with the right of full profit repatriation and quite often with a beneficial tax regime. To take this concept one step further to the creation of a financial centre a lot of changes and additions needed to be made to the underlying idea. The Financial Centre The Financial Centre creates a framework which allows leapfrogging the evolutionary capital market development. A capital market will only be as efficient or as international as its framework allows it to be. Three major components make up this environment: 1. The legal framework and the market oversight (a regulator) and 2. The technical and organizational infrastructure (a “municipality”) 3. The rule of law and the court system. Hence, the International Financial Centre in Dubai started with the creation of the legal framework. It took on experts of markets ranging from New Zealand via Europe to Canada and created legacy-free system of interlinking laws. This effort effectively created a country inside a country in which the newly created regulator works to IOSCO5 Standards and takes over the roles traditionally carried out by the central bank and/or a country regulator. In case of Dubai, a close working relationship with the organizations of the federal government ensures enforcement in the home country but outside the Centre. Commercial and civil laws had been recreated by the Centre using international best practice whilst only the criminal laws are left from the Federation. More importantly the UAE government allowed the Financial Centre to choose the language of its laws. The use of English language for all applicable new laws creates an environment which leaves the 5
International Organization of Securities Commissions.
162
Peter Gomber et al.
market participants with a readily recognizable legal framework where content does not get lost in translation.6 The creation of a legal framework was followed by the second step, the creation of a physical center which – in the long run – aims at rivaling the main financial centers around the globe in its provision of technical infrastructure and international connectivity. The Financial Centre’s master plan states that it wants to create an environment which will provide office premises as well as residential and retail areas for 60,000 people.7 Finally, the Financial Centre created a Court System with international judges which complete the basic structure of this new environment.8 Dubai International Financial Exchange A key component of this new structure is the Dubai International Financial Exchange (DIFX). It was set up on the framework of the DIFC and built in less than two years. All relevant systems, services, rules and regulation aim to provide an internationally accessible capital market with a straight through processing environment.9 The DIFX that went live in September 2005 aims at achieving the following goals to be able to attract international investments: • The DIFX positions itself strategically as a complementary market to the national markets in the region. National markets perform an important task in their home countries which the DIFX is not meant to copy. The DIFX was created to complement these markets with an additional value proposition which ranges from product types to remote members. Thus it aims at keeping companies from the region which would long for international capital through European or US markets closer to their home market. The DIFX’s structure is built in such a way that it can be used in form of a ‘hub an spoke’ model for the enhancement of local markets in their ongoing development. • At the heart of the DIFX rests the concept of the ‘authorized’ and the ‘recognized’ member. An authorized member is located inside the Centre and is regulated by the exchange regulator. The differentiation from the national markets is the embedded concept of the recognized or remote member. It allows a trading member to be located in another jurisdiction as long as his legal environment is accepted by the exchange and its regulator. The market aims at providing a platform in which internationally active trading members and institutional investors can trade with lesser known regional banks and brokers who move into the highly regulated environment around the DIFX. 6 7
8 9
See the DFSA website for the whole range of laws: http://www.dfsa.ae For the DIFC setup and framework refer to the DIFC website: http://www.difc.ae/about_us/index.html For full details see the DIFC Courts website: http://www.difccourts.ae/ For background information on the establishment of the DIFX see its homepage: http://www.difx.ae
7 Capital Markets in the Gulf
163
• The challenge of remote membership of the DIFX is two-fold. Firstly, it lies in the design of the legal framework of the exchange and its subsidiaries and its acceptance in the home jurisdictions of the trading members. Secondly, the technological aspect is driven by the large region the DIFX will cover and the focus on connectivity of a remote trading member to the exchange. This has to be done in a way which does not create a disadvantage for some members due to latency. • The DIFX created an infrastructure which is known as a “vertical silo”. In its development it will provide a trading system for equity and derivatives, a clearing house with a central counter party (CCP), a Central Securities Depositary (CSD) and a registry all under one roof. To complement this structure for international connectivity, the settlement organization will connect to an International Securities Depository (ICSD) System in Europe. The way the whole market is structured allows the DIFX to offer and perform certain tasks in clearing and settlement for the local markets as they develop outside their home markets. It further facilitates the creation of dual listings which is one way for the local markets to expand their reach across their borders and to enhance local market liquidity. • The DIFX further initiated the payment clearing of its trading currency – the US Dollar – to ensure currency availability during the trading day for margin calls and settlement instructions. • For issuers, the DIFX aims to provide a stream lined listing system. Oversight of the market rests with the regulator whilst the listing authority has been delegated to the exchange. The exchange has created a self regulatory function for market oversight and listing. The listing is not hindered by either access restrictions or ownership restrictions to institutional investors in any way which exist in most local markets in the region in one form or another. • The market model of the DIFX provides for a Central Limit Order Book and will integrate market makers to provide liquidity for the lesser liquid products. For the trading system for equity and derivative products the ATOS-Euronext system NCS was chosen. For the clearing, settlement and registry function TCS of India delivered the software. These core systems were enhanced by a collection of supporting systems from the delivery of corporate actions to the dissemination of real-time data. The network of the exchange builds on GLTrade and SWIFT connectivity. The system conglomerate works to the latest IOCSO standards and aims at providing a modern straight through processing environment. • International investors and intermediaries demand a readily available system for data information from the market (market endogenous information) as well as from the issuers on the market e. g. on corporate actions (market exogenous information). The DIFX maintains a comprehensive data dissemination concept which allows over 700,000 professional users worldwide to access the market data.
164
Peter Gomber et al.
• The rules of the exchange demand high listing standards from the issuers and education with exams from the traders on the market. To achieve this, the DIFX has built the DIFX Academy which teamed up with institutions like the University of Frankfurt or FT Knowledge/New York Institute of Finance.10 Courses include general market know-how, specialist listing requirements on corporate governance and trader education and examination. Following the tabular description of stock markets from Section 7.4 the DIFX can be characterized as:
Table 7.21. Characteristics of the Dubai International Financial Exchange Category Tradable instruments Trading mechanisms/system
Supported order types Safeguards Information dissemination Services along the value chain Pricing model
Regulatory framework
Restriction on foreign ownership
Results Equities, GDRs, bonds, sukuks, certificates, ETF’s, options and futures planned for 2006 Order-driven, automated screen-based trading system with market maker interaction (Atos Euronext trading system); market access via local and international brokers; trading hours: Monday-Friday 2 p.m. – 5 p.m. Limit/market orders with various extensions in restrictions and execution conditions Volatility interruption Real-time dissemination via the major data vendors Trading, clearing with central counter party (CCP), settlement and registry Based on international standards with different pricing levels for liquidity providers and liquidity takers Regulated by the financially and administratively autonomous Dubai Financial Services Authority (DFSA) None
The many facets of the markets started to adjust to international best practices and standards and the impact is now visible in all national markets. This development can propel the region’s capital markets forward in the coming years and move them closer to the established markets which will gradually close the gap between the Asian and the European time zones.
10
For the development of its trader exam the DIFX cooperated with the Chair of e-Finance at Frankfurt University, for details see: http://efinance.wiwi.uni-frankfurt.de
7 Capital Markets in the Gulf
7.6
165
Conclusion
The previous sections gave an overview of the six GCC countries and the region as a whole. It was shown that economies and stock market business in the region experienced a very dynamic development in recent years. Along with risen stock trading activity, market performance was consistently positive. Compared to markets in the US or Europe, exchanges in the Gulf countries have a younger history and consequently, trading is limited to equities in most cases up to now. International investors require their markets to have a high grade of liquidity, free market access, transparency, reliability and no obstacles for foreign ownership. All GCC exchanges have migrated from manual trading to electronic trading systems, but market access is mostly available via local brokers only which may limit international investments. It can be observed that in some countries barriers for foreign ownership still exist, e. g. Saudi Arabia which allows no direct foreign ownership, while mainly the smaller countries in the region have tried to open up and lift restrictions. The Dubai International Financial Exchange is an approach to further open up the securities market and attract international investments by applying standards that can also be found in the US or Europe. It can be found that exchanges in the Gulf have already run through modernization and a process to attract foreign capital to their markets, and they are in a fast pacing process to catch up with the functionalities and the organizational structure of their counterparts in international stock markets.
7.7
Appendix
The pricing model of the Tadawul Saudi Stock Market (TSSM): Floor
SR 15
Maximum total commission (negotiable)
15 bps
SR 1 = US-$ 0.26666
The pricing model of the Kuwait Stock Exchange (KSE): Transaction Value (KD)
Total Commission
1 – 50,000
12.5 bps
> 50,000
10 bps of the excessive amount
KD 1 = US-$ 3.42571
166
Peter Gomber et al.
The pricing model of the Muscat Securities Market (MSM):
50 bps of the excessive amount 10 bps of exc. amount
16 bps
10 bps of the excessive amount 2 bps of exc. amount
4 bps 20 bps
RO 1 = US-$ 2.5974
The pricing model of the Doha Securities Market (DSM): Brokerage commission is negotiable and exchange fee amounts to 30 percent of this commission. The pricing model of the Dubai Financial Market (DFM): Shares
Bonds
Funds & Warrants
Broker DFM Clearance ESCA Total Variable
30 bps 10 bps 5 bps 5 bps 50 bps
3 bps 1 bps 0.5 bps 0.5 bps 5 bps
15 bps 10 bps – – 25 bps
Fixed Charge
AED 10 (AED75)
AED 10 (AED75)
AED 10 (AED75)
AED 1 = US-$ 0.27227
The pricing model of the Abu Dhabi Securities Market (ADSM):
Broker
Trade Value ≤ AED. 15.000
Trade Value > AED. 15.000
AED 45
30 bps
ADSM
AED 22.5
15 bps
Authority
AED 7.5
5 bps
Total
AED 75
50 bps
AED 1 = US-$ 0.27227
7 Capital Markets in the Gulf
167
Acknowledgement The authors gratefully acknowledge the support of the E-Finance Lab. For further information on the E-Finance Lab, please see the chapter “Added Value and Challenges of Industry-Academic Research Partnerships – The Example of the E-Finance Lab” in this volume.
References Al-Jasser MS, Banafe A (2002) “The development of debt markets in emerging economies: the Saudi Arabian experience”, BIS Papers No. 11, June 2002 Bahrain Monetary Agency (2003) “Disclosure Standards”, December 2003. Retrieved January 5, 2006, from http://www.bahrainstock.com/downloads/ law/BMA_Disclosure_Standrads_Book.pdf Bahrain Monetary Agency (2006) “Regulations and Supervision”. Retrieved Jan 5, 2006, from http://www.bma.gov.bh/cmsrule/index.jsp?action=article&ID=898 Bahrain Stock Exchange (1999) “Resolution No. (4) of 1999”. Retrieved January 5, 2006, from http://www.bahrainstock.com/downloads/law/Trading%20Rules %20and%20Procedures.pdf Bahrain Stock Exchange (2001) “Resolution No. (6) of 2001”. Retrieved January 5, 2006, fromhttp://www.bahrainstock.com/downloads/law/Trading%20Rules %20and%20Procedures.pdf Bahrain Stock Exchange (2006) “Foreign Ownership”. Retrieved January 6, 2006, from http://www.bahrainstock.com/bahrainstock/bse/foreignownership.html Banker (2005) “Saudi’s new bond market beckons”, The Banker, November 7, 2005, p. 74 Capital Market Authority (2006) Retrieved January 6, 2006, from http://www.cma.org.sa/cma_en/default.aspx Central Intelligence Agency (2005) “The World Factbook 2005”. Retrieved November 22, 2005, from http://www.cia.gov/cia/publications/factbook/ Davis EP, Steil B (2001) “Institutional Investors“, The MIT Press, Cambridge, Massachusetts; London, England Davydoff D, Gajewski JF, Gresse C, Grillet-Aubert L (2002) “Trading costs analysis: A comparison of Euronext Paris and the London Stock Exchange”, OEE research paper Degryse H, Van Achter M (2002) “Alternative Trading Systems and Liquidity”, SUERF-colloquium book “Technology and Finance: challenges for financial markets, business strategies and policymakers”, M. Balling, F. Lierman, A. Mullineux, editors; Chapter 10; Routledge, London, New York Doha Securities Market (2006) Retrieved January 9, 2006, from http://english.dsm.com.qa/site/topics/static.asp?cu_no=2&lng=1&template_ id=33&temp_type=42&parent_id=16
168
Peter Gomber et al.
Dubai Financial Market (2006) Retrieved January 5, 2006, from http://www.dfm.co.ae Economides N, Schwartz R (1995) Equity Trading Practices and Market Structure: Assessing Asset Manager’s Demand for Immediacy, Financial Markets, Institutions & Instruments 4, 1995, pp. 1−46 Economist (2005) “The march of the robo-traders”, September 17, 2005, pp. 12−13 Emirates Securities and Commodities Authority (2000) “Resolution No. 3/2000”. Retrieved January 9, 2006, from http://www.dfm.co.ae/dfm/ ESCARulesRegulations/ESCA_Rules/Transparency.DOC Fama EF (1970) “ Efficient Capital Markets. A Review of Theory and Empirical Work “, Journal Of Finance, Vol. 25, May, pp. 383−417 Federation of Euro-Asian Stock Exchanges (2005) Retrieved November 25, 2005, from http://www.feas.org/ Fincorp (2005) “GCC Markets – A special review”, November 10, 2005 Füss R (2002) “The Financial Characteristics between ‘Emerging’ and ‘Developed’ Equity Markets”, EcoMod 2002-International Conference on Policy Modeling, Proceedings Brussels, July 4−6, Brussels 2002 Golub S (2003) “Measures Of Restrictions on Inward Foreign Direct Investment for OECD Countries“, OECD Economics Department Working Papers, No.357, OECD Publishing Gomber, P. (2000) “Elektronische Handelssysteme – Innovative Konzepte und Technologien”, Physica-Verlag, Heidelberg IMF (2005a) “Middle East and Central Asia Regional Economic Outlook”, September 2005. Retrieved November 22, 2005, from http://www.imf.org/external/pubs/ft/reo/2005/eng/meca0905.pdf IMF (2005b) “World Economic Outlook”, September 2005. Retrieved November 22, 2005, from http://www.imf.org/external/pubs/ft/weo/2005/02/ International Organization Of Securities Commissions “Objectives and Principles of Securities regulation”. Retrieved December 27, 2005, from http://riskinstitute.ch Italy G, Razin A (2005) “Foreign Direct Investment vs. Foreign Portfolio Investment” NBER Working Papers 11047, National Bureau of Economic Research, Incp. 2 Jafarkhan KK (2004) “Ettihad Etisalat share trading to begin tomorrow”, Khaleej Times Newspaper Online Edition, December 19, 2004 Miller D, Puthenpurackal J (2005) “Security fungibility and the cost of capital” Working Paper Series, 426, European Central Bank, January 2005 Muscat Securities Market (2006a) Retrieved January 7, 2006, from http://www.msm.gov.om/pages/default.aspx?c=100 Muscat Securities Market (2006b) Retrieved January 7, 2006, from http://www.msm.gov.om/pages/default.aspx?c=171&tid=24 Pagano M, Röell A (1990) “Trading Systems in European Stock Exchanges: Current Performance and Policy Options”. Economic Policy, 10, pp. 65−115
7 Capital Markets in the Gulf
169
Radcliffe RC (1973) “Liquidity Costs and Block Trading”, Financial Analysts Journal, 29, July−August, pp. 73−80 Reilly FK (1989) “Investment Analysis and Portfolio Management, 3rd ed., Dryden Press, Chicago Reilly FK, Wright DJ (1984) “Block Trading and Aggregate Stock Price Volatility”, Financial Analysts Journal, 40, no. 2, March−April, pp. 54−60 Rutledge E (2005) “Is oil alone fuelling GCC stock markets?”, Khaleej Times Newspaper Online Edition, February 23, 2005 Saudi Arabia General Investment Authority (2006) “Why Saudi Arabia”. Retrieved January 6, 2006, from http://www.sagia.gov.sa/innerpage.asp?ContentID=3 Schiereck D (1995) “Internationale Börsenplatzentscheidungen institutioneller Investoren”, Gabler Verlag, Wiesbaden Sunder S (1992) “Insider Information and its Role in Securities Markets”, Carnegie Mellon University, Working Paper No.1992-18, Pittsburgh World Bank (2005) “World Development Indicators 2005”. Retrieved November 23, 2005, from http://www.worldbank.org/data/wdi2005/
CHAPTER 8 Competition of Retail Trading Venues − Online-brokerage and Security Markets in Germany Dennis Kundisch, Carsten Holtmann
8.1
Introduction
German regional stock exchanges rival in the horizontal inter-exchange competition for order flow of investors.1 Particularly, they are trying to be an attractive option for private retail investors and differentiate against the competition. Differentiation often comes in terms of ‘price quality’ and the attempt to reduce implicit transaction costs for the investors. Therefore, many exchanges have begun to extend their established trading rules. With different labels and – in detail slightly differing – service attributes, e. g. best-execution guarantees have been introduced to most markets.2 These offerings have in common, that investor gets a guarantee that the price for a transaction will be at least as good as it would be at a predefined – often more liquid – reference market. For retail investors, it is not only the implicit transaction costs but also the explicit ones3 that are relevant for his orderrouting decision.4 These are the sum of provisions and fees that are charged by the service providers that participate in routing and matching an order as well as clearing a deal.
1
2
3
4
See e. g. (Picot et al. 1996; Röhrl 1996; Rudolph and Röhrl 1997) for different forms of appearances of exchanges. Concerning the competition between exchanges see e. g. (Hammer 2004). Germany is one of only eleven countries in the world that has more than five national exchanges, see (Clayton et al. 2006). E. g. Duesseldorf: „Quality Trading“, Hamburg/Hanover: „Execution Guarantee“, Munich: „maxone“, Stuttgart: „Best-Price-Principle“. On the potential effects of best execution offerings (so called cream skimming) see e. g. (Easley et al. 1996; Battalio 1997). For a distinction between implicit and explicit transaction costs see e. g. (Lüdecke 1996; Gomber 2000). See e. g. (Barber et al. 2005) with respect to load fees of mutual funds. See also (Picot et al. 1996), p. 117, who state that a trade in a global financial market is always executed at the place with the lowest transaction costs.
172
Dennis Kundisch, Carsten Holtmann
Kirschner analyzed these fees for all German regional exchanges as well as for XETRA, the Suisse exchange and the Wiener Börse (Vienna exchange). He states substantial differences in the fees while comparing these trading venues.5 Interestingly, these differences in the fees have not led to the situation that access intermediaries, such as online-brokers, just offer access to selected (regional) exchanges.6 It remains to the investor, the retail trader, to choose between the numerous trading options taking the heterogeneous terms and services into account. Due to the different price models of online-brokers, the differing price models of exchanges and other participating service providers become hardly comparable or even not visible by any means. Moreover, most retail investors may even not know which service providers are participating at all in routing, matching and clearing a transaction. Hence, the objective of this contribution is to identify relevant services and their fees to route and execute an order at an exchange and to integrate these data with the price models of established online-brokers in the German financial services market. The following are the guiding research questions: • What are the relevant components of the service “exchange transaction” and which fees are charged for these components?7 • Is there a competition among exchanges based on the charged fees? And if so, do price models of the online-brokers distort this competition or can a retail investor recognize this competition and thus take it into account for his order routing decision? To answer these questions, the primary service components of a stock market transaction are identified and the respective fees are evaluated. Subsequently, it will be analyzed, how and whether these fees are passed on to retail investors through the price model of online-brokers. We will focus our analysis on German regional exchanges (Berlin8, Duesseldorf, Hamburg, Hanover, Frankfurt (FWB), Munich, and Stuttgart) as well as the electronic trading system XETRA. The structure of this contribution9 is as follows: First, the service components of securities trading and their respective fees are introduced. Thereafter, forms of appearance of online-brokerage are discussed and a short market overview is 5 6
7
8
9
See (Kirschner 2003). One exception is the online-broker Fimatex, which currently does not offer access to the regional exchanges at Hamburg and Hanover (as at March 2006). The internal processes of a banking institution such as account services will not be covered in this analysis. The initiative „Nasdaq Germany“ was closed in autumn 2003 after having only operated for around half a year, currently, there is no trading market at the exchange in Bremen. Hence, this exchange will not be further analyzed in the following. At the end of 2005 the Bremen exchange was bought by SWX Swiss exchange. In 2007, trading of derivative securities shall start on a fully electronic exchange system. This contribution is an extended and revised version of the contribution: Kundisch D, Henneberger M, Holtmann C (2005) Börsenwettbewerb über explizite Transaktionskosten auch beim Privatanleger?, in: Österreichisches BankArchiv, Vol. 53, No. 2, 2005, pp. 117–129.
8 Online-brokerage and Security Markets in Germany
173
provided. Then, four price models of relevant online-brokers are briefly portrayed. Finally, the price models and the fees for service components are integrated and conclusions are drawn. Concluding the contribution, a summary and outlook with respect to current developments in the market with particular emphasis on OTCofferings are provided.
8.2
Service Components and Fees in Securities Trading
8.2.1 Service Components in Securities Trading Market participants receive the complex service offering “exchange transaction” when trading on exchange markets. This offering is composed of different components that are provided by a number of players. Private investors typically do not have direct access to the market, but have to make use of financial intermediaries to route their orders to the markets and to hold their deposits and accounts. Both services constitute the core service offerings of online-brokers which hence can be subscribed as the provision of market access. Typically, private investors only have a direct business relation to the broker; they are indirect customers for any other service provider involved in exchange transactions. Those indirect service offerings are charged by the broker in the name of the providers.10 The prices of the single service components are hidden (or observable and then relevant to the investors’ decisions) depending on the type and transparency of the broker’s business and fee model. The following service components, which can be grouped by the different phases of a market transaction, extend the core offering of the broker:11 • Information services: Information services are used before as well as after a transaction. Before the trade the investment decision may be based on the information about the current market settings. After the trade information may be needed to monitor and, if necessary, to revise the decision. Information about the market settings is typically acquired from specialized companies like Reuters or Bloomberg. Brokers usually provide them for free or for a monthly fee; very detailed information (e. g. research or real-time quotes) are sometimes charged additionally. • Price determination services: In Germany, exchange transactions can be closed at any of the regional markets or within the automated trading system XETRA that is run by Deutsche Börse AG. The fees that are charged 10
11
Consequently, the broker sells a kind of system or composite good; see (Matutes and Regibeau 1988). See (Picot et al. 1996) for a phase model of market transactions; see (Holtmann 2004) for an analysis of exchanges as service providers.
174
Dennis Kundisch, Carsten Holtmann
differ between the markets and depending on the service providers, e. g. the exchange broker, involved. Exchanges charge their fees depending on the transaction volume starting with a minimum amount. Exchange brokers get special commissions, also referred to as ‘courtage’, for finding partners with corresponding trading interests. Moreover, exchanges charge annual fees for the market access. • Clearing and settlement services: Clearing and settlement is provided by specialized companies in combination with the federal state central banks, which hold the deposits and accounts of the brokers. In the remainder, the fee models of price determination and clearing and settlement providers are discussed before those of the brokers are analyzed in more detail.
8.2.2 Access Fees for Participants at German Exchanges Online-brokers have to have an admission to any single market they plan to route orders to. This admission is typically charged with a one-time accreditation fee as well as recurring annual participation fees (see the following table). Table 8.1. Accreditation and annual fees12 Exchange
Berlin
Annual participation fee Min.
Max.
in €
in €
1,500
6,000
Accreditation fee
Comments
Multiplier
(one time)
4 steps at 1.500 € each
100%
Equals the corresponding annual participation fee
Duesseldorf
1,000
22,500
Not defined explicitly
100%*
Equals to twice the amount of the participation fee when not already participant at another regional exchange; in the latter case it equals the participation fee
Frankfurt (FWB)
7,500
15,000
Participants accessing
Multiplier
exclusively via Xontro
not existent (‘Skontroführer’)
For lead brokers
7,500 €, participants on
20,000 €, none for the
the floor 15,000 € with
others
additional 1,500 € for every trader 12
Information taken from (Börse Berlin-Bremen 2004; Börse Duesseldorf 2004; Börse Munich 2004b; Börse Stuttgart 2004; Hanseatische Wertpapierbörse Hamburg 2004; Niedersächsische Börse zu Hannover 2003; Deutsche Börse AG 2003; Deutsche Börse AG 2004) as well as from emails and phone calls with the market providers.
8 Online-brokerage and Security Markets in Germany
exchange brokers 5,000 € not existent 35,000 € for lead additional
brokers (‘Skontroführer’)
XETRA:
37,500
37,500
28,500
28,500
two direct connections XETRA: one direct connection, one Internet
Fees for XETRA
connection XETRA:
10,500
10,500
13,500
13,500
Multiplier not existent
None
Internet connection @XETRA Workstation
* In Duesseldorf the multiplier is typically not applied; instead the fees are negotiated on an individual basis within the given interval
Especially the annual participation fees are regularly revised and vary significantly between the markets depending on the given and/or anticipated turnover volume of the broker at the respective market.13 Differences between the markets are immense and Berlin, Duesseldorf, and Munich define a multiplier in their scale of charges and fees that can be adapted easily to current developments. The markets operators also charge the provision of the technical infrastructures. Brokers and other direct market participants pay monthly fees for using the standardized Xontro System. Xontro is the system that allows for electronic trading and order
13
See e. g. § 2 para. 2 in the scales of charges and fees of Börse Stuttgart.
176
Dennis Kundisch, Carsten Holtmann
processing at the German regional exchanges – it is provided by Deutsche Börse. Market participants can choose between different forms of accessing Xontro:14 • Dial-in Connection: The Dial-in connection is the least expensive and easiest possibility to get the full range of functionalities for order processing. It is typically utilized by brokers with small or medium order processing volume or as a backup solution by brokers with a higher volume. • Permanent Connection: Market participants that have high(er) processing volumes and that run own order processing systems might want to have a permanent connection to the systems of Deutsche Börse AG. For such a connection they are charged a monthly fee of 7.500 € no matter at which markets the single participant is registered.15
8.2.3 Fees for Price Determination Services Market maker commissions that have to be paid for the most transactions on regional exchanges are calculated in base points (bp) of the order volume. XETRA is not only an electronic but also an automated trading system in a sense that the matching of orders and the determination of prices takes place without human intervention – orders compete directly with each other. Instead of market maker commissions XETRA system fees are charged. Partial executions of orders are handled as separate transactions that are normally charged individually (XETRA transactions can be handled differently if executed within one day16). Table 8.2 summarizes the amounts that are charged for trading stocks; trading in bonds is calculated differently and not illustrated here. If transactions are executed and documented in Xontro, contract notes are created that provide information on trading partners, price, date, fees, etc. Contract notes document the terms of any contract that has been created at an exchange. Generation and transmission of contract notes are charged by the exchanges – rates deviate between the regional markets (see Table 8.3). Contract note fees are charged by Deutsche Börse on behalf of the different regional exchanges.17 Normally there are no corresponding duties on XETRA; for voluntary contract notes for XETRA contracts, one has to pay 0.06 € per transaction. In addition to this, market participants have to pay a monthly fee of 55 € for any market they access via Xontro.
14 15
16 17
See (BrainTrade 2004b), p. 12. Munich runs the proprietary trading system maxone that can be accessed through the Xontro connection; see (Deutsche Börse Systems, BrainTrade 2004), pp. 8–9. (Deutsche Börse AG 2003). (BrainTrade 2004a).
8 Online-brokerage and Security Markets in Germany
177
Table 8.2. Market maker commission and (XETRA) system fees, respectively, for order execution18 Exchange
Variable allowance
Exception for German blue chips (DAX)
Berlin Duesseldorf Frankfurt (FWB) Hamburg Hanover Munich Stuttgart XETRA: High Volume XETRA: Medium Vol. XETRA: Low Volume
8 bp 8 bp 8 bp 8 bp 8 bp 8 bp 8 bp 0.56 bp
4 bp 4 bp 4 bp 4 bp 4 bp 4 bp 4 bp None
0.588 bp
None
0.644 bp
None
Notes Minimum/ maximum allowance in € for stocks 0.75/– 0.75/– 0.75/–* 0.75/8.00 0.75/– 0.75/– 0.75/12.00 0.70/21.00 20,000 € minimum fee 0.73/22.05 5,000 € minimum fee 0.80/24.15 2,000 € minimum fee
* For derivative leverage products and for derivative investment products, there is a maximum allowance of 3 € and 12 €, respectively
Table 8.3. Contract Note Fees19
Contract note fee in € Monthly fee for Xontro access to the respective market in €
Contract note fee in € Monthly Fee for Xontro access to the respective market in €
18
19
Berlin 1.75 55.00
Duesseldorf 1.75 55.00
Frankfurt 1.75 55.00
Hanover 1.70 55.00
Munich 1.00 55.00
Stuttgart 1.74 55.00
Hamburg 1.70 55.00
Table 8.2 is based on the references already given in Table 8.1. The internalization system XETRA Best is not illustrated. (BrainTrade 2004a).
178
Dennis Kundisch, Carsten Holtmann
Fees for Clearing and Settlement20 After closing a deal at an exchange the duties of the involved participants regarding delivery and payment are registered and fulfilled in the clearing and settlement phase. Besides these basis services clearing and settlement providers may offer additional services (e. g. custody). Trades on national exchanges are reported to Clearstream Banking AG as one of the main clearing and settlement service providers. As the central depository for securities Clearstream Banking AG holds the deposits of the banks and brokers being involved in securities market trading. Clearstream charges basis fees for the holding of accounts and deposits and additional fees per transaction. Basis fees have to be paid monthly; they are calculated in bp from the individual volumes. For differing security types different fee scales apply. Table 4 illustrates the pricing scheme for stocks, mutual funds and similar securities. The fee is calculated by summing up the security values in the single categories (e. g. for a deposit with a volume of € 150 Mio. the first € 100 Mio. are calculated with 0.2 bp, the remaining € 50 Mio. with 0.175 bp). Table 8.4. Annual Fees of Clearstream Banking AG Deposit (market value in Mio. €) 0 to 100 From 100 to 250 From 250 to 500 From 500 to 1.000 From 1,000 to 5,000 From 5,000 to 10,000 From 10,000 to 25,000 From 25,000 to 100,000 From 100,000
Transaction based fees are also charged on a monthly basis. For a standard trade they amount to typically around € 1.15 (tax free) for both partners involved in a trade. Depending on the monthly trading volume Clearstream provides discounts and rebates which are illustrated in Table 8.5.21 Since 2003 Eurex Clearing works as a Central Counterparty (CCP) for transactions in stocks at regional exchange in Frankfurt and in XETRA to allow for the netting of trades (only the net positions of the participants have to be cleared and 20
21
The numbers given in the next paragraph are taken from (Clearstream Banking AG 2003) and (Eurex Clearing AG 2004). Additional services of Clearstream and the corresponding fees are not discussed here in more detail.
8 Online-brokerage and Security Markets in Germany
179
Table 8.5. Discount Scheme for transaction-based Fees of Clearstream Banking AG Number of bookings per month
Rebate
From 5,000 From 10,000 From 20,000 From 50,000
5.0% 10.0% 12.5% 15.0%
Fee per trade and participant 1.09 € 1.04 € 1.01 € 0.98 €
settled), lower risks and more anonymity in trading. Hence, Eurex Clearing AG becomes the trading partner for both parties involved in a trade and insures its timely clearing (Clearstream Banking AG remains as the settlement service provider). The fees that are charged by Eurex Clearing differ between the two procedures, namely the gross or net procedure, which can be applied. In the first case a transaction fee of € 0.70 has to be paid, in the latter a more complex pricing scheme is applied to the so called netting units – in both cases partly executions are regarded as single executions and prices are listed without taxes:22 Table 8.6. Pricing scheme of Eurex Clearing AG Number of transactions per “settlement netting unit” and day From 1 to 1,000 From 1,001 to 2,500 From 2,501 to 5,000 From 5,001 to 10,000 From 10,001 to 20,000 From 20,000
8.2.4 Discussion of Services and Fees Diverse specialized companies contribute to the provision of the service offering “exchange transaction”. The single fees that are charged for this complex service differ significantly between the providers: • Fees for market access: Access fees differ substantially between providers and markets. This is due to the fact that (i) the interval in the official scales of charges and fees of the analyzed markets is very wide (e. g. Hanover from 50 € to 20,000 €) and (ii) the differences between the markets are also huge (e. g. Duesseldorf 22,500 € at the maximum vs. Berlin 22
A netting unit is a number of similar buy and sell orders hat shall be netted before settlement. The definition of her netting units as well as the decision which trades may be settled in the net or gross procedures is left to each participant.
180
Dennis Kundisch, Carsten Holtmann
Table 8.7. Market maker commission and order volume Bp 8 4 0.644 0.588 0.56
Volume of orders up to witch minimum market maker commission has to be paid 937.50 1,875.00 12,422.36 12,414.97 12,500.00
6,000 € at the maximum). Access and accreditation fees can be regarded as fixed costs for the brokers that are (mostly) independent from the single transactions. The same holds for Xontro fees. • Price determination services: Market maker commissions that have to be paid to the exchange broker are variable costs for the broker. At present fee structures do not differ between the markets, except for some few caps. Differences do exist between XETRA and the regional exchanges – in single cases theoretical XETRA fees amount to only 7% of the respective fees of the regional exchanges. But this more theoretical value has to be put into perspective as for most trades the minimum commission of 0.70 €, 0.73 € or 0.80 € has to be paid (most retail trades have in fact a lower volume than 12,000 €). The pretended price advantage of XETRA is therefore smaller than anticipated. Fees for the contract notes are also variable costs that can be easily forwarded. Especially Munich is less expensive (1 €) than all other exchanges (1.70 € to 1.75 €). The fee for clearing and settlement connectivity does neither vary between markets nor is it correlated with the number of orders. These 55 € per month can be regarded as fixed costs for the broker. • Clearing and settlement services: Fees are charged from Clearstream Banking AG and Eurex Clearing AG for holding deposits and for clearing and settlement services in a narrower sense. The latter can be directly associated with single trades and, therefore, can be regarded as variable costs for an online-broker. Differences between exchanges are given by the central counterparty (CCP) that exists at the markets in Frankfurt (regional market and XETRA). Trades with the CCP involved allow for significantly lower costs.23
23
It has to be mentioned though that the introduction of the CCP incorporates some significant investments in hard- and software as well as additional infrastructure components for the market participants. Additionally, the CCP is not yet available for all products. Hence traditional and new processes and infrastructures have to be run in parallel; see (Kalbhenn 2002).
8 Online-brokerage and Security Markets in Germany
181
After discussing the services and fees that are associated with an exchange transaction, the offerings and charges of four online-brokers are illustrated in more detail. The way and the transparency whether and how online-brokers forward the prices they have to pay to their customers is analyzed to discuss if and to what extend private investors can base their trading decisions on information about the prices that the different service providers charge for their individual offerings.
8.3
Analysis of the Price Models of Online-brokers
8.3.1 Forms of Appearance of Online-brokerage Companies that provide access to markets offer securities trading services for their customers. As members of the so-called sell-side, they are trading securities in stead of their customers – in contrast dealers such as market makers trade with their customers.24 To put it differently, online-brokers do business in their own name but on the account of their customers.25 Since the beginning of the nineties their appearance has changed, particularly due to the influence of information technology.26 Today the term broker – frequently specified as discount- of direct-broker – is often used for different kinds of securities trading service companies, which provide retail investors the opportunity to trade securities through the offering of access to markets. As ‘access intermediaries’ they are largely offering technical infrastructures that allow for entry and routing of customer orders as well as custody of deposits and accounts.27 Since the intensive use of the Internet, the expression online-broker (often synonymic also e-broker) has been used as an umbrella term for these different types. Over time it has embraced characteristics of differing stages of technical evolution and service offerings. Starting with the first direct-brokers that had a clear focus on cost leadership, online-brokers with a broader service offering and less discount characteristics have been entering the German market. Most of these brokers are subsidiaries of German or international private banks (see Table 8.8). In the following we will refer to an online-broker as an intermediary that focuses its business primarily on offering market access and order routing via online channels. Moreover, we will concentrate on the part of the service offering that retail investors utilize for Internet-based securities trading at a German exchange (national Internet-brokerage).
24 25 26 27
See (Harris 2002). See also §2 para. 3 Securities Trade Act (Wertpapierhandelsgesetz). See (SEC 1999). See (Weinhardt et al. 1999).
182
Dennis Kundisch, Carsten Holtmann
Table 8.8. Online-brokers, their parent companies and number of online accounts Name of the broker
Parent Company
Nationality of parent company
1822direkt
Frankfurter Sparkasse Citigroup Commerzbank BNP Paribas Hypovereinsbank/Unicredito ING Group Postbank E*TRADE FINANCIAL Société Générale Deutsche Bank German Savings Banks’ Association
Germany
Citibank comdirect bank Cortal Consors DAB bank ING Diba Easytrade E*TRADE Fimatex maxblue S Broker
Number of online securities accounts in Germany in 2005 (estimations) > 160,000 > 330,000** > 550,000 > 500,000 > 900,000*
USA Germany France Germany/ Italy Netherlands Germany USA
> 458,000 > 430,000 > 10,000
France Germany Germany
> 22,500 > 400,000 > 100,000
* Including around 450.000 accounts of the FondsServiceBank GmbH in the B2B business ** Some of which may only be used offline
8.3.2 Price Models of Selected Online-brokers Fees which are relevant for closing a deal at an exchange will be the issue of interest in the following. Therefore, we will have a look at the fees a retail investor is charged by his online-broker. The analysis will only cover orders in stocks that were entered via the Internet28 to be routed at a German exchange. These fees are called explicit transaction costs.29 A substantial part of these transaction costs are the order provisions. In general, the calculation of order provisions is dependent on the volume of a transaction. Moreover, some online-brokers charge a so-called trading place fee. Dependent on the price models there are also fees for placing a limit order, canceling or changing an order or for partial executions of orders. In addition to these costs that are described in their tariffs (also called terms & conditions or rates & fees on the respective websites), so-called external allowances – which are generally not described in detail in the tariffs – may also be 28
29
These results can be broadly applied also to publicly traded securities such as warrants, exchange traded funds or derivative investment certificates. Concerning bonds, there are generally at least different market maker commissions. Implicit transaction costs include e. g. the spread or the price impact of an order (see e. g. (Gomber 2000)).
8 Online-brokerage and Security Markets in Germany
183
charged to the customer. They may include market maker commission, XETRA fees, clearing fees or a contract note fee. Trades in registered securities may also result in additional fees for the recordal in the register of shareholders. In the following price models of selected online-brokers – Cortal Consors, comdirect, DAB bank und maxblue – are briefly sketched. These brokers offer a comparable service spectrum concerning market information, market access and tools for market analysis. Moreover, all of them exist for at least five years and hold more than 400.000 online accounts in Germany. The data refer to the standard tariffs, special conditions e. g. for active traders were not considered.30 The following tables briefly summarize the four price models. Table 8.9. Price models (in €)31 Order commission for a national Internet stock trade
* This fee applies only if the order is not executed at the same day + These are minimum fees. For orders with a volume above 100,000 €, the fee is 0.0015% and 0.0025%, respectively # The order commission is waived for the second or more partial execution Table 8.10. Charged external allowances (in €) Market maker commission Comdirect 9 Cortal Consors 9 DAB bank 9 Maxblue 9
30
31
Clearing & Settlement – – – 9
Contractual note fee – – – 9
Registration fee for registered securities 0.93 1.95 0.93 0.93
Often, special customer segments such as actively trading customers are grouped into a community. Examples for these communities that get special conditions are „Startrader“ and „Platinumtrader“ at Cortal Consors or the „comdirect first“ at comdirect. See (Kundisch and Krammer 2006) for some insights concerning attitudes of retail investors dependent on their trading activity. Moreover, there may be charged additional fees in specific cases e. g. for the custody or foreign stocks. For the sake of simplicity, these fees are not covered in the following. For the detailed price models see (Comdirect bank 2006; Cortal Consors 2005; DAB bank 2004; Deutsche Bank Privat- und Geschäftskunden AG 2006).
184
Dennis Kundisch, Carsten Holtmann
8.3.3 Comparison of the Price Models and Discussion 8.3.3.1 General Comparison of the Price Models The overview of the price models reveals that each price model on its own is quite straightforward; however, a comparison is difficult since each price models consists of different service components with their respective fees, as well as different fees for the same service component. The following figure graphically illustrates the comparison of transaction costs for a limit order in XETRA. While Cortal Consors provides the lowest explicit transaction costs for a trade volume up to 5,500 €, maxblue is the least expensive provider for an order volume between 5,500 € and 10,000 € and for volumes above 11,700 €. Cortal Consors is again the best choice between 10,000 € and 11,700 € among the big four online-brokers. It is not the objective of this contribution to analyze these apparent differences in detail. Rather it is the question, whether these numerous fees and commissions become visible and transparent for the retail investors. Executed XETRA order with limit
Explicit transaction costs in Euro
80
60
comdirect
40
Cortal Consors DAB bank
20
maxblue
30.000
25.000
20.000
15.000
10.000
5.000
0
0
Order volume in Euro
Figure 8.1. Comparison of explicit transaction costs for a limit order in XETRA
8.3.3.2 Transparency of the Fees for Service Components for Investors The components of the price models of the covered online-brokers do not allow for an exact mapping to service and fee components, which are charged the onlinebroker from exchanges or other service providers to route, execute, and clear a trade. Therefore the differentiation in market access, price discovery, and clearing and settlement fees cannot be used one-to-one for the following discussion.
8 Online-brokerage and Security Markets in Germany
185
All online-brokers have to pay accreditation and annual fees at the exchanges. These fees are independent from single orders32 and may vary between different exchanges as well as between different online-brokers. Theoretically, the annual participation fee could be distributed uniformly by an online-broker for each exchange according to the expected order volume that will be routed to this market. In contrast accreditation may be regarded as sunk costs since they were paid once and should not influence the cost calculation of onlinebrokers today. However in practice, the considered online-brokers behave differently. According to answers to our inquiries concerning this issue, at least one broker sums up the annual participation fees for all German trading places, divides this sum by all expected national orders, and considers the result as a fixed cost allocation within her order commission. Hence, potentially different fees of the exchanges will not become visible for a retail investor. At the time the analysis was conducted all market makers at German regional exchanges charged similar commissions. These commissions are passed through to the retail investor by all of the brokers considered. Hence, competition of the exchanges would be visible for an investor but for this service component, there is no differentiation in the market with the exception of caps. There are some market segments at specific exchanges that have a cap on market maker commissions (at Stuttgart 12 € for all retail derivatives (EUWAX segment), domestic stocks, and mutual funds; at Frankfurt 3 € and 8 € for derivative leverage products and derivative investment products, respectively (Smart Trading segment); at Hamburg 12 € for all stock orders). Particularly EUWAX and Smart Trading use these caps intensively as sales argument in their marketing strategies.33 So-called XETRA fees are only passed through by maxblue, while the other three Brokers charge a flat fee (and benefit in most cases by this). Specifically comdirect charges a minimum fee of 1.50 €. Additionally, for a trade volume higher than 100,000 €, a volume dependent fee of 0.0015% is applied. This order volume, however, will be very rare exceptions for the group of retail investors. Whether the XETRA fee at Cortal Consors, comdirect, and DAB bank includes fixed costs distributions or may be a sum of other variable costs is neither visible for an investor nor for the authors.
32
33
According to the terms and conditions of the exchanges, the annual participation fees are set with regard to the overall relevance and importance of a participator for the exchange. In the long run, these fixed costs in fact depend in a sense on the number of orders and the volume or orders that are routed to a specific exchange. In addition, in case of incorrect information by an online-broker, a retail customer may come to the conclusion that with respect to market maker commissions, one exchange may be less expensive compared to another. E. g. one broker told us, that the market maker commission in Berlin on all instead of just the DAX stocks is 4 bp (Email as of 04/17/2004). Incorrect information was also provided by several online-brokers with respect to the charges for the recordal of registered shares.
186
Dennis Kundisch, Carsten Holtmann
In addition to the market maker commission at regional exchanges, Cortal Consors, comdirect and DAB bank charge a regional exchange fee. It is not visible at all for a retail investor which service and fee components are included in this fixed fee. maxblue is the only broker in the sample that passes through all external allowances. Hence, differing contractual note fees (see Table 8.3), which in the worst case amounts to a moderate 0.75 € per trade, will only become visible for maxblue customers in the sample. Clearing and settlement is centrally coordinated by the Clearstream Banking AG and in some cases – concerning stock trades at FWB or via XETRA – by the Eurex Clearing AG. Competition can only be expected between FWB and XETRA on the one hand and all the other regional exchanges on the other hand. According to the hotlines of the brokers in the sample, only maxblue is passing through this fee. Thus, a potential advantage of XETRA and FWB in most cases will not become visible to a retail investor. The handling of partial executions is directly relevant to retail investors. Whereas Cortal Consors and DAB bank waive their order commission for the second and more partial executions but charge trading place fees and external allowances. maxblue and comdirect charge for each partial execution as if it was a single and separate order – except for XETRA trades that are executed on the same day. Thus, trades at maxblue or comdirect may become substantially more expensive compared to trades at Cortal Consors or DAB Bank due to partial executions. One should note that partial executions generally only occur on XETRA. Regional exchanges in most cases execute an order completely in one step; e. g. Börse Munich advertises a partial execution rate of below 0.2% of all orders. 8.3.3.3 Discussion and Limitations There is a number of service providers involved in the complex offering “exchange transaction”. On the level of online-brokers there are particularly different fees for the participation at the different exchanges (see Table 8.1). Another aspect is the diversity in fees for contractual notes (see Table 8.3). For all other service and fee components, there is only competition on two levels: • Generally, the market maker commission is identical for all regional exchanges – expect for some few exchanges that offer a cap on this commission – and differs significantly from the XETRA fee. The XETRA fee is substantially lower compared to the market maker commission, however, in most cases the minimum fee will be triggered for a retail trade. Thus, the difference to the market maker commission is relativized in most instances. • Concerning the clearing and settlement of trades, the fees of the FWB and XETRA may be lower compared to the other regional exchanges due to the introduction of the Central Counterparty (CCP) – left alone the one time fixed costs for the adaptation of the intermediaries’ infrastructures. A retail investor may benefit from the scale, in terms of the number of
8 Online-brokerage and Security Markets in Germany
187
transactions and the consolidated order volume, of his intermediary, since the fees are dependent on the number and the volume of the transactions of the intermediary in total. These discounts are independent of the exchanges, though. Normally, for retail investors the moderate differences in the fees between the competing exchanges do not become visible due to the price model of his broker. However, there are differences in these models: On the one hand Cortal Consors and DAB bank refrain from charging variable costs for most of the external allowances but charge a fixed fee per market. On the other hand, maxblue passes through all external allowances that may be attributed to a specific trade. Whether the price models affect the different investors’ behaviour can not be observed. Additionally, one important aspect should be mentioned here: None of the analyzed brokers lists the different costs for the different exchanges. Moreover, it took even the authors of this contribution a number of weeks to get to know all the relevant data. Thus, most of the time, an investor will only know about the fees at a trading place ex post on his contract note and will not have at the same time the opportunity to compare these costs to the costs the same trade would have triggered using a different trading place. Some limitations accompany the presented analysis which should be taken into account while talking about the conclusions above: • We just had a look at transactions at Germany exchanges in national stocks. There may be substantial differences in the service components and fees talking about foreign exchanges or trades in foreign securities. • We just analyzed the price models of four established online-brokers in the German market. There may be substantially differing terms & conditions at other brokers that potentially would lead to other conclusions – Citibank e. g. recently introduced a flat rate model at 9.99 € per trade without market maker commissions, trading place fees or limit fees.
8.4
Outlook: Off-exchange-markets as Competitors
So far, only markets in Germany under the official market and legal supervision of the state-run Exchange Supervisory Authority have been analyzed as trading venues. These official exchanges provide pre- and post-trade transparency and offer order-driven market models. In recent years additional trading venues gained importance for retail investors: Off-exchange trading on specific alternative trading systems (ATS). Bypassing the traditional exchanges, retail investors can communicate directly via their brokers with the issuers of retail derivatives or market makers. Investors get the chance to request ‘a quote’ (meaning the sell and the buy price the counterparty is willing to trade for) for securities to be bought or sold. The market maker or issuer sends a quote to this specific customer and he may decide upon a mouse click within
188
Dennis Kundisch, Carsten Holtmann
a specified time interval (typically 3 to 8 seconds), whether he wants to buy (or sell) the securities at the posted price. This market model is called request-forquote and belongs to the quote-driven trading mechanisms.34 Referring to a current study of the authors in these markets primarily retail derivatives are traded and account for around 80% of the total turnover. Due to the lack of transparency, for all turnover figures, prices and volumes of executed trades, one has to rely on estimations of experts for this market. In interviews of German brokers performed in 2005 experts estimated the turnover via ATS in investment retail derivatives to around 50% of the total turnover of 41 billion Euro in this product category in 2004, whereas the turnover via ATS in levered retail derivatives accounts for around 60% of the total turnover of 52 billion Euro in 2004. In 2005, not only the market itself saw another increase in the double digits but also trading via ATS became ever more popular among private investors. But not only retail derivatives are traded via ATS, also a number of German bonds, most German stocks, European blue chips, all NASDAQ and Dow Jones stocks as well as some selected Asian stocks can be traded via ATS. Talking about competition of trading places, here it really becomes visible for the customer. However, it seems that this is mostly due to the price models of the online-brokers. Regardless of the real payment streams between brokers and issuers or market makers, the explicit transaction costs for the customer are lower at all analyzed brokers compared to a trade that is executed at an exchange. • maxblue offers the greatest discount compared to its prices for an exchange executed order. Generally, the broker commission is 5 € lower and no external allowances are charged.35 • DAB bank offers a substantial discount on the broker commission only on trades with three36 selected issuers and market makers (so-called “star partners”). For these, there is a flat fee without external allowances. For all other issuers and market makers, the broker commission is the same and there is a fixed trading place fee of 0.80 €, which is substantially lower compared to the regional exchange fee of 2.90 € or a XETRA fee of 1.50 €. No other external allowances are charged.37 • comdirect and Cortal Consors just waive all external allowances but charge the same broker commission as if the order would be routed to an exchange.38
34 35
36 37
38
See e. g. (Holtmann 2004). The only external allowance is a so-called liquidity provision fee which is charged when trading with the market maker Lang & Schwarz. Out of more than 20. The only external allowance is a so-called liquidity provision fee which is charged when trading with the market maker Lang & Schwarz. Again, the only external allowance is a so-called liquidity provision fee which is charged when trading with the market maker Lang & Schwarz.
8 Online-brokerage and Security Markets in Germany
189
Additionally, all of the brokers listed above, offer free trade campaigns with selected issuers for a limited time span (usually for one to three months) or marketing campaigns with reduced broker commissions for all trades via ATS – sometimes only for specific (community) groups of customers. To sum up, competition between exchanges on the one hand and specialized off-exchange-markets on the other hand is much more visible for the retail investor compared to the competition among the different exchanges. The caps on the market maker commission at EUWAX and Smart Trading can be interpreted as a direct answer to this price competition described above.
8.5
Conclusion
German regional exchanges compete among themselves and against the fully electronic cash market trading system XETRA for the order flow of retail customers. Additional competition can be observed with the off-exchange markets. In this contribution at first hand, we described what the fee models of these exchanges mean for direct market participants, such as access intermediaries like online-brokers. Next, it was analyzed, whether any price competition between the exchanges and other service providers that contribute to the execution and settlement of a trade are traceable and seem decision-relevant for the retail customer as an indirect market participant. Four established online-brokers in the German market were selected and their price models were analyzed with regard to the question: Do the price models of online-brokers distort or even blur (if any) existing price competition of German exchanges. Online-brokers were chosen for at least one major reason: They focus their business on providing information about and access to markets, order routing and account related services. Thus, crosssubsidization hopefully will be much less distinct compared to other financial services providers, specifically branch-based banks. Finally, a brief outlook in the off-exchange markets – an increasingly recognized alternative to execute an order for a self-directed retail investor – was provided. All in all, the cost competition between the German exchanges can be classified as moderate. Solely in the domain or participation fees at an exchange and the contractual note fees differ substantially between the considered trading venues. With respect to clearing and settlement services, small cost advantages can be stated for XETRA and FWB due to the introduction of the CCP compared to the other regional exchanges. Often, these costs are included in the marketindependent broker commissions. Partially, a fee is charged, which just differentiates between XETRA on the one hand and regional exchanges on the other. However, there are some differences in the price models of the four analyzed online-brokers. While particularly Cortal Consors and DAB bank do not charge external allowances except for the market maker commission, maxblue passes through all external allowances. Whether this results in different investment and trading behavior at the site of the retail investor is questionable, yet not observable.
190
Dennis Kundisch, Carsten Holtmann
For the anticipation of explicit transaction costs, it seems not only advisable to have a look at the tariffs of all the participation service providers but also at the market model of the exchanges, which define the trading rules. For retail investors who generally trade with comparably low volumes per transaction, it might be favorable to choose trading places where partial executions are avoided. Particularly on XETRA, partial executions are quite common and may result in substantially higher transaction costs compared to an order that is executed at a regional exchange – even if the commission and fees for a single transaction is less expensive on XETRA. Talking about market models, the quote-driven off-exchange markets in Germany observed an increase in importance in recent years. Most established onlinebrokers meanwhile offer not only access to all German exchanges but also to at least one off-exchange market where retail investors can trade directly with issuers or market makers bypassing. The fees for trading off-exchange provide (partially substantial) discounts compared to orders routed and executed at exchanges. Moreover, due to the quote-driven business-model, partial executions are impossible. With respect to clearing and settlement, there are some new developments that might lead to further competition in the future. Since February 2004 clearing and settlement are allowed at any licensed central depository for securities.39 However, Clearstream Banking AG is still the only player in this market. In summary, inter-exchange competition is moderate with respect to costs and mostly becomes not visible to the retail investor due to the distorting price models of online-brokers. However, price competition between exchanges on the one hand and ATS without governmental supervision on the other hand seem to be the real battlefield of the years to come. It remains to be seen, which consequences will be brought by the current re-regulation of the securities market and the implementation of the Markets-for-Financial-Instruments (MIFID) directive.
References Barber B, Odean T, Zheng L (2005) Out of Sight, Out of Mind: The Effects of Expenses on Mutual Fund Flows, forthcoming in: Journal of Business, 2005, vol. 78, no. 6, pp. 2095–2119 Battalio R (1997) Third Market Broker-Dealers: Cost Competitors or Cream Skimmers? Journal of Finance, 1997, vol. 52, pp. 341–352 Börse Berlin-Bremen (2004) Gebührenordnung der Börse Berlin-Bremen, as at 1.1.2004 Börse Duesseldorf (2004) Gebührenordnung Börse München (2004a) Jahresbericht 2003/2004 – Mit Vielfalt und Wettbewerb Marktchancen erschließen und Zukunft sichern
39
See o.V. (2004).
8 Online-brokerage and Security Markets in Germany
191
Börse München (2004b) Gebührenordnung für die Börse München. In Kraft seit 1.10.2003 Börse Stuttgart AG (2003) Jahresbericht 2002 der Baden-Württembergischen Wertpapierbörse Börse Stuttgart AG (2004) Gebührenordnung, as at 1.1.2004 BrainTrade (2004a) Gebühren für XONTRO BrainTrade (2004b) Kreditinstitute – Anmeldung XONTRO, as at 25.2.2004 Clayton MJ, Jorgensen BN, Kavajecz KA (2006) On the presence and marketstructure of exchanges around the world, in: Journal of Financial Markets 9 (2006) pp. 27–48 Clearstream Banking AG (2003) Preisverzeichnis Inland für Kunden der Clearstream Banking AG, Frankfurt, as at 17.11.2003 Comdirect bank (2006) Preis- und Leistungsverzeichnis, as at 01.03.2006 Cortal Consors (2005) Preis- und Leistungsverzeichnis, as at 01.07.2005 DAB bank (2004) Preis- Leistungsverzeichnis der DAB bank, as at 01.11.2004 Deutsche Bank Privat- und Geschäftskunden AG (2006) Preis- und Leistungsverzeichnis, as at 01.01.2006 Deutsche Börse AG (2003) Preisverzeichnis ab 01.04.2003 für die Nutzung des elektronischen Handelssystems Xetra Deutsche Börse AG (2004) Gebührenordnung für die Frankfurter Wertpapierbörse, As at 02.01.2004, FWB12 Deutsche Börse Systems, BrainTrade (2004) Kreditinstitute – Technische Anbindung, Version 4.0, as at 19.1.2004 Easley D, Kiefer N, O’Hara M (1996) Cream-Skimming or Profit Sharing? The Curious Role of Purchased Order Flow. in: Journal of Finance, 1996, vol. 51, pp. 811–833 Eurex Clearing AG (2004) Preisverzeichnis, as at 23.02.2004 Gomber P (2000) Elektronische Handelssysteme – Innovative Konzepte und Technologien, Physica, Heidelberg Hammer T (2004) Kleiner Handel, großer Gewinn, in: Die Zeit, no. 22, 19.05.2004, p. 29 Hanseatische Wertpapierbörse Hamburg (2004) Gebührenordnung, as at: März 2004 Harris LE (2002) Trading and Exchanges, Oxford University Press, New York, N.Y Holtmann C (2004) Organisation von Märkten – Market Engineering für den elektronischen Wertpapierhandel, Dissertation, Fachbereich Wirtschaftswissenschaften, Universität Karlsruhe (TH), Karlsruhe Kalbhenn C (2002) Der unbeliebte Kontrahent, in: Börsenzeitung, no. 112, 14.06.2002, p. 8 Kirschner S (2003) Transaktionskosten der Börsen im deutschsprachigen Raum, in: Bankarchiv, no. 11, 2003, pp. 819–828 Kort K (2004) Internet-Broker setzen Kundenberater ein, in: Handelsblatt, no. 53, 16.03.2004, p. 20 Kundisch D, Krammer A (2006) Transaktionshäufigkeit als Indikator für die Angebotsgestaltung bei deutschen Online-Brokern, in: Der Markt, Vol. 46, no. 1, 2006
192
Dennis Kundisch, Carsten Holtmann
Lüdecke T (1996) Struktur und Qualität von Finanzmärkten, Gabler, Wiesbaden. Niedersächsische Börse zu Hannover (2003) Gebührenordnung, as at 11.8.2003 Matutes C, Regibeau P (1988) Mix and Match: Product Compatibility without Network Externality, in: RAND Journal of Economics, Vol. 19, S. 221–234 Picot A, Bortenlänger C, Röhrl H (1996) Börsen im Wandel, Fritz Knapp, Frankfurt a.M o.V. (2004) Deutsche Börse ermöglicht Rivalen Abwicklung, in: Handelsblatt, no. 42, 01.03.2004, p. 22 Röhrl H (1996) Börsenwettbewerb: Die Organisation der Bereitstellung von Börsenleistungen, Deutscher Universitäts-Verlag, Wiesbaden Rudolph B, Röhrl H (1997) Grundfragen der Börsenorganisation aus ökonomischer Sicht, in: Hopt, K. J./Rudolph, B./Baum H. (Hrsg.): Börsenreform – Eine ökonomische, rechtsvergleichende und rechtspolitische Untersuchung, Schäffer-Poeschel, Stuttgart, pp. 143–285 SEC (1999) On-Line Brokerage: Keeping Apace of Cyberspace. Securities and Exchange Commission (SEC), New York, N.Y. Weinhardt C, Gomber P, Holtmann C, Groffmann HD (1999) Online-Brokerage als Teil des Online-Banking – Phasenintegration als strategische Chance, in: Locarek-Junge, H./Walter, B. [eds.]: Banken im Wandel: Direktbanken und Direct Banking, Bd. 18. Berlin-Verlag 2000, Berlin, pp. 99–120
CHAPTER 9 Strategies for a Customer-Oriented Consulting Approach Alexander Schöne
9.1
Introduction
Current publications and projects in the field of banking show that banks have begun focusing on customer service again. Following a phase where customer service was not considered a core business, and was even delegated to a separate company, many banks have once again started focusing their activities on the field. There are many reasons for this. For one, the success of various financial distribution companies has shown that customer service, when carried through consequently and comprehensive, does indeed portray a high potential for financial service providers. When consulting is provided, which satisfies the customer, it often leads to a long-term customer commitment, which forms a reliable and predictable basis for financial service providers. However, financial service providers face many challenges in the field of customer service, especially when it comes to legal requirements, such as the conversion of the EC broker guidelines or the German Pension Tax law. Lately, there is also the trend, that banks extend customer service to all of their customer segment. The focus has especially been placed on the private customer and wealthy private customer segments. Due to the low number of such customers, banks often neglected these segments and let specialized banks handle the matter. Recently, different models have shown how a successful consulting offer can be set up even with a low number of potential customers. The segment of retail customers has also turned into an interesting consulting field for banks due to the more in-depth standardization and industrialization. In order to keep customers from going over to the competition, banks have to actively support the move back to enhanced customer service – and not only for wealthy private customers, but for the customer segment in general. Many banks have the wrong expectations for their consulting services. They only accompany the trend toward increased customer service with related marketing activities. In reality, however, such an approach is doomed to fail. Even the wrong
194
Alexander Schöne
approach toward customers can cause customer’s to lose interest in the consulting offered by a bank, for example if a bank focuses on acquiring new customers instead of on consequently using existing relationships. All in all it can be said that consulting is mostly just appearance at the moment and that it is rarely actually lived and practiced in banks. To successfully move toward customer-oriented consulting, certain prerequisites have to be met in many areas. In addition to organization and processes, consulting software is also an important aspect of this change. Concentrating efforts on a single area – organization or software – will generally not achieve the desired effect. Since the interaction between the two fields is quite significant, the success of this movement can only be ensured when both fields are addressed. And it is not only in the customers’ interest, but in the bank’s as well, to enforce this movement. The remainder of the chapter will discuss two central requirements for customer-oriented consulting. First, it will highlight which methods are available for enhancing the functional quality of consulting services. Then it will introduce the structures, which a modern consulting software should provide, especially to enable quick reactions to the changing requirements in the market.
9.2
Qualitative Customer Service Through Application of Portfolio Selection Theory
One focus when practicing qualitative customer support is the analysis and derivation of a customer-specific recommendation. Customers want to be acknowledged as an individual regardless of their financial situation, and not to be given impersonal support. Thus, recommendations must be given a special significance when providing support. However, the reality of the situation is quite different. At the moment, two approaches dominate consulting: • The standard recommendation: The institute defines a recommendation for a certain type of customer. During the consultation, the consultant groups the customer into one of the customer types and uses the recommendations defined for that type. This enables optimization tools and specialists in the back office to be used, which optimize the recommendation ahead of time. However, such recommendations do not necessary address the customer’s specific situation. • The subjective recommendation: During the consultation, the consultant puts together an individual recommendation for the customer using the bank’s product portfolio. Even through the customer’s situation is given special attention, the consultant cannot really optimize the recommendation due to the complex interactions between the individual investment instruments.
9 Strategies for a Customer-Oriented Consulting Approach
195
Both approaches have advantages and disadvantages, yet neither really works in the customer’s best interest. Special consulting tools are better suited for customer service, as are already used in the back office for enhancing recommendations. Qualitatively-speaking, they should provide the same benefits as in the back-office, but should be easier to use.
9.2.1
Markowitz’ Portfolio Selection Theory
One of the widest spread optimization methods for determining the best asset structure is Markowitz’ portfolio selection theory. Three parameters form the basis of the portfolio selection theory, on which the optimization is based: • Return, meaning the performance of an investment. • Risk, meaning the fluctuation margin of an investment from the expected return. • Correlation, meaning the interdependency or performance between two investments. Whereas the return and risk are pretty common terms for investors, the term “correlation” is not very widely used in consulting. In the portfolio selection theory, the correlation plays a very central role. The correlation describes the connection between two investment instruments. For example, when considering the development of the DAX compared to the Dow Jones, a tendency toward a similar development can be seen. If the DAX goes up, the Dow Jones goes up as well and vice versa. Thus, this would be called a positive correlation. However, if the DAX is compared to the development of the price of gold, a counter-development will be seen. Thus, this would be called a negative correlation. Each investment instrument is described using these three parameters. Risk and return are visualized in a risk-return diagram. The correlation is not used until two or more investment instruments are combined. For example, it two investments, let’s call them A and B, were purchased for the same amount one would intuitively expect the resulting risk-return to be exactly in the middle of the two investments. However, this is generally not the case. Using the correlation, i. e. the interdependency of the two investments, the risk is lower when the two investments are combined, than the initial risk of each investment by itself. Thus, the diversification, i. e. the distribution of the assets to different investment instruments, is very important. It is not only used to place the risk on numerous “shoulders”, but more specifically to reduce the overall risk. The central message of the portfolio selection theory, for which Harry M. Markowitz and colleagues received the Nobel prize in 1990, was based on the recognition that all possible combinations out of a number of investment instruments formed a contiguous area in the risk-return diagram. To select the best portfolio, one only has to consider those combinations, whose resulting risk return is on the upper area of the area section of the diagram. All other portfolios are ineffective, meaning that they are not optimal in the sense that other combinations can
196
Alexander Schöne
be found, which provide a higher return on the same risk or have a lower risk yet the same return. For asset consulting this means, that the consultant does not have to consider all possible combinations of investment instruments, but can concentrate on those, which are on the edge of the area section of the so-called efficiency curve. The selection of the best portfolio for an individual customer situation is thus based on different possibilities: • The minimum risk portfolio is the portfolio with the lowest risk of all possible combinations. It is especially suited for customers, which want the lowest risk possible. • The optimal risk portfolio is the portfolio with the lowest risk for a given return. It is especially suited for customers, who have a certain target revenue. • The optimal return portfolio is the portfolio with the greatest return for a given risk. It is especially suited for customers, which do not want to exceed a given risk. • The portfolio with the maximum risk premium is especially suited for customers, who want to invest economically and who do not define a target risk or return, instead want a portfolio, which delivers the highest revenue above the current market interest in relation to the exposed risk. Using these four portfolios, an optimal portfolio can be selected for every customer situation in the field of asset consulting. No other combination portrays a sensible alternative for a customer.
optimal return portfolio
target return 7,00%
optimal risk portfolio
Return
maximum risk premium minimum risk portfolio
0,00% 0,00%
Figure 9.1. Selection of the optimal portfolio
Risk
target risk 21,20%
9 Strategies for a Customer-Oriented Consulting Approach
9.2.2
197
Application to Real-World Business
How these insights are then used can be shown through a few simple examples. In these examples it is important to remember that the aim of asset consulting is the strategic selection of the investment instruments. Thus, the consulting is done on asset class levels, such as European stocks, Eurozone bonds, etc. and not on a specific securities level. Relatively few and relatively general asset classes were selected for the examples. In actual use, however, any number of asset classes can be used, with any amount of detail. The starting point will be a customer with an asset structure as shown in Table 9.1. The customer’s goal is to reduce his risk. If possible, he would still like to earn the same revenue as before. To keep the example simple, we will assume that the institute does not offer any other asset classes. First, the customer is shown a visualization of the items in the individual asset classes and those of his assets based on the risk-return diagram. The numbers shown in Figure 9.2 are based on benchmarks of the last 10 years. A return of 6.6% with a risk of 12.4% is calculated for the customer assets. Next, the efficiency curve is calculated and portrayed. It can be immediately seen, that the customer can clearly enhance his situation by selecting the optimal risk portfolio (cf. Figure 9.2). If the return of 6.6% is retained, a shift can reduce the risk by over half (to 6.0%). The structure of the corresponding portfolio is listed in Table 9.2. Due to spatial limitations, we are not able to describe the results for all portfolios based on the efficiency curve. However, upon analyzing their composition it becomes clear, that with a lower risk range and a return between 4% and 6.5%, the combination is mostly determined by the relationship between the public real estate funds and the bonds. In the upper risk area, with a return between 6.5% and 8%, the relationship between the real estate funds and the stocks are the determining factor. Even though public real estate funds have a similar risk-return structure to bonds (cf. Figure 9.2), they are a sensible addition for any portfolio due to the correlation. An analysis of the optimal portfolio in the efficiency curve also shows, that not all eight asset classes contribute the same amount to the result. The basic limitation of the consulting to the three investment instruments: European stocks, Eurozone bonds and public real estate funds, reduces the complexity of the consulting without having to accept a significance loss of quality for the customer. This is illustrated by comparing the efficiency curves of all eight, or the three selected, asset classes (cf. Figure 9.3). Identical results apply for product portfolios with a noticeably higher number of products. Although 30 or more asset classes are generally offered in consulting, less 10 ten are usually enough for qualitatively strong customer support. All other asset classes unnecessarily increase the complexity for the consultant, as well as the customer. Strong consulting concentrates on the relevant products and asset classes.
198
Alexander Schöne
10,00%
USA Stocks
7,50%
European Stocks World Stocks Asian Stocks
Recom m ended Structure Original Structure
Return
Public Real Estate Funds International Bonds
5,00%
Corporate Bonds Eurozone Bonds
2,50%
0,00% 0,00%
10,00%
20,00%
30,00%
Risk
Figure 9.2. Efficiency curves in the risk-return diagram
Table 9.1. Original asset structure Asset Class European stocks USA stocks Asian stocks World stocks Eurozone bonds International bonds Corporate bonds Public real estate funds
Table 9.2. Optimized asset structure Asset Class European stocks USA stocks Public real estate funds
Amount 9,760 EUR 13,680 EUR 46,560 EUR
9 Strategies for a Customer-Oriented Consulting Approach
199
10,00%
Return
7,50%
Eight Asset Classes
Three Asset Classes
5,00%
2,50%
0,00% 0,00%
10,00%
20,00%
30,00%
Risk
Figure 9.3. Comparison of the efficiency curves
9.2.3
Consideration of Existing Assets
During an interview, customers often decide to restructure only part of their assets and not their entire portfolio. The correct approach in this kind of situation will be explained using the data from the previous example. The customer has 70,000 EUR from an inheritance and would now like to invest it. However, the existing assets totaling 70,000 EUR (see Table 9.1) should not be touched. The target return is 7%. A consultant comes up with the following solution: Since the 70,000 EUR will not be restructured, he will simply ignore them in his suggestion. Now, his task is the same as in the first example: 70,000 EUR should be distributed to the eight available asset classes. He now finds the optimal risk portfolio with a return of 7% in the efficiency curve and recommends the following investments: 12,177 EUR in European stocks, 25,867 EUR in USA stocks and 31,956 EUR in public real estate funds. However, consultant B thinks a little farther ahead. He knows: Even if the original assets will not be restructured, they will still have a significant influence on the overall result due to the correlation to the newly created asset classes. This means a return of only 6.8% for the overall assets of 140,000 EUR. This means, that consultant A would not achieve the customer’s specified goal. Consultant B determines the correct efficiency curve for the total assets of 140,000 EUR in consideration of the fact, that 70,000 EUR are already concretely distributed to asset classes and 70,000 EUR can be freely distributed. Now, in order to achieve the specified target of 7% while minimizing the risk, the customer would have to invest 48,000 EUR in USA stocks and 22,000 EUR in public real estate funds.
200
Alexander Schöne
The results of this example can also be applied to a greater number of asset classes in the field of consulting. A new amount to be invested should never be considered apart from the overall assets. The existing assets have a significant influence on the result, even if they themselves will not be changed. The “wrong” result not only falls short of the customer’s specifications, but is inefficient as well, meaning a better combination exists, which offers a lower risk with the same return. Even the number of recommended asset classes is visibly higher with the “wrong” recommendation. If the existing assets are correctly integrated into the consulting, the recommendation will automatically be reduced to the central investment instruments, since the existing assets already have a basic risk combination.
9.2.4
Illiquid Asset Classes and Transaction Costs
The previous examples focus on liquid asset classes alone. Just as existing security assets have to be integrated into the analysis, so do illiquid asset classes, such as real estate or equity investments. They are easy to handle, i. e. in general they have to be considered a fixed mass during the consulting, but still influence (and usually even significantly) the total risk and the total return on the customer’s assets. Transaction costs can be disregarded in the liquid asset classes without a major influence on the result, since they are generally minor and are the same for all liquid asset classes. Studies have shown, that transaction costs no longer play a role for liquid asset classes when the investment volume is 10,000 EUR or more. However, transaction costs cannot be ignored when dealing with illiquid asset classes. The transaction costs can be considerable, in which case they do have a significant influence on the result. Thus, in order to not exclude illiquid asset classes from the recommendation, a planning horizon hast to be defined, which will be used for the optimization. However, Markowitz’ original portfolio theory and the related calculation algorithms fail when it comes to transaction costs und cross-period planning timeframes. Alternative approaches do exist, which solve the problem using modern algorithms. These algorithms do not have as high a performance as conventional algorithms, which means they are only partially suitable for use in customer service. The possible uses of these algorithms will be explained in the following example. A customer has the asset structure shown in Table 9.3. The variable transaction costs are 1.0% for asset class “money market”, 1.5% for “securities”, 3.5% for “real estate” and 5% for “closed participations”. Savings deposits are not subject to transaction costs. Transaction costs are added to the purchase price when buying and subtracted from the revenue when selling. The customer’s goal is an optimal return portfolio, meaning he wants to retain the risk, but to maximize his return over a period of 5 years. The best restructuring recommendation for such customers is shown in Table 9.4.
9 Strategies for a Customer-Oriented Consulting Approach
201
Table 9.3. Asset structure before optimization Asset class Liquidity Securities stocks Securities annuities Illiquid bonds
Financial instrument Savings deposits Money market European stocks USA stocks Eurozone bonds International bonds Real estate Equity investments
These shifts do create transaction costs of 4,903 EUR. Yet, after 5 years the customer can count on total assets of 439,479 EUR – compared to 432,570 EUR before the shift – yet without an increased risk. Thus, after the transaction costs he still increased his assets by over 7,000 EUR. With a planning timeframe of 10 years, the customer could even achieve an advantage of over 25,000 EUR.
9.2.5
Conclusions for Daily Use
Markowitz’ portfolio selection theory is a model for optimizing the asset situation. An optimal portfolio combination is created based on the parameters return, risk and correlation. The existing high-performance algorithms can be used in customer service as well. Thus, the portfolio selection theory is a sensible basis for neutral and individual customer support. Consequences for the consulting can also be derived from the application of the portfolio selection theory. The customer consulting should not be based on real products, but on general asset classes. In addition, the underlying asset classes can be reduced to the central classes without a significant disadvantage for the customers. A reduction of the offered product range to central asset classes is also
202
Alexander Schöne
important in consideration of the transaction costs. The consulting must consider the existing assets, otherwise it will produce incorrect results. Since the consulting is performed on the asset class level and not for real products, only very little data is required for the actual application in the customer support. The values for return, risk and correlation still form the basis for the results. The back office should provide these values and update them on a regular basis – usually on a quarterly basis. The portfolio selection theory only portrays a partial aspect of the consulting, meaning that consultants should not use it as an independent tool, but should integrate it into their consulting software. These requirements can be easily covered by web-based applications. The data is stored and calculated on a high-performance service, whereas it is portrayed on the consultant’s local computer. Web-based applications also offer the flexibility of integrating the functionality of the portfolio selection theory into an existing application as a specialized calculation core, but can also be used as a comprehensive, specialized consulting solution. However, the portfolio selection theory has faced some critique. Many studies have tried to disprove or confirm the assumptions of the theory. Without going into detail, the following can be said: A theory does not portray reality, it only offers an approach for better understanding reality. The portfolio selection theory fulfills that purpose. It should not be overrated and used as the only basis for consulting. Rather, it is a useful approach for neutral and objective customer support.
9.3
Support of Customer-Oriented Consulting Using Modern Consulting Software
As explained at the start of this chapter, consulting is based on functional content and the corresponding technical support by a suitable consulting software and portray an inseparable component of customer-oriented consulting. The second section of this article will describe the software-related aspects of customeroriented consulting. It deals with the fundamental architecture of the corresponding software and introduces possibilities for their targeted application even in mobile sales channels.
9.3.1
Architecture of a Customer-Oriented Consulting Software
Until recently, comprehensive financial planning solutions, which considered all of a customer’s aspects at once, were considered the alpha and omega of consulting. However, actual use has shown, that the complexity of comprehensive financial planning consulting is too great to actually be used in all customer segments and still be efficient. The effort and resulting costs are not suitable for many customers.
9 Strategies for a Customer-Oriented Consulting Approach
203
Moreover, comprehensive financial planning consulting is inappropriate in larger segments with retail banking and private banking customers. Recently, the trend has been moving more toward topic-oriented systems, which achieve comprehensive consulting by other means. A strong consulting solution fulfills multiple areas: • It must consider the institute’s wide range of customers and must be able to flexibly address individual consulting situations. • At the same time, it must enable neutral, high-quality consulting without omitting product sales. • It must master a balance between a uniform approach and topic-oriented consulting. • In addition, it must be possible to use it for all sales channels. The specific structure of such a consulting solution can be seen in the following figure.
Self Assessment Quick Analysis Complete Analysis
CRM/ Workflow
Consulting Module A
Consulting Module B
Consulting Module C
Consulting Module D
Back Office
Central Database
Market Data
Figure 9.4. Example of the architecture of a customer-oriented consulting software
Modern consulting software enables the flexible adjustment to individual consulting situations. Each inquiry made to the customer consulting is processed in a separate module. These inquiries may address topics such as pension schemes, construction financing, asset analyses or asset replacements. This reduces the complexity of the consulting, since only the data relevant for a specific topic has to be entered. However, standalone data must be avoided in order to ensure the holistic approach of the consulting. The entered data must be stored centrally. Thus, all consulting modules have to be based on the same database and data, such as income,
204
Alexander Schöne
asset values, marital status, etc. only have to be entered once, after which they are available in all modules. Each additional consulting increases the amount of data, so that consultants can constantly gain more information about their customers. The consulting time and effort will still be reduced, since only topic-relevant data is used for each module. Thus, the focus remains on the customer and an efficient consulting is ensured. Not all consulting has to take place on the same level. Modern consulting software offers the possibility to individually address customers by selecting a consulting level. The lowest consulting level could be a short analysis offering easy access to complex topics and requiring only four or five entries. The next level for example could be a first qualified recommendation achieved by using the quick analysis. In doing so, even a minor amount of input data can provide an overview of the consulting situation. Tax-related effects are also considered in a clear manner. If the consultant realizes that there is a greater need for consulting, the consulting can be intensified at any time up to a complete, comprehensive data collection of the overall customer situation in the top consulting level. When using different consulting levels, it is important that the same module has to be active in the background regardless of the level of consulting. This allows all customers to be treated with the same quality – and with the investment of a justifiable amount of time. The central component of modern consulting software is an integrated action and product recommendation, which generates a customer-specific, high-quality recommendation independent of the consultant. Different recommendation logics can be selected for doing so: static decision trees, optimization algorithms (such as Markowitz’ portfolio selection model described above), fuzzy algorithms or knowledge-based systems. To convince the customer that the recommendation is right, the recommendation should be derived in three steps: • Neutral structure recommendation based on asset classes. • Conversion proposal based on the existing products. • Reasoning and portrayal of the products. In contrast to a mere product recommendation, the first step generates a recommendation for asset classes, i. e. for neutral “products”. During an asset analysis, the customer would, for example, receive a recommendation as to which asset classes he should invest in or which he should get rid of. Specific products from the institute’s product portfolio should not be introduced into the recommendation until the second step. A conscious transition from a neutral recommendation to specific products is easier for customers to understand. To increase the customers’ trust in the recommendation, the transition is explained for each customer during the third step. The customer is informed as to why each of the proposed products would support his optimal strategy. Interfaces to, for example, an institute’s CRM, existing portfolio and/or market data, as well as the subsequent administration can be used to ensure that the consulting software is not a standalone system, rather is efficiently integrated into the institute’s consulting process and workflow.
9 Strategies for a Customer-Oriented Consulting Approach
9.3.2
205
Mobile Consulting as a Result of Customer-Orientation
Customer-oriented consulting must focus on a customer’s goals. This means, that consulting has to be available not only during regular banking hours in an office, but even at the customer’s home, for example. Such services have become more important not only for wealthy private customers, but also for “normal” bank customers. More and more banks have already taken advantage of this possibility or are planning how to do so. However, an error is frequently made in giving customers different consulting services when at home than when in the bank’s offices. For cost-related reasons, consultations in a customer’s home environment should not be treated as small-talk sessions or to meet informational needs, but should be treated as a full-value consultation offering additional sales opportunities. Solid consulting must offer the customer consistent, high-quality support regardless of the location. The consulting software used must be suitable for mobile as well as stationary sales. However, this is seldom the case. Even consulting software, which has been installed on a laptop and can thus be used for mobile consulting, presents the financial institute with high financial and organizational handicaps. The core problem in such cases is the data storage. Customer data must be exchanged between the consultant’s laptop and the central solution. This required a high replication effort, especially when the central solution has been integrated into the institute’s in-force business system or the CRM. Thus, many institutes decide not to use a central consulting solution, but to work with the solutions installed on the consultant’s laptop when in the office or at the customer’s location. This means, that the customer data is not stored centrally, but as standalone data on the consultant’s laptop. This, in turn, prevents a consequent enforcement of the sales strategy. Regardless of whether data is exchanged between mobile or stationary solutions or is only stored in the mobile solution, the institute will always have to perform organizational tasks. Software updates, market and product data, as well as recommendation data have to be distributed to the consultant laptops on a regular basis. Additional hidden costs are also created through the increased maintenance and increased hardware requirements related to the consultant laptops, which are often ignored when calculating the profitability of the implementation of a consulting solution. The use of common consulting solutions in stationary and mobile business presents the institute with technical, organizational and financial challenges and makes the consultant’s daily work with the consulting solution more difficult. 9.3.2.1 Web-Based Consulting Software as a Basis for Stationary and Mobile Sales Web-based software offers many advantages for mobile and stationary sales. The software is installed on a central server and the consultants can access it using
206
Alexander Schöne
a common internet browser, such as Microsoft Explorer or Netscape Navigator. It does not matter whether the server is located in an external computer center, the company headquarters or in a branch office. Consultants can access the solution from any workplace using the browser, without having to install additional software. Access is possible from the intranet or from an external, secured network. When at the customer’s location, consultants can utilize, for example, a GPTS, UMTS or WLAN connection to access the consulting software from his laptop. In doing so, he always has access to the latest data. Technically, this type of connection is similar to an internet connection using a dial-up telephone line. In contrast to common telephone connections, the duration of the connection is irrelevant – the user only has to pay for the amount of data he transfers. Regardless of whether the consultation lasts 30 minutes or 3 hours, the decisive factor for the costs is only the amount of data transferred during the meeting. The advantages of a mobile access are obvious. The requirements placed on the laptop are minimal – only an internet browser needs to be installed. Moreover, the institute does not have any costs for maintaining the mobile consulting solution. Consultants always have access to the latest customer, market and product data. The interfaces available in the consulting software, such as CRM or market data systems, can be used in mobile consulting as well, without having to install additional programs on the consultant’s laptop. The advantages offered by this possibility are subject to high cost expectations and the fear of security problems. However, both of these concerns can be solved during actual use. 9.3.2.2 Minimal Additional Costs in Mobile Consulting Modern consulting software focuses on keeping the required data transfers as low as possible. For example, graphics are no longer generated in and transferred in a standard, resource-intensive graphic format, but as so-called vector graphics. These offer the advantage, that they can be scaled as needed without losing on quality and only require a minimum of memory. The majority of screens in consulting are text-oriented, which significantly reduces the data volumes which have to be transferred. Another advantage of web-based consulting software lies in the fact that only the results have to be transferred from the server to he consultant’s laptop, regardless of the complexity of the calculations or the quantity of required data. It does not matter whether a specific analysis will be performed for a retail customer with 10 asset values or a wealth private customer with 200 asset values. All necessary data is located on the central server, where the calculations are performed and the concluding page with the results is generated. Only the relative small result portrayal is transferred. Thus, the performance of the mobile consulting solution is no longer dependent on the laptop’s hardware, but on the bandwidth of the transmission. This means, that the speed in mobile consulting is comparable to the speed of stationary consulting.
9 Strategies for a Customer-Oriented Consulting Approach
207
Table 9.5. Data volumes and transmission costs for mobile consulting
Data volumes (compressed, without consulting synopsis) Cost
Simple Support 450 Kbytes
Comprehensive Support 1,200 Kbytes
0.74 EUR
1.98 EUR
A typical page from standard, web-based consulting software requires an average data volume of approximately 40 Kbytes. Utilizing the data compression of the transmission and the local proxy storage of the transferred contents, this can generally be reduced to 5 to 10 Kbytes. The number of pages opened during the consultation depends mostly on the complexity of the consulting. Simple consulting generally requires no more than 40 pages, whereas comprehensive consulting for a wealth customer can require 150 pages or more. Table 9.5 shows the average values determined for the total data volume transferred during actual consultations and the resulting costs. The costs vary depending on the provider and the type of contract which has been signed.1 As can be seen, an average of one to two Euro is accrued per consultation. Compared to the consultant’s personnel costs, this is only a fraction of the total costs. When compared to the alternative costs accrued for, say, higher maintenance, better laptops if solutions have to be installed on a consultant’s laptop, the advantages of mobile consulting using a Web-based solution are more than obvious. 9.3.2.3 Highest Data Security in Mobile Consulting The security of mobile consulting is a core concern. A differentiation has to be made between the authentication and authorization of the consulting software, and the security of the data transmission (encoding) between the terminal and the server. Web-based consulting software may require the client to have a Web browser with an IP connection to the consulting server. Regardless of which mobile communication standard is used, whether e. g. GSM, GPRS, UMTS or WLAN, the internet is generally used as an open network between the mobile network operator and the server. Figure 9.5 shows the principle of a connection by GPRS (General Packet Radio Service) from a cell phone. To counteract any electronic eavesdropping within the involved networks, a secured point-to-point connection is formed between the mobile terminal (e. g. the consultant’s laptop) and the server. This can be achieved either by setting up a VPN (Virtual Private Network) or an SSL connection between the browser and the Web server. These connections are eavesdropping-secured according to the current technological state. 1
The calculation is based on the business rates offered by a major German cellular network provider (at the beginning of 2006). It is suitable for consultants for five to ten mobile consultations a month. Lower rates are available in case of a greater number of mobile consultations, which would reduce the costs by approximately 30%.
208
Alexander Schöne
Mobile Consulting
Stationary Consulting Firewall
Central Server
Figure 9.5. Mobile consulting with laptop and cell phone
With this approach, there is no difference in the security of the transmission and authentication when using a mobile terminal connection or a terminal connection for accessed the server directly via the internet. Unauthorized person can neither eavesdrop on the customer data during the consultation, nor can they access the consulting solution. Thus, Web-based consulting solutions offer the same service scope in stationary and mobile use with low costs and organizational efforts. Mobile consulting guarantees the same data security as consultations within a company’s offices. This means, that a Web-based consulting solution is well-suited for all financial institutes, which would like offer stationary and/or mobile consulting.
9.4
Conclusion
In this chapter two central requirements for customer-oriented consulting were discussed. The discussion revealed that available methods and tools from back office solutions can be used in consulting services to enhance the quality and service for the customer. Applying the portfolio selection theory of Markowitz enables the consultant to derive an individual proposal for the customers need which is superior to other consulting approaches. Secondly it was shown that modern software architecture enables consultants to serve their customer not only in the bank office but also at the customers home. Both facts are key fundamentals of a customer oriented consulting approach.
References Elton E, Gruber J (1995) Modern Portfolio Theory and Investment Analysis, John Wiley & Sons, New York Markowitz H (1952) Portfolio Selection, Journal of Finance 7 (1), pp. 77–91 Markowitz H (1997) Portfolio Selection: Efficient Diversification of Investments, second edition, Blackwell Publishers, Cambridge MA
CHAPTER 10 A Reference Model for Personal Financial Planning Oliver Braun, Günter Schmidt
10.1 Introduction We base the discussion of the business requirements on the application architecture LISA where four views on models are defined (Schmidt 1999): 1. the granularity of the model differing between industry model, enterprise model, and detailed model; 2. the elements of the model differing between data, function, and coordination; 3. the life cycle of modelling differing between analysis, design, and implementation; 4. the purpose of modelling differing between problem description and problem solution. We will present an analysis model as part of the application architecture showing the granularity of an industry model. It contains data, function, and co-ordination models related to the purpose of modelling. The language we use to develop the analysis model is the Unified Modeling Language (UML). An intrinsic part of the proposed system architecture is the usage of web technologies as the global Internet and the World Wide Web are the primary enabling technologies for delivering customized decision support. We refer to the reference model as a combination of the analysis model and the system architecture. We believe that combining analysis model and system architecture could enhance the usability of the reference model. We understand personal financial planning as the process of meeting life goals through the management of finances (Certified Financial Planner’s (CFP) Board of Standards 2005). Our reference model fulfils two kinds of purposes: first, the analysis model is a conceptual model that can serve financial planners as a decision support tool. Second, system developers can map the analysis model to the system architecture at the design stage of system development. Furthermore, the reference model serves as a capture of existing knowledge in the field of IT-
210
Oliver Braun, Günter Schmidt
supported personal financial planning. The model also addresses interoperability assessments by the concept of platform-independent usage of personal financial planning tools. In Section 10.2 we provide definitions and a short review of the state-of-the-art in the field of reference modelling for personal financial planning research. Furthermore, we outline some features displaying the advantages of the reference model. In Section 10.3 we describe the framework of our reference model, and in sections 10.4 and 10.5 (parts of) the reference models for the analysis and architecture level are described. We have developed a prototype FiXplan (IT-based personal financial planning) in order to evaluate the requirements compiled in Section 10.2. Finally, Section 10.6 resumes the results and discusses future research opportunities within the field of reference models for personal financial planning.
10.2 State of the Art We will briefly review concepts of personal financial planning in Section 10.2.1, and systems and models in Section 10.2.2. Section 10.2.3 discusses core concepts of reference modelling, and Section 10.2.4 is dedicated to the discussion of requirements for a reference model for personal financial planning.
10.2.1 Personal Financial Planning The field of personal financial planning is well supplied with a lot of textbooks, among them: (Böckhoff and Stracke 2001; Keown 2003; Nissenbaum et al. 2004; Schmidt 2006; Woerheide 2002) to name just a few. Most books can be used as guides to handle personal financial problems e. g. maximize wealth, achieve various financial goals, determine emergency savings, maximize retirement plan contributions, etc. There are also papers in journals ranging from the popular press to academic journals. (Braun and Kramer 2004) give a review on software for personal financial planning in German-speaking countries. We will start our discussion with some definitions related to the world of personal financial planning. Certified Financial Planner’s (CFP) Board of Standards defines (personal) financial planning as follows: Definition 10.2.1 (CFP Board of Standards 2005) Financial planning is the process of meeting your life goals through the proper management of your finances. Life goals can include buying a home, saving for your child’s education or planning for retirement. Financial planning provides direction and meaning to your financial decisions. It allows you to understand how each financial decision you make affects other areas of your finances. For example, buying a particular investment product might help you pay off your mortgage faster or it might delay your retirement significantly. By viewing each financial
10 A Reference Model for Personal Financial Planning
211
decision as part of a whole, you can consider its short and long-term effects on your life goals. You can also adapt more easily to life changes and feel more secure that your goals are on track. The draft ISO/DIS 22 222-1 of the Technical Committee ISO/TC 222, Personal financial planning, of the International Organization for Standardization defines personal financial planning as follows: Definition 10.2.2 (ISO/TC 222 2004) Personal financial planning is an interactive process designed to enable a consumer/client to achieve their personal financial goals. In the same draft, Personal Financial Planner, consumer, client and financial goals are defined as follows: Definition 10.2.3 (ISO/TC 222 2004) A Personal Financial Planner is an individual practitioner who provides financial planning services to clients and meets all competence, ethics and experience requirements contained in this standard. A consumer is an individual or a group of individuals, such as a family, who have shared financial interests. A client of a Personal Financial Planner is an individual who has accepted the terms of engagement by entering into a contract of services. A financial goal is a quantifiable outcome aimed to be achieved at some future point in time or over a period of time. Certified Financial Planner’s (CFP) Board of Standards and the Technical Committee ISO/TC 222, Personal financial planning, of the International Organization for Standardization define the personal financial planning process as follows: Definition 10.2.4 (CFP Board of Standards 2005; ISO/TC 222 2004) The personal financial planning process shall include, but is not limited to, six steps that can be repeated throughout the client and financial planner relationship. The client can decide to end the process before having passed all the steps. The process involves gathering relevant financial information, setting life goals, examining your current financial status and coming up with a strategy or plan for how you can meet your goals given your current situation and future plans. The financial planning process consists of the following six steps: 1. Establishing and defining the client-planner relationship. The financial planner should clearly explain or document the services to be provided to the client and define both his and his client’s responsibilities. The planner should explain fully how he will be paid and by whom. The client and the planner should agree on how long the professional relationship should last and on how decisions will be made. 2. Gathering client data and determining goals and expectations. The financial planner should ask for information about the client’s financial situation. The client and the planner should mutually define the client’s personal and financial goals, understand the client’s time frame for results and discuss, if relevant, how the client feel about risk. The financial planner should gather all the necessary documents before giving advice the client needs.
212
Oliver Braun, Günter Schmidt
3. Analyzing and evaluating the client’s financial status. The financial planner should analyze the client’s information to assess the client’s current situation and determine what the client must do to meet his goals. Depending on what services the client has asked for, this could include analyzing the client’s assets, liabilities and cash flow, current insurance coverage, investments or tax strategies. 4. Developing and presenting financial planning recommendations and/or alternatives. The financial planner should offer financial planning recommendations that address the client’s goals, based on the information the client provides. The planner should go over the recommendations with the client to help the client understand them so that the client can make informed decisions. The planner should also listen to the client’s concerns and revise the recommendations as appropriate. 5. Implementing the financial planning recommendations. The client and the planner should agree on how the recommendations will be carried out. The planner may carry out the recommendations or serve as the client’s “coach,” coordinating the whole process with the client and other professionals such as attorneys or stockbrokers. 6. Monitoring the financial planning recommendations. The client and the planner should agree on who will monitor the client’s progress towards his goals. If the planner is in charge of the process, he should report to the client periodically to review his situation and adjust the recommendations, if needed, as the client’s life changes.
10.2.2 Systems and Models Models are simplifications of reality. Models exist at various levels of abstraction. A level of abstraction indicates how far removed from the reality a model is. High levels of abstraction represent the most simplified models. Low levels of abstraction have close correspondence with the system they are modelling. An exact 1:1 correspondence is no longer a model but rather a transformation of one system to another. Systems are defined as follows: Definition 10.2.5 (Rumbaugh et al. 2005) A system is a collection of connected units organized to accomplish a purpose. A system can be described by one or more models, possibly from different viewpoints. The complete model describes the whole system. The personal financial planning process can be supported by financial Decision Support Systems (DSS). In general, a Decision Support System can be described as follows. Definition 10.2.6 (Turban 1995) A Decision Support System (DSS) is an interactive, flexible, and adaptable computerbased information system, specially developed for supporting the solution of a non-
10 A Reference Model for Personal Financial Planning
213
structured management problem for improved decision making. It utilizes data, provides an easy-to-use interface, and allows for the decision maker own insights. Continuous interaction between system and decision-makers is important (Briggs 1997). Interactive decision-making has been accepted as the most appropriate way to obtain the correct preferences of decision-makers (Mathieu and Gibson 1993; Mukherjee 1994). If this interaction is to be supported by a web-based system, then there is a need to manage the related techniques (or models), to support the data needs, and to develop an interface between the users and the system. Advances in web technologies and the emergence of the e-business have strongly influenced the design and implementation of financial DSS. As a result, improvement in global accessibility in terms of integration and share of information means that obtaining data from the Internet has become more convenient. Growing demand in fast and accurate information sharing increases the need in web-based financial DSS (Chou 1998; Dong et al. 2002; Dong et al. 2004; Fan et al. 1999, 2000; Hirschey et al. 2000; Li 1998; Mehdi and Mohammad 1998). The application of DSS to some specific financial planning problems is described in several articles. In (Schmidt and Lahl 1991) a DSS based on expert system technology is discussed. The focus of the system is portfolio selection. (Samaras et al. 2005) focus on portfolio selection. They propose a DSS which includes three techniques of investment analysis: Fundamental Analysis, Technical Analysis and Market Psychology. (Dong et al. 2004) also focus on portfolio selection and report on the implementation of a Web-based DSS for Chinese financial markets. (Zahedi and Palma-dos-Reis 1999) describe a personalized intelligent financial DSS. (Zopounidis et al. 1997) give a survey on the use of knowledge-based DSS in financial management. The basic characteristic of such systems is the integration of expert systems technology with models and methods used in the decision support framework. They survey some systems applied to financial analysis, portfolio management, loan analysis, credit granting, assessment of credit risk, and assessment of corporate performance and viability. These systems provide the following features: • support all stages of the decision making process, i. e. structuring the problem, selecting alternative solutions and implementing the decision; • respond to the needs and the cognitive style of different decision makers based on individual preferences; • incorporate a knowledge base to help the decision maker to understand the results of the mathematical models; • ensure objectiveness and completeness of the results by comparing expert estimations and results from mathematical models; • time and costs of the decision making process is significantly reduced, while the quality of the decisions is increased. Special emphasis on web-based decision support is given in (Bhargava et al. 2006). Even though the above efforts in developing DSS have been successful to solve some specific financial planning problems, these do not lead to a general
214
Oliver Braun, Günter Schmidt
framework for solving any financial planning problem. There is no DSS which covers the whole process of financial planning. Models can be defined as follows: Definition 10.2.7 (Rumbaugh et al. 2005) A model is a semantically complete description of a system. According to that definition, a model is an abstraction of a system from a particular viewpoint. It describes the system or entity at the chosen level of precision and viewpoint. Different models provide more-or-less independent viewpoints that can be manipulated separately. A model may comprise a containment hierarchy of packages in which the top-level package corresponds to the entire system. The contents of a model are the transitive closures of its containment (ownership) relationships from top-level packages to model elements. A model may also include relevant parts of the system’s environment, represented, for example, by actors and their interfaces. In particular, the relationship of the environment to the system elements may be modelled. The primary aim of modelling is the reduction of complexity of the real world in order to simplify the construction process of information systems (Frank 1999).
10.2.3 Reference Models (Mišić and Zhao 1999) define reference models as follows: Definition 10.2.8 (Mišić and Zhao 1999) A reference model describes a standard decomposition of a known problem domain into a collection of interrelated parts, or components, that cooperatively solve the problem. It also describes the manner in which the components interact in order to provide the required functions. A reference model is a conceptual framework for describing system architectures, thus providing a high-level specification for a class of systems. Reference modelling can be divided into two groups (Fettke and Loos 2003): investigation of methodological aspects (e. g. Lang et al. 1996; Marshall 1999; Remme 1997; Schütte 1998), and construction of concrete reference models (e. g. Becker and Schütte 2004; Fowler 1997; Hay 1996; Scheer 1994). To the best of our knowledge, there is no reference model for personal financial planning. Our reference model for personal financial planning consists of an analysis model for the system analysis of a personal financial planning system and of corresponding system architecture of an information system for personal financial planning. 10.2.3.1 Analysis Model Surveys of reference models at the analysis stage of system development (i. e. conceptual models) are given in various papers, (e. g. Scholz-Reiter 1990; Marent 1995; Mertens et al.1996; Fettke and Loos 2003; Mišić and Zhao 1999). For the construction of models, languages are needed. We will use the Unified Modeling Language (UML) for our analysis model.
10 A Reference Model for Personal Financial Planning
215
Definition 10.2.9 (Rumbaugh et al. 2005) The Unified Modeling Language (UML) is a general-purpose modelling language that is used to specify, visualize, construct, and document the artefacts of a software system. UML was developed by the Rational Software Corporation to unify the best features of earlier methods and notations. The UML models can be categorized into three groups: • State models (that describe the static data structures) • Behaviour models (that describe object collaborations) • State change models (that describe the allowed states for the system over time) UML also contains a few architectural constructs that allow modularizing the system for iterative and incremental development. We will use UML 2.0 (Rumbaugh, Jacobson, & Booch, 2005) for our analysis model. 10.2.3.2 System Architecture In general, architecture can be defined as follows: Definition 10.2.10 (Rumbaugh et al. 2005) An architecture is the organizational structure of a system, including its decomposition into parts, their connectivity, interaction mechanisms, and the guiding principles that inform the design of a system. Definition 10.2.11 (Bass et al. 1998) The term system architecture denotes the description of the structure of a computer-based information system. The structures in question comprise software components, the externally visible properties of those components, and the relationships among them. IBM (Youngs et al. 1999) has put forth an Architecture Description Standard (ADS), which defines two major aspects: functional and operational. The IBM ADS leverages UML to help express these two aspects of the architecture. This standard is intended to support harvesting and reuse of reference architectures. For a comprehensive overview of architectures, languages, methods, and techniques for analysing, modelling, and constructing information systems in organisations, we refer to the Handbook on Architectures of Information Systems (Bernus et al. 2005). The reference architecture we use to derive our system architecture is a service oriented architecture (SOA) proposed by the World Wide Web Consortium (W3C) (Booth et al. 2005) for building Web-based information systems. According to the Gartner group, by 2007 service oriented architectures will be the dominant strategy (more than 65%) of developing information systems. The architecture we propose supports modular deployment of both device specific user interfaces through multichannel delivery of services and new or adapted operational processes and strate-
216
Oliver Braun, Günter Schmidt
gies. Service oriented architectures refer to an application software topology which separates business logic from user interaction logic represented in one or multiple software components (services), exposed to access via well defined formal interfaces. Each service provides its functionality to the rest of the system as a welldefined interface described in a formal markup language and the communication between services is platform and language independent. Thus, modularity and reusability are ensured enabling several different configurations and achieving multiple business goals. The first service oriented architecture relates to the use of DCOM or Object Request Brokers (ORBs) based on the CORBA specification. service is a function that is well-defined, self-contained, and does not depend on the context or state of other services. Services are what you connect together using Web Services. A service is the endpoint of a connection. Also, a service has some type of underlying computer system that supports the connection offered. The combination of services – internal and external to an organization – make up a service oriented architecture. The technology of Web services is the most likely connection technology of service oriented architectures. Web services essentially use XML to create a robust connection. Web Services technology is currently the most promising methodology of developing web information systems. Web Services allow companies to reduce the cost of doing e-business, to deploy solutions faster and to open up new opportunities. The key to reach these goals is a common program-to-program communication model, built on existing and emerging standards such as HTTP, Extensible Markup Language (XML, see also Wüstner et al. 2005), Simple Object Access Protocol (SOAP), Web Services Description Language (WSDL) and Universal Description, Discovery and Integration (UDDI). However, the real Business-toBusiness interoperability between different organizations requires more than the aforementioned standards. It requires long-lived, secure, transactional conversations between Web Services of different organizations. To this end, a number of standards are under way (e. g. Web Services Conversation Language (WSCL), and the Business Process Execution Language for Web Services (BPEL4WS). With respect to the description of service’s supported and required characteristics the WS-Policy Framework is under development by IBM, Microsoft, SAP, and other leading companies in the area. On the other hand, other no standard languages and technologies have been proposed for composing services and modelling transactional behaviour of complex service compositions. Such proposals include the Unified Transaction Modelling Language (UTML) and the SWORD toolkit.
10.2.4 Requirements for a Reference Model for Personal Financial Planning Our reference model for personal financial planning should fulfil the following requirements concerning the input, data processing, and output: the input has to be collected in a complete, i. e. adequate to the purpose, way. The input data have to
10 A Reference Model for Personal Financial Planning
217
be collected from the client and/or from institutions that are involved in this process. Such institutions may be Saving Banks, Mortgage Banks, Life Insurances, Car Insurances, Provider of Stock Marked quotations, Pension Funds, etc. This stage is time consuming and the quality of the financial plan is dependent on the completeness and correctness of the collected input data. Data processing (i. e. data evaluation and data analysis) have the task to process the input data in a correct, individual, and networked way. Correct means that the results have to be precise and error-free according to accepted methods of financial planning. Individual means that the concrete situation with all its facets has to be centred in the planning, and that it is forbidden to make generalizations. Networked means: all of its data’s effects and interdependencies have been considered and structured into a financial plan. The output has to be coherent and clear. 10.2.4.1 Analysis Model Main task of an analysis model for personal financial planning is to help financial planners to consult individuals to properly manage their personal finances. Proper personal financial planning is so important because many individuals lack a working knowledge of financial concepts and do not have the tools they need to make decisions most advantageous to their economic well-being. For example, the Federal Reserve Board’s Division of Consumer and Community Affairs (2002) stated that “financial literacy deficiencies can affect an individual’s or family’s day-today money management and ability to save for long-term goals such as buying a home, seeking higher education, or financial retirement. Ineffective money management can also result in behaviours that make consumers vulnerable to severe financial crises”. Furthermore, the analysis model should fulfil the following requirements: • Maintainability (the analysis model should be understandable and alterable) • Adaptability (the analysis model should be adaptable to specific user requirements) • Efficiency (the analysis model should solve the personal financial planning problem as fast and with the least effort as possible) • Standards oriented (the analysis model should be oriented at ISO/DIN/ CFP – board standards; it should fulfil e. g. the CFP financial planning practice standards (Certified Financial Planner’s (CFP) Board of Standards 2005) 10.2.4.2 System Architecture An architecturally significant requirement is a system requirement that has a profound impact on the development of the rest of the system. Architecture gives the rules and regulations by which the software has to be constructed. The architect has the sole responsibility for defining and communicating the system’s architecture. The architecture’s key role is to define the set of constraints placed on the
218
Oliver Braun, Günter Schmidt
design and implementation teams when they transform the requirements and analysis models into the executable system. These constraints contain all the significant design decisions and the rationale behind them. Defining an architecture is an activity of identifying what is architecturally significant and expressing it in the proper context and viewpoint. The system architecture for an IT-based personal financial planning system should fulfil the following requirements: • Maintainability (the architecture should be understandable and alterable) • Adaptability (the architecture should be adaptable to specific user requirements) • Efficiency (the architecture should deliver a framework that allows a Personal Financial Planner to solve personal financial planning problems as fast and with the least effort as possible, e. g. by automatic data gathering) • Standards oriented (the architecture should be oriented at well established standards, e. g. service oriented architectures) • Reliability and robustness (the architecture should support a reliable and robust IT-system) • Portability (the architecture should support a platform independent usage of the personal financial planning system) In the following sections, we describe a reference model for personal financial planning which meets the requirements specified in this section. In Section 10.3 we describe the framework for our reference model and in sections 10.4 and 10.5 (parts of) the reference models for the analysis and architecture level are described.
10.3 Framework for a Reference Model for Personal Financial Planning Our reference model for personal financial planning consists of two parts (see Figure 10.1): 1. Analysis model which can help financial planners to do personal financial planning. 2. System architecture which can help system developers to develop logical models at the design stage of system development. Our reference model for personal financial planning has two purposes: first, the analysis model can help financial planners to do personal financial planning in a systematic way. Second, at the design stage of system development, system developers can take the analysis model produced during system analysis and apply the system architecture to it. An intrinsic part of our reference model at the architecture level is the usage of web technologies as web technologies have already and will strongly influence the design and implementation of financial information systems in general.
10 A Reference Model for Personal Financial Planning
219
Figure 10.1. Analysis model and system architecture
Design starts with the analysis model and the software architecture document as the major inputs (Conallen 2002). Design is where the abstraction of the business takes its first step into the reality of software. At design level, the analysis model is refined such that it can be implemented with the components that obey the rules of the architecture. As with analysis, design activities revolve around the class and interaction diagrams. Classes become more defined, with fully qualified properties. As this happens, the level of abstraction of the class shifts from analysis to design. Additional classes, mostly helper and implementation classes, are often added during design. In the end, the resulting design model is something that can be mapped directly into code. This is the link between the abstractions of the business and the realities of software. Analysis and design activities help transform the requirements of the system into a design that can be realized in software. Analysis begins with the use case model, the use cases and their scenarios, and the functional requirements of the system that are not included in the use cases. The analysis model is made up of classes and collaborations of classes that exhibit the dynamic behaviours detailed in the use cases and requirements. The analysis model and the design model are often the same artefact. As the model evolves, its elements change levels of abstraction from analysis to detailed design. Analysis-level classes represent objects in the business domain. Analysis
220
Oliver Braun, Günter Schmidt
focuses on the functional requirements of the system, ignoring the architectural constraints of the system. Use case analysis comprises those activities that take the use cases and functional requirements to produce an analysis model of the system. Analysis focuses on the functional requirements of the system, so the fact that some or all of the system will be implemented with Web technologies is beside the fact. Unless the functional requirements state the use of a specific technology, references to architectural elements should be avoided. The analysis model is made up of classes and collaborations of classes that exhibit the dynamic behaviours detailed in the use cases and the requirements. The model represents the structure of the proposed system at a level of abstraction beyond the physical implementation of the system. The classes typically represent objects in the business domain, or problem space. The level of abstraction is such that the analysis model could be applied equally to any architecture. Important processes and objects in the problem space are identified, named, and categorized during analysis. Analysis focuses on the functional requirements of the system, ignoring the architectural constraints of the system. The emphasis is on ensuring that all functional requirements, as expressed by the use cases and other documents, are realized somewhere in the system. Ideally, each use case is linked to the classes and packages that realize them. This link is important in establishing the traceability between requirements and use cases and the classes that will realize them. The analysis model is an input for the design model and can be evolved into the design model.
10.4 Analysis Model The analysis model of our reference model for personal financial planning consists of the following parts: Use Cases build the dynamic view of the system. Class diagrams build the structural view of the system. Both are combined in the behavioural view of the system with activity diagrams and sequence diagrams.
10.4.1 Dynamic View on the System: Use Cases Use cases, use case models and use case diagrams are defined as follows: Definition 10.4.1 (Rumbaugh et al. 2005) A use case is the specification of sequences of actions, including variant sequences and error sequences, that a system, subsystem, or class can perform by interacting with outside objects to provide a service of value. A use case is a coherent unit of functionality provided by a classifier (a system, subsystem, or class) as manifested by sequences of messages exchanged among the system and one or more outside users (represented as actors), together with actions performed by the system. The purpose of a use case is to define a piece of behaviour of a classifier (including a subsystem or the entire system), without revealing the internal structure of the
10 A Reference Model for Personal Financial Planning
221
classifier. A use case model is a model that describes the functional requirements of a system or other classifier in terms of use cases. A use case is a diagram that shows the relationships among actors and use cases within a system.
10.4.2 Structural View on the System: Class Diagrams The hierarchy of the dynamic view of the system (use cases) may provide a start but usually falls short when defining the structural view of the system (classes). The reason is that it is likely that certain objects participate in many use cases and logically cannot be assigned to a single use case package. At the highest level, the packages are often the same, but at the lower levels of the hierarchy, there are often better ways to divide the packages. Analysis identifies a preliminary mapping of required behaviour onto structural elements, or classes, in the system. At the analysis level, it is convenient to categorize all discovered objects into one of three stereotyped classes: «boundary», «control», and «entity», as suggested by (Jacobson et al.1992), see Figure 10.2.
Stereotype
<>
<>
<<entity>>
Icon
Figure 10.2. Structural elements
Definition 10.4.2 (Jacobson et al. 1992) Boundary classes connect users of the system to the system. Control classes map to the processes that deliver the functionality of the system. Entity classes represent the persistent things of the system, such as the database. Entities are shared and have a life cycle outside any one use of the system, whereas control instances typically are created, executed, and destroyed with each invocation. Entity properties are mostly attributes, although also operations that organize an entity’s properties can often be found on them. Controllers do not specify attributes. Initially, their operations are based on verb phrases in the use case specification and tend to represent functionality that the user might invoke. In general, class diagrams can be defined as follows: Definition 10.4.3 (Rumbaugh et al. 2005) A class diagram is a graphic representation of the static view that shows a collection of declarative (static) model elements, such as classes, types, and their contents and relationships.
222
Oliver Braun, Günter Schmidt
10.4.3
Use Case Realization: Sequence and Activity Diagrams
Construction of behavioural diagrams that form the model’s use case realizations help to discover and elaborate the analysis classes. A use case realization is an expression of a use case’s flow of events in terms of objects in the system. In UML, this event flow is expressed with sequence diagrams. Definition 10.4.4 (Rumbaugh et al. 2005) A sequence diagram is a diagram that shows object interactions arranged in time sequence. In particular, it shows the objects participating in an interaction and the sequences of messages exchanged. A sequence diagram can realize a use case, describing the context in which the invocation of the use case executes. The sequence diagram describes the objects that are created, executed, and destroyed in order to execute the functionality described in the use case. It does not include object relationships. Sequence diagrams provide a critical link of traceability between the scenarios of the use cases and the structure of the classes. These diagrams can express the flow in a use case scenario in terms of the classes that will implement them. An additional diagram that is useful for mapping analysis classes into the use case model is the activity diagram. Definition 10.4.5 (Rumbaugh et al. 2005) An activity diagram is a diagram that shows the decomposition of an activity into its constituents. Activities are executions of behaviour. Activities are behavioural specifications that describe the sequential and concurrent steps of a computational procedure. Activities map well to controller classes, where each controller executes one or more activities. We create one activity diagram per use case.
10.4.4 Example Models The basic analysis use case model (see Figure 10.3) is oriented to the Certified Financial Planner’s (CFP) Board of Standards and ISO/TC 222 definition of the personal financial planning process (see Def. 2.4). In the following, we will describe the subsystem Core Personal Financial Planning in detail. Core Personal Financial Planning includes three use cases Determining the client’s financial status, Determining a feasible to-be concept, and Determining planning steps that are elaborated in more detail in Sections 10.4.4.1 through 10.4.4.3.
10 A Reference Model for Personal Financial Planning
223
Figure 10.3. Basic use case model
10.4.4.1 Determining the Client’s Financial Status Determining the client’s financial status means answering the question “How is the current situation defined?” We will pick the following scenarios as the main success scenarios. Determining the client’s financial status 1. Data gathering. The Financial Planner asks the client for information about the client’s financial situation. This includes information about assets, liabilities, income and expenses. 2. Data evaluation. The Financial Planner uses two financial statements: balance sheet and income statement.
224
Oliver Braun, Günter Schmidt
3. Data analysis. The Financial Planner analyses the results from balance sheet and income statement. He can investigate for example the state of the provisions and pension plan, risk management, tax charges, return on investments and other financial ratios. In the following, the scenario is described in more detail. Data gathering: assets and liabilities The first category of assets is cash which refers to money in accounts such as checking accounts, savings accounts, money market accounts, money market mutual funds, and certificates of deposit with a maturity of less than one year. The second category of assets are fixed assets, fixed-principal assets, fixed-return assets, or debt instruments. This category includes investments that represent a loan made by the client to someone else. Fixed-return assets are primarily corporate and government bonds, but may also be mortgages held, mutual funds that invest primarily in bonds, fixed annuities, and any other monies owed to the client. The third category of assets are equity investments. Examples of equity investments are common stock, mutual funds that hold mostly common stock, variable annuities, and partnerships. The fourth category of assets is the value of pensions and other retirement accounts. The last category of assets is usually personal or tangible assets, also known as real assets. This category includes assets bought primarily for comforts, like home, car, furniture, clothing, jewellery, and any other personal articles with a resale value. Liabilities are debts that a client currently owes. Liabilities may be charge account balances, personal loans, auto loans, home mortgages, alimony, child support, and life insurance policy loans. In general, assets are gathered at their market value, and debts are gathered at the amount a client would owe if he were to pay the debt off today. Data gathering: income and expenses Income includes wages and salaries, income from investments such as bonds, stocks, certificates of deposit, and rental property, and distributions from retirement accounts. Expenses may be expenses for housing, food, and transportation. Expenses can be grouped by similarity, for example one can choose insurance premiums as a general category, and then include life insurance, auto insurance, health insurance, homeowner’s or renter’s insurance, disability insurance, and any other insurance premiums one pays in this general category. The acquisition of personal assets (buying a car) is also treated as an expense. Data evaluation: balance sheet A balance sheet has three sections. The first section is a statement of assets (things that one owns). The second section is a statement of liabilities (a list of one’s debts). The third section is called net worth. It represents the difference between one’s assets and one’s liabilities. Businesses regularly construct balance sheets following a rigorous set of accounting rules. Personal balance sheets are constructed much less formally, although they should follow some of the same gen-
10 A Reference Model for Personal Financial Planning
225
eral rules incorporated into business balance sheets. The assets on a personal balance sheet may be placed in any order. The most common practice is to list assets in descending order of liquidity. Liquidity refers to an asset’s ability to be converted to cash quickly with little or no loss in value. Liabilities are normally listed in order of increasing maturity. The net worth on a balance sheet is the difference between the total value of assets owned and the total amount of liabilities owed. Net worth is the measure of a person’s financial wealth. Data evaluation: income statement Where the personal financial balance sheet is like a snapshot of your net worth at a particular point in time, an income statement is a statement of income and expenses over a period of time. The time frame is usually the past year. The first section of an income statement is the income. The second section is a statement of expenses. Expenses are usually listed in order of magnitude, with the basic necessities being listed first. The third section is a summary that represents the contribution to savings. Contribution to savings is the difference between the total of income and the total of expenses. A positive contribution to savings would normally show up in the purchase of investments, a reduction in debt, or some combination of the two. A negative contribution to savings shows up in the liquidation of assets, an increase in debt, or some combination of the two. Data analysis The net worth number is significant in and of itself, since the larger the net worth, the better off a person is financially, but balance sheet and income statement contain even more useful information a financial planner should consider. The traditional approach for analyzing financial statements is using of financial ratios. Financial ratios combine numbers from the financial statements to measure a specific aspect of the personal financial situation of a client. Financial ratios are used 1. to measure changes in the quality of the financial situation over time, 2. to measure the absolute quality of the current financial situation, and 3. as guidelines for how to improve the financial situation. We will group the ratios into the following eight categories: liquidity, solvency, savings, asset allocation, inflation protection, tax burden, housing expenses, and insolvency/credit (see also DeVaney et al. 1994; Greninger et al. 1994). For example, the Solvency Ratio is defined as Solvency Ratio = Total Assets/Total Liabilities and the Savings Rate is defined as Savings Rate = Contribution to Savings/Income. UML diagrams In this section we will elaborate the use case Determining the client’s financial status: data evaluation as an example of the UML diagram.
226
Oliver Braun, Günter Schmidt
The class diagram in Figure 10.4 shows a first-pass analysis model for the use case Determining the client’s financial status: data evaluation. Most important elements are the class names, the types, and the relationships. This basic class diagram can be elaborated with the operations and attributes as described in the use case specification to an elaborated class diagram (Figure 10.5).
ASSET
BUILD BALANCE SHEET
LIABILITY
DATA EVALUATION INCOME BUILD INCOME STATEMENT
EXPENSE
Figure 10.4. Basic use case analysis model elements
Figure 10.5 Elaborated analysis model elements
10 A Reference Model for Personal Financial Planning
227
Figure 10.6 shows a simple sequence diagram that indicates a basic flow of the use case Determining the client’s financial status: data evaluation. Messages corresponding to operations on the classes are drawn to correspond to the main narrative text alongside the scenario. Figure 10.7 shows the same use case scenario Determining the client’s financial status: data evaluation as depicted in the sequence diagram in Figure 10.6 does with System expressed in terms of the analysis classes in Figure 10.5. Figure 10.8 shows the basic activity diagram for the use case Determining the client’s financial status: data evaluation elaborated with classes from Figure 10.6.
Figure 10.6. Basic sequence diagram
Figure 10.7. Sequence diagram elaborated with analysis objects
228
Oliver Braun, Günter Schmidt
Figure 10.8. Basic activity diagram
10.4.4.2 Determining a Feasible To-be Concept The use case scenario is as follows: Determining a feasible To-be concept 1. Data gathering and Feasibility check. The Financial Planner asks for information about the client’s future financial situation. This includes information about the client’s future incomes and expenses and the client’s assumption on the performance of his assets. Future incomes and expenses are determined by life-cycle scenarios. The Financial Planner asks the client for his requirements on a feasible To-be concept. A to-be concept is feasible if and only if all requirements are fulfilled at every point in time. 2. Data evaluation. The Financial Planner uses two financial statements: (planned) balance sheets and (planned) income statements. 3. Data analysis. The Financial Planner analyses the results from balance sheets and income statements. He can investigate for example the state of the foresight and pension plan, risk management, tax charges, return on investments and other financial ratios. Data evaluation and Data analysis have to be done in a very similar way as in Determining the client’s financial status. In this section, we will concentrate our discussion at the first step: Data gathering.
10 A Reference Model for Personal Financial Planning
229
Future incomes and expenses are determined by financial goals. Financial goals are not just monetary goals, although such goals are obvious and appropriate. Goals can be short term, intermediate term, and long term. Short-term goals refer to the next twelve months. What does a client want his personal balance sheet to look like in twelve months? What would he like to do during the coming year: where would he like to spend his vacation? How much would he like to spend on the vacation? What assets, such as a new computer or a new car, would he like to acquire? Intermediate goals are usually more financial in nature and less specific in terms of activities and acquisitions. As one thinks further into the future, one cares more about having the financial ability to do things and less about the details of what one will do. Most intermediate goals would likely include more than just wealth targets. For example, a client may target a date for buying his first home. Intermediate goals often address where one wants to be in five or twenty years. For most people, long-term goals include their date of retirement and their accumulated wealth at retirement. What sort of personal balance sheet does a client want to have at retirement? The goals of a system for personal financial planning consist in helping a Personal Financial Planner to handle the financial situation of a client in a way that the client can maximize his/her quality of life by using money the best way for him/herself. Goals are about the management, protection, and ultimate use of wealth. There is a strong propensity to associate wealth or standard of living with quality of life. Wealth is normally defined of assets owned less any outstanding debts. Standard of living is measured in terms of annual income. For some people, wealth or standard of living may be the correct measure of their quality of life. For many, that is not the case. First, many people give away substantial amounts of wealth during their lifetimes (places of worship, charities, non-profit organizations such as alma maters). People give gifts because they derive pleasure from it or because they have a sense of obligation. Second, many people hold jobs noted for low pay (religious professions such as priests) and earn low pay compared with what their level of educational training could earn them in other professions. A lot of people pass up lucrative corporate incomes in order to be self-employed. Some people seek jobs with no pay. Full-time, stay-at-home parents are examples of people who accept unpaid jobs based on the benefits they perceive from managing the household. Obviously, many people regularly make choices that are not wealth- or income-maximizing, and these choices are perfectly rational and important to their lives. Proper financial planning seeks only to help people make the most of their wealth and income given the non-pecuniary choices they make in their lives, not just maximize their wealth and income. Life-cycle scenarios (see Braun 2006) are for example applying for a personal loan, purchasing a car, purchasing and financing a home, auto insurance, homeowner insurance, health and disability insurance, life insurance, retirement planning, estate planning. Requirements to a feasible To-be concept as introduced in (Braun 2007) are primarily requirements to balance sheets such as the composition of the entire portfolio,
230
Oliver Braun, Günter Schmidt
a cash reserve, etc. According to the financial goals of a company, a client may have the following two main requirements to a feasible To-be concept: assurance of liquidity (i. e. to have all the time enough money to cover his consumer spending) and prevention of personal insolvency (i. e. to avoid inability to pay). Assurance of liquidity (liquidity management) means that one has to prepare for anticipated cash shortages in any future month by ensuring that enough liquid assets to cover the deficiency are available. Some of the more liquid assets include a checking account, a savings account, a money market deposit account, and money market funds. The more funds are maintained in these types of assets, the more liquidity a person will have to cover cash shortages. Even if one has not sufficient liquid assets, one can cover a cash deficiency by obtaining short-term financing (such as using a credit card). If adequate liquidity is maintained, one will not need to borrow every time one needs money. In this way, a person can avoid major financial problems and therefore be more likely to achieve financial goals. Prevention of personal insolvency can be carried out by purchasing insurance. Property and casualty insurance insure assets (such as car and home), health insurance covers health expenses, and disability insurance provides financial support in case of disability. Life insurance provides the family members or other named beneficiaries with financial support if they loose their bread-winner. Thus, insurance protects against events that could reduce income or wealth. Retirement planning ensures that one will have sufficient funds for the old-age period. Key retirement planning decisions involve choosing a retirement plan, determining how much to contribute, and allocating the contributions. If the feasibility check fails, Personal Financial Planners can apply their knowledge and experience to balance the financial situation. They can take actions which adjust to the financial structure by changing requirements to a feasible to-be concept, by changing future expenses (such as expenses for a car), or by alternative financing expenses (for example by enhancement of revenues). These actions may be necessary to ensure that the main financial goals are fulfilled. 10.4.4.3 Measures to Take Financial goals of a client determine the planning steps as gathered in Section 10.4.4.2 (Determining a feasible To-be concept) and activities to achieve a feasible To-be concept.
10.5
System Architecture
Architecture influences just about everything in the process of developing software. Architecture gives the rules and regulations by which the software has to be constructed. Figure 10.9 shows the general architecture of our prototype FiXplan (IT-based Personal financial planning), and the data flow between the User Interfaces, the Application Layer, and the Sources (see also Braun and Schmidt 2005).
10 A Reference Model for Personal Financial Planning Clients
Server Visual Basic
EXCEL
WEB BROWSER
SOAP/HTTP
Sources
FiXplan Web Services Toolbox
Java Classes
HTML/HTTP
External Web Sites
SOAP/HTTP Java Server Pages
WEB BROWSER
231
HTML/HTTP
SOAP/HTTP
FiXplan Main System
Financial Applications PHP Classes SQL
Database Server
Figure 10.9. General Three-tier architecture of FiXplan
Basically, the system contains the following components: • Clients: Used by Personal Financial Planners or clients to access our main financial planning system and our Web Services. User Interfaces may be web browsers such as the Internet Explorer or Mozilla, or applications such as Microsoft Excel. • Server: As described in detail in the analysis model. The Web Services Toolbox provides useful tools for personal financial planning. • Sources: Used as a repository for the tools of the personal finance framework. The three-tier architecture of FiXplan supports modular deployment of both device specific user interfaces through multi-channel delivery of services and new or adapted operational processes and strategies. All connectivity interfaces are based on standard specifications. At the client side, the web browser and PHP-scripts handle the user interface and the presentation logic. At the server side, the web server gets all http requests from the web user and propagates the requests to the application server which implements the business logic of all the services for personal financial planning. At the database server side, transactional or historical data of the dayto-day operations are stored in the system database by RDBMS. The application server sends the query to the database server and gets the result set. It prepares the response after a series of processes and returns to the web server to be propagated back to the client side. A registered user can log in and pick an existing session or create a new session of his preference, such as a level of user
232
Oliver Braun, Günter Schmidt
Service Provider
XML service request based on WSDL (sent using SOAP)
XML service response based on WSDL (sent using SOAP)
Service Consumer
Figure 10.10. Basic service oriented architecture
protection, access rights and level of complexity. Based on the three-tier structure, FiXplan can run on both, internet and intranet environments. The user can for example use a web browser to access a web server through HTML language and HTTP protocol. The kernel of the system is placed on the web server. FiXplan uses a collection of Web Services. These services communicate with each other. The communication can involve either simple data passing or it could involve two or more services coordinating some activity. Some means of connecting services to each other is needed. Figure 10.10 illustrates a basic service oriented architecture. It shows a service consumer below sending a service request message to a service provider above. The service provider returns a response message to the service consumer. The request and subsequent response connections are defined in some way that is understandable to both the service consumer and the service provider. A service provider can also be a service consumer. All the messages shown in the above figure are sent using SOAP. SOAP generally uses HTTP, but other means of connection such as Simple Mail Transfer Protocol (SMTP) may be used. HTTP is the familiar connection we all use for the Internet. In fact, it is the pervasiveness of HTTP connections that will help drive the adoption of Web Services. SOAP can be used to exchange complete documents or to call a remote procedure. SOAP provides the envelope for sending Web Services messages over the Internet/Internet. The envelope contains two parts: An optional header providing information on authentication, encoding of data, or how a recipient of a SOAP message should process the message. The body that contains the message. These messages can be defined using the WSDL specification. WSDL uses XML to define messages. XML has a tagged message format. Both the service provider and service consumer use these tags. In fact, the service provider could send the data shown at the bottom of this figure in any order. The service consumer uses the tags and not the order of the data to get the data values. FiXplan is both: Web Service Provider and Web Service Consumer as shown in Figure 10.11:
10 A Reference Model for Personal Financial Planning Clients
Server
233
Sources
FiXplan EXCEL-SHEET
accNr SOAP/HTTP
CreditBalance.xls balance
Java Classes
Web Services Toolbox accNr
WS Consumer ShowBalance.php
balance
accNr
WS Provider
SOAP/HTTPS
balance WS Consumer
FiXplan Main System
Bank Giro Service GetSaldo.jws WS Provider
PHP Classes
Figure 10.11. Service provider and service consumer
In this example, a Visual Basic function called from an Excel-Sheet sends a SOAP message to the PHP Web Service ShowBalance.php. ShowBalance.php calls the function GetBalance.jsp that gets the balance via a SOAP message from the Web Service of a Bank Giro Service. ShowBalance.php retrieves the balance of account accNr and constructs a SOAP message that can be transferred by a Visual Basic function in the Excel Sheet.
10.6 Conclusion and Future Trends We have presented a reference model for personal financial planning consisting of an analysis model and a corresponding system architecture. An intrinsic part of our reference model at the architecture level is the usage of web technologies. Our reference model for personal financial planning fulfils two kinds of purposes: first, the analysis model is a conceptual model that can serve financial planners as a decision support tool. Second, at the design stage of system development, system developers can take the analysis model and apply the system architecture to them. The reference model combines business concepts of personal financial planning with technical concepts from information technology. We evaluated our reference model by building the IT system FiXplan based on this reference model. The prototype shows that our reference model fulfils maintainability, adaptability, efficiency, reliability, and portability requirements. Furthermore, it is based on ISO and CFP Board standards. Further research is needed to enlarge the reference model with respect to the automation of the To-be concept feasibility check. Here, classical mathematical
234
Oliver Braun, Günter Schmidt
programming models as well as fuzzy models may deliver a solution to that problem. While in case of classical models the vague data is replaced by average data, fuzzy models offer the opportunity to model subjective imaginations of the client or the Personal Financial Planner as precisely as they are able to describe it. Thus the risk of applying a wrong model of the reality and selecting wrong solutions which do not reflect the real problem could be reduced.
References Bass L, Clements P, Kazman R (1998) Software Architecture in Practice. The SEI Series in Software Engineering. Addison Wesley, Reading Becker J, Schütte R (2004) Handelsinformationssysteme. Redline Wirtschaft, Frankfurt am Main Bernus P, Mertins K, Schmidt G (2005) Handbook on Architectures of Information Systems. Springer, Berlin Bhargava HK, Power DJ, Sun D (2006) Progress in web-based decision support technologies, Decision Support Systems, to appear Böckhoff M, Stracke G (2001) Der Finanzplaner. Sauer, Heidelberg Booth D, Haas H, McCabe F, Newcomer E, Champion M, Ferris C et al. (eds) Web Services Architecture, W3C. Retrieved August 5, 2005, from http://www.w3c.org/TR/ws-arch/ Braun O (2006) Lebensereignis- und Präferenzorientierte Persönliche Finanzplanung – Ein Referenzmodell für das Personal Financial Planning. In: Schmidt G (ed) Neue Entwicklungen im Financial Planning. Hochschule Liechtenstein Braun O (2007) IT-gestützte Persönliche Finanzplanung. Habilitationsschrift, Saarland University Braun O, Kramer S (2004) Vergleichende Untersuchung von Tools zur Privaten Finanzplanung. In: Geberl S, Weinmann S, Wiesner DF (eds) Impulse aus der Wirtschaftsinformatik. Physica, Heidelberg, pp. 119−133 Braun O, Schmidt G (2005) A service oriented architecture for personal financial planning. In: Khosrow-Pour M (ed.) Managing modern organizations with information technology, Proceedings of the 2005 IRMA International Conference. Idea Group Inc, pp. 1032−1034 Briggs RO, Nunamaker JF, Sprague RH (1997) 1001 Unanswered research questions in GSS. Journal of Management Information Systems 14: 3−21 Certified Financial Planner’s (CFP) Board of Standards Inc (2005) Retrieved August 14, 2005, from http://www.cfp.net Chou SCT (1998) Migrating to the web: a web financial information system server. Decision Support Systems 23: 29−40 Conallen J (2002) Building Web Applications with UML, 2nd edn. Pearson Education, Boston
10 A Reference Model for Personal Financial Planning
235
DeVaney S, Greninger SA, Achacoso JA (1994) The Usefulness of Financial Ratios as Predictors of Household Insolvency: Two Perspectives Dong JC, Deng SY, Wang Y (2002) Portfolio selection based on the internet. Systems engineering: Theory and Practice 12: 73−80 Dong JC, Du HS, Wang S, Chen K, Deng X (2004) A framework of Web-based Decision Support Systems for portfolio selection with OLAP and PVM. Decision Support Systems 37: 367−376 Fan M, Stallaert J, Whinston AB (1999) Implementing a financial market using Java and Web-based distributed computing. IEEE Computer 32: 64−70 Fan M, Stallaert J, Whinston AB (2000) The internet and the future of financial markets. Communications of the ACM 43: 82−88 Federal Reserve Board Division of Consumer and Community Affairs (2002) Financial Literacy: An Overview of Practice, Research, and Policy. Retrieved on October 24, 2006 from http://www.federalreserve.gov/pubs/bulletin/2002/1102lead.pdf Fettke P, Loos P (2003) Multiperspective Evaluation of Reference Models – Towards a Framework. In: Jeusfeld MA, Pastor O (eds) ER 2003 Workshops, LNCS 2814. Springer, Heidelberg, pp. 80−91 Fowler M (1997) Analysis Patterns: Reusable Object Models. Menlo Park Frank U (1999) Conceptual Modelling as the Core of the Information Systems Discipline – Perspectives and Epistemological Challenges. In Haseman WD, Nazareth DL (eds) Proceedings of the 5th Americas Conference on Information Systems (AMCIS 1999). Milwaukee, Wisconsin, pp. 695−697 Greninger SA, Hampton VL, Kitt KA (1994) Ratios and Benchmarks for Measuring the Financial Well-Being of Families and Individuals. Financial Counselling and Planning 5: 5−24 Hay DC (1996) Data Model Patterns – Conventions of Thought. New York. Hirschey M, Richardson VJ, Scholz S (2000) Stock price effects of internet buysell recommendations: the Motley Fool Case. The Financial Review 35: 147−174 International Organization for Standardization (ISO/TC 222) (2004) ISO/DIS 22 222-1 of the Technical Committee ISO/TC 222, Personal financial planning Jacobson I, Christerson M, Jonsson P, Overgaard G (1992) Object-Oriented Software Engineering: A Use Case Driven Approach. Addison-Wesley, Boston Keown AJ (2003) Personal Finance: Turning Money into Wealth. Prentice Hall, Upper Saddle River Lang K, Taumann W, Bodendorf F (1996) Business Process Reengineering with Reusable Reference Process Building Blocks. In: Scholz-Reiter B, Stickel E (eds) Business Process Modelling. Berlin, pp. 264−290 Li ZM (1998) Internet/Intranet technology and its development. Transaction of Computer and Communication 8: 73−78 Marent C (1995) Branchenspezifische Referenzmodelle für betriebswirtschaftliche IV-Anwendungsbereiche. Wirtschaftsinformatik 37(3): 303−313 Marshall C (1999) Enterprise Modeling with UML: Designing Successful Software through Business Analysis. Reading
236
Oliver Braun, Günter Schmidt
Mathieu RG, Gibson JE (1993) A methodology for large scale R&D planning based on cluster analysis. IEEE Transactions on Engineering Management 30: 283−291 Mehdi RZ, Mohammad RS (1998) A web-based information system for stock selection and evaluation. Proceedings of the International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems Mertens P, Holzer J, Ludwig P (1996) Individual- and Standardsoftware: tertium datur? Betriebswirtschaftliche Anwendungsarchitekturen mit branchen- und betriebstypischen Zuschnitt (FORWISS-Rep. FR-1996-004). Erlangen, München, Passau Mišić VB, Zhao JL (1999) Reference models for electronic commerce. Proceedings of the 9th Hong Kong Computer Society Database Conference – Database and Electronic Commerce. Hong Kong, pp. 199−209 Mukherjee K (1994) Application of an interactive method for MOLIP in project selection decision: a case from Indian coal mining industry. International Journal of Production Economics 36: 203−211 Nissenbaum M, Raasch BJ, Ratner C (2004) Ernst & Young’s Personal Financial Planning Guide, 5th edn. John Wiley & Sons Remme M (1997) Konstruktion von Geschäftsprozessen – Ein modellgestützter Ansatz durch Montage generischer Prozesspartikel. Wiesbaden Rumbaugh J, Jacobson I, Booch G (2005) The Unified Modelling Language Reference Manual, 2nd edn. Addison-Wesley, Boston Samaras GD, Matsatsinis NF, Zopounidis C (2005) Towards an intelligent decision support system for portfolio management. Foundations of Computing and Decision Sciences 2: 141−162 Scheer A-W (1994) Business Process Engineering – Reference Models for Industrial Companies, 2nd edn. Berlin Schmidt G, Lahl B (1991) Integration von Expertensystem- und konventioneller Software am Beispiel der Aktienportfoliozusammenstellung. Wirtschaftsinformatik 33: 123−130 Schmidt G (1999) Informationsmanagement. Springer, Heidelberg Schmidt G (2006) Persönliche Finanzplanung – Modelle und Methoden des Financial Planning. Springer, Heidelberg Scholz-Reiter B (1990) CIM – Informations- und Kommunikationssysteme. München, Wien Schütte R (1998) Grundsätze ordnungsmäßiger Referenzmodellierung – Konstruktion konfigurations- und anpassungsorientierter Modelle. Wiesbaden Turban E (1995) Decision Support and Expert Systems, 4th edn. Prentice-Hall, Englewood Cliffs Woerheide W (2002) Core concepts of Personal Finance. John Wiley & Sons Wüstner E, Buxmann P, Braun O (2005) The Extensible Markup Language and its Use in the Field of EDI. In: Bernus P, Mertins K, Schmidt G (eds) Handbook on Architectures of Information Systems, 2nd edn. Springer, Heidelberg, pp. 391−419
10 A Reference Model for Personal Financial Planning
237
Youngs R, Redmond-Pyle D, Spaas P, Kahan E (1999) A Standard for Architecture Description. IBM Systems Journal 38. Available at http://www.research.ibm.com/journal /sj/381/youngs.html Zahedi F, Palma-dos-Reis A (1999) Designing personalized intelligent financial decision support systems. Decision Support Systems 26: 31−47 Zopounidis C, Doumpos M, Matsatsinis NF (1997) On the use of knowledgebased decision support systems in financial management: a survey. Decision Support Systems 20: 259−277
CHAPTER 11 Internet Payments in Germany Malte Krueger, Kay Leibold
11.1 Introduction: The Long Agony of Internet Payments Ten years ago, internet payments seemed to be a new and quickly expanding universe, ready to be taken over by new and fanciful payment methods. New companies were taking if not the market so at least the headlines by storm. The year 1994 saw the foundation of new upstarts such as DigiCash, CyberCash and First Virtual, and 1994 also was the birth year of SSL (the current standard of encoding confidential information sent over the internet). Entrepreneurs were not the only ones busy in this year. Regulators also had become aware of innovations in the payment system. In particular, DigiCash’s pilot with ‘Cyberbucks’ that could be sent over the internet as a means of payment triggered high hopes and grave fears. Evangelists of the New Economy regarded Cyberbucks as THE future internet payment system, monetary reformers hoped that systems such as Cyberbucks would free money users from the malicious influence of the state and – as a mirror image – central bankers feared the loss of monetary control. Thus, it is not surprising that the European Monetary Institute (the precursor of the European Central Bank (ECB)) came out with its ‘Report on Electronic Money’ in 1994 already (EMI 1994). Subsequently, many more reports were written. The EU Commission prepared a regulatory scheme for e-money and after a long period of haggling between the European Commission, member states and the ECB the ‘e-money directive’ was passed (EU Directive 2000). 1 1
In the US, the authorities were less worried about internet payment systems. In particular, the Fed took a much more relaxed view of e-money (Greenspan 1997). However, the image of a laissez-faire regime in the US – as compared to tight regulation in the EU – is somewhat mistaken (Krueger 2001). In the US, e-payment providers have to comply with state regulations. These regulations may be less severe than the EU
240
Malte Krueger, Kay Leibold
Today, the frenzy about internet payments has died down. Most of the pioneers of internet payments went into bankruptcy. Big schemes with powerful supporters (for instance Secure Electronic Transaction (SET) championed by the credit card organisations) never got any where. One of the few exceptions is the Peer-to-Peer (P2P) payment scheme PayPal which has taken the US market by storm and is currently expanding into other regions of the world. Apart from PayPal there are few success stories. There are three important factors that may help to explain why innovative new internet payment schemes did not take off. • First, traditional means of payment – used in the offline world – proved surprisingly adept for internet usage. • Second, internet payments in general and security in particular were seen as technical problems awaiting technical solutions. • Third, due to the pervasive availability of free content, the market for internet payments – in particular for micro payments – grew much more slowly than expected. Below we will take a look at the current state of the market and some of the important drivers.
11.2 Is There a Market for Innovative Internet Payment Systems? 11.2.1 Use of Traditional Payment Systems: Internet Payments and the Role of Payment Culture What could be observed over the last ten years was the successful adaptation of real world payment systems to the internet. One of the results of this trend is that the supposedly borderless internet exhibits surprisingly large national differences when it comes to payments. In retail payments, usually, a distinction is made between giro and check countries. Typical check countries are the US and the UK whereas many continental European countries are typical giro countries (see the following table). When looking at Point of Sale (POS) payments, in the typical check country, the main payment media are cash, checks and credit cards (with debit cards catching up fast, at the moment). In remote retail payments, the check is clearly dominating. In particular, it is still used for most recurring payments such as rent, electricity payments or salaries. In giro countries, the main POS payment media are cash and e-money regulation. However, the EU e-money directive has the advantage of providing a ’passport’ for the whole of the EU.
11 Internet Payments in Germany
241
Table 11.1. Relative importance of cashless payment instruments in 2002 Check Germany Belgium France Italy Netherlands Sweden Switzerland United Kingd. United States
Card-based e-money 0.2 5.4 0.1 0.6 3.9 Nav 1.8 Nav Neg
Percentage of total volume of cashless transactions. Source: (BIS 2006)
debit cards with credit cards a distant third. For remote payments, people use either credit transfers, standing orders or direct debits – all of which can be summarised under the label ‘giro payment’ or ‘Automatic Clearing House (ACH) payment’. Over the last 20 years, the move from paper to electronic initiation and processing of payments has been notably different in the two payment cultures. Whereas check payments are still mostly paper-based, giro countries have been very successful in reducing the share of paper-based payments. The processing is almost completely electronic and even the initiation of payments is relying less and less on paper as an increasing number of customers uses phone banking or online banking. In Germany, for instance, there are more than 33 million online accounts (out of a total of 85 million giro accounts). From the point of view of customers, the giro system has a number of advantages. Recurrent payments do not require any intervention by the payer and it is possible to initiate electronic P2P payments by phone or PC.
Table 11.2. Relative importance of various internet payment systems
Belgium Denmark France Germany Netherlands United Kingdom United States
Bank Cards 50 10 51 24 26 60 > 80
Direct debits
Giro transfer 33
Cash upon del. 9
30 23
6 6 4
Other 8
90 21 30
43 19 47 10
Sources: Adams (2003), BIS (2004), Celent (2002), FEVAD (2006), insites (2004a) and (2004b), Krueger et al. (2005) and own estimates
242
Malte Krueger, Kay Leibold
The observed differences in payment cultures in the offline world have had a strong impact on payment patterns in the online world. In giro countries, credit transfers and debits have gained a large share in Business-to-Consumer (B2C) e-commerce. By contrast, in check countries the credit card is clearly dominating with a share of 60% and more. Even more striking is the difference in the area of P2P payments. Whereas in giro countries, credit transfers can easily be used for P2P payments, in check countries there is no wide accessible payment medium for electronic P2P payments. That has opened a market for newcomers such as PayPal. When it comes to international payments, there is more convergence between the two different payment cultures in the area of Consumer-to-Business (C2B) payments. X-border payments are generally dominated by the credit card. However, with the EU-directive on x-border payments, credit transfers have become an attractive option for both, C2B and Consumer-to-Consumer (C2C) payments within the EU. On the whole, it can be said that an innovative payment provider may find a very different payment environment depending on geographic market he wants to enter. With the spread of online banking and increasing use of online credit transfers the European market poses a bigger challenge for suppliers of alternative payment systems than the US market.
11.2.2 The Role of Technology The early developments of internet payment systems were very much technologydriven.2 Security and, in some cases, anonymity were seen as key elements of a successful internet payments product. One example is the ECash system offered by DigiCash. ECash was designed as an almost cash-like means of payment for the internet. It allows users to generate and store ‘one-way tokens’ on their PC which can be spent almost like cash. To provide anonymity, ECash was using blind signatures – a technique that had been developed by David Chaum. After trials in 1994, the first commercial roll-out was undertaken by Mark Twain Bank in the US. Subsequently, DigiCash licensed the technology to a number of banks in various countries (Merita Bank in Finland, the Swedish Post or Advance Bank in Australia). In Germany, Deutsche Bank acquired an ECash license and started a pilot in 1997. Another prominent example is CyberCash – a US payment service provider. CyberCash Inc. co-operated with Landesbank Sachsen and Dresdner Bank to make CyberCash the German internet payment standard. To use Cybercash both, consumer and merchant, have to download and install CyberCash software. This 2
A comprehensive overview of the early systems can be found in (Furche and Wrightson 1997).
11 Internet Payments in Germany
243
software allows them to encrypt all payment messages sent between consumer and merchant and between merchant and payment gateway. In Germany, CyberCash and its partners offered three distinct types of internet payment: Cybercoin (a prepaid micropayment system), electronic direct debit and credit card payments. Maybe the most prominent example of these early developments is SET. Security concerns with respect to the use of credit cards in open networks prompted the two large credit card companies EuroCard/MasterCard and Visa to develop a standard for SET. The system is based on digital signatures and heavy encryption. It was piloted in 1997 and subsequently promoted as a secure standard for credit card payments on the internet. Although all of these solutions found the support of large players in the field of payments, none could succeed. DigiCash (in 1998) and CyberCash (in 2001) had to declare bankruptcy and SET has been revised and re-branded over and again without achieving critical mass. What went wrong? All of these systems relied on users to install extra software on their PCs. This not only proved time-consuming and cumbersome, the wallet approach also limited portability. Given the high preference of payment system users for convenience of use, this approach did not meet enthusiastic customers. Merchants often found it difficult to integrate the new solutions into their shop systems. Moreover, while anonymity may be positive from the point of view of customers, merchants want to know their customers. Finally, the new systems were based on a narrow concept of ‘security’. Security was defined as secure against interference of third parties (‘hackers’) and irrevocability. However, a system with these traits leaves an important risk with customers: the risk that the merchant does not deliver. A second wave of internet payment systems tries to remedy the short comings of the first wave (Böhle 2001). In particular, in order to provide portability, a server based approach is used and SSL has become the encryption standard. Since SSL is integrated into standard browser software, customers no longer need to download additional encryption software. In payment systems with a server based approach customer data and account information are stored on a central server. Customers can access this server from any entry point to the internet. Authentication and authorisation is usually carried out with user names and pass words. Prominent examples are PayPal, FirstGate and the paysafecard. The latter is particularly interesting because it shows that anonymous internet payments can be provided with a system based on fairly basic technologies (i. e. a simple number tied to a prepaid online account). Companies in the second wave also follow a more focussed approach. For instance, PayPal initially strongly concentrated on the P2P segment and payments in connection with online auctions. FirstGate is focussing on digital goods and offers merchants a variety of additional services. paysafecard is eyeing the segments of the online market where anonymity is valued and seeks to position its system as a marketing tool.
244
Malte Krueger, Kay Leibold
11.2.3 The Role of Different Business Models 11.2.3.1 Two Types of C2B E-commerce In the real world, it can be observed that different goods are paid with different payment systems. Bread is mostly paid in cash, rent is often paid per bank transfer or check and car hire is generally cleared by credit card. In the world of e-commerce, a distinction that is particularly important is between soft and hard goods (or digital and ‘real’ goods). Soft goods can be bought online and immediately delivered over the internet. Hard goods have to be mailed or picked up in person. Soft goods are also often of low value compared to hard goods. From the point of view of payments, e-commerce in hard goods does not pose any new problem. After all, it is just another form of distant trade. Long before the internet was established, distant trade was well established, for instance, in the form of mail order business. Mail order merchants have successfully dealt with the problem that customers are far away and that a direct exchange of money and goods is impossible. Thus, the early claim, that e-commerce requires new types of payment and cannot flourish without such innovative e-payment systems seems unfounded. In fact, in many countries, mail order merchants have successfully embraced e-commerce. Conversely, pure e-tailors are growing in size and sophistication and are increasingly employing the risk management techniques required for distant trade. Such techniques allow them to make use of post-payment systems such as bank transfer or check payment after delivery. Thus, in the segment of hard goods there does not seem to be an urgent demand for innovative internet payment schemes. Many suppliers of innovative internet payment solutions seem to have set their eyes on e-commerce in soft goods. However, the wide availability of free goods has kept demand for payment products in check. It has repeatedly been argued that the ‘free lunch culture’ of the internet is inefficient and unlikely to survive. The laws of economic theory are invoked which supposedly state that the user has to bear the costs. Thus, it is expected that the share of goods that have to be paid for will be rising in the future as the internet becomes a ‘normal’ market place. When looking at the current situation, it is indeed true that in the area of digital content web surfers can find a wide choice of free offerings. In Germany, for instance, companies offering digital content on the internet derive 95% of their earnings from advertising and 5% from paying customers (see Figure 11.1). However, such figures do not prove that the business model is unsustainable or likely to change. Indeed, a look at the structure of earnings of the content industry in the real world shows that a high share of earnings from advertising is not unusual. To be sure, the share of advertising is not as high in the offline world as in the online world. Thus, there is hope for e-payment providers. However, when judging the prospective market two factors have to be taken into account. First, the charging model can be ‘per click’ or subscription. Subscription can be handled very well with established payment systems (for instance
11 Internet Payments in Germany
245
Figure 11.1. Digital content in the online and offline world (Germany 2003), Source: VDZ
standing order or direct debit). Thus, this segment of the market does not create a demand for innovative internet payment systems. The problem, from the point of view of e-payment providers, is that subscription can cover a fairy large chunk of the market. Looking at print media in Germany, 50−90% of sales revenue come from subscription. US figures for online paid content equally show a high fraction of subscription: 80% of sales revenues in the first half of 2005 (OPA 2006). What about economic theory? Should not the user be the one who pays? Well, there does not seem to be any economic law that says it is inefficient if a third party (the advertiser) voluntarily pays. What economic theory does say is that price should equal marginal costs. On that count, a price close to zero does not seem to be wide off the mark. Producing content is a fixed cost business. Marginal costs mainly consist of the costs of distribution. Thus, a structure as in Figure 11.1 – higher share of advertising revenue in the online world than in the offline world – does not seem inefficient. It should not be overlooked, however, that the future development can be very different, depending on the particular market segment. Music is a prominent example. The potential market for online sales is huge. The current losses due to file sharing etc. are substantial. For years, the music industry has been grappling with the problem without finding a solution. Now Apple has come up with the iPod and created a new business model: downloading music at a price of 0.99 cents per song. Apple’s sales are impressive. Up to September 2006 iTunes has sold over 1,5 billions soundfiles worldwide (200 million in Europe). The example seems to show that there is life in the ‘pay per click’ model after all. However, nagging questions remain. First, Apple’s success may comfort the digital content industry but not innovative payment providers. So far, Apple is simply using the credit card as payment instrument. Recently, PayPal has been added as another payment option. To keep costs in check, Apple is aggregating credit card transactions. Thus, in fact, there is another example of how adaptive established payment methods such as the credit card are. Second, for the moment, Apple seems to provide a possible new business model for the industry as a whole. Whoever, at
246
Malte Krueger, Kay Leibold
0.99 cents the price of a song is way above marginal costs. In such a situation economic forces should drive down the price. The only thing that could prevent this is an effective copy protection that is not being hacked within weeks or months. Falling prices would be bad news for the music industry. But they might ultimately trigger demand for some type of micro payment scheme that is capable to deal effectively with transactions well below 1 US$. A German player that seems well positioned to benefit from this development is FirstGate. Other potential contenders are T-Pay and WEB.DE. 11.2.3.2 C2C Payments and the Success of PayPal PayPal is by far the most successful new internet payment scheme. In the US, it has grown phenomenally and it is currently expanding into other countries. The question is whether PayPal will be as successful in these countries as in the US. As has been discussed above, in giro countries, there are already electronic C2C payment mechanisms. Thus, in these countries, there does not exist a gap in the payments market – like in the US – that needs to be filled. For instance, the common means of payment for eBay purchases is the credit transfer mostly initiated via the internet. Thus, for eBay transactions, PayPal does not fill a gap but is competing with established means of payment. As in other markets, the essential parameters of competition are service and price. With respect to prices, PayPal has already been forced to go below the price level in the US. With respect to service, PayPal seems to be offering more convenience to its users than most traditional payment methods used on the internet. Most notably, the payment process can be integrated into the eBay purchasing process allowing the seller to automatically fill in the PayPal payment order for the buyer. Another important ‘plus’ is the immediate notification of the payee that payment has been made. Finally, PayPal also offers a buying protection. PayPal has to compete with the credit transfer – a payment mechanism that is well known by European users. Furthermore, although a credit transfer requires more time for funds to travel from payer to payee (2−3 days) they are credited to the giro account of the seller. In the case of a PayPal payment, they still have to be transferred from the PayPal account to the giro account. Finally, banks and
Table 11.3. PayPal results Q3 2006 Total revenues (USD million) Number of accounts (million) total Germany Active accounts (million)* Total payment volume (USD million)
340 122.5 >3 30,9 9,123
Source: ebay inc. financial results and own estimates * Accounts that made or received at least one payment in the last quarter
11 Internet Payments in Germany
247
payment service providers are implementing systems that allow to integrate the credit transfer into the shopping process.3 Given the competition from new innovative internet payment systems and established payment methods, it is still open whether PayPal can carry its momentum from the US to continental Europe. At the moment, the use of PayPal in the German market is still limited, but growing.
11.3 Internet Payments in Germany: A View from Consumers and Merchants Like every market, the payment services market has a supply and a de-mand side. On the supply side there are payment service providers (banks, e-money-institutions and others). On the demand side there are merchants and consumers. From merchants’ point of view, payment instruments should include a payment guarantee, should be user-friendly and cheap and should provide a wide reach (in terms of customers). Consumers desire safe and simple payment methods which are widely accepted and secured by the legal framework. When designing and implementing an internet payment system, payment solution providers and payment service providers have to take the -sometimes conflicting – interests of these two groups into account.
11.3.1 The Consumers’ View The information on consumer’s views provided in this section is based on an online survey carried out by the Institute for Economic Policy Research in 2005. 15.342 participants took part in the 8th online survey ‘Internet payment systems: The consumers’ view’. Most of the participants are male (72%), between 26 and 35 years old (29%), relative experienced with the usage of the net (93%), at least three years online (88%) and are highly educated (37% have university degree). The survey reveals that when it comes to internet payments, traditional payment methods known from the offline world are dominating (see Table 4). Almost 78% have used the online transfer (Credit transfer via internet banking) and more than 62% have already used cash upon delivery, around 55% have used debits and 53% a paper-based credit transfer. Finally, 47% have used credit cards. New internet payment systems such as collection/billing, e-mail systems, prepaid systems or mobile payments have been used much less. 3
The German internet payment system ‘giropay’ (based on online banking) has been launched by a group of German banks in co-operation with PayPal. It can be used to fund PayPal transactions.
248
Malte Krueger, Kay Leibold
Table 11.4. Which internet payment system have you been using when purchasing on the internet?
Mobile phone Prepaid systems E-Mail systems Collection/Billing Paper based debit Credit card Paper based credit transfer Online Debit Cash on delivery Online credit transfer
% 1,5 9,0 18,1 29,6 36,3 47,3 52,9 54,2 62,5 77,6
Source: (Krueger et al. 2005). N=(12518−13234), in per cent of participants, multiple entries possible
The survey also shows that consumer behaviour towards hard and soft goods differs. Whereas 97% of participants have bought hard goods, only 65% have bought soft goods. But it is noteworthy that the willingness to pay for digital goods has been rising in 2005. Particularly, the very experienced participants are willing to pay for digitals goods (87%). As shown in Figure 2. the credit card is used most often for settling digital goods. Hard goods instead were mostly paid by credit transfer with direct debits on second place. The average low price of many soft goods (songs, articles, pictures, etc.) and the possibility to make a payment and receive the purchased goods instantaneously seems to support collection/billing systems which are on second place. High value e-goods were more often settled by online-banking or credit card. The security of e-commerce and e-payments is a much discussed topic. Our analysis suggests that lack of appropriate payment methods may not be a big issue. This seems to contradict the common perception that customers are reluctant to purchase via the internet because they are concerned about the security of internet payment systems. 26% of participants of IZV8 have had negative experience with the online purchase process. This contrasts with ‘only’ 8% who have made negative experiences with the online payment process (see Figure 11.2). Fortunately, both percentages have been decreasing over time. Thus, the security of internet payments does not seem as bad as is often assumed. Still, in spite of these results, the subjective impression of the security of e-payments is much less reassuring: 61% of the participants feel unsafe when buying goods online. To be sure, the respondents in the survey are mostly experienced internet users. Thus, the results are not representative for the population as a whole. However, they demonstrate that the lack of trust with respective to e-commerce and the internet may reflect low confidence in the internet in general rather than a specific reservation
11 Internet Payments in Germany
249
Figure 11.2. Negative experience when purchasing/payment on the internet, Source: (Krueger et al. 2005)
against particular payment products. Thus, once people get used to using the net, they also become more confident to carry out transactions over the net. If this is true, new payment systems may not be required to enhance consumer trust. Rather than looking for new ways of payment, innovators should seek to make existing payments more user friendly. After all, the payments industry is – to a large extent – a service industry.
11.3.2 The Merchants’ View The merchants’ view is based on data of the third merchant survey conducted by ECC-Handel and the Institute for Economic Policy Research in 2005 (see Van Baal 2005). In this survey, 449 companies took part. With respect to payment methods offered, the merchant survey confirms the results of the consumer survey. Most merchants offer payment methods which are known from the offline world. Special internet payment systems are rare. Pre-paid credit transfer (82%), cash on delivery (COD) (64%), credit transfer after invoice
250
Malte Krueger, Kay Leibold
payment in advance
14 , 9
8 2 ,3
cash on delivery
3 2 ,6
6 3 ,9
invoice debit entry
4 3 ,8
7, 3
4 8 ,9
3 9 ,7
13 , 6
4 6 ,7
credit card w ith SSL encryption
3 1, 0
PayPal
2 9 ,9
17,4
52 , 7
credit card verified by CVC/CVV
2 8 ,3
19 , 8
51, 9
credit card w irh verification
4 9 ,7
19 ,3
6 8 ,8
19 , 6
11,7
other
9 , 2 5, 7
8 5, 1
collection / billing systems
7, 3 9 , 8
8 2 ,9
online-transfer
6 8 ,8
2 4 ,5
6 ,8
credit card w ithout encryption
5, 45, 4
8 9 ,1
prepaid solutions
5, 26 , 8
88
payment plattform mobil payment GeldKarte
8 3 ,2
12 , 2
8 7,5
9 ,2
9 0 ,8
7, 1
payment per telephone 0% Yes
9 3 ,5
20%
40%
60%
No, but planned for 2006
80%
100%
No, not planned
Figure 11.3. Which methods of payment do you accept? (n = 368), Source: (Van Baal 2005)
(49%) and direct debit (47%) are mostly named. Up to 28% of the shops accept credit cards (SSL or CVV/CVC-based) or PayPal. Most of the dedicated internet payments systems as, for instance, telephone based systems, collection/billing systems or prepaid systems, are only accepted by a small fraction of internet merchants. However, in some segments of the market, these payment systems may have a higher penetration rate. The adoption of a payment system is, as in the consumers survey, depending on the type of goods (see Figure 11.4). Especially shops which are selling digital goods, e. g. music, require particular online payment systems. Digital goods can be delivered ‘per click’ and therefore payments should be as fast. Therefore, the share of direct debits, collection/billing and credit cards is relatively high whereas credit transfers (for prepayment or after invoice) and COD are not widely offered by merchants selling digital goods. When selling physical goods the rate of diffusion is more important than speed of payment. Therefore invoice, prepayment and COD are more common in this area.
11 Internet Payments in Germany
251
9 2 ,2
payment in advance***
3 4 ,1
6 5,2 7 5, 1
cash on delivery***
9 ,8
invoice***
4 3 ,5 4 9 ,5
19 , 5
71, 7 4 6 ,6 4 1, 5 52 ,2 3 8 ,8
debit entry credit card
3 1, 7 3 2 ,6 3 3 ,1
PayPal**
14 , 5
other
4 ,3 3 ,9
collection / billing systems***
2 3 ,9
9 ,6 12 , 2 2 9 ,3
8 ,7 5
online-transf er***
22
4 ,3 3 ,2
prepaid solutions***
19 , 5
4 ,3 3 ,2
payment plattf orm***
17 , 1
2 ,2 0
mobil payment***
19 , 5
8 ,7 1, 8
GeldKarte*
0 0 ,4
payment per telephone***
7,3
4 ,3
0
9 ,8
20
physical and digital goods
40 digital goods
60
80
100
physical goods
Figure 11.4. Does your company offer the following payment methods for sales on the internet? Share of companies answering ‘yes’, Source: (Van Baal 2005) (n1 = 321 suppliers of physical goods, n2 = 54 suppliers of digital goods, n3 = 74 suppliers of both types of goods, * sign. at 90%), ** sign. at 95%, *** sign. at 99%
In the media, a lot of attention has been given to the risks consumers face when purchasing over the internet. However, in many cases, merchants are the ones facing most of the risk. Many merchants offer delivery before payment has been received – or even before it has been initiated. In addition, they often accept means of payment that allow payment reversals by the payer. Therefore, from the point of view of the merchant, risk management is essential. The survey shows that risk is, indeed, an issue and that merchants use different strategies to deal with risk. On average, the merchants participating in the survey have an expected loss rate between 0.6% and 1% (Figure 11.5). When considering this percentage rate on its own, it is not as dramatic as media reports suggest. However, it contrasts with the offline world with a loss rate of 0.12% for direct debits initiated at the POS. Moreover, a substantial portion of online traders experiences loss rates far higher: 8% suffer losses above 8% and 2.5% even suffer losses of 10% and higher. These results show how important a proper risk management is.
252
Malte Krueger, Kay Leibold
50% 40%
37,6% 23,8%
30%
18,1%
20%
12,4% 5,7%
10%
2,5%
0% 0 - 0,5%
0,6 - 1%
1,1 - 3%
3,1 - 5%
5,1 - 10%
more than 10%
Figure 11.5. How large are internet payment losses for your company in percent of your turnover? (n = 282; estimates of participants; for example: 18,1% of the participants have payment losses of 1,1 to 3% of their turnover.), Source: (Van Baal 2005)
s elective us e of COD* or advance
30,4
69,6
checking of cus tom er data
44,8
55,2
internal check of blocking lis t
42,2
57,8
delivery generally only with cas h on
41,0
59,0
adres s check
40,0
60,0
check of cus tom er credit-worthines s
25,3
74,7
am ount lim it for new cus tom ers
24,5
75,5
own s coring for ris k valuation
22,3
77,7
ris k depending offer of paym ent s ys tem s
19,4
80,6
paym ent guarantee by paym ent s ys tem
18,4
81,6
identification of cus tom er by bank
17,3
82,7
ins urance for paym ent los s es by 3rd party 7,3 0%
92,7
20%
40%
60%
Us ed
Not us ed
80%
100%
Figure 11.6. The following instruments are suitable to restrict the risk of payment losses. Which does your company use? (251 ≥ n ≥ 217), * cash on delivery
11 Internet Payments in Germany
253
Risk management starts with the selection of appropriate payment methods. In addition, to minimize risk, other instruments like third party guarantees, address verification or selective strategies could be implemented. According to the results (Figure 11.6), many merchants prefer simple solutions. In this way COD/payment in advance were most mentioned (70%). Checking account or card numbers is on second place (55%). Use of external data and scoring models is much less popular.
11.4 SEPA For a long time, retail payment has not been an area of particular interest to regulators. However, this has been changing over the years. Increasingly, consumer protection laws, anti trust laws and prudential supervision are becoming relevant for retail payments. Moreover, in the EU, the movements towards a single market have become increasingly important for the area of payments. The EU Commission has set the goal of a Single European Payments Area (‘SEPA’) and puts a lot of pressure on the payments industry to abolish all differences between intra-EU payments and national payments. While trade in goods and services has become ever more integrated within the EU, payment systems mostly were confined within national borders. Even after the introduction of the Euro, x-border payments within the EU remained expensive and cumbersome. The EU Commission became increasingly irritated about this state of the world. Finally, it resorted to regulation to change this state of affairs. In 2001 it introduced the Regulation (EC) No 2560/2001 on cross-border payments in EUR to mandate equal pricing for national and x-border payments denominated in EUR. Regulation 2560/2001 has become something like a wake-up call for the EU banking industry. In 2002, 42 banks and the three European Credit Sector Associations launched the European Payments Council (EPC). The task of the EPC is to coordinate the activities of the banking industry to achieve the integration of the Euroarea payment system. The main activities are the creation of a European credit transfer, a European direct debit system and European standards for card and cash payments. Additionally, the EPC works on internet payments and mobile payments. The EPC has set an ambitious timeline that envisions the introduction of new European payment instruments by 2008 and the gradual adoption of these instruments until 2010. The EPC co-operates with the European Central Bank and the EU Commission. Both institutions are putting a lot of pressure on banks to stay within the proposed time framework. It is still a point of debate whether European and national solutions should coexist or whether ultimately, there should only be European solutions. The banks favour an approach that allows for co-existence. But the EU Commission and the European Central Bank are strongly in favour of discontinuing all national solutions. It remains to be seen whether shutting down all national solutions will always benefit customers. Thus, few expect that a European debit card scheme will be as
254
Malte Krueger, Kay Leibold
cheap as the Dutch PIN-system. Consequently, Dutch merchants will not see themselves as winners of SEPA. Equally, a European direct debit may offer less protection to bank customers than the current German system which is highly popular with German account holders. The drive towards SEPA also has important implications for internet payments. As has been shown above, the world of internet payments mirrors in many respects the offline world. Thus, the only truly pan-European means of payment is the credit card. Most other methods are more or less national. To some extend, specialised service providers have tried to overcome this situation. Companies like Bibit/RBS Worldpay and GlobalCollect have tried to connect to a large number of national systems in order to offer merchants the possibility to let customers in various countries all use their local payment instruments. This has worked up to a point. However, it seems immediately obvious that such a procedure comes at a cost. Moreover, the EU-Commission as well as the ECB are pressing for a genuinely European system that does not involve any cross-country differences. Thus, as the standard payment systems of the offline world are transformed, the same will have to happen to internet payment systems. This process will partly be automatic. For instance, once there is a European direct debit, merchants can offer direct debit throughout Europe. This definitely is a plus. However, it remains to be seen how consumers will react because the European direct debit will not be quite as the direct debit to which they have grown accustomed. Thus, in countries like Germany, consumers may well end up with fewer rights with respect to direct debit payments.
11.5 Outlook The payment requirements of e-commerce have been met at an astonishing degree by existing payment systems. In some cases, existing systems were used without any change (paper check, cash on demand, credit transfer). In other cases, there were smaller modifications (credit card or debit payment without signature). Finally, there are cases of more complex adaptations to the internet (Verified by Visa, Secure Code, the integrated online credit transfer). The strong performance of existing payment systems has made it difficult for innovative newcomers to enter the market. This is particularly true for the giro countries which found it easier to adapt a largely paperless system to new requirements imposed by e-commerce. But even in the check countries, a similar development may follow. Moving from paper check to e-check may ultimately lead to a system that looks very much like an electronic debit system. In addition, ACH payments are gaining market share. Thus, in the long run, there may be convergence between the two systems. Where would leave that new internet payment schemes? Strictly speaking, if the US moved towards a giro system, there would be much less need for a C2C system such as PayPal. However, in the US, PayPal has reached critical mass already and may therefore be able to survive – in particular if it can manage to gain a high reputation for Quality of Service (QoS).
11 Internet Payments in Germany
255
That would leave only micro payments as a potential market for innovative new payment schemes. The problem is that the size of this market is constrained by the prevalence of advertising finance and of subscription. Moreover, aggregation can be used to reduce the per transaction costs of conventional payment methods. Still, many new firms have set their hopes on micropayments and the ‘pay per click’ model. In Germany, for instance, there are Firstgate click&buy, T-Pay and Web.cent – to name but a few. These payment systems offer convenient ways for billing small amounts. The other side of the medal are the low diffusion of these systems and the restricted application area. Thus, it remains to be seen whether such systems will be able to gain critical mass.
References Adams J (2003) How Europe pays. European Card Review March/April: 11−16. Bank for International Settlement (BIS) (2006) Statistics on payment and settlement systems in selected countries (‘Red Book’). Basel Bank for International Settlement (BIS) (2004) Survey of e-money and internet and mobile payments. Basel Böhle K (2001) The Potential of Server-based Internet Payment Systems – An attempt to assess the future of Internet payments – ePSO Background Paper No. 3. Institute for Prospective Technological Studies, Sevilla (http://epso.jrc.es/Docs/Backgrnd-3.pdf) Celent (2002) Innovation in Internet Payments: The Plot Against Credit Cards. Celent Communication, New York European Central Bank (ECB) (2004) Payment and securities settlement systems in the European Union (‘Blue Book’). Frankfurt/M EU Directive (2000) Directive 2000/46/EC of the European Parliament and of the Council of 18 September 2000 on the taking up, pursuit and prudential supervision of the business of electronic money institutions. Official Journal of the European Communities L 275 of 27 October EU Regulation (2001) Regulation (EC) No 2560/2001 of the European Parliament and of the Council of 19 December 2001 on cross-border payments in euro. Official Journal L 344 of 28 December 2001 European Monetary Institute (EMI) (1994) Report to the Council of the European Monetary Institute on Prepaid Cards, by the Working Group on EU Payment Systems. Frankfurt/M FAVED (2006) Chiffres Clés. Vente à distance e-commerce. (www.faved.com) Furche A, Wrightson G (1997) Computer Money. Internet- und Kartensysteme. Ein systematischer Überblick. dpunkt, Heidelberg Furche A, Wrightson G (2000) Why do stored value systems fail? Netnomics 2:1, March: 37−47 Greenspan, A (1997) Fostering Financial Innovation: The Role of Government. in: Dorn, James A (ed) The Future of Money in the Information Age. Cato Institute, Washington, D.C.: 45−50
256
Malte Krueger, Kay Leibold
Insites (2004a) 6.000.000 nederlanders kochten het voorbije jaar via internet. (www.insites-consulting.com) (download 2004) Insites (2004b) 2.250.000 Belgian made online purchases in the past year. (www.insites-consulting.com) (download 2004) Krueger M, Leibold K, Smasal D (2005) Internet Payment Systems: The consumer’s view – results of the 8th online survey. University of Karlsruhe. (http://www.iww.uni-karlsruhe.de/izv8) Krueger, M (2001a) Innovation and Regulation – The Case of E-money Regulation in the EU – ePSO Background Paper No. 5. Institute for Prospective Technological Studies, Sevilla. (http://epso.jrc.es/Docs/Backgrnd-5.pdf) Krueger M (2001b) Offshore E-Money Issuers and Monetary Policy, First Monday 6:10 (October) (http://firstmonday.org/issues/issue6_10/krueger/ index.html) Online Publishers Association (OPA) (2005) Online Paid Content US Market Spending Report, October Van Baal S, Krueger M, Hinrichs J-W (2005) Internet-Zahlungssysteme aus Sicht der Händler: Ergebnisse der Umfrage IZH3. Institut für Handelsforschung, Universität Köln, September
CHAPTER 12 Grid Computing for Commercial Enterprise Environments Daniel Minoli
12.1 Introduction Grid Computing1 (or, more precisely a “Grid Computing System”) is a virtualized distributed computing environment. Such environment aims at enabling the dynamic “runtime” selection, sharing, and aggregation of (geographically) distributed autonomous resources based on the availability, capability, performance, and cost of these computing resources, and, simultaneously, also based on an organization’s specific baseline and/or burst processing requirements. When people think of a grid the idea of an interconnected system for the distribution of electricity, especially a network of high-tension cables and power stations, comes to mind. In the mid-1990s the grid metaphor was (re)applied to computing, by extending and advancing the 1960’s concept of “computer time sharing”. The grid metaphor strongly illustrates the relation to, and the dependency on, a highly-interconnected networking infrastructure. This chapter is a survey of the Grid Computing field as it applies to corporate environments and it focuses on what are the potential advantages of the technology in these environments. The goal is to serves as a familiarization vehicle for the interested Information Technology (IT) processional. It should be noted at the outset, however, that no claim is made herewith that there is a single or unique solution to a given computing problem; Grid Computing is one of a number of available solutions in support of optimized distributed computing. Corporate IT professionals, for whom this text is intended, will have to perform appropriate functional, economic, business-case, and strategic analyses to determine which computing approach ultimately is best for their respective organizations. Furthermore, it should be noted that Grid Computing is an evolving field and, so, there is not always one canonical, normative, universally-accepted, or axiomatically-derivable view of “everything 1
The material in this chapter is summarized from the book (Minoli D 2005) A Networking Approach to Grid Computing, Wiley, New York.
258
Daniel Minoli
grid-related”; it follows, that occasionally we present multiple views, multiple interpretations, or multiple perspectives on a topic, as it might be perceived by different stakeholders of communities of interest. This chapter begins the discussion by providing a sample of what a number of stakeholders define Grid Computing to be, along with an assay of the industry. The precise definition of what exactly this technology encompasses is still evolving and there is not a globally-accepted normative definition that is perfectly nonoverlapping with other related technologies. Grid Computing emphasizes (but does not mandate) geographically-distributed, multi-organization, utility-based, outsourcer-provided, networking-reliant computing methods. For a more complete treatment of the topic the reader is referred to the textbook [34]. Virtualization is a well known concept in networking, from Virtual Channels in Asynchronous Transfer Mode, to Virtual Private Networks, to Virtual LANs, and Virtual IP Addresses. However, an even more fundamental type of virtualization is achievable with today’s ubiquitous networks: machine cycle and storage virtualization through the auspices of Grid Computing and IP storage. Grid Computing is also known as utility computing, what IBM calls on-demand computing. Grid computing is a virtualization technology that was talked about in the 1980s and 1990s and entered the scientific computing field in the past ten years. The technology is now beginning to make its presence felt in the commercial computing environment. In the past couple of years there has been a lot of press and market activity, and a number of proponents see major penetration in the immediate future. IBM, Sun, Oracle, AT&T, and others are major players in this space. Grid computing cannot really exist without networks (the “grid”), since the user is requesting computing or storage resources that are located miles or continents away. The user needs not be concerned about the specific technology used in delivering the computing or storage power: all the user wants and gets is the requisite “service”. One can think of grid computing as a middleware that shields the user from the raw technology itself. The network delivers the job requests anywhere in the world and returns the results, based on an established service level agreement. The advantages of grid computing are the fact that there can be a mix-andmatch of different hardware in the network; the cost is lower because there is a better, statistically-averaged, utilization of the underlying resources; also, there is higher availability because if a processor were to fail, another processor is automatically switched in service. Think of an environment of a Redundant Array of Inexpensive Computers (RAIC), similar to the concept of RAID. Grid Computing is intrinsically network-based: resources are distributed all over an intranet, an extranet, or the Internet. Users can also get locally-based virtualization by using middleware such as VMWare that allows a multitude of servers right in the corporate Data Center to be utilized more efficiently. Typically corporate servers are utilized for less than 30−40% of their available computing power. Using virtualization mechanism the firm can improve utilization, increase availability, reduce costs, and make use of a plethora of mix-and-match processors; as a minimum this drives to server consolidation.
12 Grid Computing for Commercial Enterprise Environments
259
Security is a key consideration in grid computing. The user wants to get its services in a trust-worthy and confidential manner. Then there is the desire for guaranteed levels of service and predictable, reduced costs. Finally, there is the need for standardization, so that a user with an appropriate middleware client software can transparently reach any registered resource in the network. Grid Computing supports the concept of the Service Oriented Architecture, where clients obtain services from loosely-coupled service-provider resources in the network. Web Services based on the Simple Object Access Protocol (SOAP) and Universal Description, Discovery and Integration (UDDI) protocols are now key building blocks of a grid environment [35].
12.2 What is Grid Computing and What are the Key Issues? In its basic form, the concept of Grid Computing is straightforward: with Grid Computing an organization can transparently integrate, streamline, and share dispersed, heterogeneous pools of hosts, servers, storage systems, data, and networks into one synergistic system, in order to deliver agreed-upon service at specified levels of application efficiency and processing performance. Additionally, or, alternatively, with Grid Computing an organization can simply secure commoditized “machine cycles” or storage capacity from a remote provider, “on-demand”, without having to own the “heavy iron” to do the “number crunching”. Either way, to an end-user or application this arrangement (ensemble) looks like one large, cohesive, virtual, transparent computing system ([11]; [30]). Broadband networks play a fundamental enabling role in making Grid Computing possible and this is the motivation for looking at this technology from the perspective of communication. According to IBM’s definition ([48]; [24]), “a grid is a collection of distributed computing resources available over a local or wide area network that appear to an end user or application as one large virtual computing system. The vision is to create virtual dynamic organizations through secure, coordinated resource-sharing among individuals, institutions, and resources. Grid Computing is an approach to distributed computing that spans not only locations but also organizations, machine architectures, and software boundaries to provide unlimited power, collaboration, and information access to everyone connected to a grid.” “… The Internet is about getting computers to talk together; Grid Computing is about getting computers to work together. Grid will help elevate the Internet to a true computing platform, combining the qualities of service of enterprise computing with the ability to share heterogeneous distributed resources – everything from applications, data, storage and servers.” Another definition, this one from The Globus Alliance (a research and development initiative focused on enabling the application of grid concepts to scientific and engineering computing), is as follows [28]: “The grid refers to an infrastructure that enables the integrated, collaborative use of high-end computers, networks,
260
Daniel Minoli
databases, and scientific instruments owned and managed by multiple organizations. Grid applications often involve large amounts of data and/or computing and often require secure resource sharing across organizational boundaries, and are thus not easily handled by today’s Internet and Web infrastructures.” Yet another industry-formulated definition of Grid Computing is as follows ([14]; [15]): “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to highend computational capabilities. A grid is concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. The key concept is the ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose. The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problemsolving and resource-brokering strategies emerging in industry, science, and engineering. This sharing is, necessarily, highly controlled, with resource providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules form what we call a virtual organization (VO).” Whereas the Internet is a network of communication, Grid Computing is seen as a network of computation: the field provides tools and protocols for resource sharing of a variety of IT resources. Grid Computing approaches are based on coordinated resource sharing and problem solving in dynamic, multi-institutional VOs. A (short) list of examples of possible VOs include: application service providers, storage service providers, machine-cycle providers, and members of industry-specific consortia. These examples, among others, represent an approach to computing and problem solving based on collaboration in data-rich and computation-rich environments ([18]; [37]). The enabling factors in the creation of Grid Computing systems in recent years have been the proliferation of broadband (optical-based) communications, the Internet, and the World Wide Web infrastructure, along with the availability of low-cost, high-performance computers using standardized (open) Operating Systems ([9]; [14]; [18]). The role communications as a fundamental enabler will be emphasized throughout this chapter. Prior to the deployment of Grid Computing, a typical business application had a dedicated server platform of servers and an anchored storage device assigned to each individual server. Applications developed for such platforms were not able to share resources, and from an individual server’s perspective it was not possible, in general, to predict, even statistically, what the processing load would be at different times. Consequently, each instance of an application needed to have its own excess capacity to handle peak usage loads. This predicament typically resulted in higher overall costs than would otherwise need to be the case [23]. To address these lacunae, Grid Computing aims at exploiting the opportunities afforded by the synergies, the economies of scale, and the load smoothing that result from the ability to share and aggregate distributed computational capabilities, and deliver
12 Grid Computing for Commercial Enterprise Environments
261
these hardware-based capabilities as a transparent service to the end-user.2 To reinforce the point, the term “synergistic” implies “working together so that the total effect is greater than the sum of the individual constituent elements.” From a service provider perspective, Grid Computing is somewhat akin to an Application Service Provider (ASP) environment, but with a much-higher level of performance and assurance [5]. Specialized ASPs, known as Grid Service Providers (GSPs) are expected to emerge to provide grid-based services, including, possibly, “open-source outsourcing services.” Grid Computing started out as the simultaneous application of the resources of many networked computers to a single (scientific) problem [14]. Grid Computing has been characterized as the “massive integration of computer systems” [45]. Computational grids have been used for a number of years to solve large-scale problems in Science and Engineering. The noteworthy fact is that, at this juncture, the approach can already be applied to a mix of mainstream business problems. Specifically, Grid Computing is now beginning to make inroads into the commercial world, including financial services operations, making the leap forward from such scientific venues as research labs and academic settings [23]. The possibility exists, according to the industry, that with Grid Computing companies can save as much as 30% of certain key line items of the operations budget (in an ideal situation), which is typically a large fraction of the total IT budget [41]. Companies yearly spend, on the average, 6%3 of their top line revenues on IT services; for example, a $10B/yr Fortune 500 company might spend $600M/yr on IT. Grid middleware vendors make the assertion that cluster computing (aggregating processors in parallel-based configurations), yield reductions in IT costs and costs of operations that are expected to reach 15% by 2005 and 30% by 2007–8 in most early adopter sectors, according to the industry. Use of enterprise grids (middleware-based environments to harvest unused “machine cycles”, thereby displacing otherwise-needed growth costs), is expected to result in a 15% savings in IT costs by the year 2007–8, growing to a 30% savings by 2010 to 2012 [10]. This potential savings is what this chapter is all about. In 1994 this author published the book Analyzing Outsourcing, Reengineering Information and Communication Systems (McGraw-Hill), calling attention to the possibility that companies could save 15−20% or more in their IT costs by considering outsourcing – the trends of the mid-2000s have, indeed, validated this (then) timely assertion [32]. At this juncture we call early attention to the fact that the possibility exists for companies to save as much as 15−30% of certain key line items of the IT operations (run-the-engine) budget by using Grid Computing 2
3
As implied in the opening paragraphs, a number of solutions in addition to Grid Computing (e. g., virtualization) can be employed to address this and other computational issues – Grid Computing is just one approach. The range is typically 2% to 12%. See Minoli D (2007) Enterprise Architecture A thru Z: Frameworks, Business Process Modeling, SOA, and Infrastructure Technology – Migrating to State-of-the-Art Environments, in Enterprises with IT-intensive Assets: Platform, Storage, and Network Virtualization, Taylor & Francis Group.
262
Daniel Minoli
and/or related computing/storage virtualization technologies. In effect, Grid Computing, particularly the utility computing aspect, can be seen as another form of outsourcing. Perhaps utility computing will be the next phase of outsourcing and be a major trend in the 2010-decade. As we discuss later in the book, the evolving Grid Computing standards can be used by companies to deploy a next-generation kind of “open source outsourcing” that has the advantage of offering portability, enabling companies to easily move their business among a variety of pseudocommodity providers. This chapter explores practical advantages of Grid Computing and what is needed by an organization to migrate to this new computing paradigm, if it so chooses. The chapter is intended for practitioners and decision-makers in organizations (not necessarily for software programmers) who want to explore the overall business opportunities afforded by this new technology. At the same time, the importance of the underlying networking mechanism is emphasized. For any kind of new technology corporate and business decision-makers typically seek answers to questions such as these: (i) “What is this stuff?”; (ii) “How widespread is its present/potential penetration?”; (iii) “Is it ready for prime time?”; (iv) “Are there firm standards?”; (v) “Is it secure?”; (vi) “How do we bill it, since it’s new?”; (vii) “Tell me how to deploy it (at a macro level)”; and (viii) “Give me a self-contained reference... make it understandable and as simple as possible.” Table 12.1 summarizes these and other questions that decision-makers, CIOs, CTOs, and planners may have about Grid Computing. Table 12.2 lists some of the concepts embodied in Grid Computing and other related technologies. Grid Computing is also known by a number of other names (although some of these terms have slightly different connotations), such as “grid” (the term “the grid” was coined in the mid-1990s to denote a proposed distributed computing infrastructure for advanced science and engineering); “computational grid”; “computing-on-demand”; “On Demand computing”; “just-in-time computing”; “platform computing”; “network computing”, “computing utility” (the term used by this author in the late 1980s [33]; “utility computing”; “cluster computing”; and “high performance distributed computing”. With regard to nomenclature, in this text, besides the term Grid Computing, we will also interchangeably use the term grid and computational grid. In this text, we use the term grid technology to describe the entire collection of Grid Computing elements, middleware, networks, and protocols. To deploy a grid, a commercial organization needs to assign computing resources to the shared environment and deploy appropriate grid middleware on these resources, enabling them to play various roles that need to be supported in the grid (e. g., scheduler, broker, etc.) Some (minor) application re-tuning and/or parallelization may, in some instances, be required; data accessibility will also have to be taken into consideration. A security framework will also be required. If the organization subscribes to the service provider model, then grid deployment would mean establishing adequate access bandwidth to the provider, some possible application re-tuning, and the establishment of security policies (the assumption being that the provider will itself have a reliable security framework.)
12 Grid Computing for Commercial Enterprise Environments
263
The concept of providing computing power as a utility-based function is generally attractive to end-users requiring fast transactional processing and “scenario modeling” capabilities. The concept is may also be attractive to IT planners looking to control costs and reduce Data Center complexity. The ability to have a cluster, an entire Data Center, or other resources spread across a geography connected by the Internet (and/or, alternatively, connected by an intranet or extranet), operating as a single transparent virtualized system that can be managed as a service, rather than as individual constituent components, likely will, over time, increase business agility, reduce complexity, streamline management processes, and lower operational costs [23]. Grid technology allows organizations to utilize numerous computers to solve problems by sharing computing resources. The problems to be solved might involve data processing, network bandwidth, or data storage, or a combination thereof. In a grid environment, the ensemble of resources is able to work together cohesively because of defined protocols that control connectivity, coordination, resource allocation, resource management, security, and chargeback. Generally, the protocols are implemented in the middleware. The systems “glued” together by a computational grid may be in the same room, or may be distributed across the globe; they may running on homogenous or heterogeneous hardware platforms; they may be running on similar or dissimilar operating systems; and, they may owned by one or more organizations. The goal of Grid Computing is to provide users with a single view and/or single mechanism that can be utilized to support any number of computing tasks: the grid leverages its extensive informatics capabilities to supports the “number crunching” needed to complete the task and all the user perceives is, essentially, a large virtual computer undertaking his or her work (developerWorks). In recent years one has seen an increasing roster of published articles, conferences, tutorials, resources, and tools related to this topic (developerWorks). Distributed virtualized Grid Computing technology, as we define it today, is still fairly new, it being only a decade in the making. However, as already implied, a number of the basic concepts of Grid Computing go back as far as the mid 1960s and early 1970s. Recent advances, such as ubiquitous high-speed networking in both private and public venues (e. g., high-speed intranets/high-speed Internet), make the technology more deployable at the practical level, particularly when looking at corporate environments. Table 12.1. Issues of interest and questions that CIOs/CTOs/Planners have about grid computing What is Grid Computing and what are the key issues? Grid Benefits and Status of Technology Motivations For Considering Computational Grids Brief History of Grid Computing Is Grid Computing Ready for Prime Time? Early Suppliers and Vendors Challenges Future Directions
264
Daniel Minoli
Table 12.1. Continued What are the Components of Grid Computing Systems/Architectures? Portal/user interfaces User Security Broker function Scheduler function Data management function Job management and resource management Are there stable Standards Supporting Grid Computing? What is OGSA/OGSI? Implementations of OGSI OGSA Services Virtual Organization Creation and Management Service Groups and Discovery Services Choreography, Orchestration, and Workflow Transactions Metering Service Accounting Service Billing and Payment Service Grid System Deployment Issues and Approaches Generic Implementations: Globus Toolkit Security Considerations – Can Grid Computing be Trusted? What are the Grid Deployment/Management Issues? Challenges and Approaches Availability of Products by Categories Business Grid Types Deploying a basic computing grid Deploying more complex computing grid Grid Operation What are the Economics of Grid Systems? The Chargeable Grid Service The Grid Payment System How does one pull it all together? Communication and networking Infrastructure Communication Systems for Local Grids Communication systems for national grids Communication Systems for Global Grids
12 Grid Computing for Commercial Enterprise Environments
265
Table 12.2. Definition of some key terms Grid Computing
(Virtualized) distributed computing environment that enables the dynamic “runtime” selection, sharing, and aggregation of (geographically) distributed autonomous (autonomic) resources based on the availability, capability, performance, and cost of these computing resources, and, simultaneously, also based on an organization’s specific baseline and/or burst processing requirements. Enables organizations to transparently integrate, streamline, and share dispersed, heterogeneous pools of hosts, servers, storage systems, data, and networks into one synergistic system, in order to deliver agreed-upon service at specified levels of application efficiency and processing performance. An approach to distributed computing that spans multiple locations and/or multiple organizations, machine architectures, and software boundaries to provide power, collaboration, and information access. Infrastructure that enables the integrated, collaborative use of computers, supercomputers, networks, databases, and scientific instruments owned and managed by multiple organizations. A network of computation: namely, tools and protocols for coordinated resource sharing and problem solving among pooled assets … allows coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. Simultaneous application of the resources of many networked computers to a single problem … concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. Decentralized architecture for resource management, and a layered hierarchical architecture for implementation of various constituent services. Combines elements such as distributed computing, high-performance computing, and disposable computing, depending on the application. Local, metropolitan, regional, national, or international footprint. Systems may be in the same room, or may be distributed across the globe; they may running on homogenous or heterogeneous hardware platforms; they may be running on similar or dissimilar operating systems; and, they may owned by one or more organizations.
266
Daniel Minoli
Table 12.2. Continued
Virtualization
Clusters
Types: (i) Computational grids: machines with set-aside resources stand by to “number-crunch” data or provide coverage for other intensive workloads; (ii) Scavenging grids: commonly used to locate and exploit machine cycles on idle servers and desktop machines for use in resource-intensive tasks; and, (iii) Data grids: a unified interface for all data repositories in an organization, and through which data can be queried, managed, and secured. Computational grids can be local, Enterprise grids (also called Intragrids), and Internet-based grids (also called Intergrids.) Enterprise grids are middleware-based environments to harvest unused “machine cycles”, thereby displacing otherwise-needed growth costs. Other terms (some with slightly different connotations): “computational grid”; “computing-on-demand”; “On Demand computing”; “just-in-time computing”; “platform computing”; “network computing”, “computing utility”; “utility computing”; “cluster computing”; and “high performance distributed computing”. An approach that allows several operating systems to run simultaneously on one (large) computer (e. g., IBM’s z/VM operating system lets multiple instances of Linux coexist on the same mainframe computer). More generally, it is the practice of making resources from diverse devices accessible to a user as if they were a single, larger, homogenous, appear-to-be-locally-available resource. Dynamically shifting resources across platforms to match computing demands with available resources: the computing environment can become dynamic, enabling autonomic shifting applications between servers to match demand. The abstraction of server, storage, and network resources in order to make them available dynamically for sharing by IT services, both internal to and external to an organization. In combination with other server, storage, and networking capabilities, virtualization offers customers the opportunity to build more efficient IT infrastructures. Virtualization is seen by some as a step on the road to utility computing. Aggregating of processors in parallel-based configurations, typically in local environment (within a Data Center); all nodes work cooperatively as a single unified resource. Resource allocation is performed by a centralized resource manager and scheduling system.
12 Grid Computing for Commercial Enterprise Environments
267
Table 12.2. Continued
(Basic) Web Services (WS)
Peer-to-peer (P2P)
Comprised of multiple interconnected independent nodes that co-operatively work together as a single unified resource; unlike grids, cluster resources are typically owned by a single organization. All users of clusters have to go through a centralized system that manages allocation of resources to application jobs. Cluster management systems have centralized control, complete knowledge of system state and user requests, and complete control over individual components. Web Services provide standard infrastructure for data exchange between two different distributed applications (grids provide an infrastructure for aggregation of high-end resources for solving large-scale problems). Web Services expected to play a key constituent role in the standardized definition of Grid Computing, since Web Services have emerged as a standards-based approach for accessing network applications. P2P is concerned with same general problem as Grid Computing, namely, the organization of resource sharing within virtual communities. Grid community focuses on aggregating distributed highend machines such as clusters, whereas the P2P community concentrates on sharing low-end systems such as PCs connected to the Internet. Like P2P, Grid Computing allows users to share files (many-to-many sharing). With grid, the sharing is not only in reference to files, but also other IT resources.
As far back as 1987 this researcher was advocating the concept of Grid Computing in internal Bell Communications Research White Papers (e. g., in Special Reports SR-NPL-000790 – an extensive plan written by the author listing progressive data services that could be offered by local telcos and RBOCs, entitled A Collection of Potential Network-Based Data Services [33]. In a section called “Network for a Computing Utility” it was stated: “The proposed service provides the entire apparatus to make the concept of the Computing Utility possible. This includes as follows: (1) the physical network over which the information can travel, and the interface through which a guest PC/workstation can participate in the provision of machine cycles and through which the service requesters submit jobs; (2) a load sharing mechanisms to invoke the necessary servers to complete a job; (3) a reliable security mechanism; (4) an effective accounting mechanism to invoke the billing system; and, (5) a detailed directory of servers... Security is one of the major issues for this service, particularly if the PC is not fully dedicated to this function, but also used for other local activities. Virus threats, infiltration and corruption of
268
Daniel Minoli
data, and other damage must be appropriately addressed and managed by the service; multi-task and robust operating systems are also needed for the servers to assist in this security process … The Computing Utility service is beginning to be approached by the Client/Server paradigm now available within a Local Area Network (LAN) environment … This service involves capabilities that span multiple 7-layer stacks. For example, one stack may handle administrative tasks, another may invoke the service (e. g., Remote Operations), still another may return the results (possibly a file), and so on … Currently no such service exists in the public domain. Three existing analogues exist, as follows: (1) timesharing service with a centralized computer; (2) highly-parallel computer systems with hundreds or thousands of nodes (what people now call cluster computing), and (3) gateways or other processors connected as servers on a LAN. The distinction between these and the proposed service is the security and accounting arenas, which are much more complex in the distributed, public (grid) environment … This service is basically feasible once a transport and switching network with strong security and accounting (chargeback) capabilities is deployed, as shown in Figure... A high degree of intelligence in the network is required … a physical network is required … security and accounting software is needed … protocols and standards will be needed to connect servers and users, as well as for accounting and billing. These protocols will have to be developed before the service can be established …”
12.3 Potential Applications and Financial Benefits of Grid Computing Grid proponents take the position that Grid Computing represents a “next step” in the world of computing, and that Grid Computing promises to move the Internet evolution to the next logical level. According to some ([46], [47], [48]) among others), “utility computing is a positive, fundamental shift in computing architecture,” and many businesses will be completely transformed over the next decade by using gridenabled services, as these businesses integrate not only applications across the Internet, but, also, raw computer power and storage. Furthermore, proponents prognosticate that infrastructure will appear that will be able to connect multiple regional and national computational grids, creating a universal source of pervasive and dependable computing power that will support new classes of applications [3]. Most researchers, however, see Grid Computing as an evolution, not revolution. In fact, Grid Computing can be seen as the latest and most complete evolution of more familiar developments, such as distributed computing, the Web, peer-to-peer (P2P) computing, and virtualization technologies [27]. Some applications of Grid Computing, particularly in the scientific and engineering arena include, but are not limited to the following [6]: • Distributed Supercomputing/Computational science; • High-Capacity/Throughput Computing: large-scale simulation/chip design and parameter studies;
12 Grid Computing for Commercial Enterprise Environments
269
• • • •
Content sharing, e. g., sharing digital contents among peers; Remote software access/renting services: ASPs and Web Services; Data-intensive computing: drug design, particle physics, stock prediction; On-demand, real-time computing: medical instrumentation and mission critical initiatives; • Collaborative Computing (eScience, eEngeering, …): collaborative design, data exploration, education, e-learning; • Utility Computing/Service-Oriented Computing: new computing paradigm, new applications, new industries, and new business.
The benefits gained from Grid Computing can translate into competitive advantages in the marketplace. For example the potential exists for grids to ([27], [9]): • Enable resource sharing • Provide transparent access to remote resources • Makes effective use of computing resources, including platforms and data sets • Reduces significantly the number of servers needed (25−75%) • Allow on-demand aggregation of resources at multiple sites • Reduce execution time for large-scale, data processing applications • Provide access to remote databases and software • Provide load smoothing across a set of platforms • Provides fault tolerance • Take advantage of time zone and random diversity (in peak hours, users can access resources in off-peak zones) • Provide the flexibility to meet unforeseen emergency demands by renting external resources for a required period instead of owning them • Enables the realization of a Virtual Data Center (naturally there also are challenges associated with a grid deployment – this field being new and evolving.) As implied by the last bullet point, there is a discernable IT trend afoot toward virtualization and on-demand services. Virtualization4 (and supporting technology) is an approach that allows several operating systems to run simultaneously on one (large) computer. For example, IBM’s z/VM operating system lets multiple instances of Linux coexist on the same mainframe computer. More generally, virtualization is the practice of making resources from diverse devices accessible to a user as if they were a single, larger, homogenous, appear-to-belocally-available resource. Virtualization supports the concept of dynamically shifting resources across platforms to match computing demands with available resources: the computing environment can become dynamic, enabling autonomic shifting applications between servers to match demand [38]. There are well-known advantages in sharing resources, as a routine assessment of the behavior of the M/M/1 queue (memoryless/memoryless/1 server queue) versus the M/M/m queue 4
Virtualization can be achieved without Grid Computing; but many view virtualization as a step towards the goal of deploying Grid Computing infrastructures.
270
Daniel Minoli
(memoryless/memoryless/n servers queue) demonstrates: a single more powerful queue is more efficient than a group of discrete queues of comparable aggregate power. Grid Computing represents a development in virtualization: as we have stated, it enables the abstraction of distributed computing and data resources such as processing, network bandwidth, and data storage to create a single system image; this grants users and applications seamless access (when properly implemented) to a large pool of IT capabilities. Just as an Internet user views a unified instance of content via the Web, a Grid Computing user essentially sees a single, large virtual computer [27]. “Virtualization” – the driving force behind Grid Computing – has been a key factor since the earliest days of electronic business computing. Studies have shown that when problems can be parallelized, such as in the case of data mining, records analysis, and billing (as may be the case in a bank, securities company, financial services company, insurance company, etc.), then significant savings are achievable. Specifically, when a classical model may require, say, $100 K to process 100 K records, a grid-enabled environment may take as little as $20 K to process the same number of records. Hence, the bottom line is that Fortune 500 companies have the potential to save 30% or more in the Run-TheEngine costs on the appropriate line item of their IT budget. Grid Computing can also be seen as part of a larger re-hosting initiative and underlying IT trend at many companies (where alternatives such as Linux® or possibly Windows Operating Systems could, in the future, be the preferred choice over the highly-reliable, but fairly costly UNIX solutions.) While each organization is different and the results vary, the directional cost trend is believable. Vendors engaged in this space include (but are not limited to) IBM, Hewlett-Packard, Sun, and Oracle. IBM uses “on-demand ” to describe its initiative; HP has its Utility Data Center (UDC) products; Sun has its N1 Data-center Architecture; and Oracle has the 10 g family of “grid-aware” products. Several software vendors also have a stake in Grid Computing, including but not limited to Microsoft, Computer Associates, Veritas Software, and Platform Computing [2]. Commercial interest in Grid Computing is on the rise, according to market research published by Gartner. The research firm estimated that 15% of corporations adopted a utility (grid) computing arrangement in 2003, and the market for utility services in North America would increase from US$8.6 billion in 2003 to more than US$25 billion in 2006. By 2006, 30% of companies would have some sort of utility computing arrangement, according to the Gartner [2]. Based on published statements, IBM expects the sector to move into “hypergrowth” in 2004, with “technologies … moving ‘from rocket science to business service’ ”, and the company had a target of doubling its grid revenue during that year [12]. According to economists, proliferation of high performance cluster and Grid Computing and Web Services (WSs) applications will yield substantial productivity gains in the United States and worldwide over the next decade [10]. A recent report from research firm IDC concluded that 23% of IT services will be delivered from offshore centers by 2007 [31]. Grid Computing may be a mechanism to enable companies to reduce costs, yet keep the jobs, the intellectual capital, and the data from migrating abroad. While distributed computing does enable the idea
12 Grid Computing for Commercial Enterprise Environments
271
of “remoting” functions, with Grid Computing this “remoting” can be done to some in-country regional – rather than 3rd World-location (just like electric and water grids have regional or near-countries scope, rather than having far-flung 3rd World remote scope.) IT jobs that migrate abroad (particularly if terabytes of data about U.S. citizens and our government become the resident ownership of politicallyunstable 3rd World countries), have, in the opinion of this researcher, National Security/Homeland Security risk implications, in the long term. While there are advantages with grids (e. g., potential reduction in the number of servers from 25-to-50% and related run-the-engine costs), some companies have reservations about immediately implementing the technology. Some of this hesitation relates to the fact that the technology is new, and in fact may be overhyped by the potential provider of services. Other issues may be related to “protection of turf ”: eliminating vast arrays of servers implies reduction in Data Center space, reduction in management span of control, reduction in operational staff, reduction in budget, etc. This is the same issue that was faced in the 1990s regarding outsourcing (e. g., see [32]). Other reservation may relate to the fact that infrastructure changes are needed, and this may have a short-term financial disbursement implication. Finally, not all situations, environments, and applications are amenable to a grid paradigm.
12.4 Grid Types, Topologies, Components, Layers – A Basic View Grid Computing embodies a combination of a decentralized architecture for resource management, and a layered hierarchical architecture for implementation of various constituent services (GRID Infoware). A Grid Computing system can have local, metropolitan, regional, national, or international footprints. In turn, the autonomous resources in the constituent ensemble can span a single organization, multiple organizations, or a service provider space. Grids can be focused on the pooled assets of one organization or span virtual organizations that use a common suite of protocols to enable grid users and applications to run services in a secure, controlled manner [37]. Furthermore, resources can be logically aggregated for a long period of time (say, months or years), or for a temporary period of time (say, minutes, days, or weeks). Grid Computing often combines elements such as distributed computing, highperformance computing, and disposable computing, depending on the application of the technology and the scale of the operation. Grids can, in practical terms, create a virtual supercomputer out of existing servers, workstations, and even PCs, to deliver processing power not only to a company’s own stakeholders and employees, but also to its partners and customers. This metacomputing environment is achieved by treating such IT resources as processing power, memory, storage and network bandwidth as pure commodities. Like an electricity or water network,
272
Daniel Minoli
computational power can be delivered to any department or any application where it is needed most at any given time, based on specified business goals and priorities. Furthermore, Grid Computing allows charge-back on a per-usage basis rather than for a fixed infrastructure cost [23]. Present-day grids encompass the following types [11]: • Computational grids, where machines with set-aside resources stand by to “number-crunch” data or provide coverage for other intensive workloads; • Scavenging grids, commonly used to find and harvest machine cycles from idle servers and desktop machines for use in resource-intensive tasks (scavenging is usually implemented in a way that is unobtrusive to the owner/user of the processor); and, • Data grids, that provide a unified interface for all data repositories in an organization, and through which data can be queried, managed, and secured. As already noted, no claim is made herewith that there is a single solution to a given problem; Grid Computing is one of the available solutions. For example, while some of the machine-cycle inefficiencies can be addressed by virtual servers/re-hosting (e. g., VMWare, MS VirtualPC and VirtualServer, LPARs from IBM, partitions from Sun and HP, which do not require a grid infrastructure), one of the possible approaches to this inefficiency issue is, indeed, Grid Computing. Grid Computing does have an emphasis on geographically-distributed, multiorganization, utility-based (outsourced), networking-reliant methods, while clustering and re-hosting have more (but not exclusively) of a datacenter-focused, single organization-oriented approach. Organization will need to perform appropriate functional, economic, and strategic assessments to determine which approach is, in final analysis, best for their specific environment. (This text is on Grid Computing, hence, our emphasis is on this approach, rather than other possible approaches, such as virtualization). Figures 12.1, 12.2, and 12.3 provide a pictorial view of some Grid Computing environments. Figure 12.1 depicts the traditional computing environment where a multitude of often-underutilized servers support a disjoint set of applications and data sets. As implied by this figure, the typical IT environment prior to Grid Computing, operated as follows: a business-critical application runs on a designated server. While the average utilization may be relatively low, during peak cycles the server in question can get overtaxed. As a consequence of this instantaneous overtaxation, the application can slow down, experience a halt, or even stall. In this traditional instance, the large data set that this application is analyzing, exists only in a single data store (note that while multiple copies of the data could exist, it would not be easy with the traditional model to synchronize the databases if two programs independently operated aggressively on the data at the same time.) Server capacity and access to the data store place limitations on how quickly desired results can be returned. Machine cycles on other servers are unable to be constructively utilized, and available disk capacity remains unused [27].
12 Grid Computing for Commercial Enterprise Environments
273
Job Request 1
User 1
Result 1 Job Request 2
User 2
Corporate Intranet Result 2 (briged/ routed LAN segments Job Request 3 and VLANs) Result 3
SAN Server A Utilization: 33%
SAN Server B Utilization: 45%
SAN Server C Utilization: 20%
Data Set 1
Data Set 2
Data Set 3
User 3 Job Request 4
Result 4 User 4
SAN Server D Utilization: 52%
Data Set 4
DATA CENTER
Figure 12.1. Standard computing environment
Figure 12.2 depicts an organization-owned computational grid; here, a middleware application running on a Grid Computing Broker manages a small(er) set of processors and an integrated data store. A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. In a grid environment, workload can be broken up and sent in manageable pieces to idle server cycles. Not all applications are necessarily instantly migratable to a grid environment, without at least some re-design. While legacy business applications may, a priori, fit such class of applications, a number of Fortune 500 companies are indeed looking into how such legacy applications can be modified and/or re-tooled such that they can be made to run on grid-based infrastructures. A scheduler sets rules and priorities for routing jobs on a grid-based infrastructure. When servers and storage are enabled for Grid Computing, copies of the data can be stored in formerly unused space and easily made available [27]. A grid also provides mechanisms for managing the distributed data in a seamless way [39], [19]. A grid middleware provides facilities to allow the use the grid for applications and users. Middleware such as Globus [17], Legion [22], and UNICORE (UNiform Interface to Computer Resources) (Huber 2001) provide software infrastructure to handle various challenges of computational and data grids [39]. Figure 12.3 depicts the utility-oriented implementation of a computational grid. This concept is analogous to electric power network (grid) where power genera-
274
Daniel Minoli
tors are distributed, but the users are able to access electric power without concerning themselves about the source of energy and its pedestrian operational management (URL gridcomputing). As suggested by these figures, Grid Computing
Middleware
User 1
Job Request 1
Enterprise Grid and/or InterGrid
Result 1 Job Request 2
User 2
Middleware
Corporate Intranet Result 2 (briged/ routed LAN Job Request 3 segments and VLANs) Result 3
12 Grid Computing for Commercial Enterprise Environments
Application B
275
Application A Grid resource broker
Grid information service Database Router
R2 Router
Grid information service
R3
R1
Router
R5
Router
R6 Grid information service Grid resource broker Grid resource broker
Figure 12.4. Pictorial view of World Wide InterGrid
aims to provide seamless and scalable access to distributed resources. Computational grids enable the sharing, selection, and aggregation of a wide variety of geographically distributed computational resources (such as supercomputers, computing clusters, storage systems, data sources, instruments, and developers), and presents them as a single, unified resource for solving large-scale computing- and data-intensive applications (e. g., molecular modeling for drug design, brain activity analysis, and high-energy physics). An initial grid deployment at a company can be scaled over time to bring in additional applications and new data. This allows gains in speed and accuracy without significant cost increases. Several years ago, this author coined the phrase “the corporation is the network” [36]. Grid Computing supports this concept very well: with Grid Computing, all a corporation needs to run its IT apparatus is a reliable high-speed network to connect it to the distributed set of virtualized computational resources not necessarily owned by the corporation itself. Grid Computing started out as mechanism to share of the computational resources distributed all over the world for Basic Science applications, as illustrated pictorially by Figure 12.4 ([39], [14]). But, other types of resources, such as licenses or specialized equipment, can now also be virtualized in a Grid Computing environment. For example, if an organization’s software license agreement limits the number of users that can be using the license simultaneously, license management tools operating in grid mode could be employed to keep track of how many
276
Daniel Minoli
concurrent copies of the software are active. This will prevent the number from exceeding the allowed number, as well as scheduling jobs according to priorities defined by the automated business policies. Specialized equipment that is remotely deployed on the network could also be managed in a similar way, thereby reducing the necessity of the organization to purchase multiple devices, in much the same way today that people in the same office share Internet access or a printing resources across a LAN [23]. Some grids focus on data federation and availability; other grids focus on computing power and speed. Many grids involve a combination of the two. For endusers, all infrastructure complexity stays hidden [27]. Data (database) federation makes disparate corporate databases look like the constituent data is all in the same database. Significant gains can be secured if one can work on all the different databases, including selects, inserts, updates, and deletes as if all the tables existed in a single database.5 Almost every organization has significant unused computing capacity, widely distributed among a tribal arrangement of PCs, midrange platforms, mainframes, and supercomputers. For example, if a company has 10,000 PCs, at an average computing power of 333 MIPS, this equates to an aggregate 3 tera (1012) floating-point operations per second (TFLOPS) of potential computing power. As another example, in the U.S. there are an estimated 300 million computers; at an average computing power of 333 MIPS, this equates to a raw computing power of 100,000 TFLOPS. Mainframes are generally idle 40% of the time; Unix servers are actually “serving” something less than 10% of the time; most PCs do nothing for 95% of a typical day [27]. This is an inefficient situation for customers. TFLOPS-speeds that are possible with Grid Computing enable scientists to address some of the most computationally-intensive scientific tasks, from problems in protein analysis that will form the basis for new drug designs, to climate modeling, to deducing the content and behavior of the cosmos from astronomical data [45]. Many scientific applications are also data-intensive, in addition to being computationally-intensive. By 2006, several physics projects will produce multiple petabytes (1015 byte) of data per year. This has been called “peta-scale” data. PCs now ship with up to 100 gigabytes (GB) of storage (as much as an entire 1990 supercomputer center) [13]: one petabyte would equate to 10,000 of these PCs, or to the PC base of a “smaller” Fortune 500 company. Data grids also have some immediate commercial application. Grid-oriented solutions are the way to manage this sort of storage requirement, particularly from a data access perspective (more 5
The “federator” system operates on the tables in the remote systems, the “federatees”. The remote tables appear as virtual tables in the “Federator” database. Client application programs can perform operations on the virtual tables in the “Federator” database, but the real persistent storage is in the remote database. Each “federatee” views the “federator” as just another database client connection. The “Federatee” is simply servicing client requests for database operations. The “federator” needs client software to access each remote database. Client software for IBM Informix®, Sybase, Oracle, etc. would need to be installed to access each type of federatee (IBM Press Releases).
12 Grid Computing for Commercial Enterprise Environments
277
than just from a physical storage perspective.) As grids evolved, the management of “peta-scale” data became burdensome. The confluence and combination of large dataset size, geographic distribution of users and resources, and computationally-intensive scientific analyses, prompted the development of data grids, as noted earlier ([39]; [8]). Here, a data middleware (usually part of general purpose grid middleware), provides facilities for data management. Various research communities have developed successful data middleware like Storage Resource Broker (SRB) [1], Grid Data Farm [42], and European Data Grid Middleware. These middleware tools have been effective in providing framework for managing high volumes of data but they are often incompatible. The key components of Grid Computing include the following (developerWorks): • Resource management: the grid must be aware of what resources are available for different tasks; • Security management: the grid needs to take care that only authorized users can access and use the available resources; • Data management: data must be transported, cleansed, parceled, and processed; and, • Services management: users and applications must be able to query the grid in an effective and efficient manner. More specifically, Grid Computing can be viewed as being comprised of a number of logical hierarchical layers. Figure 12.5 depicts a first view of the layered architecture of a grid environment. At the base of the grid stack, one finds the grid fabric, namely, the distributed resources that are managed by a local resource manager with a local policy; these resources are interconnected via local-, metropolitan-, or widearea networks. The grid fabric includes and incorporates networks; computers such as PCs and processors using operating systems such as Unix, Linux, or Windows; clusters using various operating systems; resource management systems; storage devices; and databases. The security infrastructure layer provides secure and authorized access to grid resources. Above this layer, the core grid middleware provides uniform access to the resources in the fabric; the middleware designed to hide complexities of partitioning, distributing, and load balancing. The next layer, the user-level middleware layer consists of resource brokers or schedulers responsible
Figure 12.5. One view of Grid Computing Layers
278
Daniel Minoli
Figure 12.6. Layered Grid Architecture, Global Grid Forum (This presentation is licensed for use under the terms of the Globus Toolkit Public License. See http://www.globus.org/toolkit/ download/license.html for the full text of this license.)
for aggregating resources. The grid programming environments and tools layer includes the compilers, libraries, development tools, and so on, that are required to run the applications (resource brokers manage execution of applications on distributed resources using appropriate scheduling strategies, while grid development tools to grid-enable applications.) The top layer consists of grid applications themselves [9]. Building on this intuitive idea of layering, it would be advantageous if industry consensus were reached on the series of layers. Architecture standards are now under development by the Global Grid Forum (GGF). The GGF is an industry advocacy group; it supports community-driven processes for developing and documenting new standards for Grid Computing. The GGF is a forum for exchanging information and defining standards relating to distributed computing and grid technologies. GGF is fashioned after the Grid Forum, the eGrid European Grid Forum, and the Grid Community in the Asia-Pacific. GGF is focusing on open grid architecture standards [48]. Technical specifications are being developed for architecture elements: e. g., security, data, resource management, and information. Grid architectures are being built based on Internet protocols and services (for example, communication, routing, name resolution, etc. The layering approach is used to the extent possible because it is advantageous for higher-level functions to use common lower-level functions. The GGF’s approach has been to propose set of core services as basic infrastructure, as shown in Figure 12.6. These core services are used to construct high-level, domain-specific solutions. The design principles are: keep participation cost low; enable local control; support for adaptation; use the “IP hourglass” model of many applications using a few core services to support many fabric elements (e. g., Operating Systems.) In the meantime, Globus Toolkit™ has emerged as the de facto standard for several important Connectivity, Resource, and
12 Grid Computing for Commercial Enterprise Environments
279
Collective protocols. The toolkit, a “middleware plus” capability, addresses issues of security, information discovery, resource management, data management, communication, fault detection, and portability (The Global Grid Forum).
12.5 Comparison with other Approaches It is important to note that certain IT computing constructs are not grids, as we discuss next. In some instances, these technologies are the optimal solution for an organization’s problem; in other cases, Grid Computing is the best solution, particularly if in the long term one is especially interested in supplier-provided utility computing. The distinction between clusters and grids relates to the way resources are managed. In case of clusters (aggregating of processors in parallel-based configurations), the resource allocation is performed by a centralized resource manager and scheduling system; also, nodes cooperatively work together as a single unified resource. In case of grids, each node has its own resource manager and such node does not aim at providing a single system view [5]. A cluster is comprised of multiple interconnected independent nodes that co-operatively work together as a single unified resource. This means all users of clusters have to go through a centralized system that manages the allocation of resources to application jobs. Unlike grids, cluster resources are almost always owned by a single organization. Actually, many grids are constructed by using clusters or traditional parallel systems as their nodes, although this is not a requirement. An example of grid that contains clusters as its nodes is the NSF TeraGrid [20]; another example is the World Wide Grid has many nodes that are clusters, that are located in organizations such as NRC Canada, AIST-Japan, N*Grid Korea, and University of Melbourne. While cluster management systems such as Platform’s Load Sharing Facility, Veridian’s Portable Batch System, or Sun’s Sun Grid Engine can deliver enhanced distributed computing services, they are not grids themselves; these cluster management systems have centralized control, complete knowledge of system state and user requests, and complete control over individual components (such features tend not to be characteristic of a grid proper) [15]. Grid Computing also differs from basic Web Services, although it now makes use of these services. Web Services have become an important component of distributed computing applications over the Internet (Grid Computing using.NET and WSRF.NET 2004). The World Wide Web is not (yet, in itself) a grid: its open, general-purpose protocols support access to distributed resources but not the coordinated use of those resources to deliver negotiated qualities of service [15]. So, while the Web is mainly focused on communication, Grid Computing enables resource sharing and collaborative resource interplay toward common business goals. Web Services provide standard infrastructure for data exchange between two different distributed applications, while grids provide an infrastructure for aggregation of high-end resources for solving large-scale problems in Science, Engineering, and Commerce. However, there are similarities as well as dependencies:
280
Daniel Minoli
(i) similar to the case of the World Wide Web, Grid Computing keeps complexity hidden: multiple users experience a single, unified experience; and, (ii) Web Services are utilized to support Grid Computing mechanisms: these Web Services, will play a key constituent role in the standardized definition of Grid Computing, since Web Services have emerged in the past few years as a standards-based approach for accessing network applications. The recent trend is to implement grid solutions using Web Services technologies; for example, the Globus Toolkit 3.0 middleware is being implemented using Web Services technologies. In this context, low-level Grid Services are instances of Web Services (a Grid Services is a Web Services that conforms to a set of conventions that provide for controlled, fault resilient and secure management of stateful services) ([21]; [19]). Both peer-to-peer and Grid Computing are concerned with the same general problem, namely, the organization of resource-sharing within VOs. As is the case with peer-to-peer environments, Grid Computing allows users to share files; but unlike P2P, Grid Computing allows many-to-many sharing; furthermore, with grid the sharing is not only in reference to files but other resources as well. The grid community generally focuses on aggregating distributed high-end machines such as clusters, while the P2P community concentrates on sharing low-end systems such as PCs connected to the Internet [9]. Both disciplines take the same general approach to solving this problem, namely the creation of overlay structures that coexist with, but need not correspond in structure to underlying organizational structures. Each discipline has made technical advances in recent years, but each also has – in current instantiations – a number of limitations: there are complementary aspects regarding the strengths and weaknesses of the two approaches which suggests that the interests of the two communities are likely to grow closer over time [16]. P2P networks can amass computing power, as does the SETI@home project, or share contents, as Napster and Gnutella have done in the recent past. Given the number of grid and P2P projects and forums that began worldwide at the turn of the decade, it is clear that interest in the research, development, and commercial deployment of these technologies is burgeoning [9]. Grid Computing also differ from virtualization. Resource virtualization is the abstraction of server, storage, and network resources in order to make them available dynamically for sharing by IT services-both inside and outside an organization. Virtualization is a step along the way on the road to utility computing (Grid Computing) and, in combination with other server, storage, and networking capabilities, offers customers the opportunity to build, according to advocates, an IT infrastructure “without” hard boundaries or fixed constraints [25]. Virtualization has somewhat more of an emphasis on local resources, while Grid Computing has more of an emphasis on geographically distributed inter-organizational resources. The universal problem that virtualization is solving in a data center is that of dedicated resources. While this approach does address performance, this method lacks fine granularity. Typically, IT managers take an educated guess how many dedicated servers they will need to handle peaks, by purchasing extra servers, and, then later finding out that a significant portion of these servers are grossly underutilized. A typical data center has a large amount of idle infrastructure, bought and
12 Grid Computing for Commercial Enterprise Environments
281
set up online to handle peak traffic for different applications. Virtualization offers a way of moving resources from one application to another dynamically. However, specifics of the desired virtualizing effect depend on the specific application deployed [47]. Three representative press-time products are HP’s Utility Data Center, EMC’s VMware, and Platform Computing’s Platform LFS. With virtualization, the logical functions of the server, storage, and network elements are separated from their physical functions (e. g., processor, memory, I/O, controllers, disks, switches). In other words, all servers, storage, and network devices can be aggregated into independent pools of resources. Some elements may even be further subdivided (server partitions, storage LUNs) to provide an even more granular level of control. Elements from these pools can then be allocated, provisioned, and managed-manually or automatically-to meet the changing needs and priorities of one’s business. Virtualization can span the following domains [25]: i. Server virtualization for horizontally and vertically scaled server environments. Server virtualization enables optimized utilization, improved service levels, and reduced management overhead. ii. Network virtualization, enabled by intelligent routers, switches, and other networking elements supporting virtual LANs. Virtualized networks are more secure and more able to support unforeseen spikes in customer and user demand. iii. Storage virtualization (server, network, and array-based). Storage virtualization technologies improve the utilization of current storage subsystems, reduce administrative costs, and protect vital data in a secure and automated fashion. iv. Application virtualization enables programs and services to be executed on multiple systems simultaneously. This computing approach is related to horizontal scaling, clusters, and Grid Computing, where a single application is able to cooperatively execute on a number of servers concurrently. v. Data center virtualization, whereby groups of servers, storage, and network resources can be provisioned or re-allocated on-the-fly to meet the needs of a new IT service or to handle dynamically changing workloads (Hewlett-Packard Company 2002). Grid Computing deployment, although potentially related to a re-hosting initiative, is not just re-hosting. As Figure 12.7 depicts, re-hosting implies the reduction of typically a large number of servers (possibly using some older and/or proprietary OS) to a smaller set of more powerful and more modern servers (possibly running on open source OSs). This is certainly advantageous from an operations, physical maintenance, and power and space perspective. There are savings associated with re-hosting. However, applications are still assigned specific servers. Grid Computing, on the other hand, permits the true virtualization of the computing function, as seen in Figure 12.7. Here applications are not pre-assigned a server, but the “runtime” assignment is made based on real-time considerations. (Note: in the bottom diagram, the hosts could be collocated or spread all over the world. When local
282
Daniel Minoli
Figure 12.7. A comparison with re-hosting
hosts are aggregated in tightly-coupled configurations, they tend to generally be of the cluster parallel-based computing type; such processors, however, can also be nonparallel-computing-based grids, e. g., by running the Globus Toolkit. When geographically dispersed hosts are aggregated in distributed computing configurations
12 Grid Computing for Commercial Enterprise Environments
283
they tend to generally be of the Grid Computing type and not running in a clustered arrangement – Figure 12.7 does not show geography and the reader should conclude that the hosts are arranged in a Grid Computing arrangement.) In summary, like clusters and distributed computing, grids bring computing resources together. Unlike clusters and distributed computing, that need physical proximity and operating homogeneity, grids can be geographically distributed and can be heterogeneous. Like virtualization technologies, Grid Computing enables the virtualization of IT resources. Unlike virtualization technologies, that virtualize a single system, Grid Computing enables the virtualization of broad-scale and disparate IT resources [27]. Scientific community deployments such as the distributed data processing system being deployed internationally by “Data Grid” projects (e. g., GriPhyN, PPDG, EU DataGrid), NASA’s Information Power Grid, the Distributed ASCI Supercomputer system that links clusters at several Dutch universities, the DOE Science Grid and DISCOM Grid that link systems at DOE laboratories, and the TeraGrid mentioned above being constructed to link major U.S. academic sites, are all bona-fide examples of Grid Computing. A multi-site scheduler such as Platform’s MultiCluster can reasonably be called (firstgeneration) grids. Other examples of Grid Computing include the distributed computing systems provided by Condor, Entropia, and United Devices, that make use of idle desktops; peer-to-peer systems such as Gnutella, that support file sharing among participating peers; and a federated deployment of the Storage Resource Broker, that supports distributed access to data resources [18].
12.6 A Quick View at Grid Computing Standards One of the challenges with any computing technology is getting the various components to communicate with each other. Nowhere is this more critical than when trying to get different platforms and environments to interoperate. It should, therefore, be immediately evident that the Grid Computing paradigm requires standard, open, general-purpose protocols and interfaces. Standards for Grid Computing are now being defined and are beginning to be implemented by the vendors ([11], [7]). To make the most effective use of the computing resources available, these environments need to utilize common protocols [4]. Standards are the “holy grail” for Grid Computing. Regarding this issue, proponents make the case that we are now indeed entering a new phase of Grid Computing where standards will define grids in a consistent way – by enabling grid systems to become easily-built “off-the-shelf” systems. Standard-based grid systems have been called by some “Third-Generation Grids,” or 3G Grids. First generation “1G Grids” involved local “Metacomputers” with basic services such as distributed file systems and site-wide single sign on, upon which early-adopters developers created distributed applications with custom communications protocols. Gigabit testbeds extended 1G Grids across distance, and attempts to create “Metacenters” explored issues of inter-organizational integration.
284
Daniel Minoli
1G Grids were totally custom made proofs of concept [7]. 2G Grid systems began with projects such Condor, I-WAY (the origin of Globus) and Legion (origin of Avaki), where underlying software services and communications protocols could be used as a basis for developing distributed applications and services. 2G Grids offered basic building blocks, but deployment involved significant customization and filling in many gaps. Independent deployments of 2G Grid technology today involve enough customized extensions that interoperability is problematic, and interoperability among 2G Grid systems is very difficult. This is why the industry needs 3G Grids [7]. By introducing standard technical specifications, 3G Grid technology will have the potential to allow both competition and interoperability not only among applications and toolkits, but among implementations of key services. The goal is to mix-and-match components; but this potential will only be realized if the grid community continues to work at defining standards [7]. The Global Grid Forum community is applying lessons learned from 1G and 2G Grids and from Web services technologies and concepts to create 3G architectures [7]. GGF has driven initiatives such as the Open Grid Services Architecture (OGSA). OGSA is a set of specifications and standards that integrate and leverage the worlds of Web services and Grid Computing (Web services are viewed by some as the “biggest technology trend” in the last five years [29]. With this architecture, a set of common interface specifications supports the interoperability of discrete, independently developed services. OGSA brings together Web standards such as XML (eXtensible Markup Language), WSDL (Web Service Definition Language), UDDI (Universal Description, Discovery, and Integration), and SOAP (Simple Object Access Protocol), with the standards for Grid Computing developed by the Globus Project [30]. The Globus Project is a joint effort on the part of researchers and developers from around the world that are focused on grid research, software tools, testbeds, and applications. As TCP/IP (Transmission Control Protocol/Internet Protocol) forms the backbone for the Internet, the OGSA is the backbone for Grid Computing. The recently-released Open Grid Services Infrastructure (OGSI) service specification is the keystone in this architecture [7]. In addition to making progress on the standards front, Grid Computing as a service needs to address various issues and challenges. Besides standardization, some of these issues and challenges include: security; true scalability; autonomy; heterogeneity of resource access interfaces; policies; capabilities; pricing; data locality; dynamic variation in availability of resources; and, complexity in creation of applications (GRID Infoware).
12.7 A Pragmatic Course of Investigation on Grid Computing In order to identify possible benefits to their organizations, planners should understand Grid Computing concepts and the underlying networking mechanisms.
12 Grid Computing for Commercial Enterprise Environments
285
Practitioners interested in Grid Computing are asking basic questions (developerWorks): What do we do with all of this stuff? Where do we start? How do the pieces fit together? What comes next? As implied by the introductory but rather encompassing discussion above, Grid Computing is applicable to enterprise users at two levels: 1. Obtaining computing services over a network from a remote computing service provider; and, 2. Aggregating an organization’s dispersed set on uncoordinated systems into one holistic computing apparatus. As noted, with Grid Computing organizations can optimize computing and data resources, pool such resources to support large capacity workloads, share the resources across networks, and enable collaboration [27]. Grid technology allows the IT organization to consolidate and numerically reduce the number of platforms that need to be kept operating. Practitioners should look at networking as not only the best way to understand what Grid Computing is, but, more importantly, the best way to understand why and how Grid Computing is important to IT practitioners (rather than “just another hot technology” that gets researchers excited but never has much effect on what network professionals see in the real world). One can build and deploy a grid of a variety of sizes and types: for large or small firms; for a single department or the entire enterprise; and, for enterprise business applications or for scientific endeavors. Like many other recent technologies, however, Grid Computing runs the risk of being over-hyped. CIOs need to be careful not to be to oversold on Grid Computing: a sound, reliable, conservative economic analysis is, therefore, required that encompasses the true Total Cost of Ownership (TCO) and assesses the true risks associated with this approach. Like the Internet, Grid Computing has its roots in the scientific and research communities. After about a decade of research open systems are poised to enter the market. Coupled with a rapid drop in the cost for communication bandwidth, commercial-grade opportunities are emerging for Fortune 500 IT shops searching for new ways to save money. All of this has to be properly weighted against the commoditization of machine cycles (just buy more processors and retain the status quo), and against reduced support cost by way of sub-continent outsourcing of said IT support and related application development (just move operations abroad, but otherwise retain the status quo). During the past ten years or so, a tour-deforce commoditization has been experienced in computing hardware platforms that support IT applications at businesses of all sizes. Some have encapsulated this rapidly-occurring phenomenon with the phrase “IT doesn’t matter”. The pricedisrupting predicament brought about by commoditization affords new opportunities for organizations, although it also has conspicuous process- and peopledislocating consequences. Grid Computing is but one way to capitalize on this Moore-Law-driven commoditization. Already we have noted that at its core Grid Computing is, and must be, based on an open set of standards and protocols that enable communication across heteroge-
286
Daniel Minoli
neous, geographically dispersed environments. Therefore, planners should track the work that the standards groups are doing. It is important to understand the significance of standards in Grid Computing, how they affect capabilities and facilities, what standards exist, and how they can be applied to the problems of distributed computation [4]. There is effort involved with resource management and scheduling in a grid environment. When enterprises need to aggregate resources distributed within their organization and prioritize allocation of resources to different users, projects, and applications based on their Quality of Service (QoS) requirements (call these Service Level Agreements), these enterprises need to be concerned about resource management and scheduling. The user QoS-based allocation strategies enhance the value delivered by the utility. The need for QoS-based resource management becomes significant whenever more than one competing applications or users need to utilize shared resources [21]. Regional economies will benefit significantly from Grid Computing technologies, as suggested earlier in the chapter, assuming that two activities occur [10]: First, the broadband infrastructure needed to support Grid Computing and Web services must be developed in a timely fashion (this being the underlying theme of this text). And, second, that states and regions attract (or grow from within) a sufficient pool of skilled computer and communications professionals to fully deploy and utilize the new technologies and applications. One can ask: what has been learnt in the past 10 years related to grids that can be taken forward? What are the valid approaches we make normative going forward? What is enough to make it good enough for the commercial market? What is economically viable for the commercial market? What is sufficient to convey the value of grid technology to a prospective end user? What are the performance levels for businesses: Tera(FLOPS, bps, B) or Mega(FLOPS, bps, B)? Tera or Giga? Peta or Exa? What are the requisite networking mechanisms, without which Grid Computing cannot happen? For a more complete treatment of this topic the reader is referred to the book by D. Minoli, A Networking Approach to Grid Computing, Wiley, 2005, New York.
References 1. Baru C, Moore R, Rajasekar A, Wan M (1998) The sdsc storage resource broker. Proceedings of IBM Centers for Advanced Studies Conference. IBM 2. Bednarz A, Dubie D (2003) How to: How to get to utility computing. Network World, 12/01/03 3. Berman F, Fox G, Hey AJ (2003) Grid Computing: Making the Global Infrastructure a Reality, May 2003 4. Brown MC (2003) Grid Computing – moving to a standardized platform. August 2003, IBM’s Developerworks Grid Library, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, www.ibm.com
12 Grid Computing for Commercial Enterprise Environments
287
5. Buyya R, Frequently Asked Questions. Grid Computing Info Centre, GridComputing Magazine 6. Buyya R (2003) Delivering Grid Services as ICT Commodities: Challenges and Opportunities. Grid and Distributed Systems (GRIDS) Laboratory, Dept. of Computer Science and Software Engineering, The University of Melbourne, Melbourne, Australia, IST (Information Society Technologies), Milan, Italy 7. Catlett C (2003) The Rise Of Third-Generation Grids. Global Grid Forum Chairman, Grid Connections, Fall 2003, Volume 1, Issue 3, The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA 8. Chervenak A, Foster I, Kesselman C, Salisbury C, Tuecke S (1999) The Data Grid: Towards An Architecture For The Distributed Management And Analysis Of Large Scientific Datasets 9. Chetty M, Buyya R (2002) Weaving Computational grids: How Analogous Are they With Electrical grids?, Computing In Science & Engineering, July/August 2002, IEEE 10. Cohen RB, Feser E (2003) Grid Computing, Projected Impact in North Carolina’s Economy & Broadband Use Through 2010. Rural Internet Access Authority, September 2003 11. developerWorks staff (2003) Start Here to learn about Grid Computing, August 2003, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States 12. Fellows W (2003) IBM’S Grid Computing Push Continues. Gridtoday, Daily News & Information for for the Global Grid Community, October 6, 2003: VOL. 2 NO. 40, Published by Tabor Communications Inc, 8445 Camino Santa Fe, San Diego, California 92121, (858) 625-0070 13. Foster, I. (2002) The Grid: A New Infrastructure for 21st Century Science. Physics Today, 55 (2) pp. 42−47 14. Foster I, Kesselman C, (eds) (1999) The Grid: Blueprint for a Future Computing Infrastructure. Morgan Kaufmann Publishers, San Francisco, CA 15. Foster I (2002) What is the Grid? A Three Point Checklist. Argonne National Laboratory & University of Chicago, July 20, 2002, Argonne National Laboratory, 9700 Cass Ave, Argonne, IL, 60439, Tel: 630 252-4619, Fax: 630 2525986, mailto:[email protected] 16. Foster I, Iamnitchi A (2003) On Death, Taxes, and the Convergence of Peerto-Peer and Grid Computing. 2nd International Workshop on Peer-to-Peer Systems (IPTPS’03), February 2003, Berkeley, CA 17. Foster I, Kesselman C (1997) Globus: A Metacomputing Infrastructure Toolkit. The International Journal of Supercomputer Applications and High Performance Computing, 11 (2) pp. 115−128 18. Foster I, Kesselman C, Tuecke S (2001) The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15 (3). pp. 200−222 19. Fox G, Pierce M, Gannon RGD, Thomas M (2002) Overview of Grid Computing Environments. GFD-I.9, Feb 2003, Copyright (C) Global Grid Forum
288
20. 21.
22.
23.
24.
25.
26.
27. 28.
29. 30.
31. 32. 33. 34. 35. 36.
Daniel Minoli
(2002). The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA Grid Computing using.NET and WSRF.NET (2004) Tutorial. GGF11, Honolulu, June 6, 2004 GRID Infoware Grid Computing, Answers to the Enterprise Architect Magazine Query. Grid Computing Info Centre (GRID Infoware), Enterprise Architect Magazine: http://www.cs.mu.oz.au/~raj/GridInfoware/gridfaq.html Grimshaw AS, Wulf WA, the Legion Team (1997) The Legion Vision of a Worldwide Virtual Computer. Communications of the ACM, 40 (1) pp. 39−45, January 1997 Haney M (2003) Grid Computing: Making inroads into financial services. 24 Apr 2003, Issue No: Volume 4, Number 5, IBM’s Developerworks Grid Library, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, www.ibm.com Hawk T (2003) IBM Grid Computing General Manager. Grid Computing Planet Conference and Expo, San Jose, 17 June 2002. Also as quoted by Globus Alliance, Press Release, July 1, 2003 Hewlett-Packard Company (2002) HP Virtualization: Computing without boundaries or constraints, Enabling an Adaptive Enterprise. HP Whitepaper, 2002, Hewlett-Packard Company, 3000 Hanover Street, Palo Alto, CA 943041185 USA, Phone: (650) 857-1501, Fax: (650) 857-5518, http://www.hp.com Huber V (2001) UNICORE: A Grid Computing environment for distributed and parallel computing. Lecture Notes in Computer Science, 2127 pp. 258−266 IBM Press Releases. IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, http://www.ibm.com Kesselman C Globus Alliance, Press Releases. USC/Information Sciences Institute, 4676 Admiralty Way, Suite 1001, Marina del Rey, CA 90292-6695, Tel: 310 822-1511 x338, Fax: 310 823-6714, mailto:[email protected] http://www.globus.org, mailto:[email protected] Lehmann M (2003) Who Needs Web Services Transactions?. Oracle Magazine, http://otn.oracle.com/oraclemagazine, November/December 2003 McCommon M (2003) Letter from the Grid Computing Editor: Welcome to the new developerWorks Grid Computing resource. Editor, Grid Computing resource, IBM, April 7, 2003 McDougall P (2003) Offshore Hiccups In An Irreversible Trend. Dec. 1, 2003, InformationWeek, CMP Media Publishers, Manhasset, NY Minoli D (1994/5) Analyzing Outsourcing, Reengineering Information and Communication Systems. McGraw-Hill Minoli D (1987) A Collection of Potential Network-Based Data Services. Bellcore/Telcordia Special Report, SR-NPL-000790, Piscataway, NJ Minoli D (2005) A Networking Approach to Grid Computing. Wiley, New York Minoli D (2006) the Virtue of Virtualization. Networkworld, Feb. 27 Minoli D, Schmidt A (1997) Client/Server Applications On ATM Networks. Prentice-Hall/Manning
12 Grid Computing for Commercial Enterprise Environments
289
37. Myer T (2003) Grid Computing: Conceptual flyover for developers. May 2003, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States 38. Otey M (2004) Grading Grid Computing. SQL Server Magazine, January 2004, Published by Windows &.Net Magazine Network, a Division of Penton Media Inc., 221 E. 29th St., Loveland, CO 80538 39. Padala P (2003) A Survey of Grid File Systems. Editor, GFS-WG (Grid File Systems Working Group), Global Grid Forum, September 19, 2003. The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA 40. Smetanikov M (2003) HP Virtualization a Step Toward Planetary Network. Web Host Industry Review (theWHIR.com) April 15, 2003. Web Host Industry Review, Inc. 552 Church Street, Suite 89, Toronto, Ontario, Canada M4Y 2E3, (p) 416-925-7264, (f) 416-925-9421 41. Sun Networks Press Release (2003) Network Computing Made More Secure, Less Complex With New Reference Architectures, Sun Infrastructure Solution. September 17, 2003. Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054, Phone: 1-800-555-9SUN or 1-650-960-1300 42. Tatebe O, Morita Y, Matsuoka S, Soda N, Sekiguchi.S (2002) Grid datafarm architecture for petascale data intensive computing. In Henri E. Bal, KlausPeter Lohr, and Alexander Reinefeld, editors, Proceedings of the Second IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid2002), Berlin, Germany, IEEE Computer Society pp. 102−110 43. The Global Grid Forum, 9700 South Cass Avenue, Bldg 221/A142, Lemont, IL, 60439, USA, http://www.ggf.org 44. URL grid computing, http://www.gridcomputing.com/ 45. Waldrop M (2002) Hook enough computers together and what do you get? A new kind of utility that offers supercomputer processing on tap. MIT Enterprise Technology Review, May 2002, Technology Review, One Main Street, 7th Floor, Cambridge, MA, 02142, Tel: 617-475-8000, Fax: 617-475-8042 46. Yankee Group (2003) Enterprise Computing & Networking Report on Utility Computing in Next-Gen IT Architectures, Yankee Group, August 2003 Yankee Group, 31 St. James Avenue (aka the Park Square Building), Boston MA 02116, (617) 956-5000 47. Yankee Group (2004) Enterprise Computing & Networking Report on Performance Management Road Map for Utility Computing, Yankee Group, February 2004. Yankee Group, 31 St. James Avenue (aka the Park Square Building), Boston MA 02116, (617) 956-5000 48. Zhang LJ, Chung JY, Zhou Q (2002) Developing Grid Computing applications, Part 1: Introduction of a Grid architecture and toolkit for building Grid solutions. October 1, 2002, Updated November 20, 2002, IBM Corporation, 1133 Westchester Avenue, White Plains, New York 10604, United States, http://www.ibm.com (IBM’s developerworks/views/grid/articles archives)
CHAPTER 13 Operational Metrics and Technical Platform for Measuring Bank Process Performance Markus Kress, Dirk Wölfing
13.1 Introduction The progressive globalization of financial markets is requiring major adaptation on the part of market participants to move beyond national-level competition and achieve international and global competitiveness. The entire banking industry is now focusing on major performance enhancements and gains in domestic market share as a springboard to successful international expansion. Banks are concentrating their efforts on market segments offering the potential for growth and rising profits, resulting in a reorientation within the overall financial services sector. New types of banks, including distribution, processing and portfolio banks, are evolving as the market consolidates due to mergers and acquisitions. This dual trend toward specialization and consolidation is forging banks that will be able to compete in international and global markets. Performance enhancement efforts are aimed at a complete realignment of internal and inter-corporate processes. In contrast to the trend in recent years, the focus is no longer on cutting costs alone, but rather on simultaneously improving services to customers. Not only must processes become more efficient, they must be made more customer-friendly as well. Attempts are being made to transfer approaches that have proven effective in other industries, particularly manufacturing, to the financial sector. Industrialisation is the buzzword for boosting performance by applying industrial methods. What is this formula for success being borrowed from industry? Can industrialisation really mean progress in the 21st century, or is it all just another round of hype to be eventually debunked by management pundits? Are industrialised forms of process organisation truly superior to those conventionally employed by service firms? Industry has seldom relied on intuition to address these types of questions, developing instead measurement techniques to provide answers on the basis of objective criteria, in the true engineering tradition. The same approach is now being applied to process design and realignment in a banking environment and is the
292
Markus Kress, Dirk Wölfing
subject of the first part of the article. After an introductory discussion on measurement systems for business operational performance tracking, we next undertake to categorise the various types of measurement systems employed. This categorisation provides a basis for the definition of the concept of performance. We then outline a process model employing metrics in line with this concept of performance. The second part of the chapter then presents a model of a service-oriented IT architecture, into which we plug a number of elements for the measurement and reporting of process performance at financial services organisations.
13.2 Industrialisation in the 21st Century? The principal idea underlying industrial production is the division of labour. An industrialised organisation generates a collective product through a series of controlled process workflows (e. g., assembly line car production). The division of labour occurs within industrial business processes (Richter-von Hagen and Stucky 2004). Industrialisation arose from the analysis of complex workflows based on the principle of the division of labour. This analysis focuses on measuring and rendering manageable discrete steps of production labour. This led to the invention of the assembly line, an era in which manufactured products were initially highly standardised with minimal automation. Each production step involved minutely specific, pre-planned physical labour, which is why factory work has been seen as monotonous for a long time. Industrial manufacturing was likewise thought to largely preclude any customisation. Simplifying and standardising workflows made possible the introduction of mechanised and/or automated production technology. Industrialisation is thus often equated with automation. Automation is a central but not the sole constitutive element of industrialisation, as automation can also be employed in manual labour processes. The essence of industrialisation is the intelligent organisation and control of workflows within a process of discrete production steps. Modern controls now allow industrial production to be tailored to a wide variety of individual customer requirements. The mechanical assembly line of yore has been replaced by complex IT systems for controlling the manufacturing process. Knowledge and rule-based high performance systems allow unprecedented flexibility for process customisation. Upon submitting an order, customers are now able to specify the relevant production information and track production and delivery status online. The production process thus starts and ends with the customer. The industrial idea of coordinating labour within a segmented production process was initially conceived for the manufacture of goods. Today, however, we see the same principle being employed at call centres in a service environment. The service in question is broken down into individual tasks. Incoming calls are directed to provide specialised services of a defined quality within a specific period of time. Planning and controls allow capacity to be adapted to meet customer requirements. The principle of division of labour can transcend the borders of the individual enterprise, however. In the global markets, processes are designed and managed
13 Measuring Bank Process Performance
293
extending from the customer to the supplier. Companies establish overarching, multi-organisation process chains controlled via purchasing (Supply Chain Management, SCM) and sales (Customer Relationship Management, CRM) systems. Whereas originally industrialisation was concerned with organising processes in order to simplify activities1, managing complex process steps has now become the key to successful process optimisation in the 21st century. This means abandoning a “radical cure” approach (Hammer and Champy 2003) in favour of ongoing adaptation of business processes to changing external factors by means of optimised process controls (Osterloh and Frost 2003).
13.3 Process Performance Metrics as an Integral Part of Business Performance Metrics Processes are complex and largely interdependent systems, in which minor changes in parameters (e. g. capacity) may significantly impact overall process performance. These interdependencies and the need for ongoing adaptation to changing market and competitive conditions require an effective system of controls. Operational control systems involve a quantitative presentation of the status quo versus a given target, as well as instruments for analysis of factors influencing the complex system we refer to as “processes”. Discussion of process management today revolves mainly around challenges to technical process support, the integration of technical process steps into operational systems and the quality of technical control functions in an information technology context (see for example (Kahno et al. 2005; Ray et al. 2005) for further reading). The potential these technical considerations represent from an operations management perspective as control parameters has not yet been reflected. In operations management, process performance controls are viewed in the overall context of business performance metrics, quantitative reporting of process performance being seen as an integral part thereof (Sandt 2004; Siebert 1998). Figure 13.1 provides an overview of concepts of metrics systems. Classic management systems employing financial and managerial accounting used at the start of the 20th century were revised and augmented toward the century’s close. Value driver concepts were developed on the financial metrics side based on the idea of shareholder value. A number of different quality management concepts have been developed in addition to financial concepts (e. g. Six Sigma). The Balanced Scorecard method integrates financial and qualitative aspects into a comprehensive control and management system. Methodological concepts based upon this method are currently being discussed (Mende 1995).
1
Such as Frederick Winslaw Taylor, who was the first to break down the labour process into discrete activities, identifying major potential for increasing productivity.
294
Markus Kress, Dirk Wölfing
Metrics concepts for business management
Financial metrics
Non-financial metrics
Accounting systems: DuPont system, ZVEI system, etc. Quality management concepts
Value driver concepts
Balanced Scorecard
Metrics management concept
Concepts involving select metrics
Figure 13.1. Metrics concepts according to (Sandt 2004)
Managing risk is the primary managerial concern in the banking industry, where process-oriented thinking is less developed than in manufacturing. Initial attempts are currently being made to develop quality management systems based on methods such as Six Sigma (Moormann 2006). Process-oriented thinking in a risk management context is subsumed under the concept of operational risk. Frances Frei of the Harvard Business School has been pursuing an interesting value-based approach. In his studies, customer value is defined as the primary metric for multi-channel retail bank distribution. The modern concept of performance encompasses both financial and qualitative yardsticks such as customer retention data and process flexibility, incorporating forward-looking as well as historical elements (Schrank 2002). In the following passage we attempt to integrate this expanded notion of process performance into a value-oriented approach in order to develop a model system of process performance metrics.
13.3.1 Work Processes, Qualitative Criteria and Measuring Process Value In the academic discussion (cf. Mende 1995), distinction is made between metrics for process performance and for process execution. Process performance metrics
13 Measuring Bank Process Performance
295
Metric type
Nonfinancial metrics
Financial metrics
Amount, time, quality, disruptive factors etc.
Cost-income ratio, ROI etc.
Processing/set-up times, defects/rejects per machine, defects per employee
ROI per machine, cost per workstation
Process concept Input/output concept
Mapped workflows
Figure 13.2. Cluster diagram of various metrics systems with examples
break down into four categories: time, quality, costs and flexibility. Some examples of process execution metrics are: throughput times, defect rate, process reliability, customer relevancy, operational impact, information flow, directability, efficiency, know-how, information systems, motivation and innovative capability. This distinction between process performance and process configuration allows further distinction between operative concepts and types of metrics: 1. Processes may be viewed either as a series of black box-type segments with inputs and outputs, or in terms of specific workflows. 2. Metrics may be strictly financial or may also incorporate non-financial aspects (including synthetic metrics such as scoring, etc.) relevant to analysing workflows. The input/output concept and financial metrics involve a greater level of abstraction, making them particularly useful for simplification in an “end-to-end” view. Within such a view, overall process performance may be broken down into a manageable number of process segments. A detailed analysis could then concentrate on individual segments while remaining focused on the greater context. Figure 13.2 shows a matrix of these conceptual views with example of each. Financial metrics based on the input/output concept can be employed to ascertain the value created through a given process. The “inputs” used are the cash outflows resulting from the process and the “outputs” are the cash inflows generated.
296
Markus Kress, Dirk Wölfing
The present value of these cash flows represents the contribution of a given process to the sum total of added value created by an enterprise.2 The presentation below mainly concerns an input/output concept, as a specific process is being abstracted upon.
13.4 Controls for Strategic Alignment and Operational Customer and Resource Management Metrics are useful in delineating a starting situation (current), an objective (target) and the trajectory from current to target levels. Metrics are thus configured as a function of both the trajectory to be traversed and the envisioned target. Three levels of target definition can be identified: 1. The market-oriented strategic level at which the medium- to long-range alignment of the enterprise is defined. The primary focus here is on adapting business processes so as to attain the strategic targets set. Strategic business objectives (market, cost, quality objectives etc.) are transformed into metrics and pursued on this level. 2. The customer-oriented level of products or services at which performance outputs for customers are defined. Processes extending from and to the customer are defined on this level for each service provided. Controls are exercised within the framework of a defined product life cycle. 3. The resource level. The objective on this level is to optimally adapt capacity to short- and medium-term market demand (capacity management). Capacity is a function of personnel skills and IT systems.
13.5 Metrics Used for Input/Output Analysis Process performance can be tracked by abstracting upon concrete workflows, employing an input/output concept. Breaking down end-to-end-processes into process segments is also beneficial for analytical purposes. Process segments exhibit the same characteristics as the end-to-end macro-process, namely inputs and outputs described in terms of the financial criterion cash flow, in terms of output 2
The discounted cash flow (DCF) method is employed for valuing cash flows. DCF has the advantage over traditional profitability measures that it provides a macro-level view over a number of periods without having to arbitrarily posit a particular method of depreciation.
13 Measuring Bank Process Performance
Input Cash flows (expenses) based on available IT and personnel capacity
Duration/Time
297
Output Cash flows (income) based on output quality
Volume
Figure 13.3. Process segment from an input/output viewpoint
quality and of available capacity. Process steps are viewed primarily in terms of duration and volume. Figure 13.3 presents a model process and process segments, along with the criteria required for measuring process performance.
13.6 Output Metrics: Cash Inflows a Function of Output Quality The concept of quality is key regarding the generation of cash inflows. The term ‘quality’ denotes two separate things. Quality as a characteristic of financial services Services differ in terms of type and delivery. A particular banking product represents a specific type of service, while delivery concerns the manner in which a banking product is made available to customers (e. g. channel, location, speed, etc.). Banking product features represent quality characteristics that can be described nearly3 entirely in terms of specific numeric attributes. Service levels are now generally used for the detailed outlining and measurement of service delivery. Time is another quality characteristic of financial services, as speed of availability makes possible differentiation of otherwise commodity-like money products. The rapidity of loan approvals and customer-friendly features with lending and investment products (especially in the online segment) are major factors in selecting a banking partner. “Time-to-market” and “time-to-customer” are key for success in the banking business today. “Real-time” products are set to be the next
3
The few surviving physical products like cash are an exception. Other exceptions are contractual provisions governing work to be performed (terms and conditions, etc.).
298
Markus Kress, Dirk Wölfing
major step for enhanced customer service. Time considerations are thus a critical factor and standard to most service levels. The fulfilment of quality standards is the precondition for generating cash inflows. Measurement systems are relied on to document adherence to service quality standards. Income generated is thus equal to the number of qualitatively equivalent services provided multiplied by the market prices obtained for these. Defects as a lack of specific characteristics The dual connotations of the term ‘quality’ are effectively explained if we consider defects as a “lack of a (promised) characteristic”. With defects, the defined “target” characteristic differs from the “actual” characteristic”, i. e., the banking product or service related to said product does not correspond to specifications or expectations. Defects are process failures that have an impact on cash inflows and the usage of resources. Quality management concepts provide a framework for the in-depth analysis of these interdependent phenomena. Measurement systems must be able to track and document defects/errors. Corrective measures to cash inflows due to quality defects identified are implemented external to the process itself.
13.7 Cash Outflows a Function of Available Capacity The concept of “capacity” relates to the input side of the equation as “quality” relates to the delivery aspect (or “output” of a financial service process). Bank process capacity is a function of the two resources personnel and information technology. Capacity depends on the amount of resources made available for a process. IT resources are those IT functionalities required for a specific activity, while personnel resources are the available man hours of a specific personnel function. Available but unused capacity is referred to as idle capacity. Providing capacity typically generates cash outflows.
13.8 Time and Volume as Non-financial Metrics of Process Performance Time Time as a process characteristic differs from time as a delivery requirement in connection with service levels. As a process characteristic, time is an indicator of process performance. Turnaround time, processing time, down time, make-ready time, waiting- and other times are examples of non-financial work process characteristics used to describe process performance. The technical documentation of the start and finish of a given processing event depend on the specific organisation of workflows. The concept of processing time must first undergo definition and be applied to the work process before technical documentation can take place.
13 Measuring Bank Process Performance
299
Volume Volume represents the number of business events within a larger process or within individual process segments. This figure always refers to business events or services to the customer. Modular process segments allow for the providing of various service packages consisting of standardised individual services. The number of business events per process segment is thus an important measure for the standardisation of services. Volume is another non-financial process performance characteristic.
13.8.1 Fundamental Input/Output Process Performance Characteristics Information on performance quality and quantity, type and amount of resources provided and process time and volume allow for the documenting of process performance and creation of added value. The following productivity measures represent fundamental metrics: Unit of time Unit of performance
or
Unit of performance Unit of time
or
Unit of performance Resource
and value-creation metrics including: Income Unit of volume
Income Unit of time
Cash outflows Unit of volume
Cash outflows Unit of time
The Discounted Cash Flow (DCF) method is applied to determine the value of cash flows generated by a given process. In a value-based approach, the resulting figure sums up process performance. Performance may also incorporate forwardlooking elements by factoring in the present value of projected cash flows. Process segments can also be further defined to include other non-financial metrics within an “end-to-end” view. Process performance data obtained on the basis of these fundamental metrics permits studying causalities and potential performance improvements as well as the mapping of more complex interrelationships (such as are denoted by the concept of flexibility).
13.9 Technical Realization While in the chapters above the business view on business process performance (BPP) was discussed, the last part of this article addresses the technical aspects,
300
Markus Kress, Dirk Wölfing
especially the monitoring of BPP. How the business requirements regarding this kind of monitoring can be realized with a technical implementation is analyzed in this chapter. The resulting technical monitoring approach is dependent from the system landscape in which business processes run. There is considerable difference whether a process is implemented end-to-end in one application or distributed among various non-connected systems. It is obviously more difficult to extract and relate data from different data sources. This means that the monitoring approach needs to consider and address the specific system architecture. As a service oriented architecture (SOA) is a promising architecture paradigm due to the flexibility and agility it provides, the monitoring approach introduced in this paper is especially designed for SOAs. At the moment there is a clear SOA trend. Therefore the decision to examine monitoring in SOAs can be seen as market- or practice-driven. The terms monitoring and business process are used in a technical context so there meaning differs from above. The term monitoring is used here as measuring the KPIs of business processes by extracting process data and performing calculations for retrieving the KPI values. Business processes consist of a number of activities which require human interaction or automatic processing. In a SOA these activities are related to services. Therefore a business process can be defined as orchestrated services. By this orchestration and the management of services in a standardized way, a SOA separates the what (processes and metrics) from the how (resources) (O’Connel et al. 2006). To recapitulate, the technical monitoring of BPP has to implement the business requirements as stated above within a SOA.
13.9.1 Monitoring Challenges Monitoring business processes implemented in a SOA environment imply various challenges. The main challenge of performance monitoring within a SOA lies in its distributed nature. SOA monitoring has to span different components as well as architectural layers which makes the data collection difficult or at least laborintensive. Data required for KPI calculations has to be retrieved from these different sources. In practice it is often complicated to correlate the data, e. g. to the corresponding business process instance. Furthermore there is no standardized reference architecture. There are many different SOA-stacks with various types of services and service hierarchies. In practice SOAs are implemented in a lot of different ways caused by the existing system landscape. Legacy systems as well as tools strategically set, affect the architecture. These heterogeneous service implementations are the second big challenge that must be coped with. This heterogeneity may cause difficulties in data collection, e. g. when different data formats and encodings in the services are used. Generally, monitoring of business process performance requires a comparison between target and actual data. While target data is defined manually in advance
13 Measuring Bank Process Performance
301
actual data accrues during the execution of business processes. The technical solution must be able to collect this actual data automatically. Specific data transformations like format changes, mappings and aggregations may be necessary to provide the business view of the monitoring data in an adequate way. The key questions that need to be addressed are: • • • •
what data must be monitored where is the data stored how can the data be retrieved how can data from different sources be correlated to each other (data redundancies and bad data quality can make the correlation even more challenging)
13.10 Related Work As SOA-based BPM is a relatively new approach, monitoring concepts are still in development and information on this topic is scarce. The majority of existing SOA monitoring approaches are technical motivated and therefore non-functional aspects are mainly considered. This kind of monitoring comprises measurement of technical response times, error localization, determining the technical Quality of Service (QoS), among others. Various techniques and protocols can be applied for this kind of monitoring, e. g. URL probing,4 SNMP requests and traps,5 or NetFlow.6 Besides these technical aspects, measurement of BPP additionally requires the monitoring of sequences of service calls as processes are orchestrated by services as described in the technical introduction part. Only few articles exist that consider the measurement of business process performance based on business driven KPIs in a SOA. One such approach is described in (Kano et al. 2005). But they do not address the special characteristics of SOA monitoring, especially its heterogeneity and distributed nature. Neither of these approaches address the different possibilities in implementing a SOA. As stated before these SOA specific characteristics lead to various challenges. A multitude of not necessarily SOA specific monitoring approaches exists, like business activity monitoring (BAM), OASIS Common Base Event Infrastructure, IBM Common Event Infrastructure, among others, which could be applied for implementing SOA monitoring. Especially BAM is more and more used in BPM and SOA projects.
4 5
6
monitoring of URLs to make sure that the response is as expected. standardized protocol for monitoring and controlling agents from central management stations. open (but proprietary) Cisco protocol for collecting IP traffic information.
302
Markus Kress, Dirk Wölfing
13.11 SOA Based Business Process Performance Monitoring As mentioned above, the technical solution presented in this paper addresses performance monitoring of business processes running in SOA environments.
Figure 13.4. SOA performance monitoring architecture
The monitoring approach is explained by means of a service based reference architecture that has been successfully applied in several projects. This architecture comprises five layers, each representing specific functionalities and system types (see Figure 13.1). The presentation layer is the interface between user and system. On this layer the appearance of the graphical user interface is defined, like masks and forms. The business process layer describes the business process with focus on the business view. The two layers below represent the service hierarchy which is based here on an order acceptance example. There are two kinds of services, enterprise and integration services. While enterprise services encapsulate business functionalities, integration services have a more technical focus like data transformations and data access. The application and persistence layer contains all systems and databases existing in the enterprise. The layers in Figure 13.4 are vertical split in order to illustrate the business process and monitoring specific aspects. The left side shows the architectural design for implementing business processes without any monitoring aspects. These are depicted on the right side. As
13 Measuring Bank Process Performance
303
the monitoring approach does not require any business process execution itself, the right-hand business process layer is empty. The main advantages of this reference architecture and SOA in general are the clear separation of different functionalities and the loose coupling between components. The loose coupling of business process and the legacy systems is achieved by the two service layers which provide the abstraction required. This loose coupling together with the existing standards, leads to the adaptability and flexibility earlier mentioned. Due to its adaptable, flexible style of architecture, a SOA provides the foundation for shorter time-to-market, reduced costs, and risks in development and maintenance (Pallos 2001). Furthermore the so called “service enabling” can leverage existing investments in legacy applications. SOA should not be equated with web services as these are only one way of implementing a SOA. Web services are widely-used as they are based on platform-independent standards. Therefore they are the enabling technology and driver for the SOA trend mentioned earlier. Despite the fact that SOA is based on standards, there exists a multitude of vendor-specific so called SOA-stacks which consist of different service hierarchies. Many vendors are mainly driven by their existing tool set. Furthermore the service types within SOA are not standardized. Again, there exist proprietary vendor-specific definitions of service types. Services can be distinguished in business and technical services, or in business, abstraction, and access services. Besides these classifications, there exist other ones using different service definitions. For example the IBM SOA reference architecture uses services like interaction or partner services, among others (High et al. 2005). The specific tools and platforms for implementing such an architecture are not in the focus here, but the different functionalities each layer represents. For example there exist special software systems called business process management systems (BPMS) for the technical implementation and support of business processes. These systems can be defined as “specialist rapid application development software for the automation of rules based processes …” (Pyke 2005). They are also able to work with services so they can be used in a SOA environment. Our monitoring approach does not necessarily require a BPMS. Despite the importance of BPM technology (Miers 2005) many companies do not yet have a BPMS in use. If the process is implemented technically, most commonly an EAI or portal software is used instead. When considering this situation our approach makes the use of a BPMS optional. Within this architecture the monitoring solution must be able to collect the data incurred during business process executions. It depends on the software tools in use what data can be monitored without much effort as many tools provide monitoring abilities. For example most BPMSs provide sophisticated monitoring support on the workflow layer but lack on the other service layers. Using an architecture as described enables detailed and flexible monitoring over all layers even if the business process is not implemented completely end-to-end in one software system. In our technical solution there exists a special monitoring channel (see Figure 13.4) for the collection of monitoring data. Services have to publish messages
304
Markus Kress, Dirk Wölfing
containing the monitoring data on this channel.7 This is done by either using a special monitoring service or by publishing the data on the channel directly. This message-based asynchronous communication ensures that the business process execution is independent from the monitoring application. If the monitoring application fails the business process execution can proceed undisturbed. Standards like JMS8 can be used for performing read or write operations on the channel. In the non-Java world a separate monitoring service or special products like the .NET9 based ActiveJMS10 could be used. The message-driven approach delivers data to the monitoring application in real time, ensuring real time monitoring. As business processes are an orchestration of services that may consist of services themselves – depending on the service hierarchy, context information must be provided to each service call. This context information must contain all information required for relating service calls to the corresponding business process, sub process, and or activity. This context information solves the data relating problem described above. For example context information could comprise an unique ID for identifying the business process instance the service call belongs to (and of course it relates the data that is exchanged via the service call). Using this approach an error situation can be determined if it occurs. If a proactive monitoring approach is required, services must be instrumented for it. For example business processes or transactions must be executable in a test mode performing service calls for tracing only. One such approach is described in (Roch 2006). The monitoring solution has various advantages, as it: • • • • •
works in heterogeneous system landscapes solves the data correlation difficulties is independent from the concrete SOA implementation provides real time monitoring can be extended to gain a proactive business performance management instrument
Also there are some disadvantages: • higher network load • existing services must be adapted • If process steps are not implemented as services, monitoring data has to be extracted from systems rather than in the service layer Once the monitoring data is extracted it has to be stored somewhere. In practice there is often only a daily delivery of data after close of business when huge amount of data is transferred into a data warehouse in a batch run. The use of 7
8 9
10
for better visibility not all arcs between the services and the monitoring channel are drawn. Java Message Service, messaging standard for exchanging messages. Development and execution environment for creating Microsoft Windows-based applications. Open source software providing JMS functionality to .NET clients.
13 Measuring Bank Process Performance
305
a data warehouse to store the monitoring data is the traditional approach. Another newer approach is based on complex event processing (CEP). If required the CEP and data warehouse service (both illustrated in Figure 13.4) can use other services to retrieve further data (this is indicated by the dashed arc in Figure 13.4). This could be the case if not all business process steps are implemented as services, so the data cannot be published on the monitoring channel. Both, the data warehouse and CEP approach are explained in the following sections.
13.11.1 Data Warehouse A data warehouse is a special kind of database for storing historical data in a way to provide fast data access and reporting functionality. The data warehouse is loaded with monitoring data via an ETL process (extract, transform, and load). In our approach the extraction of monitoring data is done by the services that have to publish monitoring data on the channel as mentioned above. The monitoring application is responsible for the transformation and loading. To accomplish this, it has to collect the monitoring data from the channel, transform it in accordance to the requirements and send it the data warehouse. The actual loading technique is dependent on the data warehouse in use. There are file based or web service based approaches or completely proprietary methods. Within the data warehouse additional transformations may take place. Once the data is finally stored in the data warehouse it can be used for reports, queries or dashboards. Different standards for accessing data stored in data warehouse exist, like XMLA11. According to (Fingerhut 2006) a pure data warehouse approach based on fixed data models are not adequate for real time monitoring in a fast changing environment as the adoption of underlying data models is costly. A more adequate approach can be complex event processing.
13.11.2 Complex Event Processing Complex Event Processing (CEP) originally developed by David Luckham at Stanford University is now being adopted by more and more commercial tool vendors (Tibco, Progress Software, etc.). CEP is a promising approach. Its starting point is a so called event cloud which is the result of event generating IT systems. As events do not necessarily arrive in a predefined order and may also have complex relationships, they form this cloud. CEP focuses on the extraction of information from this cloud by using special algorithms that consider cause and time of
11
XML for analysis defines message interfaces for the interaction between clients and analytical data providers based on an industry standard language.
306
Markus Kress, Dirk Wölfing
events. Special maps and filters can be applied to provide different abstractions and views on the data, e. g. for determining existing event patterns. Another related approach (a subset of CEP) is event stream processing (ESP) that assumes that events are ordered by time. As events are timely ordered they create a stream, and not a CEP like cloud. A streaming database is a special kind of database especially designed for ESP to capture data and time very efficiently. In our scenario a streaming database could be used instead of a data warehouse. In contrary to ESP, CEP is a bit more costly concerning processing time, as the event order has to be determined. Still, CEP is supposed to be the more efficient approach for real time monitoring of huge amounts of data. A data warehouse service could provide additional data so a CEP does not necessarily replace a data warehouse. This depends on the specific scenario. The disadvantage of CEP and ESP compared to the data warehouse approach is the missing standardization. After introducing and explaining the architecture for monitoring BPP in a SOA environment, especially how data can be extracted and related from different sources, the implementation of a measurement environment is discussed. The most important measurement categories listed in the first part of this paper are time, number, capacity, and quality. Flexibility was also stated but as this cannot be monitored automatically we will need to ignore this. In the following sections we will discuss what factors under which prerequisites can be measured within a SOA.
13.11.3 Measuring “Time” It was mentioned that three types of time have to be measured: cycle time, processing time, and idle time. In accordance to the assumption that all technical as well as manual processing steps are implemented as a service the cycle time is calculated by the difference between the time the last service completed and the time the first service was called. The monitoring tool must be able to determine when the last service call has been performed. If this is not possible business processes never finish according to monitoring reports. The measurement of processing and idle times must be distinguished for humans and technical systems. The calculation for humans is more difficult. It is not possible to determine the real time a case worker is working on a specific case for two reasons. Firstly offline working time cannot be measured. This is the time the case worker works on a case not using any technical system. Secondly, even if a case worker started to use a technical system, it cannot be assured that he works all the time on the corresponding case until he completes the processing. For example he could have opened a mask for a case and then went to lunch. Idle times can be calculated by the difference between the working time and the processing time. As there is a measurement error in the processing time
13 Measuring Bank Process Performance
307
it will affect the idle time as well. The direct measurement of idle time is analog to the processing time not possible. The processing time is the basis for the calculation of the load factor of resources. Legal aspects have to be considered as well when the work of employees is monitored. In some countries it may be illegal when employees are monitored directly while anonymous role-based monitoring may be allowed. The processing time and idle time of services are simple to measure as the start and end of a service call is known. The processing time corresponds exactly to the service call duration.
13.11.4 Measuring “Volume” For measuring the volume the number of business processes running or finished must be determined. As each service call has to contain context information identifying the business process it is related to, it is relatively easy to calculate the number of business processes, sub processes or activities. Depending on the context of information provided the number of a department, product level, or other categories of interest is possible as well. If there is no context information at all and the service cannot be altered the measurement can be difficult as already stated. As the amount has to be correlated with time, e. g. in the case that the number of active processes in a specific period of time has to be measured, the timestamps of each measurement have to be stored. Based on the volume measurement a reuse level of services can be calculated. The reuse level determines how often a service was called from different clients or organizational units. The measurement of business events per sub process or activity in combination with the reuse level can serve as an indicator for standardization.
13.11.5 Measurement “Capacity” Both human and technical resources are measured for the capacity measurement. The capacity of human resources corresponds to the amount of cases workers and their working time. These case workers can be assigned with specific roles or to specific business process activities based on the system in use. For example in role-based systems a role determines what process steps a case worker is allowed and able to perform. The capacity can be given as amount of case workers (in hours or persons) per business process, business step, or role, among others. The capacity of technical resources is calculated based on the technical equipment which is in use. The equipment can be a single server or server farm, the number of processors, the memory, among others. As different services may run on the same server(s) for a specific service its share of resource usage must be calculated.
308
Markus Kress, Dirk Wölfing
13.11.6 Measuring “Quality” The term quality can be distinguished in quality as a characteristic and as a divergence from predefined objectives. Each business process is related to at least one product. This product is processed by the related business process. The quality describes what characteristics the product must have in certain states of the business process. If the characteristic can be described using attributes, plausibility checks can be applied for assuring these characteristics. It could be monitored how often plausibility checks failed and what time for rework had to be invested (with a measurement error as mentioned in the previous section). The second aspect of quality is handled in most cases by defining SLAs. An SLA defines the target settings a business process must keep. The quality measurement makes use of the other performance measurements like time and amount. These measurements correspond to the actual data, the predefined SLAs to the target data. To measure the quality a comparison between actual and target data must be made.
13.11.7 Measuring “Costs” or “Discounted Cash Flows” (DCF) The costs or DCF’s are not measured directly. Instead costs and DCF’s are calculated ex post based on the measured volume (KPI “volume”) and the capacities provided (KPI “capacity”). The number provides information how often and to what extent resources (either human or technical systems) were used. In combination with the related cost information an activity-based costing can be made. Expense streams caused by the execution of business processes are the result of this kind of accounting technique.
13.12 Conclusion The continuous increase of productivity in the financial industry requires more sophisticated methods than the traditional automation of single functions like Straight Through Processing (STP). The continuous improvement of processes due to more specialized and standardized process segments combined with quality and capacity management requires metrics to control the process performance. A very practical method to measure the process performance is based on components of the end-to-end processes. Each component has input and output interfaces and is valued with financial indicators like cost or incoming and outgoing cash flows. For detailed analysis non financial process indicators are required. By this
13 Measuring Bank Process Performance
309
methodology the overall metrics of the processes are split in parts working together to produce the quality of service the customer expects. The information to control the process technically can be produced by a monitoring system integrated in a SOA. Depending on the specific scenario various technical approaches exist to build such a monitoring solution which has to cope with increasing data volume as well as constant changes. The presented solution provides business driven monitoring in challenging technical environments. Currently the implementation effort in heterogeneous environments tends to be enormous but it can be anticipated that more sophisticated tools will be available in the near future that will simplify such implementations.
References Achenbach W, Lieber K, Moormann J, Six Sigma in der Finanzbranche 2. Auflage Bankakademie Verlag Frankfurt Fingerhut S (2006) Achieving Operational Awareness With Business Activity Monitoring (BAM), Business Integration Journal, May/June 2006, from http://www.bijonline.com/index.cfm?section=article&aid=272 Frei FX, Harker PT (2000). “Value Creation and Process Management: Evidence from Retail Banking.” In Creating Value in Financial Services, edited by Melnick E, Nayyar P, Pinedo M and Seshadri S, Kluwer Academic Publishers Frei F, Kalakota R, Leone A, Marx L (1999) Process Variation as a Determinant of Bank performance: Evidence from the Retail Banking Study Management Science Vol. 45 No. 9, pp. 1210−1220 Hammer M and Champy J (2003) Reengineering the Corporation, New York High J, Kinder S, Graham S (2005) IBM’s SOA Foundation, White Paper, from http://download.boulder.ibm.com/ibmdl/pub/software/dw/webservices/ ws-soa-whitepaper.pdf Richter-von Hagen C and Stucky W (2004) Business-Process- und WorkflowManagement: Prozessverbesserung durch Prozessmanagement, Teubner Verlag, Reihe Wirtschaftsinformatik, Stuttgart Kahno M, Koide A, Liu T, Ramachandran B (2005) Analysis and Simulation of Business Solutions in a Service-oriented Architecture, IBM Systems Journal Vol. 44, No. 4 Mende M (1995) Ein Führungssystem für Geschäftsprozesse, Dissertation at Hochschule St. Gallen, Bamberg Miers D (2005) BPM: driving business performance, White Paper, BP Trends, from http://www.filenet.com/English/Products/White_Papers/bpmdbp.pdf O’Connel J, Pyke J, Whitehead R (2006) Mastering your organization’s processes, Cambridge University Press Osterloh M, Frost J (2003) Kernkompetenz Prozessmanagement, Gabler Verlag, Wiesbaden
310
Markus Kress, Dirk Wölfing
Pallos MS (2001) Service-Oriented Architecture: A Primer, EAI Journal, from http://www.eaijournal.com/PDF/SOAPallos.pdf Pyke J (2005) Waking up the BPM paradigm, from http://www.itweb.co.za/sections/business/2005/0508190822.asp Ray G, Muhanna W, Barney J (2005), Information Technology and the Performance of the Customer Service Process: A Resource Based Analysis, MIS Quarterly, Vol. 29, No. 4, December Roch E (2006) Monitoring and Management within SOA: Lessons from a SOA Early Adopter, from http://blogs.ittoolbox.com/eai/business/archives/monitoring-and-managementwithin-soa-lessons-from-a-soa-early-adopter-8473 Sandt J (2004), Management mit Kennzahlen und Kennzahlensystemen, Wiesbaden Schrank R (2002) Neukonzeption des Performance Measurements, Verlag Wissenschaft und Praxis, Mannheim Siebert G (2002) Performance Management, Leistungssteigerung mittels Benchmarking, Balanced Scorecard und Business Excellence-Modell, Deutscher Sparkassenverlag, Stuttgart Wölfing D (2006) Six Sigma und Business Process Management im Kontext industrieller Bankprozesse, in: Achenbach W., Lieber K., Moormann J., Six Sigma in der Finanzbranche 2. Auflage Bankakademie Verlag Frankfurt
CHAPTER 14 Risk and IT in Insurances Ute Werner
14.1
The Landscape of Risks in Insurance Companies
The risk landscape of insurance companies is quite diverse: it stretches from insurance and investment risks to operational and regulatory risks, not to mention the risks (and chances) involved in generating integrated information technology infrastructures.1 In this paper, multiple levels of risks are discussed with the most aggregate one comprising the risks of insurance companies and their management. Risks transferred from customers to be managed by insurers make up the least aggregate level.
Solvency II regulations
collective savings and investment asset-liability management underwriting and timing risk operational risk Figure 14.1. Risk landscape of insurance companies
1
See e. g., the survey of 44 leading insurers from around the world done by (PricewaterhouseCoopers 2004).
312
Ute Werner
Risk management in insurance companies works partly as a bottom-up integration of risks: The individual risks of customers are combined in risk groupings based on an analysis of these individual risks and their assessment in relation to all the other risks accepted. The premiums charged by the insurer – for accepting the risk transferred from the insured – are calculated in due consideration of the expected frequency and extent of damages in the risk group. Additionally, the financial cover provided to the insured party as laid out in the respective insurance contract is considered. However, the less experience, knowledge, or data insurers carry with any particular risk class, the less reliable their estimates will be. This uncertainty is reflected by supplementary risk loadings. Differences between the total of premiums collected for each line of business and the payments due after the risk has materialized for a number of customers in a given period make up the insurance risk of the company. It is especially significant for property and casualty or reinsurance business since the uncertainty regarding the realization of risks (underwriting risk)2 is high compared to life insurances: In this line of business, the triggering event, e. g. death, is certain to happen, and the main share of the payments due at that moment has been laid out in the insurance contract. It is, therefore, known beforehand. However, timing risk as the uncertainty about when an insured event will occur still affords a very sophisticated investment planning. Returns on capital investments make up an important part of the overall economic performance of insurance companies. They are generated by reserving and investing that share of the premiums collected which has not yet been spent for reimbursing the insured. Moreover, collective savings and investments are an essential element of cash value life insurances or whole life health insurance. They definitely constitute a risk for the insurance company if a certified minimum return has been guaranteed to the customers. Therefore, market, credit and liquidity risks have to be controlled carefully for maintaining the ability of the firm to satisfy its duties. In recent years this has become more difficult following declines in equity and fixed income markets. Nevertheless, matching assets and liabilities according to their values, timing, and inherent risk is central for the insurance business. This also affects the management of underwriting, investment and pricing3 as well as their interdependencies, or capital allocation according to the risk carried by different business units.
2
3
Will an event occur at all? What is the loss amount to be expected from possible damages? – The terminology used to describe ‘insurance risk’ varies. By differentiating between underwriting and timing risk we follow the standard setting approaches of FAS 113 within US GAAP and the discussion laid out in (IAIS 2005). The practice of cash-flow underwriting, e.g., implies that coverage is provided for a premium level that is actuarially less than necessary to pay claims and expenses. In this case, the investment profit on the premiums is used (partially) to compensate for the underwriting loss.
14 Risk and IT in Insurances
313
Furthermore, it includes decisions of how to fulfil the requirements of insurance regulatory authorities regarding the solvency4 of insurance companies. Solvency II regulations in the EU will extend from minimum capital provisions to detailed rules for asset investments, and they will be supplemented by comprehensive risk management and publication principles (European Commission 2003). These principles should include operational risks as well, even if their quantification might be difficult (Linder and Ronkainen 2004) since they occur in multifarious forms reaching from risks related to human behaviour following diverging interests to data analysis procedures or data management. Basically all IT-supported critical business processes can be understood as operational risks, especially if they are vulnerable to accidents or attacks from inside or outside the company.5 Surveys by (Tillinghast-Towers Perrin 2002)6 and (PricewaterhouseCoopers 2004) indicate that most of the insurance companies sampled have to develop technology infrastructures that are integrated enough to allow accounting for interactions and correlations among risk sources. This, however, is fundamental for designing a holistic, enterprise-wide management of risks. In the following the focus will be laid on underwriting risks as the most typical risks of insurance companies. Section 14.2 refers to catastrophe modelling in order to distinguish between hazard, exposure, vulnerability and loss as the main components of underwriting risk. This perspective is useful for identifying and analysing single risks in their natural, technical or social context; at the same time it allows to account for interrelations between risks as far as they reflect in the business of insurers. It will be shown how information technology is put into service to fulfil these tasks. Section 14.3 then introduces IT as a tool for managing the underwriting risk of the company. Several applications in disaster assistance, claims handling and market development are presented. Section 14.4 concludes with first thoughts on IT’s cost and benefit in insurances.
4
5
6
The meaning of the term has been extended during the past years. Since its introduction in the 1970es, ‘solvency’ designates minimum capital requirements that are derived from generalized assumptions about the risk involved in the kind and amount of insurance business written. This focus on insurance risk changed when the Solvency II project was launched in 2000. Its objective is to create a prudential framework for the management of all the risks insurance companies are facing. This is in line with international developments in solvency, risk management and accounting. For an overview refer to (Linder and Ronkainen 2004); the latest work of the US Capital Adequacy Task Force is listed under www.naic.org/committees_e_capad.htm. Since the Basel II accord, operational risk is commonly defined as potential loss resulting from inadequate or failed internal processes, people and systems, or from external events. Interesting approaches for assessing operational risks in insurance companies have been developed by (Dexter et al. 2006) and (Tripp et al. 2004). The report is based on a Web-survey conducted in 2002. Executives from 94 companies headquartered around the world and working in all sectors of the insurance industry responded.
314
Ute Werner
14.2
IT’s Function for Risk Identification and Analysis
14.2.1 Components of Underwriting Risk Among natural scientists, engineers and economists, risk is modelled as the function of an interaction between four basic components: (1) certain hazards, (2) elements exposed to hazardous events with specified characteristics, (3) susceptibility of the exposed elements to the hazardous impact, and (4) resulting consequences (Mechler 2004; Sinha and Goyal 2004; UN/ISDR 2004).
(1) Hazard
Event Creation (3)
(2) Exposure
Vulnerability
(4) Loss
Damage Function
Figure 14.2. Structure of risk and of catastrophe models (Grossi et al. 2005; Borst et al. 2006)
Computer-based catastrophe models7 show a similar structure; they have been used since the late 1980es for assessing loss potentials due to extreme natural events such as earthquakes or hurricanes. And in the aftermath of the terrorist attacks of 9/11/2001 the modelling has been extended to include human-induced disasters. This was made possible by continuing advances in information technology. Geographic Information Systems (GIS) facilitate the linking and integration of various kinds of data: • scientific measurements and concepts regarding natural hazards, • engineering knowledge on structural characteristics of the built environment, • models of human goal-oriented behaviour, • and statistics or reports on historical occurrences of hazardous events. 7
FEMA (www.fema.gov/hazus/) offers an open-source multi-hazard catastrophe model (HAZUS-MH MRl); proprietary modeling software and services have been developed by AIR Worldwide (www.air-worldwide.com), EQECAT (www.eqecat.com), and RMS (www.rms.com). For a short history of catastrophe modeling c.f. (Grossi et al. 2005).
14 Risk and IT in Insurances
315
Remote Sensing on the other hand is able to deliver pictures or imagery useful for a rapid updating of GIS-based maps, e. g. after a disastrous event. The sensors are either based on the ground or mounted on aircraft and spacecraft. Data acquired from these systems are combined with related information to produce up-to-theminute maps, track weather events, or create databases of urban infrastructure and much more. GIS, therefore, have become an ideal environment for conducting multifaceted, spatial-oriented studies of risk. This induced a renaissance of risk mapping. The mapping focus, however, shifted from displaying (historical) evidence to up-to-date analysis and inspection information for various businesses.8 The hazard component of risk is usually modelled based on cause-effect relationships. In the case of (extreme) natural events this can be founded on detailed scientific and geophysical data collected by measurements, and on observations of losses registered. If human-induced events are to be investigated data is quite scarce, especially for small-scale events, and the information available might not be representative for other events to come (Kunreuther et al. 2005). This is especially true for malicious attacks as human actions can be directed to produce damages of a certain extent while taking into account potential protection measures. Moreover, data collection processes for human-induced events are not standardized. In catastrophe modelling, the exposure module contains geographic data on the location of the elements at risk (address, postal code, CRESTA zones9, etc.) as well as information on typical physical characteristics of the exposed objects and subjects. This includes property data such as the construction type of houses, the year they were built, or the number of stories. Regarding the population, data such as age and gender distribution, average daytime population, and maximum capacity of schools, hospitals, shopping centres or stadia have to be collected (Balmforth et al. 2005). The creation of synthetic events is possible via probabilistic models and computer simulation or by the deterministic approach of scenario planning. Probabilistic models are used to generate damage distributions or a set of events that can be used to calculate such a distribution. Scenario models provide a smaller array of estimates, but can easily be adapted for individualized loss assessments. In conjunction with the hazard module, both approaches are able to differentiate between various qualities of events (is it a hurricane or a tornado?, an industrial accident or vicious sabotage?), related intensities (e. g. in terms of wind speed or pressure waves of bomb blast) and estimated frequency (Borst et al. 2006). 8
9
E. g., Real Estate: assessment of property values or online listings with aerial photos; utility and telecommunication planning: review of new housing development; agriculture and forestry: appraisal of drought damage; insurance: prospective exposure analysis and claims management – to name just a few. CRESTA means ‘Catastrophe Risk Evaluation and Standardizing Target Accumulations’. This working program was set up by the insurance industry in 1977; it aims at establishing a globally uniform system for the accumulation risk control of natural hazards. GfK MACON is the official supplier of CRESTA maps comprising 13222 zones and 455 subzones of all countries in the world (www.cresta.org or www.gfk-macon.com/cresta).
316
Ute Werner
Damage functions represent a quantitative estimate of the impact produced by a hazard phenomenon on the elements at risk. Damage functions reflect a thorough and detailed understanding of the local conditions and practices (Clark 2002). These are “determined by physical, social, economic, and environmental factors or processes, which increase the susceptibility of a community to the impact of hazards” (UN/ISDR 2004, vol. II), and thus its vulnerability. The mapping of locations is therefore essential for any kind of vulnerability assessment. (UN/ISDR 2004) defines risk as “the probability of harmful consequences, or expected losses (deaths, injuries, property, livelihoods, economic activity disrupted or environment damaged) resulting from interactions between natural or human-induced hazards and vulnerable conditions”. This refers to direct and indirect10, economic and human losses including latent (environmental) conditions that may represent future threats. All personal and commercial insurance lines – from property and casualty to life, accident and health, or reinsurance – are therefore affected. The following examples illustrate how insurers employ GIS and other information technology for assessing their underwriting risk.
14.2.2 Assessing Hazard, Exposure, Vulnerability and Potential Loss Only risks that are sufficiently known can be assessed regarding the consequences for the risk carrier. Before a (financial) risk is transferred to an insurance company it will, therefore, be evaluated according to its individual characteristics and as to its fit with other risks accepted so far. This includes analyses of spatial interdependencies.11 Thanks to advances in information technology the century-old craft of risk mapping for insurance purposes has been recreated: Fire insurance maps originated in London in the late 18th century, as insurers attempted to gain accurate information about insured buildings (Nehls 2003). Like in other parts of the world, fire was one of the most troublesome problems: The large number of wooden buildings, constructed in close proximity to one another, primitive lighting systems, and unsafe ovens and furnaces all contributed to an environment where fires were frequent and often devastating (Ericson 1985). In 1867, D.A. Sanborn founded a mapping company in New York that still exists, holding over 1.2 million historic and updated maps of approximately 12.000 American cities and towns. Sanborn employed surveyors in each state of the U.S. and systematized the map-producing process, introducing standards for accuracy and design that the company published in 1905. The large-scale maps (1:600) detailed the location, footprint and material composition of all buildings – mainly in the downtown areas – including framing, flooring, and roofing materials, the number of stories, 10
11
Indirect losses such as business interruption are a consequence of the originating physical destruction. For a detailed discussion c.f. Chapter 2 and 3 of (NRC 1999). The actuarial assumption of independent risk groupings is needed for mathematical reasons; empirical features of risk groupings include space- and time-dependent accumulation effects.
14 Risk and IT in Insurances
317
the location of windows, doors, water and gas mains, fire walls as well as building use – thus depicting important components of the fire hazard. This was complemented with information on nearby baker’s ovens, stored kerosene, blacksmith forges or the location of gas lines. The surveyors also noted vulnerability-related characteristics like street and sidewalk widths, prevailing wind directions, number and type of fire hydrants, railroad corridors and nearby rivers as well as the strength of local fire departments (www.edrnet.com/reports/key.pdf). These ‘Sanborn Maps’ served the same purpose as GIS does for property and casualty insurance today: Individual objects can be inspected without personal examinations of the buildings and their surroundings. This is useful when underwriting new risks, or for renewals with expanded coverage. Even an underwriter with no knowledge of the territory would be provided with the information necessary to make an informed decision about the risk in question. Hazard maps are created to show, e. g., landslide and fire zones or areas with more or less criminal activity, thus serving as a quick identifier of the hazard potential for a certain address or zip code. Moreover, geographical data can be combined with economic data such as typical asset values, thus displaying economic aspects of exposure; or it may be blended with statistics and reports on previous damaging events to back up the imagination about vulnerability issues as to certain hazards. Underwriting can be facilitated further by calculating indicators that sum up the totality of risks identified. (Munich Re 2002), e. g., demonstrates an approach for assessing the risk of megacities regarding material losses due to various natural hazards. This macro- and mesoscale information may be supplemented by more detailed information, e. g. on fire and police stations in the area. Remote sensing is useful to identify the composition of roof material and types of building materials used on a structure. All this assists in the decision on whether to insure any particular property for fire, burglary, etc., and what premiums to charge for insurance. More detailed and reliable data about the risks accepted reduces uncertainty which in turn allows reducing the safety loadings on the net premiums calculated. It also facilitates premium differentiation according to the relative risk in the areas covered. Hazardous installations may be examined with the focus on their surroundings. As (Munich Re 2002) points out, the exposure accumulation within a 5-km radius of a chemical plant can be specified, for instance regarding physical, economic, or human and social effects. And if the prevailing wind directions are registered as in the Sanborn maps, hitherto underestimated loss scenarios can be identified and simulated which might supply important information for pricing. Figure 14.3 illustrates this approach by implicitly referring to health risks of people living in an area with a high density of chemical plants, the Rhine-NeckarTriangle in Germany. This information could easily be complemented and enhanced by focusing on the addresses of persons insured in life and health lines. Another part of the financial exposure of the company in case of hazardous incidents could be outlined by emphasizing nearby critical facilities or zones of heightened vulnerability due to heavy daytime use like schools or shopping districts: such a situation might translate into environmental liability compensations if one or more of the chemical companies have signed up for this kind of insurance.
318
Ute Werner
Figure 14.3. Chemical plants in the Rhine-Neckar-Triangle, Germany
Geographic data and its integration with other data relevant to the insurance of objects, subjects and liabilities provide insurance companies with the possibility to visualize their exposure as of risk accumulations and total coverage in certain areas. As another example, Figure 14.4 shows the location of embassies in Berlin, Germany. They are represented according to a vulnerability index for terrorist attacks which takes into account permanent political characteristics as well as current critical international situations (Borst et al. 2006). It is evident that some of the embassies that carry a higher degree of vulnerability are located at close distances to each other thus generating a ‘hot field’ that radiates to the surrounding areas. If an insurance company wants to judge its exposure regarding buildings in the vicinity of embassies, it has to consider, however, that the more attractive potential targets are to terrorists, the more protection will be provided to fend off attacks. This may result in heightening the danger for other embassies as long as they are less protected (Kunreuther and Heal 2003). In this case, GIS can only be a starting point for a more thorough analysis of underwriting risk. Investment risk, on the other hand, might be related more directly to the vicinity of real estate and ‘hot fields’. Both kinds of risk may induce risk selection processes on the part of the insurer since the risk capital allocated to each business segment has to be placed in the most beneficial way. (Munich Re 2002; Munich Re 2003) explain this with the example in Figure 5 that has been extended for our purposes: The reinsurer uses a rating tool for geocoding new risks based on their addresses in order to compare
14 Risk and IT in Insurances
319
Figure 14.4. Locations of embassies in Berlin, Germany
them with risks already accepted. The probable maximum loss amount in the vicinity of the new risks is indicated by the broken line circle. In this way, accumulation situations can be identified in a timely manner; they can be calculated in advance and avoided, if necessary. Even certain investment risks might be taken into account as long as they are connected to geographical locations. Data from various sources is also required for assessing covariant risks and domino effects such as lifeline and business interruption, e. g. following disastrous events in urban areas. Past events and their evolution may serve as background data
Kind of risk: new underwriting risk accepted underwriting risk investment risk (real estate)
Figure 14.5. Accumulation control of underwriting and investment risk
320
Ute Werner
Figure 14.6. Flood risk identification map provided by the state of Baden-Württemberg; (http://rips-dienste.lfu.baden-wuerttemberg.de/rips/hwgk/Default.aspx)
pool for complex risk analyses aiming at the detection of loss patterns and causeeffect relationships. These observations could be combined with current data, e. g. satellite imagery on traffic situations or land use to simulate the likely propagation of damages under present conditions. This may also foster the understanding of non-observable effects and assist in the development of predictive models. GIS have been installed and used by a broad range of industries as well as governmental agencies and the military.12 Insurers introduced them mainly in cooperation with consultants who, at the same time, provided the data necessary for detailed risk assessments. In general, the insurance business still needs to refine the granularity of data collected before an extended employment of geographical underwriting is possible. In the past, business was often written on the basis of large rating zones on the state or county level, or numerous individual risks with multiple locations were drawn together in one contract (Munich Re 2002). This aimed, of course, at reducing the cost of data collection, management and storage. Geocoding on the contrary implies that not only the addresses or locations of each risk transferred have to be recorded but also other situational attributes relevant for its coverage. Ideally, data should be collected and stored in a standardized way, in cooperation with clients and reinsurers. For assessing human-induced hazards such as industrial accidents or terrorist attacks a fine resolution of data on the street or address level is necessary (Munich 12
A very well-known non-proprietary application is HAZUS-MH, developed by the U.S. Federal Emergency Management Agency (FEMA) under contract with the National Institute of Building Sciences (NIBS). It contains models for estimating potential structural and economic losses from earthquakes, floods, and hurricanes as well as impacts on the population. Another GIS-based tool developed by the U.S. Defense Threat Reduction Agency (DTRA), FEMA, and Science Applications International Corporation (SAIC) is CATS – the ‘Consequences Assessment Tool Set’. It serves Government Agencies and Military Institutions to estimate and analyze effects due to extreme natural events and human-induced disasters including industrial accidents and terrorist attacks.
14 Risk and IT in Insurances
321
Re 2003). This also applies to weather hazards like flooding where a few meters on the ground or even centimetres of water level can cause a big difference in damages to be expected. Figure 14.6 shows the potentially flooded areas (marked by the dark line) and the corresponding water levels (the darker the higher) for the historic city of Heidelberg. For other hazards such as earthquake or windstorm, data for 5-digit postcodes are suitable (Munich Re 2003). Insurance companies have to consider carefully whether they want to rely on external expertise for risk assessment or assume this part of their business. Modern information technology provides a number of flexible, user-friendly applications and services that assist in linking risk assessment with the risk management process.
14.2.3 Information Management – Chances and Challenges The term GIS is used to denominate either just the computer systems capable of capturing, storing, manipulating, updating, analyzing, and displaying geographically referenced information (USGS), or it includes the graphic data13 and associated attributes collected in relational database systems. The overall function of Geographic Information Systems, however, is to support the management of data acquisition, its integration and distribution. Insurance companies with their large databases, mixed network computing environments and distributed processing needs are typical users of full-blown GIS systems with intra- and internet-based services. A central data repository and a development tool such as SDE (Spatial Database Engine from ESRI) allow the professional user to store and manipulate all variations of geographic features while sharing the data in a client/server environment over large networks or even via (standard) web browsers. PC-based graphic user interfaces facilitate model calculations, displays, and risk assessments and provide the convenience to view GIS data from anywhere in the world in a matter of minutes. This is extremely valuable for a business that covers extended geographic areas. An enterprise-wide application of GIS has been introduced by PartnerRe, a global reinsurer headquartered in Bermuda (www.esri.com/library/fliers/pdfs/cspartnerre.pdf). A centralized system set up for all risks and offices guarantees worldwide consistent data and calculation methods. Conservative loss scenarios are used for evaluating single risks and the total exposure of the company. Underwriters, therefore, are able to proceed based on real-time information about used and remaining reinsurance capacity by peril zone. Another goal of the implementation 13
Point (also called node), line (or arc) and area (or polygon) in either vector or raster form which represent a geometry of topology, size, shape, position and orientation. Spatial features are stored with longitude and latitude assigned to each location. This is called geocoding or georeferencing.
322
Ute Werner
project was to provide timely post-event loss estimations. This can be achieved by a satellite data acquisition and analysing process. It has been developed to enable the overlaying of an event’s geographic spread and intensity against the company’s exposures.14 Other reinsurers with longstanding international experience have continuously been updating the internet access to their digital atlases, libraries and services. Swiss Re, e. g., addresses its business partners via a catastrophe network that disseminates data using an ArcIMS application. The worldwide natural hazard atlas provides information on about 650,000 locations and is available over the internet. Other electronic services include, i. A., a life and health underwriting system intended to automate the application process for regular risks, or an e-business solution for property and casualty lines with online access to reinsurance quotes and capacity, as well as account management.15 Munich Re, too, has installed a new internet-accessible platform to connect with its cedants; specific versions for brokers and other business partners are currently being designed (www.munichre.com/assets/pdf/connect/2005_07_connect _flyer_eng.pdf). The platform offers a wide variety of analytical and transac tion tools for product development, underwriting, risk management, or claims settlement. The reinsurer also created an interactive online tool for its “Natural Hazards Assessment Network (NATHAN)”, which is published in two versions – the ‘light’ one without insurance statistics is open to the public at the company’s website. It covers about the same topics and range of locations as Swiss Re’s digital atlas. Both systems allow the user to drill down through superimposed layers of data to discover the information provided and data associated with a designated location – for example earthquake and windstorm hazards in 30 km distance of a preselected city. Beyond its use for reaching out to business partners, IT and especially worldwide digital maps surely have to potential to raise the interest of the general public in issues related to geography.16 The technology, therefore, can support the marketing and communication of risk mitigation measures which in turn contribute to risk reduction on the individual and societal level. This approach has been used by insurance companies, NGOs and local, state or federal authorities. It has to be carefully planned and prepared in cooperation with IT professionals, risk experts and data providers: On the one hand, lots of potentially useful data for risk assessment is still missing, not publicly available or hard to obtain. To name just a few examples: 14
15 16
It is conceivable that a continued integration of GIS with business intelligence solutions such as SAS BI further improves the efficiency and transparency of business processes. This and the enhancement of reporting capabilities was also a goal of PartnerRe’s project. For technical aspects c.f. Bayerl (2005). Google Earth and NASA’s World Wind, e.g., are open to everybody and employ visualization tools in 3-D that allow zooming from satellite altitude to earth. Additional guides and features can be accessed though a simplified menu (www.earth.google.com or http://worldwind.arc.nasa.gov).
14 Risk and IT in Insurances
323
Research on day-time distributions of populations around hazardous installations has just started (Balmforth et al. 2005), and access to data on elements sensitive to terrorist actions such as critical infrastructure is restricted (Kunreuther et al. 2005); the same can be experienced with data on technological hazards needed to judge possibilities of damage propagation due to supply chains or other interdependencies between businesses. Contrary to this shortage, there is an overwhelmingly large amount of primary data being collected from such diverse media as overhead imagery, real-time global positioning feeds and electronic sensors. They come in various formats and different scales. The sources providing these data are quite diverse, too, ranging from authorities (official census data or nation-wide mapping projects17), to research institutes or private companies offering online library and other web-based support services18. A strategic agenda for integrating these multi-source geographic data into existing databases and models of decision support and risk management is, therefore, necessary. In any case it has to be guaranteed that the data delivered correspond to certain quality standards, and legacy databases might need an update for enabling geospatial analyses. Special consideration should be given to data that was collected as paper maps, CAD drawings or observations from field surveys since a conversion to digital format is possible. Other services offered by professional providers of geospatial solutions include geocoding of addresses, joint analysis of two- and three-dimensional remote sensing data such as Synthetic Aperture Radar (SAR) and Laser Induced Detection and Ranging (LIDAR), satellite image processing19 and interpretation, as well as workflow design. The IT infrastructure could be prepared for applications like live mapping, dynamic spatial modelling, automated change detection, and location-based services. Alternatively, subscriptions with professional providers of these functions might be used. In this way, only relevant data are transmitted, even automatically, which reduces the more time-consuming queries on demand (Abler and Richardson 2003).
17
18
19
Projects like “The National Map” combine partners from U.S.G.S, federal agencies, state and local governments, NGOs and private industry (U.S. Department of the Interior/U.S. Geological Survey: The National Map: The Nation’s Topographic Map for the 21st century. http://nationalmap.gov/nmreports.html, viewer at http://nmviewogc.cr. usgs.gov/viewer.htm). GlobeXplorer, for instance, has a large commercial library of georeferenced digital aerial imagery. It is distributed over the internet using a proprietary platform that dynamically searches, retrieves, compacts, and delivers images from an online database (www.globexplorer.com). This company cooperates with CDS Business Mapping that provides risk analysis tools for insurers, e.g., automated property lookups for underwriting (www.riskmeter.com). This involves georeferencing, orthorectification, color balancing, resolution merging across satellite platforms and more, to ensure the desired quality in terms of clarity, color and resolution, even over larger areas.
324
Ute Werner
14.3
IT as a Tool for Managing the Underwriting Risk
14.3.1 Disaster Assistance Not all disasters come by surprise though the warning times differ considerably. Weather-related extreme events such as volcano eruptions and wildfires can be observed developing. Figure 14.7 displays a fire watch map provided by the Australian Department of Land Information. It shows the locations of wild fires (bright stars in the centre) as mapped from satellite imagery on April 16, 2006 for Northwestern Australia, and a large number of lightning events (dark squares) registered for the same day. The data is available in near-real time, i. e. within two hours of the satellite passing over and can be used in disaster modelling. In this case, the topographical and vegetation (fuel) data are fed to a combustion model, which simulates the spread of the fire from its origin, depending on further data input such as wind speed and direction as received over the internet from weather services. The simulation results are displayed with GIS over the topography to assist in assessing the consequences (elements exposed and their vulnerability), for managing the vegetation and other mitigation measures, as well as evacuation planning. Insurance companies may use this information system as a support tool for deciding where to set up catastrophe operations and how many loss adjusters or additional helpers are needed.
Figure 14.7. Fire watch map of Northwestern Australia, 04-16-2006; (http://firewatch.dli.wa.gov.au/landgate_firewatch_public.asp)
14 Risk and IT in Insurances
325
Larger insurance companies employ designated teams of specialists who assist in response efforts by establishing temporary claims offices in the field. After the 2004 tsunami event in Southeast Asia, Allianz Group set up clearly identifiable information booths on the island of Phuket, Thailand, and opened a 24-hour care centre for tsunami victims. It supported local help organizations in their efforts to supply emergency shelter, medication and clean water (www.allianz.com/azcom/ dp/cda/0,,610273-44,00.html). And during the Katrina catastrophe in summer 2005, AXA Art Insurance took care of art objects in the New Orleans area which otherwise would have been destroyed or heavily damaged.20 As a result of lessons learned from Hurricane Andrew which devastated parts of Florida in 1992, the insurer Travelers developed mobile claim headquarters (www.travelers.com/spt01portalmain.asp?startpage=/tcom/search/searchredirect. asp?attr1=disaster, Gallagher 2002). These custom-built vehicles are equipped with onboard computers and databases, printers, photocopiers, cell phones, fax machines – even two generators to restore electricity when the power goes out. Thus, debit cards can be issued on site and handed to clients in distress. Claims assessors may carry cell phones, pagers and digital cameras, and use Personal Digital Assistants, tablet PCs21 or notebooks with wireless networking capabilities to connect to a central server for information retrieval and update in real time. With the GPS toolbar, GPS measurements can be captured directly into a (central) geo-database. This is extremely useful when working in heavily destroyed areas where no street signs or landmarks are left: thus, even the locations of buildings that have been blown away or become damaged beyond recognition can be retrieved, and policyholders identified. Moreover, GPS indicates the proximity to resources for human assistance. The functionality of the variety of devices that an individual field force member uses may even be extended through a personal area network (PAN) as supported by Bluetooth. IT, therefore, enables loss adjusters to process all the information necessary for damage estimates right on site which affords clients with swift claim making possibilities. On the other hand, the insurer’s GIS can be updated in real time via incorporating actual claim information as it becomes available. This is valuable input for early loss assessments, for damage zoning, and reviews of loss reserving. Further on, a map with optimal routing and driving directions to locales of claims made can be set up for the adjusters. (Munich Re 2003) gives an example of how satellite data may be applied for loss estimations during disasters. Figure 14.8 is an adapted version of Munich 20
21
The company hired private security teams and helicoptered them to over 90 locations in the area. They also installed generators to control humidity and removed art to other save locations (www.axa-art.com/about/m_092905.html). These come with handwriting recognition and speech-recording functionality; some of them are designed specifically for field use, have outdoor viewable screens and have been ruggedized to survive dropping and resist moisture and dust. For a case study and tutorial in investigating insurance claims using ArcGIS 9 and a tablet PC (www.esri.com/news/arcuser/0205/files/tablet_pc.pdf).
326
Ute Werner
Figure 14.8. Linking up flooding information with portfolio data: individual risks as white dots and claims made as white triangles; adapted from (Munich Re 2003)
Re’s image and shows a small section of an area flooded during the August 2002 events in Europe. It is hatched in dark grey. The light-grey borders and captions relate to postcode areas, and the individual risks are represented by white dots. Thus it is clearly possible to distinguish between policyholders that are affected by the flooding and those that are not. The claims made are pictured as white triangles, which is helpful for deriving loss zones. Claims relating to risks located outside of these zones can be identified and clarified in subsequent investigations.
14.3.2 Claims Handling The mapping of claims made with the help of IT is also useful during non-disaster periods since their spatial patterns can indicate risk sources not yet considered. It is further necessary for an optimized service management and for fraud prevention. The Risk Management Agency (RMA) of the U.S. Department of Agriculture employs data mining and remote sensing in its fight against abuse in the crop insurance program (www.rma.usda.gov/news/2005/12/1230compliance.html): A centralized data warehouse provides all crop insurance-related data collected over time so that producers whose claiming patterns appear to be atypical can be identified. These clients are informed and some of them inspected during the growing season. This has led to a continual and substantial reduction in indemnities paid during the past years. To facilitate the spot-checking, investigators are trained to acquire high resolution Landsat imagery from the USDA Image Archive22 and make preliminary 22
C.f. (www.fas.usda.gov/pecad/remote.html) for information on the Production Estimates and Crop Assessment Division with links to other interesting projects.
14 Risk and IT in Insurances
327
determinations either to approve a crop insurance claim, or forward it to a remote sensing expert for further investigation. In Germany, RapidEye (www.rapideye.de) is preparing for geo-information services and analyses based on a combination of satellite technology, IT and e-commerce. The system consists of 5 satellites with multi-spectral imagers (MSI) mounted on them, ground stations, a data processing centre and internet-based information distribution, everything designed and built by an international team of technology providers. The satellites will be launched in 2007 and are then expected to be able to provide images of each point on earth, on a daily basis. This high frequency and extended coverage guarantees up to date information, even despite temporary cloudy conditions. Insurance companies may use the data provided by RapidEye in multiple ways, either directly or via geographic information services: As far as claims handling in agricultural insurance is concerned, a continuous remote monitoring of crop type and plant vitality during the growing season will be possible, ideally in cooperation with the producers. This facilitates an early determination of yield loss since unwelcome alterations of the produce can be discerned through comparisons of multi-temporal data sets. Other conceivable applications include the monitoring of temporary changes in regional traffic intensity (e. g. due to holiday seasons) which is helpful for road side assistance programs. In general, call centres of insurance companies using GIS can support customers in finding the preferred providers for auto repair and other necessities in case of a car-breakdown on the road. Additionally, web-based interactive information systems can be created for clients equipped with GPS functionality. Remote sensing also allows the tracking of mobile objects such as vessels and vehicles, including their cargo, which should prevent fraud in transport insurance lines. Since tracking a truck implies tracking the driver and its driving behaviour, the technology could even help to reduce the number of accidents due to excess hours of driving or speeding. It may further assist in the recovery of losses in case of stolen cars as long as these cars have GPS devices installed.
14.3.3 Market Development Remote sensing further serves innovative approaches in insurance marketing, e. g. when fine-tuning premium rates according to the individual risk transferred. As mentioned before, flooding risk needs to be assessed on a small scale basis, and available two-dimensional information on flood zones is too coarse since it doesn’t grasp risk characteristics like a building’s elevation compared to the surrounding area. Here, Digital Elevation Models (DEM) employed in conjunction with remote sensing data are helpful for distinguishing between risks at individual address levels and their susceptibility to flooding. Auto insurance is another part of business where individual risk ratings are feasible thanks to IT. The popularity of GPS in cars for its navigational and safety
328
Ute Werner
features can be used for premium differentiation provided the driving history is recorded. Monitoring driving behaviour delivers quantitative measures of subjective risk factors that might be more reliable than the tarification parameters applied so far: It is not unreasonable to expect that data on when, how long, and where a car is actually driven makes up a better representation of the accidental risk than knowledge about the owner’s profession, age and sex, or where the car is garaged (Holdredge 2005). Good drivers will be rewarded by lower premiums, and insurance companies using this individual risk rating approach will be rewarded by attracting good drivers. GIS software and associated demographic and socio-economic data packages further allow insurers to profile existing customers not only according to their contract or claims history and other data recordable along with the insurance business23 but also regarding data related to their geographical characteristics. These are, e. g., purchase patterns or average income for the zip code area they live in. Data mining then may result in distinguishable groups of customers with specific claims expectations. This can be mapped for further investigations. Consequently, new business will be sought in areas and among persons with characteristics similar to the ‘good risks’ discovered before. Microgeographic marketing developed about ten years ago based on the hypothesis that neighbours must have more in common than just the zip code or street address. Maps may further be used to visualize the context for insurance buying behaviour by representing the kind and extent of hazards existing around client’s residences. Other important information includes socio-economic data like income or age group as well as risk mitigation measures on the individual, local, and regional level. This is basic to locating target markets with the intent of marketing to target customers. They are the models to which insurance products, accompanying services and their prices are fitted. Information on preferred insurance acquisition channels is valuable for the (re)engineering of distribution networks ranging from insurance representatives and cooperating banks or retail business to intermediaries like employers, sports clubs, and the internet. A mapping of these agents and their networks might display geographic areas where more or less or different distribution support is needed. Once again, GIS can be used to optimize the routing of individual sales agents and to coordinate the sales areas allotted to them. Moreover, agents may be matched with clients following demographic or socio-economic criteria as well as kind of business. This is, of course, helpful for enhancing the communication between agent and client. Risk mapping has additional benefits since the visualization provided makes it easier for the agent to explain hazard, exposure, vulnerability and risk to the customer. 23
Which channel was selected for buying insurance – internet, phone, or agent? What is known about the personality of the client (call centers do check on this!)? Which insurance lines have been signed by an individual, her family? What is the customer’s satisfaction level with insurance and other services? What about her price sensitivity? Is there information available regarding risk taking behavior or risk management?
14 Risk and IT in Insurances
329
Market development will increase the quantity and improve the quality of insurance business written. Both effects are able to reduce the underwriting risk of insurers.
14.4
Conclusion
Throughout this article only an overview of the topic with some exemplary cases could be given. Though this is definitely not an exhaustive treatment of the many applications IT has to offer for risk analysis and risk management in insurances it hopefully illustrates the broad array of possible implementations. Each investment in technology, human and organizational resources has to be considered carefully, of course, regarding cost and benefit. Especially business reengineering involves time-intensive processes whose costs have to be balanced counter benefits that are hard to assess. The usability of IT as discussed here is, however, steadily improving which reduces the costs of implementation on all levels of the enterprise. This includes partners of an extended GIS-network and clients as well.
References Abler F, Richardson DB (2003) Geographic Management Systems for Homeland Security. In: Cutter SL, Richardson DB, Wilbanks TJ (eds) The Geographical Dimension of Terrorism. Taylor and Francis, New York and London, pp. 117–126 Balmforth H, McManus H, Fowler A (2005) Assessment and management of ‘at risk’ populations using a novel GIS based UK population database tool. Safety and Security Engineering, WIT Transactions on The Built Environment 82. Retrieved February 18, 2007, from http://library.witpress.com/pages/PaperInfo.asp?PaperID=15133 Bayerl C (2005) GEO Services at Swiss Re: Maintaining a company-wide centralized geodatabase. Paper delivered at the ESRI User Conference 2005. Retrieved February 18, 2007, from http://gis.esri.com/library/userconf/proc05/papers/pap1194.pdf Borst D, Jung D, Syed MM, Werner U (2006) Development of a methodology to assess man-made risks in Germany. Nat. Hazards Earth Syst. Sci. 6: 779−802 Clark K (2002) The Use of Computer Modeling in Estimating and Managing Future Catastrophe Losses. The Geneva Papers on Risk and Insurance 27: 181−195 Dexter N, Ford C, Jakhria P, Kelliher P (2006) Quantifying Operational Risk in Life Insurance Companies, developed by the Life Operational Risk Working Party. Retrieved February 18, 2007, from http://www.actuaries.org.uk/files/pdf/sessional/sm20040322.pdf
330
Ute Werner
Ericson T (1985) Sanborn Fire Insurance Maps 1867-1950, Exchange 27/3, newsletter published by the Wisconsin Historical Society. Retrieved February 18, 2007, from http://www.wisconsinhistory.org/localhistory/articles/sanborn.asp European Commission (2003) Design of a future prudential supervisory system in the EU – Recommendations by the Commission Services. Retrieved February 18, 2007, from http://www.europa.eu.int/comm/internal_market/insurance/ docs/markt-2509-03/markt-2509-03_en.pdf Gallagher J (2002) Road Map to Connectivity. Insurance & Technology, Sept. 13, 2002. Retrieved February 18, 2007, from http://www.insurancetech.com/ resources/fss/showArticle.jhtml?articleID=14706023 Grossi P, Kunreuther H, Windeler D (2005) An introduction to Catastrophe Models and Insurance. In: Grossi P, Kunreuther H (eds) Catastrophe Modeling: A new approach to managing risk. Springer, New York: pp. 23−42 Holdredge WD (2005) Gaining position with technology. In: Towers Perrin (ed) emphasis 2005/3: 2−5. Retrieved February 18, 2007 from http://www. towersperrin.com/TILLINGHAST/publications/publications/ emphasis/Emphasis_2005_3/Holdredge.pdf IAIS (International Association of Insurance Supervisors, 2005) Guidance Paper on Risk Transfer, Disclosure and Analysis of Finite Reinsurance. Retrieved February 18, 2007, from http://www.iaisweb.org/061021_GP_11_Guidance_ Paper_on_Risk_transfer_disclosure_and_analysis_of_finite_reinsurance.pdf Kunreuther H, Heal G (2003) Interdependent security. Journal of Risk and Uncertainty 26-2/3: 231−249 Kunreuther H, Michel-Kerjan E, Porter B (2005) Extending Catastrophe Modeling to Terrorism. In: Grossi P, Kunreuther H (eds) Catastrophe Modeling: A new approach to managing risk. Springer, New York, pp. 209−231 Linder U, Ronkainen V (2004) Solvency II – Towards a New Insurance Supervisory System in the EU. Scandinavian Actuarial Journal 104/6: 462−474 Mechler R (2004) Natural Disaster Risk Management and Financing Disaster Losses in Developing Countries. Karlsruhe editions II: Risk Research and Insurance Management (Karlsruher Reihe II: Risikoforschung und Versicherungsmanagement), vol. 1. Verlag Versicherungswissenschaft, Karlsruhe Munich Re (2002) Getting to the “point” – Does geographical underwriting improve risk management? Topics 2002: 41−45 Munich Re (2003) Geographical underwriting – Applications in practice, Topics 2003: 44−49 Nehls C (2003) Sanborn Fire Insurance Maps, A Brief History. Geostat Center and Department of History, University of Virginia. Retrieved February 18, 2007, from http://fisher.lib.virginia.edu/collections/maps/sanborn/history.html NRC (National Research Council, 1999) The Impacts of Natural Disasters: A Framework for Loss Estimation. Committee on Assessing the Costs of Natural Disasters. Retrieved February 18, 2007, from http://www.nap.edu/catalog/6425.html
14 Risk and IT in Insurances
331
PricewaterhouseCoopers (2004) Enterprise-wide Risk Management for the Insurance Industry, Global Study. Retrieved February 18, 2007, from http://www.pwc.com Sinha R, Goyal A (2004) Seismic Risk Scenario for Mumbai, In: Malzahn D, Plapp T (eds) Disasters and Society – From Hazard Assessment to Risk Reduction. Proceedings of the international conference at Karlsruhe, July 26−27, 2004. Logos Berlin, pp. 107−114 Tillinghast-Towers Perrin (2002) Enterprise Risk Management in the Insurance Industry. 2002 Benchmarking Survey Report. Retrieved February 18, 2007, from http://www.tillinghast.com Tripp MH, Bradley HL, Devitt R, Orros C, Overton GL, Pryor LM, Shaw RA (2004) Quantifying Operational Risk in General Insurance Companies. Developed by a Giro Working Party (general insurance research organisation), presented to the Institute of Actuaries, 22 March 2004. Retrieved February 18, 2007, from http://www.actuaries.org.uk/files/pdf/life_insurance/orcawp/op_ risk_capital.pdf UN/ISDR (Inter-Agency Secretariat of the International Strategy for Disaster Reduction, 2004). Vol. I: Living with Risk. A global review of disaster reduction initiatives. Vol. II Annexes: Annex 1: Basic terms of disaster risk reduction: 2−7, United Nations Publications, Geneva. Retrieved February 18, 2007, from http://www.unisdr.org/eng/about_isdr/bd-lwr-2004-eng.htm
CHAPTER 15 Resolving Conceptual Ambiguities in Technology Risk Management Christian Cuske, Tilo Dickopp, Axel Korthaus, Stefan Seedorf
15.1 New Challenges for IT Management Although the industrialization of financial products and services has been on top of the agenda in the financial service domain for many years now, there is still considerable pressure to lower the cost income ratio. This continuing trend stimulates restructuring of the IT environments (ECB 2004) and thus leads to new types of risk. From the corporate perspective, a major challenge lies in coping with rapid technological change and the increasing complexity of the IT landscape (BCBS 2003). At the same time, the restructuring of business processes and application systems is externally triggered by international regulations, such as the International Accounting Standards (IASB 2004) and the Sarbanes-Oxley act (SarbanesOxley Act 2002). These significant changes are further accompanied by the required integration of IT management activities into the operational risk management process with the advent of Basel II (BCBS 2005a). The mentioned internal and external factors are driving forces for operational technology risk management. Naturally, the stronger the influence of IT on a firm’s economic success is, the greater will be the associated risks (BIS2005). This poses new challenges to financial enterprises with regard to controlling and governing their risks more systematically. The most damaging technology (or IT/system) risks are often “those that are not well understood” (ITGI 2002). Given these assertions, the need for innovative approaches and new concepts offering an integrated support is more evident than ever before. This chapter is structured as follows: We first identify risk management as a mostly uncovered area for knowledge management applications in finance and suggest the application of ontologies for supporting the knowledge-intensive risk management processes. We present a technology risk ontology which constitutes the backbone of the proposed OntoRisk method. The technology risk management process is divided into three phases, namely identification, measurement and controlling. Finally, the system architecture of a tool supporting OntoRisk is described
334
Christian Cuske et al.
and the applicability as well as the advantages of the OntoRisk method are demonstrated by means of a case study.
15.2 Knowledge Management Applications in Finance In this section, we investigate the impact of knowledge management in the financial service domain as a means to sustain and improve competitiveness of financial organizations. In particular, we elaborate on the potentials for improvement regarding one of their core competencies: operational risk management. We discuss how ontologies can be applied as a knowledge representation technique to address common problems in that field.
15.2.1 An Extended Perspective of Operational Risk Management Knowledge management has been achieving increasing attention in recent years. The motivation for organizations to tap the full potential of knowledge management lies in an increasing focus on their core competencies and the acknowledgement of knowledge as a strategic resource (Prahalad and Hamel 1990; Probst et al. 1996). This trend is underlined by empirical studies (Skyrme and Amidon 1997) stating that active management of knowledge is critical for achieving a sustainable competitive advantage. Despite the existence of various definitions (Holsapple and Joshi 2004), knowledge is frequently distinguished from information and data. It refers to the collection of cognitions and capabilities that individuals or social groups use for problem solving, and it resides predominantly in people’s heads (Probst et al. 2003). From a technology-centric perspective, however, knowledge can also be codified, stored and distributed to be leveraged on the organizational level. From this point of view, a “codification strategy” for knowledge management is held most valuable, while from the second major perspective to be found in literature, the humanoriented perspective, a “personalization strategy” building on socialization and internalization is stressed (Hansen et al. 1999; Swan 2003). Knowledge management can generally be described as being “any process or practice of creating, acquiring, capturing, sharing and using knowledge […] to enhance learning and performance in organizations” (Quintas et al. 1997). Both exploration and exploitation of new and existing knowledge can be seen as the outcome of a successful holistic knowledge management approach. Such an approach will affect the enterprise in various dimensions, such as organization, employees, technology, and strategy.
15 Resolving Conceptual Ambiguities in Technology Risk Management
335
Financial organizations can be regarded as knowledge-intensive firms since they rely on creating, capturing, transferring, and applying specialized knowledge in their core business processes. In recent years, substantial parts of this industry have been undergoing structural changes that led to a focus on core competencies as well as the industrialization of financial services and products (Lamberti 2005). In this dynamic business setting, knowledge management can be an enabler for process improvements and a more efficient management of the firm’s most valuable assets. Today, there already exists a variety of applications that support knowledge-intensive business processes in finance. Examples include customer relationship management and creditworthiness ratings (Westenbaum 2003). Despite the fact that there are knowledge management solutions in place at many banks and insurance companies (Chong et al. 2000; Spies et al. 2005), a few core competencies still fall short of being sufficiently supported. Especially risk management is one key process that affects almost all the lines of business in the financial service industry. As Marshall et al. (1996) put it: “If knowledge management is of growing importance to every kind of business, its impact is perhaps most obvious in the financial service industry. This is because effective management of knowledge is key to managing risk. And the driving issue in both financial services and corporate treasuries today is risk management.” Thus, risk reduction should be viewed as a new key objective of knowledge management, in addition to the ones listed in (Abecker and van Elst 2004), which are the improvement of innovation, quality, cost-effectiveness and time-to-market. Losses stemming from the insufficient management of various types of risk, such as market, credit and operational risk, may possibly lead to bankruptcies or even destabilize the overall economy. A series of risk management failures hitting the headlines in the 1990s, e. g. the downfall of the Barings Bank, can partially be traced down to the failure of controls but has also been caused by the ineffective management of organizational knowledge (Holsapple and Joshi 2004). Since then, increasing complexity due to accelerating technological change has created even higher demands for effective risk management in the financial service domain. As a consequence, knowledge management is essential for addressing the problems encountered in risk management in general, and due to its fuzziness particularly in operational risk management. The ability to cope with financial risks involves • a clear understanding of the company’s assets and processes and their role for the creation of value, • awareness of the significant threats and their impacts, and • the application of appropriate policies and procedures helping to measure and control risks. However, ambiguities due to terminological confusion and uncertainties with respect to foreseeing the future are severe problems of the application of operational risk management in practice. Moreover, information relevant for risk management activities may reside in different organizational units and may have to be collected, analyzed and interpreted in a meaningful way, even if the data show inconsistencies. This requires a common language among the participants of the risk management
336
Christian Cuske et al.
process, which makes it possible to share their knowledge, specify an explicit model and transform it into quantitative figures useful for decision making. Against this background, it becomes clear that especially operational risk management can strongly benefit from the results of knowledge management research. For example, knowledge representation techniques such as formal ontologies can help to agree on shared vocabularies. Consequently, the above challenges should be addressed by an appropriate fusion of knowledge and risk management. In the following sections, we propose the ontology-based method OntoRisk, which combines the modeling of organizational knowledge with a quantification approach for resolving ambiguities in technology risk management.
15.2.2 Formal Ontologies as an Enabling Technology Although formal ontologies have a long tradition in knowledge representation and artificial intelligence (AI), they have not been widely adopted in financial applications yet. A widespread definition states that “an ontology is an explicit specification of a conceptualization” (Gruber 1993). Ontologies provide a structure for a commonly agreed understanding of a problem domain by specifying sets of concepts, relations and axioms. A shared vocabulary of terms can be regarded as an ontology in its simplest form. The attribution “formal” refers to the specification in a formal language. The main purpose of ontologies is to foster a systematic understanding of realworld phenomena and to create a reusable representation as a model. With the emerging vision of the Semantic Web (Berners-Lee at al. 2001), new technologies have been developed that facilitate the embedding of ontologies in information systems. In this respect, the role of ontologies can be explained in a more technical way: “Ontologies have been developed to provide machineprocessable semantics of information sources that can be communicated between different agents” (Fensel 2004). The uses of ontologies in information systems are manifold. In general, they serve as a unifying framework for different viewpoints and aim at reducing conceptual and terminological confusion (Uschold and Gruninger 1996). Here, we highlight two general advantages of ontologies: • Support of communication between the involved stakeholders by developing an enterprise-wide understanding of a specific problem domain; • Provision of interoperability between information systems with ontologies as part of their architecture. This way, formalized knowledge can be more easily transferred and translated to different perspectives. In the context of enterprise-wide information systems two applications can be identified. First, ontologies facilitate the integration of enterprise information when applied to enterprise modeling (Fox and Gruninger 1998). Second, ontologies may serve as the leveraging element in knowledge management (Maedche 2003; O’Leary 1998), since they respond well to structuring and modeling problems and support knowledge processes by flexibly integrating information from various sources (Staab et al. 2001).
15 Resolving Conceptual Ambiguities in Technology Risk Management
337
These characteristics qualify ontologies for risk management applications, especially in those areas that lack an explicit, shared understanding and strongly depend on domain knowledge. An ontology-based approach not only allows to provide precise definitions but to unify different methods and user perspectives. After describing the case for ontology-based technology risk management, we will introduce a concrete technology risk ontology which helps resolve the conceptual ambiguities.
15.2.3 The Case for Ontology-based Technology Risk Management The management of operational risk (and technology risk as a subcategory) has become increasingly important recently, especially after its integration into the regulatory requirements established by the 2005 Basel capital adequacy framework (BCBS 2005a). This integration was a reaction to the increasing dependence on IT environment in the financial service domain. Similarly, external regulations for insurance companies are also beginning to take shape with Solvency II (EC 2004). The general understanding of technology risk is embedded in the definition of operational risk: The risk of loss resulting from inadequate or failed internal processes, people and systems or from external events (BCBS 2005a). Accordingly, the management of technology risks represents an integral part of financial risk management. Technology-related risk comprises system failures, technologyrelated security issues, systems integration aspects or system availability concerns (BCBS 2003). In the following, we use the general term “technology risk” instead of the terms “system risk” or “IT risk” sometimes encountered in literature in order to indicate the broader meaning and to include risk types that are related to the technical infrastructure and IT management tasks as well. Since a clear shared understanding of technology risk does not yet exist on a general basis, there is a fundamental need for conceptual disambiguation. To this end, the idea of applying formal ontologies as one way of facilitating the understanding of technology-related risk in the banking industry has been proposed (Cuske et al. 2005c). Two main objectives for the effective management of operational technology risk inspire an ontology-based approach. According to the second pillar of the Basel II framework, senior management “[…] is responsible for understanding the nature and level of risk being taken” (BCBS 2005a). On the one hand, this implies a profound, unified understanding of the character and implications of risks and the use of a common language. Such a “shared understanding” should be based upon explicit knowledge about the corporate structure across functional divisions as one key element. On the other hand, the management of risks becomes an integral part of strategic management. In order to account for Basel II compliance, the risk management approach has to be comprehensible and clearly specified and the information base has to be made explicit. Ontologies are capable of providing
338
Christian Cuske et al.
a holistic approach to the described problems because they help to clarify the fuzzy understanding of risks and provide a suitable representation for the formal modeling of organizational knowledge. In addition, their ability to integrate different perspectives helps to combine different risk management approaches.
15.3 Technology Risk Management Using OntoRisk In order to achieve the two risk management objectives “understanding” and “management”, a formal but extensible knowledge model is required, which reflects a generally accepted interpretation of the relevant concepts and their interrelations and supports the management process. In the following, we introduce our OntoRisk method (cf. Figure 15.1), which is built upon a technology risk ontology. This ontology not only captures a shareable conceptual understanding but is also used in the OntoRisk method to create concrete, quantitative models of the technology risks an enterprise is exposed to at the time of modeling. It incorporates the regulatory perspective taken by, for example, Basel II as well as existing approaches to IT Governance as part of a Corporate Governance strategy. Subsequently, we describe the steps of the OntoRisk risk management process as depicted below. According to general risk management practices, the risk management process is divided into a minimum of three different phases (BCBS 2003; Culp 2002). Here, we define the following phases: The identification phase involves the creation of the knowledge base, which is then incrementally enriched during the measurement and controlling phases. Throughout the process, the knowledge base is transformed into different stakeholder perspectives. Finally, we describe the system architecture of a software tool for supporting the OntoRisk method.
Figure 15.1. Technology Risk Management Process in OntoRisk
15 Resolving Conceptual Ambiguities in Technology Risk Management
339
15.3.1 The Technology Risk Ontology In order to specify an ontology for technology risk, we first have to survey its fundamentals including the notions of “business and operational risk”. Since a commonly agreed terminology of business, operational or technology risk does not exist, the precise specification of these concepts and their interrelationships is a challenging task. In recent years, however, the development of the Basel II framework has led to a partial convergence of operational risk concepts within the banking sector (BCBS 2003). Such a level of agreement has not been reached in the insurance sector yet, but with Solvency II a similar development can be observed. However, the conceptualization of business and operational risk is only the first step towards a formal model of technology risk. Thus, we carry out an analysis of the regulatory and the corporate view of technology risk to eliminate remaining conceptual ambiguities. Then, we construct our extensible ontology using Description Logics as formal notation (Baader et al. 2004). Business risk is an integral part of the value adding process and is thus embedded into every business activity. It is best characterized as a random deviation from a preset monetary target (Knight 1965), which results in a negative financial effect. Operational risk used to be residually defined as financial risk that cannot be categorized as market, credit or interest rate risk (EC 1999). Following a “definitio ex positivo” of the term (GO30 1993), we understand operational risk as the risk of possible financial losses stemming from the failure of resources in operating the value chain. As stated before, the Basel II accord categorizes the causes of operational risk as internal processes, people, systems or external events. Subsequently, we limit the scope of our ontology to the technology risk category. For some risk scenarios it is difficult to draw the line between different risk types. Clear definitions of technology risk categories can foster a deeper understanding of the scope and inner structure of the encountered risks. For example, information systems risks can be subdivided into general risks (e. g. gaining unauthorized access), application risks (e. g. malfunctioning information processing) and user-driven risks (e. g. misuse of a human-machine interface) (van den Brink 2001). Wolf proposes a broader understanding of technology-related risk, which further includes IT strategy and external events affecting technology resources (Wolf 2005). A detailed analysis of all technology resources and their fundamental properties is beyond the scope of this chapter. Following the German implementation of the second pillar of Basel II, we base our technology risk model on a minimum of three categories: IT components, IT management tasks and application systems. Basic IT components include hard- and software components, which are the building blocks of more complex application systems directly supporting the business activities creating value. IT management tasks refer to the system lifecycle and include, for example, the operation and migration of application systems. In recent years, general standards for IT management have been gaining importance. For example, ITGI’s “Control Objectives for Information and related Technology” (COBIT) (ITGI 2002) and BSI’s “IT Baseline Protection” (BSI 2003)
340
Christian Cuske et al.
Table 15.1. Classification of resource properties relevant to technology risk Properties of Technology Resources Security of Compliance of Soundness of IT Components Application Systems IT Management BSI 2003 ITGI COBIT
propose normative guidelines for IT governance and security management. The BSI IT baseline protection manual provides a structured overview of threat scenarios and safeguard instructions. Information resources are grouped into modules like general components, infrastructure, networked systems, etc. A threat catalog describes the typical vulnerabilities of information resources. Depending on the particular threat scenario, availability, integrity or confidentiality of information may be compromised. The COBIT framework takes a complementary approach because it provides IT management guidelines for minimizing risks and increasing overall performance from a process-oriented perspective. In the following, we draw on BSI and COBIT for the identification of relevant risk factors and essential properties of technology resources. As depicted in Table 15.1, the properties defined by BSI and COBIT are assigned to the three generic resource types mentioned above. Special attention needs to be paid to the properties describing compliance of application systems. They very much depend on national regulations, for example IDW (IDW 2002) or Sarbanes-Oxley (Sarbanes-Oxley Act 2002). Here, we follow a widely-used classification scheme comprising completeness, timeliness and correctness as a least common denominator. Before we start to specify our technology risk ontology, let us formulate our objective (called a “motivating scenario” by Grüninger and Fox 1995), which concisely states the development goal: To establish an integrated – that is an internally and regulatory accepted – understanding of operational technology risk and its formalization as a model to enable integrated risk management. Derived from this objective are the so-called “competency questions” that our ontology must be “competent” to answer (Grüninger and Fox 1995). They describe the requirements on the ontology: 1. What are the special characteristics of technology risk and how does it relate to the more general concepts of operational and business risk? 2. How does a generic model of technology risk represent the understanding coined by Basel II? How can the model be integrated into IT management and IT governance approaches?
15 Resolving Conceptual Ambiguities in Technology Risk Management
341
3. Which is a feasible categorization of the risk-relevant resources and what is their contribution to the creation of value? 4. Which properties are relevant to technology risk management? Figure 15.2 summarizes the main concepts of the technology risk ontology, which are subsequently described in more detail. In the first step of our ontology formalization, we distinguish three groups of technology resources according to their basic properties and vulnerabilities. The top-tier technology resources are categorized as atomic IT components (ITComponent), application systems (ApplicationSystem) and operative tasks of IT management (ITManagementTask). As mentioned before, we describe these concepts and their relationships using Description Logic: (15.1)
Figure 15.2. Technology Risk Ontology
342
Christian Cuske et al.
ITComponents are essentially physical computer parts (Hardware), executable computer programs (Software), communication devices (Network), and auxiliary physical infrastructure (Infrastructure):1 (15.2) Unlike elementary IT components, application systems fulfill a business function and directly contribute to the creation of value. The relationship between application systems and IT components is expressed by composition (see relation form in Figure 15.2). Because of complex information flows on the systems level, there is a functional dependency (depend) between different application systems. (15.3) As stated before, management activities potentially influence technology risk. Mandatory managing tasks (ITManagementTask) comprise planning and controlling of the development, introduction, operation, migration, and termination of application systems, either on a per-project basis or recurrently. The degree of maturity of these tasks influences the compliance of the respective application systems. (15.4) An operational loss is accompanied by an event in which a relevant property of a technology resource (Property) is compromised. Following the classification scheme in Table 15.1, we specify three basic properties, which are described in more detail below: security of IT components, compliance of application systems and soundness of IT management tasks. (15.5) Within the scope of IT components, risk is defined as lack of security (cf. Hammer 1999). IT security is further subdivided into the properties integrity, availability and confidentiality. The effect of violating an IT component’s required property constraints may cascade through the IT topology and affect the application systems, finally resulting in financial losses. In order to ensure compliance of an application system and to guarantee that the value chain is properly operated, at least three property constraints have to be fulfilled: correctness, completeness and timeliness. However, subsequent extensions to this schema are permitted. IT management tasks are included in the ontology because the control functions on a managerial level help to address technology-related risks. Sound management 1
To avoid cluttering up the diagram, the subset relationships in Formulas 15.2, 15.4 and 15.6 were left out in Figure 15.2.
15 Resolving Conceptual Ambiguities in Technology Risk Management
343
activities as part of the system life cycle have a positive effect on the quality (Effectiveness), cost (Efficiency) and adherence to deadlines (Schedule). (15.6)
Finally, a definition of technology risk as a subcategory of business and operational risk is derived. Technology risk is characterized as a potential loss which results from the failure of a technology resource to support a business activity. (15.7) In this ontology, both the internal and external perspective on technology risk is reflected by explicitly stating the dependencies between technology resources and their risk-relevant properties. Moreover, application of the ontology can help to better comply with specific regulatory requirements. In the banking sector, for example, a mapping of business activities to Basel II business lines (BusinessLine) and resource properties to event types (EventType) can be defined using our ontology. Evaluation plays a key role in ontology development. An analysis of the taxonomic relationships in the complete specification was conducted using the OntoClean method (Guarino and Welty 2002). Moreover, two key measures for assessing the ontology’s quality are precision and coverage with respect to the intended models (Guarino and Persidis 2003). On an abstract level, the proposed ontology covers all relevant domain concepts to implement sound technology risk management practices. The precise specification of the relevant terms using Description Logic was regarded equally important. The ontology’s applicability in a real-world scenario is surveyed by means of a case study described below. The ontology presented here helps to resolve the conceptual ambiguities still residing in a traditionally fuzzy problem domain. Based on a consolidation of accepted approaches it provides a common language that permits the formalization of the organization’s knowledge on the IT system structure in general and technology risk phenomena in particular. The ontology thus contributes to a shared understanding of technology risk. The proposed model can be adapted to meet enterprise-specific and regulatory requirements due to the extensibility of ontology hierarchies. In the following sections, we will elaborate on how it can be applied in all phases of the risk management life cycle.
15.3.2 The OntoRisk Process Model The integration of the ontology into the risk management process is described according to the model depicted in Figure 15.1. This risk management process should be viewed as an iterative task.
344
Christian Cuske et al.
15.3.2.1 Identification “Banks should identify the operational risk inherent in all types of products, activities, processes and systems” (BCBS 2003). As a preparation, a model of all relevant technology resources is developed based upon the technology risk ontology. During the identification phase, the domain experts select the most risk-sensitive properties and bind estimated risk event frequencies to them. The resulting Systems & Threats Model (cf. Figure 15.1) serves as an input for subsequent measurement. It is obtained in a top-down approach as described below: 1. Discovery and documentation of the application systems, IT components and management tasks within the predefined scope. 2. Insertion of the relevant dependencies between IT components, management tasks and application systems. 3. Identification of the potential threat scenarios for application systems as well as for the underlying IT components and IT management tasks. 4. Assignment of threat scenarios to properties in conjunction with self assessment by the domain experts. The granularity of the modeled instances is subject to the risk manager. However, there should be a trade-off between the complexity and the maintainability of the model. The relevant threats are identified during interviews with domain experts. This process can be systematized if catalogues of potential hazards (and measures) are utilized, as provided by the IT security management guidelines BSI and COBIT (van den Brink 2001; Wolf 2005). In OntoRisk, identification is supported by the ability to auto-generate questionnaires from an extension of the ontology. In order to achieve this goal, the technology resources are subdivided into more finegrained objects (BSI 2003). The key issue here is the mapping from technology resources to generic terms defined in the above standards. In the following excerpt, a BSI data transmission component (DataTransmissionComponent) is introduced as subclass of a network component: (15.8) In the second step, we define additional restrictions on the risk-relevant properties: (15.9) During identification, domain experts will judge the Frequency of a threat scenario derived from the above mentioned standards: (15.10)
The presented approach provides the means for the iterative development of the Systems & Threats Model and simultaneously allows for the instant evaluation
15 Resolving Conceptual Ambiguities in Technology Risk Management
345
by domain experts. Thereby, the ontology acts as an enabler for an enterprisewide understanding of various potential hazards and ultimately leads to increased risk awareness.
15.3.2.2 Measurement The development of the Quantification Model for risk measurement (cf. Figure 15.1) represents an integral part of risk management. In the context of operational risk, many quantification approaches can be identified varying in both sophistication and risk sensitivity. A thorough analysis of the existing approaches is carried out in (Cuske et al. 2005a). For example, in the Basel II standardized approach the risk capital charge is calculated on the basis of the annual gross incomes in eight business lines (BCBS 2005a). More Advanced Measurement Approaches (AMA) include self-assessment, loss database, extreme value theory and early indicator approaches. The functional correlation approach, which was first proposed by (Kühn and Neu 2003) and further refined by (Leippold and Vanini 2005), is particularly promising since it takes the internal enterprise structure into account. The operational risk environment is modeled as a graph with functional dependencies between assets to which risk factors are attached. Corresponding risk events may lead to chain reactions influencing the overall loss. When no analytical solution exists, a simulation approach is applied (Kühn and Neu 2003). By including dependencies in the simulation model, the structural knowledge is leveraged and the nature of operational technology risk is adequately represented. The general adoption of the functional-dependency model for the ontologybased quantification of technology risk is described in (Cuske et al. 2005b). Here, we focus on how the technology risk ontology is applied as part of the OntoRisk method. The Systems and Threats Model serves as a starting point for measurement. Subsequently, we transform it into the simulation model, which is based on the functional-dependency approach. The following statements define the mapping of application systems to assets (Asset). Assets are characterized as directly contributing to the value chain. The risk-relevant properties (States) of assets are mapped onto elements (Element). Stochastic behavior of an element is injected by a supporting function (formula) which is represented as a text string.
(15.11)
Moreover, IT components (ITComponent) and management tasks (ITManagementTask) are defined as risk factor (Riskfactor). These stochastic parameters have an impact on assets via the relations form and influence. The properties of risk factors act as input parameters for the simulation model (Input) and are mapped
346
Christian Cuske et al.
accordingly. In this way, the identified frequencies of threat scenarios are integrated into the simulation model. (15.12) Loss functions (LossFunction) are introduced to calculate the effects of a failure of assets during operation of the value chain (operate). Financial losses are integrated as outputs into the simulation model. (15.13) The overall loss is calculated using a stochastic process defined by the individual outputs. The resulting risk potential is then retrieved through risk measures such as Value at Risk (VaR) or Conditional VaR (CVaR). By defining the mapping into the measurement view, we are able to combine the advantages of a shared understanding achieved by an ontology-based approach with a quantification of technology risk. This provides an elegant solution for system support, and diminishes the semantic gap between domain knowledge and its representation in a simulation model. 15.3.2.3 Controlling The controlling phase embraces population of the Analysis Model (cf. Figure 15.1). In the following, we just single out selected aspects, such as compliance with general risk management standards (BCBS 2005a). In this context, a basic regulatory requirement for risk management is backtesting of the quantitative model with historical loss data. This requires that occurring loss events are stored in loss databases in order to be compared to the calculated losses on a regular basis. The proposed ontology enables verification based on loss data, since the simulation results are written into the knowledge base. The risk potential computed by means of simulation has to be validated against empirical loss data on the basis of the predefined business line or event type schema. The business lines can be mapped via the map relation: (15.14) Moreover, Basel II demands a classification into EventTypes. The assignment to event types is also specified using the map relation. The next statement shows the first level of Basel II event type classification (BCBS2005a):
(15.15)
15 Resolving Conceptual Ambiguities in Technology Risk Management
347
A case for decision support is also found in the alternatives of hedging or avoiding technology risk (Cruz 2002; King 2001). As a rule of thumb, especially those risk events with a low frequency but high severity are to be prevented beforehand. Here, we introduce new concepts like HighSeverity to obtain risk factors and systems causing high severity losses.
15.3.3 System Architecture The OntoRisk process is supported by a software application, which is built around the ontology presented so far (cf. Figure 15.3). In this section, we shortly highlight the key aspects of our implementation. A more in depth discussion can be found in (Cuske et al. 2005b). Throughout the application we use the Web Ontology Language (OWL) (W3C 2004) as the standardized format for ontology serialization. A graphical OWL-Editor (n) is used to edit both the technology risk ontology (o) and the views (p), which contain the mappings that have been formalized in Description Logic in our discussion on the OntoRisk process above. A reasoner (p), i. e. an inference engine (W3C 2004), transforms the technology risk ontology into the perspectives specific to the respective process steps. The graphical editors, report engines and calculations which cooperate to support a certain step in the risk management
OWL Systems & Threats View
OWL Quantification View
Ontology Editor
Reasoning
OWL o Technology Risk Ontology
OWL Analysis View
n
ViewSpecific Code
p
Identification
q
r
Measurement
s
Figure 15.3. Architecture of the OntoRisk Platform
Controlling
t
348
Christian Cuske et al.
process are implemented separately (q) but share an in-memory model of the (transformed) ontology as a common data representation format. The result of these efforts is a tightly integrated set of three tools – one for each process step – which supports the user through questionnaire generation (r), a simulation module (s) and a graphical analysis and report interface (t).
15.4 Case Study: Outsourcing in Banking To demonstrate the benefits of the OntoRisk approach in a real-world application, a case study in an area of special interest is introduced: IT management considering the outsourcing of a substantial part of the value chain. The area under examination is one business line in banking, namely trading and sales. The case study does not aim at giving decision support on IT outsourcing itself. Instead, its purpose is to illustrate the effects of our ontology-based approach on the technology risk management process. In the identification phase, we analyze common threat scenarios. Within the measurement phase, two alternatives for quantification are compared: the Basel II standard measurement approach and the OntoRisk approach, which utilizes the proposed ontology-based quantification. The controlling phase is covered within the discussion of the simulation results.
15.4.1 Motivation and Influencing Factors The permanent cost pressures on financial institutions have been nurturing the discussion on lowering vertical integration, especially in the German banking industry where the cost/income ratio is higher than in EU average (ECB 2005b). In this context, the outsourcing of core activities is seen as one way to reduce the vertical integration and realize long term cost benefits. From a regulatory perspective, outsourcing is defined “[...] as a regulated entity’s use of a third party (either an affiliated entity within a corporate group or an entity that is external to the corporate group) to perform activities on a continuing basis that would normally be undertaken by the regulated entity, now or in the future” (BCBS 2005b). There exist various business models, but an intra-group outsourcing is often preferred (ECB 2004). Service Level Agreements (SLA) are used to contractually define the quality of the provided service (e. g. availability) together with penalty provisions for non-fulfillment. It is important to note, however, that outsourcing of IT leads to new types of risk by itself (Foit 2005). A recent study in the banking industry shows that the most significant risks in outsourcing are the loss of direct control, an increase in operational risks and loss of internal skills (ECB 2004). In our case study, we restrain the outsourcing scenario to trading and sales. Following the Basel II understanding, trading and sales includes proprietary trading, trade on the account of corporate clients as well as market making and treasury
15 Resolving Conceptual Ambiguities in Technology Risk Management
349
Table 15.2. Overview of data sources used in the case study (BCBS 2002a, 2002b; ECB 2005a; Federal Reserve Bank of Boston 2004) Public reports
Relevant data sources
Year
Quantitative Impact Study II Collection of loss events between 1998 and 2000 2002 – Tranche 2 at 30 international banks 2002 Loss Data Collection Quantitative Impact Study II extended survey of 2002 Exercise 2001, 89 international banks 2004 Loss Data Collection Survey of individual losses – 2004 of 27 US- 2004 Exercise banks by US Federal Bank EU banking statistics
Profitability of EU banks in 2004
2005
(BCBS 2005a). This business line has been selected because it proves to be particularly sensitive to operational risks as indicated by the relatively high capital charge (β = 18%) imposed by the Basel II standardized approach. In this business line, technology risk takes a high share within operational risk due to the strong dependence on information technology. To guarantee a high degree of generality, our case is based on publicly available data sources summarized in Table 15.2. The percentage of technology risk events in trading and sales varies between 33% and 66%. Furthermore, the income achieved in trading and sales has a substantial share of the overall return of a bank.
15.4.2 Trading Environment The exemplary IT environment serves to support a generic trading process, which follows a clear separation of business functions, adopted from the German implementation of the second pillar of Basel II (Bundesbank 2005). Figure 15.4 visualizes the system topology, which comprises six application systems, each contributing to different tasks. MarketData supplies all necessary information for trading activities executed in the FrontOffice. Every transaction is validated by a Confirmation and forwarded to the BackOffice. The trading limits are checked by the LimitController. All transactions are finally forwarded to the GeneralLedger. In this scenario, we choose Confirmation as a candidate for outsourcing. As a result, the firm is no longer directly responsible for inspection of transactions and the back office now depends on an external provider. This results in different risk assessments and potentially influences the required capital charge. Potential losses are triggered by malfunctioning IT components and management tasks. In the example case, the relevant risks affecting IT components are systematically identified by analyzing potential threat scenarios as described in the BSI baseline protection manual. For the IT management tasks we refer to the COBIT standard. The concrete impacts of these scenarios on the risk-relevant
350
Christian Cuske et al.
properties are displayed in Table 15.3. Domain experts are responsible for judging the estimated frequency of a risk event. Further, the affected property constraints of application systems are included in the Systems and Threats Model. Before measurement, the qualitative judgment of risk events by domain experts has to be translated to quantitative figures. In the simulation model, the loss events are modeled as Bernoulli distribution with the average frequency as stochastic parameter. At this stage, loss databases sometimes provide a supplementary tool for risk assessment. Loss severity is modeled independently of the frequency distribution (BCBS 2005a). For our case study, we consider loss drivers such as the missed return in trading or settlement errors resulting from the failure of supporting application systems. Following common operational risk practices (Cruz 2002), a lognormal distribution is applied and fitted using appropriate data from the banking statistics. In the outsourcing scenario, losses resulting from the confirmation system are covered within a SLA and set to zero.
Figure 15.4. Application systems with dependencies and property constraints
15 Resolving Conceptual Ambiguities in Technology Risk Management
351
Table 15.3. Identification of risk factors Description of hazard
Risk factor
Failure of an external service: After a disconnection to Reuters, Reuters → Availability the market data is temporarily unavailable (cf. BSI – T 4.31). Disruption of power supply: Power supply for the trading system is interrupted (cf. BSI – T 4.1).
PowerSupply → Availability
Deficient database security: Inadequate security mechanisms IBMDB2 open up the possibility for intru- → Confidentiality sion (cf. BSI – T 2.38). Bad facility management: Insufficient management of faci- FacilityManagement → Effectiveness lities leads to undetected errors in confirmation processing (cf. COBIT – DS12).
Application System
H M L
MarketData → Correctness
■
FrontOffice → Timeliness
■
Confirmation → Correctness
■
Confirmation → Correctness
Database failure: A failure of hard- or software potentially causes a loss of data integrity (cf. BSI – T 4.26).
OracleDB → Integrity
BackOffice → Correctness
Inadequate data management: Shortcomings result in incom plete data processing/storage (cf. COBIT – DS11).
DataManagement → Effectiveness
BackOffice → Completeness
Software vulnerabilities/errors: If the standard software is too complex, a software error may occur (cf. BSI – T 4.22).
ValuationSoftware → Integrity
LimitController → Completeness
ProjectManagement → Efficiency
GeneralLedger → Correctness
Managing projects: Weak project management increases the pressure on running tasks performed in accounting (cf. COBIT – PO10).
Frequency
■
■
■
■
■
352
Christian Cuske et al.
15.4.3 Analysis and Results In the case study, comparative figures were used to analyze alternative scenarios with sourcing strategies ranging from an in-house solution through a transfer to an external provider. The most interesting results are summarized in Figure 15.5. On the left hand side of the figure, the original risk potential of the in-house solution is shown, while the right hand side depicts the reduced risk potential after outsourcing the confirmation system. For completeness it should be noted that the latter scenario assumes a full coverage of resulting losses by the external provider. The ex-ante analysis of competing scenarios can support IT management in the process of decision making. Additionally, an ex-post analysis can serve as a starting point for the controlling phase.
Without Outsourcing VaR: 4729154 CVaR: 5109822
With Outsourcing VaR: 3071509 CVaR: 3401466
Figure 15.5. Exemplary results for two outsourcing scenarios
15.5 Conclusion By incorporating results from the area of knowledge management research into risk management, we were able to contribute new solutions to some of the prevalent challenges in that field. In detail, we applied formal ontologies to create a model for a shared understanding of the conceptual building blocks of technology risk management. This technology risk ontology forms the foundation for both our proposed process model and the supporting software toolset. In this way, we were not only able to unravel conceptual ambiguities inherent in this domain, but also to establish the basis for an integrated risk management approach as part of IT governance.
15 Resolving Conceptual Ambiguities in Technology Risk Management
353
References Abecker A, van Elst L (2004) Ontologies for Knowledge Management. In: Staab S, Studer R (eds.): Handbook on Ontologies. Springer 435−454 Baader F, Horrocks I, Sattler U (2004) Description Logics. In: Staab S, Studer R (eds.): Handbook on Ontologies. Springer 3−28 Bank for International Settlements (BIS 2005) 75th Annual Report Basel Committee on Banking Supervision (BCBS 2002a) The 2002 Loss Data Collection Exercise for Operational Risk: Summary of the Data Collected Basel Committee on Banking Supervision (BCBS 2002b) The Quantitative Impact Study for Operational Risk: Overview of Individual Loss Data and Lessons Learned Basel Committee on Banking Supervision (BCBS 2003) Sound Practices for the Management and Supervision of Operational Risk Basel Committee on Banking Supervision (BCBS 2005a) International Convergence of Capital Measurement and Capital Standards – A Revised Framework Basel Committee on Banking Supervision (BCBS 2005b) Outsourcing in Financial Services Berners-Lee T, Hendler J, Lassil O (2001) The Semantic Web. Scientific American, May Bundesamt für Sicherheit in der Informationstechnik (BSI 2003) Baseline Protection Manual Bundesbank (2005) Mindestanforderungen an das Risikomanagement (MaRisk) Chong CW, Holden T, Wilhelmij P, Schmidt RA (2000) Where does knowledge management add value? Journal of Intellectual Capital 1 366−380 Cruz MG (2002) Modeling, measuring and hedging operational risk. Wiley, Chichester Culp CL (2002) The ART of risk management: alternative risk transfer, capital structure, and the convergence of insurance and capital markets. Wiley, New York; Chichester Cuske C, Dickopp T, Schader M (2005a) Konzeption einer Ontologie-basierten Quantifizierung operationeller Risiken bei Banken. Banking and Information Technology 6 61−74 Cuske C, Dickopp T, Seedorf S (2005b) JOntoRisk: An Ontology-Based Platform for Knowledge-Based Simulation Modeling in Financial Risk Management. Proceedings of the European Simulation and Modeling Conference (ESM), Porto, Available at SSRN: http://ssrn.com/abstract=802665 Cuske C, Korthaus A, Seedorf S, Tomczyk P (2005c) Towards Formal Ontologies for Technology Risk Measurement in the Banking Industry. 1st Workshop Formal Ontologies Meet Industry (FOMI) Proceedings, Available at SSRN: http://ssrn.com/abstract=744404 European Central Bank (ECB 2004) Report on EU Banking Structure European Central Bank (ECB 2005a) EU banking sector stability European Central Bank (ECB 2005b) EU Banking Structures
354
Christian Cuske et al.
European Commission (EC 1999) Review of regulatory capital for credit institutions and investment firms European Commission (EC 2004) Framework for Consultation on Solvency II. Federal Reserve Bank of Boston (2004) Results of the 2004 Loss Data Collection Exercise for Operational Risk Fensel D (2004) Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, Berlin; Heidelberg Foit M (2005) Management operationeller IT-Risiken bei Banken. Universitätsverlag Regensburg, Regensburg Fox MS, Gruninger M (1998) Enterprise Modelling. AI Magazine 19 109−121 GO30 (1993) Special Report on Global Derivatives – Derivatives: Practices & Principles. The Group of Thirty Consultative Group on International Economic and Monetary Affairs, Inc. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl. Acquis. 5 199−220 Grüninger M, Fox M (1995) Methodology for the Design and Evaluation of Ontologies. IJCAI’95, Workshop on Basic Ontological Issues in Knowledge Sharing, April 13 Guarino N, Persidis A (2003) Evaluation Framework for Content Standards. Ontoweb Deliverable 3.5, IST-2000-29243 Guarino N, Welty C (2002) Evaluating ontological decisions with OntoClean. Commun. ACM 45 61–65 Hammer V (1999) Die 2. Dimension der IT-Sicherheit: Verletzlichkeitsreduzierende Technikgestaltung am Beispiel von Public-Key-Infrastrukturen. Vieweg, Braunschweig; Wiesbaden Hansen MT, Nohria N, Tierney T (1999) What’s your strategy for managing knowledge? Harvard Business Review 77 106−116 Holsapple CW, Joshi KD (2004) A formal knowledge management ontology: Conduct, activities, resources, and influences: Research Articles. J. Am. Soc. Inf. Sci. Technol. 55 593−612 Institut der Wirtschaftsprüfer (IDW 2002) Stellungnahme zur Rechnungslegung FAIT 1 − Grundsätze ordnungsgemäßer Buchführung bei Einsatz von Informationstechnologie International Accounting Standards Committee (IASB 2004) International Accounting Standards IT Governance Institute (ITGI 2002) Control Objectives for Information and Related Technology King JL (2001) Operational Risk: Measurement and Modelling. Wiley, Chichester Knight FH (1965) Risk, Uncertainty and Profit. Harper & Row Kühn R, Neu P (2003) Functional correlation approach to operational risk in banking organizations. Physica A 322 650−666 Lamberti H-J (2005) Kernelemente der Industrialisierung in Banken aus Sicht einer Großbank. In: Sokolovsky, Löschenkohl (eds.): Handbuch Industrialisierung der Finanzwirtschaft 59−82
15 Resolving Conceptual Ambiguities in Technology Risk Management
355
Leippold M, Vanini P (2005) The Quantification of Operational Risk. Journal of Risk 8 59−85 Maedche A, Motik B, Stojanovic L, Studer R, Volz R (2003) Ontologies for Enterprise Knowledge Management. IEEE Intelligent Systems 18 26−33 Marshall C, Prusak L, Shilberg D (1996) Financial risk and the need for superior knowledge management. California management review 38 77−101 O’Leary DE (1998) Using AI in Knowledge Management: Knowledge Bases and Ontologies. IEEE Intelligent Systems 13 34−39 Prahalad CK, Hamel G (1990) The core competence of the corporation. Harvard Business Review 68 79−91 Probst G, Buchel B, Raub S (1996) Knowledge as a Strategic Resource. Ecole des Hautes Etudes Commerçiales, Université de Genève Probst G, Raub S, Romhardt K (2003) Wissen managen: wie Unternehmen ihre wertvollste Ressource optimal nutzen. Gabler, Wiesbaden Quintas P, Lefrere P, Jones G (1997) Knowledge management: a strategic agenda. Long Range Planning 30 385−391 Sarbanes-Oxley Act of 2002 (2002) H.R.3763 Skyrme DJ, Amidon DM (1997) Creating the Knowledge-Based Business. Business Intelligence, London Spies M, Clayton A, Noormohammadian M (2005) Knowledge management in a decentralized global financial services provider: a case study with Allianz Group. Knowledge Management Research & Practice 3 24−36 Staab S, Studer R, Schnurr H-P, Sure Y (2001) Knowledge Processes and Ontologies. IEEE Intelligent Systems 16 26−34 Swan J (2003) Knowledge Management in Action? In: Holsapple CW (ed.): Handbook on Knowledge Management, Vol. 1. Springer, Berlin; Heidelberg 271−296 Uschold M, Gruninger M (1996) Ontologies: principles, methods, and applications. Knowledge Engineering Review 11 93−155 van den Brink GJ (2001) Operational Risk: The New Challenge for Banks. Palgrave Macmillan World Wide Web Consortium (W3C 2004) Web Ontology Language (OWL). Westenbaum A (2003) Bankbetriebliches Wissensmanagement: Entwicklung, Akquisition und Transfer der Unternehmensressource Wissen in Kreditinstituten. Lang, Frankfurt am Main Wolf E (2005) IS risks and operational risk management in banks. Eul, Lohmar
CHAPTER 16 Extracting Financial Data from SEC Filings for US GAAP Accountants Thomas Stümpert
16.1 Introduction Since financial statements are vital for decision makers in the professional investment world, quick access and automated import of this data is essential for the financial industry. In order to do that, an investor or analyst requires a comprehensive database with the figures from financial statements for all covered companies over a reasonably long period of time. Investment companies pay to service providers such as Reuters1, Compustat2 etc. for access to refined financial statements. A popular source of information are the filings held in the EDGAR database. EDGAR, the Electronic Data Gathering, Analysis, and Retrieval system, performs automated collection, validation, indexing, acceptance, and forwarding of submissions by companies and others who are required by law to file forms with the U.S. Securities and Exchange Commission SEC (U.S. Securities and Exchange Comission SEC 2007). Its primary purpose is to increase the efficiency and fairness of the securities market for the benefit of investors, corporations, and the economy by accelerating the receipt, acceptance, dissemination, and analysis of time-sensitive corporate information filed with the agency. Registered companies are required to submit certain financial reports in prescribing formats in so-called forms. Unfortunately, the data available in the SEC EDGAR database is weakly structured in technical terms. Filings contain mixed content of natural language text and semi-structured financial tables. Form 10-K is the annual report that most reporting companies file with the SEC, containing annual financial statements. Form 10-Q is the corresponding quarterly update. Filings consist of plain text
1 2
http://www.reuters.com http://www.compustat.com
358
Thomas Stümpert
and/or HTML. As long as these filings are the standard and most up-to-date source of information for professional and private investors, software agents that transform them into an exchangeable format are of greatest interest. For data extraction methods in general, see e. g. (Baeza-Yates and Ribeiro-Neto 1999) or (Salton 1988). The software agent described here continues prior explorations of data extraction using agents, formerly named EDGAR2XML, see (Leinemann et al. 2001). EDGAR2XML showed the basic ideas of transforming financial data from 10-K and 10-Q filings into machine readable XML format. One of the shortcomings of EDGAR2XML was the vast use of regular expressions, which made this project not maintainable. The reason for having used large regular expressions is that the underlying data format of 10-K and 10-Q filings often varies. How a filing should look like is described in the EDGAR filer manual and many companies deviate from the rules described there. The filer manual is available at (EDGAR Filer Manual 2007). Reason for not using these rules may be that accountants do not care about data formats, e. g. only a few companies support XBRL (Extensible Business Reporting Language XBRL 2007). The hereinafter described software agent EASE (EASE stands for Extraction Agent for SEC’s EDGAR database) uses term weights to specify interesting sections in 10-K and 10-Q forms. EASE is able to handle with these deviations, i. e. the agent takes into account changing document formats. Before we delve into the details of the EASE project, we take a look at existing implementations with similar objectives. Currently several EDGAR agents exist, e. g. EDGAR online, see (EDGAR Online 2007) and (FreeEDGAR 2007), 10 k Wizard (10 k Wizard: SEC filings 2007) and FRAANK, see (Financial Reporting and Auditing Agent with Net Knowledge FRAANK 2007) and (Kogan; Nelson; Srivastava; Vasarhelyi; Lu 2000). EdgarScan from PriceWaterhouseCoopers pulls filings from the SEC’s servers and parses them automatically to find key financial tables and normalize financial positions to a common set of items that is comparable across companies (EdgarScan 2007). The normalization makes EdgarScan superior compared to other implementations. When filings are available as HTML flavored only, EdgarScan obviously converts the HTML documents into plain text and parses these documents. Traditional EDGAR agents do not always detect a balance sheet in a 10-K filing correctly. EASE uses two different approaches that take into account changed document formats (EDGAR filer manual 2007) which now supports embedded HTML and PDF documents. We describe an algorithm which extracts key financial information at a high level for conventional text files. The chapter is organized as follows: The first section describes content and structure of SEC 10-K and 10-Q filings. In the following main part we describe the EASE agent approaches for extraction of financial data from these filings. This part consists of two subsections, one of which is dedicated to the traditional, textual filings, and the other for the newer HTML encoded documents. The next section describes the role of XBRL, an XML based standard for exchange of financial information. The final section contains results and a conclusion.
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
359
16.2 Structure of 10-K and 10-Q Filings A filing is a text document that contains tags in order to structure the parts it contains. The specification for filings can be found in the EDGAR filer manual (EDGAR filer manual 2007). These tags are special to SEC filings, but similar to HTML tags in terms of syntactical rules. A filing consists of a single root element named <SEC-DOCUMENT>, which contains a single child named <SEC-HEADER> and one or more children named . The following excerpt shows this structure which is common to all filings (indentations are added for illustration purposes): <SEC-DOCUMENT> <SEC-HEADER> CONFORMED SUBMISSION TYPE: 10-K CONFORMED PERIOD OF REPORT: 20041231 FILED AS OF DATE: 20050328 FILER: COMPANY DATA: CENTRAL INDEX KEY: 0123456789 … FILING VALUES: FORM TYPE: 10-K … 10-K SECURITIES AND EXCHANGE COMMISSION Washington, D.C: 20549 Form 10-K … …
16.2.1
Structure of Document Elements
The common structure of all embedded documents is a sequence of named properties followed by a single text node. An example: 10-K ANNUAL REPORT
360
Thomas Stümpert
actual contents In this example, the first two lines specify two document properties and . The text node encloses the actual contents which is either an embedded document of type HTML, plain text with tags, or even some graphical type. Parsing filings in order to obtain the structure up to this level is comparably easy. Embedded documents which contain financial information are nearly always either HTML or plain text, and sometimes a mixture. The format of these documents varies a lot between companies, and may change over time for a single company.
16.2.2 Embedded Plain Text Plain text documents are best viewed with browsers that display the contents formatted with a mono-spaced font. Line and page breaks as well as the length of character runs including white space are significant for the visible structure of sections, paragraphs and tables. The following excerpt illustrates this. 2006 2005 -----------------------<S> ASSETS Cash 27 85 Representing this example using a true type font, not respecting line breaks, produces something like 2006 2005 --------------------------------<S> ASSETSCash 27 85 which clearly illustrates the importance of the properties stated above. That said, we could easily draw a table using pipe character immediately in front of the “smaller than” characters of the columns, and would obtain this visible result:
--------ASSETS Cash
|2006 |-------| |27
|2005 |---| |85
Here is an example of how a balance sheet should appear in a filing (we call this well-formed or standard): ... <page>
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
361
...
CONSOLIDATED BALANCE SHEET
IN MILLIONS OF DOLLARS, EXCEPT PER SHARE AMOUNTS 2006 2005 ------------------------------------------<S> ASSETS Cash 27 85 Investments 0 1,487 ... Total Assets 9,456 9,154 LIABILITIES Debts 812 756 Treasury stock (1.2) (.8) ... Total Liabilities 9,456 9,154 notes to the consolidated balance sheet at p 16
... This example illustrates a kind of canonical table in a traditional document. First of all, the document uses <page> tags to separate pages. Second, the table is enclosed in an opening and a closing
tag. Third, the table uses
to identify its purpose. Fourth, it uses a sequence of <S> and tags which determines the column specification (<S> stands for stub and stands for column). Finally, the table entries are placed correctly within the text, that is, they are starting at their respective horizontal position. Filings often deviate in a series of points from the example above, and some tags are frequently omitted. This makes it difficult to generalize the parsing process.
16.2.3 Embedded HTML Embedded HTML documents are in no way special to SEC filings when compared to other real-world HTML content published on the web. They are enclosed in HTML tags, using and tags,
tags and others. These documents should be viewed with contemporary HTML browsers only. The following figure shows the beginning of a balance sheet.
362
Thomas Stümpert
(Dollars in million except per share amounts) AT DECEMBER 31: _______________________________________________
NOTES _______
ASSETS Current assets Cash and cash equivalents Marketable securities Notes and accounts receivable-trade, net of allowances Short-term financing receivables Other accounts receivable Inventories Deferred taxes Prepaid expenses and other current assets
D F E P
Total current assets Plant, rental machines and other property
Figure 16.1. Excerpt of a balance sheet from a 10-K filing in HTML format
16.2.4
Financial Reports Overview
In order to parse 10-K and 10-Q filings and detecting balance sheet data, we have to analyze the basic structure of these filings. The balance sheet itself is found in the financial statements part. Financial reports consist of natural language text disclosing various aspects of a company, and figures in tabular. Annual financial reports, the submission of which is prescribed in form 10-K of the SEC, are sectioned as follows (EDGAR Filer Manual 2007): Item 1. Item 2. Item 3. Item 4. Item 5. Item 6. Item 7. Item 8. Item 9. Item 10. Item 11. Item 12. Item 13. Item 14.
Business Properties Legal Proceedings Submission of Matters to a Vote of Security Holders Market for Registrant’s Common Equity and Related Stockholder Matters Selected Financial Data Management’s Discussion and Analysis of Financial Condition and Results of Operations Financial Statements and Supplementary Data Changes in and Disagreements With Accountants on Accounting and Financial Disclosure Directors and Executive Officers of the Registrant Executive Compensation Security Ownership of Certain Beneficial Owners and Management Certain Relationships and Related Transactions Principal Accountant Fees and Services
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
363
Quarterly updates, prescribed by form 10-Q, contain the following section, not all of which must be present if not applicable (EGDAR Filer Manual 2007): PART I: FINANCIAL INFORMATION Item 1. Financial Statements Item 2. Management’s Discussion and Analysis of Financial Condition and Results of Operations Item 3. Quantitative and Qualitative Disclosures About Market Risk Item 4. Controls and Procedures PART II: OTHER INFORMATION Item 1. Item 2. Item 3. Item 4. Item 5. Item 6.
Legal Proceedings Unregistered Sales of Equity Securities and Use of Proceeds Defaults Upon Senior Securities Submission of Matters to a Vote of Security Holders Other Information Exhibits
Financial statements in quarterly reports are typically condensed, offering a coarser granularity of financial positions compared to financial statements in annual reports. They are located in Part I, Item 1 in quarterly updates, and in Item 8 in annual reports.
16.3 Information Retrieval Within EASE Information retrieval (IR) deals with the representation, storage, organization of, and access to information items. The representation and organization of the information items should provide the user with easy access to the information in which he is interested. The three basic models in information retrieval are called Boolean, vector, and probabilistic (Baeza-Yates and Ribeiro-Neto 1999). In the Boolean model, documents and queries are represented as sets of index terms. In the vector model, documents and queries are represented as vectors in a n-dimensional space. In the probabilistic model, the framework for modeling document and query representations is based on probability theory. The vector model recognizes that the use of binary weights is too limiting and proposes a framework in which partial matching is possible. This is accomplished by assigning non-binary weights to index terms in queries in documents. These term weights are ultimately used to compute the degree of similarity between each document stored in the system and the user query. By sorting the retrieved documents in decreasing order of this degree of similarity, the vector model takes into consideration documents which match the query terms only partially. The main resultant effect is that the ranked document answer set is a lot more precise – in the sense that it better matches the user information need – than the document answer set retrieved by the Boolean model. Information extraction (IE) is a technology dedicated to the extraction of
364
Thomas Stümpert
structured information from texts to fill pre-defined templates. While most contemporary work focuses on either machine learning for IE patterns or wrapper generation, cf. (Baeza-Yates and Ribeiro-Neto 1999), (Kogan; Nelson; Srivastava; Vasarhelyi; Lu 2000) and others, and is geared towards web search engines, we deal with a predefined structure for the information, namely the financial statements of US-GAAP compliant financial reports. The guidelines in the filer manual specify what information must be present, but the technical specification is comparably weak.
16.3.1
Information Extraction from Traditional Filings
A filing is often not well-formed, e. g. tags within a table are often omitted. Due to this, the use of large regular expressions for detecting a balance sheet itself, is a not maintainable task (Leinemann et al. 2001). Also a Boolean model may be not the best solution. Therefore we use a modified version of the vector space model, cf. (Salton 1988), for the purpose of the identification of financial tables. We split a filing into segments, which we use as document equivalents. This is necessary since we are dealing with a single document, the parts of which need to be valuated. The query is defined as the search for the balance sheet, the formal details will be explained below. Aim of these modifications is to obtain a value which ranks the document segments in terms of likelihood for the contention of a balance sheet (or other financial table). First of all, three typical excerpts of a filing are presented in the example below. Then the classical vector space algorithm is applied (Salton 1988). But this algorithm does not detect the balance sheet itself in this example, so a refined new vector space algorithm is presented afterwards. This algorithm is used in EASE. 16.3.1.1 Example Documents Used for Query We will refer to the following three text excerpts from real-world filings in order to compare the results of the standard vector space model to our extended version. The second excerpt contains the balance sheet itself. Document d1: <page> The status of the pension plans follows.
Balance Sheet from December 31, 1998 is on page 12.
16.3.2 Standard Vector Space Model The standard vector space model creates a weighted mapping of terms to documents so that similarity of documents and search queries can be measured. First, we define a set T of n search terms T = { t1 , … , t n } ,
each of which is a literal term like “balance sheet” or “assets” or a tag like
or <page> etc. Document segments are defined as set D = { d1 , … , d m } .
The query weight vector ′ q = ( q1 , … , q n )
is a vector of query weights associated with the terms of T, and ′ w j = ( w j,1 , … , w j,n ) ∈ R n
is a binary vector with components wj,k = 1 if term tk exists in dj, or 0 otherwise.
366
Thomas Stümpert
Then the similarity measure sj for document dj is defined as sj =
n
∑ w j,k q k .
k =1
Higher values of similarity sj indicate a higher probability that a segment contains the balance sheet. The weights used for the terms in this model are taken based on observations of dozens of tests (see Table 16.1). Using the query weights q = (10,15,55,20), the search terms T= {<page>,
, Balance Sheet, Assets} we obtain the following similarity measures for documents d1, d2, d3: s1 = 1·10 + 1·15 + 0·55 + 1·20 = 45 for d1, s2 = 1·10 + 1·15 + 1·55 + 0·20 = 80 for d2, s3 = 0·10 + 1·15 + 1·55 + 1·20 = 90 for d3. The highest similarity measure s3 identifies document d3. But this document does not contain the balance sheet. That means, using this similarity measure does not necessarily lead to a successful detection of the balance sheet. The balance sheet is in document d2. Table 16.1. Defined query weights for detection of a balance sheet Query
Term
Weight
Q
<page>
balance sheet assets
10 15 55 20
Table 16.2. Result for the search query of documents d1, d2 and d3 Document segment D1
D2
D3
Term
<page>
balance sheet assets <page>
balance sheet assets <page>
balance sheet assets
Weight 1 1 0 1 1 1 0 1 0 1 1 1
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
367
In this example the standard vector space model does not detect the balance sheet with the highest similarity measure. The results obtained so far motivate a new, extended version of the vector space model. The main idea is, that the order of the occurrence of the individual search terms matters.
16.3.3 Extended Vector Space Model To take the order of the search terms into account we replace the vectors of the equations with matrices, where the components of the matrix reflect the order of the terms. This leads to the following modification: Let Q be an n×n term-term-matrix ⎛ q1,1 ⎜ Q=⎜ ⎜ q n,1 ⎝
q1,n ⎞ ⎟ ⎟, q n,n ⎟⎠
where qi,k denotes the weight of subsequent occurrences of term ti and tk. Wj denotes the matrix of occurrences in document j, ⎛ w1,1 ⎜ Wj = ⎜ ⎜ w n,1 ⎝
w1,n ⎞ ⎟ ⎟, w n,n ⎟⎠
where wi,k equals 1 if term ti precedes tk, or 0 otherwise. As similarity measure for dj we use the sum of weighted occurrences, i. e. n
n
s j = ∑ ∑ w j,k,i q i,k . i =1 k =1
A high value for the similarity measure again represents a high likelihood for the occurrence of a balance sheet. We apply the modified model to the document segments above (see below). In Table 16.3 the query weight of 75 stands for Balance sheet is followed by Assets. Table 16.3. Query weights for the extended vector space model Q
<page>
<page>
Balance Sheet Assets
0 0 0 0
25 0 70
Balance Sheet 65 70 0
Assets 30 35 75
25
65
30
368
Thomas Stümpert
Table 16.4. Example for term and order weights for both queries and segments d1
<page>
Balance Sheet
Assets
<page>
Balance Sheet Assets
0 0 0 0
1 0 0 0
0 0 0 0
0 1 0 0
d2 <page>
Balance Sheet Assets
<page> 0 0 0 0
1 0 0 0
Balance Sheet 0 1 0 0
Assets 0 0 0 0
d3 <page>
Balance Sheet Assets
<page> 0 0 0 0
0 0 0 0
Balance Sheet 0 1 0 1
Assets 0 0 0 0
We obtain a similarity measure of s1 = 60 for segment d1, s2 = 95 for segment d2, and s3 = 70 for segment d3. The extraction algorithm now tries to capture individual items by their names, and by splitting rows into columns based on both column specification, number of consecutive separating white space characters, and appearance of isolated numbers. First we identify the relevant table and determine the start of the table. Next we extract multiplier and currency of the numbers. After that we start extracting the data for the different item labels. The extraction of the details is beyond the scope of this document.
16.3.4
Extraction of HTML Documents
Since SEC filings consist of HTML parts and other predefined tags mixed up with plain text, the filing is not conform to the XML specification and for the extraction of HTML flavored documents requires the use of absolute HTML paths that point to the data item to be extracted. The extraction process of HTML flavored documents is two fold: In the first stage, a parser builds a document tree based with an HTML DOM root node for each embedded HTML document. This structure is used by the extractor in the second stage to locate financial statements and extract them into the target object. Input for the document builder is the raw character sequence containing the entire document. Output is an object which
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
369
contains a reference to the raw text and a root node, which represents the root of the document hierarchy. As stated before, a filing consists of multiple parts with varying syntax. The main parser splits the document into a header and a number of documents first, and delegates the actual parsing to specialized parsers for headers and documents, a header parser and a document parser, respectively. The header and each document are added as nodes to the root node. Splitting the entire document into its constituent parts is realized with simple regular expressions, searching the respective opening and closing tags of <SEC-HEADER> and in a single call. The header can be split into lines, and these lines can be split into a property name and its value simply finding the first occurrence of a colon. Each property is added as a single node to the document tree. The document parser uses a single regular expression in order to split the document contents into the leading property values and the final text element. The former are put as properties to the document node, and the text node is added as the only child to the document node. The content of this text element is then passed on to an implementation of an HTML parser. This parser adds the document as child named to the text node. The resulting document tree can be viewed with a special end-user application implemented for demonstration purposes (see screenshot in Figure 16.2). Navigation through the entire document is possible using the methods which the node implementation allows for, among them searching by content and/or
Figure 16.2. Screen of a DOM tree
370
Thomas Stümpert
attribute values, and addressing nodes using path expressions. The address of the “filer” for example would be “/SEC-DOCUMENT/SEC-HEADER/filer”. Moreover, the entire document can be processed in consistent way regardless of the input type. The main challenge for the HTML parser stems from the fact that most real world HTML documents do not follow the W3C recommendations strictly. In contrast to XML, which enforces strict adherence to standards, HTML documents have always been quite sloppy in terms of standardization. Web browsers have been very tolerant towards incorrect syntax since their invention, which lead to the proliferation of HTML editors that produce code not in compliance with recommendations. From a theoretical point of view, XML is a type-2 grammar in Chomsky’s hierarchy. As such, it requires a finite state engine with a stack and thus exceeds the capabilities that pure regular expressions provide. HTML is only a special incorporation of XML or SGML when perfectly in compliance with HTML recommendations. In practice, most HTML documents violate the recommendations frequently. Capturing the document tree reliably despite potential violations almost certainly requires a Turing-complete engine. And in fact, writing such a parser proved to be challenging. The basis for the HTML parser is a skeletal XML parser with special precautions for various expected and unexpected violations. The parser contains two regular expressions. The first expression is used to find comments, opening tags, closing tags, and character data tags in the parsers main loop. The second is used to parse attributes in opening tags. The parser keeps a reference to the node which has been opened most recently. Every time an opening tag is detected, a new node is created. This node will become the current top node. When a closing tag is detected, the current top node is closed, and its parent is made the current top. The reference to the current top node resembles the stack required for type-2 parsers. The current top node is the top element of the stack. Creating a new node and making it the current top node resembles the “push” operation of a stack, and making the current top nodes’ parent the current top node resembles the “pop” operation of a stack. As stated before, this approach must fail for real-world HTML documents and HTML based filings. First of all, the broken tags (opening tags without a closing pendant) must not become the top node. Second, in some cases closing tags are omitted. An opening tag then forces previously found opening tags to be closed. Third, a closing tag may be misplaced at some later position in the document, in which case it will be ignored. The extraction process is similar to the concept of XPath, which describes how to address a node in an XML document. For example, in order to find out the type of a filing (10-K or 10-Q), the extractor searches for the header node, and in this the node named submission type. The following code illustrates this: Node root; // given Node header = root.find(“SEC-HEADER”); Node submissionType = header.find(“CONFORMED SUBMISSION TYPE”); String typeName = submissionType.getText();
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
371
The path and search expressions vary from company to company, while there is still a set of default or standard methods of how to identify a certain node. The extractor looks into its configuration database whether it finds a specialized path for a given item and company, and defaults to a generic implementation if not available. This procedure makes it possible to extend the range of covered companies by simply adding specialized entries to the configuration database for that particular company. Our implementation differs from XPath in that it supports regular expressions. Using XPath and regular expressions, a balance sheet can be detected using the following expressions: XPath: SEC-DOCUMENT/DOCUMENT XPath: /TEXT/HTML/BODY//TABLE[count(TR)>7]/TR/TD Regex:.*liabilities.*equity.* XPath:../..
16.4 The Role of XBRL The most important effort in standardizing exchange of financial information is the Extensible Business Reporting Language (Extensible Business Reporting Markup Language XBRL 2007), which is comprised of a set of XML schemas, describing how to present items, and taxonomies, describing which items make up a certain type of financial report. The standard is governed by a not-for-profit international consortium (XBRL International Incorporated) of more than 300 organizations, including regulators, government agencies, intermediaries and software vendors. Registrants may choose to voluntarily submit documents in XBRL format instead of plain text or HTML documents to meet filing requirements. But only documents submitted to the EDGAR system in either plain text or HTML are official filings. All XBRL documents are unofficial submissions and considered furnished, not filed. XBRL is being rapidly adopted to replace both paper-based and legacy electronic financial data collection by a wide range of regulators. It is used for the disclosure of financial performance information by companies, notably in voluntary SEC filings. XBRL is somewhat different to many other XML standards published by consortia. Generally, when two parties wish to exchange information using XML, they need to have prior access to all of the element definitions. But accounting involves the dissemination of information, the vast majority of which conforms to reporting norms, but some of which is unique to the circumstances of a particular company.
16.5 Empirical Results Basis for the following observations was a sample of 10-K filings for the 30 Dow Jones Industrial Average companies. Performance benchmark was the former
372
Thomas Stümpert
Table 16.5. Overview of detected balance sheets within 10-Q filings Company
Detected Balance Sheets
Number of Filings
3M Co. Alcoa Inc. Altria Group Inc. American Express Co. AT&T Corp. Boeing Co. Caterpillar Inc. Citigroup Inc. Coca-Cola Co. E.I. DuPont de Nemours & Co. Eastman Kodak Co. Exxon Mobil Corp. General Electric Co. General Motors Corp. Hewlett-Packard Co. Home Depot Inc. Honeywell International Inc. IBM Corp. Intel Corp. International Paper Co. J.P. Morgan Chase Co. Johnson & Johnson McDonald’s Corp. Merck & Co. Inc. Microsoft Corp. Procter & Gamble Co. SBC Communications Inc. United Technologies Corp. Wal-Mart Stores Inc. Walt Disney Co. Total
EDGAR2XML implementation, which used about two minutes to extract financial data from comparable filings. This package utilized the GNU regular expression implementation. The DOM parser was able to create the DOM instance for all filings in the sample which contained embedded HTML. The parsing process took only about 1.5 seconds for an average 1 MB filing on a 1.5 GHz single processor machine with 512 MB of RAM. Navigation within the document tree proved to be very fast. The plain text parser recognizes 598 balance sheets of 615 text-based balance sheet data from 10-Q filings of the sample (see Table 5) and from 206 10-K filings
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
373
it detected 100% balance sheet data correctly. Processing took about 0.5 seconds per filing with an average size of 400 kB. Reasons for not detecting some balance sheets correctly from 10-Q filings are the deviations from the well-defined structure: The filing for CIK 732717 in 1996 contained no tags at all. The occurrence of consolidated balance sheets for CIK 40730 included those of subsidiaries, which confused the EASE agent. The filing for CIK 40545 for year 2000 contained a mix of HTML and plain text formatting means, which the agent cannot deal with. For CIK 18230 in year 1997, the agent found too similar patterns in other segments of the filing. In order to detect financial items in an extracted balance sheet, we worked with synonym lists (see Appendix).
16.6 Conclusion The focus of this paper is the recognition of balance sheets within 10-K and 10-Q filings. The major task and challenge to extract balance sheet data is to deal with varying data formats. The EDGAR filer manual describes the conceptual structure of 10-K and 10-Q filings for filers, but not for data format specialists. Due to this fact tags are omitted and the underlying data format may change. Using only regular expressions to detect the relevant financial data leads to an unmaintainable task, cf. former EDGAR2XML project (Leinemann; Schlottmann; Seese; Stümpert 2001). A maintainable approach was a mixture of the vector space algorithm (Salton 1988) and regular expressions. Unfortunately this approach not necessarily detected the balance sheet itself, i. e. other document segments of a 10-K or 10-Q filing seemed to have higher similarity measures than the balance sheet. So the idea was born to extend the vector space algorithm by the order of pairwise search terms, e. g. Balance sheet is followed by Assets. The first results from a sample of the companies within the Dow Jones Industrial Average Index are very promising, i. e. in this sample we detected all balance sheets of the 10-K filings. The filing format is either ASCII-based or HTML-based. HTML-based format can be transformed in ASCII-based format and then the extended vector space model can be applied. Transformation in ASCII-based format was first used by EdgarScan. Another approach is to deal with HTML-based documents directly, e. g. segment a HTML-enriched filing and create a DOM representation of the filing. Segmentation uses structural information, e. g. a <page> tag is used to get a new DOM entry. The speciality of this approach is not the DOM representation, but the navigation within the DOM tree using both XPath and regular expressions, e. g. detect a balance sheet by detecting all tables with more than seven rows which contain the keywords liabilities and equity as described in the section about extraction of HTML-based documents. Summarizing the results, both the ASCII-based extension of Salton’s vector space algorithm and the HTML-based navigation within DOM representation of filings are new and lead to robust and easy detection of balance sheets even when structural information is omitted.
374
Thomas Stümpert
Acknowledgements The author thanks his students Özkan Cetinkaya and Ralf Spöth for technical support. Furthermore, he thanks Frank Schlottmann and Detlef Seese for their ideas and work on the former EDGAR2XML project contributing the basic ideas to the EASE project.
Appendix: Synonym List for Balance Sheet Data current assets=total current assets financial investments=financing assets financial investments=financial investments financial investments=investment in affiliates financial investments=investments financial investments=investments and long.term receivables financial investments=investments in equity affiliates financial investments=total financial services assets financial investments=total loans fixed assets=fixed assets fixed assets=land.? buildings.? and equipment fixed assets=land.? buildings.? machinery.? and equipment fixed assets=net properties fixed assets=parks.? resorts.? and other property fixed assets=plant.? rental machines.? and other property..net fixed assets=plants.? properties.? and equipment fixed assets=properties.? plants.? and equipment fixed assets=property and equipment fixed assets=property.? plant.? and equipment intangible assets=intangible assets total assets=total assets current liabilities=total current liabilities current liabilities=total deposits debt=long.term debt debt=long.term liabilities debt=long.term borrowings other liabilities=other liabilities other liabilities=total deferred credits and other noncurrent liabilities
16 Extracting Financial Data from SEC Filings for US GAAP Accountants
375
#equity=total (common)?shareowners. equity equity=total (common)?stock.?holders.? equity equity=total (common)?share.?holders.? equity equity=total (common)?share.?owners.? equity total liabilities=total liabilities \\s*(and|&)\\s*(common)?\\s*stock.?holder.?s.?\\s*equity total liabilities=total liabilities \\s*(and|&)\\s*(common)?\\s*share.?owner.?s.?\\s*equity total liabilities=total liabilities \\s*(and|&)\\s*(common)?\\s*share.?holder.?s.?\\s*equity total liabilities=total liabilities\\s*(and|&)\\s*(common)?\\s*equity total liabilities=total.liabilities ..preferred.stock.of.subsidiary.and.stockholders..equity total liabilities=total liabilities and
References 10 k Wizard: SEC filings (2007) url: http://www.tenkwizard.com Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley EDGAR Filer Manual (2007) url: http://www.sec.gov/info/edgar/filermanual.htm EDGAR Online (2007) url: http://www.edgar-online.com EdgarScan (2007) url: http://edgarscan.pwcglobal.com Extensible Business Reporting Language XBRL (2007) url: http://www.xbrl.org Friedel JEF (2002) Mastering regular expressions. O‘Reilly, Beijing Cambridge Farnham Köln Paris Sebastopol Taipei Tokyo Financial Reporting and Auditing Agent with Net Knowledge FRAANK (2007) url: http://fraank.eycarat.ku.edu/ FreeEDGAR (2007) url: http://www.freeedgar.com Kogan A, Nelson K, Srivastava R, Vasarhelyi M, Lu H (2000) Virtual auditing agents: The EDGAR agent challenge. Decision Support Systems 28:241−253 Leinemann C, Schlottmann F, Seese D, Stümpert T (2001) Automatic extraction and analysis of financial data from the EDGAR database. South African Journal of Information Management 3 Salton G (1988) Automatic text processing: The transformation, snalysis, and retrieval of information by computer. Addison Wesley. Stümpert T, Seese D, Cetinkaya Ö, Spöth R (2004) EASE – a software agent that extracts financial data from the SEC’s EDGAR database. Proceedings of the Fourth International ICSC Symposium on Engineering of Intelligent Systems EIS’2004, University of Madeira, Funchal, Portugal, February 29th – March 2, 2004, http://www.icsc-naiso.org, pp. 128; full paper in the electronic proceedings, pp. 1–7 U.S. Securities and Exchange Commission SEC (2007) url: http://www.sec.gov
PART II IT Methods in Finance
Introduction to Part II
This part contains 12 chapters and is devoted to IT methods in finance. There are essentially two ways where IT enters and influences methods used in finance. The first are of course methods which originate in IT and informatics, which are just applied directly or in a suitable adapted form. The second are methods which originate in finance, e. g. mathematical finance, but which need to their efficient implementation a suitable use of IT. This part includes both aspects and it adds as a third the investigation of problems from finance in computer science. Nagurney gives a survey on networks in finance in Chapter 17. The chapter traces the history of networks in finance and overviews a spectrum of relevant methodologies for their formulation, analysis, and solution ranging from optimization techniques to variational inequalities and projected dynamical systems. Numerical examples are provided for illustration purposes. In Chapter 18 van Dinther discusses the role of agent-based simulation for research in economics. Agent-based simulations have produced a variety of interesting contributions to economic research in the past years. Nevertheless, the methodology of economic simulations in general and agent-based simulations in particular is still in question. This is due to the fact that simulations are conducted without considering the underlying theory and links to standard economic approaches. This contribution discusses the use of simulations in economics and reviews the four most common agent-based approaches and their link to economic theory. Agents are also the central theme of the next chapter by Dermietzel. He gives a survey on developments and milestones on the heterogeneous agents approach to financial markets. New models of financial markets have been developed over the last 25 years, which are able to explain empirical phenomena observed in real markets. In contrast to classical financial theory a key feature of these models is the heterogeneity of traders. The models developed from very stylized models in the 1980s to complex dynamical models with realistic trading behaviour. Today these models are used to reproduce and explain the dynamics of financial markets and to investigate improvements and regulations of these sensitive economic mechanisms. This chapter discusses the discrepancies between financial theory
380
Introduction to Part II
and real market data and provides an overview of important financial market models with heterogeneous traders ranging from 1980 to 2007. Chiarella and He develop in Chapter 20 an adaptive model of asset price and wealth dynamics in a financial market with heterogeneous agents and examine the profitability of momentum and contrarian trading strategies. In order to characterise asset prices, wealth dynamics and rational adaptiveness arising from the interaction of heterogeneous agents with constant relative risk aversion (CRRA) utility, an adaptive discrete time equilibrium model in terms of return and wealth proportions (among heterogeneous representative agents) is established. Taking trend followers and contrarians as the main heterogeneous agents in the model, the profitability of momentum and contrarian trading strategies is analysed. The authors show the capability of the model to characterize some of the existing evidence on many of the anomalies observed in financial markets, including the profitability of momentum trading strategies over short time intervals and of contrarian trading strategies over long time intervals, rational adaptiveness of agents, overconfidence and underreaction, overreaction and herd behaviour, excess volatility, and volatility clustering. A survey on simulation methods for stochastic differential equations is given in Chapter 21 by Platen. Stochastic differential equations are becoming the modelling framework in finance, insurance, economics and many other areas of social sciences. Only in rare cases one has explicit solutions. Furthermore, the modelling of market segments or entire markets often involves a large number of factor processes where many quantitative methods fail to work. For these reasons simulation methods have become essential tools in quantitative finance. This chapter provides a brief introduction into the area of simulation methods for stochastic differential equations with focus on finance. The foundations of option pricing are important to many topics in finance. They are discussed by Buchen in Chapter 22. Modern approaches to pricing financial options and derivatives are concerned with the notion of no arbitrage – i. e. the concept “there are no free lunches in efficient markets”. The chapter explores the two principal methods of arbitrage-free pricing and shows how they can be applied to price a wide range of example contracts. The two methods alluded to are: the PDE (partial differential equation) method of Black and Scholes, and the EMM (equivalent martingale measure) method of Harrison and Pliska. Both methods have their origins in the stochastic calculus and ultimately, under certain restrictions, are equivalent, as in-deed they must be if the ‘Law of One Price’ is to hold. When used together with other analytical tools such as the Gaussian Shift Theorem, Principle of Static Replication and Parity Relations, one is able to price a range of exotic options with both path independent and path dependent payoffs. Sun, Rachev and Fabozzi provide a survey on long-range dependence, fractal processes, and intra-daily data in Chapter 23. With the availability of intra-daily price data, researchers have focused more attention on market microstructure issues to understand and help formulate strategies for the timing of trades. The purpose of this chapter is to provide a brief survey of the research employing intradaily price data. Specifically, the chapter reviews stylized facts of intra-daily data, econometric issues of data analysis, application of intra-daily data in volatility and
Introduction to Part II
381
liquidity research, and the applications to market microstructure theory. Longrange dependence is observed in intra-daily data. Because fractal processes or fractional integrated models are usually used to model long-range dependence, we also provide a review of fractal processes and long-range dependence in order to consider them in future research using intra-daily data. Bayesian applications to the investment management process are the central theme of Chapter 24 by Bagasheva, Rachev, Hsu and Fabozzi. The usual investment management practice is to assume that the parameter values estimated from the available data sample are the population parameter values. Failing to take into account the errors intrinsic in sample estimates, however, could lead to suboptimal asset allocations. Bayesian methods provide the theoretical framework and tools to account for estimation risk. Moreover, they allow a researcher or a practitioner to incorporate their subjective views about the model parameters into the asset allocation process. In this chapter, an overview of some aspects of the investment management process within a Bayesian setting is presented. The chapter highlights the advantages that setting provides over the classical (frequentist) estimation framework. Racheva-Iotova and Stoyanov discuss a unified framework for constructing optimal portfolio strategies in Chapter 25. It is based on an optimization problem involving the CVaR risk measure and an appropriately selected multivariate model for stock returns. The authors provide an empirical example using the Russell 2000 universe. In Chapter 26 Gilli, Maringer and Winker give a survey on applications of heuristics in finance. Optimisation is crucial in many areas of finance, but often quite demanding because the type of objective functions and the constraints that have to be considered cannot be handled with traditional optimization techniques. A rather novel group of methods are heuristics. These methods are less restricted in their applicability, and they perform search and optimisation in a non-deterministic fashion by repeatedly updating one ore a whole set of candidate solutions and by incorporating stochastic elements. Frequently, these methods are inspired by principles found in nature, e. g., evolution, they are typically not tailored for a certain type of problem, and they are flexible enough to be adapted to complex optimization problems. This chapter presents some of the most popular of these methods and demonstrates how they can be applied to financial problems including portfolio optimization and model selection. These methods are found not only to work reliably, but also to allow a better analysis of complex problems that could not be addressed with traditional methods. Another class of heuristic approaches to solve several complex problems in finance stem from the area of machine learning. They are surveyed by Chalup and Mitschele in Chapter 27 on kernel methods. Kernel methods are a class of powerful machine learning algorithms which are able to solve non-linear tasks. The chapter presents a concise overview of a selection of relevant machine learning methods and a survey of applications to show how kernel methods have been applied in finance. The overview of learning concepts addresses methods for dimensionality reduction, regression, and classification. The concept of kernelisation which can be used in order to transform classical linear machine learning methods into non-linear kernel methods is emphasised. The survey of applications of kernel
382
Introduction to Part II
methods in finance covers the areas of credit risk management, market risk management, and discusses possible future application fields. It concludes with a brief overview of relevant software toolboxes. In the last years there were several related results developed, showing that special problems form finance are provably complex, i. e. their complexity is so high that one is in need of heuristic methods to approach solutions. Some of these are results on the complexity of decision problems in foreign exchange markets. These problems are discussed in Chapter 28 by Cai and Deng. They study the computational complexity problem of finding arbitrage in foreign exchange markets, taking market frictions into consideration. The problem is approached through several steps, refining the computational complexity issue with respect to arbitrage. At first, it is investigated the computational complexity to detect the existence of an arbitrage opportunity in a general market model. This problem has a negative solution, since it turns out to be NP-complete to identify arbitrage. Secondly, market models are described for which polynomial time algorithms exist to locate an arbitrage opportunity. It is shown that two important market structure models belong to this category. Thirdly, the impact of futures on the complexity of finding arbitrage is explored. Finally, the minimal number of necessary transactions to bring the exchange system back to some non-arbitrage states is studied. The results show that different exchange systems exhibit different computational complexities for finding arbitrage. The complexity understandings help to shed new light on understanding how monetary system models are adopted and evolved in reality.
CHAPTER 17 Networks in Finance Anna Nagurney
17.1 Introduction Finance is concerned with the study of capital flows over space and time in the presence of risk. As a subject, it has benefited from numerous mathematical and engineering tools that have been developed and utilized for the modeling, analysis, and computation of solutions in the present complex economic environment. Indeed, the financial landscape today is characterized by the existence of distinct sectors in economies, the proliferation of new financial instruments, with increasing diversification of portfolios internationally, various transaction costs, the increasing growth of electronic transactions through advances in information technology and, in particular, the Internet, and different types of governmental policy interventions. Hence, rigorous methodological tools that can capture the complexity and richness of financial decision-making today and that can take advantage of powerful computer resources have never been more important and needed for financial quantitative analyses. In this chapter, the focus is on financial networks as a powerful tool and medium for the modeling, analysis, and solution of a spectrum of financial decisionmaking problems ranging from portfolio optimization to multi-sector, multiinstrument general financial equilibrium problems, dynamic multi-agent financial problems with intermediation, as well as the financial engineering of the integration of social networks with financial systems. Throughout history, the emergence and evolution of various physical networks, ranging from transportation and logistical networks to telecommunication networks and the effects of human decision-making on such networks have given rise to the development of rich theories and scientific methodologies that are networkbased (cf. Ford and Fulkerson 1962; Ahuja et al. 1993; Nagurney 1999; Geunes and Pardalos 2003). The novelty of networks is that they are pervasive, providing the fabric of connectivity for our societies and economies, while, methodologically, network theory has developed into a powerful and dynamic medium for
384
Anna Nagurney
abstracting complex problems, which, at first glance, may not even appear to be networks, with associated nodes, links, and flows. The topic of networks as a subject of scientific inquiry originated in the paper by (Euler 1736), which is credited with being the earliest paper on graph theory. By a graph in this setting is meant, mathematically, a means of abstractly representing a system by its depiction in terms of vertices (or nodes) and edges (or arcs, equivalently, links) connecting various pairs of vertices. Euler was interested in determining whether it was possible to stroll around Königsberg (later called Kaliningrad) by crossing the seven bridges over the River Pregel exactly once. The problem was represented as a graph in which the vertices corresponded to land masses and the edges to bridges. (Quesnay 1758), in his Tableau Economique, conceptualized the circular flow of financial funds in an economy as a network and this work can be identified as the first paper on the topic of financial networks. Quesnay’s basic idea has been utilized in the construction of financial flow of funds accounts, which are a statistical description of the flows of money and credit in an economy (see Cohen 1987). The concept of a network in economics, in turn, was implicit as early as the classical work of (Cournot 1838), who not only seems to have first explicitly stated that a competitive price is determined by the intersection of supply and demand curves, but had done so in the context of two spatially separated markets in which the cost associated with transporting the goods was also included. (Pigou 1920) studied a network system in the form of a transportation network consisting of two routes and noted that the decision-making behavior of the users of such a system would lead to different flow patterns. Hence, the network of concern therein consists of the graph, which is directed, with the edges or links represented by arrows, as well as the resulting flows on the links. (Copeland 1952) recognized the conceptualization of the interrelationships among financial funds as a network and asked the question, “Does money flow like water or electricity?” Moreover, he provided a “wiring diagram for the main money circuit.” Kirchhoff is credited with pioneering the field of electrical engineering by being the first to have systematically analyzed electrical circuits and with providing the foundations for the principal ideas of network flow theory. Interestingly, (Enke 1951) had proposed electronic circuits as a means of solving spatial price equilibrium problems, in which goods are produced, consumed, and traded, in the presence of transportation costs. Such analog computational devices, were soon to be superseded by digital computers along with advances in computational methodologies, that is, algorithms, based on mathematical programming. In this chapter, we further elaborate upon historical breakthroughs in the use of networks for the formulation, analysis, and solution of financial problems. Such a perspective allows one to trace the methodological developments as well as the applications of financial networks and provides a platform upon which further innovations can be made. Methodological tools that will be utilized to formulate and solve the financial network problems in this chapter are drawn from optimization, variational inequalities, as well as projected dynamical systems theory. We begin with a discussion of financial optimization problems
17 Networks in Finance
385
within a network context and then turn to a range of financial network equilibrium problems.
17.2 Financial Optimization Problems Network models have been proposed for a wide variety of financial problems characterized by a single objective function to be optimized as in portfolio optimization and asset allocation problems, currency translation, and risk management problems, among others. This literature is now briefly overviewed with the emphasis on the innovative work of (Markowitz 1952, 1959) that established a new era in financial economics and became the basis for many financial optimization models that exist and are used to this day. Although many financial optimization problems (including Markowitz’s) had an underlying network structure, and the advantages of network programming were becoming increasingly evident (cf. Charnes and Cooper 1958), not many financial network optimization models were developed until some time later. Some exceptions are several early models due to (Charnes and Miller 1957) and (Charnes and Cooper 1961). It was not until the last years of the 1960s and the first years of the 1970s that the network setting started to be extensively used for financial applications. Among the first financial network optimization models that appear in the literature were a series of currency translating models. (Rutenberg 1970) suggested that the translation among different currencies could be performed through the use of arc multipliers. Rutenberg’s network model was multiperiod with linear costs on the arcs (a characteristic common to the earlier financial networks models). The nodes of such generalized networks represented a particular currency in a specific period and the flow on the arcs the amount of cash moving from one period and/ or currency to another. (Christofides et al. 1979) and (Shapiro and Rutenberg 1976), among others, introduced related financial network models. In most of these models, the currency prices were determined according to the amount of capital (network flow) that was moving from one currency (node) to the other.
Figure 17.1. Network structure of classical portfolio optimization
386
Anna Nagurney
(Barr 1972) and (Srinivasan 1974) used networks to formulate a series of cash management problems, with a major contribution being (Crum’s 1976) introduction of a generalized linear network model for the cash management of a multinational firm. The links in the network represented possible cash flow patterns and the multipliers incorporated costs, fees, liquidity changes, and exchange rates. A series of related cash management problems were modeled as network problems in subsequent years by (Crum and Nye 1981) and (Crum et al. 1983), and others. These papers further extended the applicability of network programming in financial applications. The focus was on linear network flow problems in which the cost on an arc was a linear function of the flow. (Crum, Klingman, and Tavis 1979), in turn, demonstrated how contemporary financial capital allocation problems could be modeled as an integer generalized network problem, in which the flows on particular arcs were forced to be integers. In many financial network optimization problems the objective function must be nonlinear due to the modeling of the risk function and, hence, typically, such financial problems lie in the domain of nonlinear, rather than linear, network flow problems. (Mulvey 1987) presented a collection of nonlinear financial network models that were based on previous cash flow and portfolio models in which the original authors (see e. g. Rudd and Rosenberg 1979 and Soenen 1979) did not realize, and, thus, did not exploit the underlying network structure. Mulvey also recognized that the (Markowitz 1952, 1959) mean-variance minimization problem was, in fact, a network optimization problem with a nonlinear objective function. The classical Markowitz models are now reviewed and cast into the framework of network optimization problems. See Figure 17.1 for the network structure of such problems. Additional financial network optimization models and associated referencescan be found in (Nagurney and Siokos 1997) and in the volume edited by (Nagurney 2003). Markowitz’s model was based on mean-variance portfolio selection, where the average and the variability of portfolio returns were determined in terms of the mean and covariance of the corresponding investments. The mean is a measure of an average return and the variance is a measure of the distribution of the returns around the mean return. Markowitz formulated the portfolio optimization problem as associated with risk minimization with the objective function: Minimize V = X T QX
(17.1)
subject to constraints, representing, respectively, the attainment of a specific return, a budget constraint, and that no short sales were allowed, given by: n
R = ∑ X i ri
(17.2)
i =1
n
∑ Xi i =1
=1
X i ≥ 0, i = 1,…, n.
(17.3) (17.4)
17 Networks in Finance
387
Here n denotes the total number of securities available in the economy, X i represents the relative amount of capital invested in security i , with the securities being grouped into the column vector X , Q denotes the n × n variancecovariance matrix on the return of the portfolio, ri denotes the expected value of the return of security i , and R denotes the expected rate of return on the portfolio. Within a network context (cf. Figure 17.1), the links correspond to the securities, with their relative amounts X 1 ,…, X n corresponding to the flows on the respective links: 1,…, n . The budget constraint and the nonnegativity assumption on the flows are the network conservation of flow equations. Since the objective function is that of risk minimization, it can be interpreted as the sum of the costs on the n links in the network. Observe that the network representation is abstract and does not correspond (as in the case of transportation and telecommunication) to physical locations and links. Markowitz suggested that, for a fixed set of expected values ri and covariances of the returns of all assets i and j , every investor can find an ( R, V ) combination that better fits his taste, solely limited by the constraints of the specific problem. Hence, according to the original work of (Markowitz 1952), the efficient frontier had to be identified and then every investor had to select a portfolio through a mean-variance analysis that fitted his preferences. A related mathematical optimization model (see Markowitz 1959) to the one above, which can be interpreted as the investor seeking to maximize his returns while minimizing his risk can be expressed by the quadratic programming problem: Maximize α R − (1 − α ) V
(17.5)
subject to: n
∑ Xi = 1
(17.6)
X i ≥ 0, i = 1,…, n,
(17.7)
i =1
where α denotes an indicator of how risk-averse a specific investor is. This model is also a network optimization problem with the network as depicted in Figure 1.1 with Equations (17.6) and (17.7) again representing a conservation of flow equation. A collection of versions and extensions of Markowitz’s model can be found in (Francis and Archer 1979), with α = 1/ 2 being a frequently accepted value. A recent interpretation of the model as a multicriteria decision-making model along with theoretical extensions to multiple sectors can be found in (Dong and Nagurney 2001), where additional references are available. References to multicriteria decision-making and financial applications can also be found in (Doumpos, Zopounidis, and Pardalos 2000). A segment of the optimization literature on financial networks has focused on variables that are stochastic and have to be treated as random variables in the optimization procedure. Clearly, since most financial optimization problems are of large size, the incorporation of stochastic variables made the problems more com-
388
Anna Nagurney
plicated and difficult to model and compute. (Mulvey 1987) and (Mulvey and Vladimirou 1989, 1991), among others, studied stochastic financial networks, utilizing a series of different theories and techniques (e. g., purchase power priority, arbitrage theory, scenario aggregation) that were then utilized for the estimation of the stochastic elements in the network in order to be able to represent them as a series of deterministic equivalents. The large size and the computational complexity of stochastic networks, at times, limited their usage to specially structured problems where general computational techniques and algorithms could be applied. See (Rudd and Rosenberg 1979, Wallace 1986, Rockafellar and Wets 1991, and Mulvey et al. 2003) for a more detailed discussion on aspects of realistic portfolio optimization and implementation issues related to stochastic financial networks.
17.3 General Financial Equilibrium Problems We now turn to networks and their utilization for the modeling and analysis of financial systems in which there is more than a single decision-maker, in contrast to the above financial optimization problems. It is worth noting that (Quesnay 1758) actually considered a financial system as a network. (Thore 1969) introduced networks, along with the mathematics, for the study of systems of linked portfolios. His work benefited from that of (Charnes and Cooper 1967) who demonstrated that systems of linked accounts could be represented as a network, where the nodes depict the balance sheets and the links depict the credit and debit entries. Thore considered credit networks, with the explicit goal of providing a tool for use in the study of the propagation of money and credit streams in an economy, based on a theory of the behavior of banks and other financial institutions. The credit network recognized that these sectors interact and its solution made use of linear programming. (Thore 1970) extended the basic network model to handle holdings of financial reserves in the case of uncertainty. The approach utilized two-stage linear programs under uncertainty introduced by (Ferguson and Dantzig 1956) and (Dantzig and Madansky 1961). See (Fei 1960) for a graph theoretic approach to the credit system. More recently, (Boginski et al. 2003) presented a detailed study of the stock market graph, yielding a new tool for the analysis of market structure through the classification of stocks into different groups, along with an application to the US stock market. (Storoy et al. 1975), in turn, developed a network representation of the interconnection of capital markets and demonstrated how decomposition theory of mathematical programming could be exploited for the computation of equilibrium. The utility functions facing a sector were no longer restricted to being linear functions. (Thore 1980) further investigated network models of linked portfolios, financial intermediation, and decentralization/decomposition theory. However, the computational techniques at that time were not sufficiently well-developed to handle such problems in practice.
17 Networks in Finance
389
(Thore 1984) later proposed an international financial network for the Euro dollar market and viewed it as a logistical system, exploiting the ideas of (Samuelson 1952) and (Takayama and Judge 1971) for spatial price equilibrium problems. In this paper, as in Thore’s preceding papers on financial networks, the microbehavioral unit consisted of the individual bank, savings and loan, or other financial intermediary and the portfolio choices were described in some optimizing framework, with the portfolios being linked together into a network with a separate portfolio visualized as a node and assets and liabilities as directed links. The above contributions focused on the use and application of networks for the study of financial systems consisting of multiple economic decision-makers. In such systems, equilibrium was a central concept, along with the role of prices in the equilibrating mechanism. Rigorous approaches that characterized the formulation of equilibrium and the corresponding price determination were greatly influenced by the Arrow-Debreu economic model (cf. Arrow 1951, Debreu 1951). In addition, the importance of the inclusion of dynamics in the study of such systems was explicitly emphasized (see, also, Thore and Kydland 1972). The first use of finite-dimensional variational inequality theory for the computation of multi-sector, multi-instrument financial equilibria is due to (Nagurney et al. 1992), who recognized the network structure underlying the subproblems encountered in their proposed decomposition scheme. (Hughes and Nagurney 1992 and Nagurney and Hughes 1992) had, in turn, proposed the formulation and solution of estimation of financial flow of funds accounts as network optimization problems. Their proposed optimization scheme fully exploited the special network structure of these problems. (Nagurney and Siokos 1997) then developed an international financial equilibrium model utilizing finite-dimensional variational inequality theory for the first time in that framework. Finite-dimensional variational inequality theory is a powerful unifying methodology in that it contains, as special cases, such mathematical programming problems as: nonlinear equations, optimization problems, and complementarity problems. To illustrate this methodology and its application in general financial equilibrium modeling and computation, we now present a multi-sector, multiinstrument model and an extension due to (Nagurney et al. 1992) and (Nagurney 1994), respectively. For additional references to variational inequalities in finance, along with additional theoretical foundations, see (Nagurney and Siokos 1997) and (Nagurney 2001, 2003).
17.3.1
A Multi-Sector, Multi-Instrument Financial Equilibrium Model
Recall the classical mean-variance model presented in the preceding section, which is based on the pioneering work of (Markowitz 1959). Now, however, assume that there are m sectors, each of which seeks to maximize his return and, at the same time, to minimize the risk of his portfolio, subject to the balance accounting and
390
Anna Nagurney
nonnegativity constraints. Examples of sectors include: households, businesses, state and local governments, banks, etc. Denote a typical sector by j and assume that there are liabilities in addition to assets held by each sector. Denote the volume of instrument i that sector j holds as an asset, by X i j , and group the (nonnegative) assets in the portfolio of sector j into the column vector X j ∈ R+n . Further, group the assets of all sectors in the economy into the column vector X ∈ R+mn . Similarly, denote the volume of instrument i that sector j holds as a liability, by Yi j , and group the (nonnegative) liabilities in the portfolio of sector j into the column vector Y j ∈ R+n . Finally, group the liabilities of all sectors in the economy into the column vector Y ∈ R+mn . Let ri denote the nonnegative price of instrument i and group the prices of all the instruments into the column vector r ∈ R+n . It is assumed that the total volume of each balance sheet side of each sector is exogenous. Recall that a balance sheet is a financial report that demonstrates the status of a company’s assets, liabilities, and the owner’s equity at a specific point of time. The left-hand side of a balance sheet contains the assets that a sector holds at a particular point of time, whereas the right-hand side accommodates the liabilities and owner’s equity held by that sector at the same point of time. According to accounting principles, the sum of all assets is equal to the sum of all the liabilities and the owner’s equity. Here, the term “liabilities” is used in its general form and, hence, also includes the owner’s equity. Let S j denote the financial volume held by sector j . Finally, assume that the sectors under consideration act in a perfectly competitive environment. 17.3.1.1 A Sector’s Portfolio Optimization Problem Recall that in the mean-variance approach for portfolio optimization, the minimization of a portfolio’s risk is performed through the use of the variance-covariance matrix. Hence, the portfolio optimization problem for each sector j is the following: ⎛ ⎜ ⎜ ⎜ ⎝
T
X j ⎞⎟ j ⎛⎜ X j ⎞⎟ n ⎛ j j⎞ Minimize ⎟ Q ⎜ ⎟ − ∑ ri ⎜ X i − Yi ⎟ ⎠ ⎜ Y j ⎟ i =1 ⎝ Y j ⎟⎠ ⎝ ⎠
(17.8)
subject to: n
∑ X ij = S j
(17.9)
i =1 n
∑ Yi j = S j
(17.10)
X i j ≥ 0, Yi j ≥ 0, i = 1, 2,…, n,
(17.11)
i =1
17 Networks in Finance
391
where Q j is a symmetric 2n × 2n variance-covariance matrix associated with the assets and liabilities of sector j . Moreover, since Q j is a variance-covariance matrix, one can assume that it is positive definite and, as a result, the objective function of each sector’s portfolio optimization problem, given by the above, is strictly convex. Partition the symmetric matrix Q j , as j ⎛ ⎜ Q11
Q12j ⎞⎟
⎜Q j ⎝ 21
Q22j ⎟⎠
Qj = ⎜
⎟,
where Q11j and Q22j are the variance-covariance matrices for only the assets and only the liabilities, respectively, of sector j . These submatrices are each of dimension n × n . The submatrices Q12j and Q21j , in turn, are identical since Q j is symmetric. They are also of dimension n × n . These submatrices are, in fact, the symmetric variance-covariance matrices between the asset and the liabilities of j j sector j . Denote the i -th column of matrix Q(αβ ) , by Q(αβ ) i , where α and β can take on the values of 1 and/or 2 .
17.3.1.2 Optimality Conditions The necessary and sufficient conditions for an optimal portfolio for sector j , are that the vector of assets and liabilities, ( X j ∗, Y j ∗) ∈ K j , where K j denotes the feasible set for sector j , given by (17.9) – (17.11), satisfies the following system of equalities and inequalities: For each instrument i; i = 1,…, n , we must have that: j T 2(Q(11) i) ⋅ X
T j ∗ + 2(Q j (21) i )
⋅ Y j ∗ − ri∗ − μ1j ≥ 0,
j T T j ∗ + 2(Q j 2(Q(22) i ) ⋅Y (12) i ) ⋅ X ∗
j T ⎡ X i j ⎢⎣ 2(Q(11) i ) ⋅ X ∗
T j ∗ + 2(Q j (21) i )
j ∗ + r∗ i
− μ 2j ≥ 0,
⋅ Y j ∗ − ri∗ − μ1j ⎤⎥⎦ = 0,
∗ j T j T ⎡ Yi j ⎢⎣ 2(Q(22)i ) ⋅ Y j + 2(Q(12)i ) ⋅ X
j ∗ + r∗ i
− μ2j ⎤⎥⎦ = 0,
where μ 1j and μ 2j are the Lagrange multipliers associated with the accounting constraints, (17.9) and (17.10), respectively. Let K denote the feasible set for all the asset and liability holdings of all the sectors and all the prices of the instruments, where K ≡ {K × R+n } and m K ≡ ∏ i =1 K j . The network structure of the sectors’ optimization problems is depicted in Figure 17.2.
392
Anna Nagurney
Figure 17.2. Network structure of the sectors’ optimization problems
17.3.1.3 Economic System Conditions The economic system conditions, which relate the supply and demand of each financial instrument and the instrument prices, are given by: for each instrument i; i = 1,…, n , an equilibrium asset, liability, and price pattern, ( X ∗ , Y ∗ , r ∗ ) ∈ K , must satisfy: J
⎧= 0,
j =1
∗ ⎪⎩≥ 0, if ri = 0.
∗ ∗ ⎪ ∑ ( X i j − Yi j ) ⎨
if ri∗ > 0
(17.12)
The definition of financial equilibrium is now presented along with the variational inequality formulation. For the derivation, see (Nagurney et al. 1992) and (Nagurney and Siokos 1997). Combining the above optimality conditions for each sector with the economic system conditions for each instrument, we have the following definition of equilibrium. Definition 17.1: Multi-Sector, Multi-Instrument Financial Equilibrium A vector ( X ∗ , Y ∗ , r ∗ ) ∈ K is an equilibrium of the multi-sector, multi-instrument financial model if and only if it satisfies the optimality conditions and the economic system conditions (17.12), for all sectors j ; j = 1,…, m , and for all instruments i ; i = 1,…, n , simultaneously. The variational inequality formulation of the equilibrium conditions, due to (Nagurney et al. 1992) is given by:
17 Networks in Finance
393
Theorem 17.1: Variational Inequality Formulation for the Quadratic Model A vector of assets and liabilities of the sectors, and instrument prices, ( X ∗ , Y ∗ , r ∗ ) ∈ K , is a financial equilibrium if and only if it satisfies the variational inequality problem: m
n
⎡
j T T j ∗ + 2(Q j j ∗ − r∗ ⎤ × ⎢ X j − X ij i ⎥ ∑ ∑ ⎡⎢⎣ 2(Q(11) i) ⋅ X (21) i ) ⋅ Y ⎦ ⎣ i j =1 i =1
m
n
j T T j ∗ + 2(Q j + ∑ ∑ ⎣⎢⎡ 2(Q(22) i ) ⋅Y (12) i ) ⋅ X j =1 i =1
n
m
⎡
∗
j ∗ + r ∗ ⎤ × ⎡⎢Y j i ⎥ ⎦ ⎣ i
∗⎤
− Yi j
+ ∑ ∑ ⎢ X i j − Yi j ⎥ × ⎡⎢⎣ ri − ri∗ ⎤⎥⎦ ≥ 0, ∀( X , Y , r ) ∈ K. i =1 j =1 ⎣
⎦
∗⎤ ⎥ ⎦
∗⎤ ⎥ ⎦
(17.13)
For completeness, the standard form of the variational inequality is now presented. For additional background, see (Nagurney 1999). Define the N -dimensional column vector Z ≡ ( X , Y , r ) ∈ K , and the N -dimensional column vector F ( Z ) such that: ⎛X⎞ ⎛ 2Q ⎜ ⎟ F ( Z ) ≡ D ⎜ Y ⎟ where D = ⎜ T ⎝ −B ⎜r⎟ ⎝ ⎠
and I is the n × n -dimensional identity matrix. It is clear that variational inequality problem (17.13) can be put into standard variational inequality form: determine Z ∗ ∈ K , satisfying: F ( Z ∗ )T , Z − Z ∗ ≥ 0, ∀Z ∈ K.
(17.14)
394
Anna Nagurney
17.3.2
Model with Utility Functions
The above model is a special case of the financial equilibrium model due to Nagurney (1994) in which each sector j seeks to maximize his utility function, U j ( X j , Y j , r ) = u j ( X j , Y j ) + r T ⋅ ( X j − Y j ),
which, in turn, is a special case of the model with a sector j ’s utility function given by the general form: U j ( X j , Y j , r ) . Interestingly, it has been shown by (Nagurney and Siokos 1997) that, in the case of utility functions of the form U j ( X j , Y j , r ) = u j ( X j , Y j ) + r T ⋅ ( X j − Y j ) , of which the above described quadratic model is an example, one can obtain the solution to the above variational inequality problem by solving the optimization problem: J
Maximize
∑ u j ( X j ,Y j )
(17.15)
j =1
subject to: J
∑ ( X i j − Yi j ) = 0, i = 1,…, n
(17.16)
( X j ,Y j ) ∈ K j ,
(17.17)
j =1
j = 1,…, m,
with Lagrange multiplier ri∗ associated with the i -th “market clearing” constraint (17.16). Moreover, this optimization problem is actually a network optimization problem as revealed in (Nagurney and Siokos 1997). The structure of the financial system in equilibrium is as depicted in Figure 17.3.
Figure 17.3. The network structure at equilibrium
17 Networks in Finance
17.3.3
395
Computation of Financial Equilibria
In this section, an algorithm for the computation of solutions to the above financial equilibrium problems is recalled. The algorithm is the modified projection method of (Korpelevich 1977). The advantage of this computational method in the context of the general financial equilibrium problems is that the original problem can be decomposed into a series of smaller and simpler subproblems of network structure, each of which can then be solved explicitly and in closed form. The realization of the modified projection method for the solution of the financial equilibrium problems with general utility functions is then presented. The modified projection method, can be expressed as: Step 0: Initialization Select Z 0 ∈ K . Let τ := 0 and let γ be a scalar such that 0 < γ ≤ 1L , where L is the Lipschitz constant (see Nagurney and Siokos (1997)). Step 1: Computation Compute Z τ by solving the variational inequality subproblem: ( Z τ + γ F ( Z τ )T − Z τ )T , Z − Z τ ≥ 0, ∀Z ∈ K.
(17.18)
Step 2: Adaptation Compute Z τ +1 by solving the variational inequality subproblem: ( Z τ +1 + γ F ( Z τ )T − Z τ )T , Z − Z τ +1 ≥ 0, ∀Z ∈ K.
(17.19)
Step 3: Convergence Verification If max Zbτ +1 − Zbτ ≤ ε , for all b , with ε > 0 , a prespecified tolerance, then stop; else, set τ := τ + 1 , and go to Step 1. An interpretation of the modified projection method as an adjustment process is now provided. The interpretation of the algorithm as an adjustment process was given by (Nagurney 1999). In particular, at an iteration, the sectors in the economy receive all the price information on every instrument from the previous iteration. They then allocate their capital according to their preferences. The market reacts on the decisions of the sectors and derives new instrument prices. The sectors then improve upon their positions through the adaptation step, whereas the market also adjusts during the adaptation step. This process continues until noone can improve upon his position, and the equilibrium is reached, that is, the above variational inequality is satisfied with the computed asset, liability, and price pattern. The financial optimization problems in the computation step and in the adaptation step are equivalent to separable quadratic programming problems, of special network structure. Each of these network subproblems structure can then be solved,
396
Anna Nagurney
at an iteration, simultaneously, and exactly in closed form. The exact equilibration algorithm (see, e. g., Nagurney and Siokos 1997) can be applied for the solution of the asset and liability subproblems, whereas the prices can be obtained using explicit formulae. A numerical example is now presented for illustrative purposes and solved using the modified projection method, embedded with the exact equilibration algorithm. For further background, see (Nagurney and Siokos 1997).
17.3.3.1 Example 1: A Numerical Example Assume that there are two sectors in the economy and three financial instruments. Assume that the “size” of each sector is given by S 1 = 1 and S 2 = 2 . The variance-covariance matrices of the two sectors are: ⎛ 1 .25 ⎜ ⎜ .25 1 ⎜ .3 1 Q1 = ⎜ 0 ⎜ 0 ⎜ 0 0 ⎜⎜ 0 ⎝ 0
The modified projection method was coded in FORTRAN. The variables were initialized as follows: ri0 = 1 , for all i , with the financial volume S j equally distributed among all the assets and among all the liabilities for each sector j . The γ parameter was set to 0.35. The convergence tolerance ε was set to 10−3 . The modified projection method converged in 16 iterations and yielded the following equilibrium pattern:
Equilibrium Prices: r1∗ = .34039,
r2∗ = .23805,
r3∗ = .42156,
17 Networks in Finance
397
Equilibrium Asset Holdings: ∗
X 21 = .31803,
∗
X 22 = .60904,
X 11 = .27899, X 12 = .79662,
∗
∗
X 31 = .40298,
∗
X 32 = .59434,
∗
Equilibrium Liability Holdings: ∗
Y21 = .43993,
∗
Y22 = .48693,
Y11 = .37081, Y12 = .70579,
∗
∗
Y31 = .18927,
∗
Y32 = .80729.
∗
The above results show that the algorithm yielded optimal portfolios that were feasible. Moreover, the market cleared for each instrument, since the price of each instrument was positive. Other financial equilibrium models, including models with transaction costs, with hedging instruments such as futures and options, as well as, international financial equilibrium models, can be found in (Nagurney and Siokos 1997), and the references therein. Moreover, with projected dynamical systems theory (see the book by Nagurney and Zhang 1996) one can trace the dynamic behavior prior to an equilibrium state (formulated as a variational inequality). In contrast to classical dynamical systems, projected dynamical systems are characterized by a discontinuous right-hand side, with the discontinuity arising due to the constraint set underlying the application in question. Hence, this methodology allows one to model systems dynamically which are subject to limited resources, with a principal constraint in finance being budgetary restrictions. (Dong et al. 1996) were the first to apply the methodology of projected dynamical systems to develop a dynamic multi-sector, multi-instrument financial model, whose set of stationary points coincided with the set of solutions to the variational inequality model developed in (Nagurney 1994); and then to study it qualitatively, providing stability analysis results. In the next section, the methodology of projected dynamical systems is illustrated in the context of a dynamic financial network model with intermediation (cf. Nagurney and Dong 2002).
17.4 Dynamic Financial Networks with Intermediation In this section, dynamic financial networks with intermediation are explored. As noted earlier, the conceptualization of financial systems as networks dates to (Quesnay 1758) who depicted the circular flow of funds in an economy as a network. His basic idea was subsequently applied to the construction of flow of
398
Anna Nagurney
funds accounts, which are a statistical description of the flows of money and credit in an economy (cf. Board of Governors 1980, Cohen 1987, Nagurney and Hughes 1992). However, since the flow of funds accounts are in matrix form, and, hence, two-dimensional, they fail to capture the dynamic behavior on a micro level of the various financial agents/sectors in an economy, such as banks, households, insurance companies, etc. Furthermore, as noted by the (Board of Governors 1980) on page 6 of that publication, “the generality of the matrix tends to obscure certain structural aspects of the financial system that are of continuing interest in analysis,” with the structural concepts of concern including financial intermediation. (Thore 1980) recognized some of the shortcomings of financial flow of funds accounts and developed network models of linked portfolios with financial intermediation, using decentralization/decomposition theory. Note that intermediation is typically associated with financial businesses, including banks, savings institutions, investment and insurance companies, etc., and the term implies borrowing for the purpose of lending, rather than for nonfinancial purposes. Thore also constructed some basic intertemporal models. However, the intertemporal models were not fully developed and the computational techniques at that time were not sufficiently advanced for computational purposes. In this section, we address the dynamics of the financial economy which explicitly includes financial intermediaries along with the “sources” and “uses” of financial funds. Tools are provided for studying the disequilibrium dynamics as well as the equilibrium state. Also, transaction costs are considered, since they bring a greater degree of realism to the study of financial intermediation. Transaction costs had been studied earlier in multi-sector, multi-instrument financial equilibrium models by (Nagurney and Dong 1996 a,b) but without considering the more general dynamic intermediation setting. The dynamic financial network model is now described. The model consists of agents with sources of funds, agents who are intermediaries, as well as agents who are consumers located at the demand markets. Specifically, consider m agents with sources of financial funds, such as households and businesses, involved in the allocation of their financial resources among a portfolio of financial instruments which can be obtained by transacting with distinct n financial intermediaries, such as banks, insurance and investment companies, etc. The financial intermediaries, in turn, in addition to transacting with the source agents, also determine how to allocate the incoming financial resources among distinct uses, as represented by o demand markets with a demand market corresponding to, for example, the market for real estate loans, household loans, or business loans, etc. The financial network with intermediation is now described and depicted graphically in Figure 17.4. The top tier of nodes in Figure 17.4 consists of the agents with sources of funds, with a typical source agent denoted by i and associated with node i . The middle tier of nodes in Figure 17.4 consists of the intermediaries, with a typical intermediary denoted by j and associated with node j in the network. The bottom tier of nodes consists of the demand markets, with a typical demand market denoted by k and corresponding to the node k .
17 Networks in Finance
399
Figure 17.4. The Network Structure of the Financial Economy with Intermediation and with Non-Investment Allowed
For simplicity of notation, assume that there are L financial instruments associated with each intermediary. Hence, from each source of funds node, there are L links connecting such a node with an intermediary node with the l -th such link corresponding to the l -th financial instrument available from the intermediary. In addition, the option of non-investment in the available financial instruments is allowed and to denote this option, construct an additional link from each source node to the middle tier node n + 1 , which represents non-investment. Note that there are as many links connecting each top tier node with each intermediary node as needed to reflect the number of financial instruments available. Also, note that there is an additional abstract node n + 1 with a link connecting each source node to it, which, as shall shortly be shown, will be used to “collect” the financial funds which are not invested. In the model, it is assumed that each source agent has a fixed amount of financial funds. From each intermediary node, construct o links, one to each “use” node or demand market in the bottom tier of nodes in the network to denote the transaction between the intermediary and the consumers at the demand market. Let xijl denote the nonnegative amount of the funds that source i “invests” in financial instrument l obtained from intermediary j . Group the financial flows associated with source agent i , which are associated with the links emanating from the top tier node i to the intermediary nodes in the logistical network, into the column vector xi ∈ R+nL . Assume that each source has, at his disposal, an amount of funds Si and denote the unallocated portion of this amount (and flowing on the link joining node i with node n + 1 ) by si . Group then the xi s of all the source agents into the column vector x ∈ R+mnL . Associate a distinct financial product k with each demand market, bottomtiered node k and let y jk denote the amount of the financial product obtained by
400
Anna Nagurney
consumers at demand market k from intermediary j . Group these “consumption” quantities into the column vector y ∈ R+no . The intermediaries convert the incoming financial flows x into the outgoing financial flows y . The notation for the prices is now given. Note that there will be prices associated with each of the tiers of nodes in the network. Let ρ1ijl denote the price associated with instrument l as quoted by intermediary j to source agent i and group the first tier prices into the column vector ρ1 ∈ R+mnL . Also, let ρ 2 j denote the price charged by intermediary j and group all such prices into the column vector ρ 2 ∈ R+n . Finally, let ρ3k denote the price of the financial product at the third or bottom-tiered node k in the network, and group all such prices into the column vector ρ3 ∈ R+o . We now turn to describing the dynamics by which the source agents adjust the amounts they allocate to the various financial instruments over time, the dynamics by which the intermediaries adjust their transactions, and those by which the consumers obtain the financial products at the demand markets. In addition, the dynamics by which the prices adjust over time are described. The dynamics are derived from the bottom tier of nodes of the network on up since it is assumed that it is the demand for the financial products (and the corresponding prices) that actually drives the economic dynamics. The price dynamics are presented first and then the dynamics underlying the financial flows.
17.4.1
The Demand Market Price Dynamics
We begin by describing the dynamics underlying the prices of the financial products associated with the demand markets (see the bottom-tiered nodes). Assume, as given, a demand function d k , which can depend, in general, upon the entire vector of prices ρ3 , that is, d k = d k ( ρ3 ), ∀k .
(17.20)
Moreover, assume that the rate of change of the price ρ3k , denoted by ρ 3k , is equal to the difference between the demand at the demand market k , as a function of the demand market prices, and the amount available from the intermediaries at the demand market. Hence, if the demand for the product at the demand market (at an instant in time) exceeds the amount available, the price of the financial product at that demand market will increase; if the amount available exceeds the demand at the price, then the price at the demand market will decrease. Furthermore, it is guaranteed that the prices do not become negative. Thus, the dynamics of the price ρ3k associated with the product at demand market k can be expressed as: n ⎧ − d ( ρ ) y jk , 3 k ∑ ⎪ j =1 ⎪ = ρ 3k ⎨ n ⎪max{0, d k ( ρ3 ) − ∑ y jk }, ⎪⎩ j =1
if
ρ 3k > 0 (17.21)
if
ρ3k = 0.
17 Networks in Finance
401
17.4.2 The Dynamics of the Prices at the Intermediaries The prices charged for the financial funds at the intermediaries, in turn, must reflect supply and demand conditions as well (and as shall be shown shortly also reflect profit-maximizing behavior on the part of the intermediaries who seek to determine how much of the financial flows they obtain from the different sources of funds). In particular, assume that the price associated with intermediary j , ρ 2 j , and computed at node j lying in the second tier of nodes, evolves over time according to: o m L ⎧ ∑ y jk − ∑ ∑ xijl , ⎪⎪ k =1 i =1 l =1 ρ2j=⎨ o m L ⎪max{0, ∑ y jk − ∑ ∑ xijl }, ⎪⎩ k =1 i =1 l =1
if
ρ2 j > 0 (17.22)
if
ρ 2 j = 0,
where ρ 2 j denotes the rate of change of the j -th intermediary’s price. Hence, if the amount of the financial funds desired to be transacted by the consumers (at an instant in time) exceeds that available at the intermediary, then the price charged at the intermediary will increase; if the amount available is greater than that desired by the consumers, then the price charged at the intermediary will decrease. As in the case of the demand market prices, it is guaranteed that the prices charged by the intermediaries remain nonnegative.
17.4.3 Precursors to the Dynamics of the Financial Flows First some preliminaries are needed that will allow the development of the dynamics of the financial flows. In particular, the utility-maximizing behavior of the source agents and that of the intermediaries is now discussed. Assume that each such source agent’s and each intermediary agent’s utility can be defined as a function of the expected future portfolio value, where the expected value of the future portfolio is described by two characteristics: the expected mean value and the uncertainty surrounding the expected mean. Here, the expected mean portfolio value is assumed to be equal to the market value of the current portfolio. Each agent’s uncertainty, or assessment of risk, in turn, is based on a variance-covariance matrix denoting the agent’s assessment of the standard deviation of the prices for each instrument/product. The variance-covariance matrix associated with source agent i ’s assets is denoted by Q i and is of dimension nL × nL , and is associated with vector xi , whereas intermediary agent j ’s variance-covariance matrix is denoted by Q j , is of dimension o × o , and is associated with the vector y j .
402
Anna Nagurney
17.4.4
Optimizing Behavior of the Source Agents
Denote the total transaction cost associated with source agent i transacting with intermediary j to obtain financial instrument l by cijl and assume that: cijl = cijl ( xijl ), ∀i, j, l .
(17.23)
The total transaction costs incurred by source agent i , thus, are equal to the sum of all the agent’s transaction costs. His revenue, in turn, is equal to the sum of the price (rate of return) that the the agent can obtain for the financial instrument times the total quantity obtained/purchased of that instrument. Recall that ρ1ijl denotes the price associated with agent i /intermediary j /instrument l . Assume that each such source agent seeks to maximize net return while, simultaneously, minimizing the risk, with source agent i ’s utility function denoted by U i . Moreover, assume that the variance-covariance matrix Q i is positive semidefinite and that the transaction cost functions are continuously differentiable and convex. Hence, one can express the optimization problem facing source agent i as: n
L
n
L
Maximize U i ( xi ) = ∑ ∑ ρ1ijl xijl − ∑ ∑ cijl ( xijl ) − xi TQ i xi , j =1 l =1
(17.24)
j =1 l =1
subject to xijl ≥ 0 , for all j, l , and to the constraint: n
L
∑ ∑ xijl j =1 l =1
≤ Si,
(17.25)
that is, the allocations of source agent i ’s funds among the financial instruments made available by the different intermediaries cannot exceed his holdings. Note that the utility function above is concave for each source agent i . A source agent may choose to not invest in any of the instruments. Indeed, as shall be illustrated through subsequent numerical examples, this constraint has important financial implications. Clearly, in the case of unconstrained utility maximization, the gradient of source agent i ’s utility function with respect to the vector of variables xi and denoted by ∇ xi U i , where ∇ xi U i = ( ∂∂xUi11i ,…, ∂∂xUinLi ) , represents agent i ’s idealized direction, with the jl -component of ∇ xi U i given by: ( ρ1ijl − 2Qzi jl ⋅ xi −
∂cijl ( xijl ) ), ∂xijl
(17.26)
where Qzi jl denotes the z jl -th row of Qi , and z jl is the indicator defined as: z jl = (l − 1)n + j . We return later to describe how the constraints are explicitly incorporated into the dynamics.
17 Networks in Finance
17.4.5
403
Optimizing Behavior of the Intermediaries
The intermediaries, in turn, are involved in transactions both with the source agents, as well as with the users of the funds, that is, with the ultimate consumers associated with the markets for the distinct types of loans/products at the bottom tier of the financial network. Thus, an intermediary conducts transactions both with the “source” agents as well as with the consumers at the demand markets. An intermediary j is faced with what is termed a handling/conversion cost, which may include, for example, the cost of converting the incoming financial flows into the financial loans/products associated with the demand markets. Denote this cost by c j and, in the simplest case, one would have that c j is a function of ∑ im=1 ∑ lL=1 xijl , that is, the holding/conversion cost of an intermediary is a function of how much he has obtained from the various source agents. For the sake of generality, however, allow the function to, in general, depend also on the amounts held by other intermediaries and, therefore, one may write: c j = c j ( x), ∀j.
(17.27)
The intermediaries also have associated transaction costs in regard to transacting with the source agents, which are assumed to be dependent on the type of instrument. Denote the transaction cost associated with intermediary j transacting with source agent i associated with instrument l by cˆ ijl and assume that it is of the form
cˆ ijl = cˆ ijl ( xijl ), ∀i, j, l .
(17.28)
Recall that the intermediaries convert the incoming financial flows x into the outgoing financial flows y . Assume that an intermediary j incurs a transaction cost c jk associated with transacting with demand market k , where c jk = c jk ( y jk ), ∀j, k .
(17.29)
The intermediaries associate a price with the financial funds, which is denoted by
ρ 2 j , for intermediary j . Assuming that the intermediaries are also utility maxi-
mizers with the utility functions for each being comprised of net revenue maximization as well as risk minimization, then the utility maximization problem for intermediary agent j with his utility function denoted by U j , can be expressed as: Maximize U j ( x j , y j ) = m
L
m
L
o
∑ ∑ ρ 2 j xijl − c j ( x) − ∑ ∑ cˆ ijl ( xijl ) − ∑ c jk ( y jk ) i =1 l =1 m
L
i =1 l =1
−∑ ∑ ρ1ijl xijl − y j Q j y j , i =1 l =1
T
k =1
(17.30)
404
Anna Nagurney
subject to the nonnegativity constraints: xijl ≥ 0 , and y jk ≥ 0 , for all i, l , and k . Here, for convenience, we have let x j = ( x1 j1 ,…, xmjL ) . The above bijective function expresses that the difference between the revenues minus the handling cost and the transaction costs and the payout to the source agents should be maximized, whereas the risk should be minimized. Assume now that the variance-covariance matrix Q j is positive semidefinite and that the transaction cost functions are continuously differentiable and convex. Hence, the utility function above is concave for each intermediary j . The gradient ∇ x j U j = ( ∂∂xU j ,…, ∂∂xU j ) represents agent j ’s idealized direction in 1 j1
mjL
terms of x j , ignoring the constraints, for the time being, whereas the gradient ∇ y j U j = ( ∂∂Uy j ,…, ∂∂Uy j ) represents his idealized direction in terms of y j . Note that j1 jo
the il -th component of ∇x j U j is given by: ( ρ 2 j − ρ1ijl −
∂c j ( x) ∂ cˆ ijl ( xijl ) − ), ∂xijl ∂xijl
(17.31)
whereas the jk -th component of ∇ y j U j is given by: (−
∂c jk ( y jk ) − 2Qkj ⋅ y j ). ∂y jk
(17.32)
However, since both source agent i and intermediary j must agree in terms of the xijl s, the direction (17.26) must coincide with that in (17.31), so adding both gives us a “combined force,” which, after algebraic simplification, yields: ( ρ 2 j − 2Qzi jl ⋅ xi −
The Dynamics of the Financial Flows Between the Source Agents and the Intermediaries
We are now ready to express the dynamics of the financial flows between the source agents and the intermediaries. In particular, define the feasible set K i ≡ {xi xijl ≥ 0,∀i, j, l , and (17.25) holds} . Let also K be the Cartesian product given by K ≡ Π im=1 Ki and define Fijl1 as minus the term in (1.33) with 1 Fi1 = ( Fi111 ,…, FinL ) . Then the best realizable direction for the vector of financial instruments xi can be mathematically expressed as: 1 x i = Π Ki ( xi , − Fi ),
(17.34)
17 Networks in Finance
405
where Π K ( Z , v) is defined as:
Π K ( Z , v) = lim
PK ( Z + δ v) − Z
δ →0
δ
,
(17.35)
and PK is the norm projection defined by PK ( Z ) = argmin Z ′∈K Z ′ − Z .
17.4.7
(17.36)
The Dynamics of the Financial Flows Between the Intermediaries and the Demand Markets
In terms of the financial flows between the intermediaries and the demand markets, both the intermediaries and the consumers must be in agreement as to the financial flows y . The consumers take into account in making their consumption decisions not only the price charged for the financial product by the intermediaries but also their transaction costs associated with obtaining the product. Let cˆ jk denote the transaction cost associated with obtaining the product at demand market k from intermediary j . Assume that this unit transaction cost is continuous and of the general form: cˆ jk = cˆ jk ( y ), ∀j, k .
(17.37)
The consumers take the price charged by the intermediaries, which was denoted by ρ 2 j for intermediary j , plus the unit transaction cost, in making their consumption decisions. From the perspective of the consumers at the demand markets, one can expect that an idealized direction in terms of the evolution of the financial flow of a product between an intermediary/demand market pair would be: ( ρ3k − cˆ jk ( y ) − ρ 2 j ).
(17.38)
On the other hand, as already derived above, one can expect that the intermediaries would adjust the volume of the product to a demand market according to (17.32). Combining now (17.32) and (17.38), and guaranteeing that the financial products do not assume negative quantities, yields the following dynamics: ∂c jk ( y jk ) ⎧ if ρ3k − cˆ jk ( y ) − ρ 2 j − − 2Qkj ⋅ y j , ⎪ ∂y jk ⎪ = y jk ⎨ ⎪max{0, ρ3k − cˆ ( y ) − ρ 2 j − ∂c jk ( y jk ) − 2Q j ⋅ y j }, if jk k ⎪⎩ ∂y jk
y jk > 0 y jk = 0.
(17.39)
406
Anna Nagurney
17.4.8
The Projected Dynamical System
Consider now the dynamic model in which the demand prices evolve according to (17.21) for all demand markets k , the prices at the intermediaries evolve according to (17.22) for all intermediaries j ; the financial flows between the source agents and the intermediaries evolve according to (17.34) for all source agents i , and the financial products between the intermediaries and the demand markets evolve according to (17.39) for all intermediary/demand market pairs j, k . Let now Z denote the aggregate column vector ( x, y, ρ 2 , ρ3 ) in the feasible set K ≡ K × R+no + n + o . Define the column vector F ( Z ) ≡ ( F 1 , F 2 , F 3 , F 4 ) , where F 1 is as has been defined previously;
nent
Fjk2 ≡ (2Qkj ⋅ y j +
where
∂c jk ( y jk ) ∂y jk
+ cˆ jk ( y) + ρ2 j − ρ3k ) ,
F j3 ≡ (∑ i =1 ∑ l =1 xijl − ∑ k =1 y jk ) , and m
L
F 2 = ( F112 ,…, Fno2 ) , with compo-
o
∀j, k ;
F 3 = ( F13 ,…, Fn3 ) ,
F 4 = ( F14 ,…, Fo4 ) , with
Fk4 ≡
(∑ j =1 y jk − d k ( ρ3 )) . n
Then the dynamic model described by (17.21), (17.22), (17.34), and (17.39) for all k , j, i, l can be rewritten as the projected dynamical system defined by the following initial value problem: Z = Π K ( Z , − F ( Z )), Z (0) = Z 0 ,
(17.40)
where, Π K is the projection operator of − F ( Z ) onto K at Z and Z 0 = ( x 0 , y 0 , ρ 20 , ρ30 ) is the initial point corresponding to the initial financial flows and the initial prices. The trajectory of (17.40) describes the dynamic evolution of and the dynamic interactions among the prices and the financial flows. The dynamical system (17.40) is non-classical in that the right-hand side is discontinuous in order to guarantee that the constraints in the context of the above model are not only nonnegativity constraints on the variables, but also a form of budget constraints. Here this methodology is applied to study financial systems in the presence of intermediation. A variety of dynamic financial models, but without intermediation, formulated as projected dynamical systems can be found in the book by Nagurney and Siokos (1997).
17.4.9
A Stationary/Equilibrium Point
The stationary point of the projected dynamical system (17.40) is now discussed. Recall that a stationary point Z ∗ is that point that satisfies Z = 0 = Π K ( Z ∗ , − F ( Z ∗ )),
and, hence, in the context of the dynamic financial model with intermediation, when there is no change in the financial flows and no change in the prices. Moreover, as established in (Dupuis and Nagurney 1993), since the feasible set K is
17 Networks in Finance
407
a polyhedron and convex, the set of stationary points of the projected dynamical system of the form given in (17.40) coincides with the set of solutions to the variational inequality problem given by: determine Z ∗ ∈ K , such that F ( Z ∗ )T , Z − Z ∗ ≥ 0, ∀Z ∈ K,
(17.41)
where in the model F ( Z ) and Z are as defined above and recall that ⋅,⋅ denotes the inner product in N -dimensional Euclidean space where here N = mnL + no + n + o .
17.4.10 Variational Inequality Formulation of Financial Equilibrium with Intermediation In particular, variational inequality (17.41) here takes the form: determine ( x∗ , y ∗ , ρ 2∗ , ρ3∗ ) ∈ K , satisfying: m
⎤ ∂c jk ( y∗jk ) ⎥ + cˆ jk ( y∗ ) + ρ 2∗ j − ρ3∗k ⎥ × ⎡⎣⎢ y jk − y∗jk ⎤⎦⎥ ⎥ ∂y jk ⎦
n ⎡m
L
o
⎤
k =1
⎦⎥
+ ∑ ⎢⎢ ∑ ∑ xijl∗ − ∑ y ∗jk ⎥⎥ × ⎡⎢⎣ ρ 2 j − ρ 2∗ j ⎤⎥⎦ j =1 ⎣⎢ i =1 l =1
o ⎡ n ⎤ + ∑ ⎢ ∑ y∗jk − d k ( ρ3∗ ) ⎥ × ⎡⎢⎣ ρ3k − ρ3∗k ⎤⎥⎦ ≥ 0, ∀( x, y, ρ 2 , ρ3 ) ∈ K, k =1 ⎣ j =1 ⎦
(17.42) where K ≡ {K × R+no + n + o } and Qzi jl is as was defined following (17.26). In (Nagurney and Ke 2001) a variational inequality of the form (17.42) was derived in a manner distinct from that given above for a static financial network model with intermediation, but with a slightly different feasible set where it was assumed that the constraints (17.25) had to be tight, that is, to hold as an equality. (Nagurney and Ke 2003), in turn, demonstrated how electronic transactions could be introduced into financial networks with intermediation by adding additional links to the network in Figure 17.4 and by including additional transaction costs and prices and expanding the objective functions of the decision-makers accordingly. We discuss electronic financial transactions subsequently, when we describe the financial engineering of integrated social and financial networks with intermediation.
408
Anna Nagurney
17.4.11 The Discrete-Time Algorithm (Adjustment Process) The projected dynamical system (17.40) is a continuous time adjustment process. However, in order to further fix ideas and to provide a means of “tracking” the trajectory of (17.40), we present a discrete-time adjustment process, in the form of the Euler method, which is induced by the general iterative scheme of (Dupuis and Nagurney 1993). The statement of the Euler method is as follows: Step 0: Initialization Start with a Z 0 ∈ K . Set τ := 1 . Step 1: Computation Compute Z τ by solving the variational inequality problem: Z τ = PK ( Z τ −1 − ατ F ( Z τ −1 )),
where {ατ ;τ = 1, 2,…} is a sequence of positive scalars such that ατ → 0 , as τ → ∞ (which is required for convergence).
(17.43) ∞
∑τ =1 ατ
= ∞,
Step 2: Convergence Verification If Z bτ − Z bτ −1 ≤ ε , for some ε > 0 , a prespecific tolerance, then stop: otherwise, set τ := τ + 1 , and go to Step 1. The statement of this method in the context of the dynamic financial model takes the form:
17.4.12 The Euler Method Step 0: Initialization Step Set ( x 0 , y 0 , ρ 20 , ρ30 ) ∈ K . Let τ = 1 , where τ is the iteration counter, and set the ∞ sequence {ατ } so that ∑τ =1 ατ = ∞ , ατ > 0 , ατ → 0 , as τ → ∞ . Step 1: Computation Step Compute ( xτ , yτ , ρ 2τ , ρ3τ ) ∈ K by solving the variational inequality subproblem: m
i = 1, , m ; j = 1, , n ; l = 1,…, L ; k = 1, , o , with ε > 0 , a pre-specified tolerance, then stop; otherwise, set τ := τ + 1 , and go to Step 1. The variational inequality subproblem encountered in the computation step at each iteration of the Euler method can be solved explicitly and in closed form since it is actually a quadratic programming problem and the feasible set is a Cartesian product consisting of the product of K , which has a simple network structure, and the nonnegative orthants, R+no , R+n , and R+o , corresponding to the variables x , y , ρ 2 , and ρ3 , respectively.
Figure 17.5. The financial network structure of the numerical examples
410
Anna Nagurney
17.4.13 Numerical Examples In this section, the Euler method is applied to two numerical examples. The algorithm was implemented in FORTRAN. For the solution of the induced network subproblems in x , we utilized the exact equilibration algorithm, which fully exploits the simplicity of the special network structure of the subproblems. The convergence criterion used was that the absolute value of the flows and prices between two successive iterations differed by no more than 10 −4 . For the examples, the sequence {ατ } = .1{1, 12 , 12 , 13 , 13 , 13 ,…} , which is of the form given in the initialization step of the algorithm in the preceding section. The numerical examples had the network structure depicted in Figure 17.5 and consisted of two source agents, two intermediaries, and two demand markets, with a single financial instrument handled by each intermediary. The algorithm was initialized as follows: since there was a single financial ini strument associated with each of the intermediaries, we set xij1 = Sn for each source agent i . All the other variables, that is, the initial vectors y , ρ 2 , and ρ3 were set to zero. Additional details are given in (Nagurney and Dong 2002). 17.4.13.1 Example 2
The data for this example were constructed for easy interpretation purposes. The supplies of the two source agents were: S 1 = 10 and S 2 = 10 . The variancecovariance matrices Q i and Q j were equal to the identity matrices for all source agents i and all intermediaries j . The transaction cost functions faced by the source agents associated with transacting with the intermediaries were given by: 2 2 c111 ( x111 ) = .5 x111 + 3.5 x111 , c121 ( x121 ) = .5 x121 + 3.5 x121 , 2 2 c211 ( x211 ) = .5 x211 + 3.5 x211 , c221 ( x221 ) = .5 x221 + 3.5 x221 .
The handling costs of the intermediaries, in turn, were given by: 2
The demand functions at the demand markets were: d1 ( ρ3 ) = −2 ρ31 − 1.5ρ32 + 1000, d 2 ( ρ3 ) = −2 ρ32 − 1.5 ρ31 + 1000,
17 Networks in Finance
411
and the transaction costs between the intermediaries and the consumers at the demand markets were given by: cˆ11( y ) = y11 + 5, cˆ12( y ) = y12 + 5, cˆ 21( y ) = y21 + 5, cˆ 22( y ) = y22 + 5.
It was assumed for this and the subsequent example that the transaction costs as perceived by the intermediaries and associated with transacting with the demand markets were all zero, that is, c jk ( y jk ) = 0 , for all j, k . The Euler method converged and yielded the following equilibrium pattern: ∗ ∗ ∗ ∗ x111 = x121 = x211 = x221 = 5.000, ∗ ∗ ∗ ∗ y11 = y12 = y21 = y22 = 5.000. ∗ ∗ The vector ρ 2∗ had components: ρ 21 = ρ 22 = 262.6664 , and the computed de∗ ∗ = 282.8106 . mand prices at the demand markets were: ρ31 = ρ32 The optimality/equilibrium conditions were satisfied with good accuracy. Note that in this example, the budget constraint was tight for both source agents, that is, n L s1∗ = s2∗ = 0 , where si∗ = S i − ∑ j =1 ∑ l =1 xijl∗ , and, hence, there was zero flow on the
links connecting node 3 with top tier nodes 1 and 2 . Thus, it was optimal for both source agents to invest their entire financial holdings in each instrument made available by each of the two intermediaries. 17.4.13.2 Example 3
The following variant of Example 2 was then constructed to create Example 3. The data were identical to that in Example 2 except that the supply for each source sector was increased so that S 1 = S 2 = 50 . The Euler method converged and yielded the following new equilibrium pattern: ∗ ∗ ∗ ∗ x111 = x121 = x211 = x221 = 23.6832, ∗ ∗ ∗ ∗ y11 = y12 = y21 = y22 = 23.7247.
∗ ∗ The vector ρ 2∗ had components: ρ 21 = ρ 22 = 196.0174 , and the demand prices ∗ ∗ at the demand markets were: ρ31 = ρ32 = 272.1509 . It is easy to verify that the optimality/equilibrium conditions, again, were satisfied with good accuracy. Note, however, that unlike the solution for Example 2, both source agent 1 and source agent 2 did not invest their entire financial holdings. Indeed, each opted to not invest the amount 23.7209 and this was the volume of flow on each of the two links ending in node 3 in Figure 17.6. Since the supply of financial funds increased, the price for the instruments charged by the intermediaries decreased from 262.6664 to 196.1074 . The demand prices at the demand markets also decreased, from 282.8106 to 272.1509 .
412
Anna Nagurney
17.5 The Integration of Social Networks with Financial Networks As noted by (Nagurney et al. 2004), globalization and technological advances have made major impacts on financial services in recent years and have allowed for the emergence of electronic finance. The financial landscape has been transformed through increased financial integration, increased cross border mergers, and lower barriers between markets. Moreover, as noted by several authors, boundaries between different financial intermediaries have become less clear (cf. Claessens and Jansen 2000, Claessens et al. 2003, G-10 2001). For example, during the period 1980–1990, global capital transactions tripled with telecommunication networks and financial instrument innovation being two of the empirically identified major causes of globalization with regards to international financial markets (Kim 1999). The growing importance of networks in financial services and their effects on competition have been also addressed by (Claessens et al. 2003). (Kim 1999) has argued for the necessity of integrating various theories, including portfolio theory with risk management, and flow theory in order to capture the underlying complexity of the financial flows over space and time. At the same time that globalization and technological advances have transformed financial services, researchers have identified the importance of social networks in a plethora of financial transactions (cf. Nagurney et al. 2004 and the references therein), notably, in the context of personal relationships. The relevance of social networks within an international financial context needs to be examined both theoretically and empirically. It is clear that the existence of appropriate social networks can affect not only the risk associated with financial transactions but also transaction costs. Given the prevalence of networks in the discussions of globalization and international financial flows, it seems natural that any theory for the illumination of the behavior of the decision-makers involved in this context as well as the impacts of their decisions on the financial product flows, prices, appreciation rates, etc., should be network-based. Recently, (Nagurney et al. 2004) took on a network perspective for the theoretical modeling, analysis, and computation of solutions to international financial networks with intermediation in which they explicitly integrated the social network component. They also captured electronic transactions within the framework since that aspect is critical in the modeling of international financial flows today. Here, that model is highlighted. This model generalizes the model of (Nagurney and Cruz 2003) to explicitly include social networks. For further background on the integration of social networks and financial networks, see (Nagurney et al. 2006). As in the model of (Nagurney and Cruz 2003), the model consists of L countries, with a typical country denoted by l or lˆ ; I “source” agents in each country with sources of funds, with a typical source agent denoted by i , and J financial intermediaries with a typical financial intermediary denoted by j . As noted ear-
17 Networks in Finance
413
lier, examples of source agents are households and businesses, whereas examples of financial intermediaries include banks, insurance companies, investment companies, and brokers, where now we include electronic brokers, etc. Intermediaries in the framework need not be country-specific but, rather, may be virtual. Assume that each source agent can transact directly electronically with the consumers through the Internet and can also conduct his financial transactions with the intermediaries either physically or electronically in different currencies. There are H currencies in the international economy, with a typical currency being denoted by h . Also, assume that there are K financial products which can be in distinct currencies and in different countries with a typical financial product (and associated with a demand market) being denoted by k . Hence, the financial intermediaries in the model, in addition to transacting with the source agents, also determine how to allocate the incoming financial resources among distinct uses, which are represented by the demand markets with a demand market corresponding to, for example, the market for real estate loans, household loans, or business loans, etc., which, as mentioned, can be associated with a distinct country and a distinct currency combination. Let m refer to a mode of transaction with m = 1 denoting a physical transaction and m = 2 denoting an electronic transaction via the Internet. The depiction of the supernetwork (see also, e. g., Nagurney and Dong 2002) is given in Figure 17.6. As this figure illustrates, the supernetwork is comprised of the social network, which is the bottom level network, and the international financial network, which is the top level network. Internet links to denote the possibility of electronic financial transactions are denoted in the figure by dotted arcs. In
Figure 17.6. The multilevel supernetwork structure of the integrated international financial network/social network system
414
Anna Nagurney
addition, dotted arcs/links are used to depict the integration of the two networks into a supernetwork. The supernetwork in Figure 17.6 consists of a social network and an international financial network with intermediation. Both networks consist of three tiers of decision-makers. The top tier of nodes consists of the agents in the different countries with sources of funds, with agent i in country l being referred to as agent il and associated with node il . There are IL top-tiered nodes in the network. The middle tier of nodes in each of the two networks consists of the intermediaries (which need not be country-specific), with a typical intermediary j associated with node j in this (second) tier of nodes in the networks. The bottom tier of nodes in both the social network and in the financial network consists of the demand markets, with a typical demand market for product k in currency h and country lˆ associated with node khlˆ . There are, as depicted in Figure 17.6, J middle (or second) tiered nodes corresponding to the intermediaries and KHL bottom (or third) tiered nodes in the international financial network. In addition, we add a node J + 1 to the middle tier of nodes in the financial network only in order to represent the possible non-investment (of a portion or all of the funds) by one or more of the source agents, as also done in the model in the previous section. The network in Figure 17.6 includes classical physical links as well as Internet links to allow for electronic financial transactions. Electronic transactions are possible between the source agents and the intermediaries, the source agents and the demand markets as well as the intermediaries and the demand markets. Physical transactions can occur between the source agents and the intermediaries and between the intermediaries and the demand markets. (Nagurney et al. 2004) describe the behavior of the decision-makers in the model, and allow for multicriteria decision-making, which consists of profit maximization, risk minimization (with general risk functions), as well as the maximization of the value of relationships. Each decision-maker is allowed to weight the criteria individually. The dynamics of the interactions are discussed and the projected dynamical system derived. The Euler method is then used to track the dynamic trajectories of the financial flows (transacted either physically or electronically), the prices, as well as the relationship levels until the equilibrium state is reached. Fascinatingly, it has recently been shown by (Liu and Nagurney 2005) that financial network problems with intermediation can be reformulated as transportation network equilibrium problems and that electric power networks can be as well (cf. Nagurney et al. 2005). Hence, we now can answer Copeland’s question in that money and electricity are alike in that they both flow on networks, which are, in turn, transformable into transportation networks.
Acknowledgments The writing of this chapter was supported, in part, by NSF Grant No.: IIS 0002647 under the MKIDS Program. This support is gratefully acknowledged. Also, the
17 Networks in Finance
415
writing of this chapter was done, in part, while the author was a Science Fellow at the Radcliffe Institute for Advanced Study at Harvard University. This support is also gratefully acknowledged. Finally, the author thanks the anonymous reviewers and editors for helpful comments and suggestions. This chapter is a condensed and adapted version of (Nagurney 2005).
References Ahuja RK, Magnanti TL, Orlin JB (1993) Network Flows, Prentice Hall, Upper Saddle River, New Jersey Arrow KJ (1951) “An Extension of the Basic Theorems of Classical Welfare Economics,” Econometrica: 1305−1323 Barr RS (1972) “The Multinational Cash Management Problem: A Generalized Network Approach,” Working Paper, University of Texas, Austin, Texas Board of Governors (1980) Introduction to Flow of Funds, Flow of Funds Section, Division of Research and Statistics, Federal Reserve System, Washington, DC, June Boginski V, Butenko S, Pardalos, PM (2003) “On Structural Properties of the Market Graph,” in Innovations in Financial and Economic Networks, A Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 29−42 Charnes A, Cooper WW (1958) “Nonlinear Network Flows and Convex Programming over Incidence Matrices,” Naval Research Logistics Quarterly 5: 231−240 Charnes A, Cooper WW (1961) Management Models and Industrial Applications of Linear Programming, John Wiley & Sons, New York Charnes A, Cooper, WW (1967) “Some Network Characterizations for Mathematical Programming and Accounting Approaches to Planning and Control,” The Accounting Review 42: 24−52 Charnes A, Miller M (1957) “Programming and Financial Budgeting,” Symposium on Techniques of Industrial Operations Research, Chicago, Illinois, June Christofides N, Hewins, RD, Salkin, GR (1979) “Graph Theoretic Approaches to Foreign Exchange Operations,” Journal of Financial and Quantitative Analysis 14: 481−500 Claessens S, Jansen, M, Editors (2000) Internationalization of Financial Services, Kluwer Academic Publishers, Boston, Massachusetts Claessens S, Dobos G, Klingebiel D, Laeven, L (2003) “The Growing Importance of Networks in Finance and Their Effects on Competition,” in Innovations in Financial and Economic Networks, A Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 110−135 Cohen J (1987) The Flow of Funds in Theory and Practice, Financial and Monetary Studies 15, Kluwer Academic Publishers, Dordrecht, The Netherlands Copeland MA (1952) A Study of Moneyflows in the United States, National Bureau of Economic Research, New York
416
Anna Nagurney
Cournot AA (1838) Researches into the Mathematical Principles of the Theory of Wealth, English Translation, Macmillan, London, England, 1897 Crum RL (1976) “Cash Management in the Multinational Firm: A Constrained Generalized Network Approach,” Working Paper, University of Florida, Gainesville, Florida Crum RL, Klingman DD, Tavis LA (1979), “Implementation of Large-Scale Financial Planning Models: Solution Efficient Transformations,” Journal of Financial and Quantitative Analysis 14: 137−152 Crum RL, Klingman DD, Tavis LA (1983) “An Operational Approach to Integrated Working Capital Planning,” Journal of Economics and Business 35: 343−378 Crum RL, Nye DJ (1981) “A Network Model of Insurance Company Cash Flow Management,” Mathematical Programming Study 15: 86−101 Dafermos SC, Sparrow FT (1969) “The Traffic Assignment Problem for a General Network,” Journal of Research of the National Bureau of Standards 73B: 91−118 Dantzig GB, Madansky A (1961) “On the Solution of Two-Stage Linear Programs under Uncertainty,” in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability 1, University of California Press, Berkeley, California, pp. 165−176 Debreu G (1951) “The Coefficient of Resource Utilization,” Econometrica 19: 273−292 Dong J, Nagurney A (2001) “Bicriteria Decision-Making and Financial Equilibrium: A Variational Inequality Perspective,” Computational Economics 17: 29−42 Dong J, Zhang D, Nagurney A (1996) “A Projected Dynamical Systems Model of General Financial Equilibrium with Stability Analysis,” Mathematical and Computer Modelling 24: 35−44 Doumpos M, Zopounidis C, Pardalos, PM (2000) “Multicriteria Sorting Methodology: Application to Financial Decision Problems,” Parallel Algorithms and Applications 15: 113−129 Dupuis P, Nagurney A (1993) “Dynamical Systems and Variational Inequalities,” Annals of Operations Research 44: 9−42 Enke S (1951) “Equilibrium Among Spatially Separated Markets,” Econometrica 10: 40−47 Euler L (1736) “Solutio Problematis ad Geometriam Situs Pertinentis,” Commetarii Academiae Scientiarum Imperialis Petropolitanae 8: 128−140 Fei JCH (1960) “The Study of the Credit System by the Method of Linear Graph,” The Review of Economics and Statistics 42: 417−428 Ferguson AR, Dantzig GB (1956) “The Allocation of Aircraft to Routes,” Management Science 2: 45−73 Ford LR, Fulkerson DR (1962) Flows in Networks, Princeton University Press, Princeton, New Jersey Francis JC, Archer SH (1979) Portfolio Analysis, Prentice Hall, Englewood Cliffs, New Jersey
17 Networks in Finance
417
G-10 (2001) “Report on Consolidation in Financial Services,” Bank for International Settlements, Switzerland Geunes J, Pardalos PM (2003) “Network Optimization in Supply Chain Management and Financial Engineering: An Annotated Bibliography,” Networks 42: 66−84 Hughes M, Nagurney A (1992) “A Network Model and Algorithm for the Estimation and Analysis of Financial Flow of Funds,” Computer Science in Economics and Management 5: 23−39 Korpelevich GM (1977) “The Extragradient Method for Finding Saddle Points and Other Problems,” Matekon 13: 35−49 Kim HM (1999) Globalization of International Financial Markets, Ashgate, Hants, England Liu Z, Nagurney A (2005) “Financial Networks with Intermediation and Transportation Network Equilibria: A Supernetwork Equivalence and Reinterpretation of the Equilibrium Conditions with Computations,” to appear in Computational Management Science Markowitz HM (1952) “Portfolio Selection,” The Journal of Finance 7: 77−91 Markowitz HM (1959) Portfolio Selection: Efficient Diversification of Investments, John Wiley & Sons, Inc., New York Mulvey JM (1987) “Nonlinear Networks in Finance,” Advances in Mathematical Programming and Financial Planning 1: 253−271 Mulvey JM, Simsek KD, Pauling B (2003) “A Stochastic Network Optimization Approach for Integrated Pension and Corporate Financial Planning,” in Innovations in Financial and Economic Networks, A Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 110−135 Mulvey JM, Vladimirou H (1989) “Stochastic Network Optimization Models for Investment Planning,” Annals of Operations Research 20: 187−217 Mulvey JM, Vladimirou H (1991), “Solving Multistage Stochastic Networks: An Application of Scenario Aggregation,” Networks 21: 619−643 Nagurney A (1994) “Variational Inequalities in the Analysis and Computation of Multi-Sector, Multi-Instrument Financial Equilibria,” Journal of Economic Dynamics and Control 18: 161−184 Nagurney A (1999) Network Economics: A Variational Inequality Approach, Second and Revised Edition, Kluwer Academic Publishers, Dordrecht, The Netherlands Nagurney A (2001) “Finance and Variational Inequalities,” Quantitative Finance 1: 309−317 Nagurney A, Editor (2003) Innovations in Financial and Economic Networks, Edward Elgar Publishing, Cheltenham, England Nagurney A (2005) “Financial Networks,” Invited Chapter for Handbook of Financial Engineering, C. Zopounidis, M. Doumpos, and P. M. Pardalos, Editors, Springer, Berlin, Germany, to appear Nagurney A, Cruz JM (2003) “International Financial Networks with Electronic Transactions,” in Innovations in Financial and Economic Networks, A. Nagurney, Editor, Edward Elgar Publishing, Cheltenham, England, pp. 135−167
418
Anna Nagurney
Nagurney A, Cruz JM, Wakolbinger T (2004) “The Co-Evolution and Emergence of Integrated International Financial Networks and Social Networks: Theory, Analysis, and Computations,” to appear in Globalization and Regional Economic Modeling, RJ Cooper, KP Donaghy, GJD Hewings, Editors, Springer, Berlin, Germany Nagurney A, Dong J (1996a) “Network Decomposition of General Financial Equilibria with Transaction Costs,” Networks 28: 107−116 Nagurney A, Dong J (1996b) “General Financial Equilibrium Modelling with Policy Interventions and Transaction Costs,” Computational Economics 9: 363−384 Nagurney A, Dong J (2002) Supernetworks: Decision-Making for the Information Age, Edward Elgar Publishers, Cheltenham, England Nagurney A, Dong J, Hughes M (1992) “Formulation and Computation of General Financial Equilibrium,” Optimization 26: 339−354 Nagurney A, Hughes M (1992) “Financial Flow of Funds Networks,” Networks 22: 145−161 Nagurney A, Ke K (2001) “Financial Networks with Intermediation,” Quantitative Finance 1: 441−451 Nagurney A, Ke K (2003) “Financial Networks with Electronic Transactions: Modeling, Analysis, and Computations,” Quantitative Finance 3: 71−87 Nagurney A, Liu Z, Cojocaru M-G, Daniele P (2005) “Dynamic Electric Power Supply Chains and Transportation Networks: An Evolutionary Variational Inequality Formulation,” to appear in Transportation Research E Nagurney A, Siokos S (1997) “Variational Inequalities for International General Financial Equilibrium Modeling and Computation,” Mathematical and Computer Modelling 25: 31−49 Nagurney A, Wakolbinger T, Zhao L (2006) “The Evolution and Emergence of Integrated Social and Financial Networks with Electronic Transactions: A Dynamic Supernetwork Theory for the Modeling, Analysis, and Computation of Financial Flows and Relationship Levels,” Computational Economics 27: 353−393 Nagurney A, Zhang D (1996) Projected Dynamical Systems and Variational Inequalities with Applications, Kluwer Academic Publishers, Boston, Massachusetts Pigou AC (1920) The Economics of Welfare, Macmillan, London, England Quesnay F (1758) Tableau Economique, 1758, reproduced in facsimile with an introduction by H. Higgs by the British Economic Society, 1895 Rockafellar RT, Wets RJ-B (1991) “Scenarios and Policy in Optimization under Uncertainty,” Mathematics of Operations Research 16: 1−29 Rudd A, Rosenberg B (1979) “Realistic Portfolio Optimization,” TIMS Studies in the Management Sciences 11: 21−46 Rutenberg DP (1970), “Maneuvering Liquid Assets in a Multi-National Company: Formulation and Deterministic Solution Procedures,” Management Science 16: 671−684
17 Networks in Finance
419
Samuelson PA (1952) “Spatial Price Equilibrium and Linear Programming,” American Economic Review 42: 283−303 Shapiro AC, Rutenberg DP (1976) “Managing Exchange Risks in a Floating World,” Financial Management 16: 48−58 Soenen L A (1979) Foreign Exchange Exposure Management: A Portfolio Approach, Sijthoff and Noordhoff, Germantown, Maryland Srinivasan V (1974) “A Transshipment Model for Cash Management Decisions,” Management Science 20: 1350−1363 Storoy S, Thore S, Boyer M (1975) “Equilibrium in Linear Capital Market Networks,” The Journal of Finance 30: 1197−1211 Takayama T, Judge GG (1971) Spatial and Temporal Price and Allocation Models, North-Holland, Inc., Amsterdam, The Netherlands Thore S (1969) “Credit Networks,” Economica 36: 42−57 Thore S (1970) “Programming a Credit Network under Uncertainty,” Journal of Money, Banking, and Finance 2: 219−246 Thore S (1980) Programming the Network of Financial Intermediation, Universitetsforlaget, Oslo, Norway Thore S (1984) “Spatial Models of the Eurodollar Market,” Journal of Banking and Finance 8: 51−65 Thore S, Kydland F (1972) “Dynamic for Flow-of-Funds Networks,” in Applications of Management Science in Banking and Finance, S Eilon and TR Fowkes, Editors, Gower Press, England, pp. 259−276 Wallace S (1986) “Solving Stochastic Programs with Network Recourse,” Networks 16: 295−317
CHAPTER 18 Agent-based Simulation for Research in Economics Clemens van Dinther
18.1 Introduction Financial theory and financial markets are complex constructs that are not always easy to understand. It is even not possible to explain the function of financial markets on basis of theoretical analysis, since theory normally assumes fully rational agents. [37] have presented the so called “No trade theorem” stating that in a state of efficient equilibrium there will be no trade on markets as long as all participants are fully rational (no noise traders or other uninformed traders) and the structure of information acqusition is common knowledge. It is obvious that the assumptions do not hold in real markets, since we observe asymmetric informed participants or technical traders who base their decisions on the analysis of market movements (and not on the assessment of the underlying value of a stock). Thus, apart from theoretic analysis science uses additional methods for the study of financial markets. Empiric studies are one way, but also simulation shows interesting results. Already in 1996, [3] have successfully applied agentbased simulation to the analysis of financial markets in their famous “Santa Fe Artificial Stock Market”. Besides agents-based simulation, Monte Carlo techniques are frequently used to model basic financial market movements (see e. g. [33]). The present chapter deals with the application of simulation in economics and in particular with agent-based simulations. It discusses the use of simulation and its application and presents some general approaches of agent-based simulation and existing links to economic analysis. Research in social science was for long affected by two commonly accepted approaches for gaining insight: induction and deduction. With induction one derives general principles from particular cases, e. g. through pattern discovery in empirical data or observation of certain rules. In the deductive approach a set of axioms is
422
Clemens van Dinther
defined and the consequences deriving from these assumptions are proven [5]. This approach can also be described as inferring the particular from the general. These techniques are applied in many fields of economic research. Game Theory uses deduction to solve for equilibria, macro economics uses deduction to explain equilibria, e. g. on money or goods markets. Empirical surveys (inductive reasoning) are used for macro economic analysis or in econometric theory. Most economic theories apply the method of induction and/or deduction in a static way. Game theory, for example, studies static equilibria. The model dynamics to establish the equilibrium are studied in extensive form games. It is not always possible to develop the extensive form of a game. This is the starting point for Computational Economics (CE). This fairly new branch of economic research is firstly interested in the system’s dynamics and the processes leading to observed phenomena by conducting simulations. Secondly CE provides computational techniques to support e. g. the solution for equilibria. Real world systems or system models can be scrutinised by experimental methods, or by building formal mathematical models for analytic evaluation and deductive reasoning. One reason for applying simulation is the complexity of models under investigation. Mathematical and experimental approaches are feasible for analysis of simple models. Adding more detail to refine simple models increases the complexity, and thus, renders more difficulty in finding analytic solutions. Models for experimental studies are often very simple in order to not ask to much of the participants. This is where computational techniques can support researchers to better find solutions and explanations in the study of complex systems. In that sense, “Computational Economics can help to study qualitative effects quantitatively” [29]. One popular computational approach is to implement heterogeneous agent models also known as agent-based computational economics. [26] presents a number of reasons for the increasing popularity of ACE. One of [26]’s most interesting argument is the idea of heterogeneous expectations. There would be no trade in a fully rational world where it is common knowledge that all agents are rational, since agents with superior private information are not able to profit from the information. This is due to the fact that all other traders realize that this agent’s trade request is based on private information, and thus, they will not trade with him.1 Another argument is the cognition (e. g. from laboratory experiments) that individuals often do not behave rational. Software agents are suited to model bounded rationality. This chapter is concerned with agents used for the simulation studies in economics. Agent-based approaches have become popular in recent years, but economists still find fault the missing link of agent-based approaches to traditional economic theory. Section 18.2 discuss the application of simulation in economics in general. Section 18.3 presents the most common agent-based approaches and links them to economic standard theory. The chapter closes with a summary of the key issues.
1
For literature on the no trade theorem see also [37, 19].
18 Agent-based Simulation for Research in Economics
423
18.2 Simulation in Economics In general, simulation in economics does not differ from simulation in other fields. It is understood as a particular type of modelling the real world or other systems [20]. Simulations are an intuition of real world processes over time [7] and support the study of these systems. Therefore, simulation models represent real world systems in a less detailed and less complex way, described by input and output variables and certain system states. An artificial history of the system is generated during the course of the simulation. This artificial history is used for observation and to draw inferences concerning the objective of the study. [20] understand simulations as a method for theory development, since a specific theory can be studied within a simulation and the results are used to refine theory. This process is also depicted in Figure 18.1.
Refinement
Figure 18.1. Simulation for theory development
Simulations are often understood as an alternative approach of doing research in social science. [42] argues that social science theory can be expressed either by mathematics using formalized expression, or by verbal explanation. Additionally, he puts simulations as a third opportunity of expressing theoretical propositions in computer models. This can be understood as using simulations as a third methodological approach apart from deduction and induction. In the context of evaluating electronic markets, [39] determines simulation as a third research method beside axiomatic reasoning (cp. deduction) and experimental techniques (cp. induction). Conducting a simulation starts – comparable to deduction – with the determination of assumptions. In a deductive approach these assumptions are proven. Simulations are not applicable for theorem proving, but generate data that can be inductively analysed. Insofar, simulations use techniques from both, induction and deduction, though the data, the induction is based upon, is not observed within the real world but in a model based on assumptions. Hence, simulations can also be understood as doing thought experiments, where assumptions may be simple but the consequences may be not obvious (see e. g. [5]). In the following, advantages and disadvantages of simulation are discussed briefly.
18.2.1 Benefits of Simulation The advantage of simulations compared to other research methods is primarily the fact that the simulation designer is in control of any parameter to adapt to problem specific circumstances. This allows for both, normative and descriptive studies.
424
Clemens van Dinther
A well designed simulation system can help to understand and explain real world systems and to describe certain observed phenomena by comparing different simulation settings. Descriptive application of simulation can especially be used for teaching and training. Additionally, normative studies can be conducted by changing a particular parameter and studying the consequences for the system’s development and its output. This helps to diagnose systems. The simulation speed can be varied (in fact this allows to compress/expand time) which enables the study of slow motion activities as well as the study of long periods of time. In comparison to other research methods such as experiments or surveys, simulations are less expensive, because no expenses for resource acquisition or for the reward payments in experiments are necessary. Agent-based simulations in particular provide some additional advantages. [6] remarks that agents allow for modelling heterogeneous behaviour and to limit rationality. This is not possible in mathematical theorizing.
18.2.2
Difficulties of Simulation
On the other hand, simulations face diverse problems. It is difficult to build appropriate models which accurately represent the real systems. Therefore, it is essential to prove the correctness of the simulation model. In some cases it is difficult to identify mistakes in a simulation model. Wrong conclusions are drawn (from incorrect simulation models) and are taken for granted. Thus, the quality and the correctness of simulation results is a sensitive issue. The development of good models is often not only difficult but also time consuming. It may also be difficult to interpret the collected data. One of the most difficult problems in the development of simulation models is to decide which methodological approach to follow. There is a variety of methods available such as system dynamics, queuing models, agent-based simulation, or stochastic models. Just a few of these will produce appropriate results applied on a specific problem. The use of simulations in economics has rapidly grown in the last years and some work was done on the question of how to apply certain methods (e. g. [20]). Agent-based Computational Economics (ACE) is one specific branch within CE. It describes the “computational study of economies modelled as evolving systems of autonomous interacting agents” [52]. This research approach builds upon the understanding of economy not from the macro level perspective of a system of aggregated demand and supply, but from a micro level as the result of the interaction of numerous independent individuals. [32] notes in a talk2 that economics can be understood as the study of those phenomena emerging from the interactions among intelligent, self-interested individuals. ACE conducts interdisciplinary work on the interface of Cognitive Science, 2
The talk was given to the European Association for Evolutionary Political Economy and is published as an essay.
18 Agent-based Simulation for Research in Economics
425
Evolutionary Economics
Agent-based Computational Economics (ACE)
Cognitive Science
Computer Science
Figure 18.2. Agent-based computational economics, source: T. Eymann at http://www.econ.iastate.edu/tesfatsi/ace.htm
Computer Science, and Evolutionary Economics (see also Figure 18.2). Agentbased simulation itself provides several mechanisms such as genetic algorithms, stochastic components, or other evolutionary techniques. Thus, the decision on which method to apply has to be posed on this level. In the last years more and more effort has been made to link agent-based simulation approaches to economic research. Economist are still suspicious of agentbased simulation approaches, since the simulation model’s underlying theory is often not linked to existing economic theory. Agent-based economics researchers have started to close this gap, although there are still open issues. To the author’s knowledge, there is no summary on possible agent-based simulation techniques and the existing link to economic theory. Since such an overview on the one hand facilitates the work of agent-based economics researchers, and on the other hand makes ACE more credible among economists, this chapter presents the main agent-based modelling techniques and summarises literature on bridging the gap to economic theory. [6] presented a valuable analysis of the use of agent-based simulation in respect to the solvability of the underlying theoretic model. He argues that there are three distinct alternatives for the usage of agent-based techniques: i. fully solvable models, ii. partially solvable models, and iii. insolvable models. In the first class of models (i), agent-based simulations are applied to models that are completely solvable. In this case simulations serve for model validation, exemplification of the solution, or for application of stochastic simulations. For example, it is often the case that solutions to mathematical models are only found in numerical form. If there are numerical samples available, an agent-based model can be applied to validate the results.
426
Clemens van Dinther
In those cases in which there are not only a numerical but a solution in closed form,3 simulations can be applied to better present the solution or to study the dynamics of the model. Many models are based on stochastic functions, e. g. the output of the model is dependent on stochastic input. The functional relation between input and output may be determinable. In such cases stochastic sampling on basis of agent-based simulations can be applied that randomly draws input values in order to determine the output. Stochastic sampling in general is known as “Monte Carlo simulation” and a common approach in science. In many cases it is only possible to partially solve a mathematical model (ii), e. g. if the solution is only possible under restrict assumptions that make the results ambiguous, or if the model can not be solved. In such cases agentbased simulation can be applied complementary to mathematical theorizing. [6] remarks that it has to be distinguished between simulation as instantiation of a mathematical model and simulation as supplement to mathematical models. The former case is equivalent to (i). In order to simplify such complex models, assumptions are necessary. Agent-based models allow to relax such assumptions since heterogeneous agents can be implemented. This makes it possible, firstly to apply agents in simple models that are restricted by many assumptions and validate the outcome with the partly available solutions, or numerical results. In a second stage, the assumptions can be relaxed and the (same) agents can act in the more complex environment. This facilitates a structured analysis of the model. Insolvable models (iii) are rarely analysed in scientific papers since they do not bear new insights in a given problem. Nevertheless, agent-based simulation might be a starting point in order to better understand models that are not mathematical solvable. The described classification has a problem-oriented character, giving more insight to the question which problem/model is handled in agent-approaches. As argued earlier, there is also need for a method-oriented view on agent-based simulation, answering the question how to apply agent-based simulation and which techniques to use. There are different techniques applied in agent-based simulation. The four most popular approaches that are found in literature are pure agent-based simulation, Monte Carlo techniques, evolutionary (genetic) approaches,and reinforcement learning techniques. The first approach builds agent societies on basis of simple rules. Monte Carlo simulations are used either to sample complex models or to model stochastic processes, e. g. not deterministic behaviour. The main criticism agent-based simulation is exposed to, is the insufficient theoretic basis. In recent years, it was started to link agent-based approaches to well founded theory. Evolutionary mechanisms according to biological studies have been widely used to model 3
The term closed form denominates mathematical solutions of functions or equations in a general and explicit form with a finite amount of operations. That is, the integral of a function such as F(x) = ∫ f(x) = x2 + cx +1 dx can be generally solved. Sometimes it is only possible to provide an approximation for an integral and compute numerical examples. Such solutions are not considered “closed form”.
18 Agent-based Simulation for Research in Economics
427
evolving systems. Today, evolutionary techniques are applied according to well founded game theoretic approaches as evolutionary game theory. Besides evolutionary mechanisms, it is popular to apply reinforcement learning methods that are related to Markov theory and stochastic games. Certainly, it is not possible to provide an unambiguous classification, since the discussed approaches can also be combined. Nevertheless, these four approaches appear to be the most applied techniques. Therefore, the following sections briefly summarize these four methods.
18.3 Agent-based Simulation Approaches The basic idea of agent-based simulation approaches is to build simple models of a real world system in which agents interact on basis of social rules. This allows to conduct studies of macro behaviour from a micro behavioural view. First work in this area was done by [58] with the theory on self-reproducing automata. It was the 2005 Nobel prize laureate Thomas Schelling who introduced one of the first agent models in order to explain cultural segregation in American housing areas [47]. In his models of segregation he observes the phenomenon of cultural segregation in American residential districts into areas of white and coloured citizens. In these times computational power was limited. During the last decade, computational progress made large scale agent populations possible. That is also why most of the work on ACE was done in the last ten to fifteen years. This section briefly presents the four main agent-based approaches used in economics. Since economists are still suspicious of agent-based simulation, it is more and more tried to link agent-based approaches to standard economic theory. Especially evolutionary techniques show parallels to economic theory, and thus, are presented in the following besides pure agent-based techniques and Monte Carlo simulation.
18.3.1
Pure Agent-based Simulation: The Bottom-up Approach
He studies spatial arrangements where agents of different characteristics prefer to live in a neighbourhood of a certain characteristic, e. g. the colour of the neighbours. Agents have preferences that at least a particular fraction of the neighbourhood has their own colour. Changing these preferences certainly leads to different compositions of the neighbourhood. In what is denominated the pure agent-based approach, the model consists of several agents, the environment the agents act in, and interaction rules. These interaction rules are fixed for the agent, meaning that the agents’ action sets are not changed, as it
428
Clemens van Dinther
is the case if using learning mechanisms such as reinforcement learning or genetic algorithms. [11, p. 7280] presents a very simple but striking example that explains the model dynamics by changing fixed interaction rules: In a group of 10−40 agents each agent is assigned a protector and an opponent randomly and secretly. Each agent tries to hide from the opponent behind the protector. This simple rule will result in a highly dynamic system of agents moving around in a seemingly random fashion. Changing the rule in such a way that the agents have to place themselves between opponent and protector, the system will end up in everyone clustering in a tight knot. This example shows how simple rules result in complex dynamic systems. [15] incrementally developed Sugarscape, a model of a growing artificial society that received much attention in the agent-based research community. Agents are born into a spatial distribution, the Sugarscape. Agents have to eat sugar in order to survive and thus, move around on the landscape to find sugar. Starting with simple movement rules, more and more social rules like sexual reproduction, death, conflicts, history, diseases, and finally trade are introduced. This results in a quite complex society that allows to study several social and economic aspects. As a last example may serve the computational market model of [50] that builds a small society of agents that can either produce food or gold. Each agent needs food for survival and is equipped with a certain skill level of producing gold or food. Food can be traded for gold. In such a simple model, agents switch from producing food to gold or vice versa if the exchange rate passes their individual level from which on it is more lucrative for them to produce the other product. This results in a cyclic trade behaviour with large amplitudes of volume and price. When adding speculators, agents that have the capabilities of storing gold and food, the price stabilises significantly. The described examples give an introduction of how to use agent models in their simplest form. It is difficult to analyse the described models analytically. These models use the normative effect of agent simulation by controlling and adding particular rules and their impact on the model’s behaviour. Often, economists are interested not only in the dynamics of model, but also in stability. Economists want to discover equilibria states and optimal strategies that lead to equilibrium. Pure agent models are not well suited for finding new strategies. This is why, learning techniques or stochastic components have to be used. Such approaches are introduced in the next sections.
18.3.2 Monte Carlo Simulation In general, the term Monte Carlo (MC) denotes methods for mathematical experiments using random numbers. Monte Carlo methods became known explicitly in the Manhattan Project developing the nuclear bomb during the second world war. The research group around John von Neumann namely the researchers Ulam, Metropolis and Fermi developed this method and applied random number experiments.
18 Agent-based Simulation for Research in Economics
429
The name Monte Carlo is ascribed to Metropolis who alludes to Ulam’s passion for poker by referring to the famous casino of Monte Carlo. A first article to stochastic experiments and the title The Monte Carlo Method was published by Metropolis and Ulam in 1949 [36]. In fact, stochastic processes for experiments are known from earlier years,4 but the method was systematically developed primary in the late fourties of the 20th century by Ulam, Fermi, von Neumann and Metropolis, and thus, was available for application in manifold scientific fields since then. The problems studied by Monte Carlo methods can be distinguished in probabilistic and deterministic problems [23]. Probabilistic problems determine cases where random variables are used to model real stochastic processes, e. g. in queuing theory. One speaks of deterministic Monte Carlo cases if formal theoretical models exist that are not possible or hard to solve numerically. In these cases, the Monte Carlo method helps to find solutions applying random numbers, e. g. in computing the number π.5 In the social science, the Monte Carlo method is applied to different problem structures. In Operations Research (OR) probability distributions are used within models, e. g. in queuing theory. In other fields, e. g. in finance, stochastic processes are used to simulate order flows in stock markets. In agent-based simulation, Monte Carlo methods are used for both, probabilistic and deterministic, models. In most of the agent-based simulation models, probabilistic MC-methods are applied, e. g. to assign valuations to agents, or to simulate noise. A well known example is the contribution of [21] in which a double auction market is simulated by agents with zero intelligence. Instead of implementing fixed strategies, agents in this market draw their bid value from uniform distributions regardless of true valuation. This strategy is used to represent bounded rational agents. [13] apply a deterministic MC approach to sequential, (possibly) multiunit, sealed-bid auctions. The agents sample the valuation space of the opponents and solves the incomplete information game that results of the sample. The results of the samples are aggregated in a policy. This approach is used since the “straightforward expansion of the game is intractable even for very small problems and it is beyond the capability of the current algorithms to solve for the Bayes-Nash equilibria” [13, p.154]. As explained earlier, it is an appropriate approach to use stochastic sampling if the problem is hard, or not possible to solve. The next section introduces into evolutionary approaches for agent based simulation. 4
5
[23] refer to work by [22] that describes the computation of π by means of random processes. A needle of length l can be dropped on a board with parallel lines of distance t. Then the number of trials N and the number of areas R in which the needle has dropped are counted. The probability of the needle crossing a line is directly related to the value of p. The needle crosses a line if the distance of the needle's center to the closer line is smaller than 0.5 l sin(?) with? being the angle between the needle and the lines. We can now compute the probability of the needle crossing one line to 2 l/t p which is equivalent to the ratio R/N. Hence, p can be computed as p = 2 l N/t R.
430
Clemens van Dinther
18.3.3
Evolutionary Approach
Starting in 1975, John Holland was the first who transferred the idea of biological genetic evolution to artificial adaptive systems [24]. In biology, the genotype is stored on chromosomes that are combined during reproduction. The biological improvement mechanisms are used in computer science as search technique. Genetic algorithms (GA) have become popular for economic simulations in order to model learning of economic agents. GA are applied to find solution for optimization problems. The search space is encoded as a genetic representation which is exploited by means of an “intelligent” random search. GA are intelligent in that they use historical information to improve actual best strategies in direction of local or global optima. Through applying Charles Darwin’s proposition of “the survival of the fittest” that states that the fittest individuals survive the battle for scarce resources. Basically, actions or a sequence of actions are assigned values, e. g. performance, utility, payoff, etc. The actions are genetically encoded as strategies, e. g. in string or binary representation. This chromosome representations builds the genetic strategy set. A subset of the strategy set forms a population, also called generation. Each strategy of a population is assigned a value, also called fitness, representing how good the strategy performed in the past. At each time playing one strategy out of the pool of one generation the strategy earns a reward that serves as feedback for the performance of this particular strategy. The fitness is adapted according to the reward. Some strategies will perform better than others. In order to improve the population of strategies means of innovation, experimentation and replication are applied. GA are widely used in science and industry, e. g. it is applied in industry for the optimization of production schedules, to minimize size in circuit design of computer chips, for improving the efficiency of machines, or for efficient network design [41]. In social science simulation GA are used as learning algorithm to model bounded rational behaviour of software agents. 18.3.3.1 The Learning Model GA can be understood as models of strategic interaction. That is, economic agents try to find the best strategy in the given environment. Thus, “GA are models of social learning, which in fact is a way of learning by interaction” [46, p.48]. Basically, two components are necessary, (1) the genetic representation of the solution space, and (2) an appropriate fitness function that is used to evaluate the solution space. Each possible solution (strategy) is represented by a gene, e. g. an array of bits or characters. The fitness of a gene represents the ability of a gene to compete and derives from payoffs achieved in the past. As such, GA builds beliefs for rules in response to experience. The frequency of representation in a population and the fitness of a given strategy indicate the performance of that strategy. In order to determine optimal strategies, it is necessary to search the strategy set. This is done by manipulating the population with the three basic genetic modi-
18 Agent-based Simulation for Research in Economics
431
Figure 18.3. Adaption of strategies through crossover
fication mechanisms: selection (also understood as reproduction or imitation), crossover (also understood as communication), and mutation (also understood as innovation or learning by experiment). Selection simply means that the strongest strategies of a population in terms of fitness are identically reproduced into the next generation. The probability for reproduction from the current population n into next period’s population only depends on its relative fitness R (i / n ) derived from individual’s payoff in ratio to the population’s aggregate payoff n [46, p. 48]. During crossover, two chromosomes are drawn out of the population and split at a randomly drawn position. Than the chromosome parts are recombined with the ones of the other chromosome. The crossover operator is also understood as an communication mechanism. This might cause a dramatic change from an individual perspective, but does not introduce complete new strategies into the population. Figure 18.3 depicts an example for crossover. Mutation is an operator that randomly manipulates a single bit or character of the chromosome by flipping single bits with a (small) mutation probability μ. This is also understood as an operator for experimentation, since it is the only way of introducing abrupt changes. Mutation is just a small change to the individual strategy, but might bring a complete new strategy into the population. GA have been widely used for simulating adaptive and bounded rational agents in economics. An early application was described in [38], and [4] who applied GA to the prisoner’s dilemma, and [1] simulated auctions under human like behaviour. [25] give a brief overview on artificial adaptive agents in economic theory. [2] remarks that one of the basic questions in applying GA to economic problems is how well suited GA are to discover Nash equilibria and to model individual and social learning. 18.3.3.2 Evolutionary Games Game theory studies strategic situations of agents assuming fully rational agents. The analysis of such models is often difficult or impossible. Agentbased simulation using GA models the economy as bounded rational agents that try to adapt to their environment. Bringing together both views is an important step towards a better understanding of the economy. [31] argues for a middle way of pure agent theory and game theory. Advances in this direction have been made, e. g. by [35] who compares evolutionary processes with learning models, or by [46] who compares evolutionary game theory and GA learning. The latter author is able to show
432
Clemens van Dinther
links between GA and evolutionary game theory, i. e. GA can be understood as “a specific form of a repeated evolutionary game” [46, p. 58]. The game theoretic concepts of evolutionary stable states and Nash equilibria are applied to GA but mainly valid for single population GA. An evolutionary game is understood in the words of [18, p. 16]: “By an evolutionary game, I mean any formal model of strategic interaction over time in which (a) higher payoff strategies tend over time to displace lower payoff strategies, but (b) there is some inertia, and (c) players do not systematically attempt to influence other players’ future actions.” The most common example is the Hawk-Dove game, where hawks and doves compete for a nutrition source v. If two hawks meet at the source, they fight against each other and win with a probability of 50%, thus receiving a payoff6 of 1/2(v − c).If a hawk meets a dove at the source, the hawk dispels the dove and gets the total source for itself (payoff = v), whereas the dove earns 0. Two doves share the source equally and thus receive a payoff of v/2. One solution concept in evolutionary games are evolutionary stable strategies/states (ESS). A strategy σ is called ESS if it prevents the population from an invading strategy μ.Thus, if F0 is the initial fitness, F (s) is the total fitness of the individual following strategy s, ΔF (s1,s2) is the change in fitness for playing s1 while the opponent plays s2,and p is the proportion of the population following mutant strategy μ, then we write: F (σ) = F0 + (1 − p)ΔF (σ, σ) + pΔF (σ, μ) F (μ) = F0 + (1 − p)ΔF (μ, σ) + pΔF (μ, μ) Since p is close to 0 it follows that F (σ) > F (μ) if either ΔF (σ, σ) > ΔF (μ, σ) or ΔF (σ, σ) = ΔF (μ, σ)and ΔF (σ, μ) > ΔF (μ, μ). Disadvantages of this solution concept is that on the one hand in evolutionary games strategies are repeatedly compared pairwise and on the other hand an explicit formulation of the selection process underlying the concept does not exist [46, p. 52]. In contrast, GA learning compares each single strategy to the aggregated other strategies. [46, p. 51] directly links this fact to Nash equilibria: “A Nash strategy is defined as the best strategy given the strategies of the competitors […] As such a genetic population can be interpreted as a population of agents, each trying to play Nash.” This fact enables researchers to use all analytical tools available for evolutionary game theory to apply it to GA. Insofar, the findings of [46] are an important step of closing the gap between evolutionary game theory and computational techniques. The next section introduces reinforcement learning theory.
18.3.4
Reinforcement Learning
Reinforcement learning (RL) is a goal directed, trial and error learning procedure. According to [30], RL has relations to cybernetics, statistics, psychology, neuro6
c are the injury costs.
18 Agent-based Simulation for Research in Economics
433
science, and computer science. [51, p. 16] allude to the origin from two threads that developed independently. The first thread derives from learning by trial and error inspired by research in psychology. This approach was already pursuit in the early work of artificial intelligence. The second main thread concentrates on the problem of optimal control and its solution through value functions and dynamic programming. The merge of both research branches turns RL into an important research approach that is theoretical well founded from a (human) behavioural point of view as well as on basis of economic theory. The “Law of Effect” states that it is more likely to choose out of a set of possible actions that action which results in the largest satisfaction [53, 57].7 Thereby, the largest satisfaction is determined by trial and error and in the course of time the action with the best outcome in the past is chosen. This effect combines the two characteristic and important aspects, search (selection) and memory (association). RL is selectional since it tries alternatives and selects the best one by comparing consequences. It is also associative because the action selection depends on particular states. [16] additionally consider the “Power Law of Practice” that was introduced by [10]. The law states that learning at the beginning of the process is more intensive and reduces during the course of time. The optimal control branch was initiated among others by [8] for designing a controller that minimizes a measure of a dynamic system’s behaviour over time. His approach implements an optimal return function and uses (dynamic) system states. These two factors define the dynamic programming equation that is also known as Bellman equation8 for discrete values. The discrete stochastic version of the dynamic control problem became known as Markov Decision Process (MDP) [9] that specifies a course of state transitions that are independent from state transitions in the past. [27] contributed the policy iteration method that was modified by [45], and [44]. These contributions are the basis for the RL theory. The idea of MDP links the computer science theory of RL to economic theory of stochastic games as it is shown in the paragraph about Markov Games. The development of RL from psychology at the one hand, and mathematics on the other hand, turns RL into a qualified foundation for agent-based simulation. It is possible to model human behaviour as e. g. [16] have shown. Additionally, the models have a mathematical foundation that can be linked to economic theory. The next paragraph describes the RL mechanism. Paragraph “Markov Games” introduces the link to stochastic game theory. 7
8
It was Edward L. Thorndike who studied this effect in animal experiments during the years 1898 to 1911. [14] remarks that Thorndike has published a couple of articles ([53, 54, 55, 56, 57]) where he implicitly refers to the law of effect without explicitly defining it. It is the 1907 monograph in which the term is used the first time. This makes it difficult to determine the date of origin for correct citation. Bellman’s work is based on work of the Irish scientist William Rowan Hamilton (18051865) and the German mathematician Carl Gustav Jakob Jacobi (1805-1851). The Hamilton-Jacobi equation is a non-linear differential equation for the description of a system’s behaviour.
434
Clemens van Dinther
18.3.4.1 The Learning Model Reinforcement learning focuses on goal directed learning of agents in an uncertain and possibly dynamic environment. Agents seek to achieve goals by performing actions that result in (delayed) rewards. The agents can learn from rewards received in the past by assigning the rewards to actions that were performed in a particular state. At the beginning of the learning process, agents do not have preferences for certain actions. Thus, agents proceed by trial and error in order to explore the best action. The model is characterised by the environment and agents that interact with the environment in order to achieve a particular goal. Agents are not certain about their environment since they do not possess all information. Instead, agents have a perception of the environment that gives an indication about the actual state that consists of environmental and (optionally) agent internal state parameters. Agents face the problem of deciding which action to take in a particular state. This decision problem is often described as the n-armed bandit problem. A n-armed bandit is also understood as n one-armed bandits.9 Since the reward is not deterministic it is difficult for the agent to determine an optimal policy. The agent faces the problem of either exploiting the action space by choosing action with high values, or exploring the action space by trying actions with low values. The probability of making a win is equal for all actions regardless of their performance in the past. The decision problem of exploration versus exploitation applies to all reinforcement learning problems. Reinforcement learning is well-suited to determine an appropriate actionselection mechanism by optimizing the direct feedback from the environment. The agent identifies certain environmental states and maps its perception of these states into particular actions. The tuple (S, A, P, r) describes a set of environmental states S, a set of possible actions A, a probability P: S × S × A → [0,1] for the transition from state s1 to state s2 by choosing action a, and a scalar reward function (reinforcement) r: S × A → R. The reinforcement is non-deterministic in the most cases, i. e. taking action ain the same state at two different points of time might not result in the same reward and transition. Each agent holds a policy π that maps the actual state s to the action to be performed. The optimal policy maximises the sum of the expected rewards. The present value of a state is determined as the discounted sum of future rewards received by following some fixed policy π in state s, i. e. Vγπ = E ( ∑ t∞= 0 γ t rsπ,t ) . Thereby, rsπ,t the stochastic parameter for the reward r at t time steps after the realization of policy π in state s, using the discount rate γ(0 ≤ γ < 1) (cp. [48, p.266]). The value function maps states to state values. It can be approximated by any kind of function approximator such as a look-up table or a multi-layered perceptron. The problem of finding the optimal agent’s behaviour is equivalent to finding the optimal value function. 9
A one-armed bandit denominates a gambling machine with one arm that can be pulled at specific costs. The one-armed bandit stochastically pays a reward greater or equal to zero.
18 Agent-based Simulation for Research in Economics
435
Table 18.1. Reinforcement learning approaches for model-based and model-free settings Model-based Value Iteration Policy Iteration Modified Policy Iteration
Different methods for models of optimal behaviour have been proposed in literature. [30] distinguish model-based and model-free approaches that are summarized in Table 1. Here, the term “model” consists of knowledge about the probability of the state transitions. P: S × S × A→ [0,1] and about the reinforcement function r: S× A→ R. In most economic models the state transition is not known to the agents. One speaks of model-based approach if estimates for the model (P and r) are learned and used to find an optimal policy. In model-free approaches, the optimal policies are learned without building a model. Since Q-learning is the mainly used method for economic model it is explained in more detail in the remainder of this section. It was proposed by [59] and is an asynchronous dynamic programming approach. In general, dynamic programming provides algorithms to compute optimal policies in Markov Decision Processes (MDP). [27] defines MDP by a set of states, a set of actions, and a transition function. A MDP satisfies the Markov property that states that transitions in the actual state are independent from previous states. That is, the history of how a particular state was reached has no impact on the transition probabilities of further states. Reinforcement learning models that satisfy the Markov property are Markov Decision Processes [51]. Note that MDP theory assumes a stationary environment, i. e. the probabilities of state transitions or of receiving a specific reward do not change over time. Q-learning is often used in ACE, since the approach can be linked to Markov games as one branch of game theory. This is the reason for explaining Q-learning in the following in more detail. Markov games are presented in more detail in Section 18.3.4.2. Q-learning can be understood as controlled Markov process. Agents control the state transitions by performing actions. For each policy π, an action-value Q is defined that determines the “expected discounted reward for executing action a at state x and following policy π thereafter” [60, p. 280]: Q* ( x, a ) = Rx (a ) + γ ∑ Pxy (π ( x))V ∗ ( y )
(18.1)
y ∗
where γ is the discount factor of future rewards, V is the optimal value in state y, Pxy is the probability function for the transition from state x to y. Since V ∗ ( y ) = max a Q * ( y, a ) , the Q-learning rule can be recursively defined: Q( x, a ) = Q( x, a ) + α (r + γ max Q( x′, a ′) − Q( x, a)) a′
(18.2)
436
Clemens van Dinther
The tuple (x, a, r, x’) determines the experienced transition from state x to x’ while performing action a and receiving reward r. It can be shown for bounded rewards and learning rates 0 ≤ α < 1 that Q(x, a) → Q*(x, a) with probability 1 for an infinite number of repetitions of playing action a in state x [60, p.282]. Having converged to optimal Q*, it is optimal for agents to act greedily, i. e. in state s to choose the action with the highest Q-value. Improvement to this algorithm have been proposed. An important contribution was done by [16], who developed a descriptive model for human-like learning behaviour in strategic environments. The reinforcement learning algorithms presented by Erev and Roth can be used for stateless learning. The performance of the ErevRoth algorithm was tested for models of experimental economics and the result of the simulation was compared to the experimental results of human subjects. It was shown that the computational results were similar to the experimental results. The basic difference to Q-learning is in the action selection. The Q-learning algorithms applies an exploration rate in order to assure the exploration of certain actions. Therefore, the agents decides arbitrarily whether to explore or to exploit. During exploitation the action with maximum Q-value is chosen. The Erev-Roth algorithm introduces propensities in action selection. A propensity qnk to choose a particular action k is assigned to each possible action k of agent n. This propensity is adjusted in respect to the achieved result (reward). The probability pnk(t) for choosing a certain action at time t results from pnk (t ) = qnk (t )∑ j qnj (t ). Since this algorithm on the one hand simulates human behaviour and on the other hand builds upon the fundamental work of reinforcement learning with good characteristics to converge to stable states, it is used in many social studies, e. g. in market simulations in [40]. The next section introduces the theory of stochastic games and links reinforcement learning techniques to game theoretic approaches. 18.3.4.2 Markov Games Markov games (MG) are a subfield of game theory. Since the seminal work of [49] MG are widely studied and also known as stochastic games (SG) [43]. MG can be described as a tuple n, S , A1,..., n , T , R1,..., n where n describes the number of players, S is a set of states, A1,…,n are the action sets of each player, T is a transition function S × A1 × … × An × S → [0,1] that is controlled by the current state and the performed actions, and Ri is agent i’s reward function S × A1 × … × An → R. Each player i wants to maximise the expected sum of discounted rewards ∞ E ∑ γ j ri,t + j . A policy π maps states to actions S → A, thus, the goal is to find
(
j =0
)
an optimal policy for each state. A policy is said to be optimal if there is no other policy that achieves a better result at any state. Comparing stochastic games and classical game theory, it can be remarked that stochastic games are an extension of matrix games to multiple states [12] i. e. each state can be described as a matrix game. Matrix games are determined by the tuple n, A1,…, n , T , R1,…,n where n is
18 Agent-based Simulation for Research in Economics
437
the number of players, A: A1 × … × An is the joint action set and Ai is the action set available to player i, and player’s i payoff function Ri. Alternatively, MG can be understood as the extension of a Markov decision problem (MDP) to multiple players. MDPs have at least one optimal policy [34]. In Markov games the performance depends on the opponents’ actions. Thus, it is not simple to determine optimal policies. A common approach is to evaluate each policy assuming the opponents best choice, i. e. to maximize the reward in the worst case. [34, p. 158] states that each Markov game has a non-empty set of optimal policies that might be probabilistic. In order to solve stochastic games for optimal policies, several algorithms were proposed. For the analysis, it has to be distinguished between zero-sum (also called purely competitive) and general-sum games. In zero sum games, the agent’s payoff sum up to zero, whereas in general-sum games the sum of payoffs can be different from zero. Almost all algorithms developed are applied to zero-sum games. [49] has shown that there are equilibria solutions for zero-sum games, and [17] were able to demonstrate the same for generalsum games. Both, reinforcement learning theory and game theory, have developed algorithms in order to find optimal policies. Nash-equilibrium solutions are one solution concept in game theory. In Nash equilibrium, each agent’s strategy is a best response on the other agents’ strategy. Each finite bimatrix game has at least one Nash equilibrium in mixed strategies. The fundamental difference between game theory solution concepts and reinforcement learning is that game theory assumes that the model of the game is known to the players. Reinforcement learning is basically applied to unknown models. Nevertheless, the approaches are very similar. [49] proposes to compute the quality of each action a of state s : Q ( s, a ) ← R ( s, a ) + γ ∑ s ' T ( s, a, s ' ) V ( s ' ) , and the value V ( s ) of a state as the value of the matrix game at state s. The value operator is given by the minimax value of the game. The difference of this algorithm to MDP’s value iteration is that the max operator was replaced by the value operator. According to this approach, [34] proposes the minimax-Q algorithm. It replaces the max operator in Q-learning with the value operator:
Q ( s, a ) ← (1 − α ) Q ( s, a ) + α ( r + γ V ( s ') )
(18.3)
and V (s) =
max
min
∑ π ( s, a1 ) Q ( s, a1 , a2 )
π ( s ,• )∈Π ( A1 ) a2 ∈ A2 a1∈ A1
(18.4)
It can be shown that in the zero-sum case this algorithm converges to the true values V * . Consequently, it is straightforward to use Q-learning mechanisms for learning in stochastic games. This is possible for a single-agent learning environment. [28] have introduced an algorithm for multi-agent learning in two player games. It is assumed that agents not only observe their own actions and payoff but also the competing agents immediate payoffs. Under these assumptions the
438
Clemens van Dinther
algorithm converges to Nash equilibrium strategies. Unfortunately, the assumption of observing the competitors immediate payoffs is not applicable in many cases. This section has presented theoretical work in both fields, game theory and computer science that builds the foundation for solving Markov games. Nevertheless, there is still much theoretical work needed in order to proof convergence of Q-learning algorithms in multi-agent learning environments.
18.4 Summary This chapter deals with application of simulations in economic research. Therefore, it reflects the methodology of simulation and its applicability for economic models. Agent-based simulations have become more and more popular in recent years but there is no common understanding of its different methodological approaches and their appropriateness to study specific problems. Some work has been done in the scientific area to close the gap between simulation approaches and economic theory. The present paper identifies four main techniques used in agent-based simulation, presents a brief review of the methodology and discusses the known similarities to economic theory. Certainly, the four approaches can also be combined according to the economic problem under investigation. The first technique is the fundamental pure-agent approach that implements agents, the environment and sets of simple (mostly) static interaction rules. It follows the agent-based philosophy of studying macro behaviour from a micro oriented model. The pure agent-based modellers are not so much interested in a direct link to standard economic approaches and tools, but try to understand and explain real economic observations as a result of interacting agents. Another approach is the Monte Carlo technique that can be used either probabilistically to model real world processes, or deterministically in order to find solutions by sampling the solution space. In the context of agent-based simulations Monte Carlo techniques are often combined with other simulation approaches, e. g. to model stochastic processes or to simplify learning by stochastic search. Evolutionary methods such as genetic algorithms are applied to search for stable states and to identify optimal strategies. Early work was done (see for example [35, 46]) in order to link computational methods to economic approaches such as evolutionary game theory. A better theoretical base is essential for economist to gain more confidence in agent-based techniques, but the cited work is a valuable contribution to close the gap. The theoretical foundation of Reinforcement Learning is developed by bridging game theory and computer science theory on basis of Markov decision problems and Markov games. Some work was found for proving the convergence of RLmethods to equilibrium of economic games. Still, the assumptions in the mathematical foundations are strong, so that the link is only well-founded for zero-sum Markov games. It is still necessary to prove the convergence to optimal behaviour in the more common case of generalsum games. Nevertheless, the progress and the results made in this area are promising.
18 Agent-based Simulation for Research in Economics
439
The presented classification of agent-based approaches can not be claimed comprehensive. Certainly, there are a couple of other techniques known in computer science such as neural networks or classifier systems. It is difficult to find a strict classification, since many simulation models combine the presented techniques. Nevertheless, the present paper presents the four most common techniques and their links to economic theory.
References 1. Andreoni JA, Miller JH (1990) Auctions with adaptive artificialy intelligent agents. Santa Fe Institute Working Paper, No. 90-01-004 2. Arifovic J (1994) Genetic algorithm learning and the cobweb model. Journal of Economic Dynamics and Control, 18:3–28 3. Arthur WB, Holland J, LeBaron B, Palmer R, Tayler P (1996) Asset pricing under endogenous expectations in an artificial stock market. Technical report, Santa Fe Institute 4. Axelrod R (1987) Evolving new strategies – the evolution of strategies in the iterated prisoner’s dilemma. In Davis L, editor, Genetic Algorithms and Simulated Annealing, Pitman and Morgan Kaufman, pp. 32–41 5. Axelrod R (2003) Advancing the art of simulation in the social science. Japanese Journal for Management Information System, 12(3), Special Issue on Agent-Based Modeling 6. Axtell R (2000) Why agents? on the varied motivations for agent computing in the social sciences. In Proceedings of the Workshop on Agent Simulation: Applications, Models and Tools, IL, Argonne National Laboratory 7. Banks J (1998) Principles of simulation. In Banks J, editor, Handbook of Simulation, New York, John Wiley & Sons, Inc., pp. 3–31 8. Bellman RE (1957) Dynamic Programming, Princeton University Press, Princeton 9. Bellman RE (1957) A markov decision process, Journal of Mathematical Mechanics, 6:679–684 10. Blackburn JM (1936) Acquisition of skill: An analysis of learning curves, IHRB Report, No. 73 11. Bonabeau E (2002) Agent-based modeling: Methods and techniques for simulationg human systems, in Adaptive Agents, Intelligence, and Emergent Human Organization: Capturing Complexity Through Agent-Based Modeling, Proceedings of the National Academy of Sciences of the USA (PNAS), 99 (3), Washington, DC, National Academy of Sciences of the USA, pp. 7280–7287 12. Bowling M, Veloso M (2000) An analysis of stochastic game theory for multiagent reinforcement learning, Technical Report CMU-CS00-165, Computer Science Department, Carnegie Mellon University, http://www.cs.ualberta.ca/ bowling/papers/00tr.pdf
440
Clemens van Dinther
13. Cai G, Wurman PR (2005) Monte carlo approximation in incomplete information, sequential auction games, Decision Support Systems, Special Issue: Decision Theory and Game Theory in agent design, 39(2):153–168 14. Catania C (1999) Thorndike’s legacy: Learning, selection, and the law of effect, Journal of The Experimental Analysis of Behavior, 72(3):425–428 15. Epstein J, Axtell R (1996) Growing Artificial Societies – Social Science From the BottomUp, MIT Press, Boston 16. Erev I, Roth AE (1998) Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria, The American Economic Review, 88(4):848–881 17. Filar J, Vrieze K (1997) Competitive Markov Decision Processes, Springer Verlag, New York 18. Friedman D (1998) On economic applications of evolutionary game theory. Journal of Evolutionary Economics, 8(1):15–43 19. Fudenberg D, Tirole J (1991) Game Theory, MIT Press, Cambridge 20. Gilbert N, Troitzsch K (2005) Simulation for the Social Scientist, Open University Press, Graw-Hill Education, Berkshire, England, second edition 21. Gode D, Sunder, S (1993) Allocative efficiency of markes with zerointelligence traders: Market as a partial substitute for individual rationality, Journal of Political Economy, 101:119–137 22. Hall. On an experimental determination of pi, Messeng. Math., 2:113–114, 1873 23. Hammersley JM, Handscomb DC (1964) Monte Carlo Methods, Spottiswoode, Ballantyne & Co Ltd, London and Colchester 24. Holland JH (1992) Adaptation in Natural and Artificial Systems, MIT Press, Boston, 1st MIT-Press edition (book was originally published 1975) 25. Holland JH, Miller JH (1991) Artificial adaptive agents in economic theory, American Economic Review, Papers and Proceedings, 81:365–370 26. Hommes CH (2006) Heterogeneous agent models in economics and finance, in: Tesfatsion L, Judd KL, editors, Handbook of Computational Economics 27. Howard RA (1960) Dynamic Programming and Markov Processes, The MIT Press, Cambridge, MA 28. Hu J, Wellman, MP (1998) Multiagent reinforcement learning: Theoretical framework and an algorithm, in: Proceedings of the 15th International Conference on Machine Learning 29. Judd KL (1997) Computational economics and economic theory: Substitutes or complements? Journal of Economic Dynamics and Control, 21(6):907–942 30. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey, Journal of Artificial Intelligence Research, 4:237–285 31. Kirman A, The structure of economic interaction: Individual and collective rationality, in: Bourgine P, Nadal JP, editors, Cognitive Economics: An Interdisciplinary Approach, Chapter 18, Springer, Heidelberg, pp. 293–312 32. Krugman P (1996) What economists can learn from evolutionary theorists, talk given to the European Association for Evolutionary Political Economy, http://www.mit.edu/krugman/evolute.html
18 Agent-based Simulation for Research in Economics
441
33. Kunzelmann M, Mäkiö J (2005) Pegged and bracket order as a success factor in stock exchange competition, in: Proceedings of the 2nd Conference FinanceCom05, IEEE 34. Littman, ML (1994) Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning, San Francisco, CA, pp. 157–163 35. Marimon R (1993) Adaptive learning, evolutionary dynamics and equilibrium selection in games, European Economic Review, 37:603–611 36. Metropolis N, Ulam, S (1949) The monte carlo method, Journal of the American Statistical Association, 44(247):335–341 37. Milgrom P, Stokey NL (1982) Information, trade, and common knowledge, Journal of Economic Theory, 26(1):17–27 38. Miller JH (1986) A genetic model of adaptive economic behaviour, Working paper, University of Michigan, Ann Arbor 39. Neumann D (2004) Market Engineering – A Structured Design Process for Electronic Markets, PhD thesis, Universität Karlsruhe (TH), Karlsruhe, Germany 40. Nicolaisen J, Petrov V, Tesfatsion, L (2001) Market power and efficiency in a computational electricity market with discriminatory double-auction pricing. IEEE Transactions on Evolutionary Computation, 5(5):504–523, http://www. econ.iastate.edu/tesfatsi/mpeieee.pdf 41. Optimatics (2004) Genetic Algorithm Technology for the Water Industry, information on http://www.optimatics.com/GAhistory.htm 42. Ostrom TM (1988) Computer simulation: The third symbol system, Journal of Experimental Social Psychology, 24(5):381–392 43. Owen G (1982) Game Theory. The MIT Press, Cambridge, MA, second edition 44. Puterman ML (1994) Markov Decision Processes – Discrete Stochastic Dynmaic Programming, Wiley Series in Probability and Statistics, John Wiley & Sons Inc., New York 45. Puterman ML, Shin MC (1978) Modified policy iteration algorithms for discounted markov decision processes, Management Science, 24:1127–1137 46. Riechmann T (2002) Genetic algorithm learning and economic evolution, in: Chen SH, editor, Evolutionary Computation in Economics and Finance, Physica-Verlag, Heidelberg, pp. 45–60 47. Schelling TC, Models of segregation, The American Economic Review, 59(2):488–493 48. Sen S, Weiss G (1999) Learning in multiagent systems, In Weiss G, editor, Multiagent Systems, The MIT Press, Cambridge (Massachusetts) pp. 259–298 49. Shapley LS (1953) Stochastic games, Proceedings of the National Academy of Sciences of the United States of America, 39(10):1095–1100 50. Steiglitz K, Honig ML, Cohen LM (1996) A computational market model based on individual action, in Clearwater S, editor, Market-Based Control: A Paradigm for Distributed Resource Allocation, Hong Kong, 1996, World Scientific, see http://www.cs.princeton.edu/ ken/scott.ps
442
Clemens van Dinther
51. Sutton RS, Barto AG (2002), Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 2002 52. Tesfatsion L (2002) Agent-based computational economics: Growing economies from the bottom-up, ISU Economics Working Paper No. 1, February 2002, http://www.econ.iastate.edu/tesfatsi/ 53. Thorndike EL (1898) Animal intelligence: An experimental study of the associative processes in animals, Psychological Review Monograph Supplement, II(4, Whole No. 8) 54. Thorndike EL (1901) The Human Nature Club; an Introduction to the Study of Mental Life, Longman, Green and Co., New York, 2nd, enl. edition 55. Thorndike EL (1907) The Elements of Psychology, A.G. Seiler, New York, 2nd edition 56. Thorndike EL (1909) Darwin’s contribution to psychology, University of California Chronicle, 12:65–80, 1909 57. Thorndike EL (1911) Animal Intelligence: Experimental Studies, The Animal Behavior Series. Macmillan, New York 58. von Neumann J (1966) Theory of Self-Reproducing Automata, University of Illinois Press, Urbana and London 59. Watkins C. (1989) Learning from Delayed Rewards, PhD thesis, University of Cambridge, England 60. Watkins C, Dayan P (1992) Q-learning, Machine Learning, 8(3–4):279–292
CHAPTER 19 The Heterogeneous Agents Approach to Financial Markets – Development and Milestones Jörn Dermietzel
19.1
Introduction
For economists financial markets are one of the most interesting mechanisms to investigate. One reason is the availability of tons of data ranging from high frequency data to long term observations covering several decades. On the basis of these huge data sets it is possible to study the dynamics of asset prices balancing supply and demand. On the other hand there is a sound theoretical basis that has mainly been developed in the last century. The results from this theory are based on strict assumptions in order to enabled researchers to derive and to solve analytical models for financial markets. Since (Mandelbrot 1963) researchers have discovered numerous statistical properties in real market time series that contradict the theoretical results. These so called stylized facts together with the paradigm shift away from the completely rational, representative agent to boundedly rational, heterogeneous agents motivated researchers to model financial markets as complex evolving systems of heterogeneous interacting traders. These models can provide explanations for some of the observed stylized facts, as well as historical events, like the internet bubble in the early 90s and the stock crashes of 1987 and 1990. The purpose of this chapter is, to provide a survey of financial market models with heterogeneous traders for newcomers and to wake the interest of researchers in related fields. It presents a brief summary of the development from very stylized models more than 25 years ago to recent complex models, which are able to reproduce and explain many empirically observed anomalies in financial markets. The selected models are examples of pioneering works which influenced this development significantly. In order to preserve the introductory character of this article the description of the models is kept short and some details are omitted in case they are not necessary to understand the idea behind the model or the analysis. For additional overviews with many references and detailed discussions of related models the reader is referred to the surveys by (LeBaron 2000; LeBaron 2006; Hommes 2006).
444
Jörn Dermietzel
The structure of the chapter is as follows: Section 19.2 presents and discusses the basic assumptions of financial theory. Section 19.3 summarizes the most important statistical properties of real world financial time series, which contradict the results of financial theory. In Sections 19.4 to 19.6 the most important models of financial markets with heterogeneous traders are briefly described. Section 19.7 presents related questions in financial markets and section 19.8 finally concludes and provides an outlook for further research activities.
19.2
Fundamental Assumptions in Financial Theory
Financial theory is based on strong assumptions on the behaviour of traders and on the market environment. These assumptions enable researchers to model and to predict financial markets by using mathematical tools.
19.2.1 Rationality Rationality is one of the most important assumptions in economics, meaning that every economic individual has rational expectations and acts optimal in an economic sense given these expectations (see (Sargent 1993)). (Friedman 1953) claims that the behaviour of economic individuals in markets can be described as if they behave rational, since the Friedman Hypothesis states that biological competition will eliminate all non-rational traders. The concept of rationality also implies that individuals have unlimited knowledge about their environment, form beliefs that perfectly reflect this knowledge, and are able to derive optimal decisions based on these beliefs, no matter how complex the tasks are. Following this argumentation the rational expectation hypothesis (REH) was introduced by (Muth 1961; Lucas 1971), and states that beliefs are always perfectly consistent with realizations, i. e. rational individuals do not make systematic forecasting errors. According to the Friedman hypothesis these assumptions exclude rules of thumb and market psychology as profitable trading strategies, whereas (Keynes 1936) argues that investors sentiment and market psychology play an important role in financial markets. (Simon 1957) claims that individuals are boundedly rational and use simple rules of thumb, because they are limited in their knowledge about the environment, as well as in their computational skills. The most important argument by Simon is that in reality individuals have to face search costs for obtaining sophisticated information. Indeed it seems highly unrealistic that individuals are able to have full information and to predict every impact arbitrary events can have on the market. A formal proof of for the complexity of financial markets is given in (Aspnes et al. 2001). The authors present a simple asset pricing model and show that for
19 The Heterogeneous Agents Approach to Financial Markets
445
a reasonable heterogeneity among trading strategies the task of predicting the future asset price is not solvable in polynomial time. This underlines the theoretical character of the concept of full rationality. The complexity of financial markets forces traders to use imperfect rules of thumb in order to react fast enough on the changing market environment.
19.2.2 Efficient Market Hypothesis The Efficient Market Hypothesis (EMH) dates back to (Fama 1965). One distinguishes between weak, middle and strong market efficiency, which means that either all available, all public, or all public and private information is reflected in the market prices. This implies that past prices do not contain any usable information that is not represented by current prices. The argument for efficient markets is that every profit opportunity in an inefficient market would be exploited by rational arbitrageurs and thereby disappear, which drives the price immediately to its fundamental value. Thus, in efficient markets there is no forecastable structure in asset returns. Here again, the criticism of (Simon 1957) comes into play. (Grossman and Stiglitz 1980) argue that if information is costly, e. g. due to search costs, there is no incentive in an efficient market to obtain the costly information. But the question is then, how did the market became informationally efficient?
19.2.3 Representative Agent Theory The assumption of rationality leads to the representative agent theory, which aggregates the behaviour of a population of economic individuals in one representative agent, acting perfectly rational. Since it is assumed that all economic individuals are perfectly rational, including common knowledge, they will all behave the same. Thus one can describe the behaviour of a population of rational individuals as if it is one rational individual. The representative agent theory implies the no trade theorem of (Milgrom and Stokey 1982). In a market of homogeneous traders prices always reflect the fundamental price of the underlying asset. No one is willing to sell at lower prices and no one is willing to buy at higher prices. At equilibrium prices there is no arbitrage opportunity. Thus there is no incentive to trade at equilibrium prices. Of course one can assume that there are always some less rational traders around providing liquidity around equilibrium prices, but empirical observations show that there is heavy trade, which contradicts the not trade theorem. A possible explanation of excess trading volumes is heterogeneity among traders. This approach is strengthened by questioners conducted among financial practitioners. The models presented in this article base on the assumptions that traders neither are perfectly rational nor have heterogeneous expectations.
446
Jörn Dermietzel
19.3
Empirical Observations
In the 70s and 80s the assumptions presented above, namely representative agents, rational expectations and the efficient market hypothesis, became the dominant paradigms in finance and in economics in general. But almost in parallel to the development of this theory empirical observations lead to increasing criticism of these strong assumptions. (Mandelbrot 1963) was the first who discovered empirical properties in financial time series that contradicted financial theory. Other studies use the statistical properties found in financial time series to show that market prices are not as unpredictable as suggested by the efficient market hypothesis. Examples are (Campbell and Shiller 1988; Lo and MacKinlay 1988; Dacoragna et al. 1995). Their results are strengthened by (Brock et al. 1992), who test a number of simple frequently used technical trading rules on real time series and derive that these can generate significant positive returns. Other researchers, like (Hansen and Singleton 1983; Frankel and Froot 1986; Mehra and Prescott 1988; Cutler et al. 1989) call the strong connection between financial markets and macroeconomic fundamentals into question. Today a number of stylized statistical properties are known. The most important are Excess volatility: The volatility of asset prices is higher than the volatility of the underlying economic fundamentals. Volatility clustering: The variance of asset prices is not stationary. There are switches between phases of high volatility and phases of low volatility. Fat tails: The distribution of price changes has more weight in the extremes and a higher peak around the mean than the normal distribution suggests. More stylized facts and extended discussions on these findings are provided e. g. in (Cont 2001; Shiller 2003). In addition to the analysis of real world time series and the underlying fundamentals, researchers have conducted questionnaires among financial practitioners, e. g. (Allen and Taylor 1990; Ito 1990; Taylor and Allen 1992; Frankel and Froot, who published a whole series of papers with their data in the late 80s and early 90s (see (Frankel and Froot 1986, 1987a, b, 1990a, b)). These studies show that financial practitioners indeed use chart analysis and similar trading techniques in addition to fundamental analysis. But in contrast to the Friedman hypothesis these alternative trading techniques did not vanish. Instead their importance increased since then, which is a hind for their success in terms of significant excess returns. As a result of the discrepancies between real markets and theoretical models, in the late 80s and 90s heterogeneous agent models and bounded rationality became increasingly popular in economics and finance. The developments in mathematics, physics and computer science regarding nonlinear dynamics, chaos and complex
19 The Heterogeneous Agents Approach to Financial Markets
447
systems motivated economists to apply these tools to financial markets. This leads to the interpretation of financial markets as complex evolving systems as emphasized e. g. in (Anderson et al. 1988). Also the emerged availability of computational tools enormously stimulated the development in this field.
19.4
Evolution of Models
19.4.1
Fundamentalists vs. Chartists
Heterogeneous expectations of traders can have different reasons, e. g. different information accuracy, different interpretation of information, or different preferences. Most analytical models concentrate on the two typical types of traders found in financial practice, namely fundamental traders and chart traders. Fundamental traders belief that the price of an asset will return to its fundamental value in the medium to long run. Therefore they invest into undervalued assets, i. e. where the price of the asset is below its fundamental value. Analogously they sell assets that are overvalued. With their demand and supply they drive the prices of the assets to its fundamental values. This is the behaviour usually assumed in financial theory, because it is solely based on the development of economic news and the current price that reflects this news. Chart traders belief that there is useful information in past prices, which contradicts financial theory. Their trading is based on the extrapolation of price trends, expecting a trend to persist in the short run. Accordingly they buy an asset if its price shows an upward trend and they sell if decreasing prices are expected. By prolonging observed trends, which were initially started by changes of the fundamental, these traders can produce substantial deviations from fundamental prices. The interaction of fundamental and chart traders can generate complex dynamics that exhibit similar statistical properties than real world price series. One of the first models with fundamental and chart traders was developed in (Beja and Goldman 1980). They present a very stylized model that is desired to explain the implications of speculative behaviour on prices, i. e. if past prices exhibit some information that traders can exploit, which is a basic requirement for chart trading. Although the model lacks any micro foundation, it is able to produce substantial deviation of prices from the underlying fundamental price. This observation contradicts the assumption of financial theory, that market prices always reflect the fundamental price of an asset. The model is based on the assumption that prices adjust to excess demand or supply with a given finite speed, instead of instantaneous adjustment. This method, applied in (Beja and Goldman 1980), is today called market maker mechanism. In real markets market makers provide liquidity. They hold an inventory from which they offer stocks at a fixed sell-price and they buy stocks at a fixed buy-price. These prices are permanently adjusted to the market, i. e. if there is excess demand
448
Jörn Dermietzel
the market maker raises the prices, if there is excess supply he lowers the prices. By keeping the range between sell-and buy price small, the market maker guarantees liquidity around equilibrium prices. Financial market models often use a stylized version of a market maker in order to determine the market price, who adjusts the price to observed demand and supply. Accordingly the price raises as a reaction to excess demand and decreases if excess supply is observed. Another well known model using a market maker to determine asset prices is presented in (Day and Huang 1990). The performance of automated market makers in financial markets is analyzed in (Hakansson et al. 1985). Usually one assumes the price change to be a linear function of excess demand with a constant speed of adjustment. Recently (Farmer 2002) as well as (Farmer and Joshi 2002) presented a model with nonlinear price adjustment. They define a log-linear price impact function which is closely related to the linear market maker. In their model Beja and Goldman use a linear market maker that adjusts prices to the aggregated excess demand of fundamental traders and chart traders, so called speculators. That is P ( t ) = a ⎡⎣ P f ( t ) − P ( t ) ⎤⎦ + b ⎡⎣ψ ( t ) − r ( t ) ⎤⎦ + ε ( t ) ,
(19.1)
ED c ( t )
ED f ( t )
where P(t) is the logarithm of the current asset price and Pf(t) is the logarithm of the exogenously given fundamental price, i. e. the price that perfectly reflects the underlying fundamentals of the asset. The excess demand of fundamental traders EDf(t) is a linear function of the difference between the current price and the fundamental price. Thus fundamental excess demand drives the price back to its fundamental value. The excess demand of speculators EDc(t) is determined by the difference of the anticipated price trend of the asset ψ(t) and the return r(t) of an exogenously given alternative investment opportunity. The anticipated price trend can be seen as the average of the heterogeneous expectations of speculators and is revised according to
ψ ( t ) = c ⎡⎣ P ( t ) −ψ ( t ) ⎤⎦ ,
(19.2)
where c is a positive constant that determines the flexibility of adaption to the current price. While the parameters a, b and c are invariant the variables Pf(t) and r(t) may change over time by following a stochastic process. Beja and Goldman 1980) analyze the stability of their model for constant exogenous parameters, i. e. Pf(t) = Pf, r(t) = 0 and ε(t) = 0 for all t. They show that the system is stable if and only if a > c ( b − 1)
(19.3)
and it is oscillating if and only if
(1 − a c ) < b < (1 + a c ) . 2
2
(19.4)
19 The Heterogeneous Agents Approach to Financial Markets
b
449
8
7
6
5
4
3
2
1
0 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
a/c Figure 19.1. Stability conditions in (Beja and Goldman 1980): The dotted area represents stable parameter constellations. Parameters in the grey area lead to oscillating prices with increasing resp. decreasing amplitudes
That means a high amount of speculative demand, i. e. high b, or a high flexibility in the anticipation of trends, i. e. high c, leads to instability. These properties are visualized in Figure 19.1. The analysis of the convergence to the fundamental price by Beja and Goldman shows that the impact of speculation can accelerate convergence but it can also lead to oscillating prices and instability. With their model Beja and Goldman do not only show that chart trading can be successful but they also show that speculative behaviour plays an important role in the dynamics of asset prices. Both observations are in strong contrast to classical financial theory, which states that past prices do not exhibit useful information and thus there is no substantial chart trading. In the early ’90s the ongoing debate on market efficiency and the random walk model together with the market crash of ’87 motivated researchers to enforce the search for alternative models of financial markets. In this realm (Chiarella 1992) extended the model of (Beja and Goldman 1980). He substituted the linear function which determines the excess demand in Equation 19.1 by a non-linear function, i. e. an s-shaped function h(·). This changes Equation 19.1 into P ( t ) = a ⎡⎣ P f ( t ) − P ( t ) ⎤⎦ + h (ψ ( t ) − r ( t ) ) + ε ( t ) , ED f ( t )
ED c ( t )
(19.5)
450
Jörn Dermietzel
The following properties of h(·) qualify this function to represent the aggregated excess demand of many traders, having different levels of wealth, time horizons and preferences. 1. 2. 3. 4.
h′(x) > 0 for all x, h(0) = 0, there exists an x* with h′′(x < 0 (> 0) for all x > x* (< x*), limx→±∞ h′ (x) = 0.
The analysis of the model shows that for the unstable case there exists a stable limit cycle along which the model fluctuates. In addition (Chiarella 1992) presents a discrete time version of the model which is able to generate chaotic motion around the fundamental price. This is an important step towards the replication of realistic price fluctuations. Chairella points out that the techniques of dynamic optimization theory are not sophisticated enough to analyze a model with a detailed microstructure which could generate similar dynamics, although such a model would be desired. Therefore the properties of h(·) are an approximation of the aggregation of such microscopic behaviour. Beside the development of a detailed microstructure he also suggests to incorporate endogenous changes of the parameters a, b and c which would lead to temporary regime switching between convergence to fundamental prices, limit cycle motion and chaotic fluctuations.
19.4.2
Social Interaction
The above models focus on the economic interaction of different groups of traders, i. e. fundamentalists and chartists. Another aspect that may play an important role in real markets is social interaction among traders. (Kirman 1991; Kirman 1993) describe the local interaction in ant populations and suggest, to carry over these dynamics to the behaviour of traders in financial markets. (Lux 1995) presents a model of social interaction in a financial market model. He differentiates the group of chartists into optimistic and pessimistic chartists. The transitions between these two subgroups are determined by transition laws, based on the synergetics literature of (Haken 1993). Accordingly the probabilities of pessimists to become optimists and vice versa are given by
π +− = υ exp U ,
(19.6)
π −+ = υ exp− U ,
(19.7)
with U = α1 x + α 2
P
υ
.
(19.8)
The speed of change is given by υ. While the first part of the exponent represents the social interaction, the second part represents the current price process.
19 The Heterogeneous Agents Approach to Financial Markets
451
These two forces are weighted by α1 and α2. Here x is an opinion index ranging from −1, i. e. all chartists are pessimistic, to +1, i. e. all chartists are optimistic. That means the more chartists are optimistic the more pessimists are converted into optimists and vice versa. Assuming that each pessimistic chartist sells a fixed amount sc of asset shares and each optimistic chartist buys sc asset shares the excess demand of chartists’ sums to ED c = ( n+ − n− ) s c .
(19.9)
The excess demand of fundamentalists is determined by the difference between fundamental value p f and the actual price p. That is
(
)
ED f = s f p f − p .
(19.10)
Together 19.9 and 19.10 determine the excess demand ED the market maker has to face. The latter adjusts the price according to p = β ED,
(19.11)
ED = ED c + ED f .
(19.12)
with
Note that the total numbers of chartists and fundamentalists are given exogenously and stay constant. (Lux 1998) extends the model by allowing traders to switch between both strategies. Switching between optimistic or pessimistic chartists and fundamentalists is based on the profit differential between these strategies. In addition an exit/entry process of traders to the market is modeled that leads to permanent fluctuations. Another extension is made in (Lux 1997). By adding noise to the excess demand which is observed by the market maker equation 19.12 becomes
ED = ED c + ED f + ε ,
(19.13)
were ε is i.i.d. with mean 0 and has a symmetric probability density function φ(ε). Assuming that there exists a minimal price step, e. g. cents or pence, one can then determine the probabilities of the price to increase or decline by this minimal step: ∞
π ↑ p = β ∫ − ED ( ED + ε ) φ ( ε )d ε , − ED
π↓p = β ∫ ∞
( − ED − ε ) φ (ε )d ε ,
(19.14) (19.15)
All three models produce time series that exhibit major stylized facts. While (Lux 1995) concentrates on the occurrence of bubbles and crashes, (Lux 1998) points out the fat tails of the return distribution and (Lux 1997) focuses on the time variation of second moments of the return distribution.
452
Jörn Dermietzel
19.4.3
Adaptive Beliefs
(Brock and Hommes 1998) introduce a new class of models called Adaptive Believe Systems (ABS). These are based on the Cobweb Framework of (Brock and Hommes 1997). The basic idea is to provide a framework for models of financial markets where strategies differ only by the beliefs of the different types of traders, i. e. the information set and the interpretation by traders. Once the forecast, based on this information is done, all traders act optimal given their individual forecast. There are two assets to invest in: one risky asset with endogenous price pt paying a stochastic dividend dt and one risk free asset paying a constant interest r = R−1. Traders are assumed to be myopic mean variance maximizers with risk aversion γ, who try to maximize their expected wealth, which evolves according to Wt +1 = R (Wt − pt zt ) + ( pt +1 + d t +1 ) zt Riskfree investment
Risky investment
= RWt + ( pt +1 + dt +1 − Rpt ) zt ,
(19.16)
where zt is the number of stock shares the trader holds. Thus the first part of equation 19.16 is the return of the wealth-fraction invested in the risk free asset, while the second part is the return from the risky asset. Suppose there is a type of traders called h. Given the conditional expectation Eht and variance Vht these traders face the maximization problem
γ ⎧ ⎫ max ⎨ EhtWh,t +1 − VhtWh,t +1 ⎬ . zht ⎩ 2 ⎭
(19.17)
Assuming that all traders hold the same expected conditional variance, i. e. Vht = σ2 for all types h, equation 17 leads to the optimal demand of shares zht =
Eht ( pt +1 + dt +1 − Rpt )
γσ 2
.
(19.18)
The model of (Brock and Hommes 1998) assumes that the market is always cleared, i. e. the outside supply of shares per investor zst equals the aggregated demand of traders. That is
∑ nht h
Eht ( pt +1 + dt +1 − Rpt )
γσ 2
= zst ,
(19.19)
where nht denotes the fraction of investors of type h. By setting nht = 1 and zst = 0, i. e. there is only one type of traders and no outside supply, one can derive the fundamental price
ptf =
d , r
(19.20)
19 The Heterogeneous Agents Approach to Financial Markets
453
and thereby define the deviation of the price from its fundamental value
Δp = pt − ptf . t
(19.21)
This deviation of the price from its fundamental value is one part of the dynamic system Brock and Hommes analyze. The second part of the system are the fractions of investors of the different types. In order to define the dynamic of these one first has to introduce the different beliefs of the types of traders. These are of the form
(
)
Eht ( pt +1 + dt +1 ) = Et ptf+1 + dt +1 + f h ( Δpt −1 , …, Δpt − L ) .
(19.22)
Thus all traders share the first term of the forecast, while the second term, the deterministic function fh(·), can differ across trader types h. The selection of trading strategies is based on a fitness function Πht, which is determined by realized profits. This leads to the discrete choice probabilities1 nht =
exp υΠ h,t −1 , Ut
(19.23)
with U t = ∑ exp υΠ h,t −1 ,
(19.24)
h
where υ is the intensity of choice. (Brock and Hommes 1998) analyze the dynamics for different initial populations consisting of up to four types of traders. The beliefs of these types are characterized by simple linear functions of the form f ht = ah Δpt −1 + bh ,
(19.25)
where a stands for the addiction of trends and b represents biases. Given this form of beliefs one can model a number of typical trading behaviours, e. g. fundamentalists and chartists. (Brock and Hommes 1998) analyze the stability of systems with two, three and four different types of traders and show that heterogeneity leads to persistent deviations from the fundamental price. A high intensity of switching between strategies leads to irregular, chaotic price fluctuations. Especially the setup with a population mixture of fundamentalists, trend chasers, contrarians, and biased beliefs produces time series that exhibit important stylized facts. 1
This is a simplified representation. The model of Brock and Hommes (1998) contains more parameters, e. g. the memory strength, which are not taken into account in their analysis.
454
Jörn Dermietzel
There exist many extensions and variations of the framework presented by (Brock and Hommes 1998). E. g. (Chiarella and He 2001) substitute the constant absolute risk aversion utility (CARA) of (Brock and Hommes 1998) by constant relative risk aversion utility (CRRA) in order to link the relative wealth of the traders to their demand and to the price function. They set up a model with growing price and wealth processes that is stationary in terms of return and wealth proportions. In this paper traders are not allowed to switch between strategies. The switching between strategies is added to the model in the chapter by (Chiarella and He) in this handbook. Other extensions of (Brock and Hommes 1998) are (Chiarella and He 2002; Chiarella and He 2003). In the first article the robustness of the results of Brock and Hommes are tested. Therefore the authors relax the assumptions of homogeneous degrees of risk aversion and homogeneous expected conditional variances. They also investigate the effects of different memory lengths on the dynamics. In the later article the authors substitute the equilibrium model by a more realistic market mechanism, i. e. a linear market maker which was introduced in (Beja and Goldman 1980; Day and Huang 1990). A very recent extension of the model is (Chang 2007), in which the impact of social interaction in the framework of (Brock and Hommes 1998) is analyzed.
19.4.4
Artificial Stock Markets
With the availability of computational power, researchers where able to construct simulation models and thereby to overcome the limits of mathematical modelling and analytical tractability. One of the most important models of financial markets, namely the Santa Fe Artificial Stock Market (SF-ASM), is such a simulation model. It combines the framework and parts of the theory from financial economics with techniques from computer science, especially from artificial intelligence. The basic model was presented in (LeBaron et al. 1999; Arthur et al. 1997). The economic framework is similar to (Brock and Hommes 1998). Traders can invest into one risky and one risk free asset. While the risky asset pays a stochastic dividend dt the risk free one pays a constant interest r. The price of the risky asset is determined endogenously by balancing supply and demand analytically. Similar to (Brock and Hommes 1998) traders are myopic mean variance maximizers, who seek to maximize their expected one period wealth. Each trader forecasts the future return of the risky asset according to Et ( pt +1 + d t +1 ) = aij ( pt + d t ) + bij
(19.26)
The parameters aji and bji are not only heterogeneous among traders j but depend on the current status of the market, i. e. current fundamental and technical signals. These signals are encoded in a market vector of twelve bits, as illustrated in Table 19.1.
19 The Heterogeneous Agents Approach to Financial Markets
455
Table 19.1. Market vector (left) and trading rule (right) in the SF-ASM (cf. LeBaron et al. 1999) Market vector: Bit Condition 1 pt*r/dt > 1/4 2 pt*r/dt > 1/2 3 pt*r/dt > 3/4 4 pt*r/dt > 7/8 5 pt*r/dt > 1 6 pt*r/dt > 9/8 7 pt > 5-period Moving Average 8 pt > 10-period MA 9 pt > 100-period MA 10 pt > 500-period MA 11 Always 1 12 Always 0
Each trader is able to store parameter tuples for 100 market situations. Each of these trading rules consists of a condition part, a forecasting part and a fitness measure, as shown in Table 19.1. The condition part of rule j consists of twelve bits (mj,1, … , mj,12) in {0; 1; #}. In order to identify the current market situation, these bits are compared to the current market vector, where 1 matches 1, 0 matches 0 and # matches both. If all bits match the rule becomes active. There is at least one active rule, namely the one with m1 = m2 = … = m12 = # matching every market status, but there might also be several active rules. Among these the rule with the highest fitness is used to forecast the future return as given in equation 19.25. The expected variance of the forecast corresponds to the fitness of the chosen rule. Traders learn in two ways. First the fitness of all active trading rules is updated at the end of each trading period, in order to reflect recent forecasting accuracy of the rules. The second mechanism is a genetic algorithm, which is activated frequently. The genetic algorithm replaces the 20 rules with the lowest fitness of each agent by mutations and recombinations of well performing rules. Thereby traders repeatedly adjust their rule-set to recent market observations. There exist numerous extensions and alternative designs to the model. Most of these are reviewed in (LeBaron 2006). (LeBaron 2002) gives an inside view to the design decisions the developers of the model had to face. The complexity of the model and the many alternatives in designing give rise to criticism, because the mass of parameters that have to be set makes it difficult to pin down clear causalities. Nevertheless the authors show that for slow learning, i. e. the genetic algorithm is activated every 200 periods on average, the arising price dynamics correspond to the rational expectations equilibrium (REE), i. e. the risk adjusted fundamental price. For faster learning, i. e. the genetic algorithm is activated every 50 periods on average, the price dynamics exhibit major stylized facts and phases of substantial deviation from the rationally expected equilibrium price are observed.
456
Jörn Dermietzel
In the SF-ASM model the traders follow standard fundamental and technical trading patterns. (Daniel et al. 2005) use human subject experiments in order to compare the performance and the impact of very heterogeneous trading approaches. Therefore humans coming from different backgrounds program and update their own traders that act in an artificial stock market model. This method allows the authors to analyse a hugh variety of non-standard trading strategies directly imprinted by humans.
19.5
Competition of Trading Strategies
Different trading strategies have a huge impact on the performance of financial market models. In the models described in the previous sections, traders use strategies based on reasonable beliefs. These are mostly fundamental signals and trend analysis. If a selective framework is applied, these strategies coexist in the market, with temporary dominance of one of both strategies. But none of the two is eliminated persistently from the market. Thus, there are phases when one strategy works better than the other, but on average both strategies are performing well. Nevertheless it is important to analyze the returns of different trading strategies, in order to identify dominant strategies and the motivation for strategy selection. One important aspect is the impact of strategies on market dynamics and thereby on the return of other traders.
19.5.1 Friedman Hypothesis Revisited Behavioural finance investigates the impact of bounded rational behaviour on financial markets, believing that this helps to understand some important financial phenomena and offers an explanation for the risk premium puzzle. A good survey on behavioural finance is given by (Barberis and Thaler 2003). One important aspect of this field is the noise trader approach, which investigates the impact of obviously wrong information on the market. The Friedman hypothesis states, that irrational traders will realize lower returns than rational arbitrageurs and due to evolutionary pressure they disappear from the market. An example for such irrational traders are so-called noise traders, who erroneously belief that they have special information about an asset and trade on that wrong information (see (Kyle 1985; Black 1986)). In contrast to the Friedman hypothesis noise traders do not necessarily perform worse than rational traders. This is shown by (Delong et al. 1990). In their evolutionary model rational arbitrageurs trade against noise traders. The authors show that a significant impact of noise traders may drive asset prices away from its fundamental value and thereby induce additional risk. Rational arbitrageurs take into account the increased risk and, since the opinion of noise traders is generally unpredictable, they leave the
19 The Heterogeneous Agents Approach to Financial Markets
457
market. In the selective model of (Delong et al. 1990) this enables noise traders to realize higher returns than rational traders, i. e. the higher the risk aversion of rational traders, the higher the difference in returns. In this scenario it is even possible that rational traders are completely driven out of the market by noise traders. It is noteworthy that due to the risk aversion, resp. the utility function, of rational traders noise traders, who are willing to bear higher risk, can earn higher returns. In terms of utility the rational traders perform always better than the noise traders, and thus if the selection is based on utility instead of returns in the long run noise traders cannot survive.
19.5.2
Dominant Strategies
Not only wrong information can lead to decreasing returns for sophisticated traders, but chart trading can also lower the overall expected return of the market. This is shown in (Joshi et al. 1998). The authors analyze the returns of traders in the SFASM model, depending on the use of chart information or combined chart and fundamental information. They identify a multi-person prisoner’s dilemma in the selection of strategies. Table 19.2 shows the final wealth of an agent for the four scenarios A to D. If all traders, including the agent under observation, abandon technical trading (situation D) the payoff is fairly high. This is the social optimum a regulator would desire. If only the agent under observation uses technical trading (B) he has an even higher payoff. If the number of traders is high the change of this single agent’s strategy has a neglectable impact on the average payoff of all other traders, i. e. it stays approximately at the level of situation D. Thus each individual trader has an incentive to include technical trading if all other traders are not doing so. In situation A all traders, including the agent under observation, include technical trading. This leads to a fairly low payoff level, which is clearly below the payoff in situation D. If all traders, but the one under observation include technical trading (situation C) the payoff of this one is even below the level of A. Thus there is no incentive for individual traders to abandon technical trading if all traders are using this method. For each individual trader it is a dominant strategy to include technical trading, but if all traders follow this dominant strategy the overall payoff is lowered and the market is locked in an inefficient equilibrium (A).
Table 19.2. Strategy selection leads to a prisoner’s dilemma (cf. (Joshi et al. 1998)) All other traders include technical exclude technical trading trading The agent under observation
includes technical trading excludes technical trading
A: ~113
B: ~154
C: ~97
D: ~137
458
Jörn Dermietzel
According to (Joshi et al. 2002) a similar situation arises for the optimal frequency to revise the forecasting rules. As shown for the SF-ASM model in (LeBaron et al. 1999) slow learning leads to prices close to the REE, whereas fast learning leads to substantial deviation from the REE-prices, i. e. the market becomes inefficient with fast learning. (Joshi et al. 2002) use this model to show that each individual trader has an incentive to increase the frequency of revising her rules. Again the optimal decision, i. e. fast learning, leads to a suboptimal outcome, because it increases the amount of technical trading and noise in the market. Thereby it becomes more difficult for all traders to predict future prices.
19.6
Multi-Asset Markets
A natural extension of the models with one risky asset is to allow traders to split their investment into several risky assets. But more assets make the mathematical determination of optimal allocations, i. e. portfolios, more complicated. Therefore the first multi-asset models restrict their traders and thereby avoid the problem of portfolio building. A first example is (Westerhoff 2004). In his model fundamental traders are restricted to invest into one certain asset. Thus the number of fundamentalists who are active in this market is fixed. Chartists are allowed to switch between different assets, but are restricted to invest into one risky asset at a time. Thus they do not build portfolios with several risky positions. Chartists chose the market with the strongest trend signal for their risky investment. This changes the proportions of chartists to fundamentalists in the markets and leads to complex dynamics, e. g. co-movement of stocks and the replication of stylized facts. Other examples for multi-asset markets are (Cincotti et al. 2005; Cincotti et al. 2006). The first one avoids the portfolio selection by using zero intelligence traders, i. e. traders determine their portfolios randomly. In (Cincotti et al. 2006) traders are seen as nodes of a sparsely connected graph so that each agent is influenced by near agents. The authors point out that this is the first model that reproduces univariate and multivariate stylized facts at the same time. Very recently (Chiarella et al. 2007) presented a multi-asset model that is based on (Brock and Hommes 1998; Chiarella and He 2001; Chiarella and He 2003). The authors extend the framework with a single risky asset to n risky assets. They apply the standard one period capital asset pricing model (CAPM), developed simultaneously by (Sharpe 1964; Lintner 1965; Mossin 1966), in order to determine the optimal demand for shares of the risky assets. In order to keep the analysis simple Chiarella et al. 2007) excluded strategy switching in their model. Nevertheless it provides a general framework for multi-asset markets as (Brock and Hommes 1998) does for the classical case of one risky and one risk free asset. This model might become the basis for numerous extensions and investigations of multi-asset market dynamics.
19 The Heterogeneous Agents Approach to Financial Markets
19.7
459
Extensions and Applications
The approach to model financial markets as evolving systems of heterogeneous interacting agents was successful in explaining empirical phenomena which were not addressed by classical financial theory. Recently researchers started to extend the established models of this field in order to investigate related questions in financial markets. One application of these models is the regulation of financial markets. An interesting instrument to reduce destabilizing speculation is the use of Tobin-like transaction taxes. The impact of such instruments on the market is investigated e. g. in (Westerhoff 2003; Mannaro et al. 2005; Westerhoff and Dieci 2006). Another interesting extension is the interaction of markets. E. g. (Sallans et al. 2003) analyse the interaction of a financial market with an consumer market. Basically this means to replace the stochastic dividend process by a microstructural model of the underlying company, i. e. its actions, decisions and most important its cash-flows. Naturally such an extension comes at the cost of more complexity and a bunch of new parameters. (Sallans et al. 2003) apply the Marcov chain Monte Carlo method in order to analyze the impact of the most important parameters systematically. By this method one can not only show the impact of parameters, it also enables modellers to fine tune many parameters synchronously in order to calibrate models to real world time series. The new insights of financial markets and the traders therein are also used in the new field of Market Engineering, which analyzes and optimizes electronic market-platforms with respect to their micro-, infra- and business structure. An introduction to this field is provided e. g. by (Roth 2002; Weinhardt et al. 2003) (rsp. (Holtmann et al. 2002)).
19.8
Conclusion and Outlook
The above presented survey shows how the field of financial market models has evolved from very stylized models, addressing single issues, to very complex models that are able to replicate many stylized facts simultaneously. These models are very successful in explaining important puzzles in the dynamics of financial markets, and thereby reduce the gap between empirical findings in real markets and the corresponding theoretical models. Although recent models are very complex, and are able to explain certain aspects of market dynamics, the models are by no means realistic. Remember, that most models are limited to one risky and one risk-free asset. One of the most important tasks for future research is to extend these models, e. g. to multi-asset markets where traders can build realistic portfolios. Another important step is to connect the financial market with the activities of the underlying economy, e. g. individual companies or industrial sectors. This
460
Jörn Dermietzel
might shed some light on the interaction and dependencies of financial markets and thereby lead to a better understanding of financial markets as a mechanism which is embedded in an economy. The existing stylized frameworks have to be filled with proper microeconomic details in order to derive more realistic general assumptions explaining the behaviour of traders and markets. Furthermore, the results from financial market models have to find their way into financial theory. These models may help traders and regulators to understand the dynamics of financial markets better than today, and to increase the benefits from these important economic instruments.
References Allen H, Taylor MP (1990) Charts, noise and fundamentals in the London foreign exchange market. Economic Journal 100(400):49–59 Anderson PW, Arrow KJ, Pines D (1988) The Economy as an evolving Complex System. Addison Wesley Longman Publishing Co Arthur WB, Holland J, LeBaron B, Taylor P (1997) Asset pricing under endogenous expectations in an artificial stock market. In: Arthur WB, Durlauf S, Lane D (eds) The economy as an evolving complex system II. AddisonWesley, Reading, MA, pp. 15–44 Aspnes J, Fischer D, Fischer M, Kao M, Kumar A (2001) Towards understanding the predictability of stock markets from the perspective of computational complexity. In: Proc. of 12th annual ACM SIAM Symposium of Discrete Algorithms, pp. 745–754 Barberis N, Thaler R (2003) A survey of behavioural finance. In: Constantinides GM, Harris M, Stulz R (eds) Handbook of the Economics of Finance. Elsevier Beja A, Goldman MB (1980) On the dynamic behaviour of prices in disequilibrium. Journal of Finance 35:235–248 Black F (1986) Noise. Journal of Finance 41:529–543 Brock WA, Lakonishok J, LeBaron B (1992) Simple technical trading rules and the stochastic properties of stock returns. Journal of Finance 47:1731–1764 Brock WA, Hommes C (1997) A rational route to randomness. Econometrica 65: 1059–1095 Brock WA, Hommes C (1998) Heterogeneous beliefs and routes to chaos in a simple asset pricing model. Journal of Economic Dynamics and Control 22:1235–1274 Campbell JY, Shiller R (1988) The dividend-price ratio and expectations of future dividends and discount factors. Review of Financial Studies 1:195–227 Chang SK (2007) A simple asset pricing model with social interactions and heterogeneous beliefs. Journal of Economic Dynamics and Control 31:1300−1325 Chiarella C (1992) The dynamics of speculative behaviour. Annals of Operations Research 37:101–123
19 The Heterogeneous Agents Approach to Financial Markets
461
Chiarella C, Dieci R, He X (2007) Heterogeneous expectation and speculative behaviour in a dynamic multi-asset framework. Journal of Economic Behaviour and Organization 62(3):408–427 Chiarella C, He X (2007) An adaptive model on asset pricing and wealth dynamics in a market with heterogeneous trading strategies. In this Handbook Chiarella C, He X (2001) Asset pricing and wealth dynamics under heterogeneous expectations. Quantitative Finance 1:509–526 Chiarella C, He X (2002) Heterogeneous beliefs, risk and learning in a simple asset pricing model. Computational Economics 19:95–132 Chiarella C, He X (2003) Heterogeneous beliefs, risk and learning in a simple asset pricing model with a market maker. Macroeconomic Dynamics 7:503–536 Cincotti S, Ponta L, Pastore S (2006) Information-based multi-asset artificial stock market with heterogeneous agents. 1st International Conference on Economic Sciences with Heterogeneous Interacting Agents (WEHIA 2006), University of Bologna, Italy Cincotti S, Ponta L, Raberto M (2005) A multi-assets artificial stock market with zero-intelligence traders. 10th Annual Workshop on Economic Heterogeneous Interacting Agents (WEHIA2005), University of Essex, UK Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance 1:223–236 Cutler DM, Poterba JM, Summers LH (1989) What moves stock prices? Journal of Portfolio Management 15:4–12 Dacoragna MM, Müller UA, Jost C, Pictet OV, Olsen RB, Ward JR (1995) Heterogeneous real time trading strategies in the foreign exchange market. European Journal of Finance 1:383–403 Daniel G, Muchnik L, Solomon S (2005) Traders imprint themselves by adaptively updating their own avatar. In: Mathieu P, Beaufils B, Brandouy O (eds) Artificial Economics – Agent-Based Methods in Finance, Game Theory and Their Applications. Lecture Notes in Economics and Mathematical Systems, Vol. 564, Springer Berlin Heidelberg, pp. 27–38 Day RH, Huang WH (1990) Bulls, bears, and market sheep. Journal of Economic Behaviour and Organization 14:299–330 Delong JB, Shleifer A, Summers LH, Waldmann RJ (1990) Noise trader risk in financial markets. Journal of Political Economy 98:703–738 Fama EF (1965) The behaviour of stock market prices. Journal of Business 38:34−105 Farmer JD (2002) Market force, ecology and evolution. Industrial and Corporate Change 11:895–953 Farmer JD, Joshi S (2002) The price dynamics of common trading strategies. Journal of Economic Behaviour and Organization 49:149–171 Frankel JA, Froot KA (1986) Understanding the US dollar in the eighties: The expectations of chartists and fundamentalists. Economic Record, Special issue: pp. 24–38
462
Jörn Dermietzel
Frankel JA, Froot KA (1987a) Short-term and long-term expectations of the yen/dollar exchange rate: evidence from survey data. Journal of the Japanese and International Economies 1:249–274 Frankel JA, Froot KA (1987b) Using survey data to test standard propositions regarding exchange rate expectations. American Economic Review 77:133–153 Frankel JA, Froot KA (1990a) Chartists, fundamentalists and the demand for dollars. In: Courakis AS, Taylor MP (eds) Private behaviour and government policy in interdependent economies. Oxford University Press, New York, pp. 73−126 Frankel JA, Froot KA (1990b) The rationality of the foreign exchange rate. chartists, fundamentalists and trading in the foreign exchange market. American Economic Review 80(2):181–185 Friedman M (1953) The case of flexible exchange rates. In: Essays in Positive Economics. University of Chicago Press Grossman S, Stiglitz J (1980) On the impossibility of informationally efficient markets. American Economic Review 70:393–408 Hakansson NH, Beja A, Kale J (1985) On the feasibility of automated market making by a programmed specialist. Journal of Finance 40(1):1–20 Haken H (1993) Synergetics – An Introduction. Springer Series in Synergetics, Berlin: Springer, 3rd edn Hansen L, Singleton K (1983) Stochastic consumption, risk aversion, and the temporal behaviour of asset returns. Journal of Political Economy 91:249–265 Holtmann C, Neumann D, Weinhardt C (2002) Market engineering – towards an interdisciplinary approach. Technical report, Department of Economics and Business Engineering, Chair for Information Management and Systems, Universität Karlsruhe (TH) Ito K (1990) Foreign exchange rate expectations. American Economic Review 80:434–449 Joshi S, Parker J, Bedau M (1998) Technical trading creates a prisoner’s dilemma: Results from an agent-based model. Research in Economics, Number 98-12-115e, Santa Fe Institute Joshi S, Parker J, Bedau M (2002) Financial markets can be at sub-optimal equilibria. Computational Economics 19:5–23 Keynes JM (1936) The general theory of unemployment, interest and money. Harcourt, Brace and World, New York Kirman A (1991) Epedemics of opinion and speculative bubbles in financial markets. In: Taylor M (ed) Money and Financial Markets, Macmillian Kirman A (1993) Ants, rationality and recruitment. Quarterly Journal of Economics 108:137–156 Kyle AS (1985) Continuous auctions and insider trading. Econometrica 47: 1315−1336 LeBaron B (2000) Agent-based computational finance: Suggested readings and early research. Journal of Economic Dynamics and Control 24:679–702 LeBaron B (2002) Building the Santa Fe Artificial Stock Market, Working paper, Santa Fe Institute
19 The Heterogeneous Agents Approach to Financial Markets
463
LeBaron B (2006) Computational finance. In: Tesfatsion L, Judd KL (eds) Handbook of Computational Economics, vol 2. Elsevier LeBaron B, Arthur WB, Palmer R (1999) Time series properties of an artificial stock market. Journal of Economic Dynamics and Control 23:1487–1516 Lintner J (1965) The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Review of Economics and Statistics 47:13–37 Lo AW, MacKinlay AC (1988) Stock prices do not follow random walks: Evidence from a simple specification test. Review of Financial Studies 1:41–66 Lucas RE (1971) Econometric testing of the natural rate hypothesis. In: Eckstein O (ed) The Econometrics of Price Determination Conference. Board of Governors of the Federal Reserve System and Social Science Research Council Lux T (1995) Herd behaviour, bubbles and crashes. The Economic Journal 105(431):881–896 Lux T (1997) Time variation of second moments from a noise trader/infection model. Journal of Economic Dynamics and Control 22:1–38 Lux T (1998) The socio-economic dynamics of speculative markets: interacting agents, chaos, and the fat tails of return distribution. Journal of Economic Behaviour and Organization 33:143–165 Mandelbrot BB (1963) The variation of certain speculative prices. Journal of Business 36:394–419 Mannaro K, Marchesi M, Setzu A (2005) Using an artificial financial market for assessing the impact of Tobin-like transaction taxes. Technical report, DIEE, University of Cagliari Mehra R, Prescott EC (1988) The equity risk premium: A solution? Journal of Monetary Economics 22:133–136 Milgrom PR, Stokey N (1982) Information, trade and common knowledge. Journal of Economic Theory 26:17–27 Mossin J (1966) Equilibrium in a capital asset market. Econometrica 34:768–783 Muth JF (1961) Rational expectations and the theory of price movements. Econometrica 29:315–335 Roth AE (2002) The economist as an engineer: Game theory, experimental economics and computation as tools of design economics. Econometrica 70(4): 1341–1378 Sallans B, Pfister A, Karatzoglou A, Dorfner G (2003) Simulation and validation of an integrated markets model. Journal of Artificial Societies and Social Simulation, 6(4) Sargent TJ (1993) Bounded Rationality in Macroeconomics. Claredon Press, Oxford Sharpe WF (1964) Capital asset prices. a theory of market equilibrium under conditions of risk. Journal of Finance 19:425–442 Shiller RJ (2003) From efficient market theory to behavioural finance. Journal of Economic Perspectives 17:83–104 Simon HA (1957) Models of Man. Wiley, New York
464
Jörn Dermietzel
Taylor MP, Allen H (1992) The use of technical analysis in the foreign exchange market. Journal of International Money and Finance 11:304–314 Weinhardt C, Holtmann C, Neumann D (2003) Market Engineering. Wirtschaftsinformatik 45(6):635–640 Westerhoff F (2003) Heterogeneous traders and the Tobin tax. Journal of Evolutionary Economics 13:53–70 Westerhoff F (2004) Multi-asset market dynamics. Macroeconomic Dynamics 8:596–616 Westerhoff F, Dieci R (2006) The effectiveness of Keynes-Tobin transaction taxes when heterogeneous agents can trade in different markets: A behavioural finance approach. Journal of Economic Dynamics and Control 30:193–322
CHAPTER 20 An Adaptive Model of Asset Price and Wealth Dynamics in a Market with Heterogeneous Trading Strategies Carl Chiarella, Xue-Zhong He
20.1 Introduction The traditional asset-pricing models – such as the capital asset pricing model (CAPM) of [42] and [34], the arbitrage pricing theory (APT) of [40], or the intertemporal capital asset pricing model (ICAPM) of [38] – have as one of their important assumptions, investor homogeneity. In particular the paradigm of the representative agent assumes that all agents are homogeneous with regard to their preferences, their expectations and their investment strategies.1 However, as already argued by Keynes in the 1930s, agents do not have sufficient knowledge of the structure of the economy to form correct mathematical expectations that would be held by all agents. The other important paradigm underpinning these models, the efficient market hypothesis (EMH), maintains that the current price contains all available information and past prices cannot help in predicting future prices. However there is evidence that markets are not always efficient and there are periods when real data show significantly higher autocorrelation of returns than would be expected under EMH. Over the last decade, a large volume of empirical work (e. g., [8, 25, 26, 3, 41, 2, 17, 39, and 29]) has documented a variety of ways in which asset returns can be predicted based on publicly available information and many of the results 1
Financial support from the Australian Research Council under Discovery Grant DP0450526 is greatly acknowledged. An early version of this paper was presented at the 7th Asia Pacific Finance Association Annual Conference, Shanghai, July 24-26, 2000, the 2nd CeNDEF Workshop on Economic Dynamics, University of Amsterdam, 2001, the 19th Economic Theory Workshop, University of Technology, Sydney, 2001 and seminars at the University of Melbourne, the University of New South Wales, and the University of Bielefeld. In particular, we would like to thank Volker Bohm, Cars Hommes, Thomas Lux, Willi Semmler and Jan Wenzelburger for stimulating comments and discussion. The authors are responsible for any remaining errors in this paper.
466
Carl Chiarella, Xue-Zhong He
can be thought of as belonging to one of two broad categories of phenomena.2 On the one hand, returns appear to exhibit continuation, or momentum, over short to medium time intervals, which may imply the profitability of momentum trading strategies over short to medium time intervals. On the other hand, there is also a tendency toward reversals over long time intervals, leading to possible profitability of contrarian strategies. The traditional models of finance theory seem to have difficulty in explaining this growing set of stylized facts. As a result, there is a growing dissatisfaction with (i) models of asset price dynamics based on the representative agent paradigm, as expressed for example by [27], and (ii) the extreme informational assumptions of rational expectations. In order to extend the traditional models of finance theory so as to accommodate some of the aforementioned stylized facts, a literature has developed over the last decade that involves some departure from the classical assumptions of strict rationality and unlimited computational capacity, and introduces heterogeneity and bounded rationality of agents. This strand of literature seeks to explain the existing evidence on many of the anomalies observed in financial markets as the result of the dynamic interaction of heterogeneous agents. In financial markets, individuals are imperfectly rational. They seek to learn about the market from their trading outcomes, as a result the market may fluctuate around the fully rational equilibrium. A number of recent models use this approach to characterize the interactions of heterogeneous agents in financial markets (e. g. [21, 16, 9, 35, 5, 6, 7, 12, 13, 18, 19 20, 36 28, and 23]). To avoid the constraints of analytical tractability, many of these authors use computer simulations to explore a wider range of economic settings. A general finding in many of these studies is that long-horizon agents frequently do not drive short-horizon agents out of financial markets, and that populations of long- and short-horizon agents can create patterns of volatility and volume similar to actual empirical patterns. [5], [6] propose to model economic and financial markets as an adaptive belief system (ABS), which is essentially an evolutionary competition among trading strategies. A key aspect of these models is that they exhibit expectations feedback and adaptiveness of agents. Agents adapt their beliefs over time by choosing from different predictors or expectations functions, based upon their past performance as measured by realized profits. Agents have the standard constant absolute risk aversion (CARA) utility function of the CAPM world but are boundedly rational, in the sense that they do not know the distribution of future returns. The evolutionary model generates endogenous price fluctuations with similar statistical properties to those observed in financial markets. The model of Brock and Hommes has been extended in [12] by allowing agents to have different risk attitudes and different expectation formation schemes for both first and second moments of the price distribution.
2
A detailed discussion and references to the related empirical work is provided in Section 20.2.
20 An Adaptive Model of Asset Price and Wealth Dynamics
467
Because of the underlying CARA utility function, investors’ optimal decisions depend only on the asset price and do not directly involve their wealth. The resulting separation of asset price and wealth dynamics greatly simplifies the analysis of the model. However, a consequence of this separation is that the model cannot generate the type of growing price process that is observed in the market. [32] and [31] consider a more realistic model where investors’ optimal decisions depend on their wealth (as a result of an underlying constant relative risk aversion (CRRA) utility function) and both price and wealth processes are intertwined and thus growing. Using numerical simulations and comparing the stock price dynamics in models with homogeneous and heterogeneous expectations, they conclude that the homogeneous expectations assumption leads to a highly inefficient market with periodic (and therefore predictable) booms and crashes while the introduction of heterogeneous expectations leads to much more realistic dynamics and more efficient markets. [11] develop a theoretical model of interaction of portfolio decisions and wealth dynamics with heterogeneous agents having CRRA utility function. A growth equilibrium model of both the asset price and wealth is obtained. To characterize the interaction of heterogeneous agents in financial markets and conduct a theoretical analysis, stationary models in terms of return and wealth proportions (among different types of agents) are then developed. As a special case of the general heterogeneous model, these authors consider models of homogeneous agents and of two heterogeneous agents without switching of strategies. It is found that, in these cases, the heterogeneous model can have multiple steady states and the convergence to the steady states follows an optimal selection principle – the return and wealth proportions tend to the steady state which has relatively higher return. The model developed displays the volatility clustering of the returns and the essential characteristics of the standard asset price dynamics model of continuous time finance in that the asset price is fluctuating around a geometrically growing trend. The aim of the current chapter is twofold. First to establish an adaptive model of asset price and wealth dynamics in an economy of heterogeneous agents that extends the model in [11] to allow agents to switch amongst different types of trading strategies. Second to characterize the profitability of the two most popular trading strategies in real markets – momentum and contrarian trading strategies. According to [5, 6] and the references cited therein, a financial market is an interaction of heterogeneous agents who adapt their beliefs from time to time. In our model, based on certain fitness measures, such as realized wealth, the agents are allowed to switch from one strategy to another from time to time. Consequently a model with adaptive beliefs is established where evolutionary dynamics across predictor choice is coupled with the dynamics of the endogenous variables. Empirical studies provide some evidence that momentum trading (or trend following) strategies are more profitable over short time intervals, while contrarian trading strategies are more profitable over long time intervals (see for instance [8, 25, 26, 3, 2, 17; 41, 39; 29, and 30]). To characterize the profitability of momentum and contrarian trading strategies, a quasi-homogeneous model is introduced, in which agents use exactly the same trading strategies except for different time horizons. Our results in general support the empirical findings on the profit-
468
Carl Chiarella, Xue-Zhong He
ability of momentum and contrarian trading strategies. In addition, the model also exhibits various anomalies observed in financial markets, including, overconfidence and underreaction, overreaction and herd behavior, excess volatility, and volatility clustering. This chapter is organized as follows. Section 20.2 establishes an adaptive model of asset price and wealth dynamics with heterogeneous beliefs amongst agents. It is shown how the distributions of the wealth and population across heterogeneous agents are measured. As a simple case, a model of two types of agents is then considered in Section 20.3. To characterize the profitability of momentum and contrarian trading strategies, a quasi-homogeneous model is also introduced as a special case of the model of two types of agent in Section 20.3. The profitability of momentum and contrarian trading strategies is then analyzed in Sections 20.4 and 20.5, respectively. Section 20.6 concludes.
20.2 Adaptive Model with Heterogeneous Agents This section is devoted to establishing an adaptive model of asset price and wealth dynamics with heterogeneous beliefs amongst agents. The model can been treated as a generalization and extension of some asset pricing models involving the interaction between heterogeneous agents, for example, [31, 4, 6, 15, 24 and 11]. The key characteristics of this modelling framework are the adaptiveness, the heterogeneity and the interaction of the economic agents. The heterogeneity is expressed in terms of different views on expectations of the distribution of future returns on the risky asset. The modelling framework of this paper extends that of the earlier cited works by focusing on the interaction of both asset price and wealth dynamics. The framework of the adaptive model developed here is similar to the one in [31] and [11]. Our hypothetical financial market contains two investment choices: a stock (or index of stocks) and a bond. The bond is assumed to be a risk free asset and the stock is a risky asset. The model is developed in the discrete time setting of standard portfolio theory in that agents are allowed to revise their portfolios over each time interval, the new element being the heterogeneity and adaptiveness of agents and the way in which they form expectations on the return distributions. The use of CARA utility functions has been standard in much of asset pricing theory. It has the characteristic of leading to demands that do not depend on the agents’ wealth, but this dependence turns out to be quite crucial in developing a model exhibiting a growing price trend. A CRRA utility function is sufficient to capture the interdependence of price and wealth dynamics. The selection of logarithmic utility in the model developed here is based on a number of experimental and empirical studies, as summarized in [33] that, “it is reasonable to assume decreasing absolute risk aversion (DARA) and constant relative risk aversion (CRRA)” (p.65). They show that the only utility function with DARA and CRRA property is the power utility function, among which, the logarithmic utility function is one of the special cases.
20 An Adaptive Model of Asset Price and Wealth Dynamics
469
For the standard portfolio optimization problem, a model in terms of price and wealth is first established in this section. However it turns out that the resulting model is non-stationary in that both the price and wealth are growing processes. So, in order to reduce the growth model to a stationary model, the return on the risky asset and the wealth proportions (among heterogeneous investors), instead of price and wealth, are used as state variables. Based on a certain performance (or fitness) measures, an adaptive mechanism is then introduced, leading to the general adaptive model. The final model includes the dynamics of both the asset price and wealth and it characterizes three important and related issues in the study of financial markets: heterogeneity, adaptiveness, and interaction of agents.
20.2.1
Notation
Denote
It is assumed that3 all the agents have the same attitude to risk with the same utility function U (W ) = log(W ) . Following the above notation, the return on the risky asset at period t is then defined by4
ρt = 3
pt − pt −1 + yt . pt −1
To make the following analysis more tractable and transparent, the assumption that all agents have the same utility function U (W ) = log(W ) is maintained in this paper. However, the analysis can be generalized to the case of utility functions that allow agents to have different risk coefficients, say, U i (W ) = (W γ − 1) /γ i with 0 < γ i < 1 . As shown by [12], the dynamics generated by the difference in risk aversion coefficients is an interesting and important issue that for the present model is left for future work. The return can also be defined by the difference of logarithms of the prices. It is known that the difference between these two definition becomes smaller as the time interval is reduced (say, from monthly to weekly or daily). i
4
(20.1)
470
Carl Chiarella, Xue-Zhong He
20.2.2
Portfolio Optimization Problem of Heterogeneous Agents
The framework is that of the standard one-period portfolio optimization problem. The wealth of agent (or investor) i at time period t + 1 is given by
Wi ,t +1 = (1 − π i ,t )Wi ,t R + π i ,tWi ,t (1 + ρ t +1 ) = Wi ,t [ R + π i ,t ( ρ t +1 − r )]. (20.2) As in [6] and [31], a Walrasian scenario is used to derive the demand equation, that is each trader is viewed as a price taker and the market is viewed as finding (via the Walrasian auctioneer) the price pt that equates the sum of these demand schedules to the supply. That is, the agents treat the period t price, pt , as parametric when solving their optimisation problem to determine π i ,t . Denote by Ft = { pt −1 , pt − 2 , ; yt , yt −1 , yt − 2 , } the information set5 formed at time t . Let Et , Vt be the conditional expectation and variance, respectively, based on Ft , and Ei ,t , Vi ,t be the “beliefs” of investor i about the conditional expectation and variance. Then it follows from (20.2) that
Ei ,t (Wi ,t +1 )
= Wi ,t [ R + π i ,t ( Ei ,t ( ρ t +1 ) − r )],
Vi ,t (Wi ,t +1 )
= Wi ,2t π i2,tVi ,t ( ρ t +1 ).
(20.3)
Consider investor i , who faces a given price pt , has wealth Wi ,t and believes that the asset return is conditionally normally distributed with mean Ei ,t ( ρ t +1 ) and variance Vi ,t ( ρ t +1 ) . This investor chooses a proportion π i ,t of his/her wealth to be invested in the risky asset so as to maximize the expected utility of the wealth at t + 1 , as given by max π Ei ,t [U (Wi ,t +1 )] . It follows that6 the optimum investment proportion at time t , π i ,t is given by i ,t
π i ,t =
Ei ,t ( ρ t +1 ) − r Vi ,t ( ρ t +1 )
.
(20.4)
Heterogeneous beliefs are introduced via the assumption that
Ei ,t ( ρ t +1 ) = f i ( ρ t −1 ,
, ρ t − L ), i
Vi ,t ( ρ t +1 ) = g i ( ρ t −1 ,
, ρ t − L ) (20.5) i
for i = 1, , H , where Li are integers, f i , g i are some deterministic functions which can differ across investors. Under this assumption, both Ei ,t ( ρ t +1 ) and Vi ,t ( ρ t +1 ) are functions of the past prices up to t − 1 , which in turn implies the 5
6
Because of the Walrasian scenario, the hypothetical price pt at time t is included in the information set to determine the market clearing price. However, agents form their expectations by using the past prices up to time t − 1 . See Appendix A.1 in [11] for details.
20 An Adaptive Model of Asset Price and Wealth Dynamics
471
optimum wealth proportion π i ,t , defined by (20.4), is a function of the history of the prices ( pt −1 , pt − 2 , )7.
20.2.3
Market Clearing Equilibrium Price – A Growth Model
The optimum proportion of wealth invested in the risky asset,
π i,t , determines the
number of shares at price pt that investor i wishes to hold Ni ,t = π i ,tWi ,t /pt . Summing the demands of all agents gives the aggregate demand. The total number of shares in the market, denoted by N , is assumed to be fixed, and hence the market clearing equilibrium price
pt is determined by
H
∑π
i ,t
∑
Wi ,t = Npt .
H i =1
N i ,t = N , i. e., (20.6)
i =1
Thus, equations (20.2) and (20.6) show that, in this model, as in real markets, the equilibrium price pt and the wealth of investors, (W1,t , , WH ,t ) , are determined simultaneously. The optimum demands of agents are functions of the price and their wealth distribution. Also, as observed in financial markets, the model implies that both the price and the wealth are growing processes in general.
20.2.4 Population Distribution Measure Now suppose all the agents can be grouped in terms of their conditional expectations of mean and variance of returns of the risky asset. That is, within a group, all the agents follow the same expectation schemes on the conditional mean and variance of the return ρ t +1 , and hence the optimum wealth proportion ( π i ,t ) invested in the risky asset for the agents are the same. Assume all the agents can be grouped as h types (or groups) and group j has j ,t agents at time t with j = 1, , h , then 1,t + + h,t = H . Denote by n j ,t the proportion of the number of agents in group j , at time t , relative to the total number of the investors, H , that is, n j ,t = j ,t /H , so that n1,t + + nh,t = 1 . Some simple examples on return and wealth dynamics when proportions of different types of agents n j ,t are fixed over time are given in [11]. However, this is a highly simplified assumption and it would be more realistic to allow agents to 7
In [31], the hypothetical price pt is included in the above conditional expectations on the return and variance. In this case, the market clearing price is solved implicitly and is much more involved mathematically. The approach adopted here is the standard one in deriving the price via the Walrasian scenario and also keeps the mathematical analysis tractable. A similar approach has been adopted in [6] and [12]. Of course other market clearing mechanisms are possible, e. g., a market-maker. It turns out that the type of market clearing mechanism used does affect the dynamics; on this point see [13].
472
Carl Chiarella, Xue-Zhong He
adjust their beliefs from time to time, based on some performance or fitness measures (say, for example, the realized returns or errors, as in [6]). In this way, one can account for investor psychology and herd behavior8. As a consequence, the proportions of different types of agents become endogenous state variables. Therefore the vector ( n1,t , n2 ,t , , nh ,t ) measures the population distribution among different types of heterogeneous agents. The change in the distribution over time can be used to measure herd behavior among heterogeneous agents, in particular, during highly volatile periods in financial markets.
20.2.5 Heterogeneous Representative Agents and Wealth Distribution Measure By assuming the adaptiveness of agents’ behavior, agents may switch among different groups from time to time. To track the wealth evolution of each individual agent is certainly an interesting and important issue, but is rather a difficult problem within the current framework. However, by introducing heterogeneous representative agents (HRAs), the model established here is capable of characterizing their performance in terms of wealth distribution over time. By HRAs, we mean that such agents can become a fraction so that the sum of the fractions of all those agents is equal to 1. More precise construction of such HRAs is given as follows. Assume that all agents are grouped into h types (according to their beliefs) and group j had j ,t ( j = 1, 2, , h) agents at time t . Here h is assumed to be fixed, while j ,t can vary from period to period. At time t , for agents within group j , let W j ,t be the average wealth of agents within this group j , so that gives the total wealth of group j . Denote by w j ,t the average wealth j ,t W j ,t proportion of group j relative to the total average wealth W t at time t , that is,
w j ,t =
W j ,t Wt
,
h
with
W t = ∑ W j ,t .
(20.7)
j =1
Then the vector ( w1,t , w 2,t , , w h ,t ) corresponds to the wealth proportion distribution among HRAs of different types, it measures the average wealth levels associated with different trading strategies.
20.2.6 Performance Measure, Population Evolution and Adaptiveness Following [5], [6], a performance measure or fitness function, denoted (Φ1,t , , Φ h ,t ) , is publicly available to all agents. Based on the performance measure agents make a (boundedly) rational choice among the predictors. This results in the Adaptive Rational Equilibrium Dynamics, introduced by [5], an evolutionary dynamics across predictor choice which is coupled to the dynamics of the 8
See more discussion on this aspect in the next section.
20 An Adaptive Model of Asset Price and Wealth Dynamics
473
endogenous variables. In the limit as the number of agents goes to infinity, the probability that an agent chooses trading strategy j is given by the well known discrete choice model or ‘Gibbs’ probabilities9
n j ,t = exp[ β (Φ j ,t −1 − C j )]/Z t
h
Z t = ∑ exp[ β (Φ j ,t −1 − C j )], (20.8) j =1
where C j ≥ 0 measures the cost of the strategy j for j = 1, 2, , h . The crucial feature of (20.8) is that the higher the fitness of trading strategy j , the more traders will select that strategy. The parameter β , called the intensity of choice or switching intensity, plays an important role and can be used to characterize various psychological effects, such as overconfidence and underreaction, overreaction and herd behavior, as discussed by [22]. On the one hand, when individuals are overconfident, they do not change their beliefs as much as would a rational Bayesian in the face of new evidence. This may result from either high cost in processing new information or individuals’ reluctance to admit to having made a mistake (so the new evidence is under-weighted). Both overconfidence and underreaction can be partially captured by a small value of the switching intensity parameter β . In the extreme case when β = 0 , there is no switching among strategies and the populations of agents is evenly distributed across all trading strategies10. On the other hand, if the environment is volatile, or agents are less confident about their beliefs, agents may be less reluctant to switch to beliefs which generate better outcomes. This effect can be captured by a high value of the switching intensity parameter β . An increase in the switching intensity β represents an increase in the degree of rationality with respect to evolutionary selection of trading strategies. In the extreme case when β is very large (close to infinity), a large proportion of traders are willing to switch more quickly to successful trading strategies. In such a situation, market overreaction and herd behavior may be observed. A natural performance measure can be taken as a weighted average of the realized wealth return on the proportion invested in the risky asset among h HRAs, given by
Φ j ,t = φ j ,t + γΦ j ,t −1 ; φ j ,t = π
W j ,t − W j , t −1
W j ,t −1
j , t −1
=π
[ r + ( ρ t − r )π
j , t −1
]
j , t −1
for j = 1, , h , where 0 ≤ γ ≤ 1 and φ j ,t is the realized wealth return invested in the risky asset in period t . Here γ is a memory parameter measuring how strongly the past realized fitness is discounted for strategy selection, so that Φ j ,t may be interpreted as the accumulated discounted return on the proportion of wealth invested by group j in the risky asset. 9
10
See [37] and [1] for extensive discussion of discrete choice models and their applications in economics. See [11] for models with fixed, but not evenly distributed, population proportions among different types of trading strategies.
474
Carl Chiarella, Xue-Zhong He
20.2.7
An Adaptive Model
The above growth model is rendered stationary by formulating it in terms of the risky asset return and the average wealth proportions among the investors, instead of the stock price pt and the wealth Wt . It should be made clear that the term “stationary” is not being used in the econometric sense of a stationary stochastic process. Rather the term is used to refer to a stationary dynamical system that has fixed points (as opposed to growing trends) as steady state solutions. It may well be that the dynamical system to be analyzed below generates time series that are non-stationary in the econometric sense; this will depend on both the local stability/instability properties of the dynamical system and how noise is processed by the nonlinear system. When such a steady state exists, following from (20.1), it generates a geometrically growing price process. The dynamical system describing the evolution of average wealth proportions of HRAs and risk asset return is given by the following proposition. Proposition 20.1. For group i , formed at time period t − 1 , the average wealth proportions at the next time period t evolve according to
w i ,t =
∑
w i ,t −1[ R + ( ρ t − r )π i ,t −1] w j ,t −1[ R + ( ρ t − r )π j =1
h
(i = 1,
, h)
(20.9)
]
j , t −1
with return ρ t given by11
ρt
∑ =r+
h i =1
w i ,t −1[(1 + r )( ni ,t −1 π i ,t −1 − ni ,t π i ,t ) − α t ni ,t −1 π i ,t −1]
∑
h
π i ,t −1w i ,t −1( ni ,t π i ,t − ni ,t −1 ) i =1
, (20.10)
where α t denotes the dividend yield defined by α t = yt /pt −1, and the population proportions ni ,t evolve according to
ni ,t = exp[ β (Φ i ,t −1 − Ci )]/Z t ,
(20.11)
in which the fitness functions are defined by
Φ i ,t = φi ,t + γ Φ i ,t −1 ;
φi ,t = π i ,t −1
W i ,t − W i ,t −1 W i ,t −1
0 ≤ γ ≤ 1, = π i ,t −1[ r + ( ρ t − r )π i ,t −1],
h
Z t = ∑ exp[ β (Φ i ,t −1 − Ci )], i =1
and the constants Ci ≥ 0 measure the cost of the strategy for i = 1, 2,
, h.
Proof. See Appendix 20.7.1. 11
It is easy to check that ρ t ≡ r is a trivial solution. As a necessary condition for investing in the risky asset, it is assumed that E ( ρ t ) > r .
20 An Adaptive Model of Asset Price and Wealth Dynamics
475
Equations (20.9) and (20.10) constitute a difference equation system for w j ,t and ρt the order of which depends on the choice by agents of the L j at equation (20.5). It should be stressed that in the process of rendering the model stationary it has become necessary to reason in terms of the dividend yield ( α t ) rather than the dividend ( yt ) directly. It is easy to see that, when h ≤ H , j ≥ 1 and β = 0 for j = 1, , h , Proposition 20.1 leads to the model in [11] with fixed proportion n j ,t = n j ( j = 1, , h) of heterogeneous agents.
20.2.8
Trading Strategies
The adaptive model established in Proposition 20.1 is incomplete until the conditional expectations of agents on the mean and variance of returns are specified. Different trading strategies can be incorporated into this general adaptive model as indicated by equation (20.5). To illustrate various features of the model, only three simple, but well-documented, types of agents, termed fundamentalists, momentum traders and contrarians, are considered in this paper.12 Neither type is fully rational in the sense used in the rational expectations literature. The information on the dividends and realized prices is publicly available to all agent types.
20.2.9
Fundamental Traders
The fundamentalists make forecasts on the risk premium level based on both public and their private information about future fundamentals. It is assumed that
EF ,t ( ρ t +1 ) = r + δ F ,
(20.12)
where EF ,t denotes the fundamentalists’ estimate of the conditional mean of ρ t +1 for the next period t + 1 and δ F is their estimate of risk premium.13 That is, the fundamentalists believe that the excess conditional expected return for the risky asset (above the risk-free rate) is given by the risk premium δ F that they may have estimated from a detailed analysis of the risky asset (earnings reports, market prospects, political factors etc.).
20.2.10 Momentum Traders Momentum traders, in contrast to the fundamental traders, do condition on the past prices. Momentum, or positive feedback, trading has several possible motivations, one being that agents form expectations of future prices by extrapolating trends. They buy into price trends and exaggerate them, leading to overshooting. As a result there may appear excess volatility. 12
13
To simplify the analysis, we focus on the conditional mean estimation by assuming that subjective estimation of variance of all the agents is given by a constant. A constant risk premium is a simplified assumption. In practice, the risk premium may not necessarily be constant but could also be a function of, for example the variance.
476
Carl Chiarella, Xue-Zhong He
Empirical studies have given support to the view that momentum trading strategies yield significant profits over short time intervals (e. g. [3, 25, 26, 29, 39 and 41]). Although these results have been well accepted, the source of the profits and the interpretation of the evidence are widely debated. In addition, there does not exist in the literature a quantitative model to clarify and give theoretical support to such evidence. As a first step, this issue is discussed in the next section within the framework of the adaptive heterogeneous model outlined in Proposition 20.1. For momentum traders, it is assumed in this paper that their forecasts are “simple” functions of the history of past returns. More precisely, it is assumed that
EM ,t ( ρ t +1 ) = r + δ M + d M ρ M ,t ,
ρ M ,t =
LM
1 LM
∑ρ
t −k
,
(20.13)
k =1
where EM ,t ( ρt +1 ) denotes the momentum traders’ estimate of the conditional mean of
ρt +1 for the next period t + 1 and δ M is their risk premium estimate and
d M > 0 corresponds to the extrapolation rate of the momentum trading strategy. The integer L ≥ 1 corresponds to the memory length of momentum traders. M Equation (20.13) states that the expected excess return (above the risk-free rate) of momentum traders has two components: their estimated risk premium δ M and
trend extrapolation d M ρ M ,t , which is positively proportional to the moving aver-
age of the returns over the last LM time periods.
20.2.11 Contrarian Traders The profitability of contrarian investment strategies is now a well-established empirical fact in the finance literature (see, for example, [30]). Empirical evidence suggests that over long time intervals, contrarian strategies generate significant abnormal returns (see, for example, [2, 17, and 8]). Some evidence has shown that overreaction can use aggregate stock market value measures such as dividend yield to predict future market returns, so that contrarian investment strategies are on average profitable. In spite of the apparent robustness of such strategies, the underlying rationale for their success remains a matter of lively debate in both academic and practitioner communities. In the following section, the role of expectational errors in explaining the profitability of contrarian strategies is examined. For contrarian traders, it is assumed that
EC ,t ( ρ t +1 ) = r + δ C − d C ρ C ,t,
ρ C ,t =
1 LC
LC
∑ρ
t −k
,
(20.14)
k =1
where EC ,t ( ρ t +1 ) denotes the contrarian agents’ estimate of the conditional mean of ρt +1 for the next period
t + 1 and δ C is their risk premium estimated and
dC > 0 corresponds to their extrapolation rate. The integer LC ≥ 1 corresponds to
20 An Adaptive Model of Asset Price and Wealth Dynamics
477
the memory length of contrarian agents. Equation (20.14) states that contrarian traders believe that the difference of excess conditional mean and the risk premium [ E ( ρ ) − r ] − δ is negatively proportional to the moving average of C ,t
t +1
C
the returns over the last L time periods. C In addition to the different trading strategies used by different types of traders, there are various other ways to introduce agent heterogeneity such as through different risk premia, extrapolation rates and memory lengths.
20.3
An Adaptive Model of Two Types of Agents
In the remainder of this chapter, the focus is on a simple model of just two types of agents – momentum traders and contrarian traders. In this case, the adaptive model developed in section 20.2 can be reduced to a simple form, as indicated below. To examine profitability of momentum and contrarian trading strategies over different time intervals, a special case of the model, termed the quasi-homogeneous model, is then considered. Detailed discussion on the dynamics of such quasihomogeneous models, including profitability, herd behavior, price overshooting, statistical patterns of returns, is then undertaken in the subsequent sections.
20.3.1 The Model for Two Types of Agents Assume that there are only two different types trading strategies. Let w t , n t be the difference of the average wealth proportions and population proportions of type 1 and type 2 agents; that is
w t = w1,t − w 2,t ,
nt = n1,t − n2,t .
(20.15)
Then it follows from w1,t + w 2 ,t = 1 and n1,t + n2 ,t = 1 so that w 1, t =
1 + wt 2
,
w 2 ,t =
1 − wt
and
2
n1, t =
1 + nt 2
,
n2 , t =
1 − nt 2
.
Correspondingly, the adaptive model in Proposition 20.1 can be reduced to a simple form given by Proposition 20.2. Proposition 20.2. The difference of the average wealth proportions wt evolves according to
w t +1 =
f1 − f 2
(20.16)
f1 + f 2
with return ρt given by
ρt +1 = r +
g11 + g12 g 21 + g 22
,
(20.17)
478
Carl Chiarella, Xue-Zhong He
where
f1 = (1 + w t )[1 + r + ( ρ t +1 − r )π 1,t ], f 2 = (1 − w t )[1 + r + ( ρ t +1 − r )π 2 ,t ], g11 = (1 + w t )[(1 + r − α t +1 )(1 + n t )π 1,t − (1 + r )(1 + n t +1)π 1,t +1], g12 = (1 − w t )[(1 + r − α t +1 )(1 − n t )π 1,t − (1 + r )(1 − n t +1)π 2 ,t +1], g 21 = (1 + w t )π 1,t[(1 + n t +1)π 1,t +1 − (1 + n t )], g 22 = (1 − w t )π 2,t[(1 − n t +1)π 2,t +1 − (1 − n t )] and π j ,t ( j = 1, 2) are defined by (20.4). The difference of population proportions nt evolves according to
and Cj ≥ 0 measure the cost of the strategy for j = 1, 2 .
20.3.2
Wealth Distribution and Profitability of Trading Strategies
The average wealth distribution among the two types of agents (following different trading strategies) is now characterized by w t , the difference of the average wealth proportions. Over a certain time period, if w t stays above (below) the initial value w o and increases (decreases) significantly as t increases, then, on average, type 1 agents accumulate more (less) wealth than type 2 agents, and one may say type 1 trading strategy is more (less) profitable than type 2 trading strategy. Otherwise, if the difference is not significantly different from w o , then there is no evidence that on average either trading strategy is more profitable than the other.
20.3.3
Population Distribution and Herd Behavior
The distribution of populations using different types of trading strategies is now characterized by the difference of the population proportions nt . At time period, if nt is positive (negative), then this indicates that there are more (less) agents using type 1 trading strategy than type 2 trading strategy. Moreover, if nt is significantly different from zero, then this could be taken as an indication of herd behavior. This is, in particular, frequently observed to be the case when the switching intensity β > 0 is high.
20 An Adaptive Model of Asset Price and Wealth Dynamics
479
When there is evidence on the profitability of type 1 (type 2) trading strategy and a clear indication of herding behavior using type 1 (type 2) trading strategy over the time period, we say type 1 (type 2) trading strategy dominates the market.
20.3.4 A Quasi-Homogeneous Model As a special case of the adaptive model with two types of agents, consider the case, termed quasi-homogeneous model, where both types of agents use exactly the same trading strategies except that they use different memory lengths. The trading strategies for both types of agents can be unified by writing
Ei ,t ( ρt +1 ) = r + δ i + d i ρ i ,t,
ρ i ,t =
1 Li
Li
∑ρ
t −k
,
(20.20)
k =1
for i = 1, 2 , where L i ≥ 1 is integer, r ( > 0), δ i ( > 0) and d i ∈ R are constants. For the quasi-homogeneous model, it is further assumed that δ1 = δ 2 = δ , d1 = d 2 = d but 1 ≤ L1 ≤ L 2 . In the following discussion, assume that the conditional variances of agents are 2 given by a constant σ . It is convenient to standardize both thesk premium δ and 2 2 extrapolation rate d according to δ = δ/σ , d = d /σ . Correspondingly, the optimal demand of type j agents in terms of the wealth proportion invested in the risky asset is given by
π j ,t = δ j + d j ρ j , t . It is also assumed that the dividend yield process has the form
α t = α o + qN (0, 1),
(20.21)
where N (0, 1) is the standard normal distribution and q is a measure of the volatility of the dividend process.14 Because of the highly nonlinear nature of the adaptive model theoretical analysis (even of the steady states) seems intractable and thus the model is analyzed numerically. However, the results for the non-adaptive model established in [11] underly the dynamics of the adaptive model established here. In the presence of heterogeneous agents, the non-adaptive model can have multiple steady states, and the convergence of such steady states follows an optimal selection principle – the return and wealth proportions tend to the steady state which has a relatively high return. More importantly, heterogeneity can generate instability which, under the 14
The normal distribution has been chosen for convenience, it has the disadvantage that the dividend yield αt could be negative. However for the parameters used in the simulations this probability is extremely low so the distribution is truncated at 0.
480
Carl Chiarella, Xue-Zhong He
stochastic noise processes, results in switching of the return among different states, such as steady-states, periodic and aperiodic cycles from time to time. One would expect the adaptive model to display even richer dynamics. 20.3.4.1 Existence of Steady-state Returns If α t = α o is a constant, the system (20.16) – (20.20) becomes a deterministic dynamical system. The return time series generated by the adaptive model is the outcome of the interaction of this deterministic dynamical system with external noise processes (here the dividend yield process). A first step to understanding the possible dynamical behavior of the noise perturbed dynamical system is an understanding of the underlying dynamics of the deterministic systems, such as existence of steady-states, their stability and bifurcation. When α t = α o is a constant, the quasi-homogeneous model has the same steady-state as the homogeneous model (in which L1 = L2 ). The existence of such steady states is studied in [11] and the results may be summarized as follows. (i) The steady-state of the wealth proportions stays at the initial level, while the steady-state of the return depends on the extrapolation rate d . (ii) When agents are fundamentalists ( d = 0 ), there is a unique steady-state return called the fundamental steady-state return. Moreover, high risk premia δ correspond to high levels of the steady-state return. (iii) There exist two steady-state returns when agents are contrarians ( d < 0 ). One of the steady-state returns is negative while the other is positive. We call the positive steady-state return the contrarian steady-state return. More importantly, with the same risk premium, when agents act as contrarians, the contrarian steadystate return is pushed below the fundamental steady-state return. (iv) When agents are momentum traders ( d > 0 ) and they extrapolate weakly, there exist two steady-state returns. However, when d is close to zero, only one of the steady states is bounded and this steady-state return is called the momentum steady-state return. Furthermore, given the same risk premium, compared to the fundamental equilibrium, a weakly homogeneous momentum trading strategy leads to a higher level of steady-state return. An aim of the following analysis is to determine to what extent the adaptive model with two types of agents reflects these characteristics. 20.3.4.2 Parameters and Initial Value Selection Using data for the United States during the period 1926−94, as reported by Ibbotson Associates, the annual risk-free interest rate, r = 3.7%, corresponds to the average rate during that period. The initial history of rates of return on the stock consists of a distribution with a mean of 12.2% and a standard derivation of 20.4%. A mean dividend yield of α o = 4.7 % corresponds to the historical average yield on the S & P500 . The initial share price is po = $10.00 . The analysis in the following sections selects the annual risk-free rate r , standard derivation σ and the mean dividend yield α o as indicated above. For the
20 An Adaptive Model of Asset Price and Wealth Dynamics
481
simulations, the time period between each trade is one day and simulations are conducted over 20 years. Parameters and initial values are selected as follows, unless stated otherwise,
δ = 0.6, β = 0.5, γ = 0.5, C1 = C2 = 0
(20.22)
w o = 0, no = 0, Φ1,o = Φ 2,o = 0.5, po = $10.
(20.23)
and
Furthermore, annualized returns for risk-free and risky assets are used in the fitness functions Φ j ,t for j = 1, 2 .
20.4
Wealth Dynamics of Momentum Trading Strategies
This section considers the quasi-homogeneous model with d1 = d 2 = d > 0 and 1 ≤ L 1 < L 2 , that is both types of agents follow the same momentum trading strategy except for having different memory lengths. The simulations address the question as to which type of agent dominates the market over time. As discussed in Section 20.2, some empirical studies seem to support a view that momentum strategies are profitable over short time intervals, but not over long time intervals. The following discussion examines different combination of ( L1 , L2 ) and analyzes the effect of lag length on the wealth dynamics15. The results indicate in general that, the strategy with short memory length dominates the market by accumulating more wealth and attracting more of the trading population. The adaptive model outlined in this paper is thus capable of characterizing some broad features found in empirical studies.
20.4.1 Case: (L1,L2) = (3,5) We consider first the dynamics of the underlying deterministic system (i. e. q = 0 in (20.21)) and then subsequently consider the impact of the noise dividend process on the dynamics. 20.4.1.1 No-noise Case For d = 0.5 , initial population proportion no = 0 and any initial wealth proportion w o , numerical simulations show that ρ t → ρ ∗ = 15 . 45 % 15
p .a .,
w t → w o,
nt → 0 .
The selection of various combinations of lag lengths is arbitrary. However more extensive simulations (not reported) indicate some robustness of the results presented in this paper.
482
Carl Chiarella, Xue-Zhong He
By changing various parameters and initial values, the following results on the momentum trading strategies from the quasi-homogeneous model have been obtained. Risk premium and over-pricing. It is found that, ceteris paribus, for δ = 0.35 , ρt → ρ ∗ = 10.94% , while for δ = 0.53 , ρ ∗ = 15.45% . In general, a higher riskadjusted premium leads to a higher return, and a higher price as well. In fact, for the given parameters, there exists δ o ∈ (0.69, 0.7) , a called bifurcation value16,
such that the returns converge to fixed values for δ < δ o and diverge for δ > δ o , leading to price explosion. Over-extrapolation and overshooting. Momentum traders form expectations of future prices by extrapolating trends. However, when the prices or returns are over-extrapolated, stocks are over-priced, and as a result, overshooting takes place. Based on the parameters selected, simulations indicate that there exists d o ∈ (0.573, 0.574) such that, ceteris paribus, returns converge to fixed values for d < d o and diverge for d > d o , leading prices to exhibit overshooting. No herding behavior. For either fixed n o ≠ 0 and a range of w (say no = −0.3 and w o ∈ ( −0.5, 0.3) ), or fixed w o ≠ 0 and a range of no (say, w o = −0.3 and no ∈ (−1, 0.3) ), simulations show that o
ρt → ρ ∗ = 15.45% (annual),
wt → wo + ε ,
nt → 0
with ε ≈ 10 . Also, the switching intensity parameter β has almost no effect on the results (as long as the returns series converge to constants). This implies that, without the noise from the dividend yield process, in terms of profitability, type 1 trading strategy is slightly better than type 2, but not significantly. In addition, the populations of agents using different strategies become evenly distributed. In other words, no one of the momentum trading strategies dominates the market, even though both wealth and population are not evenly distributed initially. Therefore, there is no herd behavior. −6
20.4.1.2 Effect of Noise
We select the annualized standard deviation of the noisy dividend yield process, q = 0.03 = 3% . By adding a noisy dividend process to the adaptive system, the general features of the corresponding deterministic system (without the noise) still hold. However, the noise has a significant impact on the dynamics of the system, such as wealth and population distributions, autocorrelation of returns, volatility of returns and prices etc., as indicated below. In particular, the dynamics of the model is greatly affected by agents’ behavior, which is measured by their extrapolation 16
As in Brock and Hommes [6], the dynamics of the system through various types of bifurcation can be analyzed and are of interest. However, in this paper, we focus on the dynamics of the stochastic system when the return process of the underlying deterministic system is stable.
20 An Adaptive Model of Asset Price and Wealth Dynamics
483
rate, d , and switching intensity, β . The following discussion focuses on the dynamics of the system for various combinations of these two parameters d and β . Wealth distribution is largely influenced by agents’ extrapolation and strategy switching activity. Simulations show that, in general, a strong extrapolation leads type 1 trading strategy (with lag 3) to accumulate more wealth than type 2 trading strategy does. In other words, type 1 trading strategy (with lag of 3) is more profitable than type 2 (with lag of 5) under the noisy dividend process. Furthermore, as the switching intensity β increases, the profitability of type 1 trading strategy is improved significantly. This result is unexpected and interesting, and it is optimal in the sense that the overall outcome is independent of the initial wealth and population distributions. When the wealth and population are evenly distributed across the two types of strategies initially (i. e. w o = 0, no = 0 ), on average, type 1 strategy accumulates more wealth (about 5% to 6%) than type 2 strategy over the whole period, as indicated by the time series plot for the average wealth proportion difference ( w t ) in Figure 20.1. Also, as the extrapolations rate increases (i. e. as d increases), type 1 strategy accumulates more wealth than type 2 strategy (say, about 2% to 3% more for d = 0.5 , compared to 5−6% more for d = 0.53 ). This suggests that, when both types of strategies start with the same level of wealth and have the same number of traders, type 1 trading strategy is more profitable under the noisy dividend process. This result still holds when the initial wealth is not so evenly distributed. However, on average, when type 1 strategy starts with more initial wealth than type 2 (say w o = 0.2 ), the prices can be pushed immediately to very high levels so that any further trend chasing from type 1 strategy can cause price to overshoot, leading an explosion of price. For a fixed initial wealth proportion w o (say w o = 0 ) and a range of no (say, no ∈ ( −1, 0.5) ), w t increases in t . But for large no (say no = 0.6 ), the prices are pushed to explosion. This indicates that type 1 strategy accumulates more wealth over the period, even when the population of type 2 agents is high initially. However, an initial over concentration of type 1 agents can lead to overshooting of price. Herding behavior is measured by the population proportion difference nt and the switching intensity parameter β . For β = 0 , there is no switching between the two trading strategies. However, when agents are allowed to switch (i. e., β > 0 ), as indicated by the time series plot for the population ( nt ) in Figure 20.1, agents switch between the two strategies frequently. In general, because of the profitability of type 1 strategy, more agents switch from type 2 to type 1, as indicated by the mean and standard deviation of the population nt in Table 20.1. Also, as the switching intensity β increases, simulations (not reported here) show that the frequency of such switching increases too. Furthermore, as β increases, both prices and returns become more volatile, as indicated by the time series plots of returns ( ρ t ) and prices ( pt ) in Figure 20.1. Excess volatility and volatility clustering – As indicated by the time series plot of returns ρ t in Figure 20.1, adding the noisy dividend process causes an otherwise stable return series to fluctuate. This fact itself is not unexpected. What is of interest is the contrast between the simply normally distributed dividend process that is the input to the system and the return process that is the output of the sys-
484
Carl Chiarella, Xue-Zhong He
Figure 20.1. Time series plots for returns (top left), wealth (top right) and population (bottom left) distributions, and prices (bottom right) when the same momentum trading strategies with different lags (L1,L2) = (3,5) are used. Here, δ = 0.6, d = 0.53 and q =0.03
tem. With the increase of either the standard deviation of the noise process q , or agents extrapolation rate d , or switching intensity β , both returns and prices become more volatile. Moreover, volatility clustering is also observed. Autocorrelation – Significant positive autocorrelation (AC) for lags 1 and 2, negative for lags 3 to 8, positive for lags 9−14, are found, as indicated by Table 20.4. However, as lag length increases, the ACs become less significant. Overshooting – Related simulations (not reported here) indicate that either strong extrapolation (corresponding to high d ), or high volatility of the dividend yield process ( q ), or high switching density ( β ) can cause price to overshoot and lead to price explosion. Numerical simulations also show that, to avoid price overshooting, a minimum level of risk premium ( δ ) is required.
20.4.2
Other Lag Length Combinations
This section addresses how the above results are affected by different lag length combinations. For ( L1 , L2 ) = (3, 7) , the general dynamic features are similar to the case of ( L1 , L2 ) = (3, 5) , except for the following differences. (i) The underlying deterministic system is stable over a wider range of extrapo∗ ∗ ∗ lation rates d ∈ [0, d ) ; d ≈ 0.57 for L2 = 5 and d ≈ 0.773 for L2 = 7 .
20 An Adaptive Model of Asset Price and Wealth Dynamics
485
Figure 20.2. Time series plots for wealth and population distributions, returns and prices when the same momentum trading strategies with different lags (L1,L2) = (10,26) are used. Here, δ = 0.6, d = 1.2, q =0.03 and ω0 = 0.6 .
(ii) The trading strategy with short lag dominates the market. However, compared to the previous case, for the same set of parameters and initial values, both the profitability and herd behavior decrease, as indicated by the time series plots for wealth and population in Figure 20.5 and the corresponding statistical results in Table 20.1. (iii) ACs are significantly positive for lags 1 and 2, either positive or negative for lag 3, but not significantly, negative for lags 4 to 9, positive for lags 10 to 15, as indicated in Table 20.4. For ( L1 , L2 ) = (10, 14) , compared to the previous two cases, the following differences have been observed. ∗ (i) The upper bound d for returns of the underlying deterministic system to be ∗ stable increases to d ≈ 1.57 . (ii) By adding the noisy dividend yield process, the trading strategy with memory length 10 accumulates more wealth than the one with lag length 14. However, in contract to the previous cases, for the same set of parameters and initial values, the profitability and herd behavior of the strategy of lag 10 compared to the one with lag 14 is much less significant, as indicated by the corresponding statistical results in Table 20.2; although an increase of extrapolation improves the profitability of the strategy with lag 10, as indicated by the corresponding statistics in Table 20.2 for d = 1.2 . (iii) The ACs oscillate and become less significant when agents extrapolate weakly (say, for d = 0.5 ), but become more significant when agents extrapolate strongly (say, d = 1.1 ), as indicated by Table 20.4.
486
Carl Chiarella, Xue-Zhong He
(iv) There is less herding behavior than in the previous cases. This is partially because of the less significant profitability of one strategy with lag 10 over the other with lag 14. ∗ For ( L1 , L2 ) = (10, 26) , the upper bound d for returns of the underlying deter∗ ministic system to be stable increases to d ∈ (2.2, 2.3) . By adding the noisy dividend yield process, with the same parameters and initial values, profitability of type 1 trading strategy (with lag length 10) becomes questionable, as indicated by the time series plots in Figure 20.2 and the corresponding statistics in Table 20.3. As demonstrated by Table 20.4, the ACs have less patterns and do not die out as lags increase. In summarizing, we obtain the following results when both types of agents follow the same momentum trading strategy, but with different memory lengths. (i) Without the noisy dividend yield process, an increase in lag length from either one of the trading strategies stabilizes the return series of the underlying deterministic system, and enlarges the range of the extrapolation coefficient for which the market does not explode. However, for the same set of parameters, the profitability of the trading strategies and herd behavior become less significant. (ii) Adding the noisy dividend process in general improves the profitability of the trading strategies with short lag length L1 (say, ( L1 , L2 ) = (3, 5) and (3, 7) ). However, such profitability becomes less significant when the short lag length L1 increase, and may even disappear (say ( L1 , L2 ) = (10, 14) and (10, 26) ). (iii) When trading strategies become profitable, agents tend to adopt herding behavior – more agents switch to the more profitable strategy over the time period. However, over concentration (in terms of the initial average wealth, population proportion), or over extrapolation (in terms of high extrapolation rates and improper risk premium levels) can cause overshooting of price and push prices to explosion, leading to a market crash. (iv) Momentum trading strategies can push the prices to a very high level and lead the returns to be more volatile, exhibiting volatility clustering. (v) ACs follow certain patterns when one of the trading strategies becomes profitable and die out as lags increase. However, such patterns become less significant when the profitability of the trading strategy becomes less significant, and may even disappear. (vi) Price levels are more determined by the risk premium levels rather than by other parameters (such as the extrapolation rate and switching intensity).
20.5 Wealth Dynamics of Contrarian Trading Strategies This section considers the quasi-homogeneous model with d1 = d 2 = d < 0 and 1 ≤ L1 < L2 , that is both types of agents follow the same contrarian trading strategy except for having different memory lengths. As discussed in section 20.3,
20 An Adaptive Model of Asset Price and Wealth Dynamics
487
some empirical studies suggest that contrarian trading strategies are more profitable over long periods. The results in this section provide some consistency with this view and show that the adaptive model presented in this paper is capable of characterizing some features found in empirical studies. Furthermore, wealth and population distributions, statistical properties of returns (such as volatility clustering, autocorrelations), and herd behavior are discussed.
20.5.1
Case: (L1,L2) = (3,5)
With the selection of the parameters and initial values in (20.22)−(20.23), we consider first the dynamics of the underlying deterministic system and then the impact of the noisy dividend yield process on the dynamics. 20.5.1.1 No-noise Case
Let q = 0 . For d = −0.4 , initial difference of population proportions no = 0 and any initial wealth proportion wo , we obtain
ρt → ρ ∗ = 15.45% (p.a.),
w t → w o,
nt → 0.
By changing parameters and initial values, the following results are obtained. (i) It is found that, ceteris paribus, for δ (= δ/σ 2 ) = 0.4 , ρt → ρ ∗ = 11.52% , while ∗ for δ = 0.6 , ρ = 15.45% . In general, a high level of risk-adjusted premium leads to a high return and a high price correspondingly. In fact, for the given parameters, there exists a bifurcation value δ o ∈ (0 . 6 , 0 . 7) such that the returns converge to fixed values for δ < δ o and diverge for δ > δ o , leading prices to explode. (ii) Based on the parameters selected, there exists d o ∈ (−0.53, −0.52) such that, ceteris paribus, returns converge to fixed values for (0 >)d > d o and diverge for d < d o , leading prices to overshoot. Like the momentum trading strategies, overextrapolation from contrarian trading strategies also causes overshooting of prices. (iii) Unlike the case of the momentum trading strategies, wealth distributions of the deterministic system are affected differently by the extrapolation rate d , switching intensity, initial wealth and population distributions. In general, as d (< 0) decreases and is near the bifurcation value, the profitability of trading strategy with long lag ( L = 5 ) is improved significantly, say from 5% for d = −0.454 , to 25% for d = −0.48 , and to 50% for d = −0.5 . However, for fixed d < 0 , say d = −0.5 , as β increases, the profitability of trading strategy 2 becomes less significant, say from 45% for β = 0.1 to 20% for β = 2 . This is different from the case of momentum trading strategies. For fixed no ≠ 0 and a range of w o (say, no = 0.3 and w o ∈ ( −0.5, 0.5) ), ρ t → ρ ∗ = 15.45% p. a. and −6 w t → w o − ε , nt → 0 with ε ≈ 10 . This implies that agents’ wealth is distributed according to their initial wealth distribution, although populations are not evenly distributed initially. For fixed w o < 0 (say, w o = −0.3 ) and no ∈ [ −1, 1] , type 2 strategy accumulated more wealth than type 1 strategy over a very short
488
Carl Chiarella, Xue-Zhong He
period, but the difference is not significant (about 1%). In other words, when the initial average wealth for type 2 strategy is more than average wealth for type 1 strategy, no one of the contrarian trading strategies can make significant profit over the other, no matter how the initial populations are distributed. For fixed w o > 0 (say, ) and n o ∈ [ − 1, 1] , type 2 strategy accumulates more wealth than type 1 strategy over a very short period, and the difference becomes more significant (up to 37%) as more agents use type 2 trading strategy initially. This implies that, when the initial average wealth of type 1 strategy is higher than the one of type 2 strategy, contrarian strategies with long memory length ( L2 = 5 ) are able to accumulate more wealth over a very short period than the same strategy but with short memory length ( L1 = 3 ). In addition, the profitability becomes more significant when there are more agents using the strategy with long memory length initially. This is different from the case when agents use momentum strategies. (iv) The dynamics display no significant differences for different switching intensity parameter β when the returns process for the underlying deterministic system is stable. However, as d nears the bifurcation value, herd behavior is also observed. w o = 0.3
20.5.1.2 Effect of Noise
With the noisy dividend yield process set at q = 3% , we obtain the following results. Effect of the initial wealth distribution – When the wealth and population are evenly distributed among the two types of trading strategies initially (i. e. w o = 0, no = 0 ), type 2 strategy accumulates more average wealth (about 5% to 7%) than type 2 strategy over the whole period, as indicated by the time series plots of wealth and population in Figure 20.3. Also, as extrapolation increases (i. e. as d decreases), such extrapolation helps type 2 strategy to accumulate more wealth than type 1 strategy (say, about 5% for d = −0.45 , and about 45% for d = −0.5 ). This suggests that, when both types of strategies start with the same level of wealth and have equal number of traders, the strategy with long lag length ( L2 = 5 ) accumulate more wealthy than one with short lag length ( L1 = 3 ). In other words, type 2 strategy benefits significantly from the noisy dividend yield process. This result still holds when the initial wealth is not so evenly distributed (say, w o ∈ (−0.6, 0.3) for d = −0.5 ). Effect of the initial population distribution – Similar to the case without noise, the wealth distribution is affected differently as a function of different initial wealth levels. For fixed w o < 0 and a range of no (say, w o = −0.3 and no ∈ (−0.8, 0.65) ), the profitability of trading strategy 2 does not change much for different no . However, for fixed wo > 0 and a range of no (say, wo = 0.3 and no ∈ (−0.9, 0.9) ), the profitability of trading strategy 2 increases significantly as more and more agents use trading strategy 2. Price overshooting is possible when the populations are over concentrated in the use of one of the trading strategies. Herding behavior is also observed for changing values of the parameter β . Given the profitability of the trading strategy over the long lag length, more agents tend to switch to this more profitable strategy, as indicated by the time series plot
20 An Adaptive Model of Asset Price and Wealth Dynamics
489
Figure 20.3. Time series plots for wealth and population distributions, returns and prices when the same contrarian trading strategies with different lags (L1,L2) = (3,5) are used. Here, δ = 0.6, d = −0.45, and q = 0.03 .
of population in Figure 20.3 and the corresponding statistics in Table 20.5. Furthermore, as β increases, both prices and returns become more volatile, leading to excess volatility. Excess volatility and volatility clustering – The addition of a noise dividend yield process causes an otherwise stable return series to fluctuat. Similar to the case of momentum trading strategies, an increase of either the standard deviation of the dividend yield noise process q or agents’ extrapolation d leads both returns and prices to be more volatile. Volatility clustering is observed, as illustrated by the time series plot of the returns in Figure 20.3 and the corresponding statistics in Table 20.5. The ACs are significantly negative for odd lags and positive for even lags across all lags, as indicated in Table 20.6. Similar to the momentum trading strategies discussed in section 20.4, the noise dividend yield process has a significant impact on prices. An increase of q can push prices to significantly high levels. This can also result from either strong extrapolation (corresponding to low d ), or high risk premia δ , or high switching density β and causes prices to explode.
20.5.2 Other Cases For ( L1 , L2 ) = (3, 7) , the general dynamic features are similar to the above case when ( L1 , L2 ) = (3, 5) , except for the following differences.
490
Carl Chiarella, Xue-Zhong He
(i) The underlying deterministic system is stable over a wider range of extrapo∗ ∗ lation rates d ∈ ( d , 0] with d ∈ (−0.65, −0.6) for L2 = 7 in contrast with ∗ d ∈ (−0.53, −0.52) for L2 = 5 . (ii) Similar to the previous case, the trading strategy with the longer lag L2 = 7 dominates the market, in particular, when d is near the bifurcation value. However, for the same set of parameters, compared with the case of L2 = 5 , the profitability is reduced slightly. On the other hand, agents can extrapolate over a wide range (of the parameter d ). Similar impacts of initial wealth and population distributions, the switching intensity, and the standard deviation of the noisy dividend process on the dynamics can be observed over a wider range of the parameters. ∗ For ( L1 , L2 ) = (10, 14) , the lower bound d for the dynamic process for returns of the underlying deterministic system to be stable decreases to d ∗ ∈ (−1.6, −1.5) . By adding the noisy dividend yield process, with the same parameters and initial values, the profitability of type 1 trading strategy (with lag length 14) becomes questionable, as indicated by the corresponding statistics in Table 20.5. The patterns of the ACs are maintained, but they become less significant (for the same parameter d = −0.45 ), as shown in Table 20.6. Compared with the previous cases, there is less herding behavior, as indicated by the corresponding statistics in Table 20.5. This is partially due to the less significant (even no) profitability of the strategy with lag 14 over the one with lag 10.
Figure 20.4. Time series plots for wealth and population distributions, returns and prices when the same contrarian trading strategies with different lags (L1,L2) = (10,26) are used. Here, δ = 0.5, d = −0.4, and q = 0.01 .
20 An Adaptive Model of Asset Price and Wealth Dynamics
491
∗
For ( L1 , L2 ) = (10, 26) , the lower bound d (on d ) such that the dynamic process for returns of the underlying deterministic system be stable decreases to d ∗ ∈ (−2.2, −2.1) . By adding the noisy dividend yield process, the trading strategy with lag length 26 accumulates more wealth than the one with lag length 10, as shown in Figure 20.4 and Table 20.5. However, compared to the previous cases ( L1 , L2 ) = (3, 5) and (3, 7) , the profitability of the strategy of lag 26 over the one with lag 10 becomes much less significant (at about 0.01% to 0.04% more for d = −0.45 ), although a strong extrapolation rate can improve the profitability of the strategy with lag 26 (at about 5% to 7% for d = −2.0 ). The ACs become less significant when agents extrapolate weakly (say, d = −0.45 ), as indicated in Table 20.6, and more significant when agent extrapolate strongly (say, d = −2.0 ). In summary, we obtain the following results when both types of agents follow the same contrarian trading strategy, but with different lag lengths. (i) Without the noisy dividend process, for a given set of parameters, an increase in lag length of the trading strategies stabilizes the return series of the underlying deterministic system. As both d and β are near their bifurcation values, profitability of trading strategies and herd behavior are observed, in general. (ii) Adding a noisy dividend yield process, in general, improves the profitability of the trading strategies with long lag lengths (say, L2 = 5, 7, 26 ). However, such profitability becomes less significant when the relative difference between the two lag lengths is small (say, L1 = 10, L2 = 14 ). (iii) Similar to the case of momentum trading strategies, herding behavior is observed when one of the trading strategies becomes (significantly) profitable. Also, over-concentration (in terms of the initial average wealth and population proportion), or over extrapolation (in terms of low extrapolation rates and improper risk premium levels) can cause overshooting of price and lead to price explosion and a market crash. Price levels are more determined by the risk premium levels rather than the other parameters (such as extrapolation rate and switching intensity). (iv) The ACs are significantly negative for odd lags and positive for even lags when the lag lengths of the contrarian trading strategies are small. However, they become less significant when the lag lengths increase.
20.6 Conclusion This chapter has taken the basic two-date portfolio optimization model that is the basis of asset pricing theories (such as CAPM) and adds into it investor heterogeneity by allowing agents to form different views on the expected value of the return distribution of the risky asset. The outcome is an adaptive model of asset price and wealth dynamics with agents using various trading strategies. As a special case, a quasi-homogeneous model of two types of agents using either momentum or contrarian trading strategies is introduced to analyze the profitability of the trading strategies with different time horizons. It is found that agents with different timehorizons coexist. Our results shed light on the empirical finding that momentum
492
Carl Chiarella, Xue-Zhong He
trading strategies are more profitable over short time intervals, while contrarian trading strategies are more profitable over long time intervals. It should be pointed out that this is an unexpected result given the traditional foundations of the adaptive model. Even though the quasi-homogeneous model is one of the simplest cases of the adaptive model, it generates various phenomena observed in financial markets, including rational adaptiveness of agents, overconfidence and under-reaction, overreaction and price overshooting, herding behavior, excess volatility, and volatility clustering. The model also displays the essential characteristics of the standard asset price dynamics model assumed in continuous time finance in that the asset price is fluctuating around a geometrically growing trend. Our analysis in this paper is based on a simplified quasi-homogeneous model. Further analysis and extensions can be found in [14] and [10]. A more extensive analysis of the adaptive model is necessary in order to explore the potential explanatory power of the model. One of the extensions is to consider models of two or three different types of trading strategies, to analyze the profitability of different trading strategies, and to examine the stylized facts of the return distribution. Secondly, the attitudes of agents towards the extrapolation and risk premium change when the market environment changes and this change should be made endogenous. Thirdly, there should be a more extensive simulation study of these richer models once they are developed. In fact a proper Monte Carlo analysis is required to determine whether the models can generate with a high frequency the statistical characteristics of major indices such as the S&P500. These extensions are interesting problems which are left to future research work.
20.7
Appendix
20.7.1 Proof of Proposition 1 Proof. At time period t − 1 , assume that there are j ,t −1 agents belonging to group j . Then, within the group, their optimum demand in terms of wealth proportion to be invested in the risky asset are the same, denoted by π j ,t −1 . It follows from (20.2) and (20.7) that their average wealth proportion, which corresponds to the wealth proportion of the j -th HRA, at the next time period t is given by w j ,t =
W
j ,t
=
W
[ R + ( ρ t − r )π
j , t −1
Wt
Wt
] = w j ,t −1[ R + ( ρt − r )π
j , t −1
W t/W t −1
]
j ,t −1
. (20.24)
Note that Wt W t −1
∑ =
h j =1
W j ,t
W t −1
h
= ∑ w k ,t −1[ R + ( ρ t − r )π k ,t −1]. k =1
(20.25)
20 An Adaptive Model of Asset Price and Wealth Dynamics
493
Then both (20.24) and (20.25) imply that the time t average wealth proportion for the j -th group, formed at t − 1 , is given by (20.9). With the notations introduced in section 20.2, the market clearing equilibrium price equation (20.6) can be rewritten as: h
∑n
j ,t
π j ,tW j ,t = Npt /H .
(20.26)
j =1
Note that
Wt =
H j =1
h
W j,t =
j =1
l j,t W j,t = H
h j =1
n j,t W j,t .
(20.27)
It follows from (20.26) and (20.27) that the market clearing price equilibrium equation (20.26) becomes
Wt
h j =1
n j,t π j,t ω j,t = N pt
h j =1
n j,t ω j,t .
(20.28)
From (20.28) Wt Wt−1 h
h
h
j =1 n j,t π j,t ω j,t
= (1 + ρt − αt ) h
j =1 n j,t−1 π j,t−1 ω j,t−1
j =1 n j,t ω j,t
j =1 n j,t−1 ω j,t−1
. (20.29)
Note that Wt Wt−1
=
h h
j =1 n j,t W j,t
=
j =1 n j,t−1 W j,t−1
h
j =1 n j,t ω j,t−1 [R+(ρt −r)π j,t−1 ] h j =1 n j,t−1 ω j,t−1
.
(20.30)
Substituting (20.30) into (20.29), h j =1
n j,t ω j,t−1 [R + (ρt − r )π j,t−1 ] = (1 + ρt − αt )
h j =1
h j =1
n j,t ω j,t
n j,t π j,t ω j,t h j =1
n j,t−1 π j,t−1 ω j,t−1. (20.31)
Also, using (20.9), h
∑n
j ,t π j ,t w j ,t
j =1
∑ =
h j =1
∑
∑ = ∑ h
h
∑n j =1
j ,t w j , t
n j ,t π j ,t w j ,t −1[ R + ( ρ t − r )π
j =1
w k ,t −1[ R + ( ρ t − r )π k ,t −1] k =1 h
n j ,t w j ,t −1[ R + ( ρ t − r )π
]
j , t −1
w k ,t −1[ R + ( ρ t − r )π k ,t −1] k =1 h
]
j , t −1
.
,
(20.32)
(20.33)
494
Carl Chiarella, Xue-Zhong He
Substitution of (20.32) and (20.33) into (20.31) and simplification of the corresponding expression leads to the equation h
∑n
j ,t
w j ,t −1π j ,t[ R + ( ρ t − r )π
]
j , t −1
j =1
h
= [( ρ t − r ) + (1 + r − α t )](∑ n j ,t −1 π
j , t −1
w j ,t −1).
(20.34)
j =1
Solving for ρ t from (20.34), one obtains (20.10) for the return ρ t .
20.7.2
Time Series Plots, Statistics and Autocorrelation Results
For both momentum and contrarian trading strategies with different combinations of lag lengths ( L1 , L2 ) , this appendix provides (i) time series plots for wealth ( w t , the difference of wealth proportions), population ( nt , the difference of population proportions), returns ( ρt ), and prices ( pt ); (ii) numerical comparative statics for wealth (WEA), population (POP), and returns (RET); and (iii) the autocorrelation coefficients (AC) for return series with lags from 1 to 36.
Figure 20.5. Time series plots for wealth and population distributions, returns and prices when the same momentum trading strategies with different lags (L1,L2) = (3,7) are used. Here, δ = 0.6, d = 0.53, and q = 0.03
20 An Adaptive Model of Asset Price and Wealth Dynamics
495
Table 20.1. Statistics of time series of wealth (WEA), population (POP) and returns (RET) for momentum trading strategies with (L1,L2) = (3,5), (3,7) and δ = 0.6, d = 0.53 and q = 0.03 δ = 0.6, d = 0.53 and q = 0.03
Table 20.2. Statistics of time series of wealth, population and returns for momentum trading strategies with (L1,L2) = (10,14) and δ = 0.6, q = 0.03, d = 0.53 for (a), and d = 1.2 for (b)
Table 20.3. Statistics of time series of wealth, population and returns for momentum trading strategies with (L1,L2) = (3,7) and q = 0.03 , and δ = 0.6, d = 0.53 for (a), and δ = 0.6, d = 1.2 for (b)
496
Carl Chiarella, Xue-Zhong He
Table 20.4. Autocorrelation coefficients (AC) of returns for momentum trading strategies with (L1,L2) = (3,5), (3,7), (10,14) and (10,26). The parameters are δ = 0.6, d = 0.53 and q = 0.03 for (3,5), (3,7), (10,14) and (10,26); δ = 0.6, d = 1.2, q q = 0.03 for (10,14) (a) ; and δ = 0.6, d = 1.2, q = 0.03 for (10,26) (a)
Table 20.5. Statistics of time series of wealth, population and returns for contrarian trading strategies with parameters δ = 0.6, d = −0.45 and q = 0.03 and lag length combinations (L1,L2) are (3,5) for (a), (3,7) for (b), (10,14) for (c) and (10,26) for (d)
20 An Adaptive Model of Asset Price and Wealth Dynamics
497
Table 20.6. Autocorrelation coefficients (AC) of returns for contrarian trading strategies with parameters δ = 0.6, d = −0.45, β = 0.5 and q = 0.03 and lag length combinations (L1,L2) = (3,5), (3,7), (10,14) and (10,26)
References 1. Anderson S, de Palma A, Thisse J (1993) Discrete Choice Theory of Product Differentiation, MIT Press, Cambridge, MA 2. Arshanapali B, Coggin D, Doukas, J (1998) ‘Multifactor asset pricing analysis of international value investment strategies’, Journal of Portfolio Management (4), 10–23 3. Asness C 1997, ‘The interaction of value and momentum strategies’, Financial Analysts Journal, 29–36 4. Barberis N, Shleifer A Vishny R (1998) ‘A model of investor sentiment’, Journal of Financial Economics, 307–343 5. Brock W, Hommes C (1997) ‘A rational route to randomness’, Econometrica, 1059–1095 6. Brock W, Hommes C (1998) ‘Heterogeneous beliefs and routes to chaos in a simple asset pricing model’, Journal of Economic Dynamics and Control, 1235–1274 7. Bullard J, Duffy J (1999) ‘Using Genetic Algorithms to Model the Evolution of Heterogeneous Beliefs’, Computational Economics, 41–60 8. Capaul C, Rowley I Sharpe W (1993) ‘International value and growth stock returns’, Financial Analysts Journal (July/Aug.), 27–36 9. Chiarella C (1992) ‘The dynamics of speculative behaviour’, Annals of Operations Research, 101–123 10. Chiarella C, Dieci R Gardini L (2006) ‘Asset price and wealth dynamics in a financial market with heterogeneous agents’, Journal of Economic Dynamics and Control, 1755–1786 11. Chiarella C, He X (2001) ‘Asset pricing and wealth dynamics under heterogeneous expectations’, Quantitative Finance, 509–526
498
Carl Chiarella, Xue-Zhong He
12. Chiarella C, He X (2002) ‘Heterogeneous beliefs, risk and learning in a simple asset pricing model’, Computational Economics, 95–132 13. Chiarella C, He X (2003) ‘Heterogeneous beliefs, risk and learning in a simple asset pricing model with a market maker’, Macroeconomic Dynamics (4), 503–536 14. Chiarella, C, He, X (2005) An asset pricing model with adaptive heterogeneous agents and wealth effects, pp. 269–285 in Nonlinear Dynamics and Heterogeneous Interacting Agents, Vol. 550 of Lecture Notes in Economics and Mathematical Systems, Springer 15. Daniel K, Hirshleifer D, Subrahmanyam A (1998) ‘A theory of overconfidence, self-attribution, and security market under- and over-reactions’, Journal of Finance, 1839–1885 16. Day R, Huang W (1990) ‘Bulls, bears and market sheep’, Journal of Economic Behavior and Organization, 299–329 17. Fama E French K (1998) ‘Value versus growth: The international evidence’, Journal of Finance, 1975–1999 18. Farmer J (1999) ‘Physicists attempt to scale the ivory towers of finance’, Computing in Science and Engineering, 26–39 19. Farmer J, Lo A (1999) ‘Frontier of finance: Evolution and efficient markets’, Proceedings of the National Academy of Sciences, 9991–9992 20. Franke R, Nesemann T (1999) ‘Two destabilizing strategies may be jointly stabilizing’, Journal of Economics, 1–18 21. Frankel F, Froot K (1987) ‘Using survey data to test propositions regarding exchange rate expectations’, American Economic Review, 133–153 22. Hirshleifer D (2001) ‘Investor psychology and asset pricing’, Journal of Finance, 1533–1597 23. Hommes C (2001) ‘Financial markets as nonlinear adaptive evolutionary systems’, Quantitative Finance, 149–167 24. Hong H, Stein J (1999) ‘A unified theory of underreaction, momentum trading, and overreaction in asset markets’, Journal of Finance, 2143–2184 25. Jegadeesh N, Titman S (1993) ‘Returns to buying winners and selling losers: Implications for stock market efficiency’, Journal of Finance, 65–91 26. Jegadeesh N, Titman S (2001) ‘Profitability of momentum strategies: an evaluation of alternative explanations’, Journal of Finance, 699–720 27. Kirman A (1992) ‘Whom or what does the representative agent represent?’, Journal of Economic Perspectives, 117–136 28. LeBaron B (2000) ‘Agent based computational finance: suggested readings and early research’, Journal of Economic Dynamics and Control, 679–702 29. Lee C, Swaminathan B (2000) ‘Price momentum and trading volume’, Journal of Finance, 2017–2069 30. Levis M, Liodakis M (2001) ‘Contrarian strategies and investor expectations: the U.K. evidence’, Financial Analysts Journal (Sep./Oct.), 43–56 31. Levy M, Levy H (1996) ‘The danger of assuming homogeneous expectations’, Financial Analysts Journal (3), 65–70
20 An Adaptive Model of Asset Price and Wealth Dynamics
499
32. Levy M, Levy H, Solomon S (1994) ‘A microscopic model of the stock market’, Economics Letters, 103–111 33. Levy M, Levy H, Solomon S (2000) Microscopic Simulation of Financial Markets, from investor behavior to market phenomena, Acadmic Press, Sydney 34. Lintner J (1965) ‘The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets’, Review of Economics and Statistics, 13–37 35. Lux T (1995) ‘Herd behaviour, bubbles and crashes’, Economic Journal, 881−896 36. Lux T, Marchesi M (1999) ‘Scaling and criticality in a stochastic multi-agent model of a financial markets’, Nature (11), 498–500 37. Manski C, McFadden D (1981) Structural Analysis of Discrete Data with Econometric Applications, MIT Press 38. Merton R (1973) ‘An intertemporal capital asset pricing model’, Econometrica, 867–887 39. Moskowitz T, Grinblatt M (1999) ‘Do industries explain momentum?’, Journal of Finance, 1249–1290 40. Ross S (1976) ‘The arbitrage theory of capital asset pricing’, Journal of Economic Theory, 341–360 41. Rouwenhorst K G (1998) ‘International momentum strategies’, Journal of Finance, 267–284 42. Sharpe W (1964) ‘Capital asset prices: A theory of market equilibrium under conditions of risk’, Journal of Finance, 425–442 3. An Adaptive Model of Two Types of Agents
CHAPTER 21 Simulation Methods for Stochastic Differential Equations Eckhard Platen
21.1 Stochastic Differential Equations As one tries to build more realistic models in finance, stochastic effects need to be taken into account. In finance, the randomness in the dynamics is in fact the essential phenomenon to be modeled. More than hundred years ago, Bachelier used in (Bachelier 1900) what we now call Brownian motion or the Wiener process to model stock prices observed at the Paris Bourse. Einstein, in his work on Brownian motion, employed in (Einstein 1906) an equivalent construct. Wiener, then developed the mathematical theory of Brownian motion in (Wiener 1923). Itô laid in (Itô 1944) the foundation of stochastic calculus, which represents a stochastic generalization of the classical differential calculus. It allows to model in continuous time the dynamics of stock prices or other financial quantities. The corresponding stochastic differential equations (SDEs) generalize ordinary deterministic differential equations. A most striking example is given by the modern derivative pricing theory. The Nobel prize winning works in (Merton 1973) and (Black and Scholes 1973) initiated an entire derivatives industry. In our competitive world it is vital to improve continuously the understanding of the underlying stochastic dynamics of financial markets, and to calculate efficiently relevant financial quantities such as derivative prices and risk or performance measures. This will be a major driving force in the development of appropriate numerical methods for the solution of SDEs in finance. This chapter provides a very basic introduction, as well as a brief overview of the area of simulation methods for SDEs with focus on finance. An attempt has been made to highlight key approaches and main ideas. Let us consider an stochastic differential equation (SDE) of the form dX t = a ( X t ) dt + b( X t ) dWt
(21.1)
502
Eckhard Platen
t ∈ [0, T ] , with initial value X 0 ∈ [0, T ] . The stochastic process X = X t , 0 ≤ t ≤ T is assumed to be a unique solution of the SDE (21.1), which consists of a slowly varying component governed by the drift coefficient a ( ⋅) and
for
{
}
a rapidly fluctuating random component characterized by the diffusion coefficient b ( ⋅) . The second differential in (21.1) is an Itô stochastic differential with respect to the Wiener process W = Wt , 0 ≤ t ≤ T . Over a short period of time length
{
}
Δ > 0 the increment of the solution of the SDE (1) is approximately of the form X t + Δ − X t ≈ a ( X t ) Δ + b ( X t ) (Wt + Δ − Wt ) .
Here the increment Wt + Δ − Wt of the Wiener process W is Gaussian distributed with zero mean and variance Δ . Note that the Itô integral is the limit of Riemann sums where the integrand is always taken at the left hand point of the time discretization interval. For an introduction to SDEs we refer to (Øksendal 1998) or (Kloeden and Platen 1999). Itô SDEs have natural applications to finance. In particular, the gains process of a portfolio resembles an Itô integral because the investor has to fix at the beginning of an investment period the holdings in a security. Then the self-financing portfolio value changes according to the changes in the underlying security. The path of a Wiener process is not differentiable. Consequently, the stochastic calculus differs in its properties from classical calculus. This is most obvious in the stochastic chain rule, the Itô formula, which for a twice continuously differentiable function f has the form
(
)
df ( X t ) = f ′ ( X t ) a ( X t ) + 12 f ′′ ( X t ) b 2 ( X t ) dt
(21.2)
+ f ′ ( X t ) b ( X t ) dWt for 0 ≤ t ≤ T . We remark that the extra term 12 f ′′b 2 in the drift function of the resulting SDE (21.2) is characteristic of the stochastic calculus. This has a substantial impact on the dynamics of financial models and derivative prices, as well as, on numerical methods for SDEs.
21.2 Approximation of SDEs Since analytical solutions of SDEs are rare, numerical approximations have been developed. These are typically based on a time discretization with points 0 = τ 0 < τ 1 < … < τ n < … < τ N = T in the time interval [0, T ] , using, for instance, an equal step size Δ = T N . More general time discretizations can be used. Simulation methods for SDEs have been presented in the monographs (Kloeden and Platen 1999) and (Kloeden et al. 2003), (Milstein 1995), (Jäckel 2002), (Glasserman 2004) and (Platen and Heath 2006), where the last three books are related to finance.
21 Simulation Methods for Stochastic Differential Equations
503
To keep our formulae simple, we discuss in the following only the case of the above one-dimensional SDE. Discrete time approximations for SDEs with a jump component or SDEs that are driven by Lévy processes have been studied, for instance, in (Platen 1982), (Mikulevicius and Platen 1988), (Maghsoodi 1996), (Li and Liu 1997), (Kubilius and Platen 2002) and (Protter and Talay 1997). There is also a developing literature on the approximation of SDEs and their functionals involving absorbtion or reflection at boundaries, see (Gobet 2000) or (Hausenblas 2000). Most of the numerical methods that we mention have been generalized to multi-dimensional state variables X t and Wiener processes Wt and the case with jumps. Jumps can be covered, for instance, by discrete time approximations of SDEs if one adds to the time discretization the random jump times for the modeling of events and treats the jumps separately at these times, see (Platen 1982) and (Bruti-Liberati and Platen 2006). The Euler approximation is the simplest example of a discrete time approximation Y of an SDE. It is given by the recursive equation Yn +1 = Yn + a (Yn ) Δ + b(Yn ) ΔWn .
(21.3)
for n ∈ {0,1, … , N − 1} with Y0 = X 0 . Here ΔWn = Wτ n+1 − Wτ n denotes the increment of the Wiener process in the time interval [τ n ,τ n +1 ] and is represented by an independent N (0, Δ ) Gaussian random variable with mean zero and variance Δ . To simulate a realization of the Euler approximation, one needs to generate the independent random variables involved. In practice, linear congruential pseudorandom number generators are often available in common software packages. An introduction to this area is given in (Fishman 1996). There exist also natural uniform random number generators that do not generate any cycles, as described in (Pivitt 1999).
21.3 Strong and Weak Convergence The most general and also most flexible quantitative method for SDEs is the simulation of their trajectories. It is most important, that for simulations, one has to specify the class of problems that one wishes to investigate before starting to apply a simulation algorithm. There are two major types of convergence to be distinguished. These are the strong convergence for simulating scenarios and the weak convergence for simulating functionals. The simulation of sample paths, as, for instance, the generation of a stock price scenario, a filter estimate for some hidden unobserved variable, the testing of a statistical estimator or a hedge simulation require the simulated sample path to be close to that of the solution of the original SDE and, thus, relate to strong convergence.
504
Eckhard Platen
A discrete time approximation Y of the exact solution X of an SDE converges in the strong sense with order γ ∈ (0, ∞) if there exists a constant K < ∞ such that E X T − YN ≤ K Δγ ,
(21.4)
for all step sizes Δ ∈ (0,1) . Much computational effort is still being wasted, in practice, on many simulations by overlooking that in a large variety of problems in finance a pathwise approximation of the solution X of an SDE is not required. If one aims to compute, for instance, a moment, a probability, an expected value of a future cash flow, an option price, a risk measure or, in general, a functional of the form E ( g ( X T )) , then only a weak approximation is required. A discrete time approximation Y of a solution X of an SDE converges in the weak sense with order β ∈ (0, ∞] if for any polynomial g there exists a constant K g < ∞ such that E ( g ( X T )) − E ( g (YN )) ≤ K g Δ β ,
(21.5)
for all step sizes Δ ∈ (0,1) . Numerical methods, which can be constructed with respect to this weak convergence criterion are much easier to implement than those required by the strong convergence criterion (21.4). The key to the construction of most higher order numerical approximations is a Taylor type expansion for SDEs, which is known as Wagner-Platen expansion, see (Platen and Wagner 1982). It is obtained by iterated applications of the Itô formula (21.2) to the integrands in the integral version of the SDE (21.1). For example, one obtains the expansion X tn+1 = X tn + a ( X tn ) Δ + b( X tn ) I (1) + b( X tn ) b′( X tn ) I (1,1) + Rt0 ,t ,
(21.6)
where Rt0 ,t represents some remainder term consisting of higher order multiple stochastic integrals. Multiple stochastic integrals of the type τ n+1
I (1) = ΔWn = ∫ τ τ n+1
I (1,1) = ∫ τ
n
τ n+1
I (0,1) = ∫ τ
n
τ n+1
I (1,1,1) = ∫ τ
n
n
dWs ,
s 2 ∫ τ dWs dWs = 12 (( I (1) ) − Δ) 2
1
n
s2
∫τ
n
2
τ n+1
ds1dWs2 , I (1,0) = ∫ τ
s3
s2
n
n
n
s2
∫τ
n
(21.7) dWs1 ds2 ,
∫ τ ∫ τ dWs dWs dWs 1
2
3
on the interval [τ n ,τ n +1 ] form the random building blocks in a Wagner-Platen expansion. Algebraic relationships exist between multiple stochastic integrals which have been described, for instance, in (Platen and Wagner 1982) and (Kloeden and Platen 1999). The above type of stochastic Taylor expansion can be applied in many ways in finance, for instance, in the calculation of sensitivities for derivative prices.
21 Simulation Methods for Stochastic Differential Equations
505
21.4 Strong Approximation Methods If from the Wagner-Platen expansion (21.6) we select only the first three terms, then we obtain the Euler approximation (21.3), which, in general, has strong order of γ = 0.5 . By taking one additional term in the expansion (21.6) we obtain the Milstein scheme Yn +1 = Yn + a (Yn ) Δ + b(Yn ) ΔWn + b(Yn ) b′(Yn ) I (1,1) ,
proposed in (Milstein 1974), where the double Itô integral I (1,1) is given in (21.7). In general, this scheme has strong order γ = 1.0. For multi-dimensional driving Wiener processes, the double stochastic integrals appearing in the Milstein scheme have to be generated separately, involving stochastic area integrals, or should be approximated, unless the drift and diffusion coefficients fulfil a commutativity condition, see (Milstein 1995), (Kloeden and Platen 1999) and (Gaines and Lyons 1994). In (Platen and Wagner 1982) it has been described which terms of the expansion have to be chosen to obtain any desired higher strong order of convergence. The strong Taylor approximation of order γ = 1.5 has the form
(
Yn +1 = Yn + a Δ + b ΔWn + bb′I (1,1) + ba ′I (1,0) + aa ′ + 12 b 2 a ′′
(
)
(
+ ab′ + 12 b b′′ I (0,1) + b bb′′ + ( b′ ) 2
2
)I
(1,1,1)
)
Δ2 2
,
where we suppress the dependence of the coefficients on Yn and use the multiple stochastic integrals mentioned in (21.7). For higher order schemes one requires, in general, adequate smoothness of the drift and diffusion coefficients, but also sufficient information about the driving Wiener processes, which is contained in the multiple stochastic integrals. A disadvantage of higher order strong Taylor approximations is the fact that derivatives of the drift and diffusion coefficients have to be calculated at each step. As a first attempt at avoiding derivatives in the Milstein scheme, one can use the following γ = 1.0 strong order method
⎛ ^ ⎞ 1 ((ΔWn ) 2 − Δ) , Yn +1 = Yn + a (Yn )Δ + b(Yn )ΔWn + ⎜ b(Yn ) − b′(Yn ) ⎟ ⎝ ⎠2 Δ with Yˆn = Yn + b(Yn ) Δ . In (Burrage et al. 1997) a γ = 1.0 strong order two-stage Runge-Kutta method has been developed, where Yn +1 = Yn + ( a(Yn ) + 3a (Yn ) )
Δ 4
+ ( b(Yn ) + 3b(Yn ) )
ΔWn 4
(21.8)
with Yn = Yn + 23 ( a (Yn )Δ + b(Yn )ΔWn ) using the function a ( x) = a ( x ) − 12 b( x)b′( x).
(21.9)
506
Eckhard Platen
The advantage of the method (21.8) is that the principal leading error has been minimized within a class of one stage first order Runge-Kutta methods. In the context of simulation of SDEs it must be emphasized that before any properties of higher order convergence can be exploited the question of numerical stability of a scheme has to be satisfactorily answered, see (Kloeden and Platen 1999). Many problems in financial risk management turn out to be multidimensional. Systems of SDEs with extremely different time scales easily arise in modeling, which can cause numerical instabilities for most explicit methods. Implicit methods can resolve some of these problems. We mention here a family of drift implicit Milstein schemes, described in (Milstein 1995), where Yn +1 = Yn + {α a (Yn +1 ) + (1 − α )a (Yn )} Δ + b(Yn ) ΔWn + b(Yn ) b′(Yn )
(ΔWn ) 2 . 2
Here α ∈ [0,1] represents the degree of implicitness and a is given in (21.9). This scheme is of strong order γ = 1.0. In a strong scheme it is almost impossible to construct implicit expressions for the noise terms. However, in finance typically the diffusion terms are of multiplicative nature and can cause numerical instability. In Milstein, Platen & Schurz (1998) one overcomes part of the problem by suggesting the family of balanced implicit methods given in the form Yn +1 = Yn + a(Yn )Δ + b(Yn )ΔWn + (Yn − Yn +1 )Cn ,
where Cn = c 0 (Yn )Δ + c1 (Yn ) ΔWn and c 0 , c1 represent positive real valued uniformly bounded functions. This method is of strong order γ = 0.5. It demonstrates for a number of applications in finance and filtering excellent numerical stability properties, see (Fischer and Platen 1999).
21.5 Weak Approximation Methods For many applications in finance, in particular, for Monte Carlo simulation of derivative prices, it is not necessary to use strong approximations. Under the weak convergence criterion (21.5) the random increments ΔWn of the Wiener process can be replaced by simpler random variables ΔWn . By substituting the N (0, Δ ) Gaussian distributed random variable ΔWn in the Euler approximation (21.3) by an independent two-point distributed random variable ΔWn with P(ΔWn = ± Δ ) = 0.5 we obtain the simplified Euler method Yn +1 = Yn + a (Yn ) Δ + b(Yn ) ΔWn ,
(21.10)
which converges with weak order β = 1.0. The Euler approximation (21.3) can be interpreted as the order 1.0 weak Taylor scheme.
21 Simulation Methods for Stochastic Differential Equations
507
One can select additional terms from a Wagner-Platen expansion to obtain weak Taylor schemes of higher order. The order β = 2.0 weak Taylor scheme has the form Yn +1 = Yn + a (Yn )Δ + b(Yn ) ΔWn + b(Yn ) b′(Yn ) I (1,1) + b(Yn ) a ′(Yn ) I (1,0) + (a(Yn ) b′(Yn ) 1 + b 2 (Yn ) b′′(Yn )) I (0,1) + (a (Yn ) a ′(Yn ) 2 1 Δ2 + b 2 (Yn ) a ′(Yn )) 2 2
(21.11)
which was first proposed in (Milstein 1978). We still obtain a scheme of weak order β = 2.0 if we replace the random variable ΔWn in (21.11) by ΔWˆn , the double integrals I (1,0) and I (0,1) by Δ2 ΔWˆn and the double Wiener integral I (1,1) by 1 (( 2
ΔWˆn )2 − Δ) . Here ΔWˆn can be a three-point distributed random variable with
(
)
1 P ΔWˆn = ± 3Δ = 6
and
(
)
2 P ΔWˆn = 0 = 3
(21.12)
As in the case of strong approximations, higher order weak schemes can be constructed only if there is adequate smoothness of the drift and diffusion coefficients and a sufficiently rich set of random variables involved for approximating the multiple stochastic integrals. A weak second order Runge-Kutta approximation, see (Kloeden and Platen 1999), that avoids derivatives in a and b is given by the algorithm
with Yˆn +1 = Yn + a (Yn )Δ + b(Yn )ΔWˆn und Yˆn±+1 = Yn + a (Yn )Δ ± b(Yn ) Δ , where ΔWˆn can be chosen as in (21.12). Implicit methods have better numerical stability properties than most explicit schemes, and turn out to be better suited to simulation tasks with potential stability problems. The following family of weak implicit Euler schemes, converging with weak order β = 1.0, can be found in (Kloeden and Platen 1999) Yn +1 = Yn + {ξ aη (Yn +1 ) + (1 − ξ )aη (Yn )} Δ + {η b(Yn +1 ) + (1 − η )b(Yn )} ΔWn
(21.14)
Here ΔWn can be chosen as in (21.10), and we have to set aη ( y ) = a( y ) − η b( y )b′( y ) for ξ ∈ [0,1] and η ∈ [0,1] . For ξ = 1 and η = 0 , (21.14) leads to the drift implicit Euler scheme. The choice ξ = η = 0 provides the simplified Euler scheme, whereas for ξ = η = 1 we have the fully implicit Euler scheme.
508
Eckhard Platen
A family of implicit weak order 2.0 schemes has been proposed in (Milstein 1995) with Yn +1 = Yn + {ξ a (Yn +1 ) + (1 − ξ )a (Yn )} Δ
(
1 + b(Yn )b′(Yn ) ⎛⎜ ΔWˆn ⎝ 2
)
2
− Δ ⎞⎟ / 2 ⎠
1 ⎧ ⎫ + ⎨b (Yn ) + ( b′ (Yn ) + (1 − 2ξ ) a ′ (Yn ) ) Δ ⎬ ΔWˆn 2 ⎩ ⎭
{
} Δ2 , 2
+ (1 − 2ξ ) β a ′ (Yn ) + (1 − β ) a ′ (Yn +1 )
where ΔWˆn is chosen as in (21.12). For an implicit method one has usually to solve an algebraic equation at each time step. A weak second order predictor corrector method, see (Platen 1995), is avoiding such a calculation and is given by the corrector Yn +1 = Yn +
Δ 2
( a (Yˆ ) + a (Y )) +ψ n +1
n
n
with ⎛ ⎞ ψ n = b (Yn ) + ⎜ a (Yn ) b′ (Yn ) + b 2 (Yn ) b′′ (Yn ) ⎟ ΔWˆn 2 2 1
1
⎝
⎠
(
+ b (Yn ) b′ (Yn ) ⎛⎜ ΔWˆn ⎝
)
2
− Δ ⎞⎟ / 2 ⎠
and the predictor 2 1 ⎛ ⎞Δ Yn +1 = Yn + a (Yn ) Δ + ψ n + ⎜ a (Yn ) a ′ (Yn ) + b 2 (Yn ) a ′′ (Yn ) ⎟ 2 ⎝ ⎠ 2 ˆ + b (Yn ) a ′ (Yn ) ΔWn Δ / 2,
where ΔWˆn is as in (21.12). For the weak second order approximation of the functional E ( g ( X T ) ) a Richardson extrapolation of the form
( (
)) ( (
VgΔ,2 (T ) = 2 E g Y Δ (T ) − E g Y 2 Δ (T )
))
(21.15)
was proposed in (Talay and Tubaro 1990), where Y δ (T ) denotes the value at time T of an Euler approximation with step size δ. Using Euler approximations with
21 Simulation Methods for Stochastic Differential Equations
509
step sizes δ = Δ and δ = 2Δ, and then taking the difference (21.15) of their respective functionals, the leading error coefficient cancels out, and VgΔ,2 (T ) ends up to be a weak second order approximation. Further weak higher order extrapolations can be found in (Kloeden and Platen 1999). For instance, one obtains a weak fourth order extrapolation method using VgΔ,4 =
( (
))
1 ⎡ 32 E g Y Δ (T ) − 12 E 21 ⎣
( g (Y
2Δ
(T ) ) ) + E
( g (Y
4Δ
(T ) ) )⎤⎦
where Y δ (T ) is the value at time T of the weak second order Runge-Kutta scheme (21.13) with step size δ. Extrapolation methods can only be successfully applied if there is a wide range of time step sizes where the underlying discrete time approximation works efficiently.
21.6 Monte Carlo Simulation for SDEs By a discrete time weak approximation Y one can form a straightforward Monte Carlo estimate for a functional of the form u = E ( g ( X T ) ) , see (Boyle 1977) and (Kloeden and Platen 1999), using the sample average uN ,Δ =
1 N
∑ g (YT (ωk ) ) , N
k =1
with N independent simulated realizations YT (ω1 ) , YT (ω2 ) , … , YT (ω N ) . Monte Carlo methods are very general and work in almost all circumstances. However, the computational efficiency is often limited. For high-dimensional functionals it is sometimes the only method of obtaining a result. One pays for this generality by the large sample sizes required to achieve reasonably accurate estimates. To see this we can introduce the mean error μˆ in the form
μˆ = u N , Δ − E ( g ( X T ) ) . We can decompose μˆ into a systematic error μ sys and a statistical error μ stat , such that
μˆ = μ sys + μ stat ,
(21.16)
where
μ sys = E ( μˆ ) ⎛1 N ⎞ = E ⎜ ∑ g (YT (ωk ) ) ⎟ − E ( g ( X T ) ) ⎝ N k =1 ⎠ = E ( g (YT ) ) − E ( g ( X T ) ) .
(21.17)
510
Eckhard Platen
For a large number N of simulated independent sample paths of Y the statistical error μ stat becomes by the Central Limit Theorem asymptotically Gaussian with mean zero and variance of the form Var ( μ stat ) = Var ( μˆ ) =
1 Var ( g (YT ) ) . N
(21.18)
This shows a significant disadvantage of Monte Carlo methods, because the deviation Dev ( μ stat ) = Var ( μ stat ) =
decreases at only the slow rate
1 N
1 N
Var ( g (YT ) ) ,
(21.19)
as N → ∞ , which determines the length of
corresponding confidence intervals.
21.7 Variance Reduction One can increase the efficiency in Monte Carlo simulation for SDEs considerably by using variance reduction techniques. These reduce primarily the variance of the random variable actually simulated. The method of antithetic variates, see (Law and Kelton 1991), uses repeatedly the random variables originally generated in some symmetric pattern to construct sample paths that offset each other’s noise to some extent. Stratified sampling is another technique that has been widely used, see (Fournie et al. 1997). It divides the whole sample space into disjoint sets of events and an unbiased estimator is constructed that is conditional on the occurrence of these events. The measure transformation method, see (Milstein 1995), (Kloeden and Platen 1999) and (Hofmann et al. 1992), introduces a new probability measure via a Girsanov transformation and can achieve considerable variance reductions. An efficient technique that uses integral representation has been suggested in (Heath and Platen 2002). Very useful is the control variate technique, which is based on the selection of a random variable Y with known mean E(Y) that allows one to construct an unbiased estimator with smaller variance, see (Newton 1994) and (Heath and Platen 1996). The control variate technique is a very flexible and powerful variance reduction method. Let X = { X t = ( X t1, … , X td )T , t ∈ [0, T ]} be the solution of a d-dimensional SDE with initial value x = ( x1 , … , x d )T ∈ ℜ d . We wish to compute the following expected value u = E ( H ( XT )) ,
(21.20)
where H : ℜd → ℜ is some payoff function. Our aim will be to find a control variate by exploiting the particular structure of the underlying SDE. This will enable us to obtain an accurate and fast estimate of E (H ( X T )) . The control variate
21 Simulation Methods for Stochastic Differential Equations
511
technique aims to find a suitable random variable Y with known mean E (Y ) . It then estimates E (H ( X T )) by computing the expected value of Z = H ( X T ) − α (Y − E (Y ) )
for some suitable choice of α ∈ ℜ . The parameter α can be chosen to minimize the variance of Z. Because E (Z ) = E ( H ( X T ) ) ,
the average uN =
1 N
N
∑ Z (ωi ) i =1
is an unbiased estimator for E ( H ( X T ) ) . One assumes that both H ( X T ) and Y can be evaluated for any realization ω ∈ Ω . We call the random variable Y a control variate for the estimation of E ( H ( X T ) ) . As an example, we consider a stochastic volatility model with a vector process T X = X t = ( St , σ t ) , t ∈ [ 0, T ] that satisfies the two-dimensional SDE
{
}
dSt = σ t St dWt1 dσ t = γ (κ − σ t ) dt + ξσ t dWt 2
(21.21)
for t ∈ [0, T ] with initial values S0 = s > 0 , σ 0 = σ > 0 and γ , κ , ξ > 0 . Here the short rate is set to zero and the pricing is performed under an equivalent risk neutral probability measure. We take W 1 and W 2 to be two independent standard Wiener processes. Stochastic volatility models of similar type have been widely used in the literature, see (Stein and Stein 1991), (Heston 1993) or (Heath et al. 2001). Let us consider the payoff of a European call option H ( X T ) = ( ST − K )
+
where K > 0 is the strike and T is the maturity. Now, Xˆ = Xˆ t = Sˆt , σˆ t , t ∈ [ 0, T ] is an adjusted asset price process with de-
{
(
}
)
terministic volatility, which evolves according to the system of SDEs dSˆt = σˆ t Sˆt dWt1
(21.22)
dσˆ t = γ (κ − σˆ t ) dt
for t ∈ [0, T ] with the same initial values Sˆ0 = s and σˆ 0 = σ . The adjusted price û + for the European call payoff Y = SˆT − K is obtained as expectation
(
)
(
uˆ = E (Y ) = E ⎛⎜ SˆT − K ⎝
)
+
⎞. ⎟ ⎠
(21.23)
512
Eckhard Platen
Since the volatility process is deterministic, this payoff can be evaluated by using the well-known Black-Scholes formula, see (Black and Scholes 1973). Consequently, the random variable
(
+ Z = ( ST − K ) − α ⎛⎜ SˆT − K ⎝
)
+
(
− E SˆT − K
)
+
⎞ ⎟ ⎠
= ( ST − K ) − α (Y − uˆ ) +
is easily computed without any simulation. To price the option under consideration, one now estimates E ( Z ) , which is the control variate estimator, with Y playing the role of the control variate. This means, one simulates σ t , St and Sˆt for t ∈ [0, T ] . To do this, one could use the previously mentioned second weak order predictor-corrector scheme, for example. The above method can be very powerful since it takes advantage of the structure of the underlying SDE. Variance reduction techniques can be combined and can achieve several thousand times variance reduction if appropriately designed.
References Bachelier L (1900) Théorie de la spéculation. Annales de l’Ecole Normale Supérieure, Series 3 17, pp. 21–86 Black F, Scholes M (1973) The pricing of options and corporate liabilities. J. Political Economy 81, pp. 637–654 Boyle P (1977) A Monte Carlo approach. J. Financial Economics 4, pp. 323–338 Bruti-Liberati N, Platen E (2006) Strong approximations of stochastic differential equations with jumps. J. Comput. Appl. Math. (forthcoming) Burrage K, Burrage P, Belward J (1997) A bound on the maximum strong order of stochastic Runge-Kutta methods for stochastic ordinary differential equations. BIT 37(4), pp. 771–780 Einstein, A (1906) Zur Theorie der Brownschen Bewegung. Ann. Phys. IV 19, pp. 371 Fischer P, Platen E (1999) Applications of the balanced method to stochastic differential equations in filtering. Monte Carlo Methods Appl. 5(1), pp. 19–38 Fishman GS (1996) Monte Carlo: Concepts, Algorithms and Applications. Springer Ser. Oper. Res. Springer Fournie E, Lebuchoux J, Touzi N (1997) Small noise expansion and importance sampling. Asymptot. Anal. 14(4), pp. 331–376 Gaines JG, Lyons TJ (1994) Random generation of stochastic area integrals. SIAM J. Appl. Math. 54(4), pp. 1132–1146 Glasserman P. (2004) Monte Carlo Methods in Financial Engineering, Volume 53 of Appl. Math. Springer Gobet E (2000) Weak approximation of killed diffusion. Stochastic Process. Appl. 87, pp. 167–197
21 Simulation Methods for Stochastic Differential Equations
513
Hausenblas E (2000) Monte-Carlo simulation of killed diffusion. Monte Carlo Methods Appl. 6(4), pp. 263–295 Heath D, Hurst SR, Platen E (2001) Modelling the stochastic dynamics of volatility for equity indices. Asia-Pacific Financial Markets 8, pp. 179–195 Heath D, Platen E (1996) Valuation of FX barrier options under stochastic volatility. Financial Engineering and the Japanese Markets 3, pp. 195–215 Heath D, Platen E (2002) A variance reduction technique based on integral representations. Quant. Finance. 2(5), pp. 362–369 Heston SL (1993) A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev. Financial Studies 6(2), pp. 327–343 Hofmann NE, Platen E, Schweizer M (1992) Option pricing under incompleteness and stochastic volatility. Math. Finance 2(3), pp. 153–187 Itô K (1944) Stochastic integral. Proc. Imp. Acad. Tokyo 20, pp. 519–524 Jäckel P (2002) Monte Carlo Methods in Finance. Wiley Kloeden PE, Platen E (1999) Numerical Solution of Stochastic Differential Equations, Volume 23 of Appl. Math. Springer. Third corrected printing Kloeden PE, Platen E, Schurz H (2003) Numerical Solution of SDE’s Through Computer Experiments. Universitext. Springer. Third corrected printing Kubilius K, Platen E (2002) Rate of weak convergence of the Euler approximation for diffusion processes with jumps. Monte Carlo Methods Appl. 8(1), pp. 83−96 Law AM, Kelton WD (1991) Simulation Modeling and Analysis (2nd ed.). McGraw-Hill, New York Li CW, Liu XQ (1997) Algebraic structure of multiple stochastic integrals with respect to Brownian motions and Poisson processes. Stochastics Stochastics Rep. 61, pp. 107–120 Maghsoodi Y (1996) Mean-square efficient numerical solution of jump-diffusion stochastic differential equations. SANKHYA A 58(1), pp. 25–47 Merton RC (1973). Theory of rational option pricing. Bell J. Econ. Management Sci. 4, pp. 141–183 Mikulevicius R, Platen E (1988) Time discrete Taylor approximations for Ito processes with jump component. Math. Nachr. 138, pp. 93–104 Milstein GN (1974) Approximate integration of stochastic differential equations. Theory Probab. Appl. 19, pp. 557–562 Milstein GN (1978) A method of second order accuracy integration of stochastic differential equations. Theory Probab. Appl. 23, pp. 396–401 Milstein GN (1995) Numerical Integration of Stochastic Differential Equations. Mathematics and Its Applications. Kluwer Milstein GN, Platen E, Schurz H (1998) Balanced implicit methods for stiff stochastic systems. SIAM J. Numer. Anal. 35(3), pp. 1010–1019 Newton NJ (1994) Variance reduction for simulated diffusions. SIAM J. Appl. Math. 54(6), pp. 1780–1805 Øksendal BK (1998) Stochastic Differential Equations. An Introduction with Applications (5th ed.). Universitext. Springer
514
Eckhard Platen
Pivitt B (1999) Accessing the Intel Random Number Generator with CDSA. Technical report, Intel Platform Security Division Platen E (1982) An approximation method for a class of Itô processes with jump component. Liet. Mat. Rink. 22(2), pp. 124–136 Platen E (1995) On weak implicit and predictor-corrector methods. Math. Comput. Simulation 38, pp. 69–76 Platen E, Heath D (2006) A Benchmark Approach to Quantitative Finance. Springer Finance. Springer. 700 pp. 9 illus., Hardcover, ISBN10 3-540-26212-1 Platen E, Wagner W (1982) On a Taylor formula for a class of Itˆo processes. Probab. Math. Statist. 3(1), pp. 37–51 Protter P, Talay D (1997) The Euler scheme for Lévy driven stochastic differential equations. Ann. Probab. 25(1), pp. 393–423 Stein EM, Stein J (1991) Stock price distributions with stochastic volatility: An analytic approach. Rev. Financial Studies 4, pp. 727–752 Talay D, Tubaro L (1990) Expansion of the global error for numerical schemes solving stochastic differential equations. Stochastic Anal. Appl. 8(4), pp. 483−509 Wiener N (1923) Differential space. J. Math. Phys. 2, pp. 131–174
CHAPTER 22 Foundations of Option Pricing Peter Buchen
22.1 Introduction In this chapter we lay down the foundations of option and derivative security pricing in the classical Black-Scholes (BS) paradigm. The two key assumptions in this approach are the absence of arbitrage and the modelling of asset prices by geometrical Brownian motion (gBm). No arbitrage is the driving mechanism for much of modern finance and still plays a dominant role in models extended beyond the BS framework. It is generally recognised that gBm, which implies Gaussian logreturns of asset prices, while good for analysis is rather a poor description of reality. Financial data very often show stylised features in their log-return distributions that are highly non-Gaussian. These include, heavy-tails (leptokurtosis), skewness, stochastic volatility and long-range dependence. Unfortunately, no model of asset prices capturing all, or even some of these features, is widely accepted by either theorists or practitioners. One reason that the BS model has maintained its popularity to the present day, is its connection to market completeness (defined later in the text). A single asset with dynamics following gBm, coupled with a derivative security on that asset (or even a riskless asset, such as a non-defaultable zero coupon bond) together admit a complete market. This in turn leads to a unique arbitrage free price for any derivative security written on the asset. Much recent research is devoted to option and derivative pricing under alternative stochastic models of the asset price, many of which lead to market incompleteness. In such cases their is no unique arbitragefree price for derivatives, unless further assumptions, possibly involving investor preferences or investor utility, are imposed. There are two distinct methods for pricing financial derivatives in the BS framework. The first is the PDE approach and the second is called the EMM approach.
516
Peter Buchen
As shown by Black and Scholes (Black and Scholes 1973) in their now celebrated paper, derivative prices in the continuous time framework, satisfy a diffusion type partial differential equation. We shall derive this PDE using basically their dynamic hedging method. In the EMM approach, derivatives are priced as an expectation under the Equivalent Martingale Measure. Martingales are the mathematical entities that capture the notion of a fair game, or in the present context, no arbitrage. The EMM method is a consequence of the First Fundamental Theorem of Asset Pricing. While the PDE method is specifically tied to the BS model, the EMM method is generally valid for a much wider class of models. The two pricing methods are connected by another celebrated formula, well known to theoretical physicists as the Feynman-Kac Formula. Practitioners often model asset prices with stochastic volatility or other departure from BS standards. Pricing options and derivatives for such models can no longer be done analytically, so the emphasis becomes one of computational expediency. The EMM approach is particularly useful in this context as it immediately lends itself to a Monte Carlo (MC) algorithm. Indeed, the great majority of all industry computation in financial derivatives today, is by MC simulation. Does this mean that BS pricing is obsolete? Not at all. That the BS model has survived to the present day, is testimony to its extraordinary applicability and robustness. Monte Carlo methods perform best when run with control variates, which may considerably reduce standard errors of price estimates. Control variates with analytic representations are obviously much preferred, and BS prices admirably perform this role. For example, a practitioner in the foreign exchange market may wish to price a certain barrier option. The exchange rate is assumed to follow a process with stochastic volatility and stochastic interest rates. The barrier option can be priced by MC simulation, using a closed form BS representation of the barrier price (with constant volatility and interest rates) as a control variate. In this chapter we develop methods for pricing many different types of options and derivatives within the BS framework. These include: vanilla European calls and puts, dual expiry options such as compound and chooser options, two-asset rainbow options such as exchange options, and path-dependent options such as barrier and lookback options. Our approach, while strongly connected to both the PDE and EMM methods, uses in addition a powerful set of related analytical tools. In a nutshell, we employ the following general approach: • Represent the derivative or exotic option as a portfolio of elementary contracts • Price the elementary contracts using the EMM method; this is often complemented with a Gaussian Shift Theorem • Employ the Principal of Static Replication or parity relations to obtain the arbitrage free price of the derivative The details of this approach are described in many of the examples considered in the text that follows.
22 Foundations of Option Pricing
22.2
517
The PDE and EMM Methods
The standard BS model makes the following assumptions: the market is frictionless (i. e. no transaction costs or taxes and no penalties for short selling); the market operates continuously, the risk-free interest rate r is a known constant; the asset price X t follows gBm with constant volatility σ > 0 and pays no dividends; options and derivatives are European (i. e. no early exercise) and expire at time T with a payoff that depends only on X T ; the market is arbitrage free. European options are therefore path independent. We shall also price several exotic options with path dependent payoffs, including barrier and lookback options.
22.2.1 Geometrical Brownian Motion Under the assumption of gBm, the asset price X t satisfies a stochastic differential equation (sde) of the form
dX t = X t ( μ dt + σ dBt );
X 0 = x0
(22.1)
where μ is the growth rate of the asset. The term Bt is a standard Brownian motion under a measure P (called the real-world measure), with the following properties: 1. For t > 0 , Bt is continuous with B0 = 0 d 2. Bt has stationary and independent increments ⇒ Bt − Bs = Bt − s . Here we d use to mean ‘equal in distribution’. = d 3. For fixed t , Bt = N (0, t ) 4. From 2. and 3. above, cov{Bs , Bt } = min( s, t ) for any s, t > 0 . Thus, while the increments of Brownian motion are independent, their future values at two distinct times are not. Since Bt is Gaussian with zero mean and varid
ance t , it is convenient in calculations to write Bt = tZ t where Zt = N (0,1) is a standard Gaussian random variable (rv). We also take the liberty to drop the subscript and simply write Z for Z t . Theorem 22.1 (Itô’s Lemma). Let X t satisfy the sde dX t = α ( xt ) dt + β ( x, t ) dBt , where x = X t and let V ( x, t ) be any C2,1 function. Then V ( X t , t ) satisfies the sde 1 ⎡ ⎤ dV = ⎢Vt + αVx + β 2 Vxx ⎥ dt + β Vx dBt 2 ⎣ ⎦
(22.2)
518
Peter Buchen
where subscripts on V denote partial derivatives. It is sometimes more instructive to write this last sde in the equivalent form
1 ⎤ ⎡ dV = ⎢Vt + β 2Vxx ⎥ dt + Vx dX t . 2 ⎦ ⎣ Itô’s Lemma is the main tool used to solve sde’s. For example, the sde (1) for gBm can be solved by taking Yt ( X t ) = log( X t /x0 ) . The sde for Yt then becomes 1 dYt = ( μ − σ 2 ) dt + σ dBt , and this is readily integrated to give the representation 2 d
X t = x0 exp{( μ − 12 σ 2 )t + σ tZ ;
Z ~ N (0,1)
(22.3)
22.2.2 The Black-Scholes pde Consider a portfolio, established at time t , containing one contract of any derivative security, and h units of the underlying asset X . Let V = V ( x, t ) , where x = X t , denote the price of the derivative, then the portfolio has present value P = V ( x, t ) − hx . Assume the portfolio is self-financing over the time interval (t , t + dt ) . This means that changes in portfolio value occur only through changes in derivative and asset prices, but not through changes in the hedge ratio h . Then, the change in P will, by Itô’s Lemma, be dP = dV − h dX t 1 = [Vt + σ 2 x 2Vxx ] dt + (Vx − h) dBt 2
Now choose h = Vx so the last term above vanishes. The expression for P becomes non-stochastic and hence riskless. In order to avoid arbitrage, the portfolio P must therefore earn the risk-free rate of return r . That is dP = rP dt = r (V − xVx ) dt. dP = rP dt = r (V − xVx ) dt. Equating the two expressions for dP , we obtain the celebrated BS-pde for the derivative price V ( x, t ) Vt = rV − rxVx − 12 σ 2 x 2Vxx
(22.4)
This pde describes price evolution of any derivative in the BS framework and remarkably is independent of the parameter μ . The strategy under which it was derived, is called ‘dynamic hedging’, because holders of the derivative must hedge their positions with either short selling stock if h > 0 , or purchasing stock if h < 0 , to continuously eliminate risk. In practice, of course, such continuous hedging is impossible to implement, regardless of the accumulated transaction costs and other real market frictions. Any general European option with expiry T payoff f ( x ) must satisfy the pde (22.4) for all t < T with the expiry value V ( x, T ) = f ( x) . For a vanilla European
22 Foundations of Option Pricing
519
call option (the right to buy one unit of the asset for k at time T ), the payoff function is f ( x) = ( x − k ) + . For a vanilla European put option (the right to sell one unit of the asset for k at time T ), the payoff function is f ( x) = (k − x) + . In theory, one could price any European derivative security by solving this pde by transforming it into a standard initial value (IV) problem for the heat equation. For example, setting V ( x, t ) = e− rr u ( y,τ ) with y = log x + (r − 12 σ 2 )τ and τ = T − t , the pde for 2 u ( y,τ ) becomes, uτ = 12 σ uyy with initial value u ( y, 0) = f ( y ) . Such problems can be solved by Green’s Function methods, as indeed employed by BS in their original paper. However, we shall adopt a somewhat more sophisticated approach, which saves considerable effort when pricing more complex exotic (i. e. nonvanilla) options. Our approach requires more technical machinery considered in the next section.
22.2.3
The Equivalent Martingale Measure
Mathematically, we say that a stochastic process Yt is a martingale with respect to a measure Q and a filtration Ft (think of Ft as all the information available up to time t ), if Yt = EQ {Ys | Ft } for all t ≤ s . For example, it is well known that if Bt is a standard Brownian motion, then Bt , Bt2 − t and exp{σ Bt − 12 σ 2 t} are all martingales. Definition 22.1. The Equivalent Martingale Measure (EMM) is a measure Q , equivalent1 to the real world measure P , under which the discounted asset price ~ X t = X t e− rt is a martingale.
This means there exists a measure Q , such that X t = e − r ( s −t ) EQ { X s | Ft } . The First Fundamental Theorem of Asset Pricing (see (Harrison and Pliska 1981) for a formal proof) tells us that this is not only true for the asset price, but for any derivative security as well. Theorem 22.2 (First Fundamental Theorem). If the market is arbitrage free, the discounted price of any derivative security is a martingale under the Equivalent Martingale Measure.
This theorem immediately leads to a pricing formula for the derivative, since V ( X t , t ) = EQ {V ( X s , s ) | Ft } implies with s = T and V ( X T , T ) = f ( X T ) ,
V ( x, t ) = e− r (T −t ) EQ { f ( X T ) | X t = x}
(22.5)
Note that since Itô processes are Markov, Ft is equivalent to the statement Xt = x . 1
Measures P and Q are equivalent if they have the same null sets. That is, P ( A) = 0 ⇔ Q ( A) = 0 .
520
Peter Buchen
The Second Fundamental Theorem on Asset Pricing states that if the market is complete, then Q is unique. Market completeness means that it is possible to dynamically replicate the riskless security (via dynamic hedging) with any derivative and its underlying asset. Indeed one can also dynamically replicate any derivative, using just the underlying asset and a riskless security. Under Q , which is also known as the Risk Neutral Measure, the asset price follows gBm with drift rate r . That is, dX t = X t (rdt + σ dBtQ ) . However, we saw in (22.1), that under the real-world measure P , dX t = X t ( μ dt + σ dBtP ) . This does not mean that the stock price earns the risk-free rate in the real economy. It does however mean that when pricing derivatives in an arbitrage free way, stock prices must be transformed to the new measure Q , under which they do earn the risk-free rate. The asset price sde’s under the two measures are compatible if and only if the two Brownian motions are related by ⎛ μ −r ⎞ dBtQ = dBtP + ⎜ ⎟ dt ⎝ σ ⎠
(22.6)
That such a relationship exists, is the result of another well known theorem in stochastic calculus known as Girsanov’s Theorem (e. g. see (Klebaner 1998)). We shall not explore this theorem here, but point out that from a practical point of view, we could have simply replaced μ by r at the outset: a rule sometimes called the Correspondence Principle. Under Q , we can therefore replace (22.3) for the asset price, by X T = x exp{(r − 12 σ 2 )τ + σ τ Z };
τ = T −t
(22.7)
d
where Z = Z tQ = N (0,1) . While the EMM approach leading to the representation (22.5) for the derivative price was originally obtained through financial arguments, it could also have been obtained directly from the Feynman-Kac (FK) formula (e. g. see (Friedman 1975)). Basically, the FK formula expresses the solution of diffusion equations of the type (22.4), as a terminal value expectation, with respect to a measure determined from the transition probability density of a certain stochastic process. This measure is precisely the EMM of the First Fundamental Theorem. In fact, it is a simple matter to show by direct substitution, that (22.5), with X T given by (22.7), satisfies the pde (22.4). The following is a useful and powerful pricing adjunct. Theorem 22.3 (Principle of Static Replication). If the payoff of a European style derivative can be expressed as a portfolio of elementary contracts, then the arbitrage free price of the derivative is the present value of this portfolio.
Proof. While this theorem is virtually self-evident, the proof follows from either arbitrage principles (the Law of One Price) or from the linearity of both the BS-pde (22.4) and the expectation (22.5).
22 Foundations of Option Pricing
521
We remark that static replication is quite different from dynamic replication used in deriving the BS-pde. In static replication, the portfolio is created once at time t and then held to its expiry date. No continuous re-balancing to eliminate risk is required. Both methods however, have their origins in the concept of no arbitrage and both are used as pricing tools.
22.2.4 Effect of Dividends We have tended to think of the underlying asset as an equity or common stock. In practice, many equities pay dividends which have so far been ignored. While the payment of dividends is a subject of some complexity, we shall adopt a very simple approach and assume dividends are paid continuously at a constant yield q . It transpires that all the previous analysis still applies, not to the asset price X t , but rather to the dividend corrected asset price Yt = X t e − qt . The BS pde is modified to Vt = rV − (r − q ) xVx − 12 σ 2 x 2Vxx
(22.8)
The EMM formula (22.5) still applies, but the expression (22.7) for X T must now be replaced by X T = x exp{(r − q − 12 σ 2 )τ + σ τ Z }
(22.9)
There is another benefit for including dividends. While strictly related only to the equity market, the presence of dividends allows us, with only a slight change in interpretation, to apply all the formulae derived in this chapter to other markets as well. Thus we capture the futures market with the choice X t = futures prices and q = r ; the foreign exchange market with X t = exchange rate and r = rd , q = rf (the domestic and foreign risk free rates); the commodities market with X t = commodity price and q = the convenience yield; and of course the original BS equity market with q = 0 . We shall assume constant dividends as described, for the remainder of this chapter.
22.3
Pricing Simple European Derivatives
We show here how to price European derivatives with fairly simple payoff functions f ( x) . First, we price some elementary building block contracts called bond and asset binaries. These are also known as bond and asset digitals. Then we show how some popular traded contracts, such as vanilla calls and puts, can be expressed as portfolios of these elementary contracts. We finally invoke the Principle of Static Replication to price the calls and puts.
522
Peter Buchen
22.3.1 Asset and Bond Binaries Probably the simplest non-trivial contract is a zero coupon bond (ZCB) which pays one unit of cash (a nominal $1 say) at a fixed time T in the future. This payment is made with certainty, regardless of the economic ‘states of the world’ at time T . In the BS framework, we may regard the ZCB as a derivative with expiry payoff f ( x) = 1 .Whether one use the PDE method or the EMM method, the present value (pv) of the bond is found to be given by B ( x, t ) = e− rr with τ = T − t . Not surprisingly, this price is independent of the underlying asset price x . Another simple contract is one which pays one unit of the asset (e. g. one share of stock) at T , corresponding to an expiry payoff f ( x) = x . Again, it is a simple matter to derive the present value of this contract as A( x, t ) = xe − qr . Of particular interest in pricing more complex derivatives are the associated up (if s = + ) and down (if s = − ) type asset and bond2 binaries, with time T payoffs3 Asset binary
Aξs ( x, T ) =
Bond binary
Bξ ( x, T ) = s
xI ( sx>sξ ) ⎪⎫ ⎬ I ( sx>sξ ) ⎪⎭
(22.10)
The parameter ξ > 0 denotes an exercise price. In the case of the asset binary, the holder receives one unit of the asset at time T , but only if the asset is above (or below) the exercise price. For the bond binary, the payout is $1 rather than one unit of asset. It is a greater challenge to price these contracts, which are fundamental building blocks for many complex derivatives. Before doing so, we define two functions which play a ubiquitous role in BS option pricing. They are [dξ , d ′ξ ]( x,τ ) =
log( x /ξ ) + (r − q ± 12 σ 2 )τ
σ τ
(22.11)
Observe that d ′ξ ( x,τ ) = dξ ( x,τ ) − σ τ . Starting with the up-type bond binary, using (22.5), we can write its pv as Bξ+ ( x, t ) = e− rr E{I ( Z > −dξ′ )}
But from (22.9), X T > ξ implies Z > − dξ′ , which leads to Bξ+ ( x, t ) = e − rr E{I ( Z > − dξ′ )}
Since Z ~ N (0,1) it has normal cumulative distribution function N (d ) = E{I ( Z < d )} = E{I ( Z >− d )} 2 3
Bond binaries are also known as cash binaries or cash digitals. The function I ( x > ξ ) is the indicator function, equal to 1 if x > ξ , and equal to zero otherwise.
22 Foundations of Option Pricing
523
Symmetry allows us to replace Z by − Z in any expectation. The function N (d ) is related to the standard error function by N (d ) = 12 [1 + erf (d / 2)] . Hence,
we obtain Bξ+ ( x, t ) = N (dξ ) and similarly, Bξ− ( x, t ) = N (−dξ ) . We express both results in a single formula as Bξs ( x, t ) = N ( sdξ )
(22.12)
The calculation for the up-asset binary follows along similar lines, and results in the representation Aξ+ ( x, t ) = x
( − q − 1 σ 2 )τ 2
E{σ
τZ
I ( Z >− d ′ξ )}
Calculations of expectations of the above type are greatly facilitated by the following. Theorem 22.4 (Uni-variate Gaussian Shift Theorem). If Z ~ N (0,1) is a standard Gaussian rv with zero mean and unit variance, c is any real constant and F ( z ) a measurable function, then 2
E{ecZ F ( Z ) = e0.5c E{F ( Z + c)}
(22.13)
Proof. The proof follows immediately from the identity Φ ( x − c) = e
Φ ( x − c) =
1 cx − c 2 e 2
Φ ( x) , where Φ ( x) =
1 − x2 e 2
1 cx − c 2 2
Φ ( x)
/ 2π is the probability density func-
tion (pdf) of the standard normal. It follows from the Gaussian Shift Theorem (GST) that, Aξ+ ( x, t ) = xe − qr E{I ( Z + σ τ > − dξ′ )} = xe− qr N (dξ )
Similarly, we find Aξ− ( x, t ) = xe− qr N (− dξ ) , so as a single formula we obtain Aξs ( x, t ) = xe− qr N ( sdξ )
(22.14)
This completes the derivation of the arbitrage free prices of asset and bond binaries. Later we shall meet, second-order versions of these binaries which can be priced using a bi-variate version of the Gaussian Shift Theorem. 22.3.1.1 Parity Relations
From (22.10) it is obviously true that Aξ+ ( x, T ) + Aξ− ( x, T ) = x and Bξ+ ( x, T ) + Bξ− ( x, T ) = 1 , so by the Principle of Static Replication, the up-down parity relations for asset and bond binaries, Aξ+ ( x, t ) + Aξ− ( x, t ) = xe− qτ ⎫⎪ ⎬ Bξ+ ( x, t ) + Bξ− ( x, t ) = e− rτ ⎪⎭
(22.15)
524
Peter Buchen
will be valid for all t ≤ T . Parity relations such as these pervade the subject of exotic option pricing, so it is well to keep them in mind in order to avoid unnecessary calculations and also to sometimes simplify mathematical formulae obtained.
22.3.2 European Calls and Puts Recall that a European call option of strike price k , gives its holder the right to buy one unit of asset at time T for k dollars. The payoff for the contract is f ( x) = ( x − k ) + = max( x − k , 0) . The plus function captures the ‘option’ part of the contract – the holder is not obliged to buy the asset. Indeed, if at expiry the asset price satisfies x < k , then the holder will not exercise the option and the contract will expire worthless. It follows that the payoff has the equivalent representation f ( x) = ( x − k ) I ( x>k ) = xI ( x>k ) − kI ( x>k )
This payoff is the same as that of a portfolio containing one up-type asset binary and short k up-type bond binaries. From (22.10) f ( x) = Ak+ ( x, T ) − kBk+ ( x, T )
The Principle of Static Replication then tells us that a European call option can be statically replicated at any time before expiry, by a portfolio of asset and bond binaries. Holding the portfolio is financially equivalent to holding the call option. It follows, that the present value of the call option, for all t < T , is given by Ck ( x, t ) = Ak+ ( x, t ) − kBk+ ( x, t ) and by virtue of (22.12) and (22.14), this can be expressed as Ck ( x, t ) = xe − qr N (dξ ) − ke − rr N (dξ′ )
(22.16)
This as the celebrated BS formula for the price of a European call option. The corresponding formula for a European put option is derived from its payoff function Pk ( x, T ) = (k − x) + = kBk− ( x, T ) − Ak− ( x, T )
yielding the result Pk ( x, T ) = ke − rr N (−dξ′ ) − xe − qr N (− dξ )
(22.17)
for all t < T . The asset and bond binary parity relations (15) also imply the well known put-call parity relation Ck ( x, t ) − Pk ( x, t ) = xe − qr − ke− rr
(22.18)
When we need to specify the expiry date T of the option explicitly, we shall use a third parameter and write Ck ( x, t ; T ) and Pk ( x, t ; T ) for the call and put price. We also apply this rule to other options e. g. Aks ( x, t ; T ) and Bks ( x, t : T ) .
22 Foundations of Option Pricing
525
22.3.2.1 Threshold Calls and Puts
These contracts are similar to standard calls and puts, except the strike price k differs from the exercise price h . The payoffs at expiry are Ch, k ( x, T ) = ( x − k ) I ( x>h)
and
Ph, k ( x, T ) = (k − x) I ( x
(22.19)
Normally, for the threshold call option h > k , while for the threshold put option h < k . Static replication immediately determines their present values, which can be expressed in terms of asset and bond binaries as Ah+ ( x, t ) − kBh+ ( x, t ) ⎫⎪ ⎬ = kBh− ( x, t ) − Ah− ( x, t ) ⎪⎭
Ch, k ( x, t ) = Ph, k ( x, t )
(22.20)
The corresponding parity relation is Ch, k ( x, t ) − Ph, k ( x, t ) = xe− qr − ke− rr
(22.21)
Note that the right hand side is independent of the exercise price h .
22.4
Dual-Expiry Options
There are many exotic options whose payoffs depend on more than just a single asset price at a single future expiry date, as assumed in the standard BS framework. Examples include compound and chooser options whose payoffs depend on a single asset price at two future dates, and so-called two asset rainbow options whose payoffs depend on two asset prices at a single future date. We consider in this section examples of the former type and consider the two-asset case in the next section. There are strong similarities between the two types.
22.4.1 Second-order Binaries We first price some further elementary binary contracts and use these to price the exotics by static replication. However, the binary contracts needed here are somewhat more complicated. Definition 22.2 (Second-order asset and bond binaries). Let T1 and T2 with T1 < T2 be two future exercise dates. A second-order asset binary is a contract that pays at time T2 , one unit of the underlying asset, but only if x1 = X T1 is above (or below) a given exercise price ξ1 and x2 = X T2 is above (or below) a given exercise price ξ 2 . A second-order bond binary is identical except that it pays $1 at time T2 rather than one unit of asset.
526
Peter Buchen
There are four types of second-order binaries: up-up, up-down, down-up and down-down, depending on whether exercise at T1 and T2 are respectively above or below the given price levels. Let s1, 2 be sign indicators ( ± ) for these four types and ξ1, 2 the exercise prices at T1, 2 . Then the expiry T2 payoffs are respectively Aξs11ξs22 ( x1 , x2 ; T2 ) = x2 I ( s1 x1>s1ξ1 ) I ( s2 x2 >s2ξ 2 ) Bξs11ξs22 ( x1 , x2 ; T2 ) = I ( s1 x1>s1ξ1 ) I ( s2 x2 >s2ξ 2 )
Since x1 is actually known (and remains fixed) for all T1 ≤ t ≤ T2 , we can also write down the equivalent payoffs at time T1 . They are Aξs11ξs22 ( x1 , T1 ) = Aξs22 ( x1 , T1 ; T2 ) I ( s1 x1>s1ξ1 ) Bξs11ξs22 ( x1 , T1 ) = Bξs22 ( x1 , T1 ; T2 ) I ( s1 x1>s1ξ1 )
Second-order binaries are compound binaries: i. e. they are binaries of firstorder binary contracts. Theorem 22.5. The present values, for all bond binaries are given respectively by
t < T1 < T2 , of second-order asset and
Aξs11ξs22 ( x, t ) = xe− qτ 2 N ( s1d1 , s2 d 2 ; s1 s2 ρ ) ⎫⎪ ⎬ Bξs11ξs22 ( x, t ) = e− rτ 2 N ( s1d1′, s2 d 2′ ; s1s2 ρ ) ⎪⎭
(22.22)
where ρ = τ 1/τ 2 ,τ i = Ti − t , N ( d , d ′; ρ ) is the bi-variate cumulative normal distribution function with correlation coefficient ρ , and for i = 1, 2 [di , d ′i ] =
log( x /ξi ) + (r − q ± 12 σ 2 )τ i
σ τi
Proof. We present only an outline for the up-up type asset binary. Its present value is given by Aξ++ ( x, t ) = e− rτ 2 E{ X 2 I ( X 1 > ξ1 ) I ( X 2 > ξ 2 )} 1ξ 2
where, from (22.9), X i = x exp{(r − q − 12 σ 2 )τ i + σ τ i Z i ; Z i ~ N (0,1; ρ )
The correlation coefficient is easily derived from the covariance structure of Brownian motion considered in section 22.1. To calculate the above expectation we apply a bi-variate version of the Gaussian Shift Theorem (see below) and employ the bi-variate normal definition N (a, b; p) = E{I ( Z1 < a) I ( Z 2 < b)}; Z1,2 ~ N (0,1; p )
22 Foundations of Option Pricing
527
Theorem 22.6 (Multi-variate Gaussian Shift Theorem). If Z ~ N (0,1; R) is a standard Gaussian random vector with zero means, unit variances and correlation matrix R , c is any real constant vector and F ( Z ) a measurable function, then E{ecZ F ( Z )} = e0.5cRc E{F ( Z + Rc )}
(22.23)
We do not present a proof, which in any case, is a straightforward extension of the uni-variate GST (Theorem 22.4). We might also remark that the multi-variate Gaussian Shift Theorem above is the key to pricing many multi-period, multi-asset exotic options, a topic beyond the scope of this exposition. For the bi-variate case under consideration, recall that the correlation matrix has the simple two-dimensional ⎛1 ρ⎞ representation R = ⎜ ⎟. ⎝ρ 1⎠ 22.4.1.1 Parity Relations
Second-order asset and bond binaries satisfy the parity relations Aξ+1ξs22 ( x, t ) + Aξ−1ξs22 ( x, t ) =
Aξs22 ( x, t; T2 ) ⎫⎪ ⎬ Bξ+1ξs22 ( x, t ) + Bξ−1ξs22 ( x, t ) = Bξs22 ( x, t ; T2 ) ⎪⎭
(22.24)
The terms on the right are first-order asset and bond binaries which expire at time T2 .
22.4.2
Binary Calls and Puts
It is time to extend our list of elementary contracts. A second-order binary call is a contract which gives its holder at time T1 , a strike k , expiry T2 European call option, but only if the asset price at T1 is above (for an up-type) or below (for a down-type) a given exercise price h . The payoff at time T1 is therefore Chs, k ( x1 , T1 ) = Ck ( x1 , T1 ; T2 ) I ( sx1>sh)
(22.25)
and s = ± as usual. However, since we know the payoff of the European call option at time T2 is ( x2 − k ) + , we get the alternative binary call payoff at T2 as Chs, k ( x2 , T2 ) = ( x2 − k ) + I ( sx1>sh) = ( x2 − k ) I ( x2 >k ) I ( sx1>sh) s+ s+ = Ahk ( x2 , T2 ) − kBhk ( x2 , T2 )
528
Peter Buchen
That is a portfolio of second-order asset and bond binaries. It follows from the Principle of Static Replication that for any time t < T1 < T2 , s+ s+ Chs, k ( x, t ) = Ahk ( x, t ) − kBhk ( x, t )
(22.26)
A second order put binary can be obtained in the same way and is given by s− s− Phs, k ( x, t ) = kBhk ( x, t ) − Ahk ( x, t )
(22.27)
22.4.2.1 Parity Relations
For all t < T1 < T2 , Ch+, k ( x, t ) + Ch−, k ( x, t ) = Ck ( x, t ; T2 ) ⎫⎪ ⎬ Ph+, k ( x, t ) + Ph−, k ( x, t ) = Pk ( x, t ; T2 ) ⎪⎭
22.4.3
(22.28)
Compound Options
There are four basic compound options, generally referred to as a call–on–call, call–on–put, put–on–call and put–on–put. We show how to price the first of these by representing its equivalent time T1 payoff as a portfolio of elementary contracts. To fix the parameters, assume the holder has the right to buy for $ a at time T1 , a call option of strike price k and expiry T2 . Then the payoff at time T1 is given by V cc ( x, T1 ) = [Ck ( x, T1 ; T2 ) − a ]+
(22.29)
To proceed further we need to ‘remove’ the plus function. This is done as follows. Let x = h solve the equation Ck ( x, T1 ; T2 ) = a . Since Ck ( x) is a positive monotonic increasing function of x , the equation has a unique solution. This value of h depends only on fixed parameters (a, k , T1 , T2 ) and not on the current time. Hence the calculation, which must be carried numerically in practice, need only be done once. We can then rewrite the time T1 payoff as V cc ( x, T1 ) = [Ck ( x, T1 ; T2 ) − a]I ( x>h)
since for x > h , the holder would exercise the compound option, while for x < h the compound option would simply expire worthless. The payoff is now seen from (22.25) to be a portfolio of two elementary contracts: an up-type second order binary call and a short position in a up-type firstorder bond binaries. The Principle of Static Replication allows us to write down the present value immediately: V cc ( x, t ) = Ch+, k ( x, t ) − aBh+ ( x, t ; T1 )
(22.30)
22 Foundations of Option Pricing
529
When we write this price out in full, using Gaussian distribution functions, we get V cc ( x, t ) = xe− qr2 N (d h , d k ; ρ ) − ke− rr2 N (d h′ , d k′ ; ρ ) −ae − rr1 N (d h′ )
(22.31)
a formula in complete agreement with that first published in (Geske 1979). Formulae for the other compound options follow in similar fashion. We also note the following easily derived parity relations V cc ( x, t ; T1 , T2 ) − V pc ( x, t ; T1 , T2 ) = Ck ( x, t ; T2 ) − ae − r (T1 −t ) V cp ( x, t ; T1 , T2 ) − V pp ( x, t ; T1 , T2 ) = Pk ( x, t ; T2 ) − ae − r (t1 −t )
22.4.4
(22.32)
Chooser Options
Also known as ‘as-you-like-it’ options and first analysed in (Rubinstein 1991), these exotics give the holder at time T1 , the choice of either a European call option (of strike k , expiry T2 ) or a European put option (of strike h , expiry T2 ). The payoff at time T1 is therefore V ( x, T1 ) = max{Ck ( x, T1 ; T2 ), Ph ( x, T1 ; T2 )}
since the holder will choose the option with the greater payoff. Let x = a be the unique solution of the equation Ck ( x, T1 ; T2 ) = Ph ( x, T2 ; T2 ) . The solution is unique since Ck is a monotonic increasing function of x , while Ph is a monotonic decreasing function of x . Clearly the holder will choose the call option if x > a , and the put option if x < a . We therefore re-write the payoff in the equivalent form V ( x, T1 ) = Ck ( x, T1 ; T2 ) I ( x>a) + Ph ( x, T1 ; T2 ) I ( x
(22.33)
30.2.3 Terms and Conditions The ZKA (Zentraler Kreditausschuss) as the central organisation of the five main banking associations ([87]) has developed standard contract terms and conditions ([15] Rn. 109) which differ only in the payment verification instrument, ([17] § 8 Rn. 1; [24] § 55 Rn. 27) regulating the usage of PIN/TAN or HBCI-verification. In these terms and conditions questions of liability are not specifically mentioned, but several obligations and duties concerning the secrecy of the payment verification instrument are imposed on the customer. The inclusion of general contract terms and conditions via internet has been discussed at length; according to the unanimously opinion there are no obstacles concerning their inclusion as long as the legal requirements of the §§ 305 BGB ff. are observed. According to § 305 sec. 2 BGB the bank has to point out to the customer that the inclusion of terms and conditions is intended. Moreover, the bank has to enable the customer to take notice of the terms and conditions with reasonable effort (For all see [74] pp. 174 ff.). If the customer applies for the first time for business connection via internet the terms and conditions are sent to him in paper along with the contract.4
30.2.4 Consumer Protection – EU-Legislation and Directives Like in all legal areas European legislation has gained important influence on the legislation concerning internet banking. Especially, the directive concerning the distance marketing of consumer financial services5 and the directive on payment 4 5
See above under section 30.2.1. Directive 2002/65/EC of the European Parliament and of the Council of 23 September 2002 concerning the distance marketing of consumer financial services and amending Council Directive 90/619/EEC and Directives 97/7/EC and 98/27/EC, [90].
734
Gerald Spindler, Einar Recknagel
services in the internal market6 strongly affect the provision of banking services via internet. 30.2.4.1 The Directive Concerning the Distance Marketing of Consumer Financial Services The directive concerning the distance marketing of consumer financial services has been adopted September 23, 2002 in order to harmonize laws, regulations and administrative provisions of the Member States concerning the distance marketing of consumer financial services granting an EU-wide consumer protection. The directive has been implemented into German Civil Code in the §§ 312 b ff. BGB in 2004.7 In general the directive (and its national implementations) grants the consumer certain rights/duties of information prior to the conclusion of the distance contract the communication of the contractual terms and conditions and the right of withdrawal. The duties of information and the communication of the contractual terms and conditions are specified in Art. 3–5 of the Directive. Amongst others information concerning the supplier, the financial service and the distance contract shall be provided to the consumer prior to the conclusion of the contract. The information on the supplier shall contain its identity and the main business8. Information about the financial service shall provide a description of the main characteristics of the service, the price and the relevant risks.9 The information concerning the distance contract shall inform the consumer on the existence or absence of a right of withdrawal and its exercise, the minimum duration of the contract and information on any rights the parties may have to terminate the contract in a due course. The supplier shall provide the consumer with all contractual terms, conditions, and the mentioned information on paper or on another durable medium10 available and accessible to the consumer in due course, before the consumer is bound by any distance contract or offer. The right to withdrawal grants the consumer a period of fourteen days to withdraw from the contract without penalty and without providing any reason. The period for withdrawal begins either from the day of the conclusion of the distance contract, or from the day on when the consumer has received the contractual terms and conditions and the information. 6
7
8 9 10
Proposal for a Directive of the European Parliament and of the council on payment services in the internal market and amending Directives 97/7/EC, 2000/12/EC and 2002/65/EC (presented by the Commission) [72]. Gesetz zur Änderung der Vorschriften über Fernabsatzverträge bei Finanzdienstleistungen vom 02.12.2004, [23]; for a detailed summary of the implementation to the BGB see [74], pp. 106 ff. Art. 3 Sec.1 of the directive. Art. 3 Sec.2 of the directive. See above section 30.2.3.
30 Legal Aspects of Internet Banking in Germany
735
The right to withdrawal does not apply to financial services where the price depends on fluctuations within the financial market which may occur during the withdrawal period, such as services related to trades on foreign exchange, money market instruments, transferable securities or comparable instruments and contracts where the performance has been fully completed by both parties at the consumer’s request before the consumer exercises his right of withdrawal. 30.2.4.2 The Directive on Payment Services in the Internal Market After releasing a recommendation concerning transactions by electronic payment instruments11 pointing out guidelines to rights and obligations on electronic transactions the Commission has presented a proposal for a directive on payment services in the internal market,12 which recently had been adopted. With this proposal the Commission aims to create a single payment market where improved economies of scale and competition reduce the cost of the payment system. A modern and technologically based economy needs an efficient and modern payment system. The Commissions’ initiative focuses on electronic payments as an alternative to cost intensive cash payments. The directive shall provide legal certainty on core rights and obligations for users and providers in the interests of a high level of consumer protection, improved efficiency, a higher level of automation and pan-European straight-through processing. Though the directive mainly aims at the regulation of transeuropean payments, it shall affect inland payments as well. The directive specifies the obligations on the payment service user (Art. 46) and the obligations on the payment service provider (Art. 47) in relation to payment verification instruments. The payment service user is obliged to use a payment verification instrument in accordance with the terms governing the issuing and usage of the instrument (Art. 46 lit. a)) and to notify the payment service provider, or the entity specified by the latter, without undue delay on becoming aware of loss, theft or misappropriation of the payment verification instrument or of its unauthorised usage (Art. 46 lit. b)). In addition the user shall, as soon as he receives a payment verification instrument, take all reasonable steps to keep his security features safe. Correspondingly the payment service provider is obliged to make sure that the personalised security features of a payment verification instrument are not accessible to parties other than the holder of the payment verification instrument (Art. 47 lit a)); to refrain from sending an unsolicited payment verification instrument, except where a payment verification instrument already held by the payment 11
12
Commission Recommendation of 30 July 1997 concerning transactions by electronic payment instruments and in particular the relationship between issuer and holder [89] – the recommendation is not legally binding. Proposal for a Directive of the European Parliament and of the Council on payment services in the internal market and amending Directives 97/7/EC, 2000/12/EC and 2002/65/EC; COM(2005) 603 (final) [72].
736
Gerald Spindler, Einar Recknagel
service user is to be replaced, because it has expired or because of the need to add or change security features (Art. 47 lit b)); and to ensure that appropriate means are available at all times to enable the payment service user to make a notification pursuant to Art. 46 b) (Art. 47 lit b)). Of particular interest is the implementation of the prima facie evidence in Art. 48 which require Member States to oblige the payment service provider to provide at least evidence that the payment transaction was authenticated, accurately recorded, entered in the accounts and not affected by a technical breakdown or some other deficiency, where a payment service user denies having authorised a completed payment transaction. If, on giving this evidence, the payment service user continues to deny having authorised the payment transaction, he shall provide information or elements to allow the presumption that he could not have authorised the payment transaction and that he did not act fraudulently or with gross negligence with regard to his abovementioned obligations. For the purposes of rebutting this presumption, the use of a payment verification instrument recorded by the payment service provider shall not, of itself, be sufficient to establish either that the payment was authorised by the payment service user or that the payment service user acted fraudulently or with gross negligence. In Art. 50 the user liability for losses in respect of unauthorised payment transactions is stipulated. The payment service user shall bear the loss, however, only up to a maximum of € 150, resulting from the use of a lost or stolen payment verification instrument and occurring before he has fulfilled his obligation to notify his payment service provider (Art. 46 b)). The payment service user shall bear all the losses on unauthorised transactions if he incurred them by acting fraudulently or with gross negligence with regard to his obligations. In such cases, the maximum amount of € 150 shall not apply.
30.3
Liability – Duties and Obligations of Bank and Customer
The main focus in the legal discussion currently lies on the attribution of liability if the customer denies having disposed an order or a payment transaction claiming that a third party has used the means of identification without consent or knowledge of the user. While the legal situation is quite clear, the procedural problems concerning the burden of proof are immense.
30.3.1 Phishing and Its Variations – A Case of Identity Phishing attacks use both social engineering and technical subterfuge to steal consumers’ personal identity data and financial account credentials. Social-engineering schemes use “spoofed” e-mails to lead consumers to counterfeit websites designed to trick recipients into divulging financial data such as
30 Legal Aspects of Internet Banking in Germany
737
credit card numbers, account usernames and passwords. Hijacking brand names of banks, e-retailers and credit card companies, phishers often convince recipients to respond. Technical subterfuge schemes plant crimeware onto PCs to steal credentials directly, often using Trojan keylogger spyware. Pharming crimeware misdirects users to fraudulent sites or proxy servers, typically through DNS hijacking or poisoning, “harvesting” the consumer-information on this website. With a “Man-in-the-middle attack” the attacker mingles in between the sender and the recipient, accesses the traffic, modifies it (i. e., by changing the beneficiary of the transaction) and forwards it to the recipient – the bank. Banks have reacted to these threats by modifying the verification instruments trying to ensure that phishing and pharming attacks become ineffective. The Basis of verification instruments is the original and still widely spread PIN/TAN system in which the customer enters his Personal Identification Number and a freely chosen Transaction Number into the browser-window or the bankingprogramm. This technique of identification is executed without media conversion as all communication channels are directed via the computer of the customer – which specifically increases the danger of trojan attacks, pharming and phishing as the risk of being spied out augments. Against these dangers countermeasures by implementing verification instruments with media conversion have been developed. By the use of additional hardware – TAN-generators or receiving a TAN by mobile phone – the prevention of manipulated transactions shall be ensured. The Safest internet-banking system is the HBCI-standard ([25]) in which by the use of a Chip-card and an electronic signature the transaction is encrypted and therefore, considered safe against manipulation.
30.3.2 Legal Situation in Case of Phishing or Pharming When the account of a customer is debited with a transaction amount resulting from phishing or pharming the question evolves if and on which legal grounds the bank can claim the transferred amount from its customer. The possible basis for such a claim by the bank is a reimbursement of expenses (cf. 30.3.2.1)) or a claim for damages (cf. 30.3.2.2)). 30.3.2.1 Reimbursement of expenses A claim for reimbursement of expenses in accordance to §§ 670, 675, 676a BGB is grounded on a contractual basis and therefore, demands a “transaction-contract” between the bank and the customer concerning the relevant transaction. If a third party – the attacker – initiates a transaction by a phishing or pharming attack a transaction contract between customer and bank is not established as the customer has not offered to conclude a transaction contract. The bank cannot distinguish between a true or falsified transaction order and therefore, will point out
738
Gerald Spindler, Einar Recknagel
that the transaction order being verified by the payment verification instrument (PIN/TAN or HBCI) has been initiated by the customer. If the customer denies having given the order, the bank has the burden of proof for the “transactioncontract”. If the proof – with the adoption of prima facie evidence13 – cannot be supplied, the bank cannot claim reimbursement of expenses. Some scholars propagate a contractual binding by apparent declaration of intent as the incoming falsified order resembles to be given with the true means of identification and its eligible proprietor. The grounds for this opinion are based on the figure of apparent authority ([24] § 55 Rn. 26; [31] p. 1684, F/35; [42] p. 408; [19] p. 1206; to the apparent authority in general [71] § 173 Rn. 14 ff.) arguing that high technical security standards are applied for internet-banking. “Apparent authority” is assumed, when the – repeated – action of the agent is unknown to the representee due to lack of diligence by the representee and the contracting third partner is not aware of this fact. The application of “apparent authority” in this area, however, has to be rejected in accordance with the prevailing opinion ([16] p. 3315; [35] p. 353; [41] p. 24; [86] p. 1047; [44] pp. 145 f). For the assumption of apparent authority it is necessary that the agent is acting repeatedly for a certain period of time. Concerning internetbanking it is on the other hand assumed that the customer himself and not an agent is using the verification instrument ([14] Rn. 82; [77] Rn. 21 f). Moreover, the notion “apparent authority” requires that the acting third party usually acts within the sphere of the representee so that it could be assumed that the representee has the power to control its would-be agent ([63] § 167, Rn. 31 ff.; [77] Rn. 25). All these premises of the figure of apparent authority, however, do not apply in internet banking. At first place, the agent does not act several times, as the fraudulent transaction occurs generally only once ([77] Rn. 29); however, the traditional notion of “apparent authority” requires such a repeated action. Furher, neither the “phisher” nor the “pharmer” usually act within the sphere of the customer. These arguments demonstrate that the assumption of a transaction contract concluded by an agent with “apparent authority” does not match the circumstances under which the fraudulent order is given and executed. A transaction contract between customer and bank is not concluded when an unauthorised third party uses the means of verification of the customer. The question whether the “theft” of the verification instrument was due to the fault of the customer or to an insecure technical environment provided by the bank, has to be evaluated for claims of damages. Contributory negligence by the customer may also play an essential role which could be handled flexible. 30.3.2.2 Claims for Damages A breach of duties either by the bank or the customer results in a claim for damages by the other party. If the customer neglects his due diligence stated in the online-banking agreement and general contract terms he is liable to the bank for 13
See also Section 30.3.4. for burden of proof.
30 Legal Aspects of Internet Banking in Germany
739
the resulting damages, §§ 280 Sec. 1, 241 Sec. 2 BGB. The terms and conditions concerning online banking do not contain a stipulation on liability;14 hence, the general rules of civil code apply, § 276 BGB. Correspondingly the bank is being held liable if it neglects its duties and obligations. The exact assignment of obligations and duties to customer and to the bank has not been developped by courts so far. aa) Assignment of Duties and Obligation to Bank and to Customer Due to the offer of using internet-banking services the bank has to provide secure technical procedures for the relevant transactions, moreover, as the bank is the cheapest cost avoider concerning technical safeguards.spi These duties to maintain safety are imposed on the bank resulting from the risk by opening up the internetbanking service to its customers (Comparable argumentation when using dialers see [10; 55]). The bank, due to its personnel, financial and technical resources, can control the resulting risks better than its customers, moreover, as the bank is free to define the extent of services available and the relevant technical precautions. As well the bank has a wider horizon concerning the technical menaces from the internet than the average user ([35] p. 356; [77] Rn. 32). Finally, the bank benefits of many financial advantages from internet-banking. These services reduce the cost for personnel, branches and backoffice-operations considerably. Although internet-banking provides advantages to the customer as well, as independence from banking hours and reduced costs, the main benefits are drawn by the bank. This, however, cannot result in imposing all risks to the bank as the bank can not control the sphere of the customer. In particular, the computer of the customer may be the main source of danger being endangered by virus infections or as a prime target for phishing and pharming attacks. Therefore, security measures have to be taken also by the customer with due diligence. A breach of these duties results in liability of the customer. bb) Duties of the Bank The bank is subject to several duties concerning the provision of online banking. It has to provide a secure technical environment, has duties of surveillance regarding to the technical and security relevant developments, has to instruct and warn the customer and prevent misuse of the system when detected. The failure of accessibility is discussed in a short excursus. (a) Providing Secure Technical Environment by the Bank A principal duty concerning liability is the provision of a secure technical environment, namely the bank server ([35] pp. 357 f.; [33] p. 217; [36] Rn. 812; [74] pp. 151, 206; [24] § 55 Rn. 19), which is derived from § 25a KWG (Kreditwesengesetz). Due to the continuing development of technique and of newly evolving 14
See above section 30.3.
740
Gerald Spindler, Einar Recknagel
hazards, this duty has a dynamic character as the bank is obliged to improve constantenously the implemented software according to the acknowledged standards. The bank server has to be kept safe and secure, preventing intrusion of unauthorised third parties, virus attacks, etc. An insufficiently secured system breaches the duty of maintaining safety resulting in liability for the evolving damages ([37] p. 161, [32] p. 130; [74] p. 206). Due to further advancement in technology the bank has to implement the latest security measures on a regular basis. While the customer is responsible for securing his home PC15 the duty for maintaining safety of the bank commences the moment the customer contacts the bank server. The bank is therefore responsible – on a contractual basis to its customers, on a tortious basis to every third party – to provide a secure server environment. Additionally, the bank is liable for the incorrect configuration of the server, the usage of faulty software and its employees for the unauthorised changing or deleting of data. Though complete security cannot be provided ([43] p. 163; [66]) due to the ever changing risk potential in the internet the bank should, for minimising the liability risk, always employ state of the art technology ([35] p. 359; [74] pp. 107, 225). This adoption to state of the art technology has to be set in relation to the economical possibilities and the resulting security gain. Not everything which is technically possible can be demanded to be employed by the bank ([57] pp. 162, 437 f.; [62] § 823 Rn. 29). In this field parallels to product liability can be drawn ([77] Rn. 40). The manufacturer of a product is not liable for faults unknown to state of the art of science and technology at the time of placing the product into circulation, but he is obliged to consider newly discovered problems in future production ([5; 7; 9; 62] § 823 Rn. 508, 602). If a newly known threat to IT-security becomes known to the bank it is obliged to adjust its system to counter these dangers ([35] p. 359; [74] pp. 47 f.). This may result in the necessity of providing new or modifed verification instruments in due time ([35] p. 357). The discussion concerning the security of the sole PIN/TANprocedure has already started, arguing that this method is not state of the art any longer ([35] p. 357; [77] Rn. 40 f.). This method is subject to severe attacks by secretly employed malware on the customers PC. As a reaction to these threats banks have already developed variations of the PIN/TAN-procedure involving the employment of additional hardware (TANgenerator), the communication of a TAN to the mobile phone (mTAN) or other means of communicating with the customer without the usage of the “PC-bankingchannel” ([77] Rn. 42). Presently, the HBCI-procedure16 is considered being the most secure ([77] Rn. 43). This, however, does not result in an obligation to change its complete verification system from modified PIN/TAN-procedures to HBCI. Even if the HBCI provides the highest level of security the new PIN/TAN-methods have proven to be secure to the presently known threats ([35] p. 359; [79] pp. 212 f.). Furthermore, HBCI is not widely accepted by the customers due to the necessity 15 16
At least to a certain degree, see under Section 30.3.2.2 cc) (a). Home-Banking-Computer-Interface, also known as FinTS.
30 Legal Aspects of Internet Banking in Germany
741
of buying additional technical equipment (smartcard reader) and being bound to the home PC not being able to use internet banking from the place of work. (b) Excursus: Providing Access to the System of the Bank Besides the necessity of providing a secure infrastructure the question of accessibility presents a serious problem within a business connection based on communication via internet. If the customer is not able to place orders or transfers due to malfunctions and server breakdowns this may result in liability of the bank, if for instance a stock order is not executed in due time ([26] p. 307; [74] p. 207). Therefore, a stable and reliable access to the bank server is of crucial importance ([13] p. 141; [38] p. 43; [18] p. 1444; [76] p. 127) although due to the complexity of a server system a system failure is generally possible and the bank is not able to guarantee a 100 percent access at all times. It is obvious that not every interruption of accessibility can result in liability i. e., when the server is down for maintenance reasons17 or a sudden failure interrupts the services until a backup server is started. For these short term interruptions a breach of duty cannot be assumed until the nonaccessibility of the banking services exceeds for more than one hour ([2] p. 1541; [3] p. 264; [39] p. 224; [74] pp. 211 ff.). Concerning contributory negligence courts have to take into account available other means for the customer in order to contact the bank ([56] p. 1591; [51] – for the case of non-accessibility and the resulting liability see also [54]). The claim for breach of duty depends on the question of default, § 276 BGB. A strict liability can only be assumed, if specifically accepted by the bank, for example in their terms and conditions ([74] p. 212). Mere advertisements on the other hand are not sufficent to create a strict liablilty. (c)
Duties of Surveillance and Monitoring
Besides the duty of surveillance concering its own system the bank has to also monitor – as far as possible – possible security leaks in the customers sphere. Although this does not lead to a controlling of the customers PC the bank nonetheless has to react to threats targeted especially against the customer. As the bank has superior knowledge18 concerning the threats evolving from pharming or phishing for example, it has to react in order to neutralise or minimise the relevant dangers ([77] Rn. 45). As an analogy to the monitoring obligation of products by the manufacturer ([78] § 823 Rn. 511) one can differ between an active and a passive surveillance obligation. If misuse and events of damage become known the bank it is obliged to investigate. In addition the bank is held to inform itself by media and internet concerning new evolving threats to a certain extent, i. e., information on windows security exploits, active viruses etc. The information gained has to be passed on to the customer, depending on the seriousness of the threat via website, by flyer or even 17
18
Generally this maintenance work occurs at night-time or on weekends when the demand is low. See above Section 30.3.2.2 aa).
742
Gerald Spindler, Einar Recknagel
by mail. Warnings by e-mail, however, are – due to the suspicion of a phishing attack – not advisable. (d)
Duties of Instruction and Warning
The bank is obliged to provide the customer with proper instructions towards the internet-banking system before initial use ([35] p. 356; [33] p. 218; [14] Rn. 65 ff; [15] Rn. 167; [20] Rn. 527 ff.; [24] § 55 Rn. 19 ff.; [74] pp. 220 ff.). Especially for the technically unexperienced user the demand for information and instruction cannot be underestimated. With the conclusion of the internet-banking contract the customer has to be provided with information to the use ([14] Rn. 66) and the dangers of abuse ([14] Rn. 67; [74] p. 220; [24] § 55 Rn. 20). If the bank offers different systems of verification, i. e. PIN/TAN or HBCI, the bank has to give its customer the choice and give information towards the advantages and disadvantages of the different systems, for example to inform the costumer about the hardware related higher costs with HBCI. If the bank gains knowledge about new risks which endanger the security of the system, it is obliged to warn its customers about the threat. This warning duty correlates with the abovementioned duty of surveilleance. The information can be provided in print, the terms and conditions and the online user guidance on the website19 of the bank ([35] p. 357; [14] Rn. 67; [74] p. 220; [24] § 55 Rn. 21; [77] Rn. 48). As pop-up windows are often blocked by the browser, instructions should not be provided this way ([35] p. 357; [77] Rn. 48). There remain some doubts whether the presentation of the duties of care of the customer in the terms and conditions is sufficent as “fine print” is not always read by the customer ([20] Rn. 527 q; [77] Rn. 48). Such a negligence by the customer – who is also aware of the necessity to act with due care concerning his banking- or creditcards – cannot, however, exculpate him ([14] Rn. 67; [74] p. 220; [28] pp. 251, 260 f.). To guarantee a sufficent instruction a website based instruction is advisable. Instructions to be sent via e-mail on the other hand should be discarded in order to avoid email contact with the customer. This principle should be adopted for the prevention of phishing attacks. A bank should therefore, regularly point out (on its website for example) that it does not contact its customers via email ([35] p. 357; [77] Rn. 48). (e)
Duty to Prevent Misuse
If the bank gains knowledge of abusive transactions it has to ensure that no further unauthorised transactions can be executed ([33] p. 218). Not every assumed misuse enables the bank to bar the internet-account,20 but at least the customer should be informed by the bank. 19
20
The web presentation of several banks has been tested towards security and ease of use by the SIT Fraunhofer institute, see [70]. Such a right also cannot be stipulated in the terms and conditions; see [67].
30 Legal Aspects of Internet Banking in Germany
743
In addition the bank has to provide a system, which enables the customer to bar the access to his account on his own initiative.21 cc) Duties of the Customer The extent to which duties of care concerning IT-specific security (phishing-mails, installation of virus protection and firewalls etc.) can be imposed onto the customer is still being discussed intensively. Obligations arising from the regular duties preventing tortius liability have to be obeyed in contractual binding as well, § 241 sec. 2 BGB ([71] § 241, Rn. 28; [61] § 280 Rn. 104; [4] § 251 Rn. 92). Within the contractual connection between a customer and the bank obeying these duties can be regarded as a minimum standard ([77] Rn. 56). The question remains whether the duties of care of the customer are increased within the internet-banking connection ([33] p. 217). Increased due care could be supported by the fact that by strongly discussing the phishing problem in the media and the related information distributed by the bank a certain awareness to these dangers of the customer can be assumed ([33] p. 217). It is, however, questionable if the mere knowledge about existing dangers without the technical knowledge or ability of countering these dangers can result in enhanced duties. Different from the duties concerning secrecy and safe custody of the verification instruments countermeasures against “internet-related threats” are unknown to the average customer. The technical abilities of the common average customer therefore represent the range within security related care can be expected ([79] pp. 178 ff.; [74] p. 225). If a threat, which can only be controlled by an “above-average user” leads to a damage, it has to be considered as a risk to be taken by the bank, which is justified as the internet-banking-offer is not limited to the experienced user but all customers regardless of their internet-experience ([79] pp. 178 ff.; [74] p. 225). To determine the duties of care of the customer concerning a threat, the customer’s awareness and reasonable technical and financial countermeasures have to be taken into account. Commonly known warnings as not to delete phishing mails, not to open links these e-mails refer to and not to open attachments from unknown senders can be – after given instructions by the bank – expected to be regarded by the customer ([77] Rn. 59). A certain browser- or firewall configuration of the customer’s PC, however, cannot be expected. (a) Duty to Protect the Personal Computer Courts acknowledge for a general duty of installing security updates and employing a virus scanner as these warnings are known to the common user as well. A surplus of these usual countermeasures like installing a firewall or defining a certain browser configuration cannot be expected by the common internet user
21
See also Art. 47 of the proposal of a directive on payment services in the internal market.
744
Gerald Spindler, Einar Recknagel
(Different opinion [33] p. 217) and therefore, cannot define obligations leading to a breach of duty. Defining obligations concerning the protection of the private computer becomes obsolete if the customer does not employ internet banking from his home PC but from work. Definitions of browser configuration usually cannot not be applied as the customer generally does not possess the necessary administrative rights to configure the computer or to install a firewall or security programs in the office. It should also be taken into account that an additional threat may result out of the working environment and the simplified access to the computer by others. For a versatile user it does not provide problems to install a keylogger or other malware on the customers working PC, keeping track of visited websites and used passwords. A breach of obligation by the customer could only be assumed, when it was clear to him that the working PC was accessible by thirds and no additional security measures have been installed. (b) Duties Concerning the Storage of Verification Instruments and Emails When processing banking-transactions online certain security-measures can be expected from the customer. This means that PIN and TAN Codes should not be stored on the hard drive of the computer, the signature card (HBCI) should be removed from the card reader after processing the transaction ([74] p. 221). In addition the usual duties concerning the handling of unknown email attachments are applicable. After sufficient warning notices by the bank the customer can be expected not to respond to phishing mails even if they appear as having been sent by the customer’s bank. Moreover, the customer should not enter PIN and/or TAN Codes on websites opened by links that were enclosed in an email received ([33] p. 217; [74] p. 222). Taking into account that phishing has been widely discussed in the media and warnings by all banking institutions are regularly issued, these simple rules of conduct should be known to every internet user without depending on specified computer knowledge. Whether a breach of duty by the customer can be assumed, when the bank itself contacts the customer in other business matters is at least questionable, because under these circumstances it might not be possible for the customer to determine if a certain mail originates from the bank or not ([16] p. 3314). For the bank therefore, it is advisable to point out that it will never ask for verification instruments via email (or any other way) or even safer to fully abstain from contacting its customers via email. (c) Duties While Processing Transactions If the average customer is able to verify the security characteristics of the website (correct URL, SSL-lock-symbol, information on the certificate, digital fingerprint) seems questionable (Pro [33] p. 216; doubting [35] p. 356). At least presently these security checks are – although easy to apply – not common knowledge so far ([77] Rn. 64).
30 Legal Aspects of Internet Banking in Germany
745
(d) Duties in Case of Misuse The customer is obliged in case of certain or suspected misuse to inform the bank and bar the access to his accounts ([16] p. 3314; [83] p. 339; [85] Rn. 19/86; [74] p. 223). If the customer failed to inform the bank and if the bank could have prevented the false transaction, when the information had reached the bank in due time, the customer is liable for the resulting damage. In order to prevent further misuse the customer must change his PIN after recognising the unauthorised use of his PIN ([85] Rn. 16/68; [74] p. 224). A duty to regularly change the PIN on the other hand does not exist as due to the growing number of PIN and password based applications in everyday life the common user would be overstrained ([79] p. 218; [85] Rn. 19/70; [74] p. 224). dd) Default of the Customer, § 280 Abs. 1 BGB The terms and conditions concerning the internet banking do not contain a specified regulation concerning liability. Therefore, the usual liability designated by law applies resulting in the customers liability for not only gross rather than any negligence ([35] p. 354). § 280 sec. 1 sentence 2 BGB, however, presumes the negligence if the contracting party had breached its duties – which can be rebutted, in practice, however, rarely the case. For this reason the exact definition of the duties to be fulfilled by the customer is mandatory ([77] Rn. 71). ee) Contributory Negligence by the Bank, § 254 BGB The customer may defend himself against claims of the bank by giving evidence of contributory negligence by the bank, in particular if insufficient security measures had been the cause for damages ([33] p. 219; [16] p. 3315). A concurrent cause could be assumed, when the bank hinders the recognition of manipulated websites by frequently changing of the design of the website or the employment of difficult “not for itself speaking”-URLs. The burden of proof for the contributory negligence by the bank lies with the customer ([8; 35] p. 360; [82] § 254 Rn. 68). 30.3.2.3 Conclusion In case of misuse of the verification instruments the bank may base its claims against the customer on § 670 BGB (reimbursement of expenses) or claims for damages, §§ 280, 241 sec 2 BGB. If the bank cannot prove the conclusion of a (transaction) contract (to the burden of proof see the following section) the bank may only claim for damages due to the breach of duties by the customer based on insecure handling of the verification instruments. The exact definition of the relevant duties of the customer is essential. Duties of the bank (duties of information, instruction and warning) correlate with duties of the customer. The requirements to the customer concerning due care when using internet banking are no higher than within the usual internet usage on the basis of the average internet user.
746
Gerald Spindler, Einar Recknagel
If a breach of duty by the customer can be assumed, possible contributory negligence by the bank has to be taken into account.
30.3.3 Procedural Law and the Burden of Proof The bank has to state and prove its claim for either reimbursement of expenses for the executed transaction or damages if the customer denies having disposed a payment transaction claiming that a third party has used the means of identification without consent or knowledge of the user. If the transaction is initiated by the use of the correct verification instrument, the bank will have severe problems in proving the facts. Therefore, alleviations of the burden of proof are discussed, either a complete shift in the burden of proof or the use of prima facie evidence. 30.3.3.1 Shift in the Burden of Proof Towards the Customer A shift in the burden of proof resulting in the customers obligation to prove that he did not initiate the disputed transaction have no legal basis in German law ([77] Rn. 78) and also would not be appropriate ([77] Rn. 78). Although the bank has no insight whatsoever into the customers sphere and the customer is able to secure a safe handling of the verification instruments there remain additional risks in internet communication between the customers PC and the bank server which are not controlled by the parties. These uncontrollable risks are the main reason, why a shift in the burden of proof has to be rejected ([74] p. 145; for internet auctions jurisdiction refuses a shift in the burden of proof as well, [69; 68; 46; 45; 53]). Moreover, a complete shift in the burden of proof to the customer would not allow for a assessment of the actual technical state of the art and development (i. e. new threats) ([75] p. 208). Therefore, a shift in the burden of proof in internetbanking would not be appropriate ([59] p. 111; [74] p. 144; [73] p. 459). 30.3.3.2 Prima Facie Evidence Most legal scholars favour that the burden of proof remains with the bank; however, the application of a prima facie evidence is preferred referring to similar problems in the usage of bank-cards (fomer ec-cards).22 Applied on internet-banking the prima facie evidence states that the transaction initiated by the use of the correct verification instrument originates from the customer or an authorised third person or if this can be rebutted by the customer, that the customer did not handle the verification instrument with due care. 22
Regarding to bank-cards jurisdiction applies the prima facie evidence for quite a long time stating that if the card was used with the correct PIN it indicates prima facie that a) the cardholder initiated the transaction or b) the cardholder did not act with due care making him liable for the resulting damages. Prevailing jurisdiction: [12; 34; 47; 48; 49; 50; 52; 1; 65]; see also [74] pp. 151 ff.
30 Legal Aspects of Internet Banking in Germany
747
Jurisdiction concerning the appliance of prima facie to internet-banking has not been issued yet.23 Prima facie rule applies when evidence is produced which may allow for a presumption based on facts that a certain event causes very probably a specific result. Hence, applying this rule requires the existence of a presumption in internetbanking. If such a presumption can be established it has to be examined if it can be rebutted by giving evidence regarding to a phishing or pharming attack, thus stating an atypical cause. aa) Presumption of Techical Security The grounds for either presumption (customer has initiated the transaction or acted negligent) are that common experience states the general technical security of internet-banking applications. The prevailing opinon amongst legal scholars affirms this security with reference to the employed PIN/TAN or HBCI system. These verification instruments can only be “cracked” with considerable amount of time and effort making such an attack improbable from a technical and economical point of view ([83] p. 339; [86] pp. 1047, 1050; [41] p. 24; [84] Rn. 19/17; [74] p. 150; [33] p. 218; [77] Rn. 90). The presumption of technical security therefore, is the basis for the high probability that the customer or an authorised third party has initiated the transaction, alternativly that the customer has handled his verification instrument negligent resulting in his liability. A prima facie rule is therefore justified as long as a high technical security standard can be affirmed ([77] Rn. 91). Actually the HBCI-System (being signature based) is presently considered as secure, in contrast to the security regarding the PIN/TAN-System. While it is agreed upon that the danger of calculating or guessing PINs and/or TANs is negligable ([35] p. 369; [86] p. 1047) security hazards emerge from the danger of being spied out by phishing, pharming or trojan viruses. Although jurisdiction has recognised the possibility of “being spied out” as possible means to rebut the presumption of security in the use of bank cards it did not rule out the utilisation of prima facie evidence in general ([11]). A direct transposition to internet-banking is, however, not feasible. Although “physical” spying – i. e., looking over the shoulder of the customer and noting PIN and TANs – cannot be totally excluded, it has to be considered as rather improbable. The real threat lies in the virtual attack, phishing, pharming or the use of trojan viruses. Taking these threats into account, the usage of the “simple” PIN/TAN does not shield the user on a level which enables the assumption of adequate security. Considering the actual state of the art technology is has to be distinguished between verification systems according to media conversion.
23
Although the use of PIN/TAN instruments is mentioned as „secure“ by [46].
748
Gerald Spindler, Einar Recknagel
(a) Verification Instruments Without Media Conversion Regarding the new threats by phishing and pharming the PIN/TAN system the actual security level does not match the dangers of being spied out by making use of phishing mails. Under these circumstances there are considerable doubts if the prima facie rule can still be applied to the detriment of the customer ([77] Rn. 95 ff.). Moreover, customers can not be held liable for acting without due care. Protection against threats by phishing24 and pharming are scarcely available for customers. Even if the customer applies all security measures, trojan viruses can remain undetected and spy out the verification data ([35] p. 356). Due to these dangers the application of prima facie evidence when using PIN/TAN systems without media conversion has to be questioned ([77] Rn. 95 ff.). (b)
Verification Instruments with Media Conversion
The high standard necessary for the presumption of the necessary security level has to be taken for granted when the verification instrument is based on media conversion. If for example a TAN-generator is used, the spied out TAN being digitally linked to a certain transaction cannot be used for another (false) transaction. The same applies, when the TAN is transmitted by SMS to the mobile phone of the customer. An illegal use of the verification instruments is only possible when the attacker not only spies out the PIN but also gains access to the additional hardware (TAN-generator, mobile phone) which indicates an insecure storage by the customer, giving way to the presumption of negligence. (c)
Conclusion
The prevailing opinion assumes technical security including the PIN/TAN system without media conversion burdening the customer with the full proof for being spied out ([33] p. 219; [16] p. 3316). As the customer is obliged to keep the verification instruments secret, the illegal use of PIN and TAN indicates at least negligence by the customer. However, it is still from being clear which role media conversion could play. Due to the uncontrollable dangers of being spied out the presumption for security can only be maintained for systems with media conversion while for systems without media conversion the full burden of proof is (re-)located to the bank ([77] Rn. 95, 101, 114). bb) Rebutting Prima Facie Evidence When applying prima facie evidence the customer has to rebut it by stating evidence which shows a substantiated possibility of an atypical event ([6; 16] p. 3317; [33] p. 218; [64] § 286 Rn. 23; [88], vor § 284, Rn. 29). Applied on internet-banking the customer has to state (and prove) a substantiated possibility that an unauthorized third had initiated the transaction. Even if the customer is 24
Save „obvious“ emails, using false grammar for example.
30 Legal Aspects of Internet Banking in Germany
749
able to prove an untypical event he has to verify in the second step, that he did not handle his verification instruments with negligence. Which requirements are required to rebut the prima facie rule has to be assessed by the court. However, courts did not yet had to decide upon these issues ([35] p. 359).25 If the customer rebuts the evidence by stating that he fell victim to a phishing attack he should at least be able to produce the phishing mail in question. Whether the mail has to be considered as real or is obviously faked would have to be decided in court. Pharming attack, however, raises more difficulties. Some legal scholards contend that – due to the difficulty of proving a pharming attack – it should be sufficient to rely on abstracts or media reports ([35] p. 360). However, the pure technical possibility without connection to the case in question does not state counter evidence at all.26 cc) Final Conclusion The application of prima facie rule for internet-banking (still) has to be approved by courts. Whereas such systems as HBCI or PIN/TAN systems combined with media conversion are undisputed, discussions recently have begun considering the safety of PIN/TAN system without media conversion. When the customer claims a phishing or pharming attack it is absolutely necessary to save the corresponding mail and the (unchanged) PC-system. The use of an antivirus programm after a possible attack may lower the chances of rebutting the prima facie evidence.
References 1. AG Frankfurt, NJW (Neue Juristische Wochenschrift) 1998, 687 2. Balzer, Peter, Rechtsfragen des Effektengeschäfts der Direktbanken, in: WM (Wertpapiermitteilungen Nr. IV) 2001, 1533 3. Balzer, Peter, Haftung von Direktbanken bei Nichterreichbarkeit, in: ZBB (Zeitschrift für Bankrecht und Bankwirtschaft) 2000, 258 4. Bamberger, Heinz Georg/Roth, Herbert, BGB, 2. ed. 5. BGH, NJW 1990, 906 (907 f.) 6. BGH, NJW 1991, 230 (231) 7. BGH, NJW 1994, 517 (519 f.) 8. BGH, NJW 1994, 3102 (3105) 25
26
The LG Stralsund assumed the prima facie evidence of a telephone bill as rebutted by proving the existence of a „Backdoor-explorer 32-Trojan“-virus on the hard drive of the customer, [55]. Although the proof is hard to produce it has to be demanded as the similar situation in bank card cases show, where regularly the abstract possibility of downloading undefined programs „from the internet“ are stated in order to „prove“ the hack of the PIN.
BGH, NJW 1994, 3349 (3350) BGH, MMR (MultiMedia und Recht) 2004, 308 (311) BGH, NJW 2004, 3623 (3624) BGH, WM 2004, 2309 BGHZ (Entscheidungssammlung des Bundesgerichtshofs in Zivilsachen) 146, 138 Bräutigam, Peter/Leupold, Andreas, Online-Handel, Chapter VII Neumann, Dania/Bock, Christian, Zahlungsverkehr im Internet, 2004 Borges, Georg, Rechtsfragen des Phishing, in: NJW 2005, 3313 Derleder, Peter/Knops, Kai-Oliver/Bamberger, Heinz G., Handbuch zum deutschen und europäischen Bankrecht, 2003 Borsum, Wolfgang/Hoffmeister, Uwe Bildschirmtext und Bankgeschäfte, in: BB (Betriebsberater) 1983, 1441 (1444) Borsum, Wolfgang/Hoffmeister, Uwe, Rechtsgeschäftliches Handeln unberechtigter Personen mittels Bildschirmtext, in: NJW 1985, 1205 Canaris, Wilhelm, Bankvertragsrecht, 1988 Dörner, Heinrich, Rechtsgeschäfte im Internet, in: AcP (Archiv für die civilistische Praxis) 202; 363 Geis, Ivo, Die digitale Signatur, in: NJW 1997, 3000 Gesetz zur Änderung der Vorschriften über Fernabsatzverträge bei Finanzdienstleistungen vom 02.12.2004, BGBl. I 2004, p. 3102 Schimansky, Herbert/Bunte, Hermann-Josef/Lwowski, Hans-Jürgen, Bankrechtshandbuch, 2. ed., 2001 http://www.hbci-zka.de Hartmann, in Hadding/Hopt/Schimansky, Entgeltklauseln in der Kreditwirtschaft und E-Commerce von Kreditinstituten – Bankrechtstag 2001 Heermann, Peter W., K&R (Kommunikation und Recht) 1999, 6 Hellner, Thorwald, Rechtsfragen des Zahlungsverkehrs unter besonderer Berücksichtigung des Bildschirmtextverfahrens, in: Festschrift für Winfried Werner, 1984 Hoeren, Thomas/Oberscheidt, Jörn, VuR (Verbraucher und Recht) 1999, 371 Hoffmann, NJW-Beil. zu 14/2001 Baumbach, Adolf/Hopt, Klaus, Handelsgesetzbuch, 32. ed., 2006 Janisch, Sonja, Online-Banking, 2001 Karper, Irene, DuD (Datenschutz und Datensicherheit) 2006, 215 KG, NJW 1992, 1051 (1052) Kind/Werner, Rechte und Pflichten im Umgang mit PIN und TAN, in: CR (Computerrecht) 2006, 353 Koch, Robert, Versicherbarkeit von IT-Risiken, 2005 Köhler, Helmut, Die Problematik automatisierter Rechtsvorgänge, insbesondere von Willenserklärungen, in: AcP 182, 126 Köndgen, Johannes, Neue Entwicklungen im Bankhaftungsrecht, 1987 Krüger, Thomas/Bütter, Michael, Elektronische Willenserklärungen im Bankgeschäftsverkehr: Risiken des Online Banking zugl. Besprechung des Urteils des LG Nürnberg-Fürth vom 19.05.1999 in: WM 2001, 221
30 Legal Aspects of Internet Banking in Germany
751
40. Kümpel, Siegfried, Bank- und Kapitalmarktrecht, 3. ed. 41. Kunst, Diana, Rechtliche Risiken des Internet-Banking, in: MMR-Beil. (MultiMedia und Recht-Beilage) 09/2001, 23 42. Lachmann, Jens-Peter, Ausgewählte Probleme aus dem Recht des Bildschirmtextes, in: NJW 1984, 405 43. Lange, Thomas, Internet-Banking, 1998 44. Langenbucher, Katja, Die Risikozuordnung im bargeldlosen Zahlungsverkehr, 2001 45. LG Bonn, MMR 2002, 225 (256) 46. LG Bonn, MMR 2004, 179 (180, 181) 47. LG Bonn, NJW-RR (Neue Juristische Wochenschrift-Rechtsprechungs-Report), 1995, 815 48. LG Darmstadt, WM 2000, 9121 (913) 49. LG Frankfurt, WM 1999, 1930 (1932 f.) 50. LG Hannover, WM 1998, 1123 51. LG Itzehoe, MMR 2001, 833 (833) 52. LG Köln, WM 1995, 976 (977) 53. LG Konstanz, MMR 2002, 835 (836) 54. LG Nürnberg-Fürth, VuR 2000, 173 55. LG Stralsund MMR 2006, 487 (488) 56. Mackenthun, Thomas/Sonnenhol, Jürgen, Entgeltklauseln in der Kreditwirtschaft – e-Commerce von Kreditinstituten, Bericht über den Bankrechtstag am 29.06.2001 in Kiel, in: WM 2001, 1585 57. Marburger, Peter, Die Regeln der Technik im Recht, 1979 58. Mehrings, Josef, Vertragsschluss im Internet: Eine neue Herausforderung für das „alte“ BGB, in: MMR 1998, 30 59. Mellulis, Klaus-J.,, Zum Regelungsbedarf bei der elektronischen Willenserklärung, in: MDR (Monatsschrift für deutsches Recht) 1994, 109 60. Möller, Mirko, Rechtsfragen im Zusammenhang mit dem PostIdent-Verfahren, in: NJW 2005, 1605 61. Münchener Kommentar zum Bürgerlichen Gesetzbuch, 4. ed. 62. Münchener Kommentar zum Bürgerlichen Gesetzbuch, 4. ed. 63. Münchener Kommentar zum Bürgerlichen Gesetzbuch, 4. ed. 64. Musielak/Foerste, ZPO, 9. ed. 65. LG Osnabrück, NJW 1998, 688 66. OLG Hamburg, MMR 2003, 340 (340) 67. OLG Köln, CR 2000, 537 68. OLG Köln, MMR 2002, 813 69. OLG Naumburg, NJOZ (Neue Juristische Online-Zeitschrift) 2005, 2222 (2224) 70. Phishing: http://www.sit.fraunhofer.de/cms/media/pdfs/phishing.pdf 71. Palandt, BGB, 67. ed., 2008 72. Proposal for a Directive of the European Parliament and of the council on payment services in the internal market and amending Directives 97/7/EC, 2000/12/EC and 2002/65/EC (presented by the Commission)
752
Gerald Spindler, Einar Recknagel
73. Roßnagel, Alexander, Auf dem Weg zu neune Signaturregelungen – Die Novellierungsentwürfe für SigG, BGB und ZPO, in: MMR 2000, 451 74. Recknagel, Einar, Vertrag und Haftung beim Internet-Banking, 2005 75. Sieber, Stefanie/Nöding, Toralf, ZUM 2001, 199 76. Siebert, Lars Michael, Das Direktbankgeschäft, 1998 77. Spindler, Gerald, Online-Banking, Haftungsprobleme, 2006 – abstract presented at the „3. Tag des Bank- und Kapitalmarktrechts 2006“ in Karlsruhe 78. Bamberger, Heinz Georg/Roth, Herbert, BGB, 2. ed. 79. Spindler in Hadding/Hopt/Schimansky, Entgeltklauseln in der Kreditwirtschaft und E-Commerce von Kreditinstituten – Bankrechtstag 2001 80. Trapp, Andreas, Zivilrechtliche Sicherheitsanforderungen an eCommerce, in: WM 2001, 1192 (1199) 81. Ultsch, Michael, Zugangsprobleme bei elektronischen Willenserklärungen – Dargestellt am Beispiel der Electronic Mail, in: NJW 1997, 3007 82. Bamberger, Heinz Georg/Roth, Herbert, BGB, 2. ed. 83. Werner, Stefan, Das Lastschriftenverfahren im Internet, in: MMR 1998, 338 84. Hellner, Thorwald/Steuer, Stephan, BuB (Bankrecht und Bankpraxis) 85. Hellner, Thorwald/Steuer, Stephan, BuB (Bankrecht und Bankpraxis) 86. Wiesgickl, Margareta, Rechtliche Aspekte des Online-Banking, in: WM 2000, 1039 87. http://www.zentraler-kreditausschuss.de/ 88. Zöller/Greger, ZPO, 26. ed. 89. (97/489/EC), Official Journal 1997, L 208, pp. 52 ff. 90. (2002/65/EC), Official Journal, 2002, L 271 p. 16
CHAPTER 31 Challenges and Trends for Insurance Companies Markus Frosch, Joachim Lauterbach, Markus Warg
31.1 Starting Position There are some 670 insurance companies1 operating in the German insurance market that grew in 2004 by 1.6 percent in direct non-life/accident insurance, by 2.6% in life insurance and by 7.8% in health insurance. In such mature and highly competitive markets policies, prices and commissions converge as a result so that in many segments insurance services can almost be regarded as standardised products from a customer perspective. Nevertheless some insurance companies have been successful in recording growth far above the market average – and this organically. If one excludes the non-actuarial result, investment profits, the factors underlying the success must then be sought in the actuarial result. The actuarial result is determined by premium income, benefits and costs. This article explains in detail the reasons why there are significant differences between business results and derives from these the challenges to be faced in the future. The combined ratio is an important key figure used in reviewing the profitability of a single insurance contract, of all contracts with a customer, of a part of the business in force or the total insurance business in force. This combined lossexpense ratio for a financial year is the sum of the gross loss ratio calculated from the losses paid out or reserved in the reporting year plus processing expenses and internal claim settlement costs, the administration expense ratio and the cost ratio for other expenses, divided by the gross premiums earned. As soon as this ratio exceeds 100 percent, the transaction or business segment under review has suffered an actuarial loss. By comparing over a period of several years the trend of the combined ratio of the business segments of a company with the combined ratios of other insurance companies important conclusions can be drawn with 1
cf. GDV, 2005 Statistical pocket book of the insurance industry, 2 number of insurance companies.
754
Markus Frosch et al.
regard to the susceptibility to losses and consequently the quality of the business in force and the efficiency of management.2 The reasons underlying positive actuarial results – i. e. combined ratios clearly below 100 percent – are that some insurance companies have been successful in achieving premium growth in high yield business sectors, the objective of almost all insurance companies, and at the same time in reducing costs. If the products and prices are standardised to a large extent, the right service strategy, competence and services offered become the decisive factors for the success of the sales partner with the customer. Multi-faceted interaction between strategy, sales and marketing and management is required for this. The following points must be mastered in order to optimally cover this spectrum: • First of all it must be ensured that all resources are deployed in line with the company objectives. The business, IT and HR strategies must be balanced. A possible method to be applied is illustrated by the Balanced Scorecard. • Are the core processes in line with/consistent the company objectives, is it sensible to introduce self-controlling systematics. An example of this is sales and marketing management based on contribution margins through which it is ensured that incentive structures for the sales partners are in line with the company objectives. • In a highly standardised market it is effective with regard to customer perception to create a distinction between the company and the competition. An example of a successful unique selling proposition is the introduction of “on demand insurance” by an Austrian insurance company. • Competent and service oriented advice is based on many processes that support sales and marketing. Internet based access to primary data is becoming more important for the integration of the customer, sales and marketing and management. • Excellent services provided in such a complex environment result from the employment of excellent staff. The talent required in the view of the company must be identified, recruited and developed. The acquiring of this management talent is probably the most formidable of these challenges.
31.2
Challenges and Trends
31.2.1 Business and IT Strategy in the Balanced Scorecard According to the intellect and actions of a cautious businessman the development of the market, staff, processes and finances should be structured in a balanced manner and agreed – the Balanced Scorecard provides a systematic process geared to this. 2
cf. F. von Fürstenwerth, A.Weiss: Insurance Alphabet, 2001, page 143.
31 Challenges and Trends for Insurance Companies
755
Figure 31.1. Balanced Scorecard
Naturally it is the duty of a company to generate a net income that is customary for the market – in the international arena 15% on equity capital is considered a normal return on business activities. Ultimate success cannot be achieved if financial targets are continually not being met. Rather there is a challenge to set out in a transparent manner the financial targets and the method by which they can be achieved and to communicate them to the staff. The strategy regarding the positioning of the company and its business alignment that ensures that the financial targets are achieved on a medium- and long-term basis should form part of this Scorecard. The product portfolio, the business segments but also market presence and the interaction with the customer are reflected and understood in the customer Scorecard so that the costs and revenues contributed by the products or business segments to the bottom line are made transparent. Only on this basis can growth targets be properly specified, i. e. allocated to individual business segments, and financial targets consistently met. The targets for the processing area are then to be derived from the understanding of finances and the customer area. Here the balancing act to be achieved in most cases becomes apparent; in concrete terms the reduction of the administration expense ratio with a simultaneous improvement in sales and marketing support or in other services. The staff is responsible for the total process. Only if they understand and support the company objectives and the actions taken to achieve the objectives can success be achieved on a sustainable basis by staff responsible. Modern personnel policies, such as employment based on ability and incentive systems that reward the achievement of both company objectives and individual targets are essential for this. In addition to its importance as a communication and management tool the value of the Scorecard is its portrayal of dependencies in a visual form. Growth in high yield sectors can only be achieved if these segments are made known, i. e. detailed contribution margin accounting has been performed.
756
Markus Frosch et al.
strat.+finance
U3: ROE 15 %
process+structures
customer+markets
1
staff+partners
U2: expand market position
U1: guarantee independence
2
0
2 K2: unique selling propositions
K3: sales managed based on contribution margin
1 K1: transparency in costs + income
K4: service offensive
P3: improve in degree of automation
1
P2: reduction of admin costs
2
2 P1: sales support
1 M2: target based salary M3: talentmanagement
2
M1: HR process integration
2
Figure 31.2. Dependencies in the Balanced Scorecard
The interdependencies are clarified in Figure 31.2. If a target of a return on equity of 15 percent is set, sales must be then managed on a contribution margin basis so that the overall target can be “broken down” into individual targets. Cost and revenue transparency is again a central requirement for managing sales on a contribution margin basis. A sales business plan can only be prepared on the basis of this transparency, the implementation of which also results in the desired profit targets being met. Reliable information on costs and revenues, in particular from a product perspective, is essential for cost and revenue transparency as well as for managing sales on a contribution margin basis. All these milestones that are needed to meet the targets can only be allocated if there are transparency, clear communication and agreed targets that relate to the superordinate target. Under no. 2 the “delivery themes” are numbered that underpin this process chain.
31.2.2 Self-controlling Systems: Management Based on the Contribution Margin A service – irrespective of whether provided by the sales partner or management – is not an end in itself and can adversely affect profits if incorrectly allocated. The
31 Challenges and Trends for Insurance Companies
757
costs of providing services must therefore be allocated to the revenues generated by the transaction. This also applies to costs for advertising and marketing events and in particular to the remuneration of the sales team, as the yield on the transaction is more crucial than the revenue generated by it, i. e. the premiums written by the sales partner. The contribution margin calculation is used as the central tool for providing relevant information for the decision-making process. The contribution margin is a form of profit calculation that differentiates between fixed and variable costs. The contribution margin is the difference between the revenue generated and the variable costs incurred in producing the revenue. The contribution margin specifies the amount that is used to cover fixed costs and is often further defined between contribution margin 1, 2 etc. in order to show the effect of different types of costs. For example, premium income and other income less losses paid out is referred to as contribution margin 1, less losses paid out and commissions as contribution margin 2, less losses paid out, commissions and directly allocatable administration expenses as contribution margin 3.3 But which contribution margin is the correct one to be used for making impending decisions? The selection is large: product, class of business, company department, brand, customer, customer group based on age, sex, location, sales, exclusive sales, agency, representative, sales by broker etc. The list can be greatly extended, therefore at this point the importance of the period used for the calculation of the contribution margin should be mentioned in conclusion, as this period (over several years, year, month, week) can also be relevant to the contribution margin amount and consequently for the information calculated for this period that is used as a basis for decision-making. This illustrates the scope of the data needed. Insurance companies require much more than customer data: product information, commission payments, claim payments, periods, directly allocable costs etc. All relevant data and their interrelationships must be brought together so that evaluations can be performed based on consistent data. As illustrated in the SAS Non Life Subject Model the cycle of the relevant data and the interrelationships must first be defined. This data must obviously be allocated to specific objects that are uniquely identifiable in the insurance world. These objects are also called entities in information technology. Customer, policy, sales partner, product, costs etc. are examples. However the entity type is more important than the individual entity in the design of a self-controlling system, i. e. the general characteristics of the customer are the central point, as profiles for target customers or “non-targets” can be determined from these. The primary data must then be extracted from the source systems using this logical data model, transformed into the target schematic and loaded into a separate data inventory (Detail Data Store). This data inventory combined with the analysis software provides the basis from which reports on a contribution margin basis can be produced.
3
cf. F. von Fürstenwerth, A.Weiss: Insurance Alphabet, 2001, page 155.
758
Markus Frosch et al.
Source: SAS
– Non Life Subject Model
Figure 31.3. Non life subject model4
Source Data
Exploitation
Logical Data Model
Data Marts
Integration, Transform ation and History Managem ent Layer Detail Data Store
Applications
Dimensional Physical Models
HOLAP Data Models Reporting
Analytical Data Models
Analytics Cam paign Management Data Mart
BSC Knowledge Base Performance Managem ent
Current Data
Campaign Managem ent
ETL Process and Metadata
Online HTML WA documentation
Source: SAS – Insurance Intelligence Solutions Technical A : rchitecture
The extent of the dovetailing of the business strategy – growth in high yield business segments – with IT is clearly illustrated in this process diagram. The dependencies can also be shown in visual terms in the Balanced Scorecard by means of so-called strategy maps, as is illustrated in the following graph for the strategy and customer areas.
4
Source: Non Life Subject Model.
31 Challenges and Trends for Insurance Companies
759
Source: SAS – Strategy M aps
Figure 31.5. Strategy maps
31.2.3 Unique Selling Propositions: On Demand Insurance Insurance taken out by private households as a unit comprise three different service levels. The first level represents the core product, e. g. the specific insurance cover for the motorboat. At the second level services are added to the core product that are directly connected, such as advice, explanations, customer service, processing of claims relating to the motorboat insurance policy. The third level includes additional services that also take into account the specific personal circumstances of the customer with regard to the core product and thereby rounds off the total benefits offered.5 In the case of motorboat insurance this can be, for example, a smartphone that has worldwide reception and is programmed with location specific emergency telephone numbers and thereby gives the customer the warm feeling that he is also prepared for emergencies in this regard. The more insurance products become similar or the customer perceives that there is no difference in quality at the core product level, the more important the provision of services becomes. The services are then no longer regarded as a simple addition to the core product but as primary services provided by the supplier.6 Opportunities thus arise for innovations and special products that trigger the feeling of uniqueness in the minds of the customer. This allows the insurance company – 5
6
cf. M.Haller, P. Maas, in P. Albrecht, E. Lorenz, B. Rudolph: Risk Research and Insurance, 2004, page 179–214. cf. M. Kremer, 1994.
Source: Haller M., Maas P., in Albrecht P., Lorenz E., Rudolph B.: Risikoforschung und Versicherung , Karlsruhe 2004, S. 179-214
Figure 31.6. Service levels
as a result of the low price elasticity of demand – to institute a company-friendly price policy, at least for a certain period of time. Naturally there is the risk that additions made at the second and third levels are substituted by the competition. As these services represent additions to the core product they, together with the personal circumstances of the customer, are also to be included in the core product and its premium calculation. An example of this is the “flexible car insurance with a base fee and a price per kilometre”,7 that is checked by UNIQA, the Austrian insurance group, in conjunction with IBM. The objective is to make the premium dependent on the real use of the car. The premium decreases the less and the more safely the customer drives. The car usage is determined through satellite navigation. The car usage is determined through a combination of satellite navigation, handheld and IT. The car is tracked by GPS via satellite, the position data is then fed to a central computer via a mobile telephone network. A Navi box installed in the car serves as the GPS receiver and the GSM sender. The computer then compares the navigation data to a map and calculates the kilometres driven and routes that are then used to calculate the premium. If safe driving is rewarded by lower premiums in this manner, the new model could also be a motivating factor for increased road traffic safety. This form of insurance based on needs can also have a beneficial effect on the environment and 7
cf. UNIQA press release “Flexible Car Insurance with Basic Fee and Price per Kilometre”, UNIQA press department 5 October 2005.
31 Challenges and Trends for Insurance Companies
(secured connection through encryption
761
Combination of the GPS and Map Data is used for the calculation of the Insurance Premium
Navi Box: records the tours with the aid of the GPS-Coordinates
Figure 31.7. Flexible car insurance
society. If the car is used less through targeted use and is therefore driven more on safer routes, this will result in a reduction in traffic congestion and accidents and the environment will be protected. Furthermore this technology could also be applied as a car finder for locating a stolen car. An emergency call function is also provided by the connection to the mobile telephone network. In an emergency the driver can transmit an emergency call via GSM, the system locates the car and sends help. It is also possible to further develop the system so that an accident can be independently identified, for example on the sudden stopping of the car. The use of the tracking system in managing car fleets is also conceivable. The target groups for this insurance approach based on needs are car drivers who do not drive much, owners of second cars or younger car drivers who are interested in new technology and are looking for favourable rates.8
31.2.4 Sales and Marketing Support: Access to Primary Data The scope of activities that support sales and marketing is very extensive. The integrated process support system is the central point over and above the information and the knowledge required to support the competence of the salesman on 8
cf. Dr. Hajek in UNIQA press release “Flexible Car Insurance with Basic Fee and Price per Kilometre”, UNIQA press department 5 October 2005.
762
Markus Frosch et al.
site. The process structure is key to the quality and processing speed as perceived by the customer. Outlined below are some process milestones that are of pivotal importance for a qualitatively high-class advisory and closing process: • Range of products • Automatic data transfer (if already customer: name, address, …) on preparing the insurance proposal • Plausibility checks on range of products to avoid any subsequent queries • Transfer of proposal data into back office processing systems (portfolio, documentation systems) • Authority levels • Contract information with regard to extent of cover • Information systems that provide extensive and timely information on existing contracts • On-site changes of a technical and non-technical nature • Transparency over the status of the processing of the transaction • Cooperation agreed between sales and the call centre The quality of these process components with regard to the quality of the information, the potential for error, the throughput speed and the homogeneity of the information between sales, processing, the service centre and the portfolio team play a decisive role in how the customer perceives the quality. The standardisation of the support processes and the elimination of any redundant data housekeeping and functions is a central aspect from an IT perspective. Access to primary data, i. e. sales as well as processing staff have direct access subject to authority levels to contract, customer, collection, claims data, et al is one method for achieving this. This strategy for process support has several advantages: 1. First of all the data/information is complete and current, as the “leading system” has been accessed 2. In addition the data/information is consistent, as not only data partitions are replicated and redundantly maintained 3. Furthermore this strategy allows data housekeeping and the number of applications to be reduced resulting in a decrease in licence fees and maintenance costs in addition to a reduction in the complexity. A technology that allows the primary data held in the portfolio systems to be accessed is IMS-Connect supplied by IBM. IMS Connect is the interface through which Java applications and TCP/IP IMS transactions are called up. At the same time IMS Connect provides together the IBM development tools a programming environment that has code generators e. g. for Web services. For example this enables the SOAP Gateway to talk directly to IMS transactions as a Web service, without the need for writing or generating code and also for using an application server. 9 9
cf. IBM Corporation 2005: more details can be found on the IMS Homepage: http://www.ibm.com/ims.
There are basically two options for accessing the primary data in this way: via protocols such as APPC or VTAM (3270 terminals) in older architectures in a SNA network; via TCP/IP in newer architectures. In IMS an interface called OTMA is used for this purpose. Both IMS Connect and MQ Series (now called WebSphere MQ) are OTMA clients and can call up transactions and receive responses. There are the following communication options: From 3270 terminals using the Telnet3270 protocol. In new MQ Series (WebSphere MQ), applications via socket connections to IMS Connect, using IMS connectors for Java or via the SOAP gateway to make the transactions available in a Web service. The SOAP gateway is a Web service solution for IMS that requires the installation of IMS Connect. This enables the IMS applications or modules to into Service Oriented Architectures (SOA). The direct communication between the Web service initiator and IMS is supported so that an application server is not required to be interconnected. Open standards such as SOAP and XML are also supported. The Web service is immediately available if the configuration files were generated with the WebSphere development tools (WDz and RAD) and the SOAP gateway informed. The runtime components required to execute the Java applications form part of IMS Connect. The development components or development tools through which, for example, a Java application or a Web service can be generated from a IMS transaction written in COBOL, are an integral part of the IBM Development Tools
WebSphere Studio Application Developer-Integration Edition (WSAD-IE), the Rational Application Developer (RAD) or the WebSphere Developer for z/OS (WDz). This diagram illustrates that it is possible to operate Web applications (Browser, Web Server, IMS Connect; 3 three-level model) and also so-called Fat Clients (Client, IMS Connect; 2- level) in parallel. The applications open a TCP/IP socket and send a message (call up of a transaction) to IMS Connect. IMS Connect formats this message and places it in the IMS queue via the OTMA interface. The processing then carried out by IMS is exactly the same as that performed via the 3270 terminal. The transaction response is sent back to the initiator by the same method. Interestingly, due the very streamlined IMS Connect application processing is several times faster than that of the MQ Series.10 In such service oriented architectures it is important that the services are not too small. The smaller the services, the higher the communication expense. Services should be defined as larger entities of functions, such as, for example, an entire transaction. Equating services with functions, as they are contained in their hundreds in parts of the application, will degrade performance, in particular if these services are distributed across several systems or platforms.
10
cf. IBM Corporation 2005.
31 Challenges and Trends for Insurance Companies
765
IMS
IMS Connector for Java (IC4J)
The World Connects
to it !
' IBM Corporation 2005
Figure 31.10. IMS Connector for java
31.2.5 Talent Management: Identify, Hire and Develop Talented Staff The successful positioning, the agreement of processes and incentive structures for setting the company objectives, the development of appropriate self-controlling systems, innovative ideas and also the optimal support for sales all result from teamwork that is moulded by the company culture and the existing talent. Sentences such as “our success is due to our staff” and “we are developing our talented staff” have been heard more and more often in the last few years. This is probably also due to the recognition that technologically standardised processes and simple activities are being transferred abroad more and more as part of the increasingly noticeable globalisation process. It is only human nature that the population of the Western world is rising up in revolt against this development. However this will not change the fact that jobs are not going to locations where the level of salaries is comparatively high and capital is not flowing to locations where the rate of interest earned is comparatively low.11 If we want to secure the future, we must change the way we operate. We must be open to new ideas, we must change our perceptions and develop forward-looking concepts with regard to the economy, society and politics.12 11 12
cf. M.Miegel: Change of era, 2005, page 232. cf. M.Miegel: Change of era, 2005, page 282 et seqq.
766
Markus Frosch et al.
Consequently aspirations at the both society and company level are being concentrated on intellect and its innovative spirit and thus on staff. However, except in some cases, this recognition that is becoming increasingly prevalent in society and companies is not reflected in HR management in the form of a consistent policy for identifying, acquiring and developing talented staff. This already poses a few questions: • • • •
What does a talented member of staff mean to you? Which talents are strategically relevant to your company? Is the HR organisation consistent with the company strategy? Is there a standard process in the company that compares the actual performance of HR to the HR plan?
Talent is an inborn asset that allows one to perform excellently in a field. Talent finds its own way of evolving. Pleasure in performing a task can indicate that more opportunities should be provided to evolve and develop the talent. “The most important thing for developing a talent is the opportunity to do so”13. However, the creation of these development opportunities is not an end in itself for a company but a central factor to its success. Specific investment in staff and a HR department that identifies, positions and develops this investment and is geared to achieving the company objectives is required for each clearly defined positioning of the company.14 Figure 31.11 illustrates by way of an example the steps needed to be taken to implement a HR scorecard. The upper half of the diagram shows the interrelationships between the positioning of the company, the value drivers (talented staff) and targets, the bottom half the operational action plan, i. e. the control loop between incentive plans and the achievement of targets. Both halves must be brought together in order to for the company to achieve success. Kaplan provides another view of this interrelationship between the positioning of the company and the HR organisation by means of the “Strategic Human Capital Planning” and the “Strategic HR Management” aspects.15 The HR requirements are first of all derived from the values on the Balanced Scorecard: strategic skills and talents, management attributes, culture, alignment with the company objectives and the learning environment. The “Human Capital Development Program” is then defined based on covering the plan profile. A development plan is defined as part of the process for the areas relating to the development of expertise and management skills, the company culture, incentive structures and the agreement of objectives etc.
13 14
15
cf. Promerit Talent Management position statement, 06.07.2004 and 10.03.2005. cf. Promerit HR Scorecard: Transparency and Management for the Personnel department, 06.2004. cf. M. Bower, R.S. Kaplan: Balanced Strategy Focused Organizations with the Balanced Scorecard, 2001.
31 Challenges and Trends for Insurance Companies
767
Roadmap for implementing a HR scorecard The HR scorecard can be implemented and established in the organization within 3-6 months with the help of the roadmap and experienced Promerit consultants Project tagets/-expectations (HR team/-customers) Projektplanning & setup
Preparation
SWOT analysis determine/review of action areas and strategic HR targets
Agreement of HR value drivers and HR parameters for review (per department)
Positioning & strategy
Value drivers & control levels
Agreement of strategic HR initiatives, actionplans and targets for each area
Initiatives & targets
Strategic Operational Regular agreement of status and adjustment to action plan/review of strategic objectives
Implementation of a Balanced Scorecard Frankfurt, April 2004
Communication and, if applicable, BSC company specification; implementation of Balanced Scorecard using appropriate tool, responsibilities and roles in BSC-process
Implementation Operational action plan
Measurement & evaluation Agreement of targets
Organisation
Contents (Measurements)
Operational action plan and integration in the targets and, if necessary, the incentive systems for management and staff
The so-called health checks provide concrete assistance in analysing the HR organisation with regard to staff requirements and the recruitment and development of such staff. As part of such reviews six aspects are evaluated: Aspect 1: Positioning of HR The positioning of the Personnel Department and consequently the status of talent management can be determined by the answers to a few questions. • Are clear targets based on company strategy set for the Personnel Department? Is HR regarded as core function with its own responsibilities and “appreciated”? • Is there a HR strategy? If yes, is this based on company objectives and in line with them? • How does the Personnel department view itself? More an administration unit or rather a driver of business? • How is talented staff treated? As a schoolchild or as a customer? • Is there a talent management process?
16
Source: Promerit AG: HR Scorecard, April 2004.
768
Markus Frosch et al.
Aspect 2: HR core processes The core processes should be transparent and known to both management and staff. • Are there procedures in place for managing staff and setting objectives? • Have job specifications been written? • Is there a systematic talent review, talent development, based on appraisals? Is there proper administration of so-called personal master data or are there browser-based talent pools showing the talent available in the company that can be consequently tracked and employed? • Is there a succession plan? • Is there a correlation between training and the long-term development of talent? • Is remuneration based on performance and are there variable components based on the company success – self-controlling remuneration models? Aspect 3: HR support processes The classical personnel administration function and the support processes up to the administration of access rights to IT applications should be in line with the positioning of HR. • Are the personnel administration functions covered to a large extent by systems? Is there a system-supported interface with other areas such as accounting? • Is this function perceived as a competent service or just as a pure service? • Are the authority levels with regard to employment contracts and their negotiation centralised? • Are benchmarks used for administration expense? If yes, are all functions taken into account or does the business process outsourcing of, for example, the payroll only provide illusory advantages? Aspect 4: Staff and competencies The central aspects to be taken into account with regard to the development of talent and the alignment of the HR organisation to the company objectives are the following. • Are strategic functions communicated, raised in a systematic manner and documented? • Are there skill and competence profiles in place that are consistent with the positioning of the company and the relevant strategic functions? • Is an evaluation of human capital performed? • How motivated and committed is the staff? Are there benchmarks for sick days taken and absences from work? • How are the competence of management and cooperation perceived by the staff? • Is there sufficient talent available? Do the business plan, the HR plan and recruiting dovetail?
31 Challenges and Trends for Insurance Companies
769
• Are the company objectives made known to the staff and is there a performance culture? • How is the willingness to adjust and flexibility characterised? Do job rotations and temporary transfers to projects increase self-esteem? Aspect 5: Acceptance of HR An assessment can be made following a survey carried out in a few interviews of how the HR unit is regarded as to what extent the personnel department is involved in the implementation of company strategy. • • • •
How is the value added by HR perceived by top management? How satisfied are middle management and staff as customers? How does the staff sense the appreciation shown to them by HR? Does HR have the ability to manage and coach operating departments with regard to personnel matters – or is HR only a supplier and a processor? • Do talented staff feel like customers who are systematically looked after? • Is HR proactive in changing processes? Or “only” decisions from top management are implemented? • How attractive is the company to external talent?
Aspect 6: HR controlling The ability of the HR unit to provide information is not only used to ensure that Annual General Meetings are run smoothly but also indicates its competence and willingness to be compared to internal external benchmarks on an ongoing basis. • Are key personnel figures available for the entire company? Can developments – not only relating to staff numbers – be highlighted? • Is the contribution made by HR measured? • Is the implementation of HR strategy communicated? – for example in the Intranet? • Are parameters set to measure the efficiency of HR processes? • Are employee satisfaction surveys carried out? • What is the state of the system support? The evaluation of this health check indicates how the HR organisation is aligned to the company objectives. Any frictions and risks to the achievement of the company objectives are highlighted and development steps recommended. Only when such checks become routine for a company, such as the review of potential areas of savings, will the right talent be available at the right time in the company. Only then will excellent staff performance be achieved and the innovative ideas will be utilised to the full, which the interaction of the aspects highlighted in this article requires, and our opportunities in the age of globalisation become apparent.
770
Markus Frosch et al.
HR Health Check: evaluation of the six dimensions 1
1
4
1
2 HR core
HR positioning
2
3
4
5
5
People & competencies
2
3
4
Talent Management Consulting Health Check Frankfurt, März 2004
Figure 31.12. Summary of results: HR Health Check17
31.3 Summary and Outlook This chapter highlights the importance of IT from an insurance company perspective. It is evident that the role played by IT today extends far beyond the automation of previously manual processes. The preparation of a strategy is hardly conceivable without IT and the resultant income and cost transparency. In mature markets, such as the German insurance market, undifferentiated strategies no longer have a future. The focal points for income-based growth, sales targets, marketing actions and administration changes can only be properly planned and forecasted if the relevant information is known. The importance of IT for self-controlling systems, unique selling propositions and sales support is surely not surprising – perhaps, however, the speed with which the link between sales and IT is being developed more and more. Direct access to contribution margin and commission information as well as to primary data that affects the proposal and the contract seems to replace in part the classical administration function and to result in training administration staff rather as the carriers of special knowledge and towards supporting sales.
17
Source: Promerit AG: HR Scorecard, March 2004.
31 Challenges and Trends for Insurance Companies
771
In addition to these developments we would venture to forecast that the importance of IT as part of the HR organisation will be revolutionised in the next five years. A pivotal factor for success is the identifying, recruiting and developing of talent required by each company. It requires more than administering personnel master data in a standard software package in order to achieve this on a professional basis. It requires openness and transparency on the part of the HR organisation in terms of the highlighted health check, furthermore talent pools must be built up that allows the available talent resources to be utilised in a more effective way for achieving the company objectives. We would like to conclude with the observation that these developments will also result in changes being made to the academic profiles at high schools. Company requirements will shift from the classical computer scientist towards information managers who control the interaction of the aspects highlighted in the Scorecard example.
CHAPTER 32 Added Value and Challenges of IndustryAcademic Research Partnerships – The Example of the E-Finance Lab Wolfgang Koenig, Stefan Blumenberg, Sebastian Martin
32.1 Introduction The financial service industry is believed to be on the verge of a dramatic [r]evolution. A substantial redesign of its value chains aimed at reducing costs, providing more efficient and flexible services and enabling new products and revenue streams is imminent. But there seems to be no clear migration path or goal which can cast light on the question where the finance industry and its various players will be and should be in a decade from now. The imperative necessity for action taking has been recognized by practitioners as well as the academic world. This is why the E-Finance Lab (EFL) was founded in 2003 as a cooperation of the Universities Frankfurt and Darmstadt together with industry partners Accenture, BearingPoint, Deutsche Bank, Deutsche Börse, Deutsche Postbank, FinanzIT, IBM, Microsoft, Siemens, T-Systems, as well as DAB bank and IS.Teledata. The mission of the EFL is the development of industrial methods for the impending change process of the financial service industry. Important challenges include the design of smart production infrastructures, the development and evaluation of advantageous sourcing strategies, and smart selling concepts to enable new revenue streams for financial service providers in the future. The overall goal is to contribute methods and views to the realignment of the financial supply chain.
32.2 Structure of the EFL The EFL is organized into five clusters, as depicted in Figure 32.1. In all clusters, theoretical model-related research, technical prototyping as well as case studies and interviews with industry experts are employed to devise value-creating strategies. Cluster 1 “Sourcing and IT Management in Financial Processes” focuses on sourcing and aligning the IT resource with financial business processes as a means
774
Wolfgang Koenig et al.
Figure 32.1. The EFL “Temple”
of internally and externally rearranging the financial value chain. Related research topics are operational risk mitigation and sourcing contracts. The results so far include a real options based model for sourcing decisions under uncertainty and empirical evidence for substantial efficiency potentials in financial processes. Cluster 2 “Emerging IT Architectures to support Business Processes within the E-Finance Industry” focuses on the evaluation and development of innovative technologies and technical concepts in the context of the E-Finance industry. Of special interest is the relationship between business processes and technologies. A further research area is IT risk with respect to Basel II recommendations on operational risk. Cluster 3 “Customer Management in a Multi-Channel Environment” develops, evaluates, and implements a set of Customer Management solutions in order to generate substantial efficiency improvements for financial institutions. The charter of cluster 3 is thus to look at customers as assets and to optimize the customer equity. Cluster 4 “Reshaping the Banking Business, develops scenarios of how the German and European banking business will evolve. Based on the hypothesis that industry value chains will be rearranged in the near future, cluster 4 focuses on the core research question: “Who shall perform what function in an efficient banking value chain?”. Specific business processes, outsourcing initiatives of non-core activities and the sales and distribution function in retail banking are central topics within the cluster. Cluster 5 analyzes the “Securities Trading Value Chain” and its dynamics. The cluster has a specific focus on the impact of market regulation, the intermediation
32 Added Value and Challenges of Industry-Academic Research Partnerships
775
relationships between the industry players and recent technological innovations in this field. The roof cluster pulls together research efforts and results of the five EFL research clusters in order to boost synergies and to allow a unified view on current technology-enabled trends in the financial services industry. Joint research, conferences and discussions are targeted at answering the question: “What will the finance industry be alike in 2012?”. Moreover, the roof cluster strives to coordinate research methods to contribute to a cohesive E-Finance framework.
32.3 Generation and Sharing of Knowledge Between Science and Practice: Research Cycle The work within the EFL is performed in close cooperation with the industry partners that support the research. Thus, we are not only focused on a scientifically sound research process but also on a mutual exchange of knowledge. Consolidation has been a widespread phenomenon in the financial services industry for many years. Numerous of today’s most successful firms have rapidly grown both in scale and scope. Given the strategic uncertainty about future core competencies and profitable business areas, becoming bigger and broader and maintaining “deep pockets” seemed to be one of the safest options. We believe that the financial services industry is currently on the verge to a new competitive landscape in which much more focus is indispensable. Technological advances, the ever increasing importance of brands and reputation and the imperative to exploit crossselling opportunities imply that horizontal scale and scope economies will remain of utmost importance. However, because the same technological advances boost the standardization and industrialization of business processes, an influx of specialized and highly efficient service providers will strongly erode the extent of vertical integration that we can still observe. We envision a decomposition and reconfiguration of existing value chains into flexible value networks, in which players exclusively concentrate on their core competencies and IT plays a vital role in value creation. We also believe that the industrialization of processes will be complemented by relentless attempts by leading financial institutions to offering customized, relationship-oriented financial services to their customers through multiple channels. Relationship-banking will hence prosper even in the light of an intensifying competition. Research on the industrialization of the financial services industry in Germany has only been a matter of interest in recent years. Beside theoretical model-related research and technical prototyping, the EFL conducts both quantitative and qualitative empirical studies to support findings in this heavily emerging area of interest. Empirical research is usually conducted according to the research cycle described in Figure 32.2 (based on Atteslander 2003). We introduce these phases in the following sections, while focusing on phase four (analysis) and five (utilization) in the context of the research work within the EFL, in order to illustrate particularities of the generation and sharing of knowledge.
776
Wolfgang Koenig et al.
Figure 32.2. Empirical research cycle (based on (Atteslander 2003))
32.3.1 Problem Definition The theoretical problem definition phase refers to the formulation of a research question that restrains the problem to a verifiable circumstance and emphasizes the importance of the chosen research objective. The resulting question may change during the following phases but must be formulated first, in order to narrow the focus of the subsequent research. In this step, hypotheses are formulated (Bortz and Döring 2002). Theses hypotheses rely on close scientific requirements like empirical controllability, universal validity and falsifiability (Opp. 1999). For a detailed description of hypotheses requirements see Bortz (2002) and Opp (1999). The logical combination of a set of hypotheses may lead to a theory that tries to explain and predict circumstances in the research area. In case of presence of only vague hypotheses about the research object, an explorative study should precede to derive a clear view of the research area and to obtain sound hypotheses.
32.3.2 Operationalization After the definition of hypotheses and theories, the research object is narrowed down to a practical level in order to choose the right time frame, target group, and field access to systematize observations in categories (Atteslander 2003). • Time frame: Which period will be addressed with the empirical research and how much time and how many funds are needed and available for a study? Empirical research can, for example, focus on a point in time or conduct a time series analysis.
32 Added Value and Challenges of Industry-Academic Research Partnerships
777
• Target group: What kind of group of social events or people will the research focus on? Especially the selection of people to question within an organization can be important to receive appropriate results. • Field access: Can the chosen interviewees be addressed or are there any security issues that may hinder an empirical study? For example, empirical studies in financial service organizations are often challenging because it is difficult to get access to management level decision makers. The problem definition and operationalization phases are connected with each other. Hypotheses might, for example, change during the process of operationalization.
32.3.3 Application After hypotheses generation and operationalization, the preliminarily specified data can be gathered by means of the chosen research methods. Depending on the kind of research, the gathering will be more structured in the case of a quantitative study than in case of a qualitative approach. Data can be collected through experiments within an environment designed by the researcher, through interviews, or by observing the natural environment of the research object (Atteslander 2003).
32.3.4 Analysis The gathered data has to be prepared to run (statistical) tests and analyses which depend on the chosen methods and the gathered data. As an example of data collection and analysis within the EFL, we introduce the EFL Tracker, a collection of software tools developed by the EFL. The EFL Tracker supports data collection and analysis in quantitative (e. g. surveys with Germany’s 1.000 largest banks) as well as qualitative (e. g. highly detailed case study analysis) empirical research. Hence, the EFL Tracker can be classified as a tool that supports phases three (application) and four (analysis) of the research cycle. The E-Finance Universal Benchmarker, a primary tool in the portfolio of the EFL Tracker, is designed for supporting quantitative research approaches and has been successfully deployed in several cross-sectional and industry surveys (http://www.efinancelab.de/benchmarker). The EFL Benchmarker is a web-based software that supports the collection and on-the-fly analysis of data, providing an automated, instant benchmark for the survey participants. In particular, this benchmarking feature proved to be a major incentive for firms to participate in surveys of the EFL and turned out to be a useful way to collect data for the Lab even after a particular survey has been finished. Participating firms are able to instantly compare their own data with data of the entire available sample of the
778
Wolfgang Koenig et al.
Figure 32.3. Administration interface of the EFL Benchmarker
industry peer group. From a researcher’s perspective, another especially useful feature of using the EFL Benchmarker, besides substantially accelerated set-up time for Web-based surveys, is a drastic reduction of data errors. Until now, the EFL Benchmarker has been successfully deployed in eight empirical field surveys of the EFL. In total, the databases of the system embrace detailed data of 1,005 corporations, including 123,524 data points. The user-friendly administration interface (see Figure 32.3) allows for a fast and simple configuration of the EFL Benchmarker as well as the implementation of nearly arbitrary questionnaire designs. An SPSS output interface has been implemented for seamless integration with statistics software for advanced data analysis. Even after completion of the data collection and initial analysis process, each participant has full personalized access to her data. Furthermore, participants not included in the initial sample may enter their data and receive an instant benchmark; the data collection process never stops. As an example, for the very first
32 Added Value and Challenges of Industry-Academic Research Partnerships
779
empirical survey on Financial Chain Management in addition to the original data set of 103 firms, 69 further firms have registered online and provided their data to enhance the sample. For the support of qualitative empirical research, a second software tool under the umbrella of the EFL Tracker has been developed, called the EFL Case Study Database. Just as the EFL Benchmarker supports the collection and structuring of quantitative empirical data, the EFL Case Study Database provides support for structuring and analyzing qualitative data. Interviews and documents collected in case studies can be deposited in the database according to the methodological requirements associated with qualitative research. For data analysis, an instant comparison between different case studies or analytical units can be conducted. Currently the EFL Case Study Database contains detailed data of six in-depth case studies. Raw empirical data from the EFL Benchmarker and the EFL Case Study Database have led to a number of top-class research results, published in national and international journals and conferences. In total, the data contained in the various databases of the EFL Tracker has been used in more than 35 mostly international publications. Also, taking only Cluster 1 as an example, the data is used in more than nine Ph.D. theses. As a matter of fact, the EFL Tracker is not capable of generating empirical data, but it greatly fosters the gathering and transition process from raw and unstructured empirical data towards theoretically sound research findings and analytical models. Providing such a support, the E-Finance Tracker has been established as a valuable tool for conducting empirical research, keeping track with fast changing environments in the financial services industry and at the same time giving the EFL a boost in the internationally highly competitive scientific field of empirical research.
32.3.5 Utilization The utilization of research findings is as important as its generation. From a scientific point of view these results only represent the position of the researchers who are responsible for them. Publication and discussion in scientific journals and conferences helps the researchers to discuss their findings with other experts and, where appropriate, improve and extend the methods and the results. Additionally, problems not addressed so far may arise and the cycle will start over again. (Rynes et al. 2001) point out the importance of a cooperation of science and practice. Research results in the EFL are not only discussed with the scientific community. Knowledge transfer to the partners of the EFL has become a crucial factor of success during the last years of cooperative work. Based on the scientific publications of the EFL, ten channels of knowledge transfer between academia and practice and vice versa have been established so far (Figure 32.4).
780
Wolfgang Koenig et al.
External Events
Cooperative Research Projects with Industry Partners
Participants of the Cooperative Ph.D. Program
(25)
(9)
• Conferences hosted by the EFL (2) • Conferences with contributions of EFL members (manifold) • Monthly jours fixes (12) • Presentations of EFL staff (manifold)
Internal Events
Print
Marketing and PR
• Inhouse workshops at the industry partners (3) • Internal jour fixes (manifold)
• Universal Benchmarker (124.000 data points) • Literature database • Case study database
Publications (147)
Figure 32.4. Knowledge transfer channels (the number in brackets show the amount of items realized in 2005)
32.3.5.1 Cooperative Research Projects with Industry Partners Each cluster comprises 5–7 projects which run not more than 12 monthly and which are reevaluated in the course of the yearly budget planning process. Each project is described by: • • • •
Who is responsible for the delivery? What has when to be delivered (e. g. paper, software, data, workshop)? What is the basic research approach? How is practice incorporated in the research process?
32.3.5.2 Cooperative Ph.D. Program The participants in the cooperative Ph.D. program represent the main knowledge transfer channel towards the industry partners. Each of the industry partners can propose one person from his organization per year to become member of the cooperative Ph.D. program, i. e. to participate in cooperative research projects together with researchers from the EFL. This cooperation covers all phases of the research cycle. The participants of the cooperative Ph.D. program are important disseminators of information as they are familiar with the research areas, methods and results
32 Added Value and Challenges of Industry-Academic Research Partnerships
781
and can communicate them within their organizations in an appropriate form. In return, industry partners process the results and foster the Ph.D. students to discuss their organization’s viewpoints and areas of interest to allow a mutual exchange of knowledge and interest. In summary, participants of the corporate Ph.D. program fruitfully bridge the gap between science and practice on individual level. 32.3.5.3 External and Internal Events The EFL hosts various events to share the results with the industry partners and a wider public. During the big biannual conferences in spring and autumn with about 300–400 participants, EFL researchers and industry partners present new research results. In addition to the presentations given by EFL members, experts from practice, internationally renowned researchers and the public are invited to discuss about their work and the EFL results. These big conferences are, for example, especially interesting for members of the EFL industry partners that are not closely involved in the daily research work but are interested in getting insights into the research results. The autumn conference also offers workshops for indepth discussion of selected topics. In addition to the big biannual conferences, the EFL researchers and industry partners meet once a month in so-called jours fixes, where research in progress is presented and discussed with internal and external participants on a monthly basis. These public jours fixes are preceded by internal regular meetings of the research groups, where topics and cooperation between researchers are planned and discussed. The EFL also offers to its industry partners the possibility to organize in-house workshops where latest research results – customized to the individual needs of the organizations – are discussed. These in-house workshops are often the beginning of new research projects as the consolidation of scientific work and practitioner´s questions lead to new research questions. 32.3.5.4 Print, Marketing and Public Relations The EFL issues printed information from a quite generic level down to a highly specific level with detailed information about the status of the research. First of all, a booklet that introduces the EFL, its members, the research work and the results is issued and continuously updated. Regular press reports are released to reach the public. In 2005, 18 press reports generated 360 press releases in print, TV and online media. To complement this generic information, a printed newsletter is published on a quarterly basis. A printed newsletter presents two latest research results and moreover an editorial, an interview and information about publications on a summarized level, thus offering a concise overview on selected EFL activities. Finally, two progress reports are published biannually. One informs the EFL members about all activities within the Lab: published articles and books, attended conferences, held presentations, discourses of guest scientists, and many more.
782
Wolfgang Koenig et al.
The second printed progress report contains a collection of the top three publications of each cluster. 32.3.5.5 EFL Online The EFL issues a quarterly electronic newsletter to complement the printed newsletter with information about recent developments and short abstracts of current research projects with links onto the EFL website for further information. All information described above is available on the EFL website, which also contains a glossary with an explanation of some important research terms. Interested people can request guidance – e. g., definitions of technical terms – via the Frequently Asked Questions System. Selected research results are also presented in short film format. For example, in 2005, the short film “Credit Process Modernization” briefly explains the findings of an empirical study with 500 German banks. 32.3.5.6 Software and Databases Software with applied scientific methods like the above-described Universal Benchmarker is also one of the channels that allow communicating research results to the industry partners and to the public in an automated manner. Furthermore, based on the results of two empirical studies of “Kreditanstalt fuer Wiederaufbau” and “Bundesbank”, the EFL developed the Internet portal “German-Z-Score” which enables medium-sized enterprises to determine an approximated rating based on financial ratios. Potential effects of different strategies can be determined through variation of the input variables. The EFL also developed a tool called “Integration of Estimation Risk in the Context of Total Return Strategies in Modern Investment Management” which covers a comprehensive library of optimization and performance evaluation tools for asset management. Databases with the raw data of qualitative and quantitative empirical studies and a literature database with more than 1,000 scientific papers dealing with e-finance are exclusively available for industry partners and EFL members. So far, the EFL databases embrace detailed data of 1005 corporations, including more than 123,000 data points. 32.3.5.7 Publications Last but not least, for a research institute like the EFL, publications represent the single most important knowledge transfer channel. From all 147 contributions of the EFL in 2005, 67% are written in English and 61% have been quality approved by peer reviewers. 22% of the research papers of the EFL have been published in (international) journals.
32 Added Value and Challenges of Industry-Academic Research Partnerships
783
32.4 Added Value and Challenges of the IndustryAcademic Research Partnership An industry-academic research partnership is comprised of people and organizations in the university as well as in the business practice – and in the case of the EFL the sponsors do not only provide expertise and personal connections but also funds to employ researchers. In general, we deliberate basic prerequisites as well as the advantages of a close research cooperation of industry and university, for instance boiled down to two questions. First: What is the value of research results in practice? To point it out: If industries or important acting people there do not appreciate the value of research in the development of their enterprises, then there is almost no chance to convince someone of the value of a research cooperation. The second question, too, is linked to the requirement that both sides sustainable have to be fond of cooperating with each other. In this sense: Does academia “please” the practitioners? In particular, application oriented research approaches provide advantages towards the knowledge transfer into practice. And against the background that very often research results are complex and hard to understand, some so-called secondary virtues come into play: Do researchers seriously accept their practitioner-counterparts as “buddies”? People sense this e. g. in the atmosphere of a meeting. Or: Do researchers seriously work as much as their cooperation partners in industry? Often, the latter work hard and long (exceptions assert the rule) – and do not easily understand if for instance the researchers are not at work in the late afternoon or do not check their emails on weekends. In detail, we distinguish the following added value of an industry-academic research partnership: • Scientists need not to invent “the wheel” manifold. Experience says that a lot of concepts about how an interesting problem may be solved have already been evaluated in the industry. A research partnership helps to unearth former evaluations of concepts and ideas – if academia allows situations in which experience-advantages of practitioners compared to scientists become evident. • In particular, practitioners help to shape the structure of empirical investigations, specifying questions and the way how answers are obtained. They also help increase the respondents rate, because of their industry knowledge. In the case of the EFL, the quality of the empirical data is often outstanding. In international comparison, our respondents rate is regularly more than twice as high – which nicely supports our requirements (see the respective point in “challenges”) to publish in top-tier journals. Or, to put it in other words: One long-term wealth of the EFL is its unparalleled data collection that we can use in second research steps for numerous analytical research processes.
784
Wolfgang Koenig et al.
• The close cooperation between practice and academia also provides for real-life problems (which of course might be anonymized, but these data are real world data compared to otherwise “visions” of researchers) – and, in the opposite direction – supplies the practitioners with profound methodological knowledge. • The technology transfer from academia into practice is lubricated because industry people are from the very beginning involved in the specification of the research projects. • We have top practitioners in the cooperative Ph.D. program – it is just a joy to work with them scientifically. They provide high quality research results and publish these internationally. • In this cooperative work environment, we create a stream of top quality research results which differentiates us from many other research institutions and programs. • The networking between the actors is of high importance and very valuable. The best brains on both sides interconnect to each other and easily exchange ideas as well as business. In this humus, a lot of very good ideas and approaches are developed. And what are the challenges of an industry-academic research partnership? • The basic prerequisite is a congruence of interests on both sides. Which means on the academic side: You need to have solid research questions and you need to perform solid research procedures. On the industry side you need to explain your cooperation partners the advantages of particular research results. Often, this requires a doubled presentation of results. One version is geared towards top science media where for example sometimes more than 50% of an article is devoted to thoroughly describing the derivation of a research process – which is in the version for the industry replaced by the description of new insights. Moreover, in the latter case one often has to translate the results into the language of the “customers”. • Aside of all short term necessities to put new research results into everyday business work: Solid practitioners look for sustainable results – not some of a type: “valid today and probably tomorrow, but after that we have to perform a new research project”. Again: This requires solid research questions and solid research procedures. And in the long run, only results that pass the peer examinations of the scientific community count – even in practice (see for example the actual university excellence initiative in Germany). Which says: Top quality publications are required (plus a second type of presentation of results, one that is geared towards business). • So, the “architecture” of the research design is of paramount importance. Take exciting questions from practice, come up with a solid research process, incorporate properly the experience of practitioners, and put all these things properly together not only for a moment but at least for 2 years! • In the EFL we execute only 1 year research projects which give the members of the board time to at least review once a year the whole research
32 Added Value and Challenges of Industry-Academic Research Partnerships
•
•
• •
785
design – and high quality research is then continued in the second or even in the third term. The board of the EFL is made up of the 5 cluster heads which supervise the research in their clusters plus 1 senior representative of each of the main sponsors (actually 10). The board meets at least twice a year personally but may in addition decide via e-mail. In order to support the complex process of aligning the interests of practice and academia, the EFL has decided on and it maintains over the years “research priorities”, a document that tries to provide a kind of long-term guidance of exciting topics spanning more than one year of attention. And we have pinpointed for each cluster the 15 world-wide most important scientific journals and the five most important conferences where the researchers want to present and discuss their results – the goal is “top notch publications”. Sure, one needs a substantial share of “team players” in such a research group – on both the academic as well as on the industry side. This group may “digest” one or the other excellent solitaire, but the team must closely work together. You need excellent people who are willing to work more than usual in order to also realize the technology transfer requirements without inhibitive time-delays. And, last but not least, you need resources so that all ambitious ideas in the end really become true. The EFL is very thankful for an almost 1.5 Mio. Euro budget in 2005 that funds more than 30 researchers – and this group for example provided 147 publications in that year, half of them peerreviewed contributions to highly ranked media and more than half of them in English.
In addition, all the other knowledge transfer channels have to be executed – see Figure 32.4.
32.5 Conclusion The EFL has been founded as a means for practice and academia to cooperatively support the imminent changes in the German financial services industry. The mission of the EFL is to develop scientifically sound and relevant contributions to an industrialization of the financial processes. Research contributing to this goal is done within five research clusters in cooperation with the industry partners. Many of the publications rely on empirical studies conducted according to high scientific standards within five phases (problem definition, operationalization, application, analysis and utilization). The EFL uses a specialized software toolkit to support and structure empirical analysis and data gathering in the context of quantitative and qualitative analyses. Other important facets of its success are the manifold channels of knowledge transfer between academia and industry. Aside of a large amount of publications, cooperative research projects and a cooperative
786
Wolfgang Koenig et al.
Ph.D. program are at the core of the knowledge transfer, being complemented by a variety of printed materials like newsletters, brochures and a website with all information tailored to the needs of the industry partners. As a result of the close cooperation and the mutual knowledge transfer, the research partnership offers added value to both, practitioners and scientists. Practitioners provide real life problems and help to shape, for example, the structure of empirical investigations with their knowledge of the industry while researchers bring in sound scientific methods to investigate how concepts that have been developed and successfully applied in other industries may be adapted to the finance industry. Besides added value we rise to challenges like the requirement of a congruence of interest on both sides, long-term validity of research results and dual presentation of results, depending on the kind of auditorium.
References Atteslander P (2003) Methoden der empirischen Sozialforschung. Walter de Gruyter, Berlin, New York Bortz J, Döring N (2002) Forschungsmethoden und Evaluation für Human- und Sozialwissenschaftler. Springer, Berlin, Heidelberg, New York Opp KD (1999) Methodologie der Sozialwissenschaften, Westdeutscher Verlag, Opladen. Rynes SL, Bartunek JM, Daft RL (2001) Across the great divide: knowledge creation between practitioners and academics. Academy of Management Journal 44(2): 340–355