Advances
in COMPUTERS VOLUME 54
This Page Intentionally Left Blank
Advances in
COMPUT RS Trends in Software Engineering EDITED B Y
M A R V I N V. ZELKOWITZ Department of Computer Science and Institute for Advanced Computer Studies University of Maryland College Park, Maryland
V O L U M E 54
ACADEMIC PRESS A Harcourt Science and Technology Company
San Diego
San Francisco
London Sydney Tokyo
New York
Boston
This book is printed on acid-flee paper. Copyright ~ 2001 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher's consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-2000 chapters are as shown on the title pages. If no fee code appears on the title page, the copy fee is the same as for current chapters./00 $35.00 Explicit permission from Academic Press is not required to reproduce a maximum of two figures or tables from an Academic Press chapter in another scientific or research publication provided that the material has not been credited to another source and that full credit to the Academic Press chapter is given. Academic Press A Harcourt Science and Technology Company 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.academicpress.com Academic Press A Harcourt Science and Technology Company Harcourt Place, 32 Jamestown Road, London NW1 7BY, UK http://www.academicpress.com International Standard Book Number 0-12-012154-9 Typeset by Mathematical Composition Setters Ltd, Salisbury, UK Printed in Great Britain by MPG Books Ltd, Bodmin, UK 01 02 03 04 05 06 MP 9 8 7 6 5 4 3 2 1
Contents CONTRIBUTORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PREFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix xiii
An Overview of Components and Component-Based Development Alan W. Brown 1. 2. 3. 4. 5. 6. 7. 8.
Introduction ...................................... The Goals of Component Approaches .................... Why Component-Based Development? .................... W h a t is a C o m p o n e n t ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W h a t is the E x e c u t i o n E n v i r o n m e n t for C o m p o n e n t s ? . . . . . . . . . H o w are A p p l i c a t i o n s A s s e m b l e d u s i n g C B D ? . . . . . . . . . . . . . . W h a t is the C u r r e n t P r a c t i c e in C B D T o d a y ? . . . . . . . . . . . . . . . Summary ......................................... References ........................................
2 3 3 5 13 20 24 32 33
Working with UML: A Software Design Process Based on Inspections for the Unified Modeling Language Guilherme H. Travassos, Forrest Shull, and Jeffrey Carver 1. 2. 3. 4. 5. 6.
Introduction ...................................... The Unified Modeling Language (UML) .................. Software Process Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Example ...................................... Maintenance or Evolution ............................. The Road Ahead ................................... References ........................................
36 40 43 64 86 94 95
Enterprise JavaBeans and Microsoft Transaction Server: Frameworks for Distributed Enterprise Components Avraham Left, John Prokopek, James T. Rayfield, and Ignacio Silva-Lepe 1. 2. 3. 4. 5. 6.
Introduction ...................................... Component Evolution ............................... Object Transaction Monitors ........................... Enterprise JavaBeans and Microsoft Transaction Server ....... Parallel Evolution .................................. Sample Application ................................. V
100
101 114 120 128 136
vi
CONTENTS
7. C o n t i n u e d E v o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. C o n c l u s i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ........................................
142 147
151
Maintenance Process and Product Evaluation Using Reliability, Risk, and Test Metrics Norman F. Schneidewind 1. 2. 3. 4. 5. 6. 7. 8.
Introduction ...................................... Related Research and Projects ......................... Concept of Stability ................................ M e t r i c s for L o n g - T e r m A n a l y s i s . . . . . . . . . . . . . . . . . . . . . . . M e t r i c s for L o n g - T e r m a n d S h o r t - T e r m A n a l y s i s . . . . . . . . . . . Data and Example Application ........................ Relationships among Maintenance, Reliability, and Test Effort Shuttle Operational Increment Functionality and Process Improvement ..................................... 9. U n i t e d S t a t e s A i r F o r c e G l o b a l A w a r e n e s s ( G A ) P r o g r a m Application ...................................... 10. C o n c l u s i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments .................................. References .......................................
.
154 155 157 160 160 160 163 175 177 180 180 180
Computer Technology Changes and Purchasing Strategies Gerald V. Post 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction ...................................... Moore's Law: The Beginning .......................... M a i n f r a m e s to P e r s o n a l C o m p u t e r s : P r i c e / P e r f o r m a n c e Personal Computers ................................. Laptops .......................................... Centralization, Decentralization, and TCO ................. Demand ......................................... The Future ....................................... Conclusions ....................................... References ........................................
.......
184 185 187 191 202 205 206 208 210 211
Secure Outsourcing of Scientific Computations Mikhail J. Atallah, K.N. Pantazopoulos, John R. Rice, and Eugene Spafford 1. I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. G e n e r a l F r a m e w o r k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. A p p l i c a t i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. S e c u r i t y A n a l y s i s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
216 223 236 247
CONTENTS
vii
5. C o s t A n a l y s i s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. C o n c l u s i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ........................................
265 268 270
AUTHOR INDEX ......................................
273
SUBJECT I N D E X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
279
C O N T E N T S OF V O L U M E S IN T H I S SERIES . . . . . . . . . . . . . . . . . . . .
287
This Page Intentionally Left Blank
Contributors Mikhail J. Atallah received a BE degree in electrical engineering from the American University, Beirut, Lebanon, in 1975, and MS and PhD degrees in electrical engineering and computer science from Johns Hopkins University, Baltimore, Maryland, in 1980 and 1982, respectively. In 1982, Dr. Atallah joined the Purdue University faculty in West Lafayette, Indiana, where he is currently a professor in the computer science department. In 1985, he received an NSF Presidential Young Investigator Award from the US National Science Foundation. His research interests are information security and algorithms (in particular for geometric and parallel computation). Dr. Atallah is a fellow of the IEEE, and serves or has served on the editorial boards of SIAM Journal o17 Computing, Journal of Parallel and
Distributed Computing, hzformation Processing Letters, Computational Geometry: Theory Applications, hTternational Journal of Computational Geometry Applications, Parallel Processing Letters, Methods of Logic in Computer Science. He was Guest Editor for a Special Issue of Algorithmica on Computational Geometry, has served as Editor of the Handbook of Parallel and Distributed Computing (McGraw-Hill), as Editorial Advisor for the Handbook of Computer Science and Engineering (CRC Press), and as Editor for the Handbook of Algorithms and Theory of Computation (CRC Press). Alan W. Brown, PhD is a Senior Technical evangelist at Catapulse, Inc., a Silicon Valley start-up company leveraging the Internet to create a new generation of software development services and tools. Previously Alan was vice president of R&D for Computer Associates' Application Development products where he was responsible for advanced technology activities across the organization. Alan joined Computer Associates on their acquisition of Sterling Software in April 2000. Prior to joining Sterling Software, he was director of software research for Texas Instruments Software (TIS), which was acquired by Sterling Software in 1997. Alan also led the Object Technology Branch for Texas Instruments, Inc.'s corporate software research labs in its investigation of advanced research and development products. Previously Alan spent five years at the Software Engineering Institute (SEI) at Carnegie Mellon University in Pittsburgh, Pennsylvania. There he led the CASE environments project advising on a variety of US government agencies and contractors on the application and integration of CASE technologies.
ix
X
CONTRIBUTORS
Jeff Carver is a graduate research assistant in the Experimental Software Engineering Group at the University of Maryland. He received his BS in Computer Science from Louisiana State University. He is currently pursuing a PhD degree. His research interests include software inspections, reading techniques, and process improvement. Avraham Left has been a Research Staff Member in the Intelligent Object Technology group at IBM since 1991. His research interests include distributed components and distributed application development. He received a BA in Computer Science and Mathematical Statistics from Columbia University in 1984, and an MS and PhD in Computer Science from Columbia University in 1985 and 1992, respectively. Dr. Left has been issued two patents, and has five patents pending. Konstantinos N. Pantazopoulos received a BE degree in computer and informatics engineering from The University of Patras, School of Engineering, Patras, Greece, in 1991 and MS and PhD degrees in Computational Science from Purdue University, West Lafayette, Indiana, in 1995 and 1998 respectively. In 1998 Dr. Pantazopoulos joined Goldman, Sachs Inc. in New York and later moved to London where he is currently in the Fixed Income Markets division. His research interests include numerical algorithms with applications in finance and information security. Gerald Post is a professor of Management Information Systems (MIS) at the University of the Pacific. He has a PhD from Iowa State University in international economics and statistics, and postdoctoral work in MIS at the University of Indiana. He has written textbooks on MIS and database management systems. He has published articles on decision-making, information technology, computer security, and statistical analysis. His research has appeared in leading journals such as MIS Quarterly,
Communications of the A CM, Journal of Marketing Research, Decision Sciences, Journal of MIS, and Information & Management. He buys his own computers so he has a vested interest in purchasing strategies. John Prokopek is a Senior Development Engineer at Instinet, a Reuters Company, providing Institutional and Retail Investment services. His areas of specialization include object-oriented and component technologies and Microsoft NT Technologies. He received a BS in Chemistry from Siena College, Loudonville, NY in 1980.
James T. Rayfield is a Research Staff Member and manager in the Intelligent Object Technology group. He joined IBM in 1989. His research interests include object-oriented transaction-processing systems and database systems. He received an ScB in 1983, an ScM in 1985, and a PhD in
CONTRIBUTORS
xi
1988, all in Electrical Engineering from Brown University. Dr. Rayfield has two patents issued and five patents pending. John R. Rice studied mathematics at the California Institute of Technology,
receiving his PhD in 1959. He came to Purdue University in 1964 as Professor of Mathematics and Computer Science. He was Head of Computer Sciences from 1983 through 1996 and in 1989 he was appointed W. Brooks Fortune Distinguished Professor of Computer Sciences. His early research work was mostly in mathematics (approximation theory) but over the years his research shifted to computer science and now he is working in the areas of parallel computation, scientific computing, problem solving environments, solving partial differential equations and computer security. His professional career includes terms as Chair of the ACM Special Interest Group on Numerical Mathematics (1970-73) and Chair of the Computing Research Association (1991-93). He was founder and Editorin-Chief of the A C M Transactions on Mathematical So[t~l'are (1975-1993). Professional honors include the 1975 Forsythe Distinguished Lectureship, Fellow of the AAAS, Fellow of the ACM, and election to the National Academy of Engineering. Norman F. Schneidewind is Professor of Information Sciences and Director
of the Software Metrics Research Center in the Division of Computer and Information Sciences and Operations at the Naval Postgraduate School, where he teaches and performs research in software engineering and computer networks. Dr. Schneidewind is a Fellow of the IEEE, elected in 1992 for "contributions to software measurement models in reliability and metrics, and for leadership in advancing the field of software maintenance." He is the developer of the Schneidewind software reliability model that is used by NASA to assist in the prediction of software reliability of the Space Shuttle, by the Naval Surface Warfare Center for Trident software reliability prediction, and by the Marine Corps Tactical Systems Support Activity for distributed system software reliability assessment and prediction. This model is one of the models recommended by the American National Standards Institute and the American Institute of Aeronautics and Astronautics Recommended Practice for Software Reliability. He has published widely in the fields of software reliability and metrics. Forrest Shull is a scientist at the Fraunhofer Center for Experimental
Software Engineering, Maryland. He received his doctorate degree from the University of Maryland, College Park, in 1998. His current research interests include empirical software engineering, software reading techniques, software inspections, and process improvement. Contact him at fshull@ fc-md.umd.edu.
xii
CONTRIBUTORS
Ignacio Silva-Lepe is a Research Staff Member in the Enterprise Middleware Group. He joined IBM in 1997. His research interests include distributed component middleware, message-oriented middleware, and networked virtual environment servers. He received a BS in Computer Systems Engineering from Universidad ITESO, Guadalajara, Mexico in 1985, and an MS and PhD in Computer Science from Northeastern University, Boston, Massachusetts in 1989 and 1994, respectively. Dr. Silva-Lepe has one patent pending.
Gene Spafford's current research interests are focused on issues of computer and network security, computer crime and ethics, and the social impact of computing. Spaf's involvement in information security led, in May of 1998, to Purdue University establishing the Center for Education and Research in Information Assurance and Security with Dr. Spafford as its first Director. This university-wide center addresses the broader issues of information security and information assurance, and draws on expertise and research across all of the academic disciplines at Purdue. Among many professional activities, Dr. Spafford is a member of the Computing Research Association's Board of Directors, the US Air Force's Science Advisory Board, and he is chair of ACM's US Public Policy Committee. In 1996, he was named a charter recipient of the Computer Society's Golden Core, for his past service to the Society, in 1997 he was named as a Fellow of the ACM, and in 1999 he was named as a Fellow of the AAAS. Dr. Spafford is the Academic Editor of the journal Computers & Security, and on the editorial and advisory boards of the Journal of Artificial Life, ACM's Transactions on Information and System SecuriO', and Network Security. Guilherme H. Travassos is an Associate Professor of Computer Science at the COPPE-Federal University of Rio de Janeiro, Brazil. He received his doctorate degree from the C O P P E / U F R J in 1994. His current research interests include empirical software engineering, software quality, software engineering environments, and process improvement. He is a former faculty researcher in the Experimental Software Engineering Group at the Department of Computer Science, University of Maryland from 1998 until 2000. Contact him at ght@,cos.ufrj.br.
Preface
As we enter the 21st century, the Advances in Computers remains at the forefront in presenting the new developments in the ever-changing field of information technology. Since 1960, the Advances has chronicled the constantly shifting theories and methods of this technology that greatly shapes our lives today. In this 54th volume in this series, we present six chapters on the changing face of software engineering--the process by which we build reliable software systems. We are constantly building faster and less expensive processors, which allow us to use different processes to try and conquer the "bug" problem facing all developments--how to build reliable systems with few errors at low or at least manageable cost. The first three chapters emphasize components and the impact that object oriented design is having on the program development process. The final three chapters present other aspects of the software development process. In the first chapter, "An overview of components and component-based development," Alan W. Brown describes the virtues of building systems using a component-based development strategy. The goal is to reuse existing software artifacts without the need to rebuild them, by building them initially within a common infrastructure framework. The chapter discusses various approaches toward these framework infrastructures. In Chapter 2, "Working with UME: A software design process based on inspections for the Unified Modeling Language," the authors Guilherme Travassos, Forrest Shull, and Jeffrey Carver build upon the ideas in the first chapter. The Unified Modeling Eanguage (UML) has been proposed as a notation for building object oriented (OO) systems. OO design is one of the approaches that allow one to effectively build the reusable components referred to in the first chapter. In this chapter, the authors discuss ways to ensure that the UML description of a new product is both correct and easily implementable. "Enterprise JavaBeans and Microsoft Transaction Server: Frameworks for distributed enterprise components" by Avraham Left, John Prokopek, James T. Rayfield, and Ignacio Silva-Lepe, the title of Chapter 3, is a specialization of the ideas in the first two chapters. This chapter examines a specific type of component, the distributed enterprise component, that provides business function across an enterprise. Distributed enterprise
xiii
xiv
PREFACE
components require special functions such as distribution, persistence, and transactions, which are achieved by deploying the components in an object transaction monitor. Recently, distributed enterprise components and object transaction monitor technology have standardized into two competing frameworks: Sun's Enterprise JavaBeans and Microsoft's Transaction Server. Maintaining a system is often half or more of the life-cycle costs of a large software system. In Chapter 4, "Maintenance process and product evaluation using reliability, risk, and test metrics" by Norman F. Schneidewind, the author discusses maintenance issues on the NASA Space Shuttle Program, a software system first used in 1981 and expected to evolve until at least 2020. Dr. Schneidewind has been investigating an important facet of process capability--stability--as defined and evaluated by trend, change, and shape metrics, across releases and within a release of a software product. Integration of product and process measurement serves the dual purpose of using metrics to assess and predict reliability and risk and to evaluate process stability. Although computers have gotten faster, and paradoxically much cheaper, over the past 30 years, they still represent a significant investment for most people. In Chapter 5, Gerald Post in "Computer technology changes and purchasing strategies" discusses various strategies for when and how to purchase a computer. Since machines are constantly getting both faster and cheaper, waiting for the next faster machine to replace an older machine will always be cost effective. However, since this improvement cycle does not seem to be changing, the result is to always wait and never replace an old slow machine. If in need of a new machine, then, when do you do the "non-cost effective" action of actually buying a machine? Dr. Post discusses various strategies depending upon the use of the new machine. In the final chapter, Mikhail J. Atallah, K.N. Pantazopoulos, John R. Rice, and Eugene E. Spafford in "Secure outsourcing of scientific computations," present a different approach to the computer replacement issue of the previous chapter. With machines being relatively inexpensive today, many of them are both connected to the Internet and also remain idle much of the time. That provides a rich source of computing power not being used. A customer needing computing services could "outsource" the problem to one of these idle machines. The problem that the authors address in this chapter is security. For security or proprietary business reasons, how can the customer use another machine to perform needed calculations without the owner of the donor machine knowing the true nature of the data being processed? (For a discussion of basic outsourcing concepts, see "Resource-aware meta-computing" by
PREFACE
xv
J. Hollingsworth, P. Keleher, and K. Ryu in Advances in Computers, volume 53, 2000.) I hope that you find these chapters of interest. If you would like to see a specific topic covered by these Advances, let me know at
[email protected]. MARVIN V. ZELKOWITZ College Park, Maryland
This Page Intentionally Left Blank
An Overview of Components and Component-Based Development ALAN W. BROWN Rational Software 5 Results Way Cupertino, CA 95014 USA
[email protected]
Abstract Components and component-based development are important technology advances in use by many organizations around the world. This chapter examines the main concepts and current practices involving these technologies. In particular, the chapter offers an analysis of the current state of componentbased development as practiced and supported in the software industry today.
1. 2. 3. 4.
5.
6.
7.
8.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Goals of C o m p o n e n t Approaches . . . . . . . . . . . . . . . . . . . . . . . W h y Component-Based Development? . . . . . . . . . . . . . . . . . . . . . . . What is a Component? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Components and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Components and Distributed Systems . . . . . . . . . . . . . . . . . . . . 4.3 Elements of a C o m p o n e n t . . . . . . . . . . . . . . . . . . . . . . . . . . W h a t is the Execution Environment for Components? . . . . . . . . . . . . . . . 5.1 C o m p o n e n t Infrastructure Services . . . . . . . . . . . . . . . . . . . . . . 5.2 Component Infrastructure Implementations . . . . . . . . . . . . . . . . . . H o w are Applications Assembled using CBD? . . . . . . . . . . . . . . . . . . . 6.1 Sources of Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Interface-Focused Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Applications and C o m p o n e n t Architecture . . . . . . . . . . . . . . . . . . W h a t is the Current Practice in C B D Today? . . . . . . . . . . . . . . . . . . . 7.1 Component Software Vendors . . . . . . . . . . . . . . . . . . . . . . . . 7.2 C o m p o n e n t Software Portals . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 C o m p o n e n t Infrastructure Vendors . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 3 5 6 8 11 13 13 14 20 21 23 23 24 25 26 28 32 33
I This paper is a substantially extended version of Chapter 4 in A. W. Brown, Large-scale by Prentice-Hall in May 2000. It is provided here with permission of Prentice-Hall.
Component-BasedDevelopment published
ADVANCES IN COMPUTERS, VOL. 54 ISBN 0-12-012154-9
Copyright ~ 2001 by Academic Press All rights of reproduction in any form reserved.
2
ALAN W. BROWN 1.
Introduction
Complex situations in any walk of life are typically addressed by applying a number of key concepts. These are represented in approaches such as abstraction, decomposition, iteration, and refinement. They are the intellectual tools on which we rely to deal with difficult problems in a controlled, manageable way. Critical among them is the technique of decomposition--dividing a larger problem into smaller, manageable units, each of which can then be tackled separately. This technique is at the heart of a number of approaches to software engineering. The approaches may be called structured design, modular programming, or object orientation, and the units they produce called modules, packages, or components. However, in every case the main principle involved is to consider a larger system to be composed from welldefined, reusable units of functionality, introduce ways of managing this decomposition of a system into pieces, and enable its subsequent reconstruction into a cohesive system. While the concepts are well understood, most organizations struggle to apply these concepts to the provisioning of enterprise-scale solutions in the Internet age. However, driven by the challenges faced by software engineers today, many organizations are beginning to reassess their approach to the design, implementation, and evolution of software. This stems from growing pressure to assemble solutions quickly from existing systems, make greater use of third-party solutions, and develop reusable services for greater flexibility. Consequently, renewed attention is being given to reuse-oriented, component-based approaches. This in turn is leading to new componentbased strategies being defined, supported by appropriate tools and techniques. The collective name for these new approaches is componentbased development (CBD) or component-based software engineering (CBSE). For developers and users of software-intensive systems, CBD is viewed as a way to reduce development costs, improve productivity, and provide controlled system upgrade in the face of rapid technology evolution [1,2]. In this chapter we examine the renewed interest in reuse-oriented approaches, and define the goals and objectives for any application development approach able to meet the needs of today's software development organizations. Armed with this knowledge, we examine the major principles of CBD, and the characteristics that make CBD so essential to software developers and tool producers alike. We then consider the major elements of any CBD approach, and highlight the main concepts behind them. Finally, we assess the current state of CBD as practiced today.
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
3
2. The Goals of Component Approaches Recently there has been renewed interest in the notion of software development through the planned integration of pre-existing pieces of software. This is most often called component-based development (CBD), component-based software engineering (CBSE), or simply componentware, and the pieces called components. There are many on-going debates and disagreements concerning what exactly are and are not components [3]. Some authors like to emphasize components as conceptually coherent packages of useful behavior. Others concentrate on components as physical, deployable units of software that execute within some well-defined environment [4]. Regardless of these differences, the basic approach of CBD is to build systems from welldefined, independently produced pieces. However, the interesting aspects of CBD concern how this approach is realized to allow components to be developed as appropriate cohesive units of functionality, and to facilitate the design and assembly of systems from a mix of newly and previously developed components. The goal of building systems from well-defined pieces is nothing new. The interest in CBD is based on a long history of work in modular systems, structured design, and most recently in object oriented systems [5-9]. These were aimed at encouraging large systems to be developed and maintained more easily using a "divide and conquer" approach. CBD extends these ideas, emphasizing the design of solutions in terms of pieces of functionality provisioned as components, accessible to others only through well-defined interfaces, outsourcing of the implementation of many pieces of the application solution, and focusing on controlled assembly of components using interface-based design techniques. Most importantly, these concepts have been supported by a range of products implementing open standards that offer an infrastructure of services for the creation, assembly, and execution of components. Consequently, the application development process has been re-engineered such that software construction is achieved largely through a component selection, evaluation, and assembly process. The components are acquired from a diverse set of sources, and used together with locally developed software to construct a complete application [10].
3. Why Component-Based Development? Many challenges face software developers today to provision enterprisescale solutions. By reviewing and distilling the challenges being faced, we
4
ALAN W. BROWN
obtain the following goals and objectives for enterprise-scale solutions in the Internet age: 9 Contain complexity. In any complex situation there are a few basic techniques that can be used to understand and manage that complexity. These are the techniques of abstraction, decomposition, and incremental development. Any solution to application development must provide ways to support these techniques. 9 Reduce deliver), time. The ability to deliver solutions in a timely manner is an essential aspect of any software development project. With the increased rate of change of technology, this aspect is even more critical. This need for reduced delivery time for software-intensive systems is often referred to as working at "Internet speed." 9 Improve consistency. Most software-intensive systems share significant characteristics with others previously developed, in production, or yet to be produced. It must be possible to take advantage of this commonality to improve consistency and reduce development expense. 9 Make use of best-in-class. In a number of areas there are well-developed solutions offering robust, best-in-class functionality and performance. Taking advantage of these solutions as part of a larger development effort is essential. 9 Increase productiviO'. The shortage of software development skills is causing a major backlog for systems users. Any new approaches must improve the productivity of skilled employees to allow them to produce quality results at a faster rate. 9 Improve quality. As the economic and human impact of failure of software-intensive systems increases, greater attention must be turned to the quality of the deployed systems. A goal must be to support the building of systems correctly the first time, without extensive (and expensive) testing and rewriting. 9 Increase visibility into project progress. Managing large software projects is a high-risk undertaking. To help this, greater visibility must be possible throughout the software life-cycle. This requires an incremental approach to development, delivery, and testing of software artifacts. 9 Support parallel and distributed development. Distributed development teams require approaches that encourage and enable parallel development of systems. This requires specific attention to be given to manage complexity due to the need to partition and resynchronize results. 9 Reduce maintenance costs. The majority of software costs occur after initial deployment. To reduce maintenance costs it must be possible to
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
5
more easily identify the need for change, scope the impact of any proposed change, and implement that change with predictable impact on the rest of the system. This list represents a daunting set of challenges for any approach. Yet it is components and component-based approaches that offer the most promising attempt to meet the challenges head-on, and provide the basis of a new set of techniques supporting the next generation of software-intensive solutions.
4.
W h a t is a C o m p o n e n t ?
The key to understanding CBD is to gain a deeper appreciation of what is meant by a component, and how components form the basic building blocks of a solution. The definition of what it means to be a component is the basis for much of what can be achieved using CBD, and in particular provides the distinguishing characteristics between CBD and other reuse-oriented efforts of the past. For CBD a component is much more than a subroutine in a modular programming approach, an object or class in an object oriented system, or a package in a system model. In CBD the notion of a component both subsumes and expands on those ideas. A component is used as the basis for design, implementation, and maintenance of component-based systems. For now we will assume a rather broad, general notion of a component, and define a component as:
An independently deliverable piece of functionali O, providing access to its services through interfaces. This definition, while informal, stresses a number of important aspects of a component. First, it defines a component as a deliverable unit. Hence, it has characteristics of an executable package of software. Second, it says a component provides some useful functionality that has been collected together to satisfy some need. It has been designed to offer that functionality based on some design criteria. Third, a component offers services through interfaces. To use the component requires making requests through those interfaces, not by accessing the internal implementation details of the component. Of course, this definition is rather informal, and provides little more than an intuitive understanding of components and their characteristics. It is in line with other definitions of a component [11, 12], and is sufficient to allow us to begin a more detailed investigation into components and their use.
6
ALAN W. BROWN
However, it is not sufficient for in-depth analysis of component-approaches when comparing different design and implementation approaches. Consequently, more formal definitions of a component are provided elsewhere [ 10, 13, 14]. To gain greater insight into components and component-based approaches, it is necessary to explore a number of topics in some detail. In particular, it is necessary to look at components and their use of object oriented concepts, and view components from the perspective of distributed systems design. Based on such an understanding, the main component elements can then be highlighted.
4.1
Components and Objects 2
In discussions on components and component-based approaches there is much debate about components in relation to the concept of objects and object oriented approaches. Examining the relationship between objects and components provides an excellent starting point for understanding component approaches [13, 14]. For over 30 years there have been attempts to improve the design of programming languages to create a closer, more natural connection between the business-oriented concepts in which problems are expressed, and the technology-oriented concepts in which solutions are described as a set of programming language statements. In the past decade these attempts have led to a set of principles for software structure and behavior that have come to be called object oriented programming languages (OOPLs). There are many variations in OOPLs. However, as illustrated in Fig. 1, there are a number of concepts that have come to characterize OOPLs [15]. The commonly identified principles of object orientation are: 9 Objects: A software object is a way of representing in software an idea, a thing, or an event according to a chosen set of principles, these principles being the following five. 9 Encapsulation: A software object provides a set of services and manipulates data within the object. The details of the internal operation and data structures are not revealed to clients of the object. 9 Identity: Every object has a fixed, unique ~'tag" by which it can be accessed by other parts of the software. This tag, often called an object identifier, provides a way to uniquely distinguish that object from others with the same behavior. 2This discussion components [13].
is based
on John
Daniels" excellent short
paper
on objects and
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
7
FIG. 1. Basic concepts of object-oriented approaches.
9 Implementation: An implementation defines how an object works. It defines the structure of data held by the object and holds the code of the operations. In object oriented languages the most common form of implementation is the class. It is possible for an implementation to be shared by many objects. 9 Interface: An interface is a declaration of the services made available by the object. It represents a contract between an object and any potential clients. The client code can rely only on what is defined in the interface. Many objects may provide the same interface, and each object can provide many interfaces. 9 Substitutability: Because a client of the object relies on the interface and not the implementation, it is often possible to substitute other object implementations at run-time. This concept allows late, or dynamic binding between objects, a powerful feature for many interactive systems. For the last decade, object oriented principles have been applied to other fields, notably databases and design methods. More recently they have been used as the basis for a number of advances in distributed computing as an approach to support the integration of the various pieces of a distributed application. Based on this analysis, a component can be seen as a convenient way to package object implementations, and to make them available for assembly into a larger software system. As illustrated in Fig. 2, from this perspective a component is a collection of one or more object implementations within the context of a component model. This component model defines a set of
8
ALAN W. BROWN
FIG. 2. The relationship between components and objects.
rules that must be followed by the component to make those object implementations accessible to others. Furthermore, it describes a set of standard services that can be assumed by components and assemblers of component-based systems (e.g., for naming of components and their operations, security of access to those operations, transaction management, and so on). Organizations can define their own component models based on the needs of their customers and the tools they use to create and manipulate components (e.g., Computer Associates' proprietary CS/3.0 standard for components developed using the COOL:Gen product). Alternatively, a number of more widely used component model standards is now available, notably the Enterprise JavaBeans (EJB) standards from Sun Microsystems, and the C O M + standard from Microsoft. In summary, we see that in comparing components to objects, a component is distinguished by three main characteristics" 9 A component acts as a unit of deployment based on a component model defining the rules for components conforming to that model. 9 A component provides a packaging of one or more object implementations. 9 A component is a unit of assembly for designing and constructing a system from a number of independently created pieces of functionality, each potentially created using an OOPL or some other technology.
4.2 Components and Distributed Systems The 1980s saw the arrival of cheap, powerful hardware together with increasingly sophisticated and pervasive network technologies. The result was a move toward greater decentralization of computer infrastructures in most organizations. As those organizations built complex networks of
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
9
distributed systems, there was a growing need for development languages and approaches that supported these kinds of infrastructures. A wide range of approaches to application systems development emerged. One way to classify these approaches is based on the programming level that must be used to create distributed systems, and the abstract level of transparency that this supported. This is illustrated in Fig. 3. Initial approaches to building distributed systems were aimed at providing some level of hardware transparency for developers of distributed systems. This was based on technologies such as remote procedure calls (RPC) and use of message oriented middleware (MOM). These allow interprocess communication across machines independent of the programming languages used at each end. However, using RPCs still required developers to implement many services uniquely for each application. Other than some high-performance real-time systems, this expense is unnecessary. As a result, to improve the usability of these approaches a technique was required to define the services made available at each end of the RPC. To allow this, the concepts of object orientation were introduced as described above. By applying these concepts, application developers were able to treat remote processes as a set of software objects. The independent service providers can then be implemented as components. They are structured to offer those services through interfaces, encapsulating the implementation details that lie behind them. These service providers have unique identifiers, and can be substituted for others supporting the same interfaces. Such distributed object technologies are in widespread use today. They include Microsoft's Distributed
FIG. 3. Different levels of transparency for distributed systems.
10
ALAN W. BROWN
Component Object Model (DCOM) and the Object Management Group's Common Object Request Broker Architecture (CORBA). This approach resulted in the concept of an Interface Definition Language (IDL). The IDL provides a programming language neutral way to describe the services at each end of a distributed interaction. It provides a large measure of platform independence to distributed systems developers. The services are typically constructed as components defined by their interfaces. The interfaces provide remote access to these capabilities without the need to understand many of the details of how those services are implemented. Many distributed computing technologies use this approach. However, many implementations supporting an IDL were not transportable across multiple infrastructures from different vendors. Each middleware technology supported the IDL in its own way, often with unique valueadded services. To provide a middleware transparency for applications, bridging technologies and protocols have been devised (e.g., the Internet Inter-Orb Protocol (IIOP)). These allow components greater independence of middleware on which they are implemented. Of course, what most people want is to assemble solutions composed of multiple components--independently deliverable pieces of system functionality. These components interact to implement some set of business transactions. Ideally, developers would like to describe the services offered by components, and implement them independently of how those services will be combined later on. Then, when assembled into applications for deployment to a particular infrastructure, the services can be assigned appropriate characteristics in terms of transactional behavior, persistent data management, security, and so on. This level of service transparency requires the services to be implemented independently of the characteristics of the execution environment. Application servers based on the C O M + and EJB specifications support this approach. These standards define the behavior a component implementer can rely upon from the "container" in which it executes. When being deployed to a container, the application assembler is able to describe the particular execution semantics required. These can change from one container to another without impact on the component's internal logic, providing a great deal of flexibility for upgrade of component-based solutions. Finally, in many domains it is possible to imagine common sets of services that would frequently be found in many applications within that domain. For example, in areas such as banking and financial management it is possible to construct a common list of services for managing accounts, transfer of funds between accounts, and so on. This is the goal of application transparency. Groups of experts in an industry domain have
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
11
attempted to define common services in the form of a library, or interacting framework of components and services. As a result, application designers and implementers in that domain can use a pre-populated set of components when designing new application behavior. This increases the productivity of developers, and improves the consistency of the applications being produced. In summary, from the perspective of distributed systems, components provide a key abstraction for the design and development of flexible solutions. They enable designers to focus on business level concerns independently of the lower-level component implementation details. They represent the latest ideas of over two decades of distributed systems thinking.
4.3
Elements of a Component
As a result of these analyses, the elements of a component can now be discussed. Building on object oriented concepts and distributed systems thinking, components and component-based approaches share a number of characteristics familiar to today's systems designers and engineers. However, a number of additional characteristics must also be highlighted. These are shown in Fig. 4. As illustrated in Fig. 4, there are five major elements of a component: 9 A specification. Building on the interface concept, a component requires an abstract description of the services it offers to act as the
FIG.
4. What is a component?
12
ALAN W. BROWN
contract between clients and suppliers of the services. The component specification typically defines more than simply a list of available operations. It describes the expected behavior of the component for specific situations, constrains the allowable states of the component, and guides the clients in appropriate interactions with the component. In some cases these descriptions may be in some formal notation. Most often they are informally defined. 9 One or more implementations. The component must be supported by
one or more implementations. These must conform to the specification. However, the specification will allow a number of degrees of freedom on the internal operation of the component. In these cases the implementer may choose any implementation approach deemed to be suitable. The only constraint is on meeting the behavior defined in the specification. In many cases this flexibility includes the choice of programming language used to develop the component implementation. In fact, this behavior may simply be some existing system or packaged application wrapped in such a way that its behavior conforms to the specification defined within the context of the constraining component standard. 9 A constraining component standard. Software components exist within a
defined environment, or component model. A component model is a set of services that support the software, plus a set of rules that must be obeyed by the component in order for it to take advantages of the services. Established component models include Microsoft's C O M + , Sun's JavaBeans and Enterprise JavaBeans (EJB), and the OMG's emerging CORBA Component Standard. Each of these component models address issues such as how a component makes its services available to others, how components are named, and how new components and their services are discovered at run-time. Additionally, those component models concerned with enterprise-scale systems provide additional capabilities such as standard approaches to transaction management, persistence, and security. 9 A packaging approach. Components can be grouped in different ways
to provide a replaceable set of services. This grouping is called a package. Typically, it is these packages that are bought and sold when acquiring components from third-party sources. They represent units of functionality that must be installed on a system. To make this package usable, some sort of registration of the package within the component model is expected. In a Microsoft environment, for example, this is through a special catalog of installed components called the registry.
COMPONENTS AND COMPONENT-BASEDDEVELOPMENT
13
9 A deployment approach. Once the packaged components are installed in an operational environment, they will be deployed. This occurs by creating an executable instance of a component and allowing interactions with it to occur. Note that many instances of the component can be deployed. Each one is unique and executes within its own process. For example, it is possible to have two unique instances of an executing component on the same machine handling different kinds of user requests.
5.
W h a t is the Execution Environment for Components?
To support a component-based approach, it is common to use some form of component infrastructure (sometimes also called "component oriented middleware") to handle all of the complex details of component coordination [16]. Essentially, the component infrastructure provides a common set of component management services made available to all components interested in using that infrastructure. The component infrastructure imposes constraints on the design and implementation of the components. However, in return for abiding by these constraints, the component developer and application assembler is relieved from the burden of developing many complex services within their application. To understand component infrastructures, it is necessary to understand the kinds of services the infrastructure can make available, and the different competing infrastructure implementations currently available.
5.1
Component Infrastructure Services
The use of a component infrastructure arises as a result of a simple premise: the services common to many components should be extracted and provided once in a consistent way to all components. This provides greater control and flexibility over those common services, and allows component developers to concentrate on the specification and implementation of business aspects of the component. There are many kinds of services that the component infrastructure may offer. However, the component infrastructure is typically responsible for at least the following categories of services: 9 Packaging. When developing a component it is necessary to provide some description of that component in a form that is understandable to the component infrastructure. At a minimum the component
14
ALAN W. BROWN
infrastructure needs to know what services the component makes available, and the signatures of the methods which invoke those services. Requests for external component services must also be made in some standard form so that they can be recognized by the component infrastructure. 9 Distribution. The distribution services are responsible for activating and deactivating component instances, and for managing the allocation of component instances to remote host processes on which they execute. Once a client of a component service makes a request, the component infrastructure is responsible for routing that request to the appropriate component instance, which may involve activating a component instance to service the request. This provides location transparency between clients and servers of requests--the client does not need to know where the component instance servicing the request resides, and the component instance does not need to have knowledge of the origin of possible requests. 9 Security. In distributed systems there must be services for authenticating the source of requests and responses to requests, and for privacy of connections when information is transmitted. The component infrastructure may provide various levels of privacy to ensure secure, trusted connections between components can take place. 9 Transaction management. As each component may manage its own persistent data, a single high-level function may require many interactions among components affecting many individually managed pieces of data. As a result, partially completed functions have the potential for leaving this collection of data in an inconsistent state. Distributed transaction management is provided by the component infrastructure to manage and coordinate these complex component interactions. 9 Asynchronous communication. It is not always necessary or possible to communicate synchronously between components. Because components may be distributed across multiple host processes, there is always the possibility that some components will be unavailable to respond to requests. The component infrastructure supports asynchronous communication among components, typically through some form of queuing system for requests. 5.2
C o m p o n e n t Infrastructure
Implementations
In the world of component infrastructure technologies, a number of solutions are now in use. Currently, there are three dominant component
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
15
infrastructure choices possible: The Object Management Group's (OMG's) Object Management Architecture, Microsoft's distributed computing architecture, and Sun's Java-based distributed component technology. In each of these there is a vision for building enterprise-scale component-based applications supported by a set of standards and products. Here, we briefly review the main elements of these approaches.
5.2. 1 OMG's Object Management Architecture The need for a widely agreed component infrastructure led to the formation of the Object Management Group (OMG), a large consortium of over 900 companies attempting to come to agreement on an appropriate component model and services for building component-based distributed systems. The OMG is a large and complex organization, with many special interest groups, focus areas, and task forces. It attempts to provide standards for building component oriented applications, and encourages those standards to be followed by vendors of component infrastructure products and developers of component oriented applications. There are many OMG standards under development, a number of them currently being supported by products. OMG's vision for component-oriented applications is defined in its Object Management Architecture (OMA) [17]. This consists of a specification of the underlying distributed architecture for component communication providing the packaging services and some of the distribution services. The remaining component infrastructure services are developed to make use of those services. The main component infrastructure standard provided by OMG is the Common Object Request Broker Architecture (CORBA) [18]. This defines the basic distribution architecture for component oriented applications. There are three major aspects of CORBA: 9 The OMG's Interface Definition Language (IDL), which describes how business functionality is packaged for external access through interfaces. 9 The CORBA component model describing how components can make requests of each other's services. 9 The Internet Inter-ORB Protocol (IIOP), which allows different CORBA implementations to interoperate. Together with the CORBA standard, a set of additional capabilities is defined in the CORBA Services standards [19]. A wide range of services has been defined, or is currently under investigation. However, the following
16
ALAN W. BROWN
services are those that are most often found in currently available implementations: 9 Life-cycle services, which control the creation and release of component instances. 9 Naming services, which allow identification and sharing of component instances. 9 Security services, which provides privacy of connection between a client and provider of services. 9 Transaction service, which allows a user to control the start and completion of distributed transactions across components. Recently the O M G also released a request for proposals for a component model for CORBA [20]. This is intended to extend the CORBA model to allow CORBA components to be defined through extensions to the IDL. This would allow server-side components to be created and their relationships to be described. The proposal under consideration draws many of its design ideas from Sun's Enterprise JavaBeans (EJB) specification (discussed later). It is expected that this proposal will be accepted as an O M G standard early in 2000. A number of implementations conforming to the various O M G standards are now available on a variety of platforms. For distributed applications executing across heterogeneous platforms, the O M G approach to component infrastructure has been shown to be a viable way to build componentbased applications. There are a number of examples of successful component-based implementations in applications domains such as banking, telecommunications, and retail.
5.2.2
Microsoft" s COM+
As can be expected, Microsoft, the largest software company in the world, has had a major influence in how people think about components and component oriented approaches. As Microsoft shifts its focus from desktop applications to enterprise-scale commercial solutions, it has described its vision for the future of application development as a component oriented approach building on Microsoft's existing dominant desktop technologies [21]. To enable sharing of functionality across desktop application, Microsoft developed the Component Object Model (COM) as the basis for interapplication communication [22]. Realizing the value of COM as a generic approach to component interoperation, Microsoft defined its strategy of component-based applications to consist of two parts. The first
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
17
is its packaging and distribution services, Distributed COM (DCOM), providing intercomponent communication. The second is currently referred to as Microsoft's Distributed interNet Applications (DNA) architecture, providing the additional categories of component infrastructure services making use of DCOM. Collectively these ideas are referred to as Microsoft C O M + [231. The packaging and distribution services implemented in DCOM consist of three major aspects: 9 The Microsoft Interface Definition Language (MIDL), which describes how business functionality is packaged for external access through interfaces. 9 The COM component model describing how components can make requests of each others' services. 9 The DCOM additions to COM providing support for location transparency of component access across a network. Additional component infrastructure services are provided by Microsoft via two products, both making extensive use of the underlying packaging and distribution services: 9 The Microsoft Transaction Service (MTS), which provides security and transaction management services. 9 The Microsoft Message Queue (MSMQ), which provides support for asynchronous communication between components via message queues. The Microsoft component infrastructure services offer significant functionality to builders of component-based applications for Windows platforms. For anyone building a distributed Windows NT or Windows 2000 solution, these services provide essential capabilities to greatly reduce the cost of assembling and maintaining component-based applications. Many Windows-based applications make significant use of the C O M + technologies, including many of Microsoft's own desktop applications such as the Microsoft Office Suite. Furthermore, a wide range of Microsoftfocused components is available from third-party software vendors (for example, there are hundreds of COM components listed at Microsoft's site--see ht tp : I/www. mi c r o s o f t. c o m l c o m p o n e n t resources).
5.2.3
Sun's Java-based Distributed Component Environment
One of the most astonishing successes of the past few years has been the rapid adoption of Java as the language for developing client-side
18
ALAN W. BROWN
applications for the web [24, 25]. However, the impact of Java is likely to be much more than a programming language for animating web pages. Java is in a very advantageous position to become the backbone of a set of technologies for developing component-based, distributed systems. Part of this is a result of a number of properties of Java as a language for writing programs: 9 Java was designed specifically to build network-based applications. The language includes support for distributed, multithreaded control of applications. 9 Java's run-time environment allows pieces of Java applications to be changed while a Java-based application is executing. This supports various kinds of incremental evolution of applications. 9 Java is an easier language to learn and use for component-based applications than its predecessors such as C + + . Many of the more complex aspects of memory management have been simplified in Java. 9 Java includes constructs within the language supporting key component-based concepts such as separating component specification and implementation via the interface and class constructs. However, Java is much more than a programming language. There are a number of Java technologies supporting the development of componentbased, distributed systems. This is what allows us to consider Java as a component infrastructure technology [26]. More specifically, there are a number of Java technologies providing packaging and distribution services. These include [27]: 9 JavaBeans, which is the client-side component model for Java. It is a set of standards for packaging Java-implemented services as components. By following this standard, tools can be built to inspect and control various properties of the component. 9 Remote Method Invocation (RMI), which allows Java classes on one machine to access the services of classes on another machine. 9 Java Naming and Directory Interface (JNDI), which manages the unique identification of Java classes in a distributed environment. An additional set of technologies support the remaining component infrastructure services. These are necessary to allow Java to be used for the development of enterprise-scale distributed systems. These technologies are defined within the Enterprise JavaBeans standard. Enterprise JavaBeans (EJB) is a standard for server-side portability of Java applications [28]. It provides the definition of a minimum set of services that must be available on any server conforming to the specification. The services include: process
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
19
and thread dispatching, and scheduling; resource management; naming and directory services; network transport services; security services; and transaction management services. The goal of the Enterprise JavaBeans specification is to define a standard model for a Java application server that supports complete portability. Any vendor can use the model to implement support for Enterprise JavaBeans components. Systems, such as transaction monitors, CORBA run-time systems, COM run-time systems, database systems, Web server systems, or other server-based run-time systems can be adapted to support portable Enterprise JavaBeans components. Early in 2000 all leading platform middleware vendors (except Microsoft) will have delivered (or claimed) support for the EJB standard. It is considered the primary alternative to Microsoft's C O M + model. As illustrated in Fig. 5, the EJB specification describes a number of key aspects of a system. In particular, a user develops an Enterprise JavaBean implementing the business logic required. The user also defines a Home interface for the bean defining how the EJB objects (i.e., instances of the EJB) are created and destroyed, and a remote interface for clients to access the bean's behavior. An EJB executes within an EJB container. Many EJB containers can operate within a given EJB server. The EJB server provides many of the basic services such as naming, transaction management, and security. The EJB specification was made available early in 1999. However, by the end of 1999 over 25 vendors already offered EJB-compliant containers. These implement the component infrastructure services that any application developer can rely on when designing a component-based application in
FIG. 5. E l e m e n t s o f a n E J B server.
20
ALAN W. BROWN
Java. The strong support for EJB from both Sun and IBM provides significant impetus to the case for EJB as an important player in the future of component infrastructure services. Toward the end of 1999, Sun Microsystems announced a new initiative aimed at bringing together a number of existing Java initiatives to provide a standard platform on which to build distributed applications in Java. This initiative, known as the Java 2 Enterprise Edition (J2EE) standard, builds on the Java 2 standard, and adds most of the important programming interfaces in an application server [29]. For example, J2EE includes the EJB specification for server-side components, together with the necessary application programming interfaces (APIs) necessary for building clients and connecting them to these server-side components (e.g., the Java Server Page (JSP) API for dynamically generating web pages, and the Java Naming and Directory Interface (JNDI) for locating and accessing distributed Java objects). To fulfill the promise of making the development of component-based systems in Java easier, the J2EE augments this collection of standards and interfaces with [30]: 9 A programming model for developing applications targeting the J2EE platform. This provides an outline approach to distributed application development, and highlights a number of key design heuristics for creating efficient, scaleable solutions in Java. 9 A compatibility test suite verifying J2EE platform implementations conform to the J2EE platform as defined by the collection of standards and APIs. This encourages portability of solutions across different vendors' implementations of the J2EE platform. 9 A reference implementation that offers an operational definition of the J2EE platform. This demonstrates the capabilities of the J2EE platform, and supplies a base implementation for rapid prototyping of applications. By collecting these elements together under a single umbrella, Sun aims to simplify the creation of distributed systems development for Java.
6.
H o w are Applications A s s e m b l e d using CBD?
Having described the basic component concepts and the main aspects of component execution environments, we now consider how these concepts are embodied in supporting a component oriented approach to application assembly. In particular, we identify the key elements on which any CBD approach is based.
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
21
While much of the technology infrastructure for component oriented approaches is in place, of equal importance to its success are the methods for developing component oriented applications. Faced with the wealth of technology, software developers must be able to answer the key question of how to effectively design solutions targeting that technology. The software industry is only just beginning to offer guidance in this regard. Fortunately, the latest wave of software development methods is beginning to rise to the challenge of supporting the development of distributed, Web-based systems involving reuse of legacy systems and packaged applications. A number of organizations have begun to publicize their methods and best practices for developing enterprise-scale systems from components. The three most prominent of these are: 9 Rational's unified process. This is a broad process framework for software development covering the complete software life-cycle. In architecting the solution a component-based approach is encouraged, heavily influenced by the constraints of the Unified Modeling Language (UML) notation [31]. 9 The Select Perspective method. A general component design approach is supported, targeted at the Select Component Manager. General component design principles target UML as the component design notation [32]. 9 Computer Associates' Enterprise-CBD approach. Influenced by the Catalysis approach to component development, Computer Associates' approach encourages a strong separation of component specification from implementation using an extended form of UML [33]. This allows technology-neutral specifications to be developed and then refined into implementations in a number of different implementation technologies.
These approaches differ in many details. However, the abstract method that they encourage for component-based design is fundamentally the same. As illustrated in Fig. 6, three key elements provide the focus of these methods: a diverse set of components stored in a component library, an interface-focused design approach, and application assembly based on a component architecture.
6.1
Sources of Components
A component has been informally defined as an independently deliverable piece of functionality providing access to its services through interfaces. This definition is important as much for what it doesn't say as for what it does
22
ALAN W. BROWN
FIG. 6. Elements of a component-oriented software process.
say; it does not place any requirements on how the components are implemented. Hence, valid components could include packaged applications, wrapped legacy code and data, previously developed components from an earlier project, and components developed specifically to meet current business needs. The diverse origin for components has led to a wealth of techniques for obtaining components from third parties, developing them in a variety of technologies, and extracting them from existing assets. In particular, many CBD approaches focus on wrapping techniques to allow access to purchased packages (e.g., enterprise resource planning (ERP) systems) and to mine existing legacy systems to extract useful functionality that can be wrapped as a component for future use. This approach leads to a view of CBD as the basis for enterprise application integration (EAI). Many vendors providing technologies for combining existing systems take a component view of the applications being assembled, and introduce specific integration technologies to tie the systems together. Often this is focused on the data flows in and out of the packages, and consists of transformers between the packages offering a broker-based approach. The sources of the components may be diverse and widespread, yet assembling solutions from these components is possible due to: 9 use of a component model as a standard to which all components can conform regardless of their heritage; 9 a component management approach, supported by appropriate tools, for the storage, indexing, searching, and retrieval of components as required;
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
23
9 a design approach that allows solutions to be architected from components by considering their abstract functionality, and ignoring peculiarities of their implementation for a later date. This final aspect, a design approach targeting appropriate component architectures, is a key element to the success of component approaches in general. In particular, it is the motivation for interface-based design approaches.
6.2
Interface-Focused Design
Interfaces are the mechanism by which components describe what they do, and provide access to their services. The interface description captures everything on which the potential client of that component can rely since the implementation is completely hidden. As a consequence, the expressiveness and completeness with which the interfaces are described is a primary consideration in any component-based approach to software. Furthermore, during the early stages of analysis and design, the focus is on understanding the roles and responsibilities within the domain of interest as a basis for the interfaces of the implemented system. This gives rise to a new form of design method: interface-based design. Focusing on interfaces as the key design abstraction leads to much more flexible designs [34]. Designers are encouraged to consider system behavior more abstractly, define independent suppliers of services, describe collaborations among services to enact scenarios, and reuse common design patterns in addressing familiar situations. This results in more natural designs for systems with a greater independence from implementation choices. Such thinking is an essential part of new application provisioning approaches in the Internet age. To target rapidly evolving distributed computing technologies requires an approach to design that can evolve as the technology changes. Familiar, rigid approaches to design are unsuitable. Interface-based design approaches represent a significant step forward in reducing the cost of maintenance of future systems.
6.3 Applications and Component Architecture Interfaces and interface-based design provide the techniques necessary for a component assembly view of software-intensive solutions. Component assembly concerns how an application is designed and built from components. In a component-based world, an application consists of a set of components working together to meet the broader business needs. How this is achieved is often referred to as the component architecture--the
24
ALAN W. BROWN
components and their interactions. These interactions result in a set of dependencies among components that form an essential part of a software solution. Describing, analyzing, and visualizing these dependencies become critical tasks in the development of a component-based application. Typically, a candidate component architecture is proposed relatively early in a project. This consists of a number of known pieces of existing systems to be reused, familiar patterns of system interactions, and constraints based on imposed project and organizational standards. Consequently, at least two levels of component architecture become important. The first is a logical component architecture. This describes the abstract design of the system in terms of the major packages of functionality it will offer, descriptions of each collection of services in terms of its interfaces, and how those packages interact to meet common user scenarios. This is often called the component specification architecture. It represents a blueprint of the design of the system. Analysis of this architecture is essential to ensure that the system offers appropriate functionality, and can easily be modified as the functional requirements for the system evolve. The second is a physical component architecture. This describes the physical design of the system in terms of selected technical infrastructure products, distributed hardware and its topology, and the network and communication protocols that tie them together. This forms part of the component implementation architecture. This architecture is used to understand many of the system's nonfunctional attributes such as performance, throughput, and availability of services.
7.
W h a t is the Current Practice in CBD Today?
A number of organizations are practicing component-based approaches today using a variety of technologies. While many component approaches are limited to client desktop applications (via ActiveX controls, visual Java Beans, and so on), there are others that are beginning to address larger scale applications with significant business functionality. There are a number of very valuable lessons being learned by these pioneers of component approaches. Originally, many of these lessons were related to the vagaries and incompatibilities of specific component technologies. More recently, a number of published accounts discuss a much broader range of critical success factors for component approaches in areas such as organizational readiness, costs of component development, and management of deployed component-based applications. Much can be learned from these existing users of components, and by examining the strategies and directions of key organizations in the software
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
25
industry. To establish the current practice in CBD today we look at three key communities: component vendors creating a marketplace of third-party components, component portal providers offering information and discussion forums on component-related topics, and component infrastructure providers establishing the strategic direction for many of the key component technologies. 7.1
Component
Software Vendors
A number of companies now offer brokering services for components. Through their on-line presence they allow component suppliers to make their components available to a broader market, and provide a focal point for those organizations looking to acquire those components. Component software brokers offer services for the buying and selling of components, additional services for cataloguing and certifying components, and some form of auction or matching services to connect potential suppliers of components with those looking for solutions. To support and promote their brokering activities they additionally provide a source of information, guidance, and advice on components and component technologies. There are four major component software brokers: ComponentSource (w w w . c o m p o n e n t s o u r c e . c o m). This is the most widely used of the component software brokers, offering the largest selection of components. It has an extensive catalog of components for many technologies. The majority of the components are focused on user interface controls for Microsoft desktops (ActiveX, COM). There are some business-focused components, with some information to allow a detailed evaluation. Some attempt has been made to provide consistency across the component descriptions via white papers offering guidance for component providers. 9 FlashLine (www. f l ash l i ne. com). Many components are provided for both Java (some EJB) and COM. However, perhaps most interesting is the free component manager that is offered. It is a front-end navigation and search engine for components to be shared among teams. This can be used standalone, or as a plug-in to some of the popular IDEs. Other services offered include a component auction site, and a QA and certification lab where organizations can pay for a component to be certified against a set of basic criteria. 9 Objectools (www. ob j e c t oo l s . com). This offers a relatively small number of components, with rather terse descriptions of each
9
26
ALAN W. BROWN
component's attributes. Having started in early 2000, this is a growing company catching up with the other two component software brokers in terms of quantity of components, and services being offered. IntellectMarket (www. In t e L Lec tMark e t . corn). This is a fledgling component software exchange launched towards the end of 2000. It is supported by Tim O'Reilly, so may have significance in its association with the very successful O'Reilly education services, and the larger open-source community promoted by Tim O'Reilly. The component software brokers appear to be interesting first generation solutions to the component software market. Perhaps this is inevitable given the relative newness of the component technologies they are intended to support. However, they have a number of limitations. To have greater impact they are now beginning to turn their attention toward four key areas:
1. Increasing the depth of the information the)' provide on the components they offer. At present, this often consists of a high-level functional description provided by the vendor, with a summary of technical requirements. Real application of the components requires much more knowledge about the nonfunctional aspects (quality, robustness, performance, etc.). 2. Offering consistent information about the components. As an information service on components these companies can set standards for the way in which a component is specified and delivered. They offer the potential to have a major impact in component development and should drive the component description into more useful forms. 3. Becoming integral parts of the component technology vendors. Both infrastructure and tool vendors must recognize the importance of the services they offer, and form a partnership that provides benefit to component consumers. 4. Broadening the quality and range of the services offered. There is a great deal of variability in the quality of the on-line material offered. The balance between quantity and quality of information is always a difficult o n e - - b u t must be constantly in the forefront of any decision made.
7.2
Component Software Portals
Another aspect of the component software market is the availability of information services that provide suppliers and consumers of components with the information they need to be successful. This begins with in-depth
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
27
discussions of the technology, but concerns many other aspects required to make the component marketplace a success--economic, cultural, educational, political, and so on. Many information services are available for software organizations, addressing many levels of detail required across the organization. The software component market is certainly an integral part of the information provided by industry analyst organizations such as IDC, Forrester Research, Gartner Group, Hurwitz Group, Patricia Seybold Group, Giga Information Group, and many others. While many of these provide valuable coverage of the component software market, we are now seeing services dedicated to components, component technologies, and component-based development approaches. There are many interesting component software portals. Some of the more influential include: 9 CBDi Forum (www. c bd i f o r urn. com). This independent forum for information on CBD offers a range of free and subscription based services via the Internet. It has the support of a number of major software tool vendors. In particular, it seems to be creating a close relationship with Microsoft, and offers a number of articles on the component market in relation to Microsoft's D N A and .NET technologies. CBDiForum is a comprehensive information forum, and also runs conferences and workshops in addition to providing the on-line information. Much of the information is freely available, while more in-depth information requires a paid subscription. This is a 9 Component Software (www. C o m p o n e n t S o f t w a r e . n e t ) . more general news and information portal spawned from the CBDi Forum. It is intended to be a general component software information launching pad for product news and announcements. 9 The ServerSide (www.theserverside.com). This site focuses on the J2EE technologies. It is intended to provide a community service to EJB/J2EE developers offering discussion forums, documents, component directories, and other relevant information. It is maintained by Ed Roman and his company, The Middleware Company. 3 It is a comprehensive, interactive forum for EJB developers. 9 CBD HQ (www. c b d - h q , com). This site provides a broad range of white papers and articles on components and component-based development. The maintainers of this site gather articles and commission papers on business-focused software components. Much 3 Ed Roman has a very popular book on EJBs called Mastering EJB published by John Wiley.
28
ALAN W. BROWN
of it is based around the component consulting practices offered by Castek, but the papers are much broader in scope. Rather than a news service, this provides more in-depth insight into component technology and practices. 9 The
Open
Component
Foundation
(w w w . o p e n - c o m p o n e n t s .
c o m).
This is a general information portal for component software run by the Centre for Innovation and Technology. As can be seen, a broad range of component software portals already exists, offering a great deal of information and advice. Much of this information is interesting and useful, offering a broad mixture of product release information, reprints of vendor white papers, and expert commentary. To further improve the information they provide, these component software portals are attempting to create greater value and impact by moving in three key directions:
1. Focusing their content to a well-scoped audience. In many cases it is better to do one thing really well than do many things in a mediocre way. This appears to be particularly true for component software portals. As a result, these portals are continually refining their focus to meet the needs of a well-defined audience. 2. Providing greater interactiviO'. Many of the portal sites currently provide little opportunity for true community activities. They are now in the process of adding the necessary infrastructure for sharing ideas and information. 3. Increasing the practical utility of the information. In this stage of the maturity of the component software market the most pressing need is for simple, practical guidance and advice. These sites are looking to increase their value by categorizing the information into major themes for different audiences so that detailed technical help is available for developers, and strategic analyses are provided for managers and decision makers.
7.3
Component Infrastructure Vendors
To understand the current state of CBD it is important to understand and assess the approaches to component software being taken by the major technology infrastructure vendors. The four major players in this regard are Microsoft, IBM, SUN, and BEA. 4 4A lot more information on these vendors" initiatives can be found at their respective Web
sites.
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
7.3. 1
29
Microsoft
The dominance of Microsoft on the desktop has much to do with two things: 9 The way in which Microsoft has gradually improved the integration of the various tools and services to allow data to be moved across tools, application services to be started from within a variety of tools, and creation of a common desktop visual appearance across many tools and services. 9 The ease with which third-party tools and service providers can make use of Microsoft tools and services within third-party products, or use third-party products as plug-ins to Microsoft products. The underpinning for all of this is the Component Object Model (COM). As Microsoft's strategy has evolved (from OLE, COM, C O M + , and now into D N A and .NET) this core of component concepts and services has been at the heart of their approach. Moving forward, Microsoft's current strategy with respect to components and component technologies appears to focus on a number of key elements. While the COM specification remains central to their approach, Microsoft now is creating a broader environment in which COM can flourish. This includes: 9 XML, SOAP, and UDDI. Microsoft's response to the threat of Java and the EJB technology is to de-emphasize the languages used for creating component implementations, and instead focus on how components are assembled and connected using the Internet as the distribution medium. Their approach is a complete service-based initiative using the extensible Markup Language (XML), the Simple Object Access Protocol (SOAP), and the Universal Description, Discovery and Integration protocol (UDDI) 5 as the connection technologies. This is an attempt to diffuse the Java concerns and say that the components can be implemented in any language or technology. What matters is how the pieces communicate. Microsoft believes that XML, SOAP, and U D D I will provide the universal "glue" to make this happen. This philosophy forms the basis of Microsoft's .NET strategy. The Visual Studio development environment has changed from a Visual Basic environment to a COM environment, and now will be oriented toward developing component services integrated via XML and SOAP, for exchange via UDDI. 5XML provides a simple tag-based language for encoding many kinds of information; SOAP allows distributed component interactions to be encoded within XML. UDDI provides registry and service exchange facilities for accessing SOAP-based services across the Internet.
30
ALAN W. BROWN
9 Redefining MTS as an application server. The Microsoft Transaction Server (MTS) provides an essential element of large-scale distributed systems, forming the core set of capabilities of the middle tiers of an ntier solution. As the notion of an application server has matured in the market, Microsoft has started to describe MTS as "the application server for COM-based distributed systems for the enterprise." 9 Expanding the MSDN community. The Microsoft Developer Network (MSDN) is a large group of developers creating solutions for Microsoft infrastructure. Recognizing the important role they play, Microsoft is increasing its support for these developers and expanding MSDN towards a more powerful information and software distribution channel. This will involve greater use of Web technology for more dynamic, interactive content. 9 Creating a catalog of reusable COM components. The success of a component approach requires a market of available components. Microsoft has set up their component catalog to use the MSDN as a channel for providers and resellers of COM components ( h t t p : / / m s d n . mi c r o s o f t. c o m / c o m p o n e n t
7.3.2
r e s o u r ces).
IBM
As a large, diverse organization, there are many different threads to IBM's approach to the component software market. The past few years has seen IBM make a number of changes in its approach and philosophy in this area. These changes reflect the fast pace of change in this market, the challenge posed by Microsoft's desktop dominance, and the difficulties IBM has had in changing its approach from a mainframe/server organization to an e-commerce vendor. In the past few months IBM has presented a much more Web-centric image, and is amalgamating its middle tier e-commerce strategy under the WebSphere branding. This encompasses new software products (such as the application server WebSphere Enterprise Edition) and a repositioning of existing middleware products such as MQSeries and MQIntegrator. From a strategic standpoint, IBM has announced strong support for the Java technologies as the basis for competing against Microsoft. IBM believes that organizations gain more flexibility and openness through the Java language, virtual machine, and interface standards. As a result, IBM's component software approach consists of: 9 A component assembly environment for Java. VisualAge for Java (VA) has become a very popular tool for developing Java applications, JavaBeans, and now Enterprise JavaBeans (EJBs). It is a typical
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
31
interactive development environment (IDE) for Java, but also offers some elements of visual assembly of beans, and automatic deployment of EJBs to IBM's application servers. The WebSphere application server. IBM has developed a number of distributed object technologies over the past decade (e.g. S O M / DSOM). None of these has been particularly successful. These have now evolved into an application server product supporting the EJB standard as WebSphere. Although rather large and complex, it appears that WebSphere is gaining significant momentum in the market. A component framework. The IBM San Francisco Project began some years ago as an ambitious attempt to create a complex reusable set of objects as the basis for families of solutions in a wide number of domains. As this work was gathering pace, the Java explosion gained momentum and a different group in IBM was helping to create the EJB specification. The result is that the San Francisco Project has reoriented its work to be a set of reusable frameworks for creating EJB-based applications. The outcome of this work appears to be a set of reusable components for deployment to WebSphere, plus some guidance on assembling those components into enterprise-scale solutions focused on a number of important business domains (e.g., banking, insurance, retailing).
7.3.3
Sun
Although primarily a hardware company, Sun has been attempting to improve the software aspects of its business for some years. The latest attempts are based on Sun's prominent identification with the Java technologies. To capitalize on this, Sun is trying to build a substantial software-focused business around Java-based solutions for every kind of platform from Internet appliances and handheld devices to high-end enterprise solutions. A number of acquisitions have taken place to bring in the required technology. Sun's current strategy involves a number of elements: 9 An EJB application server. A number of acquisitions have resulted in application servers, including the iPlanet, NetBeans, and SynerJ servers. The result has been to consolidate on the iPlanet server. This is a J2EE compliant container for EJBs. 9 A set of development tools for Java. The development tools from NetBeans and Forte have been renamed to provide three different kinds of t o o l s - - F o r t e for Java Developer Edition is the NetBeans developer product, Forte for Java Internet Edition is the NetBeans Developer Pro product, and Forte for Java Enterprise is the Forte SynerJ product.
32
ALAN W. BROWN
Around these products Sun is creating a community of enterprise Java developers through its work on the Java standards. This is based on its standards activities such as the Java 2 Enterprise Edition (J2EE). J2EE provides architecture for enterprise Java solutions based on Java Server Pages (JSPs), Java Servlets and EJB. It also includes a reference EJB container, and guidelines for building distributed systems in Java. In addition, Sun manages a number of community activities, most notably the JavaOne conferences.
7.3.4
BEA
The past year or two has seen BEA rise quickly to be a key player in the component software market. Beginning as an infrastructure and middleware company, BEA has successfully extended its market reach to become the major provider of EJB application server solutions. This approach is based on the following elements: 9 A range of application servers. BEA offers three different application servers--WebLogic Express for simpler database-to-webpage systems, WebLogic Server for Java-based n-tiered solutions, and WebLogic Enterprise for high transaction distributed solutions (based on the Tuxedo middleware). 9 A set of development and deployment tools. Through various acquisitions, BEA has major ownership of WebGain, offering WebGain Studio as the main environment for developing WebLogic applications. It also provides Visual Cafa as a Java based IDE, and the TopLink tool for persistence management (including object to relational mappings). 9 A set of business components. The acquisition of The TheoryCenter provided BEA with a useful set of about 80 business components aimed at building e-commerce applications for deployment to WebLogic Server. These components are marketed as a BEA Weblogic Commerce Server. Additionally, BEA offers some application-focused solutions built on the WebLogic platform. This includes an e-commerce server, and a personalization server.
8.
Summary
Component-based development of software is an important development approach for software solutions which must be rapidly assembled, take
COMPONENTS AND COMPONENT-BASED DEVELOPMENT
33
advantage of the latest Web-based technologies, and be amenable to change as both the technology and users' needs evolve. One of the key challenges facing software engineers is to make CBD an efficient and effective practice that does not succumb to the shortcomings of previous reuse-based efforts of the 1970s and 1980s. Fortunately, the past few years has seen a number of major advances in both our understanding of the issues to be addressed to make reuse successful and the technology support available to realize those advances. These are embodied in the rapidly growing field of component-based software engineering. This approach encourages a new way of application development that focuses on the following: 9 separation of component specification from component implementation to enable technology-independent application design; 9 use of more rigorous descriptions of component behaviors via methods that encourage interface-level design; 9 flexible component technologies leveraging existing tools and standards. This chapter has identified the key concepts underlying a componentbased approach to software engineering. These have been explored in the context of improving our understanding of CBD, assessing the current state of CBD technology, and improving an organization's software practices to enhance the effectiveness and viability of large-scale software development through the reuse of components. This will help to lead organizations toward an interface-based approach to application development and design that encourages the creation of systems that are more easily distributed, repartitioned, and reused. These attributes are essential to improve the future effectiveness of organizations in their development and use of largescale software systems in the Internet age. REFERENCES Brown, A. W. (2000). Large-scale Component-Based Development. Prentice Hall. Cheesman, J. and Daniels, J. (2000). UML Components. Addison-Wesley. Brown, A. W. and Wallnau, K. C. (1998). "'The current state of CBSE". IEEE Software. Allen, P. (2000). Realizing e-business with Components. Addison-Wesley. Garlan, D., Allen, R. and Ockerbloom, J. (1995). "'Architectural mismatch: why it's hard to build systems out of existing parts". Proceedings o./ the hTternational Conference on Software Engineering. [6] Parnas, D. (1972). "On the criteria for decomposing systems into modules". CACM, 15, 12, 1053-1058. [7] Parnas, D. (1976). "On the design and development of program families". IEEE Transactions on Software Eng&eering, 7, 1, 1-8.
[1] [2] [3] [4] [5]
34
ALAN W. BROWN
[8] Prieto-Diaz, R. and Freeman, P. (1987). "'Classifying software for reusability". IEEE Software. [9] Cox, B. (1986). Object Oriented Programming--An Evolutionary Approach. AddisonWesley. [10] Brown, A. W. and Wallnau, K. C. (1996). "Engineering of component-based systems". Proceedings of the 2nd IEEE International Conference on Complex Computer Systems, October. [11] Kara, D. (1996). "Components defined". Application Development Trends. [12] Allen, P. (1999). "Using components to improve your business". Component Strategies. [13] Daniels, J. (1999). Objects and Components. An internal whitepaper of Computer Associates. [14] Szyperski, C. (1998). Component So[tware." Beyond Object-Oriented Programming. Addison-Wesley. [15] Meyer, B. (1999). Object-Oriented Software Construction. Second Edition. Prentice Hall. [16] Sessions, R. (1998). "'Component-oriented middleware". Component Strategies. [17] OMG, (1998). "The object management architecture (OMA) guide". Available from http'//www.omg.org.
[18] [19] [20] [21] [22] [23] [24] [25] [26] [27]
Siegel, J. (1998). CORBA Fundamentals and Programming. Wiley. OMG, (1999). "CORBA services". Available from h t t p - / / w w w . o m g . o r g . Siegel, J. (2000). "What's coming in CORBA 3?". Java Developers Journal. Sessions, R. (1997). COM and DCOM: Microsoft's Visionjor Distributed Objects. Wiley. Box, D. (1997). Essential COM. Addison-Wesley. Platt, D. S. (1999). Understanding COM+. Microsoft Press. Flanagan, D. (1999). Java in a Nutshell. O'Reilly Press. Horstmann, C. (1999). Core Java 2, Volume 1." Fundamentals. Prentice Hall. Asbury, S. and Weiner, S. R. (1999). Developing Java Enterprise Applications. Wiley. Austin, C. and Powlan, M. (1999). "Writing advanced applications for the Java platform". Sun Microsystems. Available at h t t p- / / j a v a. s u n. c o m, December. [28] Sun Microsystems, (1999). "Enterprise JavaBeans Standard" Version 1.1. Available at h t t p " 1/j ava. sun. c o m
[29] Roman. E. (1999). Mastering Enterprise JavaBeans and the Java 2 Platform, Enterprise Edition. Wiley. [30] Flurry, G. (1999). "'The Java 2 Enterprise Edition". Java Developers Journal. [31] Booch, G. et al., (1999). The Uni.fied Modeling Language User Guide. Addison-Wesley. [32] Allen, P. and Frost, S. (1998). Conlponent-Based Development for the Enterprise." Applying the Select Perspective. Cambridge University Press. [33] D'Souza, D. and Wills, A. C. (1998). Objects, Components, and Frameworks with U M L - the Catalysis Approach. Addison-Wesley. [34] Coad, P. and Mayfield, M. (1999). Java Design. Building Better Apps and Applets. Second Edition. Yourdon Computing Press.
Working with UML: A Software Design Process Based on
Inspections for the Unified Modeling Language GUILHERME H. TRAVASSOS COPPE/PESC Federal University of Rio de Janeiro P.O. Box 68511 Rio de Janeiro, RJ 21945-970 Brazil 55 21 562 8712
[email protected]
FORREST SHULL Fraunhofer Center u Maryland University of Maryland 4321 Hartwick Road, Suite 500 College Park, MD 20742 USA 301-403-8970 fshull@fraunhofer, org
JEFFREY CARVER Experimental Software Engineering Group Department of Computer Science University of Maryland College Park, MD 20742 USA 301-405-2721
[email protected]
Abstract This text describes a simple and effective object oriented software design process template having UML as the modeling language and extensively using inspections to support the construction and maintenance of software products. This software design process uses a sequential organization, based on the waterfall approach, for two reasons: to simplify the explanation of design activities in the context of ADVANCES IN COMPUTERS, VOL. 54 ISBN 0-12-012154-9
35
Copyright ~; 2001 by Academic Press All rights of reproduction in any form reserved.
36
GUILHERME H. TRAVASSOS ETAL. this text and to make available a standard process that can be continuously improved by developers. The component phases of this design process are described in a way that gives developers the freedom to reorganize the overall process based on their own needs and environment. I n a d d i t i o n , a survey is provided of specific literature regarding important aspects of UML and the object oriented paradigm.
1.
Introduction
2.
The Unified Modeling Language ( U M L ) . . . . . . . . . . . . . . . . . . . . 2.1 Different Perspectives to Improve Design Modeling . . . . . . . . . . . . Software Process Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Development Activities . . . . . . . . . . . . . . . . . . . . . . . . . . Quality Assurance and Verification and Validation Activities . . . . . . . . 3.2 The Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Requirements Activities . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 High-Level Design Activities . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Low-Level Design Activities . . . . . . . . . . . . . . . . . . . . . . . . Maintenance or Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.
4.
5.
6.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
.
40
.
41
.
43
.
44
.
47
.
64
.
67
.
70
.
84
.
86
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
5.1
Process
5.2
Understanding
5.3
Example
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
The Road Ahead References . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
1.
Introduction
This chapter describes a simple and effective object oriented (OO) software design process template having UML as the modeling language and extensively using inspections to support the construction and maintenance of software products. In addition, a survey of specific literature regarding UML and the object oriented paradigm is presented. This software design process uses a sequential organization based on the waterfall approach for two reasons: to help with the explanation of design activities in the context of this chapter and to make available a standard process that can be continuously improved by developers. This process does not result in a loss of generality in this discussion because after developers understand the process they have freedom to reorganize all of the component phases. The activities that were considered for such a software process template are requirements specification, high- and low-level design, coding, and testing. However, we have inserted inspection activities at various points in the life-cycle to help address quality issues, as explained in Section 3.2.2.
WORKING WITH UML
37
We have concentrated on describing only activities that specifically use the OO paradigm and make use of UML. So, elicitation of requirements [1], an important task in the context of software development, is not considered here in detail. Different techniques and methods can be used for requirements elicitation and description [2]. Requirements descriptions are paradigm-independent and must be chosen based on customer needs [3]. System scenario descriptions (use cases) are part of the requirements specification and are produced after the problem is described. We consider this as part of the OO design, because the way that scenarios are described can impact future design solutions. Use cases are a good mechanism to help identify basic concepts and give an indication of the functionality of the system. The functional requirements and problem domain concepts described by the requirements and use cases arrange the information used to produce the high-level design, a set of UML artifacts. Then, these design artifacts are evolved to include nonfunctional requirements and the features that deal with the computational side of the problem, or the solution domain. These evolved artifacts are the low-level design. The result of this process will be a set of models ready for coding and testing. Generally, a software development process defines the way real-world concepts should be represented, interpreted, and transformed into a working software system. Typically, the software development process is guided by a strategy or paradigm. There are some paradigms, such as structured (or functional) and data oriented [4] that are well established. Although these paradigms can be used to specify and design systems for different types of problems, their use impacts the quality and productivity of software development. In these paradigms, developers do not use a consistent notation throughout the software life-cycle. This reduces their freedom to reorganize these activities to fit the software life-cycle models of their organization. The object oriented (OO) paradigm has emerged to address such issues. Although no "perfect" development paradigm exists (since effective software development also addresses issues beyond tailoring the software process), the OO paradigm has demonstrated its capability for supporting software process improvements in situations where other models were not suitable [5]. The OO paradigm's use of the logical constructs [6] of class, object, inheritance, polymorphism, aggregation, composition, and message to describe concepts and build artifacts across the software lifecycle improves the software process, because it: 1. allows developers to use a consistent notation with a common set of constructs across phases, increasing their communication throughout different development phases;
38
GUILHERME H. TRAVASSOS ETAL.
2. deals with well-defined components, protected by encapsulation (data and functionality) and displaying their corresponding interfaces, which allows the organization of the development activities using different software life-cycle models [7], and; 3. can be used as a criterion to identify possible parallel development tasks, speeding up the full development process.
OO design is a set of activities (scenario description, high- and low-level design) concerned with the representation of real-world concepts (as described by the requirements descriptions) as a collection of discrete objects that incorporate both data structure and behavior. These concepts must be somehow extracted from the requirements 1 to guide the construction of the artifacts that represent system scenarios, and high- and low-level design [8]. The technical literature describes several OO notations. A notation when associated with a software development process is called a methodology. For instance, methodologies such as OMT [9], BOOCH [10], OOSE [11], and FUSION [12] have been suggested. All of these methodologies are based on a specific software development process and use their own syntax and notation in trying to define a broad-spectrum software development methodology. However, it is not easy to define a software development methodology which is general enough to fit all software development contexts and domains. Each of these methodologies is suitable for specific systems (or problem domains), but not for all systems. Because different methodologies use different design representations it is difficult to compare information among projects that used different methodologies, even if those projects are from the same problem domain. These kinds of difficulties can be avoided by using a homogeneous notation (modeling language) and a standard software design process. One of the perceived benefits of the object oriented paradigm is that developers can use it for different software processes and life-cycles. Regardless of the paradigm and the software life-cycle used to plan the software process, a common set of activities is present, namely requirements specification, design (including high- and low-level issues), coding, and testing. Using this basic set of activities a template for OO design can be created, showing how the activities can be represented using a homogeneous notation. Additionally Kitchenham et al. [13], argued that software maintenance processes are similar to software development processes, which 1Requirements can be specified using a number of notations, from natural language to formal specifications. In the context of this work we consider that requirements are organized and described using natural language. It does not result in any loss of generality of our discussion, since OO design is concerned with translating the meaning of system requirements regardless of their notation.
WORKING WITH UML
39
makes this template with slight modification also suitable to software evolution and maintenance. Developers can tailor this software process to their specific development context while continuing to use the same modeling language. These modeling languages represent an interesting way of using OO constructs to describe problem domain concepts. By providing graphical representation for these constructs, these languages simplify the representation and allow developers to highlight important information about the problem domain. Moreover, they provide a well-specified and homogeneous set of constructs to capture the different object perspectives (static and dynamic) making the representation of the information consistent and reusable across different projects. The Unified Modeling Language (UML) is an example of such a language. Several companies and organizations around the world have used it and it has been adopted as an Object Management Group (OMG) standard [14]. Developers are using UML for more than just representing design diagrams. Several tasks concerned with software architecture modeling [15], pattern descriptions [16], design formalization [17], measurement supporting [18], and OO software inspection [19] have been accomplished using UML artifacts or extended UML artifacts. UML can also be used to represent high-level abstraction concepts, such as software process models [20], metamodels [21], and domain analysis models [22] or physical concepts, such as resources [23] and correlated engineering fields [24]. UML does not have a software process associated with it. Several software life-cycles and processes using UML exist for different contexts and domain applications. Some software process models can be used as frameworks to organize and configure design activities. A few examples are Catalysis [25], RUP [26], Unified Software Process [27], and Open [28]. Although strong, most of these models impose a high cost, as they are based on automated tools and require some training and detailed planning. Moreover, their weakness in providing techniques or guidance for defect detection in artifacts and the difficulty of adapting them to specific classes of problems, such as e-commerce or real-time systems, make the decision of whether to adopt them a complicated and risky task. This chapter has six sections, including this introduction. Section 2 deals with a short description of UML and how the artifacts are classified. Section 3 describes the design activities and a process for applying them. In Section 4 a small example is described and used to illustrate the use of UML, including requirements, high-level and some of the low-level design issues. In Section 5, maintenance is considered along with proposals on how the basic development process discussed in Section 3 can be modified to support software evolution and maintenance. Section 6 concludes this text.
40
GUILHERME H. TRAVASSOSET AL. 2.
The Unified Modeling Language (UML)
In recent years, the use of the OO paradigm to support systems development and maintenance has grown. Unlike other paradigms, such as structured or data-oriented development where developers are able to use several different methodologies and notations [7], the unification of different techniques provided a standard way to represent the software artifacts. One of these standards, representing a notation and modeling language, used by several companies and developers around the world, is U M L i t h e Unified Modeling Language. As stated by Booch [29] ~the UML has found widespread use: it has been applied successfully to build systems for tasks as diverse as e-commerce, command and control, computer games, medical electronics, banking, insurance, telephony, robotics, and avionics." In 1995, the first UML proposal was produced by combining work by Grady Booch [10] and James Rumbaugh [9] and released as version 0.8. Subsequently, Ivar Jacobson's contributions [11] were integrated into releases 0.9 and 0.91 in 1996. Since then, developers and companies around the world have been working together on its improvement. By integrating different techniques and mechanisms proven effective on industrial projects, the draft evolved through multiple versions. These efforts resulted in the development of UML 1.1, which was added to the list of technologies adopted by the Object Management Group (OMG) in November of 1997. OMG has assumed the responsibility of organizing the continued evolution of the standard [30]. In this text the OMG UML standard version 1.3, released in 1999, has been used [14]. Four objectives guided the development of UML [14] and are reflected in version 1.3: 1. Enable the modeling of systems (and not just software) using object oriented concepts. The UML artifacts explore basic software development concepts, such as abstraction, information hiding, and hierarchy, as well as object oriented paradigm constructs, such as class, inheritance, composition, and polymorphism [6]. Additionally, they provide techniques to support development, organization, and packaging, and mechanisms to represent the software architecture and deployable components. These features cover different aspects of software development and enable UML to represent not only systems but also software models. 2. Establish an explicit coupling to conceptual as ~t'ell as executable artifacts. By describing the problem using different perspectives (e.g., static and dynamic views), UML allows developers to capture all the relationships and information structures as well as all the behaviors and object
WORKING WITH UML
41
state modifications. Also, specific object constraints and features can be formalized and explicitly connected to the concepts, making the models reliable and able to be verified and validated. 3. Address the issues oflscale inherent in complex, mission-critical systems. Because UML does not have standard techniques and processes and can be used in different approaches (top-down and bottom-up) engineers are able to deal with different levels of abstraction and formalism, which is required when modeling and building software for different application domains. Moreover, the syntax of the modeling language makes available a homogeneous set of constructs supported by a well-defined set of techniques and that can be organized throughout the software development process to break down and reduce problem representation complexity. 4. Create a modeling language usable by both humans and machines. Although the intention is not to provide a standard framework to implement and integrate CASE tools, UML guarantees the same semantics and understanding for its constructs. This normalization of the representation plays an important role when developers are recognizing and describing problem domains, allowing the same information to be interpreted by different developers. It also stimulates different vendors to provide CASE tools for supporting the language, by defining a consistent and standard set of models to specify and build integration mechanisms for sharing information among different tools. This section will not discuss all the definitions of UML and its possible uses. Instead, it gives an overview of the different artifacts that can be created and the relationships among them. In Section 4, concepts and artifacts will be used to build an example application. The reader who desires a more complete view of UML may find some of the following works useful: An introductory text about UML can be found in Fowler and Scott [31]. It describes basic concepts and gives small examples on how the artifacts can be used to represent different projects situations. Rumbaugh [32] prepared a UML reference manual and Booch [10b] developed a tutorial describing how developers can use UML while performing different design activities. However, if more information is still necessary, the complete set of definitions and formalized concepts can be found in [14]. Object oriented concepts and definitions can be found in [6] and [10].
2.1
Different Perspectives to Improve Design Modeling
Although UML artifacts can be classified in different ways (e.g., the UML draft 1.3 describes the different types of documents, including use case
42
GUILHERME H. TRAVASSOS ETAL.
diagrams, class diagrams, behavior diagrams, and implementation diagrams) this text classifies UML software artifacts into three general categories: static, dynamic, and descriptive. There is a strong connection among these three categories. Each one of them captures or represents a different system perspective. In this way they are complementary; each describing a specific aspect of a system and together describing all relevant points of view. The classification of these is summarized in Table I. Static artifacts capture the information that is constant for one version of the software or system. This information is always true regardless of the functionality and state of the system. It includes classes, their structural organization (attributes and behaviors) and interface (visibility of behaviors), relationships with other classes (inheritance, aggregations, generalizations, and acquaintances), and interdependent packages. Moreover, the physical structure and organization of the software or system, including the components and processors that developers identified during implementation, can be classified as static. Artifacts that represent the static information are the class, package, component, and deployment diagrams. By representing all the relationships among classes and pieces of the system, developers are able to visualize important features of the model, such as interdependence, that will cause the coupling and structural complexity of the model to increase, impacting the cost and quality of the whole design. Dynamic artifacts describe information about communication (message exchanging) among objects and how these objects can be put together to accomplish some service or functionality. Important events in the problem can lead objects to change their states and need to be represented. These state changes can determine the correct behavior, services, and even functionalities that must be performed based on the different scenarios that have be~n modeled. The dynamic artifacts are use cases, interaction (sequence and collaboration diagrams), statecharts, and activities diagrams. These artifacts enhance TABLE I UML ARTIFACT CATEGORIES Category
Artifacts
Static
Class, package, component, and deployment diagrams
Dynamic
Use cases, interaction (sequence, collaboration), statecharts, and activities diagrams
Descriptive
Class descriptions and OCL
WORKING WITH UML
43
the problem description represented in the static view. While static information represents "who" will take part of the solution, dynamic information describes "when". However, the symbiosis between static and dynamic views is still not enough to describe all the details developers need. Descriptive artifacts describe some of the concepts that cannot be represented by static and dynamic artifacts or need a better formalization. They are classified as descriptive because, regardless of the formalism, they use basically a textual format to describe the concepts. The class descriptions, which are similar to data dictionaries of conventional methods, are an example of this type of artifact. They contain complementary information about artifacts and the formalization for some of the objects features (e.g., constraints, conditions, assertives) using the object constraint language--OCL [33]. Although all these artifacts can be produced during design, UML does not impose any order or precedence among them. Rather, developers have the freedom to decide the most affordable configuration. Descriptive artifacts are normally used with the other diagrams to support the description of information represented in the different perspectives. The following section shows a design process that can be used to build UML artifacts. Readers can find further information and examples for all of these artifacts in Section 4.
3.
Software Process Activities
A framework for the basic software life-cycle using UML is shown in Fig. 1. The process begins with a set of requirements for a new system and ends when the executable files exist. In between, the process proceeds through a number of activities, represented here by rectangles. The horizontal arrows show the sequence of development activities and curved arrows represent inspections, in which the software artifact being produced during a specific activity is reviewed and potentially improved before the next activity can begin. 2 Throughout the life-cycle, process tracking, management, and quality assurance activities proceed in parallel with development activities, as determined by the project plan. This figure shows the information produced throughout the software life-cycle being stored in a repository, which we recommend to allow for reusing this information. In this section we look in more detail at the process activities involved in the high- and low-level design activities, drawing connections where relevant 2 Although inspections are applicable at many stages of the life-cycle, in this chapter we will concentrate on two phases in particular: requirements and high-level design.
44
GUILHERME H. TRAVASSOS ETAL.
I Project ~>[ P.... ss tracking . . . . . " I Perspectivebased reading O b ~ r e a d i n g New ~!::::~
I Requi..... ts I ~ l H idesign g h-level specification
g. . . .
t and quality. . . . . . . . . .
I ~ l .... level[[ ~ ~ design
FIG. 1. The basic software life-cycle.
to the other activities in the life-cycle that influence, or are influenced by, the design.
3.1
Development Activities
Entire books have been written describing development processes that use UML. Some recommended examples are by D'Souza and Wills [25], Eriksson and Penker [34], Douglass [35], and Jacobson et al. [27]. In this chapter, we describe a high-level outline of the process, identifying important design steps and dependencies between artifacts. Our goal is not to provide a complete and definitive life-cycle process, but to enable the reader to understand the various process dependencies in sufficient detail that they can reorganize the activities, if desired, into a development process that suits their environment. Fig. 2 represents the activities within and leading up to high-level design (HLD). Before HLD can begin, a set of requirements, at some level of detail, using some notation, is produced during the requirements phase. We recommend the use of a requirements inspection, before beginning HLD, to ensure that the requirements specified are as correct and complete as possible, and adequately represent the needs of the customer. Although a choice of inspection methods exists, Fig. 2 illustrates the use of PBR, a particular approach to performing requirements inspections in which reviewers produce initial artifacts as they look for requirements defects from different points of view. (PBR is described in more detail in Section 3.2.2.) Using PBR has the advantage of producing an initial set of test cases, which can be used later in the life-cycle, and some information about potential classes, which leads into HLD activities. PBR also produces a representation of system functionalities, which contain important information for HLD. In this example, we have chosen use cases as a functional representation, although other choices are possible. In other environments,
WORKING WITH UML
45
FIG. 2. High-leveldesign activities, using artifacts created during the requirements phase and during PBR requirements inspections.
use cases can be produced without PBR, and indeed may be the only representation used for specifying requirements. H L D activities themselves typically begin with the creation of a first draft of the class diagram. Based on an understanding of the requirements, the designers first identify candidate classes in the design space, i.e., the important real-world entities and their associated behaviors, that should be modeled by the system in order to solve the problem specified in the requirements. Use cases also contain important information for identifying potential classes since they help identify the set of functionality of the system, and the actors and other systems that participate in it. A process such as CRC cards [36] may be used (either as part of the PBR inspections or as a separate design activity) to start the process. It is important not to engage in over-analysis at this point, as classes will almost surely be discarded, added, or modified over time as the domain is better understood. At this step of the design, as in others, the class description is continually changed to reflect changes in the way the classes themselves are defined. The next set of activities (step 2 in Fig. 2) is to construct the interaction and state diagrams, describing in more detail the behavior of the classes (as identified in the class diagram) to achieve the system functionality (as described in the requirements specification and use cases). While accomplishing this, the designer typically gains new insights about the set of necessary classes and their internal structures. Thus a new version of the
46
GUILHERME H. TRAVASSOS ETAL.
class diagram is produced (step 3) as the classes are updated with information about their dynamic behavior. As a last step, the class diagram is scrutinized with an eye to implementation. If the diagram is large enough that describing an overall implementation approach is unwieldy, it may be divided into packages that reflect the logical groupings of the system classes, allowing it to be broken into chunks for easier communication and planning. Similarly, activity diagrams may be created, using the use cases to suggest the high-level business processes in the system to provide a more concise overview of system functionality. At the end of HLD, a set of artifacts has been produced (consisting of the class description and class, interaction, and state diagrams, and possibly including package and activity diagrams) that describes the real-world entities from the problem domain. An inspection of the HLD is recommended at this point in the process to ensure that developers have adequately understood the problem before defining the solution. (That is, the emphasis of the inspection should be on high-level comprehension rather than low-level details of the architecture.) Since low-level designs use the same basic diagram set as the high-level design, but adding more detail, reviews of this kind can help ensure that low-level design starts from a highquality base. To provide a concrete discussion of the inspection process, we introduce OORT inspections, described further in Section 3.2.2. Unlike PBR, OORT inspections do not produce new artifacts, but result solely in updated versions of the existing HLD artifacts. These HLD artifacts are further refined in low-level design (LLD), in which details regarding system implementation are added. The first step of LLD is to adjust the set of design artifacts to the implementation domain. It is at this point that classes are introduced into the model that represent entities in the software-based solution but not the real world, such as abstract data types or screen displays. New methods and attributes may be introduced that reflect how objects will communicate with one another in the programming language chosen, not how high-level messages will be exchanged to achieve the solution. A second step is to design the system's specific interfaces, such as user interfaces and database solutions (if necessary). Also at this stage, an approach to task management is defined. Any changes necessary to the system classes in order to use these interfaces are reflected in the class description, which is used as input to the next step, in which all of the design artifacts are updated to be consistent with the new implementation details and interfaces specified. Based on these updated artifacts, a number of final activities are undertaken before coding starts, such as developing a test plan and undertaking coding preparation.
WORKING WITH UML
3.2
47
Quality Assurance and Verification and Validation Activities
Verification and validation (V&V) activities check whether the system being developed can meet the requirements of the customer. To do this, a typical V&V activity focuses on some artifacts produced during the software life-cycle to ascertain if they are correct in themselves (verification) and accurately describe the system that should be produced (validation). V&V activities are often known as "quality assurance" activities since they are concerned with ensuring the quality of the system being developed, either directly by evaluating the system itself, or indirectly by evaluating the quality of the intermediate artifacts used to produce the system [7]. At a high level, there are three types of V&V activities. Measurement activities attempt to assess the quality of a design by assessing certain structural characteristics. The challenge lies in finding appropriate and feasible metrics for the qualities of interest. For example, designers might be interested in evaluating the modifiability of their design, perhaps because several later versions of the system are expected and it would be worthwhile to minimize the effort required for maintenance in each case. A quality attribute such as this one cannot be measured directly, so the designers instead might choose to measure some other attribute that is measurable and yet provides some insight into the ease of making modifications. Using product metrics in this way requires some kind of baseline data or heuristic information, so that the resulting values can be evaluated. Measurement in the U M L / O O paradigm is discussed in Section 3.2.1. In contrast, inspection and testing, two other V&V activities, attempt to ensure software quality by finding defects from various artifacts produced during the life-cycle. Inspection activities require humans to review an artifact and think about whether it is of sufficient quality to support the development of a quality system. There are different types of inspection techniques that represent different strategies for organizing people's roles during the inspections for keeping reviewers focused on the important aspects of the artifact they are inspecting. Inspections are discussed in more detail in Section 3.2.2, in which the discussion is illustrated by two specific inspection approaches: perspective-based reading, which is tailored to inspections of requirements documents, and object oriented reading techniques, which are tailored for OO design inspections. Testing is a V&V activity that is appropriate for evaluating software artifacts for which some dynamic behavior or structural feature can be studied. Testing attempts to understand the quality of the artifact, for instance, by comparing the observed behavior to that which is expected. Typically, testing is applied to code, which can of course be compiled or
48
GUILHERME H. TRAVASSOS ETAL.
interpreted and run directly. However, testing can also be applied to other artifacts; for example, requirements and design represented in a formal language can be "run" using simulation and the results studied [37]. In this discussion, we will confine ourselves to discussing code testing and how it can be affected by design decisions. In Section 3.2.3, we will also look briefly at some of the different types of testing techniques that face particular challenges in the O O / U M L paradigm. These different types of V&V activities are not in competition. Rather, some combination is necessary for the production of quality software. Unfortunately, too often development efforts rely entirely on code testing and do not invest in other V&V activities, notably inspections, on artifacts earlier in the life-cycle. Relying exclusively on testing in this way means that defects are not found until the end of the life-cycle, when they are most expensive to fix. Additionally, over-reliance on testing often feeds a tendency of developers to pay less attention to careful design and code production (on the assumption that testing will catch any of the resulting problems). Such an approach can lead to difficulties since it is rarely possible to "test in" quality; low-quality code with multiple patches does not often end up being high-quality code in the end [38]. In contrast, augmenting testing with other V&V activities earlier in the life-cycle means that misconceptions about the system can be caught early. For example, requirements inspections can help identify problems with the way a planned system would address customer needs before an inappropriate design has been created. Design inspections can identify problems with the system architecture or design before significant effort has been invested in coding, which may have to be redone if problems are detected later. Inspections cannot replace testing but are an investment that helps "build-in" quality from the beginning and avoid later rework.
Defects in Software Artifacts Both inspection and testing have the same goal: to find defects in the particular software artifact under review. To get an operational definition of what exactly a "defect" is, we introduce some terms based on the IEEE standard terminology [39]: 9 An error is a defect in the human thought process made while trying to understand given information, to solve problems, or to use methods and tools. In the context of software design, an error is a basic misconception concerning how the system should be designed to meet the needs of a user or customer. 9 A f a u l t is a concrete manifestation of an error within a software artifact. One error may cause several faults and various errors may cause identical faults.
WORKING WITH UML
49
9 A failure is a departure of the operational software system behavior from user expected requirements. A particular failure may be caused by several faults and some faults may never cause a failure. For the sake of convenience, we will use the term defect as a generic term, to refer to a fault or failure. However, it should be clear that when we discuss the defects found by software inspections, we are really referring to faults. A fault in some static artifact, such as a system design, is important insofar as it can lead to a system implementation in which failures occur. Defects found by testing, on the other hand, are always failures of the software that can then be traced back to faults in a software artifact during debugging. When we look for a precise definition of a defect, capable of guiding a V&V activity, we face the problem that what constitutes a defect is largely situation-dependent. For example, if there are strong performance requirements on a system, then any description of the system that might lead to those performance requirements being unfulfilled contains a defect; however, for other systems with fewer performance constraints the same artifacts could be considered perfectly correct. Similarly, the types of defects we are interested in for a textual requirements document could be very different from what we would look for in a graphical design representation. We can avoid this difficulty by identifying broad classes of defects and then instantiating those classes for specific circumstances. For our own work on developing inspection processes, we have found a useful classification that is based on the idea of the software development life-cycle as a series of transformations of a system description to increasingly formal notations. For example, we can think of a set of natural-language requirements as a loose description of a system that is transformed into high- and low-level designs, more formal descriptions of the same basic set of functionality. Eventually these designs are translated into code, which is more formal still, but still describes the same set of functionality (hopefully) as was set forth in the original requirements. So what can go wrong during such transformations? Fig. 3 presents a simplified view of the problem, in which all of the relevant information has to be carried forward from the previous phase into a new form, and has to be specified in such a way that it can be further refined in the next phase. The ideal case is shown by arrow 1, in which a piece of information from the artifact created in the previous phase of development is correctly translated into its new form in the artifact in the current phase. There is, however, the possibility that necessary information is somehow left out of the new artifact (arrow 2) or translated into the new artifact but in an incorrect form (arrow 3). In the current phase artifact, there is always the possibility that
50
GUILHERME H. TRAVASSOS ETAL.
Previous development phase
Current
phase
Next phase
~NN
FIG. 3. Representation of various defect types that can occur during software development.
extraneous information has been entered (arrow 4), which could lead to confusion in the further development of the system, or that information has been specified in such a way as to make the document inconsistent with itself (arrow 5). A related possibility is that information has been specified ambiguously, leading to multiple interpretations in the next phase (arrows 6), not all of which may be correct or appropriate. 3 These generic defect classes can be made more specific, to guide V&V activities for various circumstances. Examples of this are given for specific inspections in Section 3.2.2. Table II, summarizes these defect classes. It is important to note that the classes are not orthogonal (i.e., a particular defect could possibly fit into more than one category) but are intended to give an idea of the set of possible defects that can occur.
3.2. 1
Measurement
Software development projects desiring some insight into the product being produced and the process being applied use metrics to measure important information about the project. Many companies have full-scale 3 Of course, Fig. 3 is a simplified view. In reality, many of the implied one-to-one mappings do not hold. There may be multiple artifacts created in each stage of the life-cycle, and the information in a particular phase can influence m a n y aspects of the artifact created in the next phase. For example, one requirement from a requirements specification can impact many components of the system design. When notational differences are taken into account (e.g., textual requirements are translated into a graphical design description) it becomes apparent why performing effective inspections can be such a challenging task.
WORKING WITH U M L
51
TABLE II TYPES OF SOFTWARE DEFECTS, WITH GENERIC DEFINITIONS Defect
General description
Omission
Necessary information about the system has been omitted from the software artifact.
Incorrect fact
Some information in the software artifact contradicts information in the requirements document or the general domain knowledge.
Inconsistency
Information within one part of the software artifact is inconsistent with other information in the software artifact.
Ambiguity
Information within the software artifact is ambiguous, i.e., any of a number of interpretations may be derived that should not be the prerogative of the developer doing the implementation.
Extraneous information
Information is provided that is not needed or used.
measurement programs that operate alongside software development activities, collecting a standard set of metrics across multiple projects to facilitate the tracking of development progress. The most sophisticated of these use some sort of measurement framework to ensure that the metrics being collected are tied directly to the business goals of the company. For example, the GQM paradigm [40] makes explicit the connection between overall goals, specific questions that must be answered to achieve the goals, and the metrics that collect information capable of answering the questions. Developers and project managers have found metrics useful for: 9 evaluating software quality (e.g., by measuring system reliability); 9 understanding the design process (e.g., by measuring how much effort is being spent, and on what activities); 9 identifying product problems (e.g., by identifying overly complex modules in the system); 9 improving solutions (e.g., by understanding the effectiveness of design techniques and how they can be better tailored to the users); and 9 acquiring design knowledge (e.g., by measuring the size of the design being produced). Two of the most often measured attributes of an OO design are coupling and cohesion. Coupling refers to the degree of interdependence between the parts of a design. One class is coupled to another class when methods declared in one class use methods or attributes of the other class. High
52
GUILHERME H. TRAVASSOS ETAL.
coupling in a design can indicate an overly complex or poorly constructed design that will likely be hard to understand. This measure can also indicate potential problems with maintenance since changes to a highly coupled class are likely to impact many other classes in the design. Cohesion refers to the internal consistency within parts of the design. Semantically, cohesion is measured by whether there is a logical consistency among the names, methods, and attributes of classes in the design. Syntactically, cohesion can be measured by whether a class has different methods performing different operations on the same set of attributes, which may indicate a certain logical consistency among the methods. A lack of cohesion can also indicate a poorly constructed design since it implies that classes have methods that would not logically be expected to belong to them, indicating that the domain may not have been modeled correctly and pointing to potential maintenance problems [41]. Size metrics are also used often. Projects undertake size measures for a variety of reasons, such as to produce an estimate of the implementation effort that will be necessary [42]. However, no one definitive size metric is possible since each involves some level of abstraction and so may not completely describe all attributes of interest; for example, measuring the number of classes is at best a rough estimate of system size since not all classes are at the same level of complexity. Lorenz and Kidd [43] identified a number of size metrics for requirements, high- and low-level design: 9 Number of scenarios scripts (NSS): counts the number of use cases that are necessary to describe the system. Since this is a measure of functionality it is correlated to the size of the application and, more directly, to the number of test cases that will be necessary. 9 Number of key classes (NKC): counts the number of domain classes in the HLD, giving a rough estimate of the amount of effort necessary to implement the system and the amount of reuse that will be possible. 9 Number of support classes (NSC): counts the number of classes in the LLD, giving rough predictions of implementation effort and reuse. 9 Average number of support classes per key class (ANSC): measures the degree of expansion of the system from H L D to LLD, giving an estimate of how much of the system is necessary for implementationlevel details. 9 Number of subsystems (NSUB): a rough size measure based on larger aggregates of system functionality. 9 Class size (CS): for an individual class, this metric is defined as the total number of operations plus the number of attributes (both including inherited features).
WORKING WITH UML
53
9 Number of operations overridden by a subclass (NO0). 9 Number of operations added by a subclass (NOA). 9 Specialization index (SI): defined as (NOO x level in the hierarchy)/ (total methods in the class). Metrics exist to measure other attributes of designs besides size. Often, these metrics attempt to somehow measure design complexity, on the assumption that more complex designs are harder for human developers to understand and consequently harder to develop and maintain. Perhaps the most well known of these metrics sets was proposed by Chidamber and Kemerer [44]: 9 Weighted methods per class (WMC): measures a class by summing the complexity measures assigned to each of the class's methods, motivated by the idea that the number and complexity of a class's methods are correlated with the effort required to implement that class. Another use of this metric is suggested by the heuristic that classes with large numbers of methods are likely to be more application specific, and hence candidates for reuse. 9 Depth of inheritance (DIT): measures the depth at which a class appears in an inheritance hierarchy. Classes deeper in the hierarchy are likely to inherit a larger number of methods and attributes, making their behavior more difficult to predict. 9 Number of children (NOC): measures the number of subclasses that directly inherit from a given class. A high NOC value typically indicates that a class should be tested more extensively, since such classes may represent a misuse of subclassing, but definitely have an extensive influence on the design. 9 Coupling between objects (CBO): measures the number of other classes to which a class is coupled. Extensive coupling indicates higher complexity (and hence suggests more testing is necessary) but also signals a likely difficulty in reusing the class. 9 Response for a class (RFC): measures the number of methods belonging to the class that can be executed in response to a message. The higher the value, the more complex testing and debugging of that class are likely to be. 9 Lack of cohesion in methods (LCOM): measures the degree to which the methods of a class make use of the same attributes. More overlap among the attributes used is assumed to signal more cohesiveness among the methods. A lack of cohesion increases complexity, increasing the likelihood of development errors, and typically indicates that this class should be split into two or more subclasses.
54
GUILHERME H. TRAVASSOS ETAL.
The popularity of the Chidamber and Kemerer metrics for describing designs has led to a few extensions to be proposed, so that the range of measures could be tailored to particular needs. For example, Lie and Henry [45] introduced two new metrics that were useful for the commercial systems (implemented in an OO dialect of Ada) they were studying: 9 Message passing coupling (MPC): calculated as the number of "send" statements defined in a class. 9 Data abstraction coupling (DAC): calculated as the number of abstract data types used in the measured class but defined in another class of the system. And, Basili, Briand, and Melo [46] introduced a version of the Chidamber and Kemerer metrics tailored to C+4-: 9 WMC: redefined so that all methods have complexity 1 (i.e., the metric is a count of the number of methods in a class) and "friend" operators do not count. 9 DIT: measures the number of ancestors of a class. 9 NOC: measures the number of direct descendants for each class. 9 CBO: redefined so that a class is coupled to another if it uses its member functions a n d / o r attributes. 9 RFC: measures the number of functions directly invoked by member functions or operators of a class. 9 LCOM: defined as the number of pairs of member functions without shared instance variables, minus the number of pairs of member functions with shared instance variables. Table III summarizes the metrics discussed in this section and connects them with the life-cycle stages for which they are appropriate.
3.2.2 Inspections Software inspections are a type of V&V activity that can be performed throughout the software life-cycle. Because they rely on human understanding to detect defects, they have the advantage that they can be done as soon as a software work artifact is written and can be used on a variety of different artifacts and notations. Because they are typically done by a team, they are a useful way of passing technical expertise as to good and bad aspects of software artifacts among the participants. And, because they get developers familiar with the idea of reading each other's artifacts, they can lead to more readable artifacts being produced over time. On the other hand, because they rely on human effort, they are affected by nontechnical
WORKING WITH UML
55
TABLE III METRICS DISCUSSED IN THIS CHAPTER FOR EACH PHASE OF THE LIFE-CYCLE Requirements description Lorenz and Kidd NSS NKC NSC ANSC NSUB CS NO0 NOA SI
~
Chidamber and Kemerer WMC DIT NOC CBO RFC LCOM
High-level design
Low-level design
Coding
Testing
~r
~ ,~ ~ ;t
,~ ~ ,~ ~ ;t
X ,~ ;t
• ,~ ~
~r
x x
x
issues: reviewers c a n h a v e different levels o f r e l e v a n t expertise, c a n get b o r e d if a s k e d to review large artifacts, c a n h a v e their o w n feelings a b o u t w h a t is or is n o t i m p o r t a n t , or c a n be affected by political or p e r s o n a l issues. F o r this r e a s o n , t h e r e h a s b e e n a n e m p h a s i s o n d e f i n i n g p r o c e s s e s t h a t p e o p l e c a n use for p e r f o r m i n g effective i n s p e c t i o n s . M o s t o f the c u r r e n t w o r k o n i n s p e c t i o n s owes a large d e b t to the very i n f l u e n t i a l w o r k s o f F a g a n [47] a n d G i l b a n d G r a h a m [48]. In b o t h , the e m p h a s i s is o n the i n s p e c t i o n method, 4 in w h i c h the f o l l o w i n g p h a s e s are identified: 9 P l a n n i n g : In this p h a s e , the scope, artifact, a n d p a r t i c i p a n t s for the i n s p e c t i o n are decided. T h e r e l e v a n t i n f o r m a t i o n a n d m a t e r i a l s are d i s t r i b u t e d to e a c h i n s p e c t o r , a n d their responsibilities are e x p l a i n e d to t h e m , if necessary. 4 In this text we distinguish a "technique" from a "method" as follows: A technique is a series of steps, at some level of detail, that can be followed in sequence to complete a particular task. We use the term "method" as defined in [49], "a management-level description of when and how to apply techniques, which explains not only how to apply a technique, but also under what conditions the technique's application is appropriate."
56
GUILHERME H. TRAVASSOS ETAL.
9 Detection: The inspectors review the artifact on their own, identifying defects or other quality issues they encounter. 9 Collection: The inspectors meet as a team to discuss the artifact, and any associated problems they feel may exist. A definitive list of the issues raised by the team is collected and turned over to the author of the artifact. 9 Correction: The author updates the artifact to reflect the concerns raised by the inspection team. The methods do not, however, give any guidelines to the reviewer as to how defects should be found in the detection phase; both assume that the individual review of these documents can already be done effectively. Having been the basis for many of the review processes now in place (e.g., at NASA [50]), Fagan [47a] and Gilb and Graham [48] have inspired the direction of much of the research in this area, which has tended to concentrate on improving the review method. Proposed improvements to Fagan's method often center on the importance and cost of the meeting. For example, researchers have proposed: 9 introducing additional meetings, such as the root cause analysis meeting of Gilb and Graham [48]; 9 eliminating meetings in favor of straightforward data collection [51]. More recent research has tried to understand better the benefits of inspection meetings. Surprisingly, such studies have reported that, while they may have other benefits, inspection meetings do not contribute significantly to the number of defects found [51,52]. That is, team meetings do not appear to provide a significantly more complete list of defects than if the actual meeting had been dispensed with and the union of the individual reviewers' defect lists taken. This line of research suggests that efforts to improve the review technique, that is, the process used by each reviewer to find defects in the first place, could be of benefit. One approach to doing this is provided by software reading techniques. A reading technique is a series of steps for the individual analysis of a software product to achieve the understanding needed for a particular task [49]. Reading techniques increase the effectiveness of individual reviewers by providing guidelines that they can use, during the detection phase of a software inspection, to examine (or "read") a given software document and identify defects. Rather than leave reviewers to their own devices reading techniques attempt to capture knowledge about best practices for defect detection into a procedure that can be followed.
WORKING WITH UML
57
In our work we have defined the following goals for inspection techniques: 9 Systematic: Specific steps of the individual review process should be defined. 9 Focused: Different reviewers should be asked to focus on different aspects of the document, thus having unique (not redundant) responsibilities.
9 Allowing controlled improvement: Based on feedback from reviewers, specific aspects of the technique should be able to be identified and improved. 9 Tailorable: The technique should be customizable to a specific project and/or organization. 9 Allows training: It should be possible to train the reviewers to apply the technique. In this section, we look at reading techniques that directly support the production of quality software designs: PBR, which ensures that the artifacts input to HLD are of high quality, and OORTs, which evaluate the quality of the HLD itself.
A Requirements Inspection Technique: Perspective-Based Reading (PBR) A set of inspection techniques known as perspectivebased reading (PBR) was created for the domain of requirements inspections. PBR is designed to help reviewers answer two important questions about the requirements they are inspecting: 9 How do I know what information in these requirements is important to check? 9 Once I have found the important information, how do I identify defects in that information? PBR exploits the observation that different information in the requirements is more or less important for the different uses of the document. That is, the ultimate purpose of a requirements document is to be used by a number of different people to support tasks throughout the development life-cycle. Conceivably, each of those persons finds different aspects of the requirements important for accomplishing his or her task. If we could ask all of the different people who use the requirements to review it from their own point of view, then we would expect that all together they would have reviewed the whole document (since any information in the document is presumably there to help somebody do his or her job).
58
GUILHERME H. TRAVASSOS ETAL.
Thus, in a PBR inspection each reviewer on a team is asked to take the perspective of a specific user of the requirements being reviewed. His or her responsibility is to create a high-level version of the work products that a user of the requirements would have to create as part of his or her normal work activities. For example, in a simple model of the software life-cycle we could expect the requirements document to have three main uses in the software life-cycle: 9 As a description of the needs of the customer: The requirements describe the set of functionality and performance constraints that must be met by the final system. 9 As a basis for the design of the system: The system designer has to create a design that can achieve the functionality described by the requirements, within the allowed constraints. 9 As a point of comparison for system test: The system's test plan has to ensure that the functionality and performance requirements have been correctly implemented. In such an environment, a PBR inspection of the requirements would ensure that each reviewer evaluated the document from one of those perspectives, creating some model of the requirements to help focus their inspection: an enumeration of the functionality described by the requirements, a high-level design of the system, and a test plan for the system, respectively. The objective is not to duplicate work done at other points of the software development process, but to create representations that can be used as a basis for the later creation of more specific work products and that can reveal how well the requirements can support the necessary tasks. Once reviewers have created relevant representations of the requirements, they still need to determine what defects may exist. To facilitate that task, the PBR techniques provide a set of questions tailored to each step of the procedure for creating the representation. As the reviewer goes through the steps of constructing the representation, he or she is asked to answer a series of questions about the work being done. There is one question for every applicable type of defect. (The defect types, tailored specifically to the requirements phase, are given in Table IV.) When the requirements do not provide enough information to answer the questions, this is usually a good indication that they do not provide enough information to support the user of the requirements, either. This situation should lead to one or more defects being reported so that they can be fixed before the requirements need to be used to support that task later in the product life-cycle.
59
WORKING WITH UML TABLE IV
TYPES OF SOFTWARE DEFECTS, WITH SPECIFIC DEFINITIONS FOR THE REQUIREMENTS AND DESIGN Defect
Applied to requirements
Applied to design
Omission
(1) some significant requirement related to functionality, performance, design constraints, attributes or external interface is not included: (2) responses of the software to all realizable classes of input data in all realizable classes of situations is not defined; (3) missing sections of the requirements document: (4) missing labeling and referencing of figures, tables, and diagrams: (5) missing definition of terms and units of measures [53].
One or more design diagrams that should contain some concept from the general requirements or from the requirements document do not contain a representation for that concept.
Incorrect fact
A requirement asserts a fact that cannot be true under the conditions specified for the system.
A design diagram contains a misrepresentation of a concept described in the general requirements or requirements document.
Inconsistency
Two or more requirements are in conflict with one another.
A representation of a concept in one design diagram disagrees with a representation of the same concept in either the same or another design diagram.
Ambiguity
A requirement has multiple interpretations due to multiple terms for the same characteristic, or multiple meanings of a term in a particular context.
A representation of a concept in the design is unclear, and could cause a user of the document (developer, lowlevel designer, etc.) to misinterpret or misunderstand the meaning of the concept.
Extraneous information
Information is provided that is not needed or used.
The design includes information that, while perhaps true, does not apply to this domain and should not be included in the design.
60
GUILHERME H. TRAVASSOS ETAL.
More information, including example techniques for the three perspectives identified above, is available at h t t p : / / f c - m d . u m d . e d u / read i no/read
i no. h tm I.
Design Inspection Techniques: Object Oriented Reading Techniques (OORTs) In PBR, reviewers are asked to develop abstractions, from different points of view, of the system described by the requirements because requirements notations do not always facilitate the identification of important information and location of defects by an inspector. For an OO design, in contrast, the abstractions of important information already exist: the information has already been described in a number of separate models or diagrams (e.g., state machines, class diagrams) as discussed at the end of the previous section. However, the information in the abstractions has to be checked for defects, and reading techniques can still supply a benefit by providing a procedure for individual inspection of the different diagrams, although unique properties of the OO paradigm must be addressed. In an object oriented design we have graphical representations of the domain concepts instead of the natural language representation found in the requirements document. Another feature of object oriented designs that has to be accounted for is the fact that while the different documents within the design all represent the system, they present different views of the information. A set of object oriented reading techniques (OORTs) has been developed for this purpose, focused on a particular set of defects that was defined by tailoring the generic defect definitions from Table II to the domain of OO designs (Table IV). For example, the information in the artifact must be compared to the general requirements in order to ensure that the system described by the artifact matches the system that is supposed to be built. Similarly, a reviewer of the artifact must also use general domain knowledge to make sure that the artifact describes a system that is meaningful and can be built. At the same time, irrelevant information from other domains should typically be prevented from appearing in the artifact, since it can only hurt clarity. Any artifact should also be analyzed to make sure that it is self-consistent and clear enough to support only one interpretation of the final system. The PBR techniques for requirements are concerned mainly with checking the correctness of the document itself (making sure the document was internally consistent and clearly expressed, and whether the contents did not contradict any domain knowledge). A major difference in the OORTs is that for checking the correctness of a design, the reading process must be twofold. As in requirements inspection, the correctness and consistency of the design diagrams themselves must of course be verified (through
61
WORKING WITH U M L
"horizontal reading ''5) to ensure a consistent document. But a frame of reference is necessary in order to assess design correctness. Thus it is also necessary to verify the consistency between design artifacts and the system requirements (through "vertical reading"6), to ensure that the system design is correct with respect to the functional requirements. The OORTs consist of a family of techniques in which a separate technique has been defined for each set of diagrams that could usefully be compared against each other. For example, sequence diagrams need to be compared to state machines to detect whether, for a specific object, there are events, constraints, or data (described in the state machine) that could change the way that messages are sent to it (as specified in the sequence diagram). The advantage of this approach is that a project engaged in design inspections can select from this family only the subset of techniques that correspond to the subset of artifacts they are using, or that are particularly important for a given project. The full set of horizontal and vertical reading techniques is defined as illustrated in Fig. 4. Each line between the software artifacts represents a reading technique that has been defined to read one against the other. These techniques have been demonstrated to be feasible and helpful in finding defects [19,54]. More information about the OORTs, including a
Requirements specification
Requirements descriptions r .......
High-level design
-
-
Vert. reading Horz. reading
Class diagrams
Class descriptions
II -U
Use cases r r .......
State machine diagrams
........
Interaction diagrams
(Sequence)
FIG. 4. The set of OORTs (each line represents one technique) that has been defined for various design artifacts.
5 Horizontal reading refers to reading techniques that are used to read documents built in the same software life-cycle phase (see Fig. 4). Consistency among documents is the most important feature here. 6Vertical reading refers to reading techniques that are used to read documents built in different software life-cycle phases (see Fig. 4). Traceability between the phases is the most important feature here.
62
GUILHERME H. TRAVASSOS ETAL.
technical report describing how these techniques were defined, is available at h t tp : / / f c - m d . u m d . e d u / r e a d
i no/read
i ng. h tm I.
3.2.3 Testing Testing is a V&V activity that is performed toward the end of the lifecycle (or, more precisely, toward the end of some iteration through the development process), once executable code has been produced. It is a relatively expensive activity; it typically takes great ingenuity to exercise all of the important features of a system under realistic conditions and notice discrepancies from the expected results, not to mention tracing the system failures observed during execution back to the relevant faults in the code. Nevertheless, it remains an essential way of ensuring the quality of the system. Due to this expense, however, exhaustive testing is not feasible in most situations, and so it is important for development projects to have a well thought-out test plan that allocates time and resources in such a way as to provide effective coverage of the system. A number of general testing approaches have been developed and in this section we describe briefly how they map to systems developed using O 0 / U M L and what types of difficulties are introduced by the new paradigm. A test plan is a general document for the entire project that identifies the scope of the testing effort, the general approach to be taken, and a schedule of testing activities. Regardless of the paradigm by which a system has been developed, a good test plan will also include: 9 a test unit specification, listing the modules to be tested and the data used during testing; 9 a listing of the features to be tested (possibly including functionality, performance, and design constraints); 9 the deliverables resulting from the test activities (which might be a list of the test cases used, detailed testing results, a summary report, and/or data about code coverage); 9 personnel allocation, identifying the persons responsible for performing the different activities. Also independent of the development paradigm are the criteria for a good test plan: effectiveness (ideally, it should result in all of the defects in the product being fixed), a lack of redundancy (effort is spent efficiently), completeness (important features of the system are not missed during testing), and the right level of complexity. Planning a system-wide test plan should make use of three different test techniques. These techniques are complementary, meaning that no one
WORKING WITH UML
63
technique covers all the important aspects of a system; rather, effort should be allocated to testing of each type to a greater or lesser degree, depending on how much it meets the needs of the particular system being developed. These three techniques are: 9 Functional (black-box) testing [55]: Test cases are built based on the functionality that has been specified for the system, with the objective of ensuring that the system behaves correctly under all conditions. Because this technique is concerned with the system's behavior, not implementation, approaches developed for other programming paradigms can be used on OO systems without change, such as equivalence partitioning, boundary value analysis, and cause-effect graphing. 9 Structural (white-box) testing [38, 56]: Test cases are built based on the internal structure of the software, with the objective of ensuring that all possible execution paths through the code will produce correct results. The nature of OO programming, in which data flow can be distributed among objects, each of which maintains its own separate state, introduces some complexities which are described below, under "unit testing." Structural testing approaches can be based on control flow, data flow, or program complexity. There is ongoing research as to how these approaches can be adapted to O O / U M L [57] but there are no practical techniques as yet. 9 Defect-based testing: Test cases are built to target likely classes of defects. Approaches include defect seeding (in which a set of defects is seeded into the code before testing begins, to get some idea of the effectiveness of the testing activities), mutation testing for unit testing [58], and interface mutation for integration testing [59]. Although work has been done on identifying useful classes of defects for use in the testing of structured programs, little has been published as to which types of defects are most useful to focus on in OO development. Testing usually proceeds through multiple levels of granularity, and techniques from each of the above types may be applied at each level. Testing may proceed from unit testing, in which individual components of the system are tested in isolation, to integration testing, in which components are tested while working together, to system and acceptance testing, in which the system is tested as an operational whole. Development with O O / U M L does not change testing at the system level, because the goal there is to test functionality independent of the underlying implementation, but does affect testing at other levels: 9 Unit testing: Unit testing is complicated in O O / U M L , first and foremost because it is not generally agreed upon what the "unit"
64
GUILHERME H. TRAVASSOS ETAL.
should represent. In structured programming, the unit is generally taken to be a code module, but in O O / U M L it can be a method, a class, or some larger aggregation of classes such as a package. Aside from the matter of notation there are still significant technical difficulties. Inheritance and polymorphism introduce challenges because they allow many execution paths to exist that are not easy to identify from inspecting the class. Additionally, object states cause complications because the different states an object can be in, and how these states affect the responses to messages, must be taken into account during testing. Integration testing: No matter how the "unit" is defined, OO systems will typically have a higher number of components, which need to be integrated earlier in the development cycle, than a non-OO system. On top of that, managing integration testing in U M L / O O can be quite complex since typically a simple calling tree does not exist; objects do not exist statically throughout the life-time of the system but are created and deleted while responding to triggering events. Thus the number of interactions between components will be much higher than it would be for a non-OO system. It is important to remember that, for these reasons, some interaction problems will not be apparent until many objects have been implemented and activated, relatively late in the testing process.
4.
The Example
This section illustrates how developers can use the previously defined design process to build U M L artifacts, by describing the design of an example system. The example explains the development of a hypothetical system for a self-service gas station, from the completion of requirements to the beginning of coding. As we proceed through the example, we define and explain the various types of design diagrams created. The gas station in this example allows customers to purchase gas (selfservice), to pay for maintenance work done on their cars, and to lease parking spots. Some local businesses have billing accounts set up to receive a monthly bill, instead of paying at the time of purchase. There is always a cashier on duty at the gas station to accept cash payments or perform system maintenance, as necessary. The requirements we are concerned with for the purposes of the example are excerpts from the requirements document describing the gas station control system (GSCS), and describe how the system receives payment from the customer. A customer has the option to be billed at the time of purchase,
WORKING WITH UML
65
or to be sent a monthly bill and pay at that time. Customers can always pay via cash or credit card. Table V describes some concepts about this part of the problem. The functional requirements that were defined for the billing part of the gas station system are as follows. As in any other system development, the initial set of requirements obtained from discussion with the customer may contain some errors that could potentially impact system quality. 1. After the purchase of gasoline, the gas pump reports the number of gallons purchased to the GSCS. The GSCS updates the remaining inventory. 2. After the purchase of gasoline, the gas pump reports the dollar amount of the purchase to the GSCS. The maximum value of a purchase is $999.99. The GSCS then causes the gas pump interface to query the customer as to payment type. 2.1. The customer may choose to be billed at the time of purchase, or to be sent a monthly bill. If billing is to be done at time of purchase, the gas pump interface queries the customer as to whether payment will be made by cash or credit card. If the purchase is to be placed on a monthly bill, the gas pump interface instructs the customer to see the cashier. If an invalid or no response is received, the GSCS bills at the time of purchase. 3. If the customer has selected to pay at the time of purchase, he or she can choose to pay by cash or credit card. If the customer selects cash, the gas pump interface instructs the customer to see the cashier. If the customer selects credit card, the gas pump interface instructs the customer to swipe his or her credit card through the credit card reader. If an invalid or no selection is made, the GSCS will default to credit card payment. 4. If payment is to be made by credit card, then the card reader sends the credit card number to the GSCS. If the GSCS receives an invalid card number, then a message is sent to the gas pump interface asking the customer to swipe the card through the card reader again. After the account number is obtained, the account number and purchase price are sent to the credit card system, and the GSCS and gas pump interface are reset to their initial state. The purchase price sent can be up to $10 000. 5. The cashier is responsible for accepting the customer's payment and making change, if necessary. When payment is complete, the cashier indicates this on the cashier's interface. The GSCS and the gas pump interface then return to the initial state.
66
GUILHERME H. TRAVASSOS ETAL. TABLE V GLOSSARY
Concept
Description
Credit card reader
The credit card reader is a separate piece of hardware mounted at each gas pump. The internal operations of the credit card reader, and the communications between the GSCS and the card reader, are outside the scope of this document. When the customer swipes his or her credit card through the card reader, the card reader reads the credit card number and sends it to the GSCS. If the credit card number cannot be read correctly, an invalid token is sent to the GSCS instead.
Credit card system
The credit card system is a separate system, maintained by a credit card company. The internal operations of the credit card system, and the communications between the GSCS and the credit card system, are outside the scope of this document. The GSCS sends a credit card number and purchase amount to the credit card system in order to charge a customer's account; the credit card company later reimburses the gas station for the purchase amount.
Gas pump
The customer uses the gas pump to purchase gas from the gas station. The internal operations of the gas pump, and the communications between the gas pump and the GSCS, are outside the scope of this document. The gas pump is responsible for recognizing when the customer has finished dispensing gas, and for communicating the amount of gas and dollar amount of the purchase to the GSCS at this time.
Gas pump interface
The gas pump interface is a separate piece of hardware mounted at each gas pump. The internal operations of the gas pump interface, and the communications between the gas pump interface and the GSCS, are outside the scope of this document. The gas pump interface receives a message from the GSCS and displays it for use by the customer. The gas pump interface also allows the customer to choose from a number of options, and communicates the option chosen to the GSCS.
Cashier's interface
The cashier's interface is a separate piece of hardware mounted at the cashier's station. The internal operations of the cashier's interface, and the communications between the cashier's interface and the GSCS, are outside the scope of this document. The cashier's interface is capable of displaying information received from the GSCS. The cashier's interface is also able to accept input from the cashier, including numeric data, and communicate it to the GSCS.
Customer
The customer is the client of the gas station. Only registered customers can pay bills monthly. Name, address, telephone number, and account number are the features that will describe a registered customer.
WORKING WITH UML
67
6. If payment is to be made by monthly bill, the purchase price is displayed on the cashier's interface. The cashier selects an option from the cashier's interface, alerting the GSCS that the payment will be placed on a monthly bill. The GSCS then prompts the cashier to enter the billing account number. 6.1. The customer must give the billing account number to the cashier, who then enters it at the cashier's interface. If a valid billing account number is entered, then the billing account number, purchase price, and a brief description of the type of transaction is logged. If an invalid billing account number is entered, an error message is displayed and the cashier is prompted to enter it again. The cashier must also have the option to cancel the operation, in which case the cashier's interface reverts to showing the purchase price and the cashier can again either receive cash or indicate that monthly billing should be used. 7. To pay a monthly bill, the customer must send the payment along with the billing account number. The cashier enters monthly payments by first selecting the appropriate option from the cashier's interface. The GSCS then sends a message to the cashier's interface prompting the cashier to enter the billing account number, the amount remitted, and the type of payment. If any of these pieces of information are not entered or are invalid, payment cannot be processed; an error message will be displayed, and the cashier's interface will be returned to the previous screen. If the type of payment is credit card, the credit card account number must also be entered, and then the paper credit card receipt will be photocopied and stored with the rest of the year's receipts. 8. Unless otherwise specified, if the GSCS receives invalid input it will send an error message to the cashier's interface. The cashier will be expected to take appropriate action, which may involve shutting the system down for maintenance. Performance and extensibility, two nonfunctional requirements [60] that can influence decisions regarding low-level design, were also used: 1. The system must always respond to customer input within five minutes. 2. The system should be easy to extend, so that if necessary another payment option (e.g., bank card) can be added with minimal effort.
4.1
Requirements Activities
The problem description and requirements were generated first. Then, the requirements were inspected to ensure they were of high enough quality to
68
GUILHERME H. TRAVASSOS ETAL.
support high-level design and fully represented the functionality needed to build the system. This requirements inspection was done using PBR. For this system we used the customer and tester perspectives because during the inspection we were able to identify defects while producing artifacts capturing the system's functionalities (use cases) and information (test cases) that might be used later for application testing. The use cases in particular were felt to be a useful representation of the system functionality and worth the time required for their creation, since they provide a very understandable format for communication between system designers and the customers. Moreover, use cases feed into later stages of the life-cycle by helping identify objects, develop testing plans, and develop documentation. So, this representation was expected to provide additional benefits for developers as well as to be useful for validating whether the system met the customer's expectations. Applying PBR to the gas station requirements resulted in the identification of some defects. The two perspectives helped the readers find different kinds of defects. First, by using the tester perspective, an omission was found in requirement 5. Domain knowledge indicated that the cashier needs to know the purchase price if he/she is to handle the cash transaction. The tester perspective allowed the reader to see that because all inputs have not been specified, the requirement cannot be tested. Also using the tester perspective, a defect of extraneous information was found in requirement 7. The requirement states that receipts are copied and stored. However, such activity is clearly outside the scope of the system. It cannot be tested for during system test. Using the customer perspective, an incorrect fact was identified in requirement 3. By using domain knowledge the customer recognized that defaulting to credit card payment is an incorrect response. Because this functionality should not have been implemented the way it was described, the defect was categorized as an incorrect fact. Also, the customer perspective helped uncover that requirement 6.1 has an ambiguous description that could result in a number of different implementations. "A brief description of the type of transaction" seems like a reasonable requirement, but exactly what information is stored? What does "transaction type" mean? Purchase of gas/maintenance? Paid in full/partial payment? Paid by credit card/cash/monthly bill? The use cases produced by PBR were the first UML design artifact produced for this project. Figure 5 shows a use case diagram highlighting its components and corresponding description. (There is a wide variation among organizations and even projects as to how formally use cases are specified. In the example system, an English description was used for the system functionality.) The use case diagrams model the system's functionalities by showing descriptive scenarios of how external participants interact
WORKING WITH UML
69
Use case ~-~'~ Actor
~
~
/~
Parking ( ~ ~ ) +_....~z _ . . . S Maintenance
Billing services
Customer ~ Generalization ~
\('~ _~
,~ Credit card administration
Paying monthly bil / k Paying by cash
Paying by credit card Specific use case for "paying monthly bills":
Customer sends payment and account number to the cashier who selects the payment option and must inform the system of the account number, amount remitted, and type of payment. If any of this information is not entered, payment cannot be completed (cashier interface will display a message) and the operation will be cancelled. Types of payments: 1. by cash Cashier informs the system of the account number, and amount paid. 2. by credit card Cashier informs the system of the credit card, amount, and account number. Gas station ask credit card system to authorize payment. If authorization is OK payment is made. If payment is not authorized or failed, cashier receives a message describing that payment was not able to be processed. Cashier must repeat operation once more before canceling the operation. FIG. 5. A use case diagram and the specific description for one of the use cases identified.
with the system, b o t h identifying the events that occur and describing the system responses for these events. Use case d i a g r a m s can also represent relationships (association, extend, generalization, and include) a m o n g use cases or between actors and use cases. An association is the relationship between an actor and a use case. In the figure an example is the line between " c u s t o m e r " and "billing services". An extend relationship shows that one use case is being a u g m e n t e d by the behaviors of a n o t h e r use case. In the
70
GUILHERME H. TRAVASSOS ETAL.
figure an example of this is the relationship between '~parking" and "billing services". The generalization shows that one use case is a specialization of another, e.g., "paying monthly bills" in relation to ~paying by cash" while the include relationship shows that an instance of one specific use case will also include the sequence of events specified by another use case. This instantiation of PBR uses the equivalence partitioning technique [8] to read and produce test cases. Test cases help identify defects and provide a starting point for real test cases of the system (so developers don't have to start from scratch at that phase.) For example, the test cases in Table VI were produced for requirement 2. The same approach was used to build the test cases for each requirement and identify defects. When readers finished the requirements inspections, the potential defects were evaluated and fixed in the requirements. At this point, there is a choice as to whether the requirements should be re-inspected, depending on the number of defects and problems that were found. For this system, one inspection was deemed to be sufficient. At the end of this phase, a set of fixed requirements, corresponding test cases and use cases, was available to continue the design process.
4.2
High-Level Design Activities
At this point, all the concepts specified by the requirements have been reviewed and a high-level representation for the functionalities was identified and modeled by the use case diagrams. The next step was to organize those concepts and functionalities using a different paradigm. From this point until the end of integration testing, all the activities were driven by the object oriented paradigm. The main issue was: how problem
TABLE VI Requirement number:
2
Description of input: Valid Equivalent sets: Test cases:
Dollar amount of purchase Valid dollar amounts $100 $13.95 Instruct gas pump interface to query for payment type Negative dollar amounts Dollar amounts >$999.99 -$5.00 $1000.00 Display error at cashier interface
Test result: Invalid Equivalent sets: Test cases: Test result:
WORKING WITH UML
71
features (concepts and functionalities) could be classified and organized to allow developers to design an object oriented solution for the problem. The use cases and requirements descriptions are the basis for the highlevel design activities (see Fig. 2). They represent the domain knowledge necessary to describe the features and make them understandable to the development team. The static and dynamic properties of the system need to be modeled. Moreover, use cases and requirements descriptions are used in design inspection because they represent the truth about the problem, and allow the solution space to be limited for the developers. As we stated in Section 3, UML artifacts can be built in any order. Developers can tailor a development process to fit their needs. In our example development process, a draft class diagram was produced first in order to explore the organization of the domain concepts. Class descriptions were produced and enhanced through all the design activities, regardless of the type of artifact being built. Therefore, a first draft for the class description was also produced. Subsequent activities refined the class diagram and described the functionalities that had to be part of the solution. Doing so, developers explored a design perspective focused on the domain concepts rather than just the functionalities of the system. We have observed that the use of such an approach drives designers to model essential domain elements, independent of the solution, and being more suitable for future reuse. This initial picture of the basic elements of the problem gave designers the ability to model the functionalities and understand how external events impact the life-cycle of the objects and system. It was also possible to identify the different chunks of the problem, allowing the identification of different classes packages. Classes were grouped into packages by their structure or type of services they offered. By doing so, developers got some information that was used to improve the definition of the application architecture, for instance, grouping all classes that contain interfaces for external systems into one package. Next, the dynamic behavior of the system was modeled by creating the sequence and state diagrams. Using this information, the class diagram was then refined. All along the way, the class descriptions were updated to be consistent with the information in the other diagrams. Finally the package and activity diagrams were created. Each of these will be explained in more detail below. Once UML design artifacts are built, developers can apply inspections to verify their consistency and then validate them against the requirements and use cases used to define the problem. Object oriented reading techniques (OORTs) were used to support inspections of UML high-level design artifacts in the context of this design process.
72
GUILHERME H. TRAVASSOS ETAL.
After finding and fixing high-level design defects, which sometimes can imply some modifications in the requirements and use cases, developers had a complete set of quality design artifacts representing the framework to be used for continuing design. The next steps included dealing with low-level issues, such as decisions regarding support classes, persistence and user interface, and also restrictions imposed by the implementation environment. The models produced guided coding and testing activities. The following sections describe this process step by step, showing the artifacts that were produced and how inspections were applied to ensure quality and reduce the number of design defects.
4.2.1
Step I: Class Diagrams and Class Descriptions
Using the repaired requirements descriptions and use cases, the designers needed to identify and extract the different domain features that would compose the design model. At this point, there is always a choice of approaches for how to proceed. One option for developers is to apply an organized and systematic technique to capture such features. A first draft of the models is produced using linguistics instruments to identify the semantics and syntactics involved in such documents [61]. Another option is to use a more relaxed approach, where some linguistics issues are considered but without the level of detail or formalization [9]. In this example, designers used the Rumbaugh approach to look for nouns, verbs, and adjectives used in the requirements and use cases to get a first idea about the classes, attributes, and specific actions. Identifying the nouns gave designers initial candidates for the system classes. The class diagram was then created to model this initial class structure. This diagram captures the vocabulary of the problem. It defines the concepts from the domain that must be part of the solution and shows how they are related. The expected result is a model of the system information with all the features (attributes and behaviors) anchored to the right classes. These features delimit the objects' interface, acting like a contract for the class [6]. By identifying the actions for each object type and the visibility of those objects, the class diagram clearly defines how objects can communicate. Figure 6 shows an example of the initial class diagram for the gas station control system. 7 Identifications of the basic diagram elements were inserted to show some of the different types of information that can be modeled, such as 7The complete set of artifacts for this system, including requirements descriptions, can be found in httpz//www, cs.umd.edu/projects/SoftEng/ESEG.
WORKING
I
Services I IDiscount Rate:fl0at /
B e h a vi o rs -....-.....,..~ I lpric2!_!!"~
~-----------d,
~,
'~
/
,
I I I' !0::!. , I Parkin~l Spot Min_Quantity = 100 II ....... ~-v, Current_Quantity :float I | .......... IPrice : float = 1.09 i I--
ll
Gas ............
.
.
.
.
.
.
.
I I I I
Association /
~ I1
~
'-,,
#
/Class
I Purchase I 1..*~Purchase-Date : datel ~,
~ - - !
I 1
t_____J 1
I
i
Multiplicity
'\
name Attributes
..I
i RegisteredCust~ I I name: text I laddress:text I I Account_number: number ! IPhone_number: long I t "' /~,/ 0 / '
I Bi, I I Issue_Date : Date I I Payment Date: Date I l amount_paid: float I I "0 !
<
> l t[~ ...... _:~1 t <> Gas- Ordenng inventory t 9 Parts_Ordenn ' System II. "1I inventory " ,q System p
/
I 1 I 0..* / ! pia i / I rt I / IPart- Code: long I / I Discount Rate float 9 I / I -" i /1..*
% /,I" I Product I I Min Quantity :int I I Current_auantity: float I I Price : flOat I
Aggregationl pnceO ;.
73
UML
I Parking II Car Maintenance] I IPrice I :float 1' =r 5.00 I I Price_float__ '""0 * 15010 '
0"*1
........-I" ...... Inher ......
WITH
O*
1" I Messagel I Text'"text I I " I " '~ ~k.
/i er,c_essa0esi I
I
Warnin,q Letters Delivery_date: date
FIG. 6. An example of a U M L class diagram.
classes, aggregations, associations, and simple inheritance. Classes (representing the object types) and their static relationships are the basic information represented by class diagrams. A class representation has a name, encapsulated attributes, and behaviors, together with their restrictions (constraints). The main static relationships for classes are subtype (e.g., Gas is a Product) and associations, which can be expanded to acquaintances (e.g., a Registered Customer makes a Purchase), aggregation (Inventory consists o f Products) and composition (e.g., a Purchase is part of a Bill). Subclasses (subtypes) are modeled by specialization. An inheritance mechanism asserts that a subclass inherits all the features from the superclass. This means that all the attributes (data structures), behaviors (services), and relationships described in one class (superclass) are immediately available for its subclasses. Single (one superclass) and multiple (more than one superclass) inheritances are both possible. In the example, because the class "Parking" is a subclass of "Services", it contains the attribute "Discount_Rate" and it also receives the ability to communication with the "Purchases" class. Abstract classes are used to improve the model by clarifying the classification and organization of the classes. However, abstract classes cannot be instantiated.
74
GUILHERME H. TRAVASSOS ETAL.
Associations represent the relationships among instances of a class (e.g., a Registered Customer makes a Purchase), but conceptually are shown as relationships among classes. Roles and the number of objects that take part in the relationship (association multiplicity) can be assigned for each side of the relationship. A Registered Customer can make 0 or more purchases, but each purchase can be made by only one customer. Relationships can also be used to represent the navigability perspectives (e.g., to give an idea about which object is the client) and specify the visibility to the objects. At this point in the design process, designers realized that the reasons behind their decision to create a Registered Customer class may be lost if more information about this class and its attributes was not captured. As a result, the class description was created and more information was inserted. This included the meaning of the class as well as behavior specifications and details about relationships and multiplicity. This could also include the constraints and restriction formalization using the object constraint language (OCL), specification of interfaces and communications such as defined by the CORBA standard and also the U M L models mapping to X M L provided by the XMI specifications [14]. Figure 7 shows the class description example generated for the Registered Customer class. Words in bold represent the template fields used on this project. Words in italic are types for the attributes. There is one description like this for each class in the class diagram.
Class name: Registered Customer Documentation: The customer is the client of the gas station. Only registered customer can pay bills monthly. External Documents defining the class: Glossary Cardinality: n Superclasses: none Interface: Attributes: Name: text. It represents the name of a customer. First + last name is the normal way that a name is represented address: text. This is the mail address for the customer. An address should be described using the following format: 1999 The Road Ave. apt. 101 Gas Station City--State--Zip Code account number: long: Customer account number is the numeric identification of the customer in the context of the gas station control system phone number: long: a phone number to the customer: (area code)-prefix-number Operations: none
FIG. 7. An example of a class description.
WORKING WITH UML
75
The class description artifact complements the information described by the other diagrams. It holds detailed descriptions for classes, attributes, behaviors, associations, and consequently all other important characteristics that need to be described but cannot be done by other diagrams. It acts as a data dictionary and it is not formally defined by the UML standard, because it is an implicit document produced during design. Because there is no standard template for this artifact, one can be derived directly from the diagrams to ensure consistency. The class description evolves throughout the design process and at the end must hold all the information necessary to begin low-level design.
4.2.2 Step 2: Interaction and State Diagrams At this point, developers already had a better view of the problem. They modeled the use cases and produced a first class diagram. The class diagram does not show how the behaviors can be put together to accomplish the functionality. Most behaviors are the basic functions for the classes. So, designers needed to understand the functionalities that are required and how to combine the behaviors to accomplish them. The relationships on the class diagram specify only which objects are related to one another, not the specific behavior or message exchanged between them. To continue development, it was necessary to show how information should flow among the objects, represented by the sequence of messages and the data they carry. The use cases contained the information the developers needed to describe system functionality. However, they described functionalities and events using an informal notation. They do not clearly represent how objects interact and which messages are exchanged, but merely describe the different scenarios. Aside from functionality, conditions and constraints are also described by use cases, and more explicitly, by the requirements descriptions. A condition describes what must be true for the functionality to be executed. A constraint must always be true regardless of the system functionality. Interaction diagrams met the needs of developers at this point because they represent system functionalities more formally. These diagrams model the use case scenarios including their conditions and constraints. Basically, interaction diagrams provide a way of showing which messages are exchanged between objects to accomplish the services and functionalities of the system. Typically, each use case has one associated interaction diagram. However, for complex use cases more than one interaction diagram can exist, breaking down the use case complexity while capturing the messages and services necessary to represent the whole functionality.
76
GUILHERME H. TRAVASSOS ETAL.
There are two forms of interaction diagrams: sequence and collaboration. Sequence diagrams show the flow of messages between objects arranged in chronological order. Normally, an actor or an event initiates a messages sequence. Some designers suggest that sequence diagrams are the best form of interaction diagrams to be used when modeling real-time specifications or complex scenarios [14]. Figure 8 shows an example of a sequence diagram for the scenario (pay by cash) represented in Fig. 5. Object life-lines, a box at the top of a dashed vertical line, represent objects. An object life-line holds all interactions over time in which the object participates to achieve a particular functionality. An object that shows up in different sequence diagrams has different lifelines, one for each sequence diagram. For example "Bill" could show up in this sequence diagram as well as "Paying by Credit Card". An arrow between object life-lines is a message, or stimulus (labeled with at least a name). An example is "get_bill()", where the origin is "Registered Customer" and the receiver is "Gas Station". Each message is usually associated with one behavior encapsulated by the receiving object's class. Additional modeling situations that can be captured include: Self-delegation (an object sends a message to itself such as IsClientRegistered(account number)), condition (to indicate conditions that must be observed to send or receive a specific message, such as [information is OK]), and iteration marker (that indicates a message can be sent several times to multiple i "Registered . Customer
" II Cashier-terminal
Gas 9 Station I
I
BillBill
]
pay_molethlybycash(accountn~unber, amount) / monthlybill_cash(a~~nt number, amount)
IsClientRegistered (account number)
I get_bill( )
update_bill
mount, payment date
1display_message(text) -
Object's life-line
I I;~176
I[inlormationis OKI Time
FIG. 8. An example of a sequence diagram.
WORKINGWITHUML
77
receiver objects). The order of messages in the sequence is represented by their position on the object life-line, beginning at the top. Collaboration diagrams are a combination of sequence diagrams with object diagrams and display the flow of events (usually associated with messages) between objects. They show the relationships among the objects and the different roles that objects play in the relationships. Figure 9 displays the collaboration diagram for the sequence diagram of Fig. 8. Objects, in this case, are represented as a box, for example "Gas Station". The arrows indicate the messages (events) sent within the given use case, for example "display_message(text)". Message order is shown by numbers instead of by position as used in sequence diagrams. So, "l:pay_monthlybycash" is followed by "2:monthlybill_cash", and so on. Interaction diagrams are adequate for capturing the collaboration among objects. Regardless of the form used, designers can see which objects and messages are used to capture the functionalities of the system. However, functionalities are represented in an isolated manner. By this we mean that this describes each scenario without considering the others. But, some systems are heavily affected by external events. For these systems, some classes have very dynamic instances. This dynamism occurs because of the changing states of the objects due to external events. The designers of this system decided that this system fell into this class. Because of this the designers found it difficult to understand each specific class by itself. Because the object states can impose constraints on the use of the object, they can affect the functionality of the system. In these cases, modeling of the different situations that change the state of the object is worthwhile. This model helps identify the events and the actions that cause state changes. It
3:IsClientRegistered(account number) f
l .m~8~8 6:display_message(text)"Cashier] Statior
terminal-I
i ~ 2:monthlybil_cash(ac'onumber, unt amount) 5:update_bil(amount, paym"~qtdate) ~i i~ "" 4 ~ men!hI'bycash(account number'am~ ! "Bill " : BIllI
I :~Reglstered FIG. 9. A collaboration diagram.
78
GUILHERMEH. TRAVASSOSETAL.
supports the comprehension of an object's life-cycle and specification of the constraints that will drive objects to be in a consistent state. The designers decided that this was necessary information for this system. So they created the U M L diagram that captures the states of an object and its events, called a statechart [62]. Figure 10 shows a statechart for the inventory class. This diagram describes the basic state information about the states themselves (the cumulative results of the object behavior, or the collection of attributes value for a specific instance, for instance "ordering gas" and "ordering parts") and transitions (a system event, or external occurrence that induces a system or object state change, for instance "gas delivery" or "parts delivery"). Actions can also be associated with states. Doing so, designers can specify which action is triggered when the object reaches a specific state, keeps the current state, or leaves the current state. For example, when the state "low gas" is reached the action of "order gas" should be triggered. Constraints (guards) are normally associated with transitions, showing when objects can change from one state to another. In this case, when the quantity is less than some minimum value, the "low gas" state is entered. Additionally, nesting can break down inherently complex states when necessary. It is common to find projects that do not make use of state diagrams. However, when these diagrams are used, their combination with interaction diagrams represent a way to identify testing scenarios and prepare complementary test cases for the system [63, 64].
4.2.3 Step 3: Refining Class Diagrams At this point, designers have a first draft for most of the diagrams and specifications for the functionalities. However, the models are not completely defined and specified since typically additional services or
~~ /(~ Finalstate
Initial state Event
State
.... ,ow,as L lowgas r oor a, s,oc 1 low arts Z ,ow 1 ,r,, j -L J [quantity<~inimum]/ordergas ~ ' ~ ~ .f t " ~ //"~" ga~iivery partsde'~ry; I ~tity<minimum [Guard]/Acti~ { orderinggas)
{orderingpartsl
FIG. 10. A statechart.
WORKING WITH UML
79
messages will need to be used to model the dynamic view. This is normal and represents how designers are organizing the design, finding new features, grouping behaviors, or even identifying new services that are necessary or make sense when the objects are combined to extract the required functionalities. New classes can show up during dynamic modeling to hold new services. The messages that are used by the objects and the information they carry can be identified. The object's interface to the external world is defined and must be represented in the class diagram and fully specified in the class description. For the gas station example, some results from this step can be seen in Fig. 11. New classes were inserted based on services used to represent required functionalities (e.g., Gas Station). Attributes types and methods signatures were identified (e.g., Bill features). Moreover, some actors needed
Services I i Ipnce0:float
Purchase
Purchase_Date: date 1.n Taxfloat
DiscountRate . float
....
pI t -n
l!i.~c~e~ i_i!Ii~
, Par.,og S;o, ,
gfloat )'t l !!!n)!!!!t~iiln ! :o p(
~
iPhone n-umber:long
1
in I 1~/ln iIlngewet-bI1/l: pIayment tOBi~/l//o.... n/-tyIPayment .p .... .quest . . . ()l,~nt_pald.lfloatDateDate.Date..n DateBll I M....g/e I I!a!l!t!(!er() . . . . . . . . . . t, date) ITexttextI IPdrd_h;uSle~(~se(cust ~ ~update_b,l.....l( t,payment da!)e
d c ! s a g ges \ [~O.n 0.n ~ilit!~e~r )(' I per....... [ l,~]el:i S 1l ordeEqasI unt itled() ~. Warning_Letters calc_refuel poce0 1 i~ Delivery_ date date <> i I Gas OrderingSystem ! <> <> GasPump Credit_CardSystem 1 1 (fromExternalSystems) ask for_payment Integer (). l Parts_Ordering <>t display_message(teXt)pay ask_for_ccnumber by_credltcar0 d. Long 0 pay_by_cash0 paymonthly0 ends~pumP00
FIG. 1 1. A refined class diagram for gas station control system.
80
GUILHERME H. TRAVASSOS ETAL.
to be represented as classes (e.g., interfaces to the external actor) to allow for communication between the system's objects and external participants (e.g., Gas Pump and Credit Card interfaces). As expected, new information was described in the class description, to better specify the new behaviors, attributes, and other features. After dynamic modeling of the GSCS, designers decided to prepare a first set of product measures. Having updated the class diagram, the classes were expected to be more stable at this point, with most of the functionality described. These measures were expected to be useful during low-level design as a way to identify complex classes or class hierarchies, and thus to provide some guidance about which part of the design structure must be modified to reduce structural design complexity [65, 66] or to identify the classes that are more fault prone [49]. Section 3.2.1 described some metrics that can be used to measure the product. The values determined for the metrics WMC, DIT, NOC and CBO for some classes of the GSCS (recall Fig. 11) are shown in Table VII. Having all these artifacts, developers had now completed a broad view of the problem. Although it was not possible to guarantee that the design was complete, decisions about packaging and internal structuring could take place. But before that, object oriented reading techniques were applied to identify possible defects in the design and ensure that such decisions were made on a sound basis. As discussed in Section 3.2.2, object oriented reading techniques (OORTs) are a set of reading techniques that have been tailored for defect detection in high-level object oriented design documents. Horizontal reading ensures that all the design artifacts represent the same system and tends to find more defects of inconsistency and ambiguity. For instance, when the diagrams from Figs. 8 and 11 were inspected, at least three possible defects
TABLE VII G S C S METRICS AND VALUES Class name Services Refuel Car maintenance Product Gas Part Registered customer Gas station
WMC 1 1 1 1 1 1 2 9
DIT
NOC
CBO
0 1 1 0 1 1 0 0
3 0 0 2 0 0 0 0
1 2 2 1 2 2 4 5
WORKING WITH UML
81
(discrepancies) were found using a horizontal reading technique (class diagram against sequence diagrams): 1. gas station class has no representation for the message get_bill(); 2. there is no Cashier terminal class described in the class diagram; 3. gas station sends a message to an object that does not seem to exist. m
The first discrepancy is an inconsistency. If gas station object receives the get_bill() message, it means that its class must have a description for such a message. Developers could not be sure whether or not the message get_bill() was appropriate and necessary in the context of the system, but the O O R T did raise the question by highlighting the discrepancy between the diagrams. Discrepancies 2 and 3 are potential defects, but it was not possible to know if they were real defects just by using the information in the design artifacts. Designers reviewed the discrepancy list as a team to discuss which ones were real defects and needed to be fixed prior to vertical reading. Discrepancies that were not seen as real defects were good candidates for further evaluation by vertical reading after the high-level design was completed.
4.2.4 Step 4: Package and Activities Diagrams The development of complex systems demands significant management efforts. The identification of the parts of the system that will be implemented and their distribution among the development team is one of the important management tasks. U M L provides a diagram, the package diagram, that can be used to represent the high-level grouping of classes. Designers can group classes by different characteristics such as structure, hierarchy, or even functionality. These diagrams are used to represent how designers broke down a large and complex system into smaller units, clustering the classes in well-defined parts, called packages. To guarantee that the information about the dependencies between classes is not lost, package diagrams allow for the representation of such dependencies between the packages. For the GSCS system, after dynamic modeling and the associated changes to the class diagram, the system was felt to be of sufficient complexity for package diagrams to be useful. Figure 12 shows the package diagram created for the system. Usually, a dependency between two classes exists if changes to the definition of one class may cause changes to the other one. In this case, dependencies can be listed as: (1) a class (object) sends a message to another class; (2) a class has another class as part of its data; and (3) a class mentions another one as a parameter to one of its behaviors. When any two classes from different packages have dependencies between them, the packages
82
GUILHERME H. TRAVASSOS ETAL.
ra~Raye rl--
.,. I . . . .
1
II
~
Services
I + Refuel I + Parking + Car_Maintenance C!st . . . . .
+R ~ e r - + Purchase + Bill + Message + Periodic Messages
S
t
/ /
/ /
I | /
I
/ /
+VVa .. .-_,etters L
/
+ Dependency
i + ~ ..... g_;~po,
I
,
Products + ~ ' + Part + Product + inventory
I + Gas Pump ~1 + Credit Card System 1
++Gas_OrderingParts-OrderingsystemSystem
FIG. 12. An example of a package diagram.
that hold them also have a dependency. For large projects this is vital information because developers can use it to identify which packages must be integrated or will be impacted when modifications are introduced. Developers can control the level of granularity at which they are grouping and defining packages, for instance, by nesting packages within other packages. Dependencies between packages support design decisions. For example, based on dependency information developers can identify which packages and parts of the static design must be rearranged to reduce system coupling. Also, by identifying the classes that can be grouped, developers can produce useful information for testing activities, for example to bind a set of system concepts that may be tested together. Information regarding package contents and visibility may be shown by the diagram. Activity diagrams can be more useful for certain types of systems, mainly those that involve multiple threads, and need to detail which internal events will happen after the system receives an external event from an actor. Different from state diagrams, activity diagrams represent the services that must be accomplished when internal events happen. They can be used to model activity flow of services (functionality). Despite this difference, activity diagrams have some similarities to statecharts. Activities are similar to states while events are similar to transitions. They are useful to represent functionalities that involve more than one class and set of services. Activity diagrams are useful when modeling business processes. An example of this diagram with some of its elements can be found in Fig. 13.
83
WORKING WITH UML Gas
I
Inve~ntory
Parts
l start ~ w
fork
I
(~ ordergaSred~I
wa,t
for gas
I '
gas delivered partsdelivered -'~
J
IJO'n ( adj~took~) (~) -- end FIG. 13. An activity diagram.
At this point, all the UML high-level design artifacts had been produced. Horizontal reading was applied and the discrepancies identified as defects were solved and fixed. However, some of the reported discrepancies were still unresolved. Vertical reading was then applied to ensure that design artifacts represented the same system described by the requirements and use cases. Because documents from different life-cycle phases, using different levels of abstraction and detail, were compared, vertical reading found more defects of omission and incorrect fact. For example, horizontal reading had determined that the sequence diagram from Fig. 8 was consistent with all the other high-level design artifacts. Applying a vertical reading technique (sequence diagrams against use cases) it was read against the use case represented in Fig. 5. This sequence diagram intends to capture the use case "pay by cash", which is a specialization of the "pay monthly bills" use case. Although representing the same functionality, the reading technique was needed to compare the diagrams due to the different levels of abstraction used. Use cases normally describe the scenarios using high abstraction. Readers need to explore these differences when inspecting the documents. Two different scenarios can be seen in the context of the use case. For instance, the use case scenario "Customer sends payment" was captured by the message pay_monthlybycash(account number, amount). The sequence
84
GUILHERME H. TRAVASSOSETAL.
of messages monthlybill_cash, IsClientRegistered, update_bill, and display_message captured the scenario "Cashier verifies information and receives payment". However, the message get_bill() did not seem to be part of the use case (a customer already has the bill before paying it) and should not be part of the sequence diagram. This message had been marked as a discrepancy when horizontal reading was applied. The use of the vertical reading techniques identified that it was a real defect that must be fixed before low-level design. After all the high-level UML design artifacts were inspected and the defects fixed, a stable description for the problem was ready. These models were important during low-level design, the next design activity.
4.3
Low-Level Design Activities
As stated earlier, one of the benefits of U M L is that designers use the same set of constructs to represent the different aspects of the system, making the transition from requirements to high-level design to low-level design a smooth process. The results from the high-level design compose the problem domain design. They must be modified and extended to include all technical restrictions imposed by the different resources (e.g., development environments, programming languages, computer architectures, and so on) that will be used to build the software and by the nonfunctional requirements, such as performance, usability, and maintainability. One necessary modification may be caused by the reuse of early projects and coded classes, which impose the reorganization of the current classes and relationships. Another could be the level of inheritance that must be observed now due to characteristics available in the programming language. Low-level issues, such as the definition of low abstraction classes or even the detailed description for an algorithm, are important for this design phase. The types of diagrams produced in low-level design are basically the same ones that were produced for high-level design. The main difference is the level of detail. Moreover, new classes will show up to deal with persistence, management, and user interface issues. Methods will have their signatures fully described and the models will be prepared in such way that programmers can use them as the basis for coding and testing activities, including the specification of the components and the different devices that will compose the final solution. Different software components support the description of the solution for a problem. Each component normally represents a physical module of code. Although a package can hold more than one component, components can be viewed as a low-level package representation. Sometimes a class that
WORKING WITH UML
85
belongs to a specific package can be present or used in different components depending on the type of functionality the developer is representing. A component diagram shows the components implemented in the system, together with their communication lines (dependencies). Dependencies highlight the coupling among components and specify which interface for the component is being used. This diagram is normally produced when designers have a clear definition about the solution and have already defined the architecture for the software. The information captured by component diagrams support the identification of system integration interdependencies. This specific type of information will be useful when planning testing and defining delivery and maintenance priorities. Figure 14 shows an example of a component diagram. Component diagrams represent system organization from the software perspective. For some system types (e.g., Web-based applications, distributed applications) the physical relationship between software and hardware plays an important role in describing how the system will be distributed and organized for deployment. UML provides deployment diagrams to describe this type of information. By representing the different pieces of hardware (nodes, such as devices, sensors, computers, and so on), and the communication paths (connections) between nodes, deployment diagrams help to clarify the existing integration features, allowing for the identification of possible communication bottlenecks or demand specific infrastructure for testing and evaluation. Some designers suggest combining components with deployment diagrams, showing the components inside the corresponding nodes. Doing so,
pr~
I,
I I
I
'
,I,,
I
1
FIG. 14. An example of a component diagram.
86
GUILHERME H. TRAVASSOS ETAL.
Gas station processor Parts ordering system
Gas ordering system
FIG. 15. A deployment diagram.
they get the benefits of both representations using just one diagram. Figure 15 shows a deployment diagram in its standard form. There is much written describing low-level design activities. To produce a complete list here is not feasible. However, the reader who is motivated by these discussions can find some of the following works useful: a classical text about OO design and the use of design patterns to describe software solutions can be found in Gamma et al. [67]. It describes useful design concepts and suggests patterns for the different design situations. The discussions are focused on the structural organization of the models. Although it does not use U M L as the modeling language (the authors used OMT, one of the U M L roots), their descriptions are clear enough to allow the immediate mapping to UML. A complementary text can be found in Buschmann el al. [68]. Meyer [6] has an in-depth discussion about basic types and objects exploring different issues such as management and persistence. Henderson-Sellers [65] raises some useful discussions about design complexity and suggests some ways to use metrics to identify high structural complexity design parts.
5.
M a i n t e n a n c e or E v o l u t i o n
The process model shown in Fig. 1 has a definite end point for the development process (delivery to the customer). This is a useful abstraction, but currently in industry very few systems are built in their entirety and then
87
WORKING WITH UML
shipped to the customer. Most software development involves some type of incremental development, or enhancement-type maintenance. Incremental development can be defined as a process whereby the system is broken into smaller pieces and the goal of each release is to add a new piece to the software. For more information on incremental development see Pressman [7] and Pfleeger [4]. In these development or maintenance cases, software development takes place in the presence of some set of reusable assets. We use the term reusable assets to refer to the set of artifacts in existence from any operational system that needs to be expanded, improved, updated or otherwise modified. The system could be some sort of a legacy system [69] where the system is many years old, a current system of which the next release is being produced, or anything in between. The reusable assets will consist of the artifacts from all of the life-cycle phases mentioned previously. This includes, for example, a requirements document, design documents, and code. The goal in incremental development or maintenance is to perform the given task with the least amount of effort while reusing as much as possible from the set of reusable assets. In this section, we discuss a modified form of the software process that takes advantage of U M L and meets those needs.
5.1
Process
The software development process from Section 1 must be augmented to allow developers to decide what pieces from the set of the reusable assets are candidates to be reused in the new or evolved system. These pieces could include requirements, design, code, or test plans. Specifically, activities must be added to the process that allows for the understanding of the reusable assets. Figure 16 shows the new development process. The activities above the dotted line are the same ones that appeared in Fig. 1. The activities that appear below the dotted line have been added for the new process. The new
Perspective based reading
Object oriented reading
]NeW.reqmrements["-'>[ Requi . . . . . ts [__[Hi~h-level I [ : ~ " [ specificat,on I I ae~;gn
ILow-level ]l_____%lCodingand
i==~ld~l~,, I---" Ite~t'ng
I's~urcecode]0
Ic::::~
IPBR ......................... cust . . . . ~" ................................... I ~O vertical reading I | ' ~........................................................ " ~ . . . . . ~ ..........................]r ......................... | ~" iperspectwe fo. . . . IvmgIIfo . . . . lvmg it 9 Integrat,on mismatches Idennficatlon I N~ .... I_ .............. _1.......... [iRequi . . . . . ts I I High-level lid.... iptions I Idesign artifacts [ _[2"2"~'"~_ ___ _..... ~
t ........... t .......... I I Lou,-level I 1' I Idesign artifacts I ~ ~ .
] ......
[
,Ill " " " -] Artifacts. ] obiectfiles, [ .- "" " k,,,~es
FIG. 16. Inserting maintenance activities in the software process.
I I
88
GUILHERME H. TRAVASSOS ETAL.
activities have been added to take into account the reusable assets when developing the new or evolved system. Before a developer can decide whether or not to use any of the reusable assets, they must first understand those assets. We have found that the U M L diagrams can be very useful in this process of understanding.
5.2
Understanding
As stated earlier, the goal of the understanding activities is to support the comprehension of the reusable assets. In order for a developer to effectively use the existing requirements or design, s/he must first understand the artifacts themselves, as well as what needs to be done. These understanding activities are important to developers because they allow for the identification of the important pieces from the set of reusable assets that can be (re)used for the new system. Understanding activities should be used for three tasks within the software life-cycle: (1) to understand the old requirements, (2) to understand the old design; and (3) to determine what the integration mismatches are. When a developer is evolving or maintaining a system, in general there will be more requirements retained from the old system than new ones added. Because of this, it makes sense to reuse as much of the existing requirements as possible. But before a developer can effectively reuse those requirements they must be understood. When examining the old requirements, developers strive to understand which domain level concepts are present in or absent from those requirements in the context of the new requirements that have been received. The main goal here is to determine, on a high level, if the old set of requirements is in conflict with the new requirements, or if the old requirements can simply be modified to include the new requirements. If the developer can efficiently reuse the existing requirements, it could have a positive effect on the cost and effort of the new system. Likewise, once the developers have created the new set of requirements, they must examine the existing design with the goal in mind of understanding that design. When developers examine the old design, they need to determine if the way that the domain was represented will allow the new requirements to be added. The old design must be understood from the point of view of whether the earlier design decisions conflict with a newly added requirement, or if the old design allows itself to be modified to include the new requirements. By reading the old design for understanding, the developers acquire enough knowledge to create a design for the new system, reusing as much of the old design as possible. Again, if this reuse is efficient, it could have a positive effect on the cost, effort, and complexity of the new system.
WORKING WITH UML
89
Integration mismatches are a way to describe problems that may be encountered when using commercial off-the-shelf software (COTS). In this case, the reusable assets can be viewed as a COTS. Therefore, the developers must determine if the new design and the reusable assets are compatible. For a more complete discussion of this topic see Yakimovitch et al. [70]. Although these types of understanding activities exist in most software life-cycles, developers traditionally have accomplished them with little or no guidance. This means that the developers normally examine the artifacts in an ad hoc fashion and try to identify the features which they consider to be important. From earlier research, we have seen that, in general, developers tend to be more effective at performing an inspection or reading task when using a technique rather than doing it in an unguided fashion. The reason for this is that the techniques help the reader to better focus on the task at hand, and accomplish the important steps necessary for its successful completion. Earlier in this text we mentioned two techniques, PBR and OORTs, that were designed for use in the process of defect detection. These techniques were defined to be useful with the U M L artifacts. As we began to look more closely at this issue and re-examine the techniques, we became aware that those techniques appeared, with slight modifications, to allow the developers to understand the document(s) being inspected, rather than to find defects in the document. In the following sections we will discuss how PBR and OORTs can be used in the process of understanding for evolving a system.
5.2. 1
PBR
The necessary starting point in this process is the requirements. This is because the requirements determine, at the highest level, what pieces of the reusable assets can be reused in the new system. When developing the next version of a system, two important artifacts are present: (1) the set of requirements describing what is to be added to the system, and (2) the requirements and use cases from the existing reusable assets. To begin determining how much of the reusable assets can be reused, the developer must examine these two sets of requirements and use cases to find out where potential discrepancies lie. This examination can be done by reading the documents. This reading is usually performed in some ad hoc manner. The perspective-based reading techniques that were discussed in Section 4.1 for examining a new set of requirements can also be used in this understanding process. A developer can take the new set of requirements and read these against the requirements
90
GUILHERME H. TRAVASSOS ETAL.
and use cases from the set of reusable assets to determine where the discrepancies lie. The goal of this is to come up with a new set of requirements that contains both the old and the new requirements. Here only the customer perspective of PBR is used. This is because the type of information that is gathered by this perspective most closely resembles the type of information that is necessary to evolve the requirements. The main task of the reader here is to determine what has to be "fixed" or changed so that the new requirement can be added to the existing system. There are many situations that the reader is looking for. Using the PBR's customer perspective, the reader first examines the participants. It must be determined whether the participants in the new system are the same as those in the old system. For example, new participants may need to be added, names of participants may need to be changed for consistency, or the way that a participant interacts with the system may change. The next thing that readers look at is product functionality. The reader will examine the existing use cases to determine if they cover the new functionality. If they do not, then either new use cases must be created, or the old use cases must be modified to handle the new functionality. The reader also needs to determine if the relationships between the participants, discussed above, and these product functions remain the same in the new system, or if changes must be made. The result of this process is a list of these discrepancies that aids the developers in integrating the new requirements and the set of requirements taken from the reusable assets.
5.2.200RT Once a developer has come up with a new set of requirements describing the new system, the next step is to create a design for this system. Again, this is typically done in some sort of ad hoc fashion. Here developers are interested in doing similar tasks to what was accomplished in the requirements phase. Now that a set of requirements has been created that includes the new requirements, the UML design documents from the set of reusable assets must be read against this set of requirements to determine where the discrepancies lie. When we speak of discrepancies here, we are referring, at an abstract level, to the information gap between the old and the new system. This may include the fact that new classes must be added to the design to support the new requirements; or, new attributes and behaviors may need to be added to existing classes to support the new functionality. The way that classes interact and pass messages may also need to be changed to support the structure of the new system.
WORKING WITH UML
91
The object oriented reading techniques, and more specifically, the vertical reading techniques discussed in Section 3.2.2, are well suited for this task. The reason that vertical reading can be used here is that when performing vertical reading, a reader examines a set of requirements against the U M L diagrams of a design. The main difference that arises when these techniques are used to understand the reusable assets is that the reader is looking for discrepancies between the new requirements and the reusable assets. At this point, the reader is not looking for defects, because the requirements and the design are describing two different systems. Rather, the reader is trying to determine what parts of the old design are inconsistent with the new requirements. Once this is determined, the designer can create the design for the new system by using as much of this old design as possible and augmenting or changing where necessary as dictated by the discrepancies. 5.3
Example
In this section, we will show how the process described above can be used to extend the example problem presented in Section 4. For the purposes of this example, assume that the gas station owner has decided that he now wishes to allow his customers to pay their monthly bills by credit card via the Internet as well as via mail. The new requirement can be seen in Fig. 17. The first step in creating the evolved system is to apply PBR in order to determine how the set of reusable assets (the old set of gas station requirements) can be used in conjunction with this new requirement. When PBR is applied, it was determined that the new requirement only requires a minor change to the reusable assets. We found that there is already a requirement that allows for the payment of monthly bills using a credit card. The old system only allowed this payment to be made by mail. So, the new requirement does not change the system, it only extends some functionality that is already present. To include the new requirement in the system we can
Monthly bill payments will now be accepted using a credit card over the Internet. The customer must access the gas station homepage. The customer will log in using his account number and password. The password is assigned by the gas station at the time of account creation. The customer is presented with the amount of his current bill. The customer then enters the amount they wish to pay and their credit card information. The credit card information is verified by the system. If verification succeeds, the bill is updated in the system. If it fails, the customer is presented with the reason for failure, and given a chance to try again.
FIG. 17. New gas station requirement.
92
GUILHERME H. TRAVASSOS ETAL.
Parking
~
~
Cashier/ Paying monthly A bills~
Paying by cash
rt~~
/
L~
Paying by credit card
Paying by mail
Paying over Internet
Specific use case for "Paying by Credit Card": Types of credit card payments:
1. Paying by mail Customer sends payment and account number to the cashier. Cashier selects the payment option and enters account number, amount remitted, and type of payment. If any of this information is not entered, payment cannot be completed (cashier interface will display a message) and the operation will be cancelled. Gas station asks credit card system to authorize payment. If authorization is OK payment is made. If payment is not authorized or failed, the cashier receives a message describing that payment was not able to be processed. The cashier must repeat the operation once more before canceling the operation. 2. Paying over the Internet Customer logs on to the homepage and enters account number and password. System prompts with bill amount. Customer enters amount to pay and credit card information. Gas station asks credit card system to authorize payment. If authorization is OK, payment is made. If authorization fails, the customer receives a message describing that payment could not be made. Customer may try one more time or cancel the operation. FIG. 18. E v o l v e d use cases.
WORKING
WITH
UML
93
leave a majority of the requirements unchanged and just add the new functionality. To do this, requirement 7 is the only one that has to change. In Fig. 18, it has been split into two parts, one dealing with payments by mail (l) and the other dealing with payments over the Internet (2). The next step is to examine the use cases to determine if modifications are necessary. Because a use case for paying monthly bills by credit card already exists, we can split that use case into two subcases: one for payment by mail, and one for payment on the Internet. The new use cases can be seen in Fig. 18. N o w that we have the evolved requirements and use cases, the next step is to modify the design so that it includes the new requirements. When applying O O R T s we determined that the only change to the class diagram that was needed was to add a new class to model the homepage. We also determined that a new sequence diagram was needed to model the new use case of "Payment over Internet". These new diagrams can been seen in Figs 19 and 20 respectively.
IDiscounts ...... Rate.
' '
I
I e ue I IGall~ : float I Ipdce0:fl~ I I1 Gas IMin_Quantlty = 100 Ic ....... Q..... ty:float
IPrice'fl . . . . !
1.09
\ I
II
float
I
I 1
prtceO'float .
~
0* pitar
I1' II I IPart Code : long I I D....... R. . . . float
II
I
/
/
o
ICurro~a.uantity:,oat ir-,,t.u. ,,ud~
i a eRte~Je,:teredC....... I:ddress ~ext
I
I'"
I I
/ o /
Gas S..... IN....... I Cod. . . . . ber I I ie ffuU:ll()clarge0 t0
/ Jget_CC .... ber() 1 *ilk ..~monthlyblll cash() / ' I....... b...... ed.... dO , 'v I' ...... ry[ ],sC,ientReg..... ed0 i OrdergaSu i lunt'UedO 1/I ~ ........... p.... 0
i
/,
'1
I
<~,n,e,..... II Gas_Ord. . . . gSy . . . . II 111111,11 I I
!
I
. . . . . . rfa. . . . Parts_Ordedng System
I <<,nteff..... OasPump I
I
t
/
/ " /
I
I
I
I M. . . . I' I
......t
ge
~
h II E ] [
Ipn )-pa'd : float
II ....... 0 lip ........ 0 | l add_purchase(customer, amount, date) ~at _bll( u t.p Y/0td t )
Penodlc M . . . . ges
'\
~
~ I \ |
~
~
\
~
X,
B,,,
I ta'xCe~0
I~-,,~
as,_
. . . . . . . . berO . . . . g d,splay message(text)
/
I 1"'* / I 0
I
/1.
I ! I I I
ask for f payment0 ,nteg. . . . Pp:Yy:byy:cCrae~s 't(~ard0
I
I ' 1 I
!..... Date Date I . . . . . . . . Date Date
/
~1~'
!
I I
# "
I new paymen .... 0 I I get/3'/IOB'l/t-type- req I | -1 . . . / / 1\0.
/ I J
_~1
I
1
be{ . . . . ber Phonenumber. long
, I Pa~'ng-Sp~ 1
I I PI....... I I I t ,
~1
I A. . . . . . .
i 0"; !
I~ Product
I Tax : flOat
!
1i
P.......
I Purchase Date : date
1"
I Car Maintenance I I Parking IPrlce : fl..... 50.00 i l ~..... float = 5 o0 I II .....
0...~
I
-
/
~
0-~
~
~
! .....g. . . . . . .
Io. . . . . rydate
"~
\
"l ,nterf....
Cred,tCard~ ..... (#ginExternal
pay monthly()
ends pumP00
FIG. 19. Evolved class diagram.
date
.... page !
ll"n o=ln,;i
]
I Payby cre~.....dOI I
-
I
"
94
GUILHERME
~I
I
Customer H0mepage Customer !0_?!_n(account,pa ;sword)
[
H. T R A V A S S O S
IOasta"~
I
Gas Station
ETAL.
I1
Cust~ i Credit c--ard i [ Registered Customer I [Credit_Card System[
)ay bycreditca accountnumber,amount, cc_number) ~ onthlyl~,ill_creditcard(aco Antnumber, cc number, amount) ~-
[[response tin130 secs]
IsCIientRegisterel(acco t~t number) | getbill( ) ~d/
authorize payment~c_number,amount / 0 ~-
date)
up ]atebill(amount, pa~ lent date) .. displa~ssage(text)
FIG. 20. Sequence diagram.
6.
The Road Ahead
Software design is a great challenge. Increasing product commercialization, rapidly changing technologies, shorter deadlines, the Internet, and other factors are radically changing the software industry daily. Narrow software engineering training is making the design of software a more complex problem each day [71]. Well-engineered software needs good software engineering. Software developers are demanding new techniques and instruments from software engineers to support their software design. Deadlines are constantly short. Quality and productivity needs to be high. Costs must be low. Flexibility for change is necessary to follow technology and user requests. All these factors surround software design. The software design process described in this text tries to deal with some of these issues without building on more complication for designers. Using a standard notation (UML) and a combination of simple techniques (PBR and OORTs) and models (waterfall software life-cycle) we described how developers can prepare the design of an application accomplishing a few activities and at the same time reducing the number of defects. In addition, the same ideas were shown to be useful to evolve or maintain the UML artifacts for a software product.
WORKING WITH UML
95
Although not complete as a software development process it can be tailored to fit in different software process frameworks or can be used as the basis for defining more complete software processes. REFERENCES [1] [2] [3]
[4] [5]
[6] [7] [8] [9] [10] [10b] [11]
[12] [13]
[14] [15] [16] [17]
[18]
Leite, J. C. S. P. and Freeman, P. A. (1991). "'Requirements validation through viewpoint resolution". IEEE Transactions on So[tware Engineering, 17, 1253-1269. Finkelstein, A., Krammer, J. and Nuseibeth, B. (eds) (1994). Software Process Modeling and Technology, Wiley. IEEE (1993). Standard 830-1993. Recommended Practice lbr Software Requirements Specifications. Software Engineering Standards Committee of the IEEE Computer Society, New York. Pfleeger, S. (1998). Software Engineering." Theory and Practice. Prentice Hall, New Jersey, USA. Lockman, A. and Salasin, J. (1990). "A procedure and tools for transition engineering". Proceedings of the 4th ACM SIGSOFT Symposium on So[tware Development Environments, 157-172. Irvine, CA. Meyer, B. (1997). Object-Oriented Software Construction, second edition. Prentice Hall, New Jersey, USA. Pressman, R. (1997). Software Engineerhlg." A Practitioner's Approach, fourth edition. McGraw-Hill, USA. Jalote, P. (1997). An Integrated Approach to Software Engineering, second edition. SpringerVerlag, New York Inc., USA. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F. and Lorenson, W. (1992). ObjectOriented Modeling and Design. Prentice Hall, USA. Booch, G. (1994). Object-Oriented AnaO'sis and Design. Addison-Wesley, USA. Booch, G., Rumbaugh, J. and Jacobson, I. (1999). The Un(fied Modeling Language User Guide. Addison-Wesley Object Technology Series. Addison-Wesley, USA. Jacobson, I., Christerson, M., Jonsson, P. and Overgaard, G. (1992). Object-Oriented Sqfltware Engineering. A Use Case Driven Approach (revised printing). AddisonWesley, USA. Coleman, D., Bodoff, S. and Arnold, P. (1993). Object-Oriented Development. The Fusion Method. Prentice Hall, New Jersey, USA. Kitchenham, B. A., Travassos, G. H., von Mayrhauser, A., Niessink, F., Shneidewind, N. F., Singer, J., Takada, S., Vehvilainen, R. and Yang, H. (1999). "'Towards an ontology of software maintenance". Journal of So liware Maintenance. Research and Practice, 11, 365-389. OMG--Object Management Group, Inc. (1999). Unified Modeling Language Specification, Version 1.3. ( h t t p : / / w w w . o m g . o r g ) . Conallen, J. (1999). "Modeling web application architectures with UML". Communications of the ACM, 42, 63-70. Larsen, G. (1999). "Designing component-based frameworks using patterns in the UML". Communications o[the ACM, 42, 38-45. Shroff, M. and France, R. B. (1997). "'Towards a formalization of UML class structures in Z". Proceedings off COMPSAC'97--21st hlternational Computer Software and Applications Conference, France. Uemura, T., Kusumoto, S. and Inoue, K. (1998). "'Function point measurement tool for UML design specification". Proceedings ~?f the 6th hlternational Symposium on Software Metrics, 62-69.
96
GUILHERME H. TRAVASSOS ETAL.
[19] Travassos, G., Shull, F., Fredericks, M. and Basili, V. (1999). "'Detecting defects in object-oriented designs: using reading techniques to improve software quality". Proceedings q/" the Conference m z Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), Denver, Colorado, 47-56. [20] J~iger, D., Schleicher, A. and Westfechtel, B. (1999). "'Using UML for software process modeling". Proceedings ~?/the 7th European En~#werhlg Conference held jointly with the 7th ACM SIGSOFT symposium on Foundations of Software Engineering, 91-108. Toulouse, France. [21] Evans, A. and Kent, S. (1999). ~Core recta-modeling semantics of UML: the pUML approach". Proceedings o/' the 2rot hlternational Conference on the Unijied Modeling Language--UML'99. Fort Collins, Colorado. [22] Morisio, M., Travassos, G. H. and Stark, M. (2000). "'Extending UML to support domain analysis". Proceedings of the 15th IEEE hzternational Conference on Automated Software Engineering. Grenoble, France. [23] Selic, B. (2000). "~A generic framework for modeling resources with UML". IEEE Computer, 64-69. [24] Epstein, P. and Sandhu, R. (1999). "'Towards a UML based approach to role engineering". Proceedings ~?/ the 4th ACM Workshop on Role-Based Access Control, 135-143. Fairfax, VA. [25] D'Souza, D. F. and Wills, A. C. (1998). Oh~cots, Components, and Frameworks With UML." The Catah'sis Approach. Addison-Wesley. [26] Krutchen, P. (1999). The Rational Unified Process." An hztroduction. Addison-Wesley, Reading, Mass. [27] Jacobson, I., Booch, G. and Rumbaugh, J. (1999). The Uni[~ed S~?[tware Development Process. Addison-Wesley. [28] Graham, I. M., Henderson-Sellers, B. and Younessi, H. (1997). The OPEN Process Specification. Addison-Wesley, Harlow. [29] Booch, G. (1999). "UML in action". Comnumications of the ACM, 42, 26-28. [30] Kobryn, C. (1999). UML 2001: "'A standardization odyssey". Communications of the ACM, 42, 29-37. [31] Fowler, M. and Scott, K. (2000). UML Distilled. A Brief Guide to the Standard Object Modeling Language, second edition. Addison-Wesley. [32] Rumbaugh, J., Jacobson, I. and Booch, G. (1999). The Un([ied Modeling Language Reference Manual. Addison-Wesley Object Technology Series. Addison-Wesley. [33] Warmer, J. B. and Kleppe, A. G. (1999). Oh/ect Constraint Language: Precise Modeling With UML. Addison-Wesley. [34] Eriksson, H. and Penker, M. (1997). UML ToolMt. Wiley. [35] Douglass, B. P. (1999). Doing Hard Time." Developing Real-Time Systems with UML, Objects, Frameworks and Patterns, Addison-Wesley. [36] Wirfs-Brock, R., Wilkerson, B. and Wiener, L. (1990). Designing Object-Oriented Software. Prentice Hall. [37] Cangussu, J. W. L., Penteado, R. A. D., Masiero, P. C. and Maldonado, J. C. (1995). "Validation of statecharts based on programmed execution". Journal of Computing and Information, 1 (CD-ROM Issue), Special Issue of the Proceedings of the 7th International Conference on Computer and Information ICCI'95, Peterborough, Ontario, CA. [38] Perry, W. (2000). Ej.[ective Methods/br Software Testing, second edition. Wiley. [39] IEEE (1987). Software Engineerhlg Standards. IEEE Computer Society. [40] van Solingen, R. and Berghout, E. (1999). The Goal~Question/Metric Method. A Practical Guide jbr Qualio" hnprovenlent of S~?/tware Development. McGraw-Hill. [41] Coad, P. and Yourdon, E. (1991). Object-Oriented Design. Yourdon Press Computing Series.
WORKING WITH UML
97
[42] Clunie, C., Werner, C. and Rocha, A. (1996). "'How to evaluate the quality of objectoriented specifications". Proceedings of the 6th hlternational Conference oll Software QualiO', Ottawa, Canada, 283-293. [43] Lorenz, M. and Kidd, J. (1994). Oh/ect-Oriented Software Metrics. Prentice Hall. [44] Chidamber, S. R. and Kemerer, C. F. (1994). "'A metrics suite for object oriented design". IEEE Transactions on So/hvare Engiueerhlg, 20. [45] Lie, W. and Henry, S. (1993). "'Object-oriented metrics that predict maintainability", Journal of Systems and So/hrare, 23, 111-122. [46] Basili, V. R., Briand, L. C. and Melo, W. L. (1996). "'A validation of object-oriented design metrics as quality indicators". IEEE Transactions on Software Engineering, 22, 751-761. [47] Fagan, M. (1976). "'Design and code inspections to reduce errors in program development". IBM Systems Journal, 15, 182-211. [47a] Fagan, M. (1986). "Advances in software inspections". IEEE Transactions on Software Engineering, 12, 744-751. [48] Gilb, T. and Graham, D. (1993). So/t~rare hlspection. Addison-Wesley, Reading, MA. [49] Basili, V., Caldiera, G., Lanubile, F. and Shull. F. (1996). "'Studies on reading techniques." Proceedings of the 21st Ammal Software Engineering Workshop. SEL-96-002, 59-65, Greenbelt, MD. [50] NASA. (1993). "'Software Formal Inspections Guidebook". Report NASA-GB-A302, National Aeronautics and Space Administration, Office of Safety and Mission Assurance. [51] Votta Jr., L. G. (1993). "'Does every inspection need a meeting?". ACM SIGSOFT Software Engineering Notes, 18, 107-114. [52] Porter, A., Votta Jr., L. and Basili, V. (1995). "'Comparing detection methods for software requirements inspections: a replicated experiment". IEEE Transactions on Software Engineering, 21, 563-575. [53] ANSI (1984). "'IEEE guide to software requirements specifications". Standard Std 8301984. [54] Shull, F., Travassos, G. H., Carver, J. and Basili. V. R. (1999). Evolving a Set of Techniques for OO Inspections. Technical Report CS-TR-4070, UMIACS-TR-99-63, University of Maryland. On line at h t t p : / / w w w . c s . u m d . e d u / D i e n s t / U I / ; ' . 0 / D e s c r i b e / n e s t r l. u m c p / C S - T R - 4 0 7 0 .
[55] Beizer, B. (1995). Black-Box Testing: Techniques.for Functional Testing o/Software and Systems. Wiley. [56] Binder, R. V. (1999). Testing Object-Oriented Systems: Models, Patterns, and Tools. Addison-Wesley. [57] Kung, C., Hsia, P., Gao, J. and Kung, D. C. (1998). "'Testing object-oriented software". IEEE Computer Society. [58] Offutt, J. (1995). "Practical mutation testing". Proceedings of the 12th hlternational Conference on Testing Computer Software, 99-109, Washington, DC. [59] Delamaro, M. E. and Maldonado, J. C. (1996). "'Integration testing using interface mutation". Proceedings o/ the 7th hlternationai Symposium on So./'tware Reliability Engineering (ISSRE). White Plains, NY, 112-121. [60] Chung, L., Nixon, B. A., Yu, E. and Mylopoulos, J. (1999). Non-Functional Requirements in Software Engineering. Kluwer Academic Publishers. [61] Juristo, N., Moreno, A. M. and L6pez, M. (2000). "'How to use linguistic instruments for object-oriented analysis. IEEE So/tware, 80-89. [62] Harel, D. and Naamad, A. (1996). "'The STATEMATE semantics of statecharts". ACM Transactions on Software EngineerhTg and Methodology, 5, 293-333. [63] Vieira, M. E. R. and Travassos, G. H. (1998). "'An approach to perform behavior testing
98
[64]
[65] [66]
[67] [68] [69]
[70]
[71]
GUILHERME H. TRAVASSOS ET AL.
in object-oriented systems". Proceedings of the Technology of Object-Oriented Languages--TOOLS 27, Be(ring, ChhTa. IEEE Computer Society. Offutt, J. and Abdurazik, A. (1999). ~'Generating tests from UML specifications". Proceedings of the 2nd International Conference on the Unified Modeling Language (UML99), Fort Collins, CO. Henderson-Sellers, B. (1996). Object-Oriented Metrics." Measures of Complexity. Prentice Hall. Travassos, G. H. and Andrade, R. S. (1999). ~'Combining metrics, principles and guidelines for object oriented complexity reduction". Workshop on Quantitative Approaches in Object Oriented Software Engineering, ECOOP'99, Lisbon, Portugal. Gamma, E., Helm, R., Johnson, R. and Vlissides, J. (1995). Design Patterns." Elements of Reusable Object-Oriented Software. Addison-Wesley. Buschmann, F., Meunier, R., Rohnert, H., Sommeerlad, P. and Stahl, M. (1996). PatternOriented Software Architecture. A System o[ Patterns. Wiley~ New York. Markosian, L., Newcomb, P., Brand, R., Burson, S. and Kitzmiller, T. (1994). "Using an enabling technology to reengineer legacy systems". Communications of the A CM, 37, 58-70. Yakimovitch, D., Travassos, G. H. and Basili, V. R. (1999). ~A classification of components incompatibilities for COTS integration". Proceedings of the 24th Annual Software Engineering Workshop, NASA/SEE, Greenbelt, USA. Clark, D. (2000). "Are too many programmers too narrowly trained?" IEEE Computer, 33, 12-15.
E nterprise JavaBea ns a nd Microsoft Transaction Server"
Frameworks for Distributed Enterprise Components AVRAHAM LEFF, JOHN PROKOPEK, JAMES T. RAYFIELD, AND IGNACIO SILVA-LEPE IBM T. J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA [email protected]
Abstract Software components were introduced to fulfill the promise of code reuse that "pure" objects were unable to deliver. This chapter examines a specific type of component, namely, distrihuted enterprise components, that provides '~business" functions across an enterprise. Distributed enterprise components require special functions such as distribution, persistence, and transactions: these functions are typically achieved by deploying the components in an oh/ect transaction monitor. Recently, distributed enterprise components and object transaction monitor technology have standardized into two competing frameworks: Sun's Enterprise JavaBeans and Microsoft's Microsoft Transaction Server. The first half of this chapter discusses the concept of distributed enterprise components in some detail and shows how they have evolved in response to the need for code reuse in business environments. This evolution is closely related to developments in other areas of software technology such as databases and transaction monitors. The second half of the chapter focuses specifically on the Enterprise JavaBeans (EJB) and Microsoft Transaction Server (MTS) technologies and explains how they relate to earlier component technologies. We show that EJBs and MTS are remarkably similar and yet differ in some important ways. We illustrate this discussion through an example developed on both frameworks.
1. 2.
3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Component Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Objects versus Components . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Local Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Distributed Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Enterprise Components and Infrastructure . . . . . . . . . . . . . . . . . . Object Transaction Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 An Example: Component Broker . . . . . . . . . . . . . . . . . . . . . . .
ADVANCES IN COMPUTERS, VOL. 54 ISBN 0-12-012154-9
99
100 101 101 102 105 111 114 114 115
Cop xright { 2001 by Academic Press All rights of reproduction in any form reserved.
100 4.
Enterprise J a v a B e a n s and Microsoft T r a n s a c t i o n Server . . . . . . . . . . . . . . 4.1
5.
6.
121
4.2 Enterprise J a v a B e a n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 C o m p a r i s o n of the C o m p o n e n t Models . . . . . . . . . . . . . . . . . . . Parallel E v o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
123 126 128
5.1 E v o l u t i o n of D a t a Access A b s t r a c t i o n 5.2 E v o l u t i o n of P a c k a g e d Business Logic Sample A p p l i c a t i o n . . . . . . . . . . . . . 6.1 I n t r o d u c t i o n . . . . . . . . . . . . . .
. . . .
129 133 136 136
. . . . . . . . . . . . . . . . . . . . . . . .
137
I m p l e m e n t i n g Business Logic
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 C r e a t i n g a Persistent C o m p o n e n t . . . . . . . . . . . Continued Evolution . . . . . . . . . . . . . . . . . . . 7.1 M T S . . . . . . . . . . . . . . . . . . . . . . . . 7.2 EJB . . . . . . . . . . . . . . . . . . . . . . . . . 7.3
8.
Microsoft T r a n s a c t i o n Server
120
. . . . . . . . . . . . . . . . . . . . . . . .
6.2 7.
AVRAHAM LEFF ET AL.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . .
. . . .
. . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . .
. . . .
. . . . . .
140 142 142 143
CORBA Components . . . . . . . . . . . . . . . . . . . . . . . . . .
145
7.4 R e l a t i o n s h i p with S O A P . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
146 147 148
8.2 E v o l u t i o n o f Microsoft's C O M T e c h n o l o g y . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
149 151
1.
Introduction
Sun's Enterprise JavaBeans (EJBs) [1, 2] and Microsoft Transaction Server (MTS)[3,4] are emerging standards for distributed enterprise components. In order to better appreciate the benefits of these standards, it is helpful to be aware of the evolution of component function and infrastructure. This evolution has been driven by the need of software developers to create truly reusable business components. Separately, database systems and transaction processing monitors have evolved into object transaction monitors, providing increasingly powerful platforms for object deployment. EJBs and MTS can be viewed as a standardization of distributed enterprise components that can be portably deployed across multiple object transaction monitors. In the first part of this chapter, we discuss why components are important for software developers and how one type of component, namely, distributed enterprise components, has evolved to meet these needs in the context of deploying business-related software across an enterprise. We also discuss how component infrastructure evolved along with component function. With this context as a background, in the second part of this chapter we take a closer look at the EJB and MTS frameworks themselves, and show how they improve on earlier component technologies. Although EJBs and MTS are competing technologies, we show that the fundamentals are remarkably similar, although their details can differ considerably. The
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
101
discussion contains an example that gives a better sense of how an application is actually developed on these technologies. Finally, we take a brief look ahead at some ongoing developments in this field.
2.
Component
Evolution
Component evolution is driven mainly by a simple fact: software developers prefer to write code only once and do not enjoy rewriting the code for different environments; similarly, organizations prefer that developers produce new function rather than rework code for new environments. "Different environments," in this sense, involves many dimensions including multiple languages, persistence mechanisms, and networked systems. In this section we show how components have evolved to facilitate reuse in various environments, and discuss key issues that have been addressed in the course of this evolution.
2.1
Objects versus Components
The terms objects and components are often used interchangeably; for the purposes of this chapter we draw the following distinction between them. Software objects are programming constructs that encapsulate both data and code logic. Object programming offers certain advantages compared to procedural programming, such as the ability to produce more flexible, maintainable, and reusable software (see Appendix 1). Because they separate interface from implementation, objects facilitate software reuse since an object's clients need only learn about the object's interface rather than the considerably more complicated implementation. Components share many of the characteristics of objects such as dynamic binding and polymorphism. The reuse promise of the object oriented approach cannot be fully met, however, so long as an object's binary packaging prevents a client from using the object in a implementation-independent way. Consider, for example, a C + + c a r object that is compiled into a shared library (d l l) file. Clients wishing to reuse the c a r object must compile against the header file which contains information about the size of the c a r implementation, its private and public data members, and the layout of its virtual function table. Whenever the c a r implementation changes, if this is reflected as a change in the header file (even if the public interface is unchanged), clients must recompile their code against the new version of the header file. This requirement limits the reuse possibilities of the c a r object in a nontrivial way. Other issues play a role in determining the reuse possibilities of a given technology. For example, because C + + objects are compiled, differences
102
AVRAHAM LEFF ET AL.
among compilers (e.g., "name mangling") imply that client and server code must be compiled with the same compiler. Interpretive technologies, such as Java and Smalltalk, facilitate reuse to a greater extent. We consider that the key difference between objects and components is that components decouple interface from implementation to a much greater extent than objects. This distinction is important because reuse is a major motivation for using object oriented technology in the first place. Components achieve this decoupling either by deferring the need for certain information to run-time (as opposed to compile time) or by allowing clients to delegate the function to other parts of the run-time system. Thus, for example, when a client creates a C + + c a r , the new operation requires that the client code allocate the exact number of bytes for the specific implementation in exactly the same layout as the specific implementation. In contrast, the new operation in Java is delegated by the client to the JVM which gets the necessary information at run-time. Since components are characterized by a "reuse" orientation, there can be no hard distinction between objects and components. Rather, a multidimensional "reuse" spectrum exists in which some technologies may facilitate reuse to a greater extent than others with respect to certain issues and are less helpful with respect to other issues. The following subsections discuss these issues in greater detail.
2.2
Local Components
Component technology was first created for local environments, i.e., environments in which all server components run in a single process on a single machine. Examples include Microsoft's COM (Component Object Model, circa 1993)[5] and IBM's SOM (System Object Model, circa 1993) [6]. SOM was a local component extension of CORBA (Section 2.3). A brief overview of COM's history is provided in Appendix 2.
2.2. 1 Description One technique to separate the component interface from implementation is to use a "contractural", language-neutral description to define the interface and then allow specific languages to provide the implementation. Such interface description languages (IDL) facilitate software reuse in two ways. First, they ensure that implementation is not intermingled with the interface so that (if the interface is not changed) changes to the implementation do not affect clients. By contrast, the header files of such languages as C + + contain both interface (method signatures) and implementation (e.g. private data members). Second, the language-neutral
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
103
interface can serve as a "canonical form" of the component for which multiple language bindings can be generated. This allows client code written in one language to access a component implementation written in another language. (Note that Java--via the i n t e r f a c e construct~facilitates the first, but not the second, type of reuse.) The notion of IDL originated with DCE [7]. Both COM and SOM use (different) IDLs to describe component interfaces. Originally, SOM used its own version of an IDL, but later adopted the much better known CORBA IDL [8]. COM IDL allows component implementation to be written in Microsoft Visual C + + , Visual Basic, Java, and other languages. SOM allowed component implementation to be written in C, C + + , Cobol, and Smalltalk.
2.2.2
Persistence and Identity
Standard objects often need not (and do not) persist across multiple executions: the objects are instantiated in a given program, the program executes, and the objects vanish after the program terminates. In contrast, business enterprise components must persist across multiple executions because they model on-going activities and business entities. When calls to a component are made from within a single address space (local components), and the components are transient (the component state is stored only in main memory), the notion of a component's identity is fairly straightforward. Clients access a component through pointers or references to a main memory location; identity relates simply to the question of whether two such references resolve to the same component. Java, for example, provides the identity operator ( " = = " ) to test whether two transient objects are identical. Because of the very close association between a component and its clients, there is no need for an independent concept of component identity. In contrast, when a component's clients live in a separate address space, there is a greater need for an object id; this is discussed in Section 2.3. When components are persistent, component identity can be an issue even in the environment of local components. Persistent components are components whose state is maintained in media that are less volatile than main memory, e.g., files, relational databases, or some other persistence mechanism. Because the association between a component and its clients must persist across time, the association is therefore more loosely coupled. In such cases, the notion of component identity is more important. There are two basic models for component persistence: single-level and two-level store. In the two-level store approach, an explicit distinction is made between the instantiated component in main memory (the first level)
104
AVRAHAM LEFF ETAL.
whose state is a cop), of the persistent state, and the persistent data itself (the second level). Load methods move the state from persistent storage to the component, and store methods move the state from the component to persistent storage. There is thus no mechanism through which a client can persistently keep a "handle" to a specific component: a component loses its identity as soon as it is deactivated from main memory, and will only resume its identity after an explicit load command moves a copy of the persistent state to an uninitialized component. In the single-level store approach, a life-time association exists between a component and its persistent state such that one can speak of a component's identity persisting across multiple activations and deactivations. A component's client can view the component as persistently pointing to a specific database record. (In contrast, nothing prevents a two-level store system from pointing a component to one database record in the morning and another database record in the evening.) With singlelevel store, the component infrastructure is responsible for maintaining the mapping between the component and its persistent data, obviating the need for explicit load and store methods. The mapping consists of two main parts: the schema mapping and the object identity mapping. Schema mapping describes the mapping between portions of the database record and the object's instance variables or fields. Object identity mapping defines the subset of the database record used to uniquely identify the object. The object identity portion of the database record is typically readonly, as changing this portion of the record would change the object's identity. The component's object identity becomes visible when component references are converted into (opaque) object identity strings, e.g., for the purpose of making them persistent or for passing between address spaces. COM uses a two-level store persistence model, and does expose the notion of an object id. ObjectStore [9] uses a single-level store model, and uses object ids.
2.2.3
Inheritance
Both SOM and COM separate the component interface from component implementation (Section 2.2.1). Implementation inheritance--as in C + + - allows data members and methods of a base class (e.g., Person) to be automatically available to derived classes (e.g., Employee). Interface inheritance, in contrast, is a modeling construct: it allows developers to specify "kind of" relationships between components while not implying anything about implementation relationships. Inheritance plays a very important role in classic object models (see Appendix 1).
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
105
SOM supports interface inheritance, and even multiple interface inheritance. Thus, for example, an Amphibian interface can derive from can Walk and canSwim interfaces, with its set of supported methods (including attribute access methods) defined as the union of its various base and derived interfaces. Using a derived class, SOM clients can directly invoke any of the inherited methods, and treat the derived instances as simply a base component. Similarly, if a client knows that a base class reference actually refers to an instance of a derived class, the client can narrow the reference to that of the derived class and then treat the reference as a derived class reference. Interface inheritance is specified in the IDL of the derived class. In contrast, although COM supports single interface inheritance, it does not support multiple interface inheritance. A programmer cannot state, for example, that an interface Amphibian derives from both can Walk and canSwim interfaces. In practice, however, this does not present a problem because the programmer states that an Amphibian class supports as many interfaces as desired. Interface inheritance--even single inheritance--is therefore not needed. The key difference between the COM and SOM approaches is that COM does not allow a single reference to the Amphibian component to be used for both can Walk and canSwim interfaces. Instead, the client must explicitly ask for separate references to both interfaces, and use the appropriate reference to invoke the appropriate methods. The interface implementations may or may not be provided by the same class. By supporting the Querylnte~face, all COM components allow clients to determine at run-time whether a component supports a specific interface. Implementation reuse is achieved through the techniques of containment and delegation. The "derived" interface contains a set of "base" components and delegates client invocations to the appropriate inner component. The distinction between SOM and COM with respect to interface inheritance also applies to their distributed component models (Section 2.3), i.e., DSOM (IBM's CORBA-compliant, distributed version of SOM) and D C O M (Distributed COM). Like C + + and SOM, Java allows multiple interface inheritance; unlike C + + it does not allow multiple implementation inheritance.
2.3
Distributed Components
With the growth of client/server application architectures, it became necessary for an application's components to be deployed in more than one process or address space. Component technologies therefore evolved to include distributed components. Distributed components must solve the
106
AVRAHAM LEFF ET AL.
problem of how components, running on machines with different hardware and operating systems, can invoke methods on one another and return method results. Communication itself introduces considerable heterogeneity in the form of different communication protocols and network hardware. The key point is that component technologies factor this difficult task out of the application so that developers can use a generic, all-purpose, solution. Otherwise, component reuse is greatly limited since intercomponent communication must be interleaved with the component's business logic and redone when deployment platforms change. Three major distributed component technologies exist: Microsoft's DCOM (Distributed Component Object Model, circa 1996)[5], CORBA (Common Object Request Broker Architecture, V I.1 circa 1991, V2.0 circa 1994) [8, 10], and Java RMI [11]. Certain aspects of these technologies were previously discussed in Section 2.2. In this section we address only distribution-specific issues. In our view, despite the superficial appearance of significant differences between these distributed technologies, not much actually separates them at the functional level. Perhaps the most important difference is the fact that DCOM, like COM, does not have the notion of object identity while CORBA does.
2.3. 1 Transport Remote procedure calls (RPC) addressed the problem of distributed "peer to peer" interactions in the non-OO domain in the late 1980s. In this approach, clients invoke remote procedures in exactly the same way that they invoke local procedures. The client supplies parameters, suspends itself, and resumes execution after the method's results are returned. Distributed systems using RPCs are structured so that application code resides in a layer above the RPC layer which, in turn, resides above the communication layers. Services provided by the RPC layer include: 9 packaging transmitted values (client parameters and server results) into an RPC packet (marshalling); 9 locating the target code on the server; 9 unpacking transmitted values (client parameters and server results) when they arrive at the other end of the link (demarshalling); 9 performing data format translation (e.g. "big-endian" versus "littleendian") using formats such as Sun's XDR or DCE's NDR; 9 handling failures.
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
107
DCE [7] added IDL (see Section 2.2.1) and an IDL compiler to the RPC concept: the compiler creates portable stubs for the client and server sides of an application. The compiled stubs are linked to the RPC run-time library which performs the functions described above. With respect to components, RPCs are limited because they are not object oriented. Thus RPC calls are directed to a specific function (rather than a component method), focus on passing data structures from client to server, and the server cannot polymorphically implement a given method signature in different ways. D C O M and DSOM therefore introduce the notions of a proxy object (an object on the client that appears to be the remote object) and a server-side stub (used by the server-side component to communicate with the proxy object). (In CORBA, these constructs are termed, respectively, the client-side stub and the server-side skeleton objects.) D C O M supports these constructs through libraries that sit between the application layer and the RPC layer. Programmers interact with remote components by creating an interface pointer to a proxy object and then treating it as a local component. D C O M ' s support for distributed components can thus be viewed as a OO generalization of RPC. The CORBA analog to the DCOM library layer is the ORB (object request broker) and the object adapter. Relative to DCOM, the ORB plays a more prominent role in mediating between calls from a client to a server component. For example, object creation is performed in CORBA via the ORB, whereas DCOM uses class factory methods. ORBs use the general inter-ORB protocol (GIOP) to specify message formats and common data representations when communicating with one another. Typically, ORBs use IIOP (Internet inter-ORB protocol) to exchange GIOP messages over a T C P / I P network. In addition to static method invocation, CORBA allows clients to invoke methods dynamically, i.e. to invoke methods that are discovered only at run-time. Dynamic method invocation does not use code compiled from stub and skeleton code. The dynamic API allows programmers to write very generic code. The interface repository is a run-time database containing the interface specifications of the objects known to the ORB. The t)7~e librao' is DCOM's version of the interface repository, and it too provides support for dynamic method invocation. In practice, the biggest differences between DCOM and CORBA involve the issues of object identity (Section 2.3.3) and component infrastructure (Section 2.4). The history of Java with respect to distributed components provides an interesting perspective on the evolution of local to distributed components. Initially, Java supported only local components despite the fact that
108
AVRAHAM LEFF ET AL.
support for networking is built into the language, so that programmers can use t h e s o c k e t ,
S t r e a m, U R L, a n d
C l a s s L o a d e r constructs to build
their own distributed systems. However, precisely because of the component "reuse" concerns described earlier, Java realized that support for distributed components had to be factored out of component code and placed in a generic, system-provided, layer. C O R B A specifies Java bindings, so programmers can use CORBA to develop distributed Java components. As an alternative to CORBA, Java also specifies the RMI (remote method invocation) API [11] for building distributed components. In this approach, a Java class definition serves as the IDL for Java-only applications, and the r mi c compiler generates client stubs and server skeletons from the class definition. Clients use the R M I registry to get references to server-side components, dynamic class loading transports classes between client and server, and object serialization is used to marshal/demarshal local objects that are passed as method parameters. Native RMI uses J R M P as its transport protocol, but can use CORBA's IIOP as well.
2.3.2
Client View
In this section we discuss how clients access components in D C O M and CORBA. D C O M is defined as a binary internetworking standard in which a component exposes a low-level binary API based on arrays of function pointers. D C O M clients interact with a component by indexing off pointers to these arrays (or virtual function tables) to select the desired function and then invoke it. In contrast, CORBA components expose a higher-level API based entirely on the IDL and the IDL bindings defined for the language used by the client. Each language binding completely defines what the client can call, and low-level implementation details such as function pointers are simply not defined. Interestingly, and perhaps not surprisingly, Microsoft has recently provided high-level bindings for languages such as Visual J + + such that D C O M components look like, and are invoked as, vanilla Java classes.
2.3.3
Identity
We have previously noted (Section 2.2.2) that COM follows a two-level store model, in which instances of a component have no individual identity. Instead, component interfaces have identity via a globally unique identifier (GUID), a 128-bit integer that includes a hardware address, a timestamp, and other information. Registries store and manage GUIDs. Components support one or more interfaces, and interfaces are the basis of client/server
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
109
interaction. This model is used by D C O M as well. Clients interact with D C O M components by: 1. Getting access to the D C O M interface. D C O M clients never directly access a D C O M object: rather, the client is given an interface pointer. 2. Since all D C O M interfaces support the IUnknown member functions, the client can use Querylnterface to do "interface negotiation" with the D C O M object by asking whether the object supports the interface specified by a given GUID. An error code is returned if the object does not support a given interface or if the client cannot be served for some other reason. Consider the example of a Car component which implements interfaces for steering, acceleration, and braking. Specifying that each interface derives from IUnknown--interface ISteering:public IUnknown {...}--results in each interface having the ability to support QueryInterface for other interfaces within the same component. The component developer then implements a Car class containing classes that implement the required interfaces, e.g., class Steering:public ISteering {...} and class Braking:public IBraking {...}. By calling the QueryInterface(IID_Braking) on the ISteering interface pointer returned by Car, the client can get a pointer to the IBraking interface. A key aspect of this approach is that clients cannot ask to be connected to a specific D C O M object instance; this concept simply does not exist in COM or DCOM. Instead, clients have transient pointers to a component's interfaces. Because component instances have no object identity, if the connection between a client and server is broken and later restored, the client c a n n o t - - n o r should it need to--reconnect to the previous component instance and state. CORBA uses a different model, namely, the single-level store model. Earlier (Section 2.2.2) we focused on the implications of this model with respect to persistence. Here, we focus on the implications of this model with respect to distribution. Despite the fact that CORBA components may reside on another machine or in another process, and regardless of whether the component is currently activated or passivated, identity is a fundamental property of the component through which permanent associations can be maintained. CORBA object references are handles to objects that can be passed to a component's clients as either a parameter or result. Although they are "opaque," object references contain sufficient information for the ORB to route a method request to the correct object. An object reference is converted to a string through the object_to_string method, and can thus be preserved by a client. The client can subsequently use that string to access
110
AVRAHAM LEFF ET AL.
the object through the string_to_oh/ect method. An object reference's reference data is guaranteed to be unique within the server that implements the object; it can thus be used as a key into a datastore that maintains an object's persistent state.
2.3.4 Distribution Transparency One criterion for the success of a component model is the extent to which distributed components must be treated differently (either by developers or by clients) from local components. As we have seen, both CORBA and D C O M do a good job of masking the complicated tasks required to implement a distributed client/server component invocation. Through use of the stub~skeleton approach, remote references appear to the client as local references. From the client perspective, D C O M components have complete location transparency, i.e., client code does not change regardless of whether the invoked component resides in the same process, same machine, or different machine from the client. Only the server must be aware of location issues, and these are handled through the way that the server is packaged. Inprocess servers execute in the same process as their clients, and are packaged as a dynamic link library (or d t t). Local servers execute on the same machine as their clients but in a separate process; remote servers execute in a remote machine from their clients. Components deployed either in local or remote D C O M servers "stand on their own" so that they can be embedded in client applications. They are therefore packaged as executable files. Communication with remote servers is based on RPC (Section 2.3.1): communication with local servers uses D C O M ' s "lightweight" RPC. CORBA clients similarly do not distinguish between local and distributed components. This model of passing component references implies that changes made to a component are visible to everyone with a reference to that component. However, this pass-by-reference semantic is not always appropriate: what if clients and servers wish to exchange component copies? Until recently, CORBA supported only pass-by-reference semantics for objects, and applied this semantic consistently for both local and distributed components. Similarly, D C O M supports (and continues to support) only pass-by-reference. Under this semantic, in order to pass a component around the system at all, it must first be specified via an IDL, compiled, and then deployed on the server. "Vanilla" or "non-IDLized" C + + or Java objects cannot be used in a method invocation - - a restriction that can be painful to developers accustomed to programming in a given language. Through tooling, e.g., Visual C + + , it can be quite easy to add methods to a
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
111
C + + object, propagate the changes to the IDL, and thus to the component form of the object. Circumventing the restriction to pass-by-reference requires the use of "mapping code" to flatten a component into an IDL struct, passing the struct on the method call (using pass-by-value semantics), and reconstructing the component from the struct on the other side. Recently (circa 1997) CORBA has added support for "Objects By Value" through a construct called a ValueType, which is something like a hybrid between an IDL interface and a struct. A ValueType can contain methods but cannot be accessed remotely; instead, it is downloaded to the client using it. A Java binding of Value Types is a straightforward mapping to Java objects by value. Things get more interesting for other language bindings, e.g., without the support of a ClassLoader and a Virtual Machine how is a ValueType for C + + downloaded? RMI uses a more flexible approach in which developers can distinguish between "local" objects (that use pass-by-value semantics) and "remote" objects (that use pass-by-reference semantics). The price for this flexibility is an inconsistency between the local and distributed component models. Specifically, a successfully deployed "vanilla" Java object cannot be deployed "as is" into a distributed environment. Instead, the RMI programmer must modify the existing interface, and explicitly state that a component is distributed by extending the Remote interface and stating that all methods may throw a RemoteException. Having done so, the distributed component will use pass-by-reference semantics. Local objects, in contrast, do not inherit from Remote. To be passed between client and server, the component developer must explicitly declare pass-by-value semantics by implementing Serializable. While this task is often trivial (with serializable simply used as a "marker" interface), it can easily get complicated as in situations where a component contains references to other components. Note, however, that serializable components have the same behavior whether they are deployed locally or remotely.
2.4
Enterprise Components and Infrastructure
Previous sections focused on local versus distributed component evolution. However, component technology cannot be separated from the infrastructure in which the components are deployed. In fact, component function and component infrastructure are two halves of the component story. Typically, as the infrastructure provides more services to the component developer, the component can be reused to a greater extent since the component provides less system-specific code and contains relatively more business logic. However, these benefits do not come for free: the component infrastructure
112
AVRAHAM LEFF ET AL.
imposes greater constraints on the component to the extent that the component relies on the infrastructure's services. Historically, component infrastructures have evolved to provide more services to components while reducing component flexibility. We use the term enterprise components to denote components that rely on the infrastructure's services to a greater extent than distributed or local components. CORBA, as early as 1992, was aware of the need to provide services to component developers beyond the core ORB and object adapter concepts. Multiple volumes of common object service specifications have been released (COSS1-COSS5)[8]. Not all services have proven to be equally popular, and this is reflected in the extent to which CORBA servers support these services. For example, the naming and transaction services are far more prevalent than the relationship, licensing, or persistence services. However, the concept of a component infrastructure providing augmented facilities to components is important, and is basic to the evolution of object transaction monitors. Here we shall briefly discuss some of the more important CORBA services. 1. Life-cycle: defines a framework for creating, moving, and copying objects. 2. Naming: allows names to be associated with an object; the name can later be used to access the object. 3. Events: allows decoupled communication between objects. Instead of direct intercomponent interaction, a component can send an event to an event channel through which multiple components can access the event. 4. Transactions: allows an application to perform multiple changes to a datastore such that they are either committed or rolled back atomically. 5. Concurrency: allows concurrent access to a component to be controlled in a way that prevents its state from being corrupted. CORBA uses an elegant technique to package many of these services. Developers can focus on component business logic and subsequently mix-in combinations of CORBA services to add the desired function. For example, a developer may first write business logic in a businessComponent interface and implementation. Then, through multiple inheritance, the developer can inherit from businessComponent and the CORBA TransactionalObject and LockSet to create a transactional, concurrent, businessComponent--without additional effort. Relative to CORBA, C O M / D C O M provide fewer services, and therefore place a greater burden on the component developer. Put another way, COM
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
1 13
is mostly a specification or a set of interfaces which, because COM often doesn't provide even a default implementation, must be filled in by the programmer. In addition to these interfaces, COM also provides a small set of services. Below is a description of the COM analogs to the CORBA services previously discussed. 1. Life-cycle: As discussed in Section 2.3.3, COM clients never directly access a COM object, but instead work with interface pointers. COM clients acquire interface pointers by instantiating a class .factor),, an object that implements the IClassFactorv interface. A client may therefore first call CoCreatelnstance() to get an interface pointer to a class factory; it next calls Createlnstance() on the class factory to get an interface pointer to an object of that class. Component developers are responsible for implementing the appropriate class factory and can use the factory to control instantiation issues such as singleton creation or resource pooling. 2. Naming: The IMoniker interface provides a means to name data associated with a COM object. Monikers are the primary method used to name Storage and Stream entities. The Windows service control manager, commonly referred to as the "Scum", enables clients to locate a COM component by name. It is also responsible for loading the binary module containing the COM component into memory and returning an interface reference to the calling client through CoCreatelnstance(). 3. Events: Event notification is provided by connectable objects which allow clients to connect a "sink" to the connectable object, to disconnect the sink, and to enumerate the existing connections. 4. Transactions: COM and D C O M do not provide an API to manage transactions. MTS addresses this limitation. 5. Concurrency: COM provides support for two threading models. In the case of single-threaded components, all method calls on an object are made via the Windows message handler. This results in serialized method calls. In the case of~'ee-threaded components, all calls are made via the RPC service. This requires the developer to implement synchronization to prevent corrupt data due to concurrent data access. Both threading models can be used through the Apartments construct. Other important COM interfaces include: 6. Persistence/storage: Components implementing IStream are analogous to files, and contain data; those implementing IStorage are analogous to directories, and contain storages (subdirectories) and streams (files).
114
AVRAHAM LEFFET AL.
COM provides the implementation for basic types such as file streams and memory streams. In addition, COM specifies an API for creating, reading, and writing persistent objects. 7. Uniform data transfer: This interface facilitates data transfer between applications such as the common "drag and drop" operation from one window to another.
3.
Object Transaction Monitors
3.1
Introduction
Section 2 described how distributed component technologies, such as CORBA and DCOM: 9 provide a programming model to developers that reduces the need to consider issues arising from heterogeneous languages and platforms and distributed communication; 9 provide a set of infrastructure services, such as naming, transactions, and concurrency, that free component developers from the need to implement their own (presumably less robust and less portable) services. These technologies, however, do not provide a sufficient level of abstraction for easy development of enterprise components. For instance, a CORBA developer writing a component to transfer funds between two accounts must understand the details of the object transaction service in enough detail to ensure that the transactional semantics implied by a funds transfer is implemented correctly using transactional objects and the transaction coordinator. A component developer typically prefers that the system infrastructure itself coordinate the interaction needed to provide services such as transactions and concurrency. Doing so allows the developer to focus on details of the business logic involved in the funds transfer and on the appropriate transactional semantics. It also enables the component to be more portable since the component contains a greater percentage of business logic as compared to calls to system-specific services. Object transaction monitors (OTMs) provide such coordination of infrastructure services, allowing the coordination of infrastructure services to be removed from component logic and placed in the infrastructure middleware itself. OTMs do not necessarily replace CORBA and DCOM; and, in fact, OTMs are often developed on top of these technologies. Examples of CORBA-based OTMs include IBM Component Broker, Iona OrbixOTM and Oracle Application Server. Although not always referred to
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
1 15
as such, Microsoft's transaction server (MTS) can also be considered as an OTM, albeit a DCOM-based one. See [12] for a comparison of various OTMs. It is crucial to realize that the use of OTMs involves the following tradeoff. The OTM provides an abstraction layer around enterprise component services but, in exchange, the OTM imposes a specific programming model that must be followed in order to take advantage of its abstracted services. An OTM essentially offers a contract to the enterprise component developer: ~I will play the following roles with respect to your component if your component behaves in prescribed ways throughout its life-cycle." Enterprise JavaBeans are a further refinement of the OTM concept in that it imposes a standardized programming model on component developers in contrast to a proprietary, OTM-specific programming model. Similarly, MTS, by virtue of Microsoft's dominant position on the Windows platform, imposes a de facto standard for that platform. These two standards are therefore especially important and are discussed in Section 4. This section takes a closer look at one OTM, IBM's Component Broker, for two reasons. First, Component Broker was an important influence on the design of Enterprise JavaBeans. Second, as an alternative "data point" to Enterprise JavaBeans, it provides some insight about what makes OTMs different from, and more useful than, "raw" component infrastructures such as CORBA and DCOM.
3.2 An Example: Component Broker Like many OTMs, Component Broker is designed such that the serverside infrastructure specifies interfaces for important component life-cycle events. These include component activation and deactivation, distributed transaction coordination, and management of persistent states. In this approach, components "plug-in" to these interfaces by providing application-specific code that is invoked by the server when these life-cycle events occur. A developer can then rely on the OTM to provide the component with important services such as transactions and persistence. The tradeoff is that the component must completely cede control to the OTM in code related to the OTM's services. For example, the component can no longer initiate its own transactions since that might negatively interact with a transaction that the OTM has begun on behalf of the component. Component Broker builds on a CORBA ORB, borrowing elements from TP monitors such as Encina and CICS, so that managed components can "mix-in" most of CORBA's object services. In addition, Component Broker relieves components of the need to manage their persistent state through
116
AVRAHAM LEFF ET AL.
instance managers that map component state to a persistent datastore such as a relational datastore.
3.2.1 Component Broker Architecture Figure 1 illustrates the main elements in the architecture of Component Broker. Component Broker consists of two object-oriented frameworks: the managed object framework (MOFW) and the instance manager framework (IMF), also known as the application adapter framework. A component in Component Broker is referred to as a managed object (MO) and it hooks into Component Broker by completing the M O F W . From a developer's viewpoint, an MO consists of a business object (BO) which contains the application's business logic (e.g., a "funds transfer" method). MOs implement the (IDL-specified) component interface, termed the BOI, that clients will use to interact with the component. A BO's persistent state is managed through the M O F W ' s data object (DO), an abstraction that encapsulates the mapping between the component's state and a specific datastore. The DO interface is specified in CORBA's IDL, and includes an attribute for each persistent data item used by the BO. Importantly, in the case of "well-structured" datastores, such as relational databases, tools can automate much of the task required to implement a DO simply by examining the DO interface. An MO's relationship to an instance manager (IM) is similar to an object's relationship to an object oriented database. That is, an instance manager provides capabilities such as identity, caching, persistence, recoverability, concurrency and security for its MOs. The I M F consists of interfaces that apply to "generic", datastore-independent, instance managers. For example, an IM tracks the total number of active objects, and can passivate objects so that the objects are removed from main memory and
m
MO
MOFW
CORBA ORB and COS FIG.
l
1. Main elements of Component Broker's architecture.
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
1 17
their state saved appropriately. The business object instance manager (BOIM) framework specializes the IMF by providing interfaces to specific datastores such as DB2 and CICS. Specific IM instances provide an MO with: 9 Homes, which on a per-MO basis provide a way to create, locate, and
reactivate MOs. A home corresponds to a set of instances of a particular class, all of whom use the same persistent datastore. 9 Containers, which interact with the ORB to resolve object references,
interact with homes to reactivate managed objects, and define MO policies such as caching, locking, transaction modes, and concurrency control. 9 Mixins, which provide an MO with hooks into services such as
transactions and concurrency. Mixins cooperate with an MO to implement an interceptor pattern that is activated before and after a client invocation is actually delegated to a BO method. For example, if a BO's transaction policy requires that a method execute inside a transaction and the MO detects that the client has not started a transaction, the MO and the mixin have the opportunity to initiate the transaction on behalf of the BO. 9 Configuration objects, which are provided to an MO's home by a
container. Configuration objects implement a container's policies by determining the type of mixin that should be associated with an MO. The Component Broker architecture thus explicitly assigns "roles" to an enterprise component. Business developers are responsible for a component's business logic (the BO portion of the MO). Instance managers are responsible for transparently providing persistence, transactions, security, and other services to the BO, and do so via the mixin object portion of the MO. Furthermore, the behavior of mixin objects is tailorable based on various policies, thus enhancing component reuse and portability. As noted above, in exchange for the OTM's services, deployed components must acknowledge that the OTM is "in charge" of component life-cycle and system resources. This, in turn, allows the OTM to manage the system efficiently, and thus to provide better performance, for all its components. Specifically, Component Broker provides high availability and scalability through the notion of a server group: a cluster of servers that can run on one or more hosts. Scalability is provided by balancing the workload across servers in the group, and by adding more servers if needed. High availability is achieved since other servers in a group can assume the responsibilities of a server that has become unavailable.
118
AVRAHAM LEFFETAL.
3.2.2
Programming Model: Developer's View
C o m p o n e n t s d e p l o y e d to the C o m p o n e n t B r o k e r a r c h i t e c t u r e benefit f r o m the set o f services that are t r a n s p a r e n t l y p r o v i d e d to the c o m p o n e n t , i.e., the c o m p o n e n t does not explicitly i n v o k e or c o o r d i n a t e the services. As n o t e d a b o v e , O T M s p r o v i d e these benefits to c o m p o n e n t d e v e l o p e r s at the cost o f r e q u i r i n g the c o m p o n e n t to strictly follow the O T M ' s p r o g r a m m i n g m o d e l . H e r e we e x a m i n e C o m p o n e n t B r o k e r ' s p r o g r a m m i n g m o d e l . A n M O can be in one o f three states. These states a n d the t r a n s i t i o n s t h a t can occur a m o n g t h e m are illustrated in Fig. 2. The C o m p o n e n t B r o k e r r u n - t i m e drives the t r a n s i t i o n s of an M O a m o n g these states a n d p r o v i d e s the M O with the o p p o r t u n i t y to execute application-specific logic by i n v o k i n g specific m e t h o d s on it. These m e t h o d s - - w h i c h are p a r t o f the M O F W a r c h i t e c t u r e - - a r e d e n o t e d by the labels on the state t r a n s i t i o n s in the d i a g r a m . 9 i n i t F o r C r e a t i o n. This m e t h o d p r o v i d e s a o n e - t i m e o p p o r t u n i t y for the M O to p e r f o r m initialization after it has been created. In particular, a reference to the D O is p a s s e d to it as a p a r a m e t e r . 9 u n i n i t F o r De s t r u c t i o n. This m e t h o d p r o v i d e s a o n e - t i m e o p p o r tunity for the M O to p e r f o r m uninitialization before it is d e s t r o y e d . 9 i n i t F o r" R e a c t i v a t i o n. This m e t h o d p r o v i d e s an o p p o r t u n i t y for the M O to p e r f o r m initialization after it has been reactivated. In p a r t i c u l a r , a reference to the D O is passed to it as a p a r a m e t e r . 9 u n i n i t F o r P a s s i v a t i o n. This m e t h o d p r o v i d e s an o p p o r t u n i t y for the M O to p e r f o r m u n i n i t i a l i z a t i o n before it is passivated.
I I unltF~176176 I mun'n'tF~176
initForCreationL. [..
synchFromDataObject
initForReactivationJ ..J
~,~j
synchToDataObject
Business method FIG. 2. Life-cycle of a Component Broker component.
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
1 19
9 synchFromData0bject. This method optionally allows an MO to synchronize from its DO any data items it may be caching. 9 synchToData0bject. This method optionally allows an MO to synchronize to its DO any data items it may be caching. 9 b u s i n e s s me t h o d. This is the implementation of any method that has been exposed to clients in the business object interface.
3.2.3 Programming Model: Client's View In addition to imposing requirements on the developer, Component Broker requires a component's clients to follow a client-side programming model. The client-side programming model documents how MO clients must interact with an MO via the MO's business object interface (BOI). As with CORBA, clients do not directly access an MO; instead they get a reference to an MO proxy and manipulate the MO via the proxy. 9 Obviously, an MO must first exist in order for a client to invoke methods on the MO. Typically, a client creates an MO by first locating its home using the naming service and then invoking a create method on the home. 9 If an MO already exists, a client can "find" it by first locating its home using the naming service and then invoking a find method on the home. 9 Clients can invoke an MO's business methods after they access the MO (either through a create or a find operation). 9 Clients delete a reference to an MO (i.e., the MO proxy) by invoking a "release" method. The MO is not itself deleted, and other clients may still hold valid references to the MO. 9 Clients delete an MO along with its persistent data by invoking a "remove" method. A client application can also: 9 Use sets of objects. A home in Component Broker represents a set of MOs, all of the same type. A client can then manipulate collections of MOs by creating an iterator on the objects' home and then navigating over the objects in the home. 9 Remember interesting and important objects. Component Broker supports C O R B A ' s standard interoperable object references (IOR) which allow a client to refer to an MO regardless of where the MO is located. A client can convert an IOR into a string and store this converted object reference for future use.
120
AVRAHAM LEFF ETAL.
9 Insert an MO's persistent data directly into a persistent store. Although this is not the recommended way to create an MO, this is possible given that an MO can provide a wrapper for existing legacy data. 9 Delete an MO's persistent data directly from a persistent store. Although this is not the recommended way to delete an MO, this is possible given that an MO can provide a wrapper for existing legacy data.
3.2.4
Developing a Component in Component Broker
Developing an enterprise component in the Component Broker architecture includes the following steps: 1. Define the business object interface. This is where a developer exposes business methods to clients. 2. Select a data object pattern. A developer can choose whether to cache the values of persistent data items from a DO in an MO's state. If this is the case then the MO must implement an interface that includes the optional M O F W data synchronization methods. 3. Implement the required M O F W methods and business methods. Whether or not an MO caches DO data items, it must implement an interface that includes the four initialization and uninitialization M O F W methods. 4. Optionally implement a specialized home. Component Broker provides a generic home implementation with methods to create and find an MO via its primary key. If a developer requires more specific create and find methods or other methods such as destroy, then a specialization of the generic home can be defined and implemented. 5. Specify policies for the MO's container. To configure the behavior of the MO's container, a developer declaratively specifies policies that denote its application-specific needs. For instance, if the MO's business methods must always be invoked in the context of a transaction (regardless of whether a client has initiated one), then a policy of "begin transaction" can be specified.
4.
Enterprise JavaBeans and Microsoft Transaction Server
The contract between an OTM and its components specifies the set of services that the OTM provides to its components via a given API. In
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
121
exchange, the components (1)must follow a given programming model (thus allowing the OTM to assume control in a well-architectured manner) and (2)agree not to "meddle" in the OTM's domain. As discussed in Section 3, the notion of a O T M - c o m p o n e n t contract increases component portability and reuse. From this perspective OTMs are an improvement compared to CORBA and DCOM. However, because individual OTMs were not designed to a common architecture and programming model, the goal for components is not yet met. In contrast to CORBA (in which developers can develop a component on one ORB and deploy it on another vendor's ORB) developers cannot take a component developed with one OTM's programming model and deploy it on some other OTM. Therefore, one natural progression for OTM technology is to define a standard contract and programming model between component and containers such that developers can deploy their components on any OTM that abides by that standard. OTM standards have been defined in the Java and CORBA domains and, albeit debatably, in the DCOM domain as well. These standards are the Enterprise JavaBeans (EJB) specification, the CORBA Component Model (CCM) specification and the Microsoft Transaction Service (MTS), respectively. As a CORBA-based OTM standard, the CCM (Section 7.3) enables the development of components that are language independent, platform independent, and ORB independent, as well as OTM independent. This chapter focuses on EJB and MTS because the CCM has been adopted only very recently and commercial implementations do not yet exist. 4.1
Microsoft Transaction Server
Microsoft Transaction Server [3,4, 13] addresses several deficiencies of the C O M / D C O M technologies. These include the installation and administration of components and the ability of the infrastructure to scale with greater numbers of clients and components. For example, the COM developer has to ensure that d l l and uu i d information is correctly inserted into the registry; the DCOM developer has to ensure that consistent versions of a component's interface are installed on the client and server. However, from the perspective of the value provided by an OTM, the greatest weakness of COM and DCOM is that transactional and security services are not integrated into the programming model. The component developer can directly issue calls to ODBC (for transactions) and system APIs (for security), but no infrastructure exists that provides these services to the component developer. MTS provides COM and DCOM components with precisely this sort of OTM function that had previously been missing. MTS uses the distributed
122
AVRAHAM LEFF ETAL.
transaction coordinator service as the coordinator for distributed transactions, and automatically flows the transaction context " o n the wire" when invoking a c o m p o n e n t ' s method. It also adds the notion of role-based access control in the area of security. Moreover, the c o m p o n e n t developer can specify the desired transaction type and security roles declaratively, rather than specify them programmatically. Support for transactional and security services is provided by having the M T S infrastructure interpose an ~'interceptor" object between a client and and a component. The interceptor object serves as the h o o k for M T S to apply system function before and after invoking the client's method on the actual component. The MTS catalog supplements the D C O M registry, and is used by MTS to ease the administration tasks required to develop and deploy components. Multiple components are bundled as a package, a unit of code that runs in a single address space. A library package runs in the same address space as the client; a server package runs in the server's address space. Perhaps because Microsoft completely " o w n s " the C O M / D C O M standard, it could preserve the existing c o m p o n e n t model while adding features to M T S in an incremental manner. This may explain why the M T S features listed in Table II are almost unchanged from a C O M / D C O M
TABLE I DISTRIBUTED ENTERPRISECOMPONENTSAND THE MICROSOFTTRANSACTIONSERVER
Architecture issue
Background (Section)
MTS architecture
Description
2.2.1
COM (DCE) interface definition language
Identity
2.2.2 & 2.3.3
COM components do not have identity
Persistence model
2.2.2 & 2.3.3
Two-level store: component developer must explicitly invoke "'load" and "store" methods
Inheritance
2.2.3
Single interface inheritance
Transport
2.3.1 & 2.3.4
DCOM RPC in both the "'heavy-weight" (remote server) and "light-weight" (local server) varieties
Distribution transparency
2.3.4
A client sees no difference between a local and remote component
Services provided to component
2.4
MTS and the DCOM infrastructure provides services that include naming, transactions, concurrency, life-cycle, and persistence
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
123
feature list. MTS, for example, did not add support for OTM management of a component's persistent state (e.g., EJB's entity' bean) or for components to maintain a client's session state (e.g., EJB's state~d session bean). In contrast, the EJB OTM is a standard developed by a committee that had to bridge considerable differences between existing OTMs (e.g. IBM and Oracle) and different component models (e.g. Java RMI and CORBA). As a result, EJB introduced both a new component model and an OTM.
4.2 4.2. 1
Enterprise JavaBeans
Overview
EJB specifies a contract and programming model for developing components that are written in Java and that are platform independent, ORB independent, as well as being independent of OTM implementation. EJB technology wraps many existing technologies through other Java APIs, e.g., the transaction service (JTS) wraps CORBA's OTS; persistence (JDBC) wraps relational databases such as DB2, Oracle and SQL Server; and naming (JNDI) wraps technologies such as CosNaming, LDAP, and DNS. In addition to the Enterprise JavaBeans specification itself[l], reference [2] provides a good presentation of EJB. Here, we examine how the EJB concepts relate to the broader perspective of component technology described in this chapter. One way that EJB enables standardization is through emphasizing the importance, and the number, of roles for the process of building distributed applications. EJB defines six architectural roles: EJB provider, application assembler, deployer, container provider, server provider, and system administrator. An EJB provider creates an EJB which contains the business logic of an enterprise component and application assemblers compose EJBs into applications or larger deployable units. Container providers build EJB containers out of the basic infrastructure services supplied by server providers. A deployer is responsible for installing a packaged EJB unit into a specific container; system administrators oversee the configuration and run-time well-being of an EJB container and its deployed EJBs. The EJB and container providers are thus the most relevant roles with respect to the issues discussed in this chapter. EJB separates component life-cycle (controlled mostly by a home that specializes EJBHome) from business logic (specified in the remote interface which specializes EJBObject). It also separates the interface seen by the component's clients (the remote interface) from the component's implementation (a specialization of EnterpriseBean). In addition, an EJB unit contains a deployment descriptor. These elements are analogous to Component
124
AVRAHAM LEFF ET AL.
Broker's business object interface, business object implementation, specialized home interface, and container policy specification (see Section 3.2). EJB further categorizes an enterprise component as either an EntityBean or a SessionBean. An EntityBean represents a persistent object that is shareable, transactionally recoverable and stored as a distinguishable entity in a persistent store (e.g., an account in a banking system). In this sense, an EntityBean is most similar to a Component Broker managed object. A SessionBean represents a componentized business process that is nonpersistent, transactionaware but not transactionally recoverable (e.g., a funds transfer process). SessionBeans can maintain client state across client method invocations (stateful SessionBeans) or maintain no client state (stateless SessionBeans). Importantly, although an EJB home interface specifies application-specific methods to create and find EJBs, the bean provider is not responsible for implementing the home interface. Rather, deployment tools, guided by a deployer, are responsible for generating the home implementations. A deployment descriptor specifies structural and behavioral information about an EJB. Structural information includes the class names of its remote interface, implementation class, and home interface, as well as the name of its home in the name space available for clients to locate homes. Behavioral information includes declarations of required transactional policies, choice of persistence management if the EJB is an EntityBean, and whether a SessionBean is stateful or stateless. EJB containers provide EJB components with an object transaction monitor (OTM) (Section 3). As such, containers provide services such as persistence, transactions, memory management, and security. An EntityBean components can entirely delegate management of its persistent state to the container (container-managed persistence): in this case, the component need only specify its set of persistent attributes, and tools such as object/relational mappers generate the appropriate code. Alternatively, the component can manage its own persistence by interacting with a persistent store to save and retrieve its persistent attributes. This is typically done through a standard application programming interface such as JDBC, with the container providing services such as a database connection pool. An EJB deployment descriptor includes declarative specification of the transactional policy under which the component should run. For instance, an EJB developer can require the container to begin a transaction on a per-method basis if the client's request is not already executing inside a transaction: this is done by specifying the T X _ R E Q U I R E D policy. The EJB container implements transactional behavior through coordination of an underlying transactional service. As with other OTMs, an EJB container can improve component performance through the use of connection, thread, and other resource
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
125
pooling. The container can thus provide clients with the illusion that a large number of EJB instances are available for execution, even though the container is actually managing a smaller working set of active instances. To conserve server resources, the container can passivate an EJB instance that has been idle for some time; when a client issues a request to the instance, the container transparently reactivates the instance. The ability to manage components in this way is achieved by requiring (as in the Component Broker M O F W ) components to implement an interface that includes methods to be called by the container during passivation and activation. These methods give the component the opportunity to perform applicationspecific behavior at these points in its life-cycle. The security service provided by an EJB container to its components is specified through access control roles and policies that are defined at application assembly and deployment time. A component developer is thus insulated from security-related issues, and focuses on the component's business logic. If components require a more dynamic access control environment, they can obtain principal-related information (such as identity or role conformance) through an architected interface into the container.
4.2.2 Programming Model The client-side EJB programming model closely resembles that of Component Broker (Section 3.2.3). Most life-cycle operations on an EJB are performed via the beans' home; clients invoke methods on an EJB if the EJB exists and the client has obtained a reference to it. Even session beans, which have no persistence-related life-cycle and have no counterpart in Component Broker, use this programming model; persistence-related tasks such as finding a component, and directly inserting and deleting the component, are simply omitted from its life-cycle. The developer's view of the EJB programming model is (as it should be) more complicated than the client-side view. The three types of components are quite different; therefore, stateless session beans, stateful session beans, and entity beans have different life-cycle specifications. EJB containers are responsible for driving transitions from one EJB state to another. Table II shows that the EJB component-container contract is typical of other OTMs such as Component Broker (Section 3.2); the basic design pattern is for the component to provide callback methods that allow the OTM (aka container) to drive the component from one well-defined state to another at important life-cycle junctures. These methods are specified in the SessionBean and EntityBean interfaces, one of which must be implemented by an EJB. The EJB component-container contract includes two other interfaces: EJBContext and SessionSynchronization. EJBContext provides an EJB
126
AVRAHAM LEFF ET AL. TABLE II
METHODS IN EJB COMPONENT-CONTAINER CONTRACT AND THEIR M O F W ANALOGS EntityBean or SessionBean method
Corresponding M O F W method
ejbCreate
initForCreation
This method is invoked when the EJB instance does not yet have an identity. In addition, the ejbPostcreate method more directly parallels initForCreation.
ejbRemove ejbActivate ejbPassivate
uninitForDestruction initForReactivation uninitForPassivation
Each one of these EJB methods is a direct counterpart of the M O F W method.
ejbLoad ejbStore
synchFromDataObject synchToDataObject
These EJB methods are always required of EntityBeans, but the implementation depends on whether or not the EJB uses containermanaged persistence. SessionBeans do not specify these methods.
Comments
component with an interface into its container so that, for example, the component can get security-related information from its container. An EJBContext is specialized for use with EntityBeans and SessionBeans, and EJBs provide a s e t method that enables the container to provide the bean with its EJBContext. The SessionSynchronization object is optionally implemented by a stateful SessionBean; it enables the container to notify the bean after it starts a transaction, before the transaction completes, and after the transaction completes. Again, this design results in a well-structured, applicationindependent, set of interactions between a container and its components.
4.2.3 Perspective Table III examines the Enterprise JavaBeans architecture from the perspective of the issues that we have previously (Section 2) identified as being important for distributed, enterprise component architectures.
4.3
Comparison of the Component Models
As discussed above, EJB provides three types of components: entity, stateful session, and stateless session beans. MTS provides one type of component which is similar to a stateless session bean.
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
127
TABLE III DISTRIBUTED ENTERPRISE COMPONENTS AND THE ENTERPRISE JAVABEANS ARCHITECTURE Architecture issue
Background (Section)
Description
2.2.1
An EJB's home and remote interfaces are defined as Java RMI interfaces. In addition, a standard mapping to CORBA is defined that specifies how these interfaces are mapped to the CORBA IDL.
Identity
2.2.2 & 2.3.3
Within a home's scope, defined by the bean's primary key.
Persistence model
2.2.2 & 2.3.3
An EJB's client is offered a single-levelstore view: she does not invoke "'load" or '~store" methods on the component. EJBs are thus an improvement on client access via JDBC. An EJB developer, because of the life-cycle ejbLoad and ~jhStore methods, is offered a two-level store view. However, in contrast to MTS, these methods are cleanly factored out from the rest of the component. Container-managed persistence implementations can often be unaware o f " l o a d " and "'store" methods as the container assumes responsibility for persistence management.
Inheritance
2.2.3
Remote interface can inherit from multiple interfaces (ultimately, from Remote). Bean class, like any Java class, can inherit from multiple interfaces, and extend a single implementation class. The problem with EJB inheritance is that the base home scopes only base class EJBs rather than scoping subclassed EJBs as well.
Transport
2.3.1
Java RMI remote semantics over J R M P or IIOP. Given the prevalence of the IIOP protocol as a form of interoperation across ORB implementations, EJB containers can use IIOP to interoperate with nonJava clients or even with other EJB containers.
Distribution transparency
2.3.4
An EJB's home and remote interfaces are defined as Java RMI interfaces. A method's parameters and return values mst be either extend java.io.Serializable or java.rmi.Remote.
Services provided to component
2.4
An EJB container provides the following services to its components: naming, transactions, concurrency, life-cycle, and persistence.
Enterprise JavaBean architecture
128
AVRAHAM LEFF ETAL.
Entity beans are useful for representing persistent identifiable data, such as bank accounts and employee records, and for manipulating that data transactionally using a single-level store model. The EJB run-time provides for (mostly) automatic data movement between the in-memory component and the database at transaction boundaries. MTS attempts to provide similar function with stateless components. This puts more of a burden on the component developer and the client programmer. The component developer needs to code to a two-level store model, with explicit data movement (load and store) code embedded in the business logic. The component developer and client must also be cognizant of the fact the component's state (including any identity maintained by the business logic) will disappear after each method invocation. Thus the component developer must ensure that modified data is flushed to the database before calling SetComplete at the end of each method, and the client programmer must be aware that, even though he retains a pointer to the component interface, the component has no memory of previous method calls and internal identity. (We further contrast the EJB and MTS approaches in a sample application presented in Section 6.) One approach for dealing with these issues is for the client to pass the component's identity on each method call [13]. However, this seems a lot like RPC rather than components! Other MTS programming approaches exist, such as calling EnableCommit rather than SetAbort. Stateful session beans are useful for representing transient nontransactional states in a business process, e.g., the contents of a shopping cart in an e-commerce application. The EJB run-time handles resource management for stateful sessions beans, by transparently swapping components to and from persistent storage as needed. Although MTS components cannot directly provide this function, MTS provides the shared property manager (SPM) resource dispenser, which components can use to store transient state. As in the case of entity-bean-like function, the MTS component remains stateless and must be passed the SPM data identity (e.g., Program Group name). Finally, stateless session beans are useful for packaging business processes which do not need to maintain state across method invocations. Both MTS and EJBs provide very similar function.
5.
Parallel E v o l u t i o n
This chapter's focus is on the evolution of component technology, and the discussion in Sections 2 - 4 may therefore have given the impression that components and object transaction monitors have evolved in a software technology "vacuum." This impression is especially misleading in the areas
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
129
of persistence, transactions, and concurrency. We will show in this section that, in these areas, component technology in fact applied developments produced by other software technologies. The reader who is primarily interested in Enterprise JavaBeans and Microsoft Transaction Server component technology may wish to skip to Section 6. In this section we take a step backward from the main discussion in order to get a broader perspective on some of the key technologies used in enterprise components. One observation that may be drawn from this broader perspective is that, like many developments in software, "nothing is new under the sun" and that the important contribution of enterprise components relates to the packaging of disparate technologies. 5.1
E v o l u t i o n of Data Access A b s t r a c t i o n
Data access abstraction technology such as database servers originally focused on providing abstract access to data, but later proceeded to add services involving:
1. 2. 3. 4.
distribution (access to distributed data) heterogeneity (access to data distributed on heterogeneous systems) procedural logic (allowing code to be interleaved with data access) object encapsulation (packaging the data/logic chunks as objects).
At the end of this process (see fig. 3), services provided by these systems greatly overlap the services provide by enterprise components deployed in an object transaction monitor.
5. 1.1
Data Access Abstraction
Before Codd proposed the notion of data &dependence in the 1970s, relationships between data elements were represented by pointers such that users had to explicitly navigate through data networks or hierarchies. Data independence, in contrast, states that data relationships should be represented explicitly (through data values) so that database queries can be expressed in a manner independent of the programs and data structures used by a specific database. Codd showed that data independence can be achieved if data is stored in the form of relations that are queried either through predicate calculus or through a set of relational operators [14]. IBM's System R [15] and the Ingres [16] prototypes showed that large-scale databases, with good performance, could be built to support the relational model, queries, and transaction processing. System R chose to develop a new relational database language called SQL [17] rather than use Codd's
130
AVRAHAM LEFF ETAL.
relational algebra or calculus. SQL has since been implemented by almost all relational database vendors, adopted as ANSI and ISO standards, and continues to incorporate new features [18]. Note that at its inception SQL was designed as an end-user query language with the property that users were shielded from hardware and software issues through a data abstraction layer.
5. 1.2 Adding Procedural Logic The limitations of SQL as only a query language became apparent when users tried to combine procedural logic with abstracted data access, e.g., to generate a report displaying average department salaries. Data (the stored department salaries) now had to be combined with logic that calculated the average salary and generated a report. Embedded SQL (first standardized in ISO SQL-89) and the SQL call level interface (first introduced as SAG CLI in 1988) are two approaches that enable applications to integrate SQL calls with procedural capabilities. Embedded SQL simply embeds SQL statements in a traditional programming language such as COBOL, F O R T R A N , or C. These statements are run through a precompiler that generates database-specific code on the client side, and binds SQL statements to the database. The database-specific code is invoked at run-time. In contrast, CLI does not require a precompiler to convert SQL statements into code, nor does it require database-specific recompilation. Instead, through the techniques of dynamic SQL and by defining a common SQL API, SQL calls are created and executed at run-time. Two of the most famous CLIs are extensions of this concept: ODBC (released by Microsoft in the early 1990s) and X/Open CLI (released in 1992) [19]. While embedded SQL and CLI addressed the problem of integrating procedural logic with database access, both approaches required a client/server round trip for every database call. Thus, for example, if a report generation program is coded in a loop that first accesses an employee record and then prints it, a report about N employees requires N round trips between the client and the database server. Stored procedures were introduced to solve this performance problem. A stored procedure is a bundle of SQL statements and procedural logic that is stored on a database server. The bundle specifies an input interface, and is named so that clients can conveniently access the stored procedure and pass it parameters. The database server executes the stored procedure (e.g., accesses all N employee records), and passes the result back to the client. Because the client and server interact only once, performance is usually better than either embedded SQL or CLI. Although database vendors offered proprietary approaches to stored procedures, these were later standardized [18].
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
131
We make the following observations at this point" 1. Components borrowed persistence technology from the database area. COM and D C O M use ODBC to access relational databases; JDBC (Java database connectivity) similarly provides a database-independent way for Enterprise JavaBeans access to relational data and can, in fact, be used on top of ODBC. 2. The difference between clients accessing persistent components and clients accessing stored procedures lies mostly in the object oriented flavor of components. Object oriented databases[9,20,21] are database systems that allow clients to store, retrieve, and manipulate objects. Such database systems blur the difference even more. 3. The key difference between objects stored in a database and persistent, transactional, components has to do with the degree to which the object or component packaging facilitates reuse (Section 2.1).
5. 1.3 Distribution and Heterogeneity It is worth noting that data access abstraction technologies must also deal with distribution and heterogeneity issues. Although these issues obviously pose difficulties for transactions (Section 5.1.4), distributed and heterogeneous data access is not trivial. For CLI to work, all database servers must have a driver that accepts a CLI call and translates the call to the appropriate set of network/server communication calls and server-specific access methods. Since such database drivers are often not available, CLI also uses "pass through" techniques that ~standardize" escape clauses from the common API. Either way, portability and reuse are reduced. ISO remote data access (RDA) was therefore introduced to standardize a common format and protocols for client/database server communication. In this approach client requests are first converted to a canonical format,
Evolution of Data Access Function Network & Hierarchical Data Access (1960s)
Relational: SQL Queries (1970s)
Embedded SQL (1980s)
Call Level Interface (CLI) (1980s)
Stored Procedures (1980s)
Object-Oriented Database (1990s)
FIG. 3. E v o l u t i o n of data access a b s t r a c t i o n function.
132
AVRAHAM LEFF ET AL.
transmitted to the database server, and finally converted from the canonical format to the database-specific format and protocol. IBM's D R D A (Distributed Relational Database Architecture) takes a somewhat different approach: rather than relying on canonical formats, the server is responsible for adapting to the client. D R D A goes further than R D A in enabling the construction of "federated databases"--i.e., clients can issue a single command that initiates activity on multiple, heterogeneous sites. For example, updates within a single transaction can occur on multiple sites (Section 5.1.4) and multisite "joins" allow a query to return a result constructed from tables stored on multiple sites.
5. 1.4
Transactions
F r o m the perspective of databases (as opposed, for example, to business processes) a transaction can be viewed as a set of data access operations that are guaranteed not to corrupt a database, despite concurrent user access and the occurrence of system failures. Providing the well-known ACID transaction properties [22] has therefore long been an important part of data access abstraction. In the same way that users require an abstraction layer to mask the lower-level systems implementing persistent data access, users require an abstraction layer to mask the complexities of transaction technology. In this subsection we briefly review the evolution of transaction technology from local to distributed to heterogeneous systems. Transaction isolation is implemented through a concurrency control mechanism of which two-phase locking [22] is the most widely used. Lockbased protocols use shared locks to permit multiple read (non-statechanging) operations to execute concurrently; exclusive locks ensure that write operations execute in isolation. By using shared and exclusive locks, the system can easily detect whether operations from multiple transactions do or do not conflict, and schedule transaction execution accordingly. Although first applied to a single database resource, this transaction model was extended to multiple, distributed database resources. A major complication faced by distributed transactions is the issue of distributed commitment: how does a transaction know that all participating resources will successfully commit? The t~'o-phase commit [22] protocol is the most widely used solution for distributed commitment. Combining protocols such as two-phase locking and two-phase commit enables transactions to span both local and distributed database resources. Database system heterogeneiO' (chiefly because of different database vendors, but also due to different data models) greatly complicates the task of enabling a single transaction to access multiple database resources.
ENTERPRISEJAVABEANSAND MICROSOFTTRANSACTIONSERVER 133 The X/Open Distributed Transaction Processing (DTP) model (Version 2) was introduced in 1993 to provide the "standardization" glue needed to enable transaction across heterogeneous systems. Relying on the OSI-TP specification which provides a standard definition of the two-phase commit protocol, DTP defines a set of XA APIs to which participating systems map their proprietary two-phase commit protocols. The model defines four components: resource managers that manage shared resources; transaction managers that coordinate and control the resource managers; communication managers that control communication between distributed components; and application programs, clients that use the APIs to begin, commit, and abort transactions. Transaction technology and data access abstraction technology are thus complementary. Application programs use SQL (evolved and standardized as described in Section 5.1.1) to communicate with resource managers; X/Open TX defines how application programs talk to transaction managers, and X/Open XA defines how transaction managers talk to resource managers. Enterprise components borrowed transaction services from database technology in the same way that they borrowed persistence services. Thus, CORBA's object transaction service is an "object rendering" of the X/Open DTP model. OTS contains many one-to-one mappings between X/Open XA and TX interfaces and CORBA interfaces such as Current and Resource. This is no accident, of course: OTS was designed to be implemented on top of existing X/Open systems.
5.2
Evolution of Packaged Business Logic
People working in the areas of enterprise components and data access technologies are typically aware of the close relationship between these two areas. We next discuss the evolution of function in an area that is somewhat less known than these other areas, and that we term packaged business logic. Some people are surprised to learn that many of the issues related to software reuse--including packaging, distribution, persistence, and transa c t i o n s - w e r e addressed by the area of packaged business logic as early as the 1960s and 1970s. In contrast to "data access" technology which was originally motivated by a focus on abstract, program-independent, data access, "packaged business logic" was originally motivated by the need for code to be invoked and executed as a single unit. Although such function originated as simple stand-alone programs, packaged business logic evolved to include function that greatly overlaps with data access and enterprise components.
134
AVRAHAM LEFF ET AL.
Software developers realized fairly early that, in order to effectively reuse their programs, they needed to conceive code (i.e., business logic) not as stand-alone programs but rather as logical units of work. The difference between these viewpoints is, to a degree, only a question of how the code is packaged. Increasingly, however, as code was packaged as units of work, companies required new administrative function so as to provide management and security for entire suites of such units of work. From the development side as well, in order to easily (and portably) deploy units of work, programmers required the underlying system to provide transaction and data access services. The evolution of IBM's IMS (Information Management System)[23] and CICS (Customer Information Control System) [24] shows how the motivation to package units of work quickly led to systems that provided sophisticated transaction and data access services. Because we intend only to show the early development of services for enterprise software, our discussion of these systems relates to their status in the 1970s and 1980s. Obviously, the versions of IMS and CICS that are available in 2000 are far more powerful than the systems discussed here. With respect to the point of this section--that the requirements of logical units of work lead to early development of data access, communication, and transaction services--IMS and CICS are equally valid examples. IMS was introduced in the mid-1960s, slightly before CICS. From the perspective of this chapter, IMS differed from CICS in that applications accessed IMS's hierarchical, D L / I , data structures, and services. Many transaction concepts such as program isolation and cursor stability, algorithms such as two-phase commit, and techniques such as write-ahead and compensation logging, were introduced by IMS by the mid-1970s. To keep the discussion short, we shall refer only to CICS, although the development of IMS serves as another, more-or-less concurrent, example. Before the development of CICS, batch processing represented the "state of the art" in program execution. The notions of client interaction with the host via terminals or of transactional execution of a program did not exist, and program execution involved little data communication. CICS was introduced in 1968 as a general-purpose transaction management system in which clients could interact with the system from terminals. CICS was explicitly conceived as providing both database and data communication middleware for transactions. This consisted of a proprietary, but standardized API; data access through ISAM, and BDAM file access. Transaction programs were written in assembler language, and CICS ran only on OS/360. However foreign or primitive such function may appear with respect to current technologies, it is crucial to note that all the elements of "object
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
135
transaction monitors" were already present in 1968-- except for the notion of an "object". CICS provided: 9 a multitasking program so that multiple transactions could execute concurrently; 9 storage (main memory within its partition or region) services; 9 an automatic program loader; 9 file control function so that the transaction could execute regardless of specific media or environment. With these elements in place, CICS evolved by supporting greater numbers of concurrent clients and larger sets of operating systems, datastores, and communication protocols. Throughout this evolution programmers were provided with a consistent API to a consistent set of services regardless of the application language or host platform. Support for transactions written in COBOL and P L / I was added in 1970 by using a preprocesser to convert CICS API calls to high-level language calls. The notion of function shipping was added in 1978 to enable transparent read and write requests, regardless of the data's physical location and data format. Transaction routing was added in 1979: this allowed transactions submitted at one CICS system to be executed at another system. The evolution of CICS's packaged business logic function intersected the evolution of data access function when, in the 1980s, CICS enabled transactions to access relational data. The advanced program to program communication (APPC) function enabled heterogeneous operating systems to communicate with CICS. APPC was one of the principal forerunners of the OSI-TP transaction commit protocol (Section 5.1.4). We see then that, by the 1980s, CICS was dealing with the distribution and heterogeneity issues that relational databases encountered as they evolved from providing persistence and data access services to also providing distribution and transaction services. Because CICS provided these services inside a single operating system process, it increasingly resembled an operating system itself. CICS therefore provided TP monitor services such as scheduling, load balancing, transaction routing, and restart on failure. In order to deal with heterogeneous datastores in a consistent manner, CICS introduced the resource manager concept, and this structuring concept lead directly to X/Open DTP (Section 5.1.4). (In contrast, IMS uses the services provided by the host operating system, and is therefore less p o r t a b l e - - b u t more flexible--than CICS.) The stored procedures discussed in Section 5.1.2 can thus be viewed as packaged units of work meets distribution meets database access. The boundaries between the different technologies, in other words, had become
136
AVRAHAM LEFF ET AL.
so blurred that it would be almost impossible to properly "assign credit" to a single technology. All of the services provided by enterprise components (persistence, distribution, heterogeneity, and transactions) existed in one form or another prior to, and independently of, components per se. Thus as objects and components became an increasingly popular way to develop software, it was almost inevitable that components would incorporate such enterprise services; first as individual components, and then in the context of an object transaction monitor (Section 3).
6.
Sample Application
In contrast to the top-down and somewhat abstract discussion of the previous sections, this section examines Enterprise JavaBeans and Microsoft Transaction Server from the perspective of a developer building an application. Although considerably simpler than an actual application, we believe that it illustrates issues that apply to larger and more sophisticated applications.
6.1
Introduction
Our sample is a banking application in which customers interact with a bank to either make payments to other bank customers or to transfer funds between their own accounts. From the customer's or the bank's perspective the chief requirement is that these interactions be transactional: the payments or transfers should occur exactly once despite system or network failures. Both Enterprise JavaBeans and Microsoft Transaction Server promise the application developer that she can (mostly) ignore the transactional aspect of the application and focus on developing the required components. The developer must analyze the bank's requirements to determine the application's components. In general, this analysis phase is nontrivial; our application is simple enough that we arbitrarily, and without loss of generality, identify the following three persistent components: 9 an A c c o u n t component corresponding to a customer's bank account; 9 a C a t e g o r y component corresponding to a customer's budget category for a given payment; 9 a Transaction component corresponding to the history of a single interaction between a customer and the bank. Each of these components is backed by persistent, relational, storage. Thus, for example, the persistent state of an A c c o u n t component will reside in a row in an A c c o u n t T a b l e .
ENTERPRISE JAVABEANSAND MICROSOFTTRANSACTION SERVER 137
As part of the analysis phase, the application developer next determines the detailed interface to each of these components: for example, a c c o u n t . getAccountBalance() and Transaction.setAccountNumber(). It is important to note that at this point there is little, if any, real difference between developing the application with EJBs or with MTS. Of course, the tools and wizards will differ: after all, they differ between one EJB vendor and another! But, at the modeling and conceptual level, the application developer must similarly identify the persistent components and their interfaces whether she's using EJB or MTS. With the component interfaces specified, the developer must begin to implement the application. We show in the following discussion that nontrivial differences exist between an EJB-based implementation and an MTS-based implementation. The code samples shown below are extracted from working applications. The EJB version was done using IBM's WebSphere Advanced Edition (Version 3) running on Windows NT, using DB2 as the relational datastore. The MTS version was done using MTS Version 1 running on Windows NT using SQLServer Version 7 as the relational datastore. Naturally, the EJB version uses Java as the implementation language; the MTS implementation uses C + + .
6.2
I m p l e m e n t i n g Business Logic
Although the application developer has specified the persistent components, she has yet to specify how these components are "glued together" to provide the transfer and payment function actually used by the customer. In the case of the transfer function, for example, the application must somehow package the business logic that states that a transfer consists of the following steps: 1. 2. 3. 4.
locate the source account; locate the destination account; credit the destination account with the specified amount; debit the source account with the specified amount.
The EJB specification uses the session bean construct (in contradistinction to the entity bean construct) to model this behavior. MTS components do not make this modeling distinction. To keep the implementations similar, our developer attaches the transfer and payment business logic to a Banking component. Compare the MTS version (Code Fragment 1) to the EJB version (Code Fragment 2).
138
AVRAHAM LEFF ETAL.
FRAGMENT 1. Microsoft Transaction Server: Banking Component STDMETHODIMP
CBanking::Transfer(BSTR fromAccountNumber, BSTR t o A c c o u n t N u m b e r , C U R R E N C Y amount)
AFX M A N A G E S T A T E ( A f x G e t S t a t i c M o d u l e S t a t e ( ) ) USES C O N V E R S I O N ; H R E S U L T hr; CComPtr pFromAccount; CComPtr pToAccount; m spObjectContext->CreateInstance( uuidof(Accounts), uuidof(IAccounts), (void**)&pFromAccount); m spObjectContext->CreateInstance( uuidof(Accounts), uuidof(IAccounts), (void**)&pToAccount); hr = p F r o m A c c o u n t - > F i n d B y A c c o u n t N u m b e r ( f r o m A c c o u n t N u m b e r ) ; if(FAILED(hr)) return hr; hr = p T o A c c o u n t - > F i n d B y A c c o u n t N u m b e r ( t o A c c o u n t N u m b e r ) ; if(FAILED(hr)) return hr; hr = p T o A c c o u n t - > C r e d i t ( a m o u n t ) ; if(FAILED(hr)) return hr; hr = p F r o m A c c o u n t - > D e b i t ( a m o u n t ) ; if(FAILED(hr)) return hr; if(hr == S OK) m spObjectContext->SetComplete(); else m spObjectContext->SetAbort(); return S OK;
FRAGMENT 2. Enterprise JavaBeans Implementation: Banking Session Bean public
void
transfer(String fromAccountNumber, String t o A c c o u n t N u m b e r , d o u b l e amount) throws j a v a x . n a m i n g . N a m i n g E x c e p t i o n , java.rmi.RemoteException, javax.ejb. F i n d e r E x c e p t i o n Account f r o m A c c o u n t = null; Account t o A c c o u n t = null; A c c o u n t H o m e aHome; aHome = g e t A c c o u n t H o m e ( ) ; fromAccount = aHome.findByAccountNumber(fromAccountNumber); toAccount = aHome.findByAccountNumber(toAccountNumber); fromAccount.debit(amount); toAccount.credit(amount);
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
139
As discussed above (Sections 2.2.2 and 2.3.3), the COM programming model is a two-level store model. This model forces the application developer to consider the Account component and the associated row in AccountTable as independent things. The MTS Banking component therefore first creates an ~'empty" Account component and then loads the relevant state into that empty component via a call to its FindByAccountNumber method. In contrast, the EJB programming model for containermanaged entity beans is a single-level store model. The application programmer is therefore able to completely ignore the AccountTable and simply asks the AccountHome for the required Account components. Notice the different approaches that are used to deal with error conditions. As sketched above, the transfer algorithm is very simple. However, processing can be derailed at many points: the source or destination account numbers may be invalid, or the funds transfer amount may cause problems in the credit or debit methods. COM does not support exceptions so that error conditions are reported through constructs such as H R E S U L T (essentially an integer). The programmer must therefore test, at every step where an error may occur, for error codes; if one is detected, the transfer method is aborted, and the error code returned to the user. In contrast, EJB uses Java exceptions so that, for example, problems in finding an Account are declared as FinderExceptions and automatically propagated to the client that invoked transfer. As a result, the EJB transfer code is "cleaner" than the MTS code because the programmer focuses only on the transfer algorithm. Finally, note the call to SetComplete at the close of the MTS transfer implementation. MTS provides SetComplete and SetAbort, and recommends that one of these be called at the end of each method. Both methods cause the component to be deactivated by the MTS run-time when the method returns. SetComplete indicates that (so far) there is no reason to abort the current transaction; SetAbort indicates that the transaction must be aborted. Because the programmer does not need to maintain the Account state after transfer completes, she invokes S e t C o m p l e t e - - a s opposed to EnableCommit, for e x a m p l e - - t o allow MTS to reclaim system resources used in this method. Application programmers prefer not to have to worry about system issues: MTS, however, must supply this method because its two-level store model implies that it is only the programmer who can decide whether to maintain the association between an interface pointer and a set of state. In contrast, EJB with its single-level store model, ahrays maintains the association between a component and its persistent state so that the programmer doesn't have to worry about the association disappearing. As a result, the EJB run-time is free to reclaim system resources without programmer intervention since it can always reconstruct the association.
140
AVRAHAM LEFF ET AL.
6.3
Creating a Persistent Component
We now examine what EJB and MTS require application programmers to do in order to create a persistent component. Code Fragment 3 and Code Fragment 4 show the process for the EJB Account component, implemented as a container managed entity bean. Observe first that in the EJB programming model the create method for Account is specified on the AccountHome--not the Account component. In contrast, in the MTS programming model the create method is specified on the Account component itself (Code Fragment 5). Next observe that the AccountHome (Code Fragment 3) is an interface-that is, the programmer specifies only the method's signature, not the implementation. Having specified that an Account component is created using a long parameter, the application programmer must supply a corresponding ejbCreate method in the implementation (Code Fragment 4) which sets the bean's state to the supplied parameter. And, for container managed beans, that's it!
FRAGMENT 3. Creating an Account Component: Enterprise JavaBean AccountHome.java public { }
interface
AccountHome
extends
javax.ejb. EJBHome
com.ibm.oats.adc.Account create(long throws javax.ejb. CreateException,
argInstance) java.rmi.RemoteException;
How does the "magic" happen? During the deployment process, the EJB container (in this case IBM's WebSphere Advanced) creates a deployed EJSAccountHomeBean which implements the create method by creating an instance of a deployed EJSJDBCPersisterAccountBean, driving the EJB lifecycle methods on the bean, associating the Account component to the bean, and finally returning the component to the client. The deployed bean contains the SQL code that inserts the corresponding row into AccountTable. It is the combination of EJB's precise role differentiation, Java introspection, and a database mapping tool that allows the application programmer to specify only the minimal "component identity" information with the other players doing the rest of the work.
FRAGMENT 4. Creating an Account Component: Enterprise JavaBean AccountBean.java public { }
void ejbCreate(long argInstance) throws javax.ejb. CreateException, instance
= argInstance;
java.rmi.RemoteException
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
141
Examining the create implementation as done in MTS (Code Fragment 5) we again observe the issues related to two-level store, exception handling, and passivation that we noted in the context of the transfer method (Section 6.2). In addition, we see that--in contrast to the EJB implementat i o n - t h e programmer must supply all the database-specific code herself. Thus, what is conceptually a trivial task at the component level becomes quite tricky at the implementation level because MTS does not allow the separation of component interface issues from component state issues. FRAGMENT 5. Creating an Account Component: Microsoft Transaction Server Account.cpp STDMETHODIMP { AFX
CAccounts::Create(BSTR
BSTR
MANAGE
AccountName,
AccountNumber,
CURRENCY
OpeningBalance)
STATE(AfxGetStaticModuleState())
USES C O N V E R S I O N ; H R E S U L T hr;
accounts. ClearRecord()if( F A I L E D ( h r = m accounts.OpenRowset("Select
m
*
Accounts"))) {
return
}
tcscpy(m
tcscpy(m
from
hr; accounts.m
accounts.m
AccountNumber,
AccountName,
OLE2T(AccountNumber));
OLE2T(AccountName))"
m accounts.m CurrentBalance = OpeningBalancem accounts.m status = DBSTATUS S IGNOREhr = m a c c o u n t s . I n s e r t ( ) m accounts. Close()if( {
SUCCEEDED(hr)) tcscpy(m
accounts.m
hr = m a c c o u n t s . AccountName = ?"))-
}
field,
OpenRowset(
OLE2T(AccountName)); T("Select
m
accounts.MoveNext();
m
spObjectContext->SetComplete();
*
from
accounts
where
else
{
}
m
spObjectContext->SetAbort();
return
hr;
Of course, some programmers prefer, and some applications (e.g., access to legacy datastores or complex SQL join operations) require explicit control of the details of datastore access and the relationship between component and
142
AVRAHAM LEFF ET AL.
datastore state. To meet such requirements, EJB has the concept of beanmanaged persistence. When implementing bean-managed beans, the programmer must supply datastore-specific code in life-cycle methods such as
ejbCreate, ejbStore, ejbRemove, and FindBy. Importantly for component clients, client code is completely unaffected by a bean developer's decision to use bean-managed versus container-managed persistence.
7.
Continued Evolution
The component landscape changes rapidly, and while we expect the basic concepts discussed in this chapter to remain valid (at least for the mediumterm), we know that details of the MTS, EJB, and CORBA technologies will change even over the next year or so. In this section we describe some "breaking news" related to these component technologies, realizing that some of this news will become stale in short order.
7.1
MTS
In the past, Microsoft has combined genuinely new technology with marketing hype in a manner that made it hard to determine whether a new name for a technology actually contained new technology (see Appendix 2). At present (mid-year 2000), this appears to be happening with MTS as well. In Windows 2000, Microsoft has deemphasized talk of MTS, and talks instead about C O M + Services 1.0. Four of the basic services provided by C O M + (component servers, transactions, security, and administration) were provided by MTS under Windows NT 4.0. These services have been "rebranded" as C O M + under Windows 2000, and new services such as load balancing, events, and queuing have been added. To date, the load-balancing function is rudimentary since C O M + does not allow any control over the load balancing algorithm, and a simple "least loaded server" algorithm is hard-coded into the system. The event service is a "publish and subscribe" service for events propagated among COM components. Queued components are touted as a way to reduce the load on servers that handle large volumes of transactions since the system can now queue the transaction. One difference between MTS and C O M + has to do with the OTM implementation. MTS built directly on C O M / D C O M , whereas C O M + has reworked C O M / D C O M to provide a tighter integration between the component and OTM infrastructure. Also, C O M + presents component developers with a more consistent component view than MTS. For example, MTS exposes the existence of an "interceptor" object by
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
143
requiring a component to call special "safe" export methods that ensure that only the interceptor is passed to a client as opposed to the component itself. C O M + transparently ensures that the interface pointer is never exported to a client. In our opinion, the strong continuity in the evolution from COM to D C O M to MTS continues with the evolution from MTS to C O M + . In this evolution, the basic component model remains (more or less) the same, and changes occur mostly in the areas of easing component development and deployment and in a specific function enhancement such as distribution and transactions. The issues discussed earlier (Section 6) with respect to the COM component model are unaffected by the C O M + rebranding. Thus, while Microsoft may acknowledge the need for an architecture that manages component persistence (e.g., such as the Enterprise JavaBeans architecture), C O M + does not yet provide this.
7.2
EJB
The Enterprise JavaBeans architecture described in this chapter corresponds to the V 1.0 and V1.1 specifications. The V2.0 draft specification [25], released in June 2000, introduces two important features: a new containercomponent persistence contract, and the notion of a MessageDrivenBean for handling JMS messages. We shall attempt to briefly explain the significance of these developments.
7.2. 1 Container-Managed Persistence The EJB V1 container-component persistence contract requires the component developer to explicitly denote the set of the component's fields that are persistent. Typically, the container provides a tool that maps the component's persistent fields to a column in a relational database table. The EJB V2 contract is an additional contract that addresses limitations with the V1 contract. In V2: 9 bean
providers
can declaratively
specify persistent
relationships
between beans. 9 bean providers can declaratively specify relationships between a bean and so-called dependent objects: objects that are visible to the bean class but cannot be directly accessed by an EJB client. A Person EJB might thus have a PhoneNumber dependent object. 9 The EJB query language can be used to specify the implementation of find methods on a home interface.
144
AVRAHAM LEFF ET AL.
7.2.2 Messaging and Asynchronous Processing In EJB V1, EJBs faced certain difficulties in using JMS (Java Messaging Service), a common API to messaging systems. While the immediate benefit of using JMS from an EJB is to allow asynchronous processing of messages, a broader implication of using JMS is the integration of the paradigm that EJB represents (also known as object oriented middleware) with a separate class of middleware and applications, namely, message oriented middleware and applications. Message-oriented applications communicate with each other by exchanging data-bearing messages that are typically conveyed with the help of a intermediator. Two distinguishing features of this kind of messaging are that it is typically performed asynchronously and anonymously. That is, applications do not have to wait for the arrival of a reply to continue processing, and they do not need to know the identity of the application that receives a message. There are two flavors of this kind of messaging: one socalled point-to-point and another one so-called publish-subscribe. In pointto-point messaging, applications communicate in a one-to-one fashion (although, conceivably they could also do it in a many-to-one fashion), where the intermediator is typically a first-in-first-out queue. Publish-subscribe messaging allows applications to communicate in a many-to-many fashion by setting up topics or subjects to which interested applications can subscribe. When an application publishes a message on to one of these topics or subjects, then every subscriber receives the message. Examples of middleware that supports message-oriented applications include IBM MQSeries, Microsoft MSMQ, BEA MessageQ, and TIBCO TI B/ Rendezvous [26]. JMS defines an implementation-neutral AP] to messaging for Java applications. As such, JMS allows Java applications to be portable across messaging systems. As long as a specific system, such as MQSeries, MSMQ, or TIB/Rendezvous, implements the JMS specification, Java applications can perform messaging using those systems. Although EJBs and JMS appear to be complementary, the V1 specification did not define the coordination between arrival of JMS messages and the arrival of client EJB method invocations. The V2 specification outlines how JMS can be used as a service in the same way that JDBC can be used as a (persistence) service. Containers provide persistence services by coordinating interactions with a persistent datastore (typically, through the JDBC API); similarly, containers can now provide messaging services by coordinating interactions with a messaging infrastructure (typically, through the JMS API). Further integration with messaging services is provided through a new bean type, called a message-driven bean. This kind of bean provides
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
145
developers with an enterprise component that can consume JMS messages from a single message queue. Message-driven beans do not define home or remote interfaces, which effectively gives them the anonymous nature of a message oriented application component, and they can use a subset of the transaction policies that a regular EJB can use. This is due to the fact that it is the message destinations that are considered as transactional resources, rather than the ultimate senders or receivers of messages. In effect, this means that a distributed transaction that a message-driven bean participates in does not include such senders or receivers. Message-driven beans do not completely hide the interaction with JMS, they merely make it possible. A higher level of abstraction could allow message sends and receives to be viewed as invocations on objects. This would go a longer way towards a more uniform integration of the two paradigms.
7.3
CORBA Components
In Section 4 we focused on EJB and MTS as existing OTM standard architectures. EJB defines a standard architecture for object oriented components written in Java, and MTS defines an architecture for object based components developed on the W i n d o w s / D C O M platform. The CORBA component model (CCM) [27] is a CORBA-based OTM standard that enables the development of components that are language independent, platform independent, and ORB independent, as well as OTM independent. The CCM defines two levels of conformance: a basic level and an extended level. The CCM basic level defines, in most respects, an equivalent architecture and programming model as those in the EJB specification. In fact, a CCMconformant implementation in the Java language is required to implement the EJB specification, the Java-to-IDL mapping (also known as RMI-IIOP), and an EJB to IDL mapping that allows requests made by CORBA clients to be translated into EJB requests. The CCM extended level defines a number of features that are not contemplated in the EJB specification. Extended CORBA components support a variety of surface features through which clients and other elements of an application environment may interact with a component. These surface features are called ports. The component model supports five basic kinds of ports: 9 facets, which are distinct named interfaces provided by the component for client interaction;
146
AVRAHAM LEFF ET AL.
9 receptacles, which are named connection points that describe the component's ability to use a reference supplied by some external agent; 9 event sources, which are named connection points that emit events of a specified type to one or more interested event consumers, or to an event channel; 9 event sinks, which are named connection points into which events of a specified type may be pushed; 9 attributes, which are named values exposed through accessor and mutator operations. Attributes are primarily intended to be used for component configuration, although they may be used in a variety of other ways. Basic components are not allowed to offer facets, receptacles, event sources, and sinks. They may only offer attributes. Extended components may offer any type of port. As evidenced by the definition of an EJB-compatible basic level, the intention of the CCM is to provide a complementary, rather than an alternative, component model to EJB. In addition, the C C M specification introduces an interworking model to allow CCM-based applications and EJB-based applications to be integrated. This model specifies the architecture for a bi-directional bridge to allow clients in one component model to interact with components written in the other component model. Notice that this interworking approach does not provide for the deployment of one type of component into a container for the other type. In other words, there is no architected way to deploy a CORBA component written in C + + into an EJB container.
7.4
Relationship with SOAP
SOAP (Simple Object Access Protocol) [28] defines a generalized protocol for sending XML-based messages. These messages can convey data, remote procedure invocations, or even remote object oriented invocations. A SOAP message consists of: 9 an envelope that expresses what is in a message, who should deal with it and whether it is optional or mandatory; 9 an optional header that provides a mechanism for extending a message without prior knowledge between the communicating parties; and 9 a body that carries mandatory information intended for the ultimate recipient.
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
147
Because they are XML-based, SOAP messages can be delivered via a number of communications protocols, including HTTP and MQSeries. A data encoding format is suggested, but not required by SOAP, leaving open the option to use any encoding as long as it is identified in the SOAP envelope. One use of SOAP is to convey service requests to enterprise applications. However, given that SOAP messages can be delivered via HTTP and that CORBA and IIOP already provide a good solution for intra-enterprise application communication, a more useful application of SOAP is to convey these requests across the Internet. Because of its simplicity and generality, SOAP allows clients of enterprise applications a degree of heterogeneity greater than that allowed by EJB and CORBA. However, we consider it important that SOAP's advantages be applied without requiring developers to re-create infrastructure already provided by middleware such as CORBA and EJB. One approach is to define an integration between SOAP and CORBA and EJB. In particular, it should be possible for heterogeneous clients to speak to CORBA objects and EJBs via SOAP messages. In this context, SOAP defines a transport format, with more than one possible encoding of data items, for delivering heterogeneous client requests on CORBA objects and EJBs. To be useful, SOAP must be extended or integrated with technologies that provide request specification, object registration and discovery, and a request dispatching mechanism (ORB). Some of this work is being addressed, e.g., WSDL (web services description language).
8.
Conclusion
In this chapter, we examined the concept of distributed enterprise components--components that provide business function across an enterprise. Distributed enterprise components require special functions such as distribution, persistence, security, and transactions, from an infrastructure that is often termed an object transaction monitor. Rather than analyze component technologies in terms of platform availability, language availability, client support, or development wizards, the chapter examines component technologies in terms of how they deliver on the promise of software reuse and allow the developer to focus on business logic rather than infrastructure. By examining developments in the fields of data access abstraction and packaged business logic, we showed that component technologies have not emerged from a vacuum. Two competing component frameworks were examined in detail: namely, Sun's Enterprise JavaBeans and Microsoft's Transaction Server. We show that EJBs and MTS are remarkably similar and yet differ in some important
148
AVRAHAM LEFF ET AL.
ways. These differences are illustrated through the development of a sample application.
8.1
Appendix 1
Wegner [29] considers a number of orthogonal dimensions in the design of object oriented languages. These dimensions can also be useful in understanding the architecture and evolution of distributed component infrastructures that are based on object oriented principles. Each dimension refers to the existence, or lack thereof, of a particular feature in the support of object orientation. These dimensions are orthogonal in the sense that they cannot be somehow derived from any of the others. These dimensions are: 9 Objects. An object has a set of operations and a state that remembers the effect of operations. As opposed to functions, which are completely determined by their arguments, an object's operations are also determined by the object's state and thus by the object's invocation history. Notice that classes are not an independent notion to objects, as they define templates from which objects can then be created. Classes are also derived from the notions of types and abstraction. Objects have identity, which is established by the definition of a unique value used to refer to the object in some context. This unique value can, but it does not have to, be related to the internal state of the object. Thus, in general, equality of objects can only be ultimately resolved by comparing their respective identifiers, and not necessarily by comparing their internal states. 9 Typing. A type defines the legal values that can be denoted by a language's variables and expressions. A language is strongly typed if type compatibility of all expressions representing values can be determined from the static program representation at compile time. 9 Abstraction. Data abstraction refers to the representation of a piece of state by a collection of accessor operations. The state of an object with data abstraction is said to be encapsulated to mean that it is not directly accessible. In addition, it is possible to implement data abstraction without the use of objects--as the concept of a COM interface suggests--although the resulting approach can questionably be referred to as object based, much less object oriented. 9 Delegation. This is a resource-sharing mechanism in a general sense. It includes concepts such as inheritance, in which an object of a subclass shares the behavior of its superclass(es), dynamic binding, in which the actual behavior of an object is not known until run-time, and
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
149
delegation in classless languages, in which objects rely on other objects for (part of) their behavior. Concurrency. Concurrent languages allow the simultaneous execution of more than one process. A process denotes a computational unit that has certain resources, such as address space and cpu time, associated with it. Concurrent object based languages also allow the simultaneous execution of lighter-weight units referred to as threads. Threads can share the resources of a single process. A thread consists of a thread control block containing a locus of control and a state of execution. Persistence. A language is referred to as being persistent if it supports data that transcends the life-time of a particular program. Persistent object oriented languages provide an alternative to using relational mechanisms to support the persistence of a program's data. To support persistent objects, languages must rely on a number of mechanisms that include, at least: (1)using the identity of an object to locate an object's persistent state in a given execution context; (2)using a language to express queries and data manipulation requests on (collections of) persistent objects; (3)transaction processing and concurrency control to provide safe and efficient processing of user requests, as well as to enforce consistency constraints. In addition, Waldo et al. [30] argue that " . . . objects that interact in a distributed system need to be dealt with in ways that are intrinsically different from objects that interact in a single address space. These differences are required because distributed systems require that the programmer be aware of latency, have a different model of memory access, and take into account issues of concurrency and partial failure." Furthermore, they single out partial failure as an issue that would be very hard and not at all advisable to mask in terms of local programming issues. They state that "being robust in the face of partial failure requires some expression at the [application] interface level ... The interfaces that connect the components must be able to state whenever possible the cause of failure, and there must be interfaces that allow reconstruction of a reasonable state when failure occurs and the cause cannot be determined." for other perspectives on "object-oriented", see [31] and [34].
8.2
Appendix 2
Compared to Enterprise JavaBeans, the evolution of the Microsoft Transaction Server is both more and less complicated. Less complicated, because the evolution has been controlled by a single company: Microsoft. More complicated, because Microsoft has renamed technology for reasons
150
AVRAHAM LEFF ETAL.
that sometimes appear to have more to do with marketing strategy than reflecting basic changes in the technology. We therefore provide this Appendix as a guide to what may appear to be a complicated set of technologies. Also, the evolution of COM technology gives insight about components in general: namely, that components are largely "about" software reuse (Section 2.1). MTS's origins had little to do with general transaction or persistence frameworks. Instead, applications had a need to share data, and Microsoft developed the required technology piece by piece. COM's origins can be found in the Windows clipboard, which was the first means to allow Windows applications to share data. This data sharing was static, and quite limited since it required users to redo the "cut and paste" operation each time that the shared ("inserted") data was updated. In addition to the data itself, the application in which the data was inserted required a data format descriptor. Data could be inserted if and only if the receiving application recognized the data format. Dynamic data exchange, or DDE, provided more function than the clipboard because it enabled applications to dynamically share data. The DDE API relied on Windows messaging in order to pass data. In a typical DDE scenario, tabular data created by a spreadsheet application was later embedded into a word processing document. When a user brought up the document inside the word processing application, DDE could launch the spreadsheet application, send it the tabular data, and thus provide a graphical display of the data. Importantly, changes made in the original spreadsheet application were automatically reflected in the word processing document. However, the DDE API had a reputation for being difficult to understand and was tedious to code due to the fact that it relied totally on Windows messaging. Object linking and embedding 1.0, or OLE, was Microsoft's next approach to application data sharing. Built upon a D D E foundation, OLE enabled static or dynamic merging of data from one application into another. Information about the data's application source and how the data should be displayed was associated with the inserted data. For example, consider the merging of an image from a paint application into a text document. In the case of static embedding, "raw" data was inserted into the application so that further editing of the source image would not change the image contained in the text document. In the case of dynamic linking, OLE inserted a reference to the data source so that changes to the source image would be reflected in the text document. In-place editing, or activation, was a key feature of OLE. This allowed users to edit the inserted data from within the destination document without having to load the original application. Thus, if a spreadsheet was inserted in a word processing application, selecting the spreadsheet caused the spreadsheet's editing environment (e.g. toolbars and menus) to replace the word processing application's environment. OLE 1.0 did not spread widely.
ENTERPRISE JAVABEANS AND MICROSOFT TRANSACTION SERVER
151
OLE 2.0 (circa 1995) brought about a big change, as it introduced C O M - - t h e Component Object Model. OLE stopped referring to "object linking and embedding", and became a more inclusive term referring to COM technology in general, such as COM's interfaces and services. In 1996, Microsoft introduced ActiveX as a way to associate the notion of Internet "Active Content" (such as client-side GUI controls) with enhanced function for COM (such as U R L based monikers) and Win32 functionality (such as IWinnet which added an Internet API to the Win32 API). The introduction of ActiveX resulted in the term OLE reverting to its original meaning of object linking and embedding. The Windows API for database access has evolved in tandem with the component technology. ODBC was introduced in the early 1990s to provide a common interface to relational databases. (We discussed ODBC in the context of call level interfaces to relational databases in Section 5.1.2.) ODBC is oriented towards procedural code rather than being "object" or "component" oriented. OLE DB was therefore introduced as a component oriented, COM-based, database API. OLE DB also introduced the notion of common API access to relational and nonrelational datastores such as text files, VSAM, and AS/400. From the perspective of a Windows programmer, OLE DB was limited because it was accessible only from Visual C + + . ActiveX Data Objects, or ADO, are an API for wrapping OLE DB objects. Because they are "scriptable" or "automation-enabled", ADOs have an advantage compared OLE DB objects in that they can be called from nonVisual-C++ clients such as Visual Basic, VBScript, and JavaScript. Because data access through ADOs involves an additional layer of code, such access adds performance overhead relative to OLE DB which, in turn, adds performance overhead compared to ODBC.
REFERENCES [1] Enterprise JavaBeans Technology (2000). h t t p : / / j a v a . s u n . c o m / p r o d u c t s / e j b / . [2] Monson-Haefel, R. (2000). Enterprise JavaBeans, second edition. O'Reilly Sebastopol, CA, USA. [3] COM Technologies (2000). Microsoft Transaction Server (MTS). h t t p : / / www. microsoft,
com/com/tech/MTS,
asp.
[4] Gray, S. D., Lievano, R. A. and Jennings R. (editor) (1997). Microsoft Transaction Server 2.0 (Roger Jennings' Database Workshop). Sams Publishing Indianapolis, IN, USA. [5] Sessions, R. (1997). COM and DCOM: Microsqflt's Vision./br Distributed Objects. Wiley New York, USA. [6] Lau, C. (1995). Object-Oriented Programming Using SOM and DSOM. Wiley New York, USA. [7] Shirley, J., Hu, W., Magid, D. and Oram, A. (Editor) (1994). Guide to Writing DCE Applications (OSF Distributed Computing Environment), second edition. O'Reilly Sabastopol, CA, USA. [8] Object Management Group (2000). h t t p : / / www. omg. o r g.
152
AVRAHAM LEFF ET AL.
[9] ObjectStore(2000). h t t p : / / w w w . o d i . c o m / p r o d u c t s / o b j e c t s t o r e . h t m l . [10] Orfali, R., Harkey, D. (Contributor), Edwards, J. and Harkey, D. (1995). The Essential Distributed Objects Survival Guide. Wiley New York, USA. [11] R M I - - R e m o t e Method Invocation (2000). h t t p : / / w w w , j a v a s o f t . c o m / p r o d u c t s / jdk/1 . 1/docs/guide/rmi / index, html.
[12] Boucher, K. and Katz, F. (1999). Essential Guide to Object Monitors. Wiley New York, USA. [13] Grimes, R. (1999). Pro[essional Visual C++ M T S Pro Tramming. Wrox Press Birmingham, UK. [14] Codd, E. F. (1970). "'A relational model of data for large shared data banks". Communications ()./the ACM, 13, 377-387. [15] Astrahan, M. M. et al. (1976). "'System R: a relational approach to database management". A C M Transactions on Database Systems, 1, 97-137. [16] Stonebraker, M. et al. (1976). "'The Design and Implementation of Ingres". A C M Transactions o11 Database Systems, 1, 189-222. [17] Chamberlin, D. D. et al. (1974). "'SEQUEL: a structured English query language". Proceedings of the A C M SIGFIDET Workshop on Data Description, Access, and Control, 249-264. [18] International Organization for Standardization (ISO) (1996). Database Language S Q L - Part 4." Persistent Stored Modules. Standard No. ISO/IEC 9075-4. [19] X/Open CPI-C Speciji'cation, Version 2 (second edition). Prentice Hall, 1996 Upper Saddle River, NJ, USA. [20] Rao, B. R. (1994). Object-Oriented Databases." Technology, Applications and Products. McGraw-Hill New York, USA. [21] Object Data Management Group (2000). The Standard for Storing Objects. h t t p - / / www.odmg.org.
[22] Gray, J. and Reuter, A. (1993). Transaction Process&g: Concepts and Techniques. Morgan Kaufmann, San Francisco. [23] McGee, W. C. (1977). "The information management system IMS/VS". IBM Systems Journal, 16, 84-168. [24] Yelavich, B. M. (1985). "Customer information control system--an evolving system facility". IBM Systems Journal, 24, 264-278. [25] Enterprise JavaBeans 2.0 Specification (2000). h t t p - / / j a v a . s u n . c o r n / p r o d u c t s / ejb/2.0.html.
[26] Lewis, R. (2000). Advanced Messaging Applications with M S M Q and MQSeries. QUE Professional Indianapolis, IN, USA. [27] CORBA (1999). Components, revised submission OMG TC document orbos/99-07-01, FTF drafts of updated chapters document ptc/99-10-04, h t t p 9/ / www. o mg. o r g / c g i bi n/doc?orbos/99-07-01, 99-10-04.
and
http://www.omg.org/cgi-bin/doc?ptc/
[28] Simple Object Access Protocol 1.1 (2000). http://www.w3.org/TR/SOAP. [29] Wegner, P. "Dimensions of object-based language design". Proceedings of OOPSLA '87. [30] Waldo, J., Wyant, G., Wollrath, A. and Kendall, S. (1994). "A note on distributed computing". SMLI TR-94-29, Sun Microsystems Laboratories. [31] Budd, T. (1997). An hltroduction to Object-Oriented Programming, second edition. Addison-Wesley. [32] Gampel et al. (1998). "IBM component broker connector overview". IBM International Technical Support Organization, SG24-2022-02. [33] Gray, J. and Reuter, A. (1992). Transaction Processing: Concepts and Techniques. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann. [34] Korson, T. and McGregor, J. D. (1990). "'Understanding object-oriented: a unifying Paradigm". Communications of the ACM, 33.
Maintenance Process and Product Evaluation Using Reliability, Risk, and Test Metrics NORMAN F. SCHNEIDEWIND Computer and Information Sciences and Operations Division Naval Postgraduate School Monterey, CA 93943 USA Emaih [email protected]
Abstract In analyzing the stability of a maintenance process, it is important that it not be treated in isolation from the reliability and risk of deploying the software that result from applying the process. Furthermore, we need to consider the efficiency of the test effort that is a part of the process and a determinate of reliability and risk of deployment. The relationship between product quality and process capability and maturity has been recognized as a major issue in software engineering based on the premise that improvements in process will lead to higher-quality products. To this end, we have been investigating an important facet of process c a p a b i l i t y - - s t a b i l i t y - - a s defined and evaluated by trend, change, and shape metrics, across releases and within a release. Our integration of product and process measurement serves the dual purpose of using metrics to assess and predict reliability and risk and to evaluate process stability. We use the N A S A Space Shuttle flight software and the United States Air Force Global Awareness program to illustrate our approach.
1. 2. 3.
4. 5. 6. 7.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Research and Projects . . . . . . . . . . . . . . . . . . . . . . . . . . Concept of Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Trend Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Change Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Shape Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metrics for Long-Term Analysis . . . . . . . . . . . . . . . . . . . . . . . . . Metrics for Long-Term and Short-Term Analysis . . . . . . . . . . . . . . . . . Data and Example Application . . . . . . . . . . . . . . . . . . . . . . . . . . Relationships among Maintenance, Reliability, and Test Effort . . . . . . . . . . 7.1 Metrics for Long-Term Analysis . . . . . . . . . . . . . . . . . . . . . . 7.2 Reliability Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ADVANCES IN COMPUTERS, VOL. 54 ISBN 0-12-012154-9
153
154 155 157 157
158 159 160 160 160 163 163 166
Copyright ~' 2001 by Academic Press All rights of reproduction in any form reserved.
154
8. 9. 10.
NORMAN F. SCHNEIDEWIND
7.3 S u m m a r y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Metrics for L o n g - T e r m a n d S h o r t - T e r m Analysis . . . . . . . . . 7.5 S u m m a r y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shuttle O p e r a t i o n a l I n c r e m e n t F u n c t i o n a l i t y a n d Process I m p r o v e m e n t U n i t e d States Air F o r c e G l o b a l A w a r e n e s s ( G A ) P r o g r a m A p p l i c a t i o n Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.
. . . . . . . . . ...... ...... . . . . . . . . .
. . . .
. . .
169 170 174 175 177 180 180 180
Introduction
Measuring and evaluating the stability of the software maintenance process is important because of the recognized relationship between process quality and product quality as shown by Hollenbach et al. [1]. We focus on the important quality factor reliability. A process can quickly become unstable because the very act of installing software changes the environment: pressures operate to modify the environment, the problem, and the technological solutions. Changes generated by users and the environment and the consequent need for adapting the software to the changes is unpredictable and cannot be accommodated without iteration. Programs must be adaptable to change and the resultant change process must be planned and controlled. According to Lehman [2], large programs are never completed, they just continue to evolve. In other words, with software, we are dealing with a moving target. Maintenance is performed continuously and the stability of the process has an effect on product reliability. Therefore, when we analyzed the stability of the NASA Shuttle process, it was important to consider the reliability of the software that the process produces. Furthermore, we needed to consider the efficiency of the test effort that is a part of the process and a determinate of reliability. Therefore, we integrated these factors into a unified model, which allowed us to measure the influence of maintenance actions and test effort on the reliability of the software. Our hypothesis was that these metrics would exhibit trends and other characteristics over time that would be indicative of the stability of the process. Our results indicate that this is the case. We conducted research on the NASA Space Shuttle flight software to investigate a hypothesis of measuring and evaluating maintenance stability. We used several metrics and applied them across releases of the software and within releases. The trends and shapes of metric functions over time provide evidence of whether the software maintenance process is stable. We view stability as the condition of a process that results in increasing reliability, decreasing risk of deployment, and increasing test effectiveness. In addition, our focus is on process stability, not code stability. We explain
MAINTENANCE PROCESS AND PRODUCT EVALUATION
155
our criteria for stability; describe metrics, trends, and shapes for judging stability; document the data that was collected; and show how to apply our approach. Building on our previous work of defining maintenance stability criteria and developing and applying trend metrics for stability evaluation as described by Schneidewind [3-5], in this chapter we review related research projects, introduce shape metrics for stability evaluation, apply our change metric for multiple release stability evaluation, consider the functionality of the software product in stability evaluation, and interpret the metric results in terms of process improvements. Our emphasis in this chapter is to explain and demonstrate a unified product and process measurement model for product evaluation and process stability analysis. The reader should focus on the model principles and not on the results obtained for the two applications. These are used only to illustrate the model concepts. In general, different numerical results would be obtained for other applications that use this model. First, we review related research. Next, the concept of stability is explained and trend and shape metrics are defined. Then, we define the data and the Shuttle application environment. This is followed by an analysis of relationships among maintenance, reliability, test effort, and risk, both long term (i.e., across releases) and short term (i.e., within a release), as applied to the Shuttle. We finish the Shuttle example with a discussion of our attempts to relate product metrics to process improvements and to the functionality and complexity of the software. Last, we show how the concept of stability and its metrics are used in the US Air Force Global Awareness (GA) program.
2.
Related Research and Projects
A number of useful related maintenance measurement and process projects have been reported in the literature. Briand et al. [6], developed a process to characterize software maintenance projects. They present a qualitative and inductive methodology for performing objective project characterizations to identify maintenance problems and needs. This methodology aids in determining causal links between maintenance problems and flaws in the maintenance organization and process. Although the authors have related ineffective maintenance practices to organizational and process problems, they have not made a linkage to product reliability and process stability. Gefen and Schneberger [7] developed the hypothesis that maintenance proceeds in three distinct serial phases: corrective modification, similar to testing; improvement in function within the original specifications; and the
156
NORMAN F. SCHNEIDEWIND
addition of new applications that go beyond the original specifications. Their results from a single large information system, which they studied in great depth, suggested that software maintenance is a multiperiod process. In the Shuttle maintenance process, in contrast, all three types of maintenance activities are performed concurrently and are accompanied by continuous testing. Henry et al. [8] found a strong correlation between errors corrected per module and the impact of the software upgrade. This information can be used to rank modules by their upgrade impact during code inspection in order to find and correct these errors before the software enters the expensive test phase. The authors treat the impact of change but do not relate this impact to process stability. Khoshgoftarr et al. [9] used discriminant analysis in each iteration of their project to predict fault prone modules in the next iteration. This approach provided an advance indication of reliability and the risk of implementing the next iteration. This study deals with product reliability but does not address the issue of process stability. Pearse and Oman [10] applied a maintenance metrics index to measure the maintainability of C source code before and after maintenance activities. This technique allowed the project engineers to track the "health" of the code as it was being maintained. Maintainability is assessed but not in terms of process stability. Pigoski and Nelson [11] collected and analyzed metrics on size, trouble reports, change proposals, staffing, and trouble report and change proposal completion times. A major benefit of this project was the use of trends to identify the relationship between the productivity of the maintenance organization and staffing levels. Although productivity was addressed, product reliability and process stability were not considered. Sneed [12] re-engineered a client maintenance process to conform to the A N S I / I E E E Standard 1291, Standard ./for Software Maintenance. This project is a good example of how a standard can provide a basic framework for a process and can be tailored to the characteristics of the project environment. Although applying a standard is an appropriate element of a good process, product reliability and process stability were not addressed. Stark [13] collected and analyzed metrics in the categories of customer satisfaction, cost, and schedule with the objective of focusing management's attention on improvement areas and tracking improvements over time. This approach aided management in deciding whether to include changes in the current release, with possible schedule slippage, or include the changes in the next release. However, the authors did not relate these metrics to process stability.
MAINTENANCE PROCESSAND PRODUCT EVALUATION
157
Although there are similarities between these projects and our research, our work differs in that we integrate: (1) maintenance actions, (2) reliability, (3) test effort, and (4) risk to the safety of mission and crew of deploying the software after maintenance actions, for the purpose of analyzing and evaluating the stability of the maintenance process.
3.
Concept of Stability 3.1
Trend Metrics
To gain insight about the interaction of the process with product metrics like reliability, two types of metrics are analyzed: trend and shape. Both types are used to assess and predict maintenance process stability across (long term) and within (short term) releases after the software is released and maintained. Shape metrics are described in the next section. By chronologically ordering metric values by release date, we obtain discrete functions in time that can be analyzed for trends across releases. Similarly, by observing the sequence of metric values as continuous functions of increasing test time, we can analyze trends within releases. These metrics are defined as empirical and predicted functions that are assigned values based on release date (long term) or test time (short term). When analyzing trends, we note whether an increasing or decreasing trend is favorable as described by Schneidewind [3-5]. For example, an increasing trend in time to next failure and a decreasing trend in failures per KLOC (thousands of lines of code) would be favorable. Conversely, a decreasing trend in time to next failure and an increasing trend in failures per KLOC would be unfavorable. A favorable trend is indicative of maintenance stability if the functionality of the software has increased with time across releases and within releases. Increasing functionality is the norm in software projects due to the enhancement that users demand over time. We impose this condition because if favorable trends are observed, they could be the result of decreasing functionality rather than having achieved maintenance stability. When trends in these metrics over time are favorable (e.g., increasing reliability), we conclude that the maintenance process is stable with respect to the software metric (reliability). Conversely, when the trends are unfavorable (e.g., decreasing reliability), we conclude that process is unstable. Our research investigated whether there were relationships among the following factors: (1) maintenance actions, (2) reliability, and (3) test effort. We use the following types of trend metrics: 1. Maintenance actions: K L O C change to the code (i.e., amount of code changed necessary to add given functionality);
158
NORMAN F. SCHNEIDEWIND
2. Reliability: Various reliability metrics (e.g. MTTF, total failures, remaining failures, and time to next failure): and 3. Test effort: Total test time.
3.2 Change Metric Although looking for a trend on a graph is useful, it is not a precise way of measuring stability, particularly if the graph has peaks and valleys and the measurements are made at discrete points in time. Therefore, we developed a change metric (CM), which is computed as follows: 1. Note the change in a metric from one release to the next (i.e. release j to release j + 1). 2. (a) If the change is in the desirable direction (e.g. failures/KLOC decrease), treat the change in 1 as positive. (b) If the change is in the undesirable direction (e.g. failures/KLOC increase), treat the change in 1 as negative. 3. (a) If the change in 1 is an increase, divide it by the value of the metric in release j + 1. (b) If the change in 1 is a decrease, divide it by the value of the metric in release j. These signed quantities are called relative changes (RC). 4. Compute the average of the values obtained in 3, taking into account sign. This is the change metric (CM). The CM is a quantity in the range - 1 , 1. A positive value indicates stability, where 1 represents 100% stability; a negative value indicates instability, where - 1 represents 100% instability. A value of 0, within the numeric precision of the calculation, represents an indeterminate state between stability and instability. The numeric value of CM indicates the degree of stability or instability. For example, 0.1 would indicate 10% stability and 0.9 would indicate 90% stability. Similarly, -0.1 would indicate 10% instability and - 0 . 9 would indicate 90% instability. The standard deviation (SD) of these values can also be computed. Note that CM only pertains to stability or instability with respect to the particular metric that has been evaluated (e.g., failures/KLOC). The evaluation of stability should be made with respect to a set of metrics and not a single metric. The average of the CM for a set of metrics can be computed to obtain an overall metric of stability. In addition to the overall change metric as represented by CM, we are interested in flagging unusual RCs, both highly stable and highly unstable,
MAINTENANCE PROCESSAND PRODUCTEVALUATION
159
because these values could be indicative of significant changes in the product and process. Thus we define the following RCs: Highly stable (HS): Highly unstable (HU):
R C > C M + 3 x SD, and CMI>0 R < C M - 3 x SD, and CM ~<0.
Where HS or HU holds, we call these RCs critical transitions (e.g., relative change from one release to the next). 3.3
Shape Metrics
In addition to trends in metrics, the shapes of metric functions provide indicators of maintenance stability. We use shape metrics to analyze the stability of an individual release and the trend of these metrics across releases to analyze long-term stability. The rationale of these metrics is that it is better to reach important points in the growth of product reliability sooner than later. If we reach these points late in testing, it is indicative of a process that is late in achieving stability. We use the following types of shape metrics: 1. Direction and magnitude of the slope of a metric function (e.g., failure rate decreases asymptotically with total test time). Using failure rate as an example within a release, it is desirable that it rapidly decrease towards zero with increasing total test time and that it have small values. 2. Percent of total test time at which a metric function changes from unstable (e.g., increasing failure rate) to stable (e.g., decreasing failure rate) and remains stable. Across releases, it is desirable that the total test time at which a metric function becomes stable gets progressively smaller. 3. Percent of total test time at which a metric function increases at a maximum rate in a favorable direction (e.g., failure rate has maximum negative rate of change). Using failure rate as an example, it is desirable for it to achieve maximum rate of decrease as soon as possible, as a function of total test time. 4. Test time at which a metric function reaches its maximum value (e.g., test time at which failure rate reaches its maximum value). Using failure rate as an example, it is desirable for it to reach its maximum value (i.e., transition from unstable to stable) as soon as possible, as a function of total test time. 5. Risk: Probability of n o t meeting reliability and safety goals (e.g., time to next failure should exceed mission duration), using various shape metrics as indicators of risk. Risk would be low if the conditions in 1-4 above obtain.
160
NORMAN F. SCHNEIDEWIND 4.
Metrics for Long-Term Analysis
We use certain metrics only for long-term analysis. As an example, we compute the following trend metrics over a sequence of releases: 1. 2. 3. 4. 5.
Mean time to failure (MTTF) Total failures normalized by KLOC change to the code Total test time normalized by KLOC change to the code Remaining failures normalized by KLOC change to the code Time to next failure.
5.
Metrics for Long-Term and Short-Term Analysis
We use other metrics for both long-term and short-term analysis. As an example, we compute the following trend (1) and shape (2-5) metrics over a sequence of releases and within a given release: 1. Percent of total test time required for remaining failures to reach a specified value. 2. Degree to which failure rate asymptotically approaches zero with increasing total test time 3. Percent of total test time required for failure rate to become stable and remain stable 4. Percent of total test time required for failure rate to reach maximum decreasing rate of change (i.e., slope of the failure rate curve) 5. Maximum failure rate and total test time where failure rate is maximum.
6.
Data and Example Application
We use the Shuttle application to illustrate the concepts. This large maintenance project has been evolving with increasing functionality since 1983 as reported by Billings et al. [14]. We use data collected from the developer of the flight software of the NASA Space Shuttle, as shown in Table I, which has two parts: 1 and 2. This table shows operational increments (OIs) of the Shuttle: O I A - O I Q , covering the period 1983-1997. We define an OI as follows: a software system comprised of modules and configured from a series of builds to meet Shuttle mission functional requirements as stated in Schneidewind [15]. In Part 1, for each of the OIs, we show the release date (the date of release by the contractor to NASA),
MAINTENANCE PROCESS AND PRODUCT EVALUATION
161
total post-delivery failures, a n d failure severity (decreasing in severity f r o m " 1 " to "4"). In P a r t 2, we s h o w the m a i n t e n a n c e c h a n g e to the code in K L O C (source l a n g u a g e changes a n d additions) a n d the total test time of the OI. In a d d i t i o n , for those OIs with at least two failures, we show the c o m p u t a t i o n of M T T F , f a i l u r e s / K L O C , a n d total test t i m e / K L O C . K L O C is an i n d i c a t o r o f m a i n t e n a n c e actions, not functionality as stated by Keller [16]. I n c r e a s e d functionality, as m e a s u r e d by the increase in the size of principal functions l o a d e d into mass m e m o r y , has a v e r a g e d a b o u t 2 %
TABLE I--PART 1 CHARACTERISTICSOF MAINTAINED SOFTWAREACROSS SHUTTLE RELEASES
Operational increment
Release date
Launch date
Mission duration (days)
Reliability prediction date
Total post delivery failures
12/9/85
6
9/1/83
No flights
B
12/12/83
8/30/84
6
8/14/84
10
C
6/8/84
4/12/85
7
1/17/85
10
D
10/5/84
11/26/85
7
10/22/85
12
E
2/15/85
1/12/86
6
5/11/89
5
F G
12/17/85 6/5/87
2 3
H
10/13/88
3
I J
K L
6/29/89 6/18/90 5/2/91 6/15/92
M N O
7/15/93 7/13/94 10/18/95
8/2/91
11/19/96
9
18
7/19/91
9/26/96
3 7 1 3
1 1 5
7/16/96
3
3/5/97
1
Failure severity One 2 Five 3 Two 2 Eight 3 Two 2 Seven 3 One 4 Five 2 Seven 3 One 2 Four 3 Two 3 One 1 Two 3 Two 1 One 3 Three 3 Seven 3 One 1 One 1 One 2 One 3 One 3 One 3 One 2 Four 3 One 2 Two 3 One 3
162
N O R M A N F. SCHNEIDEWIND TABLE l - - P A R T 2 CHARACTERISTICS OF MAINTAINED SOFTWARE ACROSS SHUTTLE RELEASES
Operational increment A B C D E F G H I J K L M N O P Q
KLOC change
Total test time (days)
8.0 11.4 5.9 12.2 8.8 6.6 6.3 7.0 12.1 29.4 21.3 34.4 24.0 10.4 15.3 7.3 11.0
1078 4096 4060 2307 1873 412 3077 540 2632 515 182 1337 386 121 344 272 75
MTTF (days)
Total failures/KLOC change
179.7 409.6 406.0 192.3 374.6 206.0 1025.7 180.0 877.3 73.6
0.750 0.877 1.695 0.984 0.568 0.303 0.476 0.429 0.248 0.238
445.7
0.087
68.8 90.7
0.327 0.411
Total test time/ KLOC change (days) 134.8 359.3 688.1 189.1 212.8 62.4 488.4 77.1 217.5 17.5 8.5 38.9 16.1 11.6 22.5 37.3 6.8
over the last 10 OIs. Therefore, if a stable process were observed, it could not be attributed to decreasing functionality. Also to be noted is that the software developer is a Capability Maturity Model level 5 organization that has continually improved its process. Because the flight software is run continuously, around the clock, in simulation, test, or flight, total test time refers to continuous execution time from the time of release. For OIs where there was a sufficient sample size (i.e., total post-delivery failures)--OIA, OIB, OIC, OID, OIE, OIJ, and O I O - - w e predicted software reliability. For these OIs, we show launch date, mission duration, and reliability prediction date (i.e., the date when we made a prediction). Fortunately, for the safety of the crew and mission, there have been few post-delivery failures. Unfortunately, from the standpoint of prediction, there is a sparse set of observed failures from which to estimate reliability model parameters, particularly for recent OIs. Nevertheless, we predict reliability prior to launch date for OIs with as few as five failures spanning many months of maintenance and testing. In the case of OIE, we predict reliability after launch because no failures had occurred prior to launch to use in the prediction model. Because of the scarcity of failure data,
MAINTENANCE PROCESSAND PRODUCTEVALUATION
163
we made predictions using all severity levels of failure data. This turns out to be beneficial when making reliability risk assessments using number of remaining failures. For example, rather than specifying that the number of predicted remaining failures must not exceed one severity ~'1", the criterion could specify that the prediction not exceed one failure of an)' type--a more conservative criterion as described by Schneidewind [15]. As would be expected, the number of pre-delivery failures is much greater than the number of post-delivery failures because the software is not as mature from a reliability standpoint. Thus, a way around the insufficient sample size of recent OIs for reliability prediction is to use pre-delivery failures for model fit and then use the fitted model to predict post-delivery failures. However, we are not sure that this approach is appropriate because the multiple builds in which failures can occur and the test strategies used to attempt to crash various pieces of code during the pre-delivery process contrast sharply with the post-delivery environment of testing an integrated OI with operational scenarios. Nevertheless, we are experimenting with this approach in order to evaluate the prediction accuracy. The results will be reported in a future paper.
11
Relationships among Maintenance, Reliability, and Test Effort 7.1
Metrics for Long-Term Analysis
We want our maintenance effort to result in increasing reliability of software over a sequence of releases. A graph of this relationship over calendar time and the accompanying CM calculations indicate whether the long-term maintenance effort has been successful as it relates to reliability. In order to measure whether this is the case, we use both predicted and actual values of metrics. We predict reliability in advance of deploying the software. If the predictions are favorable, we have confidence that the risk is acceptable to deploy the software. If the predictions are unfavorable, we may decide to delay deployment and perform additional inspection and testing. Another reason for making predictions is to assess whether the maintenance process is effective in improving reliability and to do it sufficiently early during maintenance to improve the maintenance process. In addition to making predictions, we collected and analyzed historical reliability data. This data shows in retrospect whether maintenance actions were successful in increasing reliability. In addition, the test effort should not be disproportionate to the amount of code that is changed and to the reliability that is achieved as a result of maintenance actions.
164
7. 1.1
NORMAN F. SCHNEIDEWIND
Mean Time to Failure
We want mean time to failure (MTTF), as computed by equation (1), to show an increasing trend across releases, indicating increasing reliability. Total test time
Mean time to failure-
(1)
Total number of failures during test
7. 1.2
Total Failures
Similarly, we want total failures (and faults), normalized by KLOC change in code, as computed by equation (2), to show a decreasing trend across releases, indicating that reliability is increasing with respect to code changes. Total failures
Total number of failures during test
KLOC
KLOC change in code on the OI
(2)
We plot equations (1) and (2) in Fig. 1 and Fig. 2, respectively, against release time of OI. This is the number of months since the release of the OI, using "0" as the release time of OIA. We identify the OIs at the bottom of the plots. Both of these plots use actual values (i.e., historical data). The CM value for equation (1) is -0.060 indicating 6.0% instability with respect to MTTF and 0.087 for equation (2) indicating 8.7% stability with respect to normalized total failures. The corresponding standard deviations are 0.541 and 0.442. 500
~" 400 ._ ~ 300
.E_ '-" 200
100
0
OI
A
3.4 B FIG. 1. M e a n
9.2 13.7 17.5 Months since release of first OI C
D
E
|
//
81.6
145.6
J
t i m e to f a i l u r e a c r o s s r e l e a s e s .
I
MAINTENANCE PROCESS AND PRODUCT EVALUATION
r tto (3
1.5
i-
0.5
165
0._1
0
3.4
9.27
13.17
17.5
Months since release of first
OIA
B
C
D
E
81.6
145.6
OI
,I
O
FIG. 2. Total failures per KLOC across releases.
As an example of applying the highly stable (HS) criterion for M T T F , there is no M T T F >CM + 3 x SD = 1.563. Applying the highly unstable (HU) criterion, there is no M T T F < C M - 3 x SD = -1.683. Thus there are no critical transitions for M T T F . Similarly, HS for total f a i l u r e s / K L O C - 1 . 4 1 3 and H U - - 1 . 2 3 9 and there are no critical transitions. The large variability in CM is the case in this application due to the large variability in functionality across releases. Furthermore, it is not our objective to judge the process that is used in this example. Rather, our purpose in showing these and subsequent values of CM is to illustrate our model. We use these plots and the CM to assess the long-term stability of the maintenance process. We show example computations of CM for equations (1) and (2) in Table II.
7. 1.3
Total Test Time
We want total test time, normalized by K L O C change in code, as computed by equation (3), to show a decreasing trend across releases, indicating that test effort is decreasing with respect to code changes. Total test time
Total test time
KLOC
K L O C change in code on the OI
(3)
166
NORMAN
F. S C H N E I D E W I N D
TABLE II EXAMPLE COMPUTATIONS OF CHANGE METRIC (CM) Operational increment
A B C D E J O
MTTF ( day s)
Relative change
Total fail ures / K L OC
Relative change
179.7 409.6 406.0 192.3 374.6 73.6 68.8 CM
0.562 -0.007 -0.527 0.487 -0.805 -0.068 -0.060
0.750 0.877 1.695 0.984 0.568 0.238 0.330 CM
-0.145 -0.483 0.419 0.423 0.581 -0.272 0.087
We plot equation (3) in Fig. 3 against release time of OI, using actual values. The CM value for this plot is 0.116, with a standard deviation of 0.626, indicating 11.6% stability with respect to efficiency of test effort. HS = 1.994 and HU = - 1 . 7 6 2 and there are no critical transitions. We use this plot and the CM to assess whether testing is efficient with respect to the amount of code that has been changed.
7.2 7.2. 1
Reliability Predictions
Total Failures
Up to this point, we have used only actual data in the analysis. Now we expand the analysis to use both predictions and actual data but only for the 700
to 0 0 .._1
"o "~
600
500
4oo 300
200
~
loo
0
0
34
93
132
175
276
452614
70
81692110551185130414561545167.1
Months since release
OIA
B
C
D
E
F
G
H
1
J
of first OI
K
L
M
N
O
FIG. 3. Total test time per KLOC across releases.
P
Q
MAINTENANCE PROCESS AND PRODUCT EVALUATION
167
seven OIs where we could make predictions. Using the Schneidewind model described in American Institute of Aeronautics and Astronautics [15, 1720] and the S M E R F S software reliability tool developed by Farr and Smith [21], we show prediction equations, using 30 day time intervals, and make predictions for OIA, OIB, OIC, OID, OIE, OIJ, and OIO. This model or any other applicable model may be used as described in American Institute of Aeronautics and Astronautics [17] and Farr and Smith [21]. To predict total failures in the range [1,4] (i.e., failures over the life of the software), we use equation (4): ct F(cx;) =--'[-
Xs_ 1
(4)
where the terms are defined as follows: s:
starting time interval for using failure counts for computing parameters c~ and/3; c~: initial failure rate; /3: rate of change of failure rate; and Xs_ 1: observed failure count in the range [1, s - 1]. Now, we predict total failures normalized by K L O C change in code. We want predicted normalized total failures to show a decreasing trend across releases. We computed a CM value for this data of 0.115, with a standard deviation of 0.271, indicating 11.5% stability with respect to predicted normalized total failures. HS = 0.928 and H U - -0.698 and there are no critical transitions.
7.2.2 Remaining Failures To predict remaining failures r(t) at time t, we use equation (5) as described in American Institute of Aeronautics and Astronautics [17], Keller [18], and Schneidewind [20]: ,'(0 = F ( e c ) - X,
(5)
This is the predicted total failures over the life of the software minus the observed failure count at time t. We predict remaining failures, normalize them by K L O C change in code, and compare them with normalized actual remaining failures for seven OIs in Fig. 4. We approximate actual remaining failures at time t by subtracting the observed failure count at time t from the observed total failure count at time T, where T>> t. The reason for this approach is that we are approximating the failure count over the life of the software by using the
168
NORMAN
F. S C H N E I D E W I N D
failure count at time T. We want equation (5) and actual remaining failures, normalized by K L O C change in code, to show a decreasing trend over a sequence of releases. The CM values for these plots are 0.107 and 0.277, respectively, indicating 10.7% and 27.7% stability, respectively, with respect to remaining failures. The corresponding standard deviations are 0.617 and 0.715. HS = 1.958 and H U - - 1 . 7 4 4 for predicted remaining failures and there are no critical transitions. In addition, HS - 2.422 and HU = -1.868 for actual remaining failures and there are no critical transitions.
7.2.3
Time to Next Failure
To predict the time for the next FI failures to occur, when the current time is t, we use equation (6):
TF(t)---- log /3
o~ -/3(X,.. I + F,)
-- ( t - s + 1)
(from American Institute of Aeronautics and Astronautics The terms in TF(t) have the following definitions: t 9
Xs.
F,:
t 9
(6)
[17]1, 17, 18).
current time interval; observed failure c o u n t in the range [s, t]; and given number of failures to occur after interval t (e.g., one failure).
r-]
Predicted
0
3.4
OIA
B
~
Actual
m 0.8 m "-I C::
0.6
E -a 0.4
0
z 0.2
9.27 13.17 17.5 Months since release of first OI C
D
E
81.6
145.6
J
O
FIG. 4. Reliability of maintained software--remaining failures normalized by change to code.
MAINTENANCE PROCESS A N D PRODUCT EVALUATION
169
We want equation (6) to show an increasing trend over a sequence of releases. Predicted and actual values are plotted for six OIs (OIO has no failures) in Fig. 5. The CM values for these plots are -0.152 and -0.065, respectively, indicating 15.2% and 6.5% instability, respectively, with respect to time to next failure. The corresponding standard deviations are 0.693 and 0.630. HS = 1.927 and HU = -2.231 for predicted time to next failure and there are no critical transitions. In addition, HS = 1.825 and H U - - - 1 . 9 5 5 for actual time to next failure and there are no critical transitions. We predicted values of total failures, remaining failures, and time to next failure as indicators of the risk of operating software in the future: is the predicted future reliability of software an acceptable risk? The risk to the mission may or may be not be acceptable. If the latter, we take action to improve the maintained product or the maintenance process. We use actual values to measure the reliability of software and the risk of deploying it resulting from maintenance actions.
7.3
Summary
We summarize change metric values in Table III. The overall (i.e., average CM) value indicates 7.1% stability. If the majority of the results and the average CM were negative, this would be an alert to investigate the cause. The results could be caused by: (1) greater functionality and complexity in
['--] Predicted(Y1)
~
Actual(Y2)
20
15
10
0
Ol A
3.4 9.27 13.17 17.5 Months since release of first OI B
C
D
E
j]
81.6
J
FIG. 5. Reliability of maintained s o f t w a r e - - t i m e to next failure.
170
NORMAN F. SCHNEIDEWIND TABLE III CHANGE METRIC SUMMARY Metric
Actual
Mean time to failure Total test time per KLOC Total failures per KLOC Remaining failures per KLOC Time to next failure Average
-0.060 0.116 0.087 0.277 -0.065 0.071
Predicted
0.115 0.107 -0.152
the software over a sequence of releases, (2) a maintenance process that needs to be improved, or (3) a combination of these causes.
7.4
Metrics for Long-Term and Short-Term Analysis
In addition to the long-term maintenance criteria, it is desirable that the maintenance effort results in increasing reliability within each release or OI. One way to evaluate how well we achieve this goal is to predict and observe the a m o u n t of test time that is required to reach a specified number of remaining failures. In addition, we want the test effort to be efficient in finding residual faults for a given OI. Furthermore, number of remaining failures serves as an indicator of the risk involved in using the maintained software (i.e., a high value of remaining failures portends a significant number of residual faults in the code). In the analysis that follows we use predictions and actual data for a selected OI to illustrate the process: OID.
7.4.1
Total Test Time Required for Specified Remaining Failures
We predict the total test time that is required to achieve a specified number of remaining failures, r(h), at time tt., by equation (7) as described in American Institute of Aeronautics and Astronautics [17] and Schneidewind [20]: 1 t, - - log /3
+ (s - 1). /3[r(t,)]
(7)
M A I N T E N A N C E PROCESS A N D P R O D U C T E V A L U A T I O N
171
80 Operational increment OlD 9~ 9 Actual
e4~r
60
m 9
.c_
oq 4~ ~ 40 E ..,.., _
# 9149149
o 20
b-
..d .,q
I
0
Decreasing risk
I
1
I
I
I
2 3 4 Number of remaining failures
I
5
I
6
FIG. 6. T o t a l test t i m e to a c h i e v e r e m a i n i n g failures.
We plot predicted and actual total test time for OID in Fig. 6 against given number of remaining failures. The two plots have similar shapes and show the typical asymptotic characteristic of reliability (e.g., remaining failures) versus total test time. These plots indicate the possibility of big gains in reliability in the early part of testing; eventually the gains become marginal as testing continues. The figure also shows how risk is reduced with a decrease in remaining failures that is accomplished with increased testing. Predicted values are used to gauge how much maintenance test effort would be required to achieve desired reliability goals and whether the predicted amount of total test time is technically and economically feasible. We use actual values to judge whether the maintenance test effort has been efficient in relation to the achieved reliability.
7. 4.2
Failure Rate
In the short-term (i.e., within a release), we want the failure rate ( 1 / M T T F ) of an OI to decrease over an OI's total test time, indicating increasing reliability. Practically, we would look for a decreasing trend, after an initial period of instability (i.e., increasing rate as personnel learn how to maintain new software). In addition, we use various shape metrics, as defined previously, to see how quickly we can achieve reliability growth with respect to test time expended. Furthermore, failure rate is an indicator of the risk involved in using the maintained software (i.e., an increasing failure
172
N O R M A N F. S C H N E I D E W I N D
rate indicates an increasing probability of failure with increasing use of the software). Failure r a t e -
Total number of failures during test
(8)
Total test time We plot equation (8) for OID in Fig. 7 against total test time since the release of OID. Figure 7 does show that short-term stability is achieved (i.e., failure rate asymptotically approaches zero with increasing total test time). In addition, this curve shows when the failure rate transitions from unstable (increasing failure rate) to stable (decreasing failure rate). The figure also shows how risk is reduced with decreasing failure rate as the maintenance process stabilizes. Furthermore, in Fig. 8 we plot the rate of change (i.e., slope) of the failure rate of Fig. 7. This curve shows the percent of total test time when the rate of change of failure rate reaches its maximum negative value. We use these plots to assess whether we have achieved short-term stability in the maintenance process (i.e., whether failure rate decreases asymptotically with increasing total test time). If we obtain contrary results,
0,020 -
Unstable
0.018"
0.016"
~>~ (13 "tD I,., (D CL U') L
r LL
0,014-
0,012"
0.010"
0.0080.006"
Decreasing risk
0.004-
v-
0,002
0.000
,
5.9
.
6.29
.
7.41
.
9.67
.
.
12.7
.
16.56
.
,
22.76
30.82
Percent of total test time FIG. 7. O I D failure rate.
58.73
,
75.77
,
84.57
100
173
MAINTENANCE PROCESS AND PRODUCT EVALUATION 8.00 7.00 6.00 • 5.00 4.00 "0
3.00
$ t,~
2.00
o,m
1.00
Y.
0.00 [ -1.00
L Stable when negative 13 20
27
34
41
48
55
62
69
76
83
90
97
Percent of total test time FIG. 8. OID rate of change of failure rate.
this would be an alert to investigate whether this is caused by: (1) greater functionality and complexity of the OI as it is being maintained, (2) a maintenance process that needs to be improved, or (3) a combination of these causes. Another way of looking at failure rate with respect to stability and risk is the annotated failure rate of OID shown in Fig. 9, where we show both the actual and predicted failure rates. We use equations (8) and (9) as described in American Institute of Aeronautics and Astronautics [17] to compute the actual and predicted failure rates, respectively, where i is a vector of time intervals for i t> s in equation (9):
f(i)
= c~(exp(-/3(i- s + 1))).
(9)
A 30-day interval has been found to be convenient as a unit of Shuttle test time because testing can last for many months or even years. Thus this is the unit used in Fig. 9, where we show the following events in intervals, where the predictions were made at 12.73 intervals: Release time: 0 interval, Launch time: 13.90 intervals, Predicted time of maximum failure rate: 6.0 intervals, Actual time of maximum failure rate: 7.43 intervals,
174
NORMAN F. SCHNEIDEWIND
0.8
m
I
Actual 0.7
Predicted
e~
>, 0.6 Ce)
K
Launch
0.5
Decreasing risk
0.4
,.a
0.3 ~ 0)
Release
0. I , ~ , , l l m , , J ~ , ~
0
10
20
30
I
l l l l
40
l i ~ J
50
~ l i s | x s |
60
70
80
Total test time (30 day intervals) FIG. 9. O l D failure rate: predicted versus actual.
Predicted maximum failure rate: 0.5735 failures per interval, and Actual maximum failure rate: 0.5381 failures per interval. In Fig. 9, stability is achieved after the maximum failure rate occurs. This is at i = s (i.e., i = 6 intervals) for predictions because equation (9) assumes a monotonically decreasing failure rate, whereas the actual failure rate increases, reaches a maximum at 7.43 intervals, and then decreases. Once stability is achieved, risk decreases.
7.5
Summary
In addition to analyzing short-term stability with these metrics, we use them to analyze long-term stability across releases. We show the results in Table IV where the percent of total test time to achieve reliability growth goals is tabulated for a set of OIs, using actual failure data, and the change metrics are computed. Reading from left to right, the values of CM indicate 8.9% stability, 7.0% instability, and 10.1% instability; there are no critical transitions. Interestingly, except for OID, the maximum negative rate of
MAINTENANCE PROCESS AND PRODUCT EVALUATION
175
TABLE IV PERCENT OF TOTAL TEST TIME REQUIRED TO ACHIEVE RELIABILITY GOALS AND CHANGE METRICS (CM)
Operational increment A B C D E J O
One remaining failure (% test time) 77.01 64.11 32.36 84.56 83.29 76.88 46.49 CM STD DEV
Relative change
Stable failure rate (% test time)
0.168 0.495 -0.617 0.015 0.077 0.395 0.089 0.392
76.99 64.11 10.07 12.70 61.45 76.89 100.00 CM STD DEV
Relative change
Maximum failure rate change (% test time)
Relative change
0.167 0.843 -0.207 -0.793 -0.201 -0.231 -0.070 0.543
76.99 64.11 10.07 22.76 61.45 76.89 100.00 CM STD DEV
0.167 0.843 -0.558 -0.630 -0.201 -0.231 -0.101 0.544
change of failure rate occurs when failure rate becomes stable, suggesting that maximum reliability growth occurs when the maintenance process stabilizes.
ill
Shuttle Operational Increment Functionality and Process Improvement
Table V shows the major functions of each OI as reported in L o c k h e e d - M a r t i n [22] along with the release date and K L O C change repeated from Table I. There is not a one-for-one relationship between KLOC change and the functionality of the change because, as stated earlier, KLOC is an indicator of maintenance actions, not functionality. However, the software developer states that there has been increasing software functionality and complexity with each OI, in some cases with less rather than more K L O C as stated by Keller [16]. The focus of the early OIs was on launch, orbit, and landing. Later OIs, as indicated in Table V, built upon this baseline functionality to add greater functionality in the form of M I R docking and the Global Positional System, for example. Table VI shows the process improvements that have been made over time on this project, indicating continuous process improvement across releases. The stability analysis that was performed yielded mixed results: about half are stable and half are unstable. Some variability in the results may
176
NORMAN F. SCHNEIDEWlND
TABLE V SHUTTLE OPERATIONAL INCREMENT FUNCTIONALITY Operational increment
Release date
KLOC change
A B C D
9/1/83 12/12/83 6/8/84 10/5/84
8.0 11.4 5.9 12.2
E F G H I J
2/15/85 12/17/85 6/5/87 10/13/88 6/29/89 6/18/90
8.8 6.6 6.3 7.0 12.1 29.4
K
5/21/91
21.3
L M N O P Q
6/15/92 7/15/93 7/13/94 10/18/95 7/16/96 3/5/97
34.4 24.0 10.4 15.3 7.3 11.0
Operational increment function Redesign of main engine controller Payload re-manifest capabilities Crew enhancements Experimental orbit autopilot. Enhanced ground checkout Western test range. Enhance propellant dumps Centaur Post 51-L (Challenger) safety changes System improvements Abort enhancements Extended landing sites. Trans-Atlantic abort code co-residency Redesigned abort sequencer One engine auto contingency aborts Hardware changes for new orbiter Abort enhancements On-orbit changes MIR docking. On-orbit digital autopilot changes Three engine out auto contingency Performance enhancements Single global positioning system
be due to gaps in the data caused by OIs that have experienced insufficient failures to permit statistical analysis. Also, we note that the values of CM are small for both the stable and unstable cases. Although there is not pronounced stability, neither is there pronounced instability. If there were consistent and large negative values of CM, it would be cause for alarm and would suggest the need to perform a thorough review of the process. This is not the case for the Shuttle. We suspect but cannot prove that in the absence of the process improvements of Table VI, the CM values would look much worse. It is very difficult to associate a specific product improvement with a specific process improvement. A controlled experiment would be necessary to hold all process factors constant and observe the one factor of interest and its influence on product quality. This is infeasible to do in industrial organizations. However, we suggest that in the aggregate a series of process improvements is beneficial for product quality and that a set of CM values can serve to highlight possible process problems.
MAINTENANCE PROCESS AND PRODUCT EVALUATION
177
TABLE VI CHRONOLOGY OF PROCESS IMPROVEMENTS Year in which process improvement introduced 1976 1977 1978 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1992
ell
Process improvement Structured flows Formal software inspections Formal inspection moderators Formalized configuration control Inspection improvements Configuration management database Oversight analyses Build automation Formalized requirements analysis Quarterly quality reviews Prototyping Inspection improvements Formal requirements inspections Process applied to support software Reconfiguration certification Reliability modeling and prediction Process maturity measurements Formalized training Software metrics
United States Air Force Global Awareness (GA) Program Application
The change metric has also been incorporated into the decision support system (DSS) of the US Air Force Global Awareness (GA) program of the Strategic and Nuclear Deterrence Command and Control System Program Office for the purpose of tracking and evaluating various computer and communications resource and performance metrics as described by Schneidewind [23]. GA systems are ground-based radar and optical sensors, which provide warning to North America of attack from air, space, or missiles. Most of these sensors operate 24 hours per day, seven days per week. The system provides a variety of search and track operations to collect metric data and optical signature data on satellites (primarily deep space satellites) for the United States Air Force. These electro-optical sensors operate between sunset and sunrise. Sensor sites are in continual need of upgrade, modification, and changes to mission-critical hardware and software components. The development of a set of metrics, whereby managers can make informed decisions concerning the future of this system,
178
NORMAN F. SCHNEIDEWIND
is being accomplished to establish a set of statistically relevant data as described by Schneidewind [23]. The effort began in 1996 by identifying candidate forecasting metrics and instituting processes to collect and analyze forecasting data. The Metrics Engineering Team established a data repository to store metrics, including system status metrics and candidate forecasting metrics. The most recent team efforts include development of a process and an automated tool to integrate metrics collection and analysis--the DSS. The metric definitions are given below as reported in OAO Corporation [24]. These metrics are considered crucial to assessing operational performance. All metrics are reported monthly. Mean time between critical failures (MTBCF): Average time in hours between failures of critical components. The purpose of this metric is to identify the frequency of outages. The timing of replacement actions can also be determined through use of this metric.
MTBCF =
Time the system is available Number of critical failures
Mean time to repair (MTTR): Average time in minutes to restore the system to an operational state. The purpose of this metric is to identify the average time required to repair and return a sensor site from a critical failure to an operational status.
MTTR -
Total repair time for all outages Number of outages
System availability: Percentage of time that the system is available to meet mission needs. The purpose of this metric is to determine system health, determine how well the system is meeting mission requirements, and determine the probability that the system will be available at any given time for mission accomplishment.
System availability- (
Programmed operating t i m e - System downtime Programmed operating time
• 100. Software failure time loss: Total time lost in minutes from system operation due to software failures. The purpose of this metric is to assess operational
MAINTENANCE PROCESS AND PRODUCT EVALUATION
179
time lost due to software failures. Software failure time loss = Sum of all down time due to software failures. This application of CM involves using the same metrics on the same release but across different sites. An example is given in Table VII. Here CM is computed for each metric and each site over a number of time periods. Reading from left to right in Table VII, the best stability values are 4.2%, 0.7%, 7.0%, and 0.4%; the worst instability values are 0.1%, 6.4%, 0.6%, and 4.8%. The CM is used to identify sites that have low performance relative to other sites. Table VIII shows the CM ranking of the four sites, where "1" is the highest rank (most stable) and "4" is the lowest rank (most unstable). In addition, domain experts assess AF Base 2 to be the worst of the lot in overall performance; this is borne out by the rankings in Table VIII and Table IX, which show the average ranks considering all four metrics. For AF Base 1, there were two consecutive critical transitions of mean time between critical failures from 672 to 186 to 720 hours in February, March, and April 1999, respectively. In addition, for AF Base 3, there were two consecutive critical transitions of system availability from 81.01% to 50.28% to 83.88% in July, August, and September, 1997, respectively. These transitions were flagged by the HU and HS criteria, respectively. However, there is no information in the available data as to why these transitions occurred.
TABLE VII CHANGE METRIC VALUES Metric Mean time between critical failures Mean time to repair System availability Software failure time loss
A F Base 1
A F Base 2
A F Base 3
A F Base 4
-0.001 0.018 0.000 0.042
-0.011 -0.051 0.007 -0.064
-0.006 0.002 0.000 0.070
0.004 -0.048 -0.001 -0.018
A F Base 3
AF Base 4
,
TABLE VIII CHANGE METRIC RANKS Metric Mean time between critical failures Mean time to repair System availability rank Software failure time loss rank
A F Base 1
A F Base 2
2 1 2.5 2
4 4 1 4
3
1
2
3
2.5 1
4 3
180
NORMAN F. SCHNEIDEWIND TABLE IX AVERAGE RANK OF FOUR METRICS
AF Base 1 1.875
AF Base 2
AF Base 3
AF Base 4
3.25
2.125
2.75
10.
Conclusions
As stated in Section l, our emphasis in this chapter was to propose a unified product and process measurement model for both product evaluation and process stability analysis. We were less interested in the results of the Shuttle and US Air Force stability analysis, which were used to illustrate the model concepts. We have presented the concept of process stability related to product quality and we have shown how to compute the change metric (CM) that assesses process stability as a function of product metrics. One example involved applying CM across several releases of the Shuttle flight software. A second example involved applying CM across multiple sites of the US Air Force Global Awareness Program. The highly stable and highly unstable criteria are useful for flagging anomalous values of relative change. These values may signify significant changes in product or process. We conclude, based on both predictive and retrospective use of reliability, risk, and test metrics, that it is feasible to measure and assess both product quality and the stability of a maintenance process. The model is not domain specific. Different organizations may obtain different numerical results and trends than the ones we obtained for our example applications. ACKNOWLEDGMENTS We acknowledge the support provided for this project by Dr. William Farr, Naval Surface Warfare Center; Mr. Ted Keller of IBM" and Ms. Patti Thornton and Ms. Julie Barnard of United Space Alliance. REFERENCES [1] Hollenbach, C., Young, R., Pflugrad, A. and Smith D. (1997). "Combining quality and software improvement". Communications o[the ACM, 40, 41-45. [2] Lehman M. M. (1980). "Programs, life cycles, and laws of software evolution". Proceedings of the IEEE, 68, 1060-1076. [3] Schneidewind, N. F. (1997). '+Measuring and evaluating maintenance process using reliability, risk, and test metrics". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 232-239.
MAINTENANCE PROCESS AND PRODUCT EVALUATION
181
[4] Schneidewind, N. F. (1998). "How to evaluate legacy system maintenance". IEEE Software, 15, 34-42. Also translated into Japanese and reprinted in: Nikkei Computer Books (1998). Nikkei Business Publications, Tokyo, Japan, 232-240. [5] Schneidewind, Norman F. (1999). "Measuring and evaluating maintenance process using reliability, risk, and test metrics". IEEE Transactions on Software Engineering, 25, 768-781. [6] Briand L. C., Basili, V. R. and Kim, Y.-M. (1994). "Change analysis process to characterize software maintenance projects". Proceedings of the International Conference on Software Maintenance, Victoria, IEEE Computer Society, 38-49. [7] Gefen, D. and Schneberger, S. L. (1996). "The non-homogeneous maintenance periods: a case study of software modifications". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 134-141. [8] Henry, J., Henry, S., Kafura, D. and Matheson, L. (1994). "'Improving software maintenance at Martin Marietta". IEEE Software, IEEE Computer Society, 11, 67-75. [9] Khoshgoftaar, T. M., Allen, E. B., Halstead, R. and Trio, G. P. (1996). "Detection of fault-prone software modules during a spiral life cycle". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 69-76. [10] Pearse, T. and Oman, P. (1995). "Maintainability measurements on industrial source code maintenance activities". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 295-303. [11] Pigoski, T. M. and Nelson, L. E. (1994). "Software maintenance metrics: a case study". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 392-401. [12] Sneed, H. (1996). "Modelling the maintenance process at Zurich Life Insurance". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 217-226. [13] Stark, G. E. (1996). "Measurements for managing software maintenance". Proceedings of the International Conference on Software Maintenance, IEEE Computer Society, 152-161. [14] Billings, C., Clifton, J., Kolkhorst, B., Lee, E., and Wingert, W. B. (1994). Journey to a Mature Software Process". IBM Systems Journal, 33, 46-61. [15] Schneidewind, N. F. (1997). "Reliability modeling for safety critical software". IEEE Transactions on Reliability, 46, 88-98. [16] Keller, T. (1998). Private communication, IBM. [17] American Institute of Aeronautics and Astronautics (1993). Recommended Practice .for Software Reliability, ANSI/AIAA R-013-1992, Washington, DC. [18] Keller, T., Schneidewind N. F. and Thornton, P. A. (1995). "Predictions for increasing confidence in the reliability of the space shuttle flight software". Proceedings of the AIAA Computing in Aerospace 10, San Antonio, Texas, 1-8. [19] Schneidewind, N. F. and Keller, T. W. (1992). "Application of reliability models to the space shuttle". IEEE Software, 9, 28-33. [20] Schneidewind, N. F. (1993). ~'Software reliability model with optimal selection of failure data". IEEE Transactions on Software Engineering, 19, 1095-1104. [21] Farr, W. H. and Smith, O. D. (1993). "~Statistical modeling and estimation of reliability functions for software (SMERFS) users guide". NAVSWC TR-84-373, Revision 3, Naval Surface Weapons Center, Dahlgren, Virginia. [22] Lockheed-Martin (1998). "Software release schedules", Houston, Texas. [23] Schneidewind, N. F. (2000). "United States Air Force decision support system (DSS) panel discussion". International Conference on So.[?ware MahTtenance 2000 and International Symposium on Software Reliability Engineering 2000, Industry Day Proceedings, Reliable Software Technologies, Dulles, Virginia. [24] OAO Corporation Report (1997). "Identifying and proposing metrics for missile warning space surveillance sensors (MWSSS)", Colorado Springs, Colorado.
This Page Intentionally Left Blank
Computer Technology Changes and Purchasing Strategies GERALD V. POST Eberhardt School of Business University of the Pacific 3601 Pacific Ave. Stockton, CA 95211 USA
Abstract Driven by design and manufacturing improvements as well as competition, computer chips have doubled in capacity about every year and a half for over 35 years. This exponential rate of change requires careful analysis of purchasing strategies. Regardless of "need," some buying strategies are more valuable than others. F o r example, desktop PC replacement cycles of three years are preferred, but firms choose the level of technology that matches their needs. On the other hand, for laptops, it has been better to choose high-end computers, and then match the replacement cycle to the specific uses. Advances in information technology also affect the use of computers and the structure of the o r g a n i z a t i o n - - d r i v i n g firms towards decentralization.
1. 2. 3.
4.
5.
6. 7.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moore's Law: The Beginning . . . . . . . . . . . . . . . . . . . . . . . . . . . Mainframes to Personal Computers: Price/Performance . . . . . . . . . . . . . . 3.1 Grosch's Law: Economies of Scale . . . . . . . . . . . . . . . . . . . . . . 3.2 Declining Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Personal Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Buying Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Comparing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Upgrade Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laptops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Buying Strategy Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Absolute and Relative Dominance . . . . . . . . . . . . . . . . . . . . . . 5.3 Upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Centralization, Decentralization, and T C O . . . . . . . . . . . . . . . . . . . . . Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ADVANCES IN COMPUTERS, VOL. 54 ISBN 0-12-012154-9
183
184 185 187 188 189 191 192 194 194 196 200 201 202 203 203 204 205 206
Copyright ~ 2001 by Academic Press All rights of reproduction in any form reserved.
184
GERALD V. POST
8. The Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Limitsto Lithography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 AlternativeMethodologies . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.
208 208 209 210 211
Introduction
Anyone who purchases a computer quickly learns two basic rules: (1) the price of that specific computer will drop the next month and (2) a faster, more powerful computer will rapidly become available at the same or lower price. While most people observe this pattern in the purchase of personal computers (PCs), it also exists with larger machines. For example, IBM mainframe prices dropped from $100 000/MIPS in the early 1990s to $2000/MIPS in 2000 [1]. In fact, these price declines exist because of the rapid introduction of new technology at the fundamental chip level. Advances in design and manufacturing techniques produce increasingly powerful chips for processors and random access memory (RAM). These trends have profoundly affected computers and their application in business. One of the most direct questions involves purchasing strategies. When performance continues to improve, coupled with declining prices, it can be frustrating and difficult to choose a purchasing strategy. Continued hardware improvements ultimately drive new software applications. As technology improves, software vendors add new features that take advantage of this additional capacity. For example, improvements in the user interface such as speech recognition require high-speed processing with substantial memory capacity. Ultimately, the computers need to be replaced, so the question becomes one of the length of the replacement cycle, and the level of computer to buy. Another important consequence of advancing technology arises because the implementation of that technology is not uniform. For example, economics and competition have generally driven advances of technology in personal computers faster than its application in larger servers. These advances have resulted in PCs today that are faster than supercomputers of a decade ago. The availability of increasingly powerful machines at everlower prices has resulted in a substantial decentralization of computer hardware. This decentralization has generated conflicts in information systems management. For example, decentralization appears to increase management costs and reduce control. Some of these concepts became embodied in the term "total cost of ownership" (TCO). Proponents of the
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 185
TCO methodology attempted to measure management costs of decentralization, and used these costs to encourage a move to recentralize their operations. Many of these challenging management decisions and purchase issues arise because of the exponential improvements in manufacturing at the chip level. Exponential change is difficult to handle, but when it exists for over 35 years, change itself becomes entrenched. Ultimately, the next question is: how long can technology improve exponentially? Over the next 10 years, this question will become more important and more difficult to answer. Eventually, the industry will reach physical limits in terms of current technologies and manufacturing processes. Moving beyond that point will require new fundamental technologies with new manufacturing techniques.
2.
Moore's Law: The Beginning
In 1965, based on only three data points, Gordon Moore [2] predicted that integrated circuit density could double with each device generation (about 18 months). For over 35 years, this prediction has been fulfilled. Driven by competition, economics, and demand for new applications, chip manufacturers have created new technologies to stay on this exponential curve. The computer industry and society have come to rely on this continually increasing performance for DRAM and processors. Figure 1 uses a log scale to show this pattern has fit very closely at the chip level. Intel processor circuitry has lagged DRAM development slightly, but the overall growth of both has been very close to the predicted levels. Achieving this exponential growth carries a cost, in terms of design and manufacturing. With each generation, chip manufacturers are squeezed by these increasing costs. Manufacturers are particularly concerned about the increasing fabrication costs. In fact, while DRAM capacities have been expanding at the rate of about 1.8 times every generation (a year and a half), factory costs have increased by 1.17 times for each generation. These increasing costs of fabrication are known as Moore's second law: fabrication costs increase at a semi-log rate. Essentially, the economics force manufacturers to strive for a doubling of chip capacity with each generation. Smaller increases would not cover the design and capital costs of the new generation [3-6]. Chandra et al. [7] provide several other charts showing improvements in other components, such as disk drive capacities, CD-ROM speeds, and network transmission speed. Many of these other components and services have gained from the same technologies used to improve the chip capacities,
186
GERALD V. POST
FIG. 1. Moore's law at the chip level. The Moore line represents the pattern if chip capacities had completely doubled every 1.5 years. D R A M capacity increases averaged 1.8 times every year and a half. The corresponding rate for processors was 1.6 times. Source data from Intel.
densities, and performance. Some, like recent disk drives, have improved even more rapidly than the doubling defined by Moore's law [8]. Table I shows how Intel has been able to achieve the greater chip complexity through reduced transistor size and increasing wafer diameter. It also shows that through increased plant automation, Intel has been able to reduce costs by reducing defects. It is tempting to say that as long as the cost increase is at a lower rate than the capacity increase, then D R A M and processors can continue to improve at the same rate. But there is a catch to this statement: each generation requires substantially greater access to financial capital. For instance, by 1999, capital cost of a fabrication plant had exceeded $5 billion [10]. Manufacturers must be large and stable for investors to contribute the money needed to build each new generation. Consequently, the industry has continued to consolidate, and to explore joint manufacturing agreements.
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 187 TABLE I INTEL'S CHIP-MANUFACTURING PROGRESS. DATA SOURCE [9] (INTEL WEB SITE)
Chip complexity (index) Feature size (mm) Chip size (ram 2) Facility automation (%) Line yield (%)
1975
1997
2003 *
1 2 30 5 40
10 0.25 150 60 90
100 0.08 600 80 95
* forecasted values.
3.
Mainframes to Personal Computers: Price/Performance
The result of increased chip capacity has been significant price reductions for DRAM and processors, as well as increased performance. Since DRAM and processors are used in many devices, including components like disk drives, virtually all aspects of computers have improved substantially in terms of price and performance over the past 35 years. The increasing performance arises from the increased capacity of the chips. Capacity is improved through greater density, which also increases performance (and heat). Low prices are derived partly from the improved capacity. However, increased demand and economies of scale through standardization have been important factors as well. Measurement of performance is a challenge in the issues of computer price and performance. Particularly in the 1950s through 1980s, with disparate computer processors, diverse peripherals, and multiple configurations, it was difficult to compare the performance of machines [11-15]. With the standardization of microprocessors in the 1990s, the comparison issues were reduced, but never truly erased. Within a manufacturer's line, the differences are controllable. Across manufacturers, people have tried to compare MIPS (millions of instructions per second), or MFLOPS (millions of floating point operations per second), and occasionally benchmark times on standardized software tests [12, 14, 16-18]. While engineers could argue about these comparisons for years, they have all led to the same basic results: dramatic improvements in performance over time in all computers. The improved performance has also led to ever-smaller computers. As chips gain capacity, more functions are embedded, reducing the ultimate size of the computer. The pattern is clear from room-size mainframes to minicomputers to personal computers, laptops, and personal digital assistants. The unrelenting pressure of competition in the PC market has created problems for several manufacturers. Briody [19] reports that the average
188
GERALD V. POST
corporate PC purchase price dropped from $2453 in 1996 to $1321 in 1999. He also reported that IBM lost $1 billion in their PC division in 1998, which undoubtedly explains why IBM announced in 1999 that they would drop traditional retail sales of desktop PCs.
3.1
Grosch's Law: Economies of Scale
In the 1960s and 1970s, as detailed by Ein-Dor [20], several researchers observed a basic relationship in computer prices at a given point in time. Within a specific manufacturer's line (e.g., IBM computers), the price/ performance ratio tended to decline while progressing from smaller to larger computers. Researchers concluded these differences represented economies of scale, and often suggested that companies would be better off buying larger computers (instead of several smaller ones) because of this pattern. Herbert Grosch also stated that the economy of computing increases with the square root of speed. That is, to do a calculation 10 times as cheaply, it must be performed 100 times faster. Grosch explained the differential by observing that increasing processor performance was absorbed by increased software overhead. Several researchers examined data from the 1950s to the 1970s to prove Grosch's law [21-25]. Ein-Dor and Kang [26] eventually showed that this pattern did not hold when minicomputers and microcomputers were introduced. The new, small machines had considerably lower price/performance ratios than the larger mainframes. On the other hand, true economies of scale have evolved in the small-computer market: as demand and sales increase, per-unit costs decline dramatically. Today, price/performance ratios, along with absolute cost, are substantially lower for mass-produced personal computers than for other computers. Recent data provide the opportunity to once again test Grosch's law across a more homogeneous set of personal computers. Table II shows prices and performance for three systems introduced over four years. The performance change column is the ratio of the performance measure (Pt/Pt-1). The economy of computing is measured as the price/ performance ratio. The change column is similarly defined as the ratio of the economy from one system to the next. Grosch's law would imply that if performance improves by a factor of x (Pt = xPt- 1), then the economy ratio (cost/performance) should change by 1/v/-s (E, = 1,/v/-xEt_ 1). The next-tolast column holds the ratio of Et/Et_ 1. The last column displays the computation of 1/ v/P,/P, _ 1. Notice the substantial improvement in the economy ratio in the next-tolast column. Price/performance has improved considerably over the level postulated by Grosch's law. Note that Grosch claims an additional factor in
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES
189
TABLE II SAMPLE TEST OF GROSCH'S LAW FOR PERSONAL COMPUTERS. COMPARE THE LAST TWO COLUMNS. IN TERMS OF HARDWARE ECONOMIES, PRICE-PERFORMANCE RATIOS HAVE DROPPED CONSIDERABLY FASTER THAN POSTULATED BY GROSCH'S LAW
System
Date
Intro price
P/200 PII/400 PIII/800
Jan 96 Jan 98 Jan 00
5664 3800 2900
CPUMark 99
Price/ performance
Performance change
Price/ performance change
Grosch square root
11.6 31.6 72.1
488.28 120.25 40.22
2.72 6.22
0.2463 0.0824
0.6059 0.4011
"economy" due to software [27], claiming that software changes will eat up the price/performance improvements. However, this claim is dubious in the current PC environment. For a given set of tasks (e.g., a standard benchmark), the performance levels show the improvements listed in the next-to-last column. On the other hand, it is true that software vendors continually add new features, and many of these new features require additional performance. For example, speech recognition demands considerably more advanced computers than basic word processing software. This observation holds the answer to the question: why not just buy one computer and use it forever? The basic answer is that computers have not yet reached the stage where they meet the demands being placed on them. Performance improvements open up new applications and improvements in the user interface.
3.2
Declining Prices
Figure 2 shows this effect of improved performance and declining price. Price/performance ratios of all computers declined substantially from the mid-1980s through the mid-1990s. While the MIPS values may not be strictly comparable across computers, the pattern is so strong that minor discrepancies from measurement error do not influence the result. Not only has relative price/performance improved, but absolute price and performance changes have been dramatic as well. For example, it is now possible to buy personal computers for a few thousand dollars with performance equal to or greater than the fastest supercomputers of 15 years ago--costing several million dollars. The consequences of this change are shown in Fig. 3, which hints at the effect on both the computer market and on society. In the 1960s and 1970s, mainframe computers dominated the market and the industry. Improving
FIG. 2. Declining prices. Lower chip prices and competition from personal computers drove down prices of all computers. Source data from manufacturers and industry publications. Reprinted with permission of McGraw-Hill from Mana~ement Information Systems (2000) by Post and Anderson.
FIG. 3. The changing use of computers. Although sales of all computers increased through 1995, the total shipment value of mainframes declined from 1995 to 1999. Reprinted with permission of McGraw-Hill from Managenlent h?/'ormation Systems (2000) by Post and Anderson.
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 191
technology and declining costs fostered the growth of minicomputers (particularly Digital Equipment Corporation) in the 1970s and early 1980s. The even more dramatic changes in the 1990s spread the use of personal computers throughout the world. The declining price of individual machines has emphasized the reduction in total revenue from these higher-end machines. Until the mid-1990s, sales revenue was generally increasing in all three categories of computers. So, although PC revenues were gaining a larger share, the entire industry was expanding. Since 1990, even that aspect changed--largely because manufacturers (notably IBM) were forced to reduce per-unit prices. While unit sales of large mainframes were increasing, the total revenue declined [28]. Yet, the worldwide PC market was not even close to saturation, so PC sales units and revenue continued to increase. Actually, the line between the machine classifications is increasingly blurred. Because of the increased capacity and performance of the chips, manufacturers altered the way that most computers are built. While their architecture differs from personal computers, many of today's servers and "mainframes" utilize relatively standard CMOS technology. From an economic, or purchasing perspective, computers have always been an interesting product [13]. They are relatively expensive, and as soon as you buy one, a newer model comes out that is more powerful and cheaper than the one you just bought. Over time, any computer you have must be replaced; partly from pressure by the manufacturer (in terms of maintenance contract costs) and partly because of technological obsolescence. In the 1970s, it was generally recognized that IBM strove for five-year product cycles, encouraging leading users to replace main computers every five years. However, the 1970s and 1980s were essentially on the relatively low-growth linear portion of the exponential performance curve. In contrast, the 1990s have exhibited more radical improvements. For example, doubling from 64 to 128 is substantially more noticeable than doubling from to 2 to 4. Consequently, the impact on buying strategies has been more pronounced in the last decade. Today, the issue of economies of scale (particularly Grosch's law) are more complex. From a hardware perspective, standardization and competition continue to drive personal computer price/performance trends--providing strong incentives for decentralization.
4.
Personal Computers
This section is an updated analysis based on a version by the author, printed and copyright 1999 by ACM: Post, G. V., 1999, How Often Should a Firm Buy New PCs?, Communications of the A CM, 42(5), 17-21.
192
GERALD V. POST
Several writers (e.g., Schaller [29], Mendelson [30], and Kang [16]) have examined and attempted to measure computer price trends. Today, the issue of server purchases and prices has changed only slightly. With only a limited number of manufacturers, and a long investment in software and development, companies generally have few choices in terms of switching, upgrading, or pricing of servers. The rapid rise of freeware-based operating systems (initialized by Linux) offers some future price competition, but those effects cannot yet be evaluated. Hence, several of the more interesting price questions arise in the personal computer arena. Faced with a vast array of choices, managers must analyze difficult decisions regarding the purchase of personal computers. As managers recognize (e.g., [31,32]), we all face the consequence of these trends when we purchase a computer. Should we wait? Should we buy a faster computer? Should we buy the cheapest computer available? The problem with answering these questions is that it is exceedingly difficult to estimate the "need" or demand for computers. If you ask a typical cost-cutting manager for a new computer, the first question you hear is: Why do you need a new computer? The level of "need" is often a contentious issue within organizations when deciding to purchase new computers. Even on a broader scale, researchers have found it challenging to identify the business impact of computers and IT spending. The productivity debate (e.g., [33-35]) illustrates these problems. A fundamental result of the performance and price changes is that organizations have to buy new computers every few years. A consequence of these trends is that organizations must adopt some purchasing strategy that specifies the level of machine and timeframe they will be held. Hence, the primary question is to identify which strategies are superior to others. It is important to reduce the issue of "need" in purchasing personal computers. Of course, it is not possible to eliminate it completely. But it is possible to defer the issue and focus on narrowing the choices--which makes the decision process much simpler.
4.1
Buying Strategies
Even a casual knowledge of the computer industry reveals the basic patterns shown in Fig. 4. The data represent prices from one company (Gateway 2000) from December 1990 through October 1998. Notice that at any point in time you have a choice of levels of machines. At the low end of the price scale, you can find a base machine with an older processor and base amounts of RAM and disk capacity. High-end machines generally boast the latest processors, substantial memory, and larger monitors. Most vendors also offer an intermediate-level machine historically priced around
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 193
FIG. 4. Note the stability of the pricing on the high-end and base systems. Over time, the configuration of the machines changes. For example, examine the falling prices portrayed by the three specific types of computers. Source data from manufacturer ads.
$2500. By tracing the life-cycle of a particular computer, Fig. 4 shows how a computer begins life at the high end, and moves down the ladder as new technologies are introduced. This pattern of changing configurations and declining prices leads to an important characteristic of the personal computer market: the computers must be replaced at regular intervals; which leads to the two ultimate questions: (1) how often should computers be replaced, and (2) what level of computers should be purchased? The answers to these questions create a purchasing strategy. The key is to evaluate the cost of each strategy and the average performance of the machine. For example, a strategy of buying high-end machines every year will be expensive. However, a strategy of buying low-end machines every four years will result in workers having machines with substantially lower average performance. Twelve basic strategies are created from hypothetical purchases of high, intermediate, and base machines and holding them for 12, 24, 36, or 48 months. When the ownership time period is completed, a new machine is purchased at the same level (but different configuration as technology changes). The objective is to evaluate these twelve strategies to see if the options can be reduced. The technique used is to simulate purchases with the twelve strategies using the historical data. Comparing strategies over time requires that costs be discounted to a common point in time. Hence, the purchase prices were expressed as
194
GERALD V. POST
amortized monthly costs at a 5% discount rate. For each strategy, a resale value was estimated based on the life of the processor and advertised prices from computer brokerage sites. In general, the resale values were unimportant--particularly when expressed at an amortized monthly rate. They make the high-end purchases slightly more attractive but do not really alter the decision points. To compensate for possible seasonal changes and random fluctuations, the strategies were started at each of the 12 months for the first year of data. The results were then averaged using the median. The average approach also mimics firms that purchase several machines over the course of a year.
4.2
Performance
It is well known in computer science that it is difficult to compare performance characteristics of various computers. It is slightly easier in this study because all of the systems use Intel processors. Consequently, the Intel iComp-2 rating is used to measure the performance of the processors. It also turns out that the iComp-2 rating is a good proxy measure for the other attributes (RAM, drive capacity, video system, and so on). A simple regression test shows that the iComp-2 rating over this data has a high correlation (0.94R 2) with these other features. iComp - 11.54 + 0.48 RAM + 17.25 Drive + 4.02 CD + 12.52 VideoRAM RAM is measured in megabytes, drive capacity in gigabytes, CD speed in the traditional multiplier factor, and video RAM in megabytes. All of the coefficients are significant at greater than a 99% level. The equation is highly accurate; so iComp is a good measure of performance for desktop computers. The equation also shows the relative importance of each component in determining the overall performance. Consider the contribution of each component evaluated at the average as shown in Table III. Interestingly, the capacity of the disk drive has the strongest correlation with the overall performance measure. This result means that the capacity of disk drives increased at about the same rate as the overall system performance. It does not mean that drive capacity has the strongest impact on overall performance; only that it had the same growth rate over time.
4.3
Comparing Strategies
Consider strategy A with a cost of $100 and a performance level of 50, versus strategy B that costs $120 and has a performance level of 40. No one
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 195 TABLE III PC PERFORMANCE BREAKDOWN. CONTRIBUTION TO OVERALL PERFORMANCE FROM EACH COMPONENT AT THE AVERAGE. THESE VALUES ARE REGRESSION COEFFICIENTS THAT DESCRIBE THE CONTRIBUTIONS OF EACH COMPONENT TO THE OVERALL PERFORMANCE MEASURE. HIGHER VALUES SIGNIFY A HIGHER WEIGHT FOR A COMPONENT Video Intercept
RAM
Drive
CD
RAM
11.54
10.99
32.34
23.97
23.45
would choose strategy B since it costs more and offers less average performance. Strategy A dominates strategy B. Now consider strategy C that costs $101 and has a performance level of 75. Compared to strategy A, C is substantially faster but costs more money. Would people choose strategy C over A? This answer is somewhat subjective since it depends on comparing performance to price. However, it is possible to make some general comparisons along these lines by examining the cost/performance ratio. Ultimately, when comparing the cost/performance ratio, some subjective decisions must be reached in terms of the tradeoff. While it is true that the decision is subjective, it is likely that there are some levels at which almost everyone would agree that one strategy is preferable to another. In the A - C example, strategy C results in a machine that is 50% faster than A yet costs only 1% ($1 per month) more. Most people would prefer strategy C. This relationship could be called relative dominance to indicate that there is some subjectivity involved in the decision. The question that remains is what cost/performance tradeoff levels would be acceptable to most people? One way to approach this issue is to identify the ratio involved with purchasing any type of system. Three levels of systems are analyzed: high-end computers, mid-range, and low-price; for each of four time periods: 12, 24, 36, and 48 months. The 12 combinations are labeled with a letter indicating the level and the number of months the machine is held before being replaced. For example, H12 is a high-end computer replaced every 12 months: while L48 is a low-price computer replaced every 48 months. Each strategy is compared to the choice of no computer (PB = C 8 - 0) and the resulting cost/performance ratio is computed. The results range from 1.15 (strategy I36) to 2.53 (strategy H12). It is also instructive to note that the least-cost a p p r o a c h - - b u y i n g a low-end computer every 48 months--yields a ratio of 2.20.
196
GERALD V. POST
By using data expressed in original (not percentage) terms, these ratios have a direct interpretation: to obtain a one-unit level of performance, buyers had to spend from $1.15 to $2.53 per month. And the purchase of the barest minimum computer required an average of $2.20 per month for each unit of processing power. This range of values identifies limits for the cost/performance tradeoff. Specifically, if strategy A results in a decrease in performance (PA < PB) then most people should agree that A is preferred to B if the ratio is greater than 3 (rounding up the 2.53). That is, many buyers would accept a decline in performance if it saved them at least $3 for each one-unit drop. Conversely, if performance and price are higher, the more expensive strategy should be acceptable to most people if the cost ratio is less than 1 (rounding down 1.15). Since any computer costs at least $1 per month for each unit of performance, and since people choose to buy computers, then they should be willing to accept an increased cost as long as the performance gain is higher. There are limits to extending these statements, but they will hold for comparisons of relatively similar choices (e.g., small changes). One important caveat exists: the relative comparisons do not directly include a budget constraint or a performance constraint. Some organizations may only be able to afford the absolute lowest price machine-regardless of the performance. Similarly, some organizations could make the argument that they need the absolute highest performance regardless of price. However, these two extremes are probably rare in practice. If an organization has trouble paying $30-$50 per month for a low-end computer, it has more serious issues to resolve. On the other end, if a company needs massive processing power, users might be better off with workstations or parallel processing machines. 4.4
Results
The basic results of the 12 strategies are shown in Fig. 5, which compares the cost and performance results. Note that the median performance for High-24 is relatively high (equal to that for High-12). The mean values (not shown) are slightly lower. In general, the High-24 values should be lower than those of the High-12 strategy. The current results arise because of slightly different timing in the starting purchase for the two strategies.
4.4. 1 A b s o l u t e D o m i n a n c e Both the chart and the numeric results indicate that three strategies are dominated by other options. For example, the L12 option (buy base machines every 12 months) would clearly never be chosen because it is
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 197
FIG. 5. Each of the values (cost, RAM, drive, processor) is expressed as a percentage of the corresponding value in the most expensive strategy (High-12). Be cautious of adopting strategies like 1-12 that have relatively high costs, but tend to have lower average performance (not shown).
dominated by the I36 strategy. Comparing the L12 to the I36 option generates a cost performance ratio o f - 3 . 3 0 . That is, a $36 decrease in costs for I36 provides an increase in performance of 11 points. In percentages, strategy I36 represents a 6% drop in costs and a 75% increase in performance. Similarly, the I12 option is dominated by the H24 strategy, with a ratio o f - 3 . 1 4 . The essence of dominance is important, because it means that regardless of the user "demands," a rational person would never choose the dominated strategy. There is a better strategy that provides higher average performance at a lower cost.
4.4.2
Relative Dominance
Absolute dominance is relatively clear, but it only rules out three options. Can we reduce the list further? The answer lies in looking at relative dominance. For example, most people would choose a strategy that provides a machine four times faster for a one percent increase in price. These analyses are presented in Fig. 6. The horizontal axis shows the strategy being examined. Strategies that dominate by providing substantially faster machines at slightly higher costs are shown in the top of the chart. For ease of comparison, these ratio values have been inverted so that a value greater than 1.0 signifies relative dominance. Strategies that are dominated absolutely are indicated with the letter " D " indicating that the strategy on the x-axis would not be chosen because another strategy offers faster performance at a lower price.
198
GERALD V. POST
FIG. 6. An x-axis strategy is dominated (less desirable) if (1) there is a D in the column (absolute dominance), (2) the bar in the top of the chart exceeds 1 (faster machine at slightly higher price), or (3) the bar in the bottom of the chart exceeds 3 (substantially cheaper option with slight decline in performance). Options above 1 are preferred because they provide substantially higher performance at a minor increase in price. Options below 3 are preferred because they are significantly lower in price with only a slight drop in performance. The four bars exceeding the bounds are truncated and have substantially larger values than indicated by the graph.
Strategies that d o m i n a t e by being substantially cheaper with a small decline in p e r f o r m a n c e are shown in the b o t t o m of the chart. These values are positive, and magnitudes greater than 3.0 indicate that the x-axis strategy is less preferred. In all cases of dominance, the d o m i n a t i n g strategy is listed at the top or b o t t o m of the chart. The numerical results are also s u m m a r i z e d in Table IV. N o t e that some of the bars were truncated in Fig. 6 to highlight the i m p o r t a n t details a r o u n d the critical values. The actual values are listed in Table IV. Table IV shows the strategies that would typically be preferred to the base option. F o r example, an organization thinking a b o u t buying high-end machines every 48 m o n t h s would be better off buying intermediate-level machines every 36 m o n t h s , which is substantially cheaper (22%) with only a slight decline in p e r f o r m a n c e (1%). A n o t h e r choice is to purchase high-end machines every 36 m o n t h s , which is slightly m o r e expensive (13%) but results in a substantial increase in p e r f o r m a n c e (41%).
r.~ Z ,.r
0
0 .<
N
.<
0 .< >Z
0
z 0
>. .< Z 0
.<
0
>.
i
er
I
/
I
I
i,~l r162
~1- r
~ - - "
117
.-,
77
t,~l r
-~ 9
~
~
'
o
o
,,,~ ,,.~ ,,~ e~ ~
I
o
200
GERALD V. POST
Notice that all of the 12-month options are dominated by better strategies--in general they are absolutely dominated. The low end also presents an interesting situation. Clearly, there are no cheaper solutions. However, a relatively small increase in costs results in substantial performance gains. For example, it would not make sense to pursue the Low-48 strategy because a 12% ($4) increase in monthly costs provides an 84% increase in median performance. The next step up is not as dramatic, but it still provides substantial gains. In general, the I36 option is better than any of the low-end machine strategies. For example, although it costs twice as much as the Low-48 strategy ($67 versus $34 per month), it provides almost four times the level of performance. The I36 strategy is a substantially better alternative--unless the organization faces severe budget constraints.
4.5
Interpretation
In total, two strategies tend to dominate all the others: High-36 and Intermediate-36. Additionally, the Low-36 option is a viable option for organizations with extreme budget pressures, but not generally recommended. Likewise, the High-12 option may be necessary for organizations requiring the absolutely highest performance levels, but it would normally be better to accept the exchange of a slight performance drop for substantial cost savings and replace the high-end machines every 24-36 months. The most important practical conclusion is that if companies are going to buy personal computers on a regular basis, then two strategies dominate the rest: buy high-end machines every 36 months, or buy intermediate-level computers every 36 months. For organizations with severe budget constraints, buying low-end computers every 36 months is also a viable alternative. For individuals that require extreme high-end machines, purchasing the high-end machines more frequently can increase performance - - however, the costs are substantially higher. Note that the practical decision holds almost independently of the "needs" of the organization. That is, the dominance holds because of pricing trends, and because the two alternatives represent considerably better price/performance relationships. The issue of demand does play a final role in the decision--determining which of the two strategies should be adopted. Only the needs of the organization and individual can make the final determination. The high-end (H36) option costs 45% more ($97 versus $67 per month) and yields a 42% improvement in performance and feature attributes compared to the intermediate-level (I36) strategy.
COMPUTER TECHNOLOGYCHANGESAND PURCHASINGSTRATEGIES 201 4.6
Upgrade Strategy
Another possible option is to pursue a strategy with a lower-level machine, and then upgrade the machine later. This strategy can only succeed if (1) component prices drop fast enough and (2) the underlying architecture does not change very rapidly. Figure 4 indicates that condition (1)does not generally hold. Examine the price lines for the three specific processors. The decline in price within each machine is due to declining prices of components. Notice that the price decline is approximately linear (not exponential). In other words, you could save some money by delaying the purchase of some components, but you lose the use of the more advanced components over that time period. Additionally, the components are generally replacements (e.g., new drive or new processor), which means you end up paying twice: (1)the original component and (2)the replacement component. Since the rate of decline in price is linear, it is difficult to gain much with this strategy. For example, consider an upgrade strategy of buying low-end machines every three years. From Fig. 4 the average relative performance is 0.30 with a 0.17 average relative cost. Assume halfway through the three years you decide to upgrade. Using an average high-end performance rating of 0.88, the combined average performance rating over both halves would be 0.59. The actual cost is difficult to estimate, but consider the alternative. The three-year intermediate strategy yields a slightly higher average performance (0.62) with a relative cost of 0.28. The upgrade strategy would be beneficial only if the cost of the upgrades were less than 64% of the low-end machine cost (about $1000 in this scenario). The upgrade would consist of a high-end processor, an additional disk drive, and at least double the amount of RAM. Can these items be purchased for less than 64% of the cost? In 2000, the computations are close to equal since a high-end processor alone would cost around $700-$800, and RAM and a drive are not likely to be less than $200. Finally, changes in architecture often prevent complete upgrades. For example, the PC bus has changed several times, both in structure and in speed--which limits upgrading to newer components. Similarly, changes in video and disk interface subsystems result in more costly replacements. Even more difficult form-factor changes can completely prevent upgrades. The past 10 years have demonstrated that within three-four years of initial release, it will become infeasible to upgrade a computer. Choosing a strategy of purchasing a low-end machine and trying to upgrade it means initially buying a machine that is two years out from its release date, so you would have one or perhaps two years in which to make upgrades. Hence, the upgrade strategy can generally be approximated within the strategies already defined--e.g., choosing the intermediate strategy.
202
GERALD V. POST
At best, the upgrade strategy would enable you to spread the costs over time. The effective gain in the computer technology is minimal, compared to just buying a more powerful machine and keeping it longer. On the other hand, if you have perfect foresight, there have been some time periods in which it might have been beneficial to purchase a lower-end machine, and then upgrade a few components when the price dropped radically. On the other hand, with perfect foresight, you could have accomplished the same goal by scheduling new computer purchases to the point just after major price drops.
5.
Laptops
Laptop purchase strategies can be analyzed in approximately the same manner as the desktop machines. However, there is a significant difference in measuring performance. Specifically, manufacturers tended to reduce prices on the lower-end machines by reducing the RAM and drive capacities. Figure 7 highlights the difference between the relative processor and RAM/drive capacities. Harris et al. [36] examined component pricing in an earlier study. Mathematically, the difference can be seen by regressing the performance measure against the other component measures. Unlike the desktop situation, the laptop regression yields only an 80% R 2 value. Although this value is relatively high, it is not as good as the 94% value in the desktop
FIG. 7. Laptop strategy results. Because of the wide disparity between processor performance and the memory and drive values, the average of these three values is a more representative measure of system performance.
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 203
case. Even more importantly, relying solely on the processor to measure laptop performance distorts the value of the low-end machines. Instead, an average of the relative performance of the processor, RAM, and disk drive was used. Figure 7 shows how this value is a more realistic measure of overall performance than just the processor rating. In examining Fig. 7, remember that the values are expressed as percentages of the most expensive option (H12: buy high-end laptops every 12 months). These values cannot be compared directly to the desktop values in Fig. 5, because the base is different. Also, note that a slightly different timeframe was used to evaluate the laptops because the technology does not extend as far back as the desktops. Laptop data runs from January 1993 through December 1999. But the three levels (high, medium, and low) do not appear until January 1995.
5.1
Buying Strategy Results
As in the desktop situation, we need to examine the various strategies to see if some might be preferred to others. In particular, we want to rule out inferior strategies. Recall that a strategy is dominated if another option provides a better average computer at a lower price. Even a casual glance at Fig. 7 shows that some strategies are not very useful because they are expensive, particularly buying intermediate or low-level laptops every year.
5.2
Absolute and Relative Dominance
A strategy absolutely dominates another option when it offers greater performance at a lower price. In other words, no one would choose to pursue the dominated option because a clearly better choice exists. Absolute dominance can rule out some of the strategies. But other strategies also may be unappealing. The two examples are: (1) the ability to gain a substantial increase in performance for a slight increase in cost, and (2) the ability to save a substantial amount of money for only a minor performance penalty. Figure 8 presents the laptop results for both the absolute and relative dominance cases. The main eye-catching result is the large number of strategies (five) that are ruled out through absolute dominance. One additional strategy (I36) is clearly dominated by relative performance gains. The results for the laptop are fairly strong: almost all of the intermediate and low options are weak. A slight case could be made for buying intermediate or low-level laptops every four years. But in most situations the relative dominance indicates that you would be better off buying a high-end laptop and keeping it for four years. Since the dominance is not strong, there is still room to argue that a firm may prefer to save money with one of these
204
GERALD V. POST
FIG. 8. Note the many options that are ruled out through absolute dominance. The strategy listed on the x-axis is the one being tested. If a second strategy dominates it, the preferred strategy is listed above (or below) the position. For example, H48 dominates the I36 strategy by providing substantially better performance at a slight increase in price.
lower choices and simply tolerate the lower performance. The results show that most people would be better off buying a higher-end laptop and use their personal situation to determine how often it should be replaced.
5.3 Upgrades Just looking at the results in Fig. 8, it is tempting to argue that the three long-term (four year) strategies could benefit from upgrades. Since laptops carry a relatively high initial cost, and since each of the four-year replacement strategies are relatively close in price/performance, it should be possible to gain by upgrading the laptop after two years. While this upgrade option might work, complicating factors make it somewhat difficult and expensive. Specifically, laptops are much harder to upgrade than desktops are. The proprietary nature of the cases and motherboards makes it harder to find upgrade parts. The same proprietary aspect also tends to increase prices of upgrade components because of reduced competition. For example, R A M formats tend to change between manufacturers and even across a specific manufacturer's product lines. The strategy of buying a low-end laptop and expecting to upgrade it two years later is a particularly risky proposition. The low-end laptop is generally already a year or two out from its introduction. The laptop and its
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 205
components will not likely remain on the market two years later when you wish to upgrade. The high and intermediate options offer better potential to pursue this strategy.
6.
C e n t r a l i z a t i o n , D e c e n t r a l i z a t i o n , and T C O
One of the important impacts of declining hardware prices has been the ongoing debate over centralization versus decentralization of computing resources, e.g., Refs [37, 38]. Initially, all computing resources (hardware, software, data, and personnel) were centralized because of the environmental needs and the enormous costs. With declining prices and increased performance of smaller machines, hardware moved out to individual departments and users within organizations. As the hardware moved outwards, the pressure increased to decentralize the software, data, and occasionally the personnel. While decentralization has its advantages, some researchers and computer companies (notably Sun Microsystems) have been encouraging managers to focus on the management costs of decentralized computers, and the benefits of centralized controls. Emigh [39] summarizes these issues. King [40] observes that pressures for centralization and decentralization continually evolve over time because many of the underlying issues (e.g., political power) are irresolvable. One of the rallying cries was developed in 1996 and pushed by Forrester Research and the Gartner Group, two large IT consulting firms. The concept is known as total cost of ownership or TCO [41]. TCO was designed to measure management costs of decentralized computers. As summarized in Table V, in addition to purchase price, the studies typically include costs of installing software, management, maintenance, and training. The "results" vary widely, from $5000 to $7000 to $12 000 per computer [42]. The ultimate question the analysis poses is: do you need a fully configured PC for every task, or can workers perform the same job with a simple terminal or Web browser connected to a central server? Not surprisingly, most of the TCO proponents decide that centralized solutions could generate a lower TCO than decentralized systems. While the PCs generally have a lower capital cost, standardization of software, data, and management practices are hypothesized to produce lower management costs using centralized machines [43]. There is some debate over the validity of these figures (e.g., by Forrester Research) [44]. For example, all of the TCO calculations assume complete reliability of central servers and networks. But the ultimate question comes down to management issues of productivity: are users more productive
206
GERALD V. POST TABLE V COMMON MANAGEMENT COSTS UTILIZED IN TCO CALCULATIONS ,
Common elements of TCO Requirements analysis and order determination Hardware costs Software license costs Configuration and installation cost Repair and help desk support Move/add/change Refresh/replace Cost of user time
under a highly centralized, standardized system, or with individually tailored solutions [45, 46]? The second important aspect of the TCO issue is that the benefits of recentralization (reduced management costs) arise from tighter management controls, increased standardization, and less end-user flexibility. Consequently, firms interested in these "benefits" need to consider two important questions. First, could the same benefits be achieved with the lower-cost PC hardware using tighter management practices? Second, will the restrictions and control affect workers' productivity? In response to the first question, PC manufacturers, led by Intel and Microsoft, have been developing tools to reduce TCO by improving centralized management features. In particular, newer PCs can be monitored and configured remotely, even to the point of loading operating systems and software across a network (Windows 2000). The second (managerial) question is considerably more difficult to answer, and the answer will depend heavily on the specific work environment. From a hardware perspective, the costs of the processor, RAM, and drive storage are virtually zero. The management challenge is to create applications that can use this power to reduce the management costs, while still supporting user control over applications to improve worker productivity.
7.
Demand
Most of the comments and analysis to this point have focused on the supply side of the computer industry. The reasons are because (1)the production aspects are unique, and (2) the demand side is subjective and difficult to evaluate. Even in a specific situation--say one u s e r - - i t is hard
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 207
to determine the true "needs" for computer processing capabilities (e.g., [47, 48]). Everyone knows arguments that have arisen because some person or group felt they needed faster or newer computers, while an administrator with a close eye on costs disagreed. Because of these difficulties, this analysis has turned the question around to identify which purchasing strategies might be ruled o u t - - t h u s simplifying the demand discussion by reducing the number of options. While this approach is effective, it does not eliminate the role of demand. For example, some administrators are going to argue that they do not need to replace computers at all. Since their staff simply use them for word processing, a base machine is fine; and as long as it works, it never needs to be replaced. Note that the dominance results for desktops accept a version of this conclusion as a possibility. While a three-year cycle would produce substantially higher performance at slightly higher costs, a four-year cycle of low-end machines provides an extremely low-cost solution, which could be needed in some situations. On the other hand, it also provides the absolute lowest levels of performance. Often, the people who prefer the lowlevel option state that they only use the machines for word processing. Do people need increasingly powerful personal computers? The answer can be found in the answers to a slightly different question: what tools will users demand in the next few years? Even traditional tools like word processors continually gain new features that require more disk capacity, more RAM, and faster processors. Do users "need" these additional features? Do these features improve productivity? Maybe or maybe not, it is difficult to measure the impact. However, in terms of the near future, to many users, one of the most important answers to these questions is speech recognition software. Speech recognition software is one of the most performance-demanding applications run on PCs. Each new version demands increasingly powerful processors, more RAM, and faster drives. Yet, each new software version improves in performance and capabilities. Also, unlike some software, speech recognition is a tool that can significantly benefit computer novices. Will the demand for speech recognition software (and other high-end software applications) drive firms to replace their existing stock of PCs en masse? A more likely scenario is that they will defer implementing speech recognition until the more powerful machines drop in price. Once firms switch to speech recognition, the PC replacement cycle will be driven more tightly by improvements in the software. The trends highlighted in this analysis also provide a powerful commentary on the issue of decentralization. The chip-level trends clearly indicate that increasingly powerful computers will be available at lower prices. This trend will encourage the expanding use of powerful devices into
208
GERALD V. POST
decentralized applications. Ultimately, the discussion is not an issue of PC versus server, but a question of what processing is best performed at the user interface, and what parts of the application need to be handled centrally. Increasing processing capabilities create new applications, which in turn demand more capabilities, creating a cycle that leads to the necessity of replacing computers on a regular basis. From a managerial perspective, you have only limited choice over whether to follow this cycle. The short-term option of ignoring new applications could lead to long-term management issues and employee training problems. Once an organization commits to updating hardware on a regular basis, the strategies examined here provide a reliable means of evaluating and simplifying the choices. As long as the basic hardware trends continue, these techniques should be valid. Will these trends continue?
8.
The Future
Can Moore's law continue? Can chip-level technologies continue to double capacities every 18 months? This question is beginning to receive more attention. In particular, power consumption and heat generation, along with quantum level effects, are predicted to cause problems [3, 4,29, 49-55]. However, for good reasons, after 35 years, businesses have come to rely on improved performance and declining costs. If these trends continue, they will support more powerful uses of computers and more ubiquitous applications. If they do not continue, alternative solutions must be found to handle the ever-increasing demands.
8.1
Limits to Lithography
A special issue of Scient(fic American examined the underlying physics and engineering questions related to perpetuating Moore's law. In particular Stix [56] discussed the complications facing chip manufacturers as they try to reduce feature sizes to 0.1 micron. In 2000, common chips (Pentium III/800) were being developed at 0.18 microns using excimer lasers at 193 nanometer wavelength. The generation before that utilized 0.25 micron features, so the move to 0.1 micron is only a few generations (or years) away. Given the current chip manufacturing lithography methodologies, the most immediate problems involve the need for precision lenses to focus the etching lasers, and ultimately the need to reduce the wavelength. Reducing wavelength raises a challenging set of manufacturing issues on reaching the X-ray level.
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES 209
Ultimately, current methodologies can only be pushed to a certain limit. Even if they can be created, heat becomes an increasingly difficult problem, and linewidths become so small that essential electrical properties begin to fail. Pushing even further, quantum effects will interfere with the desired schematics. Most researchers agree that current methodologies are ultimately limited. Moore discussed many of these issues in 1997, and reached similar conclusions [10]. Between the physical limits and the increasing costs, we eventually run out of expansion potential. Some pundits have suggested we may reach that point in 10-15 years. That would constitute about six more generations of chips. If Moore's law holds over that time, capacities would increase by 26 or 64 times current levels. Certainly an important feat, but not necessarily enough to satisfy the needs of a large population demanding personalized real-time video editing and transmission. Yet, earlier forecasts of the demise of Moore's law have proven wrong. So far, engineers and scientists have found new methods to overcome the limitations. 8.2
Alternative Methodologies
Ultimately, the only way to surmount the limits of lithography is to find another technology [57]. Along with improvements in lithographic and deposition techniques, more radical research is progressing in optics, quantum physics, and biology. The optical research is probably the farthest along of the three. For example, Jordan and Heuring [58] described some basic progress and capabilities a decade ago. A computer operating on light offers several advantages over electrically based computers. In particular, light travels faster than electricity, there is minimal attenuation over reasonable distances, there is no problem with heat, and there is no interference generated from neighboring lines. While these properties have been beneficial for communication, there is considerable work to be done in creating optic-based memory or processor devices. In terms of chip size, quantum research is intriguing. Several researchers [59-62] are working on stable bi-state, transistor-replacement devices that utilize quantum properties. While single-unit quantum devices have been demonstrated, physicists still need to resolve the connection issues to be able to build devices that can handle computations. A few people have experimented with biological components, but with only limited success at this point [63, 64]. Theorists keep pointing to human neurons arguing that it should be possible to integrate biological and electrical components to reduce the size and increase the complexity of computational devices.
210
GERALD V. POST
Less esoteric options are more likely--particularly for the immediate future. Moore outlined some of these technologies in an interview in 1997 [10]. By stretching current manufacturing technologies, it should be possible to produce chips with feature sizes down to 0.13 microns. Even if physical or implementation limits are reached with current technology, designers can improve performance with more sophisticated instruction sets and structure. Similarly, even if individual chips reach limits, overall computer performance could be improved through parallel processing-increasing the number of processors on a machine to spread the work. Simple tasks do not benefit much from massively parallel computing, but more complex jobs of the future such as speech recognition and video processing can still benefit from multiple processors.
9.
Conclusions
To date, computer performance and prices have been driven by Moore's law, where capacity has doubled about every 18 months. In this growth environment, prices have fallen while performance increases. For over 35 years, this trend has altered the industry, providing faster, smaller, cheaper machines every year. Sales have moved from mainframes to minicomputers to personal computers, laptops, and handheld devices. While this trend has benefited both the computer industry and its customers, it leads to some difficult purchasing decisions. The centralized economies of the 1970s and 1980s gave way to increasingly rapid introductions of faster personal computers in the 1990s. While it seems useful to be able to buy ever faster machines, the continual introduction of new machines, along with applications that require the new performance, forces managers to adopt a strategy for replacing computers on a regular basis. Conflicts often arise between users who believe they can improve their productivity through newer machines, and cost-conscious managers who want to avoid unnecessary expenditures. Comparing various purchasing strategies demonstrates that some options are preferred to others. That is, strategies that cost more, but yield lower performance are generally unacceptable. Analyzing the 12 major strategies reveals that for desktop computer purchases through the 1990s, it was generally best to replace personal computers every three years. Depending on application needs, businesses will choose high, intermediate, or low-end desktops. The process of analyzing laptop purchase decisions is similar, but the results are different. In this case, the best strategies are to purchase high-end laptops and replace them when you need to improve performance.
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES
211
Since most of the performance gains have been driven by improvements in chip manufacturing, physical limits will eventually restrict the gains available through current production techniques. Researchers and manufacturers are searching for new technologies that will provide the next breakthrough in performance. Advances in optical computing are probably the leading contenders, but results in quantum and biological computing demonstrate some potential for new technologies.
REFERENCES [1] Vijayan, J. (2000). "Outrageous fortune?" Computerworld, Jan. 10. [2] Moore, G. E. (1965). "Cramming more components onto integrated circuits". Electronics Magazine, 38, April 19, 114-117. [3] DeTar, J. (1997). "Gigascale integration hitting the wall?" Electronic News, 43, 14-15. [4] Moore, G. E. (1995). "Lithography and the future of Moore's law". Optical~Laser Microlithography VIII." Proceedings o/the SPIE, Feb. 20, 2-17. [5] Ross, P. E. (1996). "Moore's second law". Forbes, 157, March 25, 116-118. [6] Studt, T. (1995). "Moore's law continues into the 21st century". R&D, 37, Nov., 33-35. [7] Chandra, J., March, S., Mukherjee, S., Pape, W., Ramesh, R., Rao, H. R. and Waddoups, R. (2000). "Information systems frontiers". Communications of the ACM, 43, 71-79. [8] Economist (1997). "Not Moore's law: magnetic storage". 344(8025), 72-4. [9] Meieran, E. S. (1998). "21st century semiconductor manufacturing capabilities". Intel Technology Journal, Fourth quarter 1998. h t t p : / / d e v e t o p e r . i n t e t . c o m/ technology/i
t j/q41998/arti
c les/art
1.htm).
[10] Miller, M. J. (1997). "Interview: Gordon Moore, Inter'. PC Magazine, March 25, 16, 236-239. [11] Bell, G. (1984). "The mini and micro industries". Computer, 17, 14- 33. [12] Cale, E. G., Gremillion, L. L. and McKenney, J. L. (1979). "Price/performance patterns of US computer systems". Communications of the ACM, 22, 225-233. [13] Chow, G. C. (1967). "Technological changes and the demand for computers". American Economic Review, 57, 1117-1130. [14] Cole, R., Chen, Y. C., Barquin-Stolleman, J. A., Dulberger, E., Halvacian, N. and Hodge, J. H. (1986). "Quality-adjusted price indexes for computer processors and selected peripheral equipment." Survey of Current Business, 66, 41-50. [15] Ein-Dor, P. and Feldmesser, J. (1987). "Attributes of the performance of central processing units: a relative performance prediction model". Communications of the ACM, 30, 308- 317. [16] Kang, Y. M. (1989). "Computer hardware performance: production and cost function analyses". Communications of the ACM, 32, 586-593. [17] Sircar, S. and Dave, D. (1986). "The relationship between benchmark tests and microcomputer price". Communications of the A CM, 29, 212-217. [18] Sharpe, W. F. (1969). The Economics of Computers. Columbia University Press, New York. [19] Briody, D. (1999). "PC at a crossroads", hTfoworld, June 19. ( h t t p : / / w w w . i n f o w o r l d . com/art i c les/hn/xml/99/06/21/990621
h n p c x , xm l).
[20] Ein-Dor, P. (1985). "Grosch's law re-revisited: CPU power and the cost of computation". Communications of the ACM, 28, 142-151. [21] Grosch, H. A. (1953). "High speed arithmetic: the digital computer as a research tool". Journal of Optical Society of America, 43, 306-310.
212
GERALD V. POST
Grosch, H. A. (1975). "Grosch's Law revisited". Computerworld 8, 24. Grosch, H. A. (1979). "'On Grosch's Law". Communications o[the ACM, 22, 376. Knight, K. E. (1966). "Changes in computer performance". Datamation, 12, 40-54. Solomon, M. B., Jr. (1966). "Economies of scale and the IBM System/360". Communications of the ACM, 9, 435-440. [26] Kang, Y. M., Miller, R. B. and Pick, R. A. (1986). ~'Comments on 'Grosch's Law rerevisited: CPU power and the cost of computation" "'. Communications of the ACM, 29, 779-781. [27] Metcalfe, R. (1996). "Computer laws galore, but one is holding back the information age". InfoWorld, May 6, 18, 52. [28] Churbuck, D. and Samuels, G. (1996). '~Can IBM keep it up?" Forbes, 157, June 30, 142-145. [29] Schaller, R. R. (1997). "Moore's Law: past, present and future". IEEE Spectrum, 34, June, 52-59. [30] Mendelson, H. (1987). "Economies of scale in computing: Grosch's Law revisited". Communications of the A CM, 30, 1066-1072. [31] Benamati, J., Lederer, A. L. and Singh, M. (1997). "'The problems of rapid information technology change". SIGCPR '97, Proceedhlgs of the 1997 Con[erence on Computer Personnel Research, 24- 29. [32] Rai, A., Patnayakuni, R. and Nainika, P. (1997). "'Technology investment and business performance". Communications of the A CM, 40, 89- 97. [33] Oosterbeek, H. (1997). "Returns from computer use: a simple test on the productivity interpretation". Economic Letters, 55, 273. [34] Lehr, W. and Lichtenberg, F. R. (1998). "'Computer use and productivity growth in US federal government agencies, 1987-92". The Journal of Industrial Economics, 46, June, 257. [35] DiNardo, J. E. and Pischke, J. (1997). "The returns to computer use revisited: have pencils changed the wage structure too?" Quarterly Journal of Economics, 112, 291. [36] Harris, A. L. and Dave, D. S. (1994). "'Effects of computer system components on the price of notebook computers". In[brmation & Management, 27 (Sept), 151-160. [37] Solomon, M. B. (1970). ~'Economies of scale and computer personnel". Datamation, 16, March, 107-110. [38] Oldehoeft, A. E. and Halstead, M. H. (1972). "Maximum computing power and cost factors in the centralization problems". Communications o[the ACM, 15, Feb, 94-96. [39] Emigh, J. (1999). "Business quick study: total cost of ownership". Computerworld, December 20. [40] King, J. L. (1983). "Centralized versus decentralized computing: organizational considerations and management options". ComputhTg Surve)'s, 15, 319-349. [41] Petreley, N. (1999). "Down to the wire: total cost of ownership reduction may just be another impossible dream", h~foWorld, 21, Oct 4, 126. [42] O'Donnell, D. (1998). "TCO: don't let hidden IT expenses hurt your company". Software Magazine, 18, August, 20-27. [43] Seymour, J. (1998). "TCO: magic, myth, or shuck?" PC Magazine, 17, May 5, 93-94. [44] Jacobs, A. (1998). "Users struggle with TCO models". Computerworld, 32, Sept 21, 61. [45] Dryden, P. (1998). "'Futz factor' measurement tough to pin down in TCO". Computerworld, 32, 6. [46] Lewis, B. (1997). "The winding road to TCO includes calculations that are both tricky and useless". InfoWorld, 19, Nov. 3, 113. [47] Stavins, J. (1997). "Estimating demand elasticities in a differentiated product industry: the personal computer market". Journal of Economics and Business, 49, July, 347.
[22] [23] [24] [25]
COMPUTER TECHNOLOGY CHANGES AND PURCHASING STRATEGIES
213
[48] Francalanci, C. and Maggiolini, P. (1994). ~'Justifying the information technology budget within changing environments". SIGCPR '94. Proceedings of the 1994 Computer Personnel Research Conference on Rehlventing IS. Managing In[brmation Technology in Changing Organizations, 23- 34. [49] Barsan, R. (1999). "Moore's Law 2000". Electronic News, 45, 8. [50] De, V. and Borkar, S. (1999). "Technology and design challenges for low power and high performance". Proceedings 1999 A CM International Symposium on Low Power Electronics and Design, 163-168. [51] Holden, D. (1993). "Statute of limitations on Moore's Law?" Electronic News, 39 April 5, 1-2. [52] Lammers, D. (1998). "Moore's Law, circa 2004"'. Electronic Engineering Times, July 13, 24. [53] Mann, C. C. (2000). "The end of Moore's Law?" Technology Review, 103, May/June, 42-51. [54] Messina, P., Culler, D., Pfeiffer, W., Martin, W., Oden, J. T. and Smith, G. (1998). "Architecture". Communications of the ACM, 41, 36-44. [55] Sawicki, J. (2000). "Enhancement deemed critical in maintaining Moore's Law". Electronic Engineering Times, August 7, 60. [56] Stix, G. (1995). "Toward 'Point One'". Scientific American, 272, February, 90-95. [57] Packan, P. A. (1999). "Device physics: pushing the limits". Science, 285, 2079-2080. [58] Jordan, H. F. and Heuring, V. P. (1991). "Time multiplexed optical computers". Conference on High Performance Networking and Computing, Proceedings of the 1991 Conference on Supercomputing, 370-378. [59] Sanders, G. D., Kim, K. W. and Holton, W. C. (1999). "Optically driven quantum-dot quantum computer". Physical Review, 60, 4146-4150. [60] Steane, A. and Rieffel, E. (2000). "Beyond bits: the future of quantum information processing". Computer, 33, January, 38-45. [61] Rotman, D. (2000). "Molecular computing". Technology Review, 103, May/June, 52-59. [62] Waldrop, M. M. (2000). "Quantum computing". Technology Review, 103, May/June, 60-69. [63] Garfinkel, S. L. (2000). "Biological computing". Technology Review, 103, May/June, 70-79. [64] Regalado, A. (2000). "DNA computing". Technology Review, 103, May/June, 80-87.
This Page Intentionally Left Blank
Secure Outsourcing Of Scientific Computations MIKHAIL J. ATALLAH, K. N. PANTAZOPOULOS, JOHN R. RICE, AND EUGENE E. SPAFFORD CERIAS: Center for Education and Research in Information Assurance and Security Purdue University West Lafayette, IN 47907 USA email: {mja, [email protected]
Abstract We investigate the outsourcing of numerical and scientific computations using the following framework: A customer who needs computations done but lacks the computational resources (computing power, appropriate software, or programming expertise) to do these locally would like to use an external agent to perform these computations. This currently arises in many practical situations, including the financial services and petroleum services industries. The outsourcing is secure if it is done without revealing to the external agent either the actual data or the actual answer to the computations. The general idea is for the customer to do some carefully designed local preprocessing (disguising) of the problem and/or data before sending it to the agent, and also some local postprocessing of the answer returned to extract the true answer. The disguise process should be as lightweight as possible, e.g., take time proportional to the size of the input and answer. The disguise preprocessing that the customer performs locally to "hide" the real computation can change the numerical properties of the computation so that numerical stability must be considered as well as security and computational performance. We present a framework for disguising scientific computations and discuss their costs, numerical properties, and levels of security. We show that no single disguise technique is suitable for a broad range of scientific computations but there is an array of disguise techniques available so that almost any scientific computation could be disguised at a reasonable cost and with very high levels of security. These disguise techniques can be embedded in a very high level, easy-touse system (problem solving environment) that hides their complexity.
1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Outsourcing and Disguise . . . . . . . . . . . . . . . . . . . . . . . . . .
216 216
1Portions of this work were supported by Grants DCR-9202807 and EIA-9903545 from the National Science Foundation, and by sponsors of the Center for Education and Research in Information Assurance and Security. ADVANCES IN COMPUTERS, VOL. 54 ISBN 0-12-012154-9
2 1 5
Copyright ~ 2001 by Academic Press All rights of reproduction in any form reserved.
216
2.
3
4.
5.
6.
MIKHAIL J. ATALLAH ET AL.
R e l a t e d W o r k in Cryptography . . . . . . . . . . . . . . . . . . . . . . . Other Differences Between Disguise and Encryption . . . . . . . . . . . . . 1.4 Four Simple Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Need for Multiple Disguises . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Atomic Disguises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Key Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Data-Dependent Disguises . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Disguise Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2
217
1.3
219
Applications
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Linear Algebra
3.2
Sorting
220 223 223 224 232 232 233 236
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
236
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
242
Template Matching in Image Analysis . . . . . . . . . . . . . . . . . . . . String Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Breaking Disguises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Attack Strategies and Defenses . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Disguise Strength Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Computational Cost for the Customer . . . . . . . . . . . . . . . . . . . . 5.2 Computational Cost for the Agent . . . . . . . . . . . . . . . . . . . . . . 5.3 Network Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cost Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 3.3
243
3.4
246 247 247 248 252 265 265 265 268 268
Conclusions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
268
References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
270
1. 1.1
Introduction
Outsourcing and Disguise
Outsourcing is a general procedure employed in the business world when one entity, the customer, chooses to farm out (outsource) a certain task to an external entity, the agent. We believe that outsourcing will become the common way to do scientific computation [1-4]. The reasons for the customer to outsource the task to the agent can be many, ranging from a lack of resources to perform the task locally to a deliberate choice made for financial or response time reasons. Here we consider the outsourcing of numerical and scientific computations, with the added twist that the problem data and the answers are to be hidden from the agent who is performing the computations on the customer's behalf. That is, it is either the customer who does not wish to trust the agent with preserving the secrecy of that information, or it is the agent who insists on the secrecy so as to protect itself from liability because of accidental or malicious (e.g., by a bad employee) disclosure of the confidential information.
SECURE OUTSOURCINGOF SCIENTIFICCOMPUTATIONS
217
The current outsourcing practice is to operate '~in the clear", that is, by revealing both data and results to the agent performing the computation. One industry where this happens is the financial services industry, where the proprietary data includes the customer's projections of the likely future evolution of certain commodity prices, interest and inflation rates, economic statistics, portfolio holdings, etc. Another industry is the energy services industry, where the proprietary data is mostly seismic, and can be used to estimate the likelihood of finding oil or gas if one were to drill in a particular geographic area. The seismic data is so massive that doing matrix computations on such large data arrays is beyond the computational resources of even the major oil service companies, which routinely outsource these computations to supercomputing centers. We consider many science and engineering computational problems and investigate various schemes for outsourcing to an outside agent a suitably disguised version of the computation in such a way that the customer's information is hidden from the agent, and yet the answers returned by the agent can be used to obtain easily the true answer. The local computations should be as minimal as possible and the disguise should not degrade the numerical stability of the computation. We note that there might be a class of computations where disguise is not possible, those that depend on the exact relationships among the data items. Examples that come to mind include (a) ordering a list of numbers, (b) typesetting a technical manuscript, or (c) visualizing a complex data set, (d) looking for a template in an image. However, example (a) can be disguised [5], and disguising (d) is actually a (quite nontrivial) contribution of this chapter, so perhaps the others might be disguised also. To appreciate the difficulty of (d), consider the obvious choice for hiding the image, i.e., adding a random matrix to it: What does one do to the template so that the disguised version of the template occurs in the disguised version of the image? It looks like a chicken and egg problem: if we knew where the template occurs then we could add to it the corresponding portion of the image's random matrix (so that the occurrence is preserved by the disguise), but of course we do not know where it occurs--this is why we are outsourcing it in the first place. 1.2
Related W o r k in C r y p t o g r a p h y
The techniques presented here differ from what is found in the cryptography literature concerning this kind of problem. Secure outsourcing in the sense of [6] follows an information-theoretic approach, leading to elegant negative results about the impossibility of securely outsourcing computationally intractable problems. In contrast, our methods are geared towards scientific computations that may be solvable in polynomial time (e.g.,
218
MIKHAIL J. ATALLAH ET AL.
solution of a linear system of equations) or where time complexity is undefined (e.g., the work to solve a partial differential equation is not related to the size of the text strings that define the problem). In addition, the cryptographic protocols literature contains much that is reminiscent of the outsourcing framework, with many elegant protocols for cooperatively computing functions without revealing information about the functions' arguments to the other party (cf. the many references in, for example, Schneider [7] and Simmons [8]. The framework of the privacy homomorphism approach that has been proposed in the past [9] assumes that the outsourcing agent is used as a permanent repository of the data, performing certain operations on it and maintaining certain predicates, whereas the customer needs only to decrypt the data from the external agent's repository to obtain from it the real data. Our framework is different in the following ways: 9 The customer is not interested in keeping data permanently with the outsourcing agent; instead, the customer only wants to use temporarily its superior computational resources. 9 The customer has some local computing power that is not limited to encryption and decryption. However, the customer does not wish to do the computation locally, perhaps because of the lack of computing power or appropriate software or perhaps because of economics. Our problem is also reminiscent of the server-aided computation work in cryptography, but there most papers deal with modular exponentiations and not with numerical computing [4, 7, 10-16] Our problems and techniques afford us (as will be apparent below) the flexibility of using one-time-pad kinds of schemes for disguise. For example, when we disguise a number x by adding to it a random value r, then we do not re-use that same r to disguise another number y (we generate another random number for that purpose). If we hide a vector of such x's by adding to each a randomly generated r, then we have to be careful to use a suitable distribution for the r's (more on this later). The random numbers used for disguises are not shared with anyone: they are merely stored locally and used locally to ~'undo" the effect of the disguise on the disguised answer received from the external agent. Randomness is not used only to hide a particular numerical value, but also to modify the nature of the disguise algorithm itself, in the following way. For any part of a numerical computation, we will typically have more than one alternative for performing a disguise (e.g., disguising problem size by shrinking it, or by expanding it, in either case by a random amount). Which method is used is also selected randomly.
SECUREOUTSOURCINGOF SCIENTIFICCOMPUTATIONS
219
Note that the above implies that, if our outsourcing schemes are viewed as protocols, then they have the feature that one of the two parties in the protocol (the external agent) is ignorant of which protocol the other party (the customer) is actually performing. This is the case even if the external agent has the source code, so long as the customer's seeds (of the generators for the randomness) are not known. Throughout this chapter when we use random numbers, random matrices, random permutations, random functions (e.g., polynomials, splines, etc., with random coefficients), etc., it is assumed that each is generated independently of the others, and that quality random number generation is used (cf. [17, Chap. 23]; [18, Chap. 12]; [19,20]). The parameters, types, and seeds of these generators provide the keys to the disguises. We show how to use a single key to generate multiple keys which are "independent" and which simplify the mechanics of the disguise techniques. This key is analogous to the key in encryption but the techniques are different.
1.3
Other Differences Between Disguise and Encryption
The following simple example further illustrates the difference between encryption and disguise. Consider a string F of text characters that are each represented by an integer from 1 to 256 (i.e., these are a byte string). Suppose that F1 is an encryption of F with one of the usual encryption algorithms. Suppose that F2 is a disguise of F that is created as follows: (1) choose a seed (the disguise key) for a uniform random number generator and create a sequence G of random integers between 0 and 128; (2) set ~ - F + G. Assume now that F is a constant (the single value 69) string of length N and the agent wishes to discover the value of this constant. It is not possible to discover from F1 the value 69 no matter how large N is. However, it is possible to discover 69 from F2 if N is large enough. Since G is uniform, the mean of the values of G converge to 64 as N increases and thus, as N increases, the mean of F2 converges to 133 + 64 + 69 and the rate of convergence is order 1 / v ~ . Thus, when 1/x/N is somewhat less than 1/2, we know that the mean of F2 is 133 and that the character is 69. An estimate of N is obtained by requiring that 128/x/N be less than 1/2 or N be more than about 60-70 000. The point of this example is that the encryption cannot be broken in this case without knowing the encryption k e y - - e v e n if one knows the encryption method. However, the disguise can be broken without knowing the key provided the disguise method is known. Of course, it follows that one should not use a simplistic disguise and we provide disguise techniques for scientific computations with security comparable, one believes, to that of the most secure encryptions.
220
MIKHAIL J. ATALLAH ET AL.
1.4 Four Simple Examples The nature and breadth of the disguises possible are illustrated by the following.
1.4. 1 Matrix multiplication Consider the computation of the product of two n x n matrices M1 and M2. We use ~Sx,v to denote the Kronecker delta function that equals 1 if x = y and 0 if x -r y. The disguise requires six steps: 1. Create (i) three random permutations rVl, rr2, and 7r3 of the integers {1, 2, ..., n}, and (ii) three sets of nonzero random numbers {C~l, c~2, ..., c~,,}, {/31,/32, ...,/3,,}, and {71,72, ..., 7,,}. 2. Create matrices P1, P2, and P3 where P l ( i , j ) - oq~5~,{i),j, P2(i,j)--/3i~Srre(i).j, and P3(i,j)-Ti~5~3{i).j. These matrices are readily invertible, e.g. P l l ( i , j ) - (~j)-l~5~,/i).j. 3. Compute the matrix X - PIM1P21. We have X(i,j)(oq//3j)M1 (rrl (i), 712(J')). 4. Compute Y - P2M2P3- 1 9 5. Send X and Y to the agent which computes the product Z - X Y - (P1M1P]I)(P2M2P~ l) - PIM1M2Pj 1 and sends Z back. 6. Compute locally, in O(n 2) time, the matrix P~IzP3, which equals
M1M2. This disguise may be secure enough for many applications, as the agent would have to guess two permutations (from the (n!) 2 possible such choices) and 3n numbers (the c~i,/3i, 7i) before it can determine M1 or M2. This example is taken from Atallah et al. [5] where the much more secure disguise of Section 3.1.1 is presented. Both of these disguises require O(n 2) local computation, which is the minimum possible since the problem involves O(n 2) data. The outsourced computations require O(n 3) operations.
1.4.2
Quadrature
The objective is to estimate
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
221
with accuracy e p s . The disguise is as follows: 1. Choose xl = a, x7 = b and five ordered, random numbers xi in [a, b] and seven values vi with min ] f (x) [ ~ M1 ~<M2 ~ max ] f (x) [. (M1 and M2 are only estimated roughly.) 2. Create the cubic spline g(x) with knots xi so that g(xi)= vi. 3. Integrate g(x) exactly from a to b to obtain 11. 4. Send g ( x ) + f ( x ) and e p s to the agent for numerical quadrature and receive the value 12 back. 5. Compute 1 2 - 11 which is the answer. All the computations made of e ps, and depend weakly of the previous example are One has to determine 12 disguise.
locally are simple, of fixed work, independent on f(x). The random vectors and matrices replaced by a " r a n d o m " smooth function. random numbers in order to break the
1.4.3 Edge Detection The objective is to determine the edges in a picture represented by an n x n array of pixel values p(x,y) between 0 and 100 000 on the square 0 ~<x, y ~< 1. This disguise is as follows: 1. Set xl, yl - - 0 , Xl0 , Yl0 = l, choose two sets of eight ordered, random numbers with 0 < x;, y; < 1, choose 100 random values 0 ~
MIKHAIL J. ATALLAH ET AL.
222
1.4.4
Solution of a Differential Equation
The objective is to solve the two-point boundary value problem y " + a~ (x)y' + , 2 ( x ) v - . f (x. ~')
y(a) - ~'o, 3'(b) - y~.
The disguise is as follows: 1. Choose a spline g(x) as in Section 1.4.2 above. 2. Create the function u(x) - g" + al (x)g' + a2(x)g. 3. Send the problem ~," + a~(x)~" + a2(x)~' - f (x, y) + , ( x ) y(a) - vo + ,(a). ~,(b) - y~ + u(b)
to the agent for solution and receive z(x) back. 4. Compute z ( x ) - g(x) which is the solution. This disguise applies the problem's mathematical operator to a known function and then combines the real and the artificial problems. Here one must determine 12 random numbers to break the disguise. This chapter is organized as follows. Section 2 presents the general framework for disguises and most of the innovative approaches. These include: 9 One time random sequences that are highly resistant to statistical attack. 9 Random mathematical oh/ects including matrices and vectors previously introduced by Atallah et al. [5] plus random functions, operators, and mappings introduced here. 9 Linear operator modification to disguise computation involving operators. 9 Mathematical object modification using random objects to create disguises that are highly resistant to approximation theoretical attacks. 9 Domain/dimensional/coordinate system modifications to disguise computation via transformations. 9 Identities and partitions of unity to create disguises highly resistant to symbolic code analysis attacks. 9 Data-dependent disguises to make disguises depend specifically on the problem data. 9 Master key/subkey methodology that allows for a single key from which a large number of independent subkeys may be derived.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
223
9 Disguise programs which specify disguises concisely and retain the information exactly for disguise inversion.
Section 3 discusses the application of these approaches to the following: 9 linear algebra (matrix multiplication, matrix inversion, systems of equations, convolutions); 9 sorting; 9 string pattern matching; 9 template matching for images with least squares or gl norms in minimal order time. Section 4 presents an analysis of the strengths of the disguises and identifies the three principal avenues of attack (statistical, approximation theoretic, and symbolic code analysis). Section 5 presents a discussion of the cost of the disguise process.
2.
General Framework
In this section we show why multiple disguise techniques are necessary and then identify five broad classes of disguises. Within each class there may be several or many atomic disguises, the techniques used to create complete disguises. The necessity for multiple disguises illustrates again the different nature of disguises and encryption. Multiple disguises require multiple keys and thus we present a technique to use one master key from which many subkeys may be generated automatically with the property that the discovery of one subkey does not compromise the master key or any other subkey. Finally, we present a notation for disguise programs which use the atomic disguises to create complete disguises.
2.1
Need for Multiple Disguises
We believe that no single disguise technique is sufficient for the broad range of scientific computations. The analogy with ordinary disguises is appropriate: one does completely different things to disguise an airplane hangar than one does to disguise a person. In a personal disguise one changes their hair, the face, the clothes, etc., using several different techniques. The same is true for scientific computation. If we consider just five standard scientific computations, quadrature, ordinary differential equations, optimization, edge detection, and matrix multiplication, we see that none of the atomic disguises applies to all five.
224
MIKHAIL J. ATALLAH ET AL.
We are unable to find a single mathematical disguise technique that is applicable to all five. 2.2
A t o m i c Disguises
A disguise has three important properties: 9 Invertibilio': After the disguise is applied and the disguised computa-
tion made, one must be able to recover the result of the original computation. 9 Security: Once the disguise is applied, someone (the agent) without the
key of the disguise should not be able to discover either the original computation or its result. It must be assumed the agent has all the information about the disguised computation. One can use multiple agents to strengthen the security of a disguise but, ultimately, one must be concerned that these agents might collaborate in an attempt to break the disguise. 9 Cost: There is a cost to apply the disguise and a cost to invert it. The
costs to outsource the computation and to carry it out might be increased by the disguise. Thus, there are four potential sources of cost in using disguises. The ideal disguise is, of course, invertible, highly secure, and cheap. We present disguises for scientific computations that are invertible, quite secure, and of reasonable cost. The cost of disguise is often related to the size of the computation in some direct way. Not unexpectedly, we see that increasing security involves increasing the cost.
2.2. 1 Random Objects The first class of atomic disguise techniques is to create random objects: numbers, vectors, matrices, functions, parameters, etc. These objects are "mixed into" the computation in some way to disguise it. These objects are created in some way from random numbers which, in turn, use random number generators. If the numbers are truly random, then they must be saved for use in the disguise and inversion process. If they come from a pseudorandom number generator, then it is sufficient to save the seed and parameters of the generator. We strongly advocate that the "identity" of the generator be hidden, not just the seed, if substantial sequences from the generator are used. This can be accomplished by taking a few standard generators (uniform, normal, etc.) and creating one time random sequences. To illustrate, assume we have
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
225
G1 = a uniform generator (with two parameters = upper/lower range), G2 = normal generator (with two parameters = mean and standard deviation), G3 = exponential generator (with two parameters = mean and exponent), and G4 = gamma generator (with two parameters = mean and shape). Choose 12 random numbers; the first eight are the parameters of the four generators and the other four, O~1, O~2, O~3, and Og4, are used to create the one time random sequence triG1 + ct2G2 + ct3G3 + oe4G4. These random numbers must be scaled appropriately for the computation to be disguised. This is rather straightforward and discussed in a more general context later. Note that in creating this one set of random numbers, we use 16 subkeys, the eight parameters, the four coefficients, and the four seeds of the generators. Once one has random numbers then it is straightforward to create random vectors, matrices and arrays for use in disguises. Objects with integer (or discrete) values such as permutations also can be created from random numbers in a straightforward manner. Functions play a central role in scientific computations and we need to be able to choose random sets of them also. The technique to do this is as follows. Choose a basis of 10 or 30 functions for a high-dimensional space F of functions. Then choose a random point in F to obtain a random function. The basis must be chosen with some care for this process to be useful. The functions must have high linear independence (otherwise the inversion process might be unstable) and their domains and ranges must be scaled compatibly with the computation to be disguised. The scaling is straightforward (but tedious). We propose to make F a one time random space as illustrated by the following example. Enclose the domain of the computation in a box (interval, rectangle, box .... , depending on the dimension). Choose a random rectangular grid in the box with 10 lines in each dimension and assuring a minimum separation (say 3%). Create K sets of random function values at all the grid points (including the boundaries), one set for each basis function desired. These values are to be in the desired range. Interpolate these values by cubic splines to create K basis functions. These functions are smooth (they have two continuous derivatives). Add to this set of K basis functions a basis for the quadratic polynomials. The approach illustrated above can be modified to make many kinds of one time random spaces of functions. If functions with local support are needed, just replace the cubic spline by Hermite quintics. If the functions are to vary more in one part of the domain than the other, then refine the grid in that part. If the functions are to be more or less smooth, then adjust the degree and smoothness of the splines appropriately. The computational
226
MIKHAIL J. ATALLAH ETAL.
techniques for creating all these splines (or piecewise polynomials) are given in the book by deBoor [18].
2.2.2
Linear Operator Modification
Many scientific computations involve linear operators, e.g., linear equations:
Solve A x = b;
differential equations:
Solve ),it _jr_COS(X)Tt _+_X2T __ 1 - xe -x.
These operator equations are of the form Lu = b and can be disguised by changing the objects in them (this is discussed later) or by exploiting their linearity. Linearity is exploited by randomly choosing v like u, i.e., v is the where L v is same type of object, and then solving L ( u + v ) = b + L v evaluated to be the same type as b. This disguises the solution but still reveals the operator. It will be seen later that, for equations involving mathematical functions, it is very advantageous to choose v to be a combination of a random function and functions that appear in the equation. Thus one could choose v(x) in the above differential equation to be l)Ran(X) q-4.2 c o s ( x ) 2.034xe -x. This helps with disguises of the operator.
2.2.3
Object Modification
Scientific computations involve the objects discussed above which can often be manipulated by addition or multiplication to disguise the computation. The related techniques of substituting equivalent objects is discussed later. We illustrate these by examples. 1. Addition to integrals. To disguise the evaluation of
i
1 x/~ cos(x + 3)dx 0
add a random function to v/-d COS(X+ 3). Note that the basis of the space F chosen in Section 2.2.1 makes it trivial to evaluate the integral of one of these functions. 2. Multiply by matrices. To disguise the solution of A x = b choose two random diagonal matrices, D1 and D2, compute B = D1AD2 and outsource the problem By-
Dlb.
The solution x is then obtained from x -
D2y.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
2.2.4
227
Domain and Dimension Modification
Many problems allow one to modify the domain or dimensions (matrix/vector problems) by expansion, restriction, splitting, or rearrangement. Each of these is illustrated. 1. Expansions. The evaluation of
j
1 v ~ cos(x + 3)dx
0
or the solution of a problem of related type on [5, 21]: y(3) = 1
y'(x) = (x + y)e-"-".
can be modified by expanding [0, 1] to [0, 2] or [3, 5] to [2, 5]. In the first case one selects a random function u(x) from F with u(1)= cos(4), integrates it on [1,2] and extends v/-xcos(x + 3) to [0, 2] using u(x). In the second case one chooses a random function u(x) from F with u(3) = 1, u'(3) = 4e -3. One computes its derivative u'(x), its value u(2), and solves the problem (x + v)e-"-' " u'(x)
)"(x) -
on [3 5] ' on [2, 31
with the initial condition y ' ( 2 ) = u'(2). 2. Restriction. One can decrease the dimension of a linear algebra computation or problem by having the customer perform a tiny piece of it and sending the rest to the agent. For example, in solving A x - b, one chooses an unknown at random, eliminates it by Gauss elimination, and then sends the remaining computation to the agent. This changes the order of the matrix by 1 and, further, it modifies all the remaining elements of A and b. This does not disguise the solution except by hiding one unknown. 3. Splitting. Many scientific problems can be partitioned into equivalent subproblems simply by splitting the domain. This is trivial in the case of quadrature and the linear algebra problem Ax-= b can split by partitioning A-
All
AI, )
A21
A22
228
MIKHAIL J. ATALLAH ET AL.
and creating two linear equations AllX1 = b l - A 1 2 x 2 (A22 -
A21AlllA12)x2
-
b2 -
A lllbl 9
(See Section 3.1 for more on disguising the Ax = b problem.) The differential equation problem y'(x) = (x + )')e -'~'
y'(3) = 1 on [3, 5]
can be split into y'(x) = (x + )')e-"-'
y'(3) = 1, on [3, 4]
y'(x) = (x + ),)e-'-'
y'(4) = as computed, on [4, 5]
An important use of splitting problems is to be able to disguise the different parts in different ways so as to strengthen the overall security of the disguise.
2.2.5 Coordinate System Changes Changes of coordinate systems are effective disguises for many scientific computations. They must be chosen carefully, however; if they are too simple they do not hide much, if they are too complex they are hard to invert. For discrete problems (e.g., linear algebra) permutations of the indices play a similar role. Random permutation matrices are easy to generate and simple to apply/invert. Disguises based on coordinate system changes are one of the few effective disguises for optimization and solutions of nonlinear systems. We illustrate the range of possibilities by considering a particular problem, disguising the two-dimensional partial differential equation (PDE) problem
V 2 f (x, y) + (6.2 + 12 sin(x + y ) ) f = gl(x, )9
Of(x,),)
Ox
(x, y) in R
f (x, y) = bl(x, y)
(x, y) in R1
f ( x , y) = b2(x, 3')
(x, y) in R2
+ gz(x, y ) f ( x , y) = b3(x , )')
(x, y) in R3.
229
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
Here R1, R2, and R3 comprise the boundary of R. To implement the change of coordinates, u - u(x, y), v = v(x, y) one must be able to: 1. invert the change, find the functions x = x(u, v), y = y(u, v); 2. compute derivatives needed in the PDE, e.g.,
02f ~= Ox2
02f
Of O2u
q
Ou2 ~Ox2
02f k~ OuOx2 Or2
Ov
O f 02bt
q---~
0U 0X 2
The change of coordinates produces an equivalent PDE problem on some domain S in (u, v) space of the form 2 i..i= 0
Of (.. ~)
d l ( u , v) ~ - k
Ou
a.:;(.. ~)
Of (.. ~)
d 2 ( u , ~A) ~ - [ - -
Ov
oioi 01.1i OH i
f ( .. ~) -- hi(.
'
~,)
(u, v) C S
f(u, v) = el(u, v)
(u, v) E S1
f (u. v) = c2(u. v)
(u. v) ~ s2
d3(tt , u)f(tt, u ) = c3(tt , u)
(u, v) E $3.
Here S1, $2, and S 3 are the images of R1, R2, R3 and the functions a O. (u, v), hi(u, v), ci(u, v), di(u, v) are obtained from substituting in the changes of variables and collecting terms. A completely general coordinate change is excellent in disguising the PDE problem. It changes the domain and all its problem coefficient functions (usually in complex ways). There are a number of coordinate changes where the inverse is known explicitly, but these are few enough that restricting oneself to them might weaken the security of disguise. For other coordinate changes the inverse must be determined numerically. This process has been studied in some depth by Ribbens [22, 23] and reliable, efficient procedures developed which can be used to create one time coordinate changes using parameterized mappings with randomly chosen parameters. An intermediate variation here is to make coordinate changes in the variables independently. That is, we have u = u(x) and v = v0'). This approach substantially reduces the cost and complexity of the change and yet allows for randomly parameterized one time coordinate changes.
230
2.2.6
MIKHAIL J. ATALLAH ET AL.
Identities and Partitions of Unity
One class of attacks on disguises is to make a symbolic analysis of the computational codes in order to separate the "random objects" from the original objects (see Section 4.3.3). Considerable protection against such attacks can be achieved by using mathematical identities to disguise the mathematical models involved in the computations. Examples of such identities include a 2 -- a x + x 2 - - ( a 3 -+- x 3 ) / ( a + x)
l o g ( x y ) - log x + log y 1 + x-(1
- x2)/(1
- x)
sin(x + y ) - sin x cos 3'+ cos x sin 3' cos2x - sine~' + cos(x + y)cos(x - 3') p cos x + q sin0') - v/p 2 + q2 cos(x - cos -l (p/ v/p 2 + q2)) sin(3(x + y)) - 3 sin(x + y) - 4 sin 3(x + y). Thus, if any component of these identities appears symbolically in a computation, the equivalent expression can be substituted to disguise the problem. A general source of useful identities for disguise comes from the basic tools for manipulating mathematics, e.g., changes of representation of polynomials (power form, factored form, Newton form, Lagrange form, orthogonal polynomial basis, etc.), partial fraction expansions or series expansions. The above classical identities are purposely simple; disguises need complex identities, and these may be created by involving lesser known functions and compositions, for example, the simple relations involving the Gamma, Psi and Struve functions [24] F(x + 1 ) - xF(x); ~b(x + 1 ) - ~ ( x ) + 1 / x , H 1 / 2 ( x ) -
(2/rex)1/2(1 - c o s x )
can be combined with the above to produce sin(x) - [sin(~b(1 + 1/x) + x) - sin(f(1/x))cos x]/cos(~(1/x)) log(x) - log P(x) + log(F(x + 1)H1/z(x)) - log(1 - cos x) + 1/2 log(rex/2). Identities which equal 1 are called partitions off unit)' and they can be used anywhere in a computation. Examples include sin'- x + c o s -
x-1
sec2(x + y) - tan2(x + y ) - 1 (tan x + tan y ) / t a n ( x + y) + tan x tan y -
1
bl (r, x) -+- b2(r. s, x) -+- b3(r. s. x) -+- b4(s, x) - 1
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
231
where the bi are "hat" functions defined by bl (r, x) = max(1 - x / r , O) bz(r, s, x ) = max(0, min(x/r, (s - x ) / ( s - r))) b3(r, s, x) = max(0, min((x - r)/(s - r), (1 - x)/(1 - s))) b4(x, s ) = max(0, (x - s)/(1 - s))
each of which is a piecewise linear function with breakpoints at 0, r, s, and/or 1. Generalizations of this partition of unity are known for an arbitrary number of functions, arbitrary polynomial degree, arbitrary breakpoints, and arbitrary smoothness (less than the polynomial degree). Partitions of unity allow one to introduce unfamiliar and unrelated functions into symbolic expression. Thus the second partition above becomes
secZ(yy(x) + u(1.07296, x)) - tan 20.'7(x) + u(1.07296, x)) - 1 where yy(x) is the Bessel function of fractional order and u(1.07296, x) is the parabolic cylinder function. The Guide to Available Mathematical Software [21] lists 48 classes of functions for which library software is available. Finally, we note that using functions and constants that appear in the original expression strengthens disguises significantly. Thus, if 2.70532, x 2, cos(x), and log(x) initially appear in an ordinary differential equation, one should use identities that involve these objects or closely related ones, e.g., 1.70532, 2x 2 - 1, cos(2x) or log(x + 1). Recall in elementary trigonometry courses that the establishment of identities is one of the most difficult topics (even given the knowledge that the two expressions are identical), so it is plausible that using several identities in a mathematical model provides very high security. One can also create one time identities as follows. There exist well-known, reliable library programs that compute the best piecewise polynomial approximation to a given function f ( x ) with either specified or variable breakpoints [18,25]. The number of breakpoints and/or polynomial degrees can be increased to provide arbitrary precision in these approximations. Thus given that f ( x ) - sin(2.715x + 0.12346)/(1.2097 + x 107654)
or t h a t f ( x ) is computed by a 1000 line code, one can use these library routines to replace f ( x ) by a code that merely evaluates a piecewise polynomial with "appropriate" coefficients and breakpoints. One time identities may also use the classical mathematical special functions that have parameters, e.g., incomplete gamma and beta functions, Bessel functions, Mathieu functions, spheroidal wave functions, parabolic cylinder functions, etc. [24].
232
MIKHAIL J. ATALLAH ET AL.
2.3
Key Processing
The atomic disguises usually involve a substantial number of keys (parameters of r a n d o m numbers generators, r a n d o m numbers, etc.) that partly define the disguise process. It is thus desirable to avoid keeping and labeling these keys individually and we present a technique that uses one master key to create an arbitrary number of derived subkeys. Let K be the master key and k;, i = 1, 2, .... N, be the subkeys. The subkeys k; are derived from K by a procedure P such as the following. We assume K is represented as a long bit string (a 16 character key K generates 128 bits) and we have a r a n d o m number generator G. For each i - 1 , 2 , ..., N we generate randomly a bit string of length 128 and use it as a mask on the representation of K (i.e., we select those bits of K where the r a n d o m bit string is 1). For each i this produces a bit string of about 64 (in our example) bits or a full word for a 64 bit computer. Thus, with a single key K and a fixed r a n d o m number generator G, we can create, say, many thousands of subkeys so that 9 Each k; is easily derived from K. 9 Knowing even a substantial set of the ki gives no information about K even if the generation procedure P is known. Recall that many of the subkeys are, in turn, seeds or parameters for a variety of random number generators so that massive sets of random numbers can be used without fear of revealing the master key or other subkeys even if a statistical attack on this aspect of the disguise is successful (which is very unlikely--see Section 4.2.1). Implicit in this discussion is that we envisage a substantial problem solving environment, call it A tri, for outstanding disguises which manages the disguise process and its inversion for the user. Of course, one can do all this by hand and, if super-security is essential, this might be advisable as one must assume a resourceful and determined attacker would have complete knowledge of Atri. 2.4
Data-Dependent
Disguises
Unlike encryption methods, disguises can depend on the problem data since no "second party" has to use the disguise procedures. This dependence is necessary in some simple cases, e.g., the random vector added must be the same length as the vector in the problem or the random function must have the same domain (and approximate size) as the function in the problem. In addition, this dependence can be introduced deliberately to further obscure the problem. Thus, once in a while, one can use the actual value of a
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
233
problem function in the creation of a random function. For example, in Section 1.4.2, Quadrature, instead of using seven random values v; to create the spline g(x), one can use five random values vl, v:, v3, vs, ~7 and set g(x4)--f(x4) and g(x6)=f(x6). This breaks the direct connection between the function g(x) and the random number generator used. A particularly attractive way to introduce data dependence is to use a problem data item in the choice of the seed for a random number generation. Thus, the disguise program statement Vector
(I ,7), V using
GLXlF (k(2), minf, 2*maxf)
could be replaced by xseed = GUN/F (k(2), a,b) Vector (2,6), V using GUN/F (f(xseed),
minf,
2*maxf)
This has been done in the second example disguise program in Section 2.5. The main value of data dependence is to introduce yet another dimension to the disguise procedure, thereby making it stronger. A second, perhaps important, value is that even knowing the master key and the disguise program is not enough to break the disguise; one must also have the problem data itself.
2.5 Disguise Programs It should now be clear that outsourcing disguises involve several steps and that one must carefully record the steps of the disguise process, as well as record the master key. At this point we ignore cost issues, e.g., is it better to save random numbers used or to recreate them when needed? We focus entirely on the questions of 9 specifying the disguise in reasonably efficient ways; 9 retaining the information so the disguise can be inverted. The second question is a subset of the first, in that it is often possible to invert a disguise without knowing everything about a disguise. This is illustrated in the earlier simple examples (Section 1.3) where, 9 (1.3.1) one does not need to know
P2,
9 (1.3.2) one does not need to know g(x); only I1 is needed. It is clear that Atri should be built using, at least internally, a formal language for specifying disguises and the process for inverting them. We do not attempt to define such a language here, but rather illustrate how such a
234
MIKHAIL J. ATALLAH ET AL.
language might appear to a user. First we observe that Atri must 9 Provide direct access to the objects (variables) in the computations. These might be single elements (numbers, arrays, functions,...) or compositions (expressions, equations, problems, coordinate systems, ...). 9 Provide a set of procedures to generate random objects as in Section 2.2.1 with specified attributes, e.g., size, dimension, smoothness, domain, random number generator type (seed/parameters, ...), etc. 9 Allow old/new objects to be combined with appropriate (type dependent) operators (e.g., add, multiply, transform, substitute, insert, apply .... ). 9 Provide simple sequencing control of the disguise, outsourcing, retrieval, and disguise inversion actions. To initially illustrate the nature of these programs, we express the simple examples of Section 1.3 as disguise programs. The informal language used is somewhat verbose in order to avoid defining many details explicitly; it is intended to be self-explanatory. Matrix Matrix
Key
K
(M1)
Permutations
Matrix
(l:m),
(1:m),
P3(i,j) Matrix =
~,
PI
Y = P2
~I,
~,
(1:m,1:m),
Pl(i,j) P2(i,j)
X
PI,
=
o(i)
= =
= =
maxf
Key
j
~(i)
if ~ 2 ( i )
=
j
~(i)
if
=
j
P(1)
X,
= a;
~3(i)
otherwise ANS
Z,
Y]
K = Quadrature
(2:6),
P(7)
Integrate of
Interval
minf
P using
(k(1))
[1,2])
otherwise
(1.4.2)-
= max(f,a,b);
Vector
Y,
h(x);
k(i)
otherwise
0
* P3 -I
f(x),
P3
(k(2),
* M2
Key
GUNZF
using
GuxlF
=
0
Quadrature
Function
P2,
~3
MI
Sub
if ~ I ( i )
Outsource [Product, Return Z ANS = PI -I * Z * P3
Master
~2
~ using
= 0 (1:m,1:m), X, * MI * P2 -I * M2
(1.4.1):
= AmY+JennA-Tygar-BeaU,
m = Dimension Vector
Multiplication
1.3.2
[a,b]
= min(f,a,b)
GuxiF
= b
f(x)
f(x)
(k(1),
a,
b)
on
a,b
[a,b]
235
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS Vector
(1:7),
V
2 * maxf)
Cubic II
=
h(x)
spline
g(x)
f(x)
+ g(x)
Integrate
=
Outsource Return
ANS
12
(1.4.3):
Key
Photo
(n,m)
=
K =
Array
i,j
=
Array
X,Y
=
=
GUNW OUNW
(1:10,1:10),
Cubic =
in
YCOR(I:4))
= YCOR(4)
Y(2:9)
spline =
1:10
(l:n,l:m)
i =
(1:4)
X(1),
Y(1),
with
SV
j =
+
1:m
(i-1)*H1,
= NUYCOR(4);
= A(1)-S1
= A(3)-$I
Photo
Amalfi2
Photo n,
Photo
Photo
II
Sphoto
m)
Out
*
= YCOR(1);
O,
r)
= V(i,j)
for
YCOR(1) 2,
+
(j-1)*H2)
for
4)
+ A(2) +
I/A(4)
= NUXCOR(1)
= NUYCOR(1)
(YCOR(1)-YCOR(4))
* XCOR(1)
* YCOR(1) + SI*X; =
=
(SV,
[edges,
V =
I2
NUXCOR,
+ S2*Y
NUYCOR,
n,
(array(Amalfi.10.27.97),
= Sphoto
Outedges
Outsource
Y(j))
= A(3)
NUYCOR(3)
II
U =
Y(1)
(k(3),
= A(1)
NUXCOR(3)
= A(2)/(XCOR(1)-XCOR(4)) I/(A(4)
(k(1),
NUYCOR(4)
$I
I2
(Amalfi.10.29.97)
Y(IO))
S(X(i),
NUXCOR(4)
= A(3);
MAPS:
Amalfi.10.29.97
X(IO))
GUNIF
GUNW
A using
= NUXCOR(4);
=
photo
- YCOR(1))/(m-I)
NUXCOR(2)
$2
in
Corners
V using
= A(1);
NUYCOR(2)
eps]
= XCOR(4);
(k(2),
NUXCOR(1)
NUYCOR(1)
=
(k(1),
S(XCOR(1)
1:n,
Vector
1:7
-XCOR(1))/(n-I);
(YCOR(4) =
i =
Amalfi.10.29.97+ElmaGarMid
X(IO)
S(x,y)
(XCOR(4)
SV(i,j)
b,
for
minf,
(Amalfi.10.29.97)
= XCOR(1);
X(2:9)
a,
edges
(Amalfi.10.29.97)
Y(IO)
H2
Edges
(1:10),
X(1)
HI
Find
Dimension
(XCOR(I:4),
Vector
= V(i)
exact)
h,
Amalfi.10.29.97
r = Max
g(P(i))
a,b)),
11
-
Edges Master
with
(g,a,b,
[Integrate,
I2
=
GUNW (f(GuNIF ( k ( 2 ) ,
using
+ Amalfi2 Out]
m)
NUXCOR,
NUYCOR,
236
MIKHAIL J. ATALLAH ET AL.
Return Photo
Outedges
Amalfi.edges
Differential
=
Y(x),
al(x),
Boundary
Conditions:
(L):
= f,
L = y"
ODE:
Ly
BC
maxf
= max(f,a,b);
Master
Key
Vector
(2:6),
P using
Vector
(1:7),
V using
P(1)
V(3)
Cubic
ODE2:
= a;
spline Ly
a2(x),
f(x),
+ al(x)
BC
=
minf
(a,
* y' b,
Solve u(x)
+ a2(x)
Y1,
Y2)
= min(f,a,b)
GUXlF
(k(1),
a,
(k(2),
minf,
P(7)
= b
V(6)
= a2(P(6))
g(x)
= f + u,
GUNIF
with
Conditions:
BC2
Outsource [Solve-ODE, R e t u r n z(x) ANS
(1.4.4):
XCOR,
YCOR,
2 point
n,
m)
BVP
* y
K = Addis+aBABA+Ethiopia.1948
= f(P(3)),
Boundary
(outedges),
Equations
Function
Operator
(Array
g(P(i))
BC2
(a,
ODE2,
b,
b)
= V(i) Y1
2*maxf)
for
+ u(a),
solution
i = 1:7 Y2
+ u(b))
= z]
= z(x)-g(x)
We observe the following from these four simple disguises. 9 Even simple disguises are rather complex. 9 The disguise procedure must be recorded carefully. 9 Minor changes in the disguise programs (even if syntactically and semantically correct) change the outsourced problem significantly or, equivalently for a fixed outsourced problem, change the original problem significantly. 9 The disguise acquires security both from the random numbers used and from the complexity of the disguise process. 9 The Atri environment should provide much higher-level ways to specify disguises than these examples use. It should, however, allow one to tailor a disguise in detail as these examples illustrate.
3. 3.1
APPLICATIONS Linear Algebra
These disguise algorithms are taken directly from Atallah et al. [5] except that the disguise of solving a linear system has been modified to enhance the numerical stability of the outsourcing process.
237
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
3. 1.1
Matrix Multiplications
A fairly satisfactory disguise is given in Section 1.4.1. The following method provides even greater security by adding in a dense random matrix. The following scheme hides a matrix by the sparse random matrices Pi or their inverse; the resulting matrix is further hidden by adding a dense random matrix to it. The details follow. 1. Compute matrices X = P1M1P] 1 and Y = P2M2P31 as in Section 1.4.1. 2. Select two random n x n matrices S1 and $2 and generate four random numbers/3, 7,/3', 7' such that
(~ + -y)(~' + -y')(-y'b - -#~') r 0. 3. Compute the six matrices X + S1, Y+ $2, /3X-',/S1,
/3Y-'TS2,
/ 3 ' X - 7'$1, /3' Y - 7'$2. Outsource to the agent the three matrix multiplications W = (X + S1)( Y + $2)
(1)
U = ( f i X - 7 S1)(/3Y- 7S2)
(2)
U= (~'X-
(3)
7'S1)(/3' Y - 7'$2)
which are returned. 4. Compute the matrices
v = (~ + -y)-I ( u + ;~-yw) V' = (/~' -+- "/') - I ( u , + flt.)/t W).
(4)
(5)
Observe that V =/3XY + 7S1 $2, and V' = / 3 ' X Y + 7'$1 $2. 5. Outsource the computation ( .~ ' ~ -
.~ ~ ' ) - ~ ( .~ ' v -
.~ v ' )
which equals X Y (as can be easily verified--we leave the details to the reader). 6. Compute M1M2 from X Y by
PllXYP3 = Pll(p1M1Pzl)(PzMzP31)p3 = M1M2.
3. 1.2 Matrix Inversion The scheme we describe to invert the n x n matrix M uses secure matrix multiplication as a subroutine. 1. Select a random n x n matrix S. The probability that S is noninvertible is small, but if that is the case then Step 4 below sends us back to Step 1.
238
MIKHAIL J. ATALLAH ETAL.
2. Outsource the computation to the agent (6)
f/l- MS
using secure matrix multiplication. Of course, after this step the agent knows neither M, nor S, nor M. 3. Generate matrices P1, P2, P3, P4, P5 using the same method as for the P1 matrix in Steps 1 and 2 in Section 1.4.1. That is, P l ( i , j ) - ai~,(i).j, ^
P2(i,j)--bi~Tr2(i).j,
P3(i,j)-ci~Tr3(i).j,
P4(i,j)-di~Tr4(i).j,
and
Ps(i,j)-
ei67rs(i).j, where 7rl, 7r2, 7r3, 7r4, 7r5 are random permutations, and where the ai, bi, r di, ei are random numbers. Then compute the matrices -1
Q-
Pl~lP~ 1 - P1MSP z
R - P3SP4-~ 9
(7) (8)
4. Outsource the computation of Q-1 and, if it succeeds, return Q-I. If Q is not invertible, then the agent returns this information. We know that at least one of S or M (possibly both) is noninvertible, and we do the following: (a) Test whether S is invertible by first computing S - $1 SS2 where $1 and $2 are matrices known to be invertible and outsource S to the agent for inverting. Note: The only interest is whether S is invertible or not, not in its actual inverse. The fact that we discard S makes the choice of $1 and $2 less crucial than otherwise. Hence $1 and $2 can be generated so they belong to a class of matrices known to be invertible (there are such classes). It is unwise to let $1 and $2 be the identity matrices, because by knowing S the agent might learn how we generate these random matrices. (b) If the agent can invert S then we know S is invertible, and hence that M is not invertible. If the agent says that S is not invertible, then we know that S is not invertible. In that case we return to Step 1, i.e., choose another S, etc. 5. Observe that Q-1 _ P2S-1 M - 1p ~-1 and compute the matrix T-
P4PzlQ-1P1P5
-1
.
It is easily verified that T is equal to P 4 S - 1 M - 1 p 5-1 . 6. Outsource the computation of Z-RT
using secure matrix multiplication. Of course the random permutations and numbers used within this secure matrix multiplication subroutine
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
239
are independently generated from those of the above Step 3 (using those of Step 3 could compromise security). Observe that Z - PsSP41P4S -1M -1P~l _ P3M-1 p~l.
7. Compute P 3 1 Z P 5 which equals M -1 The security of the above follows from: 1. The calculations of M and Z are done using secure matrix multiplication, which reveals neither the operands nor the results to agent A, and 2. The judicious use of the matrices P1, ..., P5 "isolates" from each other the three separate computations that we outsource to A. Such isolation is a good design principle whenever repeated usage is made of the same agent, so as to make it difficult for that agent to correlate the various subproblems it is solving (in this case three). Of course less care needs to be taken if one is using more than one external agent.
3. 1.3 Linear System of Equations Consider the system of linear equations M x = b where M is a square n x n matrix, b is an n-vector, and x the vector of n unknowns. The scheme we describe uses local processing which takes time 0(n 2) proportional to the size of the input. 1. Select a random n x n matrix B and a random number j E { 1,2, ..., n}. Replace the jth row of B by b, i.e., B = [B1, ..., Bj_ 1, b, Bj + l, ..., B,,]. 2. Generate matrices P1, P2, P3 using the same method as for the P1 matrix in Steps 1 and 2 in Section 1.4.1. That is, Pl(i,j)=ai~5~,(z)j, P2(i,j)=bir57r2(i),j, P3(i,j)=ci67r3(i)4, where 7rl, 71"2, 7I-3 are random permutations, and where the az, bi, ci are random numbers. 3. Compute the matrices f / I -- P 1 M P ~ _
1
-- PI B P 3-~ 9
(9) (lo)
4. Outsource to the agent the solution of the linear system f / x - / ~ . If 3)/ is singular then the agent returns a message saying so; then we know M is singular. Otherwise the agent returns
2--~t
lB.
240
MIKHAIL J. ATALLAH ET AL.
5. Compute X - p ] l x P 3
which equals M -IB, since
P2' f(P3 - P 2 1 f I - ' f~P3 - P z ' P e M - ' P , BP31p3 - M - ' B . 6. The answer x is the jth column of X, i.e., x - ~.. The security of this process follows from the fact that b is hidden through the expansion to a matrix B, and then M and B are hidden through random scalings and permutations. 3. 1.4
Convolution
Consider the convolution of two vectors M1 and M2 of size n, indexed from 0 to n - 1. Recall that the convolution M of M1 and M2 is a new vector of size 2 n - 1, denoted M - M1 | M2, such that min(i. ,1 - 1)
M(i) -
Z
k=O
M1 (klM2(i - k).
The scheme described below satisfies the requirement that all local computations take G(n) time. 1. Choose vectors $1, S: of size n randomly. Also choose five positive numbers cx,/3, 7, 13', 7' such that
(~ + c~-r)(~' + c~7')(~'~ - -r~3') r o. 2. Compute locally the six vectors c~M1 + S1, aM2 + $2, / 3 M 1 - 7S1, /3M2 - 7 S 2 , /3'M1 - 7'S1, f3'M2 - 7'$2. 3. Outsource to the agent the three convolutions: W -- (c~M1 + S1 ) | (o~M2 + $2)
(11)
U - - ( / ~ M 1 - 7S1) @ ( ~ M 2 - 7 8 2 )
(12)
U'-
(~'M1 - 7'S1) @ (~'M2 - 7'S2)
(13)
which are returned. 4. Compute locally the vectors
v - (~ + ~ 7 ) - l ( ~ g + ~Tw) v ' - (/3' + ~-r')-~(~u ' + ~'-r' w).
(14) (15)
Observe that V - c~/3M1 @ M2 + 7S1 | $2, and V ' - c~/3'M1 | M2 + 7'$1 @ $2.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
241
5. Compute o~-~ (7'/~ - ~/~')
~(,~' v - 7 v'),
which equals M1 | M2.
3.1.5 Hiding Dimensions in Linear Algebra Problems The report by Atallah et al. [5] systematically describes in detail how to hide the dimensions of linear algebra problems by either increasing or decreasing them. Here we describe their techniques in a less detailed form. The basic ideas are seen from the multiplication of M1M2 where M1 and M2 are of dimension a • b and b • c, respectively. First note that one can add k random rows to M1 a n d / o r k random columns to M2 and then just ignore the extra rows a n d / o r columns generated in the product. One may enlarge b by adding k columns to Ml and k rows to M2 which are related. The simple relationship proposed by Atallah et al. [5] is to take the oddnumbered extra columns of M1 and the even-numbered rows of M2 to be identically zero. The other additional elements in M1 and M2 could be chosen randomly. The result is that the product of the augmented matrices is the same as M1M2. To reduce the dimensions a or c, one can merely partition M1 by rows or M2 by columns and perform two smaller matrix multiplications. To reduce b one can partition both M1 and M2 of compatible dimensions and then compute M1M2 by eight smaller matrix multiplications. The total arithmetic work is unchanged. To enlarge the dimension n in inverting the n • n matrix M one uses the same scheme as for matrix multiplication and at Step 4 the matrix Q is augmented by a k • k random matrix S so that the matrix
(Q 0) 0
S
is reducible. The inversion of this larger matrix is done using the matrix inversion algorithm which hides the special structure of the enlarged matrix. To reduce the dimension n partition Q into X
Y).
V
W
242
MIKHAIL J. ATALLAH ETAL.
If X and Y - W X -1 V are invertible (and such a partition can be made) then the inverse of Q is the partitioned matrix X
1
_.1_X -1 VD-I WX-1
_X-1 VD-1) D_ 1
_D-1 WX-1
To decrease the dimension n in the linear system M x -- b one can use the scheme used for matrix inversion. To enlarge n we create the system
0)(x) (b)
o
s
sy
),
where S is a k • k invertible matrix (say, random) and y is random vector. One then applies the previous algorithm which hides the special structure of the enlarged matrix. The zero block matrices above can be replaced by random matrices with a minor change in the right side. To increase the dimension n of the problem M1 @ M2 one can merely pad the vectors M1 and M2 by adding k zeros. To decrease the dimension we first note that m l @ M2 can be replaced by three convolutions of size n/2
(M(1even) -F M(1~ even _
@ (M~even) + M~ ~ O
M(1~
even _ A/f(~ )
@ M (odd) 2
where M (~ M (even) mean the vector of odd and even indexed elements of M, respectively. From these convolutions one can easily find the products on the right side of the relationships
(M1 @ M2) (odd) _ M (even)l @M~~
+ M (1~
@'"2All(even)
(M1 @ M2) (even) - M(1even) @ M 2(even)+ Shift[M(lOdd)@ M~odd)] where Shift(x) shifts the vector x by one position.
3.2
Sorting
Consider the problem of sorting a sequence of numbers E = {el, ..., en}. This can be outsourced securely by selecting a strictly increasing function f : E--~ E, such as f ( x ) = c t + /3 (x +~/) 3 where/3 > 0. The scheme we describe below assumes this particular f ( x ) . 1. Choose a,/3, and -y so that/3 > 0.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
243
2. Choose a random sorted sequence A - { A l , .... hi} of l numbers by randomly "walking" on the real line from MIN to M A X where MIN is smaller than the smallest number in E and M A X is larger than the largest number in E. Let A - ( M A X - M I N ) / n and the random "walking" is implemented as follows. (a) Randomly generate A1 from a uniform distribution in [MIN, MIN + 2A]. (b) Randomly generate A2 from a uniform distribution in [~1, ~l -{-- 2A]. (c) Continue in the same way until you go past MAX. The total number of elements generated is I. Observe that A is sorted by the construction such that the expected value for the increment is A, therefore the expected value for l is ( M A X - M I N ) / A - n. 3. Compute the sequences
E' - . f (E)
(16)
A'-f(A).
(17)
w h e r e f ( E ) is the sequence obtained from E by replacing every element
ei by f (ei). 4. Concatenate the sequence A' to E', obtaining W = E ' U A'. Randomly permute W before outsourcing it to the agent, who returns the sorted result W'. 5. Remove A' from W' to produce the sorted sequence E'. This can be done in linear time since both W' and A' are sorted. 6. Compute E = f - I(E'). The above scheme reveals n since the number of items sent to the agent has expected value 2n. To modify n, we can let A = ( M A X - M I N ) / m in Step 2, where m is a number independent of 17. The argument at the end of Step 2 shows that the expected value for the size of A is m, therefore the size of the sequence the agent received is m + n, and we can hide the size of problem by expanding the size this way.
3.3 Template Matching in Image Analysis Given an Nx N image I and a smaller n x n image P, consider the computation of an ( N - n + 1) x ( N - n + 1) score matrix Ct. p of the form I7-
C,.p(i,j) - ~ k=0
1
n-
~ k'=0
1
f (I(i + k,./ + k'), P(k, k')),
0 <<,i,j <~N - n,
244
MIKHAIL J. ATALLAH ETAL.
for some function f. Score matrices are often used in image analysis, specifically in template matching, when one is trying to determine whether (and where) an object occurs in an image. A small Ci.p(i,j) indicates an approximate occurrence of the object P in the image I (a zero indicates an exact occurrence). Frequent choices for the function f are f (x, y) - (x - y) 2 and f ( x , y ) = I x - y ] [26, 27]. We consider how to securely outsource the computation of C for these two functions.
3.3. 1 The Case f (x, y ) - ( x - y)2 The score matrix for
f ( x , y) - (x - y)2 _ x 2 + y2 _ 2xy can be viewed as the sum of three matrices, each corresponding to one of the above three terms on the right-hand side. The matrix that corresponds to the x 2 (respectively, y2) term depends on I (respectively, P) only and is easily computed in O(N 2) (respectively, O(n2)) time local operations, i.e., without need for outsourcing. The matrix that corresponds to the term containing xy is essentially a two-dimensional convolution that can be outsourced using the same method that we described earlier for one-dimensional convolutions.
3.3.2
The Case f (x, y l - J x - Y l
The
problem for f ( x , y ) - I x - y l f (x, y) - max(x, y) because
reduces
to
the
problem
for
] x - Y I - max(x, y) + m a x ( - x , -y). So we focus in what follows on the case f ( x , y ) = m a x ( x , y ) . The algorithm for this uses as a subroutine a two-dimensional version of the securely outsourced convolution technique in Section 3.1.4. Let A be the alphabet, i.e., the set of symbols that appear in I or P. For every symbol x r A we do the following: 1. We replace, i n / , every symbol other than x by 0 (every x in I stays the same). Let/,, be the resulting image. 2. We replace every symbol that is ~<x by 1 in P, and replace every other symbol by 0. Let Px be the resulting image. Augment Px into an N • N matrix IIx by padding it with zeros. 3. Outsource the computation of the score matrix n-1
Dx(i,j) - ~ k=O
n-1
Z k'
!,.(i + k,j + k')IIx(k, k'),
O<~i,j<~N-n.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
245
This is essentially a two-dimensional convolution, and it can be securely outsourced using a method similar to the one given in Section 3.1.4. 4. We replace, in the original P (i.e., not the P modified by Step 2), every symbol other than x by 0 (every x in P stays the same). Let PI,- be the resulting image. Augment p.t into an N x N matrix H'~ by padding it with zeros. 5. We replace every symbol that is < x by 1 in (the original) L and every other symbol by 0. Let I'~ be the resulting image. 6. Outsource the computation of the score matrix 11-
Dlr(i'J') - Z k=0
1
n-
1
Z
Ii,(i + k,j + k')l]x(k, k'),
O<<,i,j<~N-n.
k'
This is done securely as in Step 3. 7. Compute locally C,. p -
Z
(Dx + D.',:).
.x'C A
Thus the computation of CI.p for the case f(x,y)=max(x,y) (and therefore for f ( x , y ) = ] x - y ]) can be done by means of O(] A ]) twodimensional convolutions (which can be securely outsourced). This is reasonable for small-size alphabets (binary, etc.). However, for large alphabets, the above solution has a considerable extra cost compared to computing CI, p directly. The number of convolutions can be reduced by using convolutions only for symbols with many occurrences in I and P, and "brute force" (locally) for the other symbols. However, this still has the disadvantage of having a considerable local computational burden. Thus the solution for the case f ( x , y) = Ix - y I is less satisfactory, for large I A ], than the solution for the case f(x, y)= ( x - y)2. This algorithm illustrates that "programming scientific disguises" requires a relatively complete programming language which must be supported by the Atri system. Of course, as a problem solving environment, Atri would have high-level natural constructs to invoke such a common procedure as this one. But, for novel situations, Atri also needs to provide the capability to write "ordinary" procedural programs. The disguised program for this algorithm uses the external procedure called 2D-Convolution and the outsourced computations are in it. The disguise program is as follows: Master Matrix Matrix
Key = S i x t e e n - T w e l v e - E i g h t 9 4 0 3 , (I:N,I:N), I (l:n,1:n), P
Sub Key
k(i)
246
MIKHAIL J. ATALLAH ET AL.
For g = 0 to L do Matrix (I:N,I:N),
It (i,j) = 0 !/ I ( i , j ) ~ g else It(i,j) = g~ Matrix (1:n,1:n), Pt(i,j) = I tf P(i,j)~
Of course this disguise constitutes only a first step in the general secure outsourcing of image analysis; the literature contains many other measures for image comparison and interesting new ones continue to be proposed (for example, see Boninsegna and Rossi [28] and the papers it references).
3.4
String Pattern Matching
Let T be a text string of length N. P be a pattern of length n(n <<,N), both over an alphabet A. We seek a score vector Cr. p such that Cv.p(i) is the number of positions at which the pattern symbols equal their corresponding text symbols when the pattern is positioned under the substring of T that begins at position i of T, i.e. it is ~-~,,-I ~57/k+ ;). P(i) where ~5,. equals one if x - y and zero otherwise. For every symbol x E A we do the following: '
k = 0
. ,v
1. Replace, in both T and P, every symbol other than x by 0, and every x by 1. Let Tx and P.,- be the resulting text and pattern, respectively. Augment Px into a length N string IIx by padding it with zeros.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
247
2. Outsource the computation of /7-
Dx(i)- Z
1
/x(i + kllI,-(k),
O<~i<~N-n.
k=0
This is essentially a convolution, and it can be securely outsourced using the method given in Section 3.1.4. It is easily seen that CT.p equals Y~'~.,-cA Dx.
4. Security Analysis 4.1
Breaking Disguises
The nature of disguises is that they may be broken completely (i.e., the disguise program is discovered) or, more likely, they are broken approximately. That is, one has ascertained with some level of uncertainty some or all of the objects in the original computation. For the example program Quadrature (1.4.2) of Section 2.4, one might have ascertained. Object C o m p u t a t i o n = Integration Interval = [a, b] A l(x) with ] f - A1 [~< 0.25 A2(x) with I f - A2]~< 0.05 . 13 with [ 13 - ANSI <<0.8 14 with ] 14 - ANSI <~0.1 . 15 with ] 15 - ANSI <<0.02
Certainly 100% 100% 60% 8% 47% 4% 0.03%
Here A 1(x) and A2(x) are functions that have been determined somehow. Thus we see that there is a continuum of certainty in breaking a disguise which varies from 100% to none (no information at all). Indeed, an attacker might not even be able to identify all the objects in the original computation. On the other hand, an attacker might obtain all the essential information about the original computation without learning any part of the disguise program exactly. The probabilistic nature of breaking disguises comes both from the use of random numbers and the uncertainty in the behavior of the disguise program itself (as mentioned earlier). Of course, an attacker could only guess at the levels of certainty about the object information obtained. This situation again illustrates the difference between disguise and encryption; in the latter one usually goes quickly (even instantly) from no
248
MIKHAIL J. ATALLAHET AL.
information to a compete break. Thus one cannot expect to have precise measurements of the degree to which a disguise is broken or of the strength of a disguise. Indeed, the strength is very problem dependent and even attacker dependent. For one computation an attacker may be completely satisfied with +15% accuracy in knowledge about the original problem while in another computation even +0.01% knowledge is useless.
4.2
Attack Strategies and Defenses
Three rather unrelated attack strategies are discussed here. We exclude nonanalytical strategies which, for example, incorporate knowledge about the customer (e.g., the company does oil/gas exploration and the customer's net address is department 13 in Mobile, Alabama), or which attempt to penetrate (physically or electronically) the customer's premises. History suggests that the nonanalytical strategies are the most likely to succeed when strong analytical security techniques are used.
4.2. 1 Statistical Attacks Knowledge of the Atri environment or casual examination immediately leads the attacker to attempt to derive information about the random number generators used. A determined attacker can check all the numbers in the outsourced computation against all the numbers particular random generators produce. This is an exhaustive match attack. While the cycle lengths of the generators are very long, one cannot be complacent about the risk of this approach as we see teraflops or petaflops computers coming into use. There are four defenses against this attack: 1. Use random number generators with (real and random) parameters. With 32 bit reals for two parameters, this increases the cost of an exhaustive match of numbers by a factor of about 1014. This should be ample to defeat this attack. Note also that an exact match no longer breaks the random number generator as subsequences from random sequences with different parameters will "cross" from time to time. On the other hand, sometimes one is constrained in the choice of parameters, e.g., it might be necessary that the numbers generated "fill" the interval [0, 1] and not go outside it. 2. Restart the random number generators from time to time with new subkeys. Similarly, one can change the random number generator used from time to time. Thus using a new seed for every 1000 or 10 000 numbers in a sequence of 10 000 000 numbers greatly reduces the value of having identified one seed or parameter set.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
249
3. Use combinations of random number sequences as mentioned in Section 2.2.1. If certain simple constraints are required, e.g., all values lie in [0, 1], one can use a rejection technique to impose such constraints. In summary, we conclude that even modest care will prevent an exhaustive match attack from succeeding. An alternate attack is to attempt to determine the parameters of the probability distribution used to generate a sequence of random numbers. We call this a parameter statistics attack and it is illustrated by the simple example in Section 1.2. In general, one can estimate the moments of the probability distribution by computing the moments of the sample sequence of generated random numbers. The mean of the sample of size N converges to the mean of the distribution with an error that behaves like O(1 x/--N). This same rate of convergence holds for almost any moment and any distribution likely to be used in a disguise. This rate of convergence is slow but not impossibly so a sample of 10 000 000 should provide estimates of moments with accuracy of about 0.03%. There are four defenses against this attack: 1. Use random number generators with complex probability distribution functions. This forces an attacker to estimate many different moments; and also to estimate which moments (and how many) are used. If problem constraints limit the parameters of a distribution, one can often apply the constraints after the numbers are generated via rejection techniques, etc. 2. Restart the random number generator from time to time with new subkeys. Restricting sequences to only 10 000 per subkey limits parameter estimation to about 1% accuracy; shorter sequences limit parameter estimation accuracy even more. 3. Use random number generators whose probability distribution function contains multiple random parameters, e.g., a cubic spline with five random breakpoints. Then, even the accurate knowledge of several (many?) moments of the distribution function provides low accuracy knowledge about the distribution function itself. This aspect of defense is related to approximation theoretic attacks discussed next. 4. Use data values to generate seeds for random number generators and, sometimes, replace randomly generated values by actual data values or other data-dependent values.
4.2.2 Approximation Theoretic Attacks The disguise functions are chosen from spaces described in Section 2.2.1. Let F be the space of these functions, u(x) be an original function, and f(x) be a disguise function so that g(x)= u(x)+f(x) is observable by the agent. The
250
MIKHAIL J. ATALLAH
ETAL.
agent may evaluate g(x) arbitrarily and, in particular, the agent might (if F were known) determine the best approximation g*(x) to g(x) from F. Then the difference g*(x)-g(x) equals u*(x)-u(x) where u*(x) is the best approximation to u(x) from F. Thus g*(x)- g(x) is entirely due to u(x) and gives some information about u(x). There are three defenses against this attack: 1. Choose F to have very good approximating power so that the size of g*(x)- g(x) is small. For example, if u(x) is an "ordinary" function, then including in F the cubic polynomials and the cubic splines with five or 10 breakpoints (in each variable) gives quite good approximation power. One would not expect I]g*(x)-g(x)]1/]1 u(x)1] to be more than a few percent. If u(x) is not "ordinary" (e.g., is highly oscillatory, has boundary layers, has jumps or peaks) then care must be taken to include functions in F with similar features. Otherwise the agent could discover a great deal of information about u(x) from g(x). 2. Choose F to be a one time random space as described in Section 2.2.1. Since F itself is then unknown, the approximation g*(x) cannot be computed accurately and any estimates of it must have considerable uncertainty. 3. Approximate the function object u(x) by a high accuracy, variable breakpoint piecewise polynomial. It is known [25] that this can be done efficiently (using a moderate number of breakpoints) and software exists to do this in low dimensions [18]. Then, one adds disguise functions with the same breakpoints and different values to the outsourced computation. Underlying all these defenses is the fact that if F has good approximation power and moderate dimension, then it is very hard to obtain any accurate information from the disguised functions. This same effect is present in the defense against statistical attacks where the function to be hidden is the probability density function of the random number generator.
4.2.3
Symbolic Code Analysis
Many scientific computations involve a substantial amount of symbolic input, either mathematical expressions or high-level programming language (Fortran, C, etc.) code. It is natural to pass this code along to the agent in the outsourcing and this can compromise the security. An expression COS(ANGLE2*x-SHIFT)+BSPLINE(Atx) is very likely to be the original function ( C0 S 99.) plus the disguise ( BS P L I NE. 9.) and they can be distinguished no matter how much the BS P L I NE function values behave like r 0 S as a function. The symbolic information may be pure mathematics
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
251
or machine language or anything in between. Outsourcing machine language is usually impractical and, in any case, provides minimal security. Fortran decompilers are able to reconstruct well over 90% of the original Fortran from machine language. Thus we must squarely address how to protect against symbolic code analysis attacks on high-level language or mathematical expression of the computation. There are four general defenses against symbolic code analysis attacks: 1. Neuter the name information in the code. This means deleting all comments and removing all information from variable names (e.g., name them A followed by their order (a number) of appearance in the code). This is an obvious, easy but important part of the defense. 2. Approximate the basic mathematical functions. The elementary builtin functions (sine, cosine, logarithm, absolute value, exponentiation, ...) of a language are implemented by library routines supplied by the compiler. There are many alternatives for these routines which can be used (with neutered names or in-line code) in place of the standard names. One can also generate one t i m e e l e m e n t a o ' f u n c t i o n a p p r o x i m a tions for these functions using a combination of a few random parameters along with best piecewise polynomial, variable breakpoint approximations. In this way all the familiar elementary mathematical operators besides arithmetic can be eliminated from the code. 3. Apply symbolic transformations. Examples of such transformations are changes of coordinates, changes of basis functions or representations, and use of identities and expansions of unity. Changes of coordinates are very effective at disguise but can be expensive to implement. Consider the potential complications in changing coordinates in code with hundreds or thousands of lines. The other transformations are individually of moderate security value, but they can be used in almost unlimited combinations so that the combinatorial effects provide high security. For example, we transform the simple differential equation ~ + (x'~ + l o g ( x ) ) y - 1 + x 2 y " + x * cos(x)v' using only a few simple symbolic transformations such as cosZx - s i n Z y - cos(x + y)cos(x - y) secZ(x + y) - tan 2(x + y) - 1 (tan x + tan y ) / t a n ( x + y) + tan x tan y = 1 1 -+- x - - ( 1
-- x 2 ) / ( 1
-- x )
sin(3(x + y)) -- 3 sin(x + y) -- 4 sin3(x + y) a 2 - a x + x 2 -- (a 3 + x 3 ) / ( a + x )
252
MIKHAIL J. ATALLAHET AL. plus straightforward rearranging and renaming. The result is quite complicated (Greek letters are various constants that have been generated)" (/3 cos 2 x - 6)y" + x[cos x / ( 7 cos(x + 1)) - cos x sin(x + 1)tan(x + 1)] * [c - sin2x + c sin(x + l) - sin2x sin(x + 1)]y' + [/3(x cos x) 2 - q(x + log x) + 0 cos x log
X 2]
* [r/sin x + ~5 tan x + [X sin x + # cos x + u)/tan(x + 2)]y = (1 + x2)[sin x + 7/cos x]. If we further rename a n d / o r reimplement some of the elementary functions and replace the variable names by the order in which the variable appears, this equation becomes y"[x01 9 x 0 2 ( x ) - x03] + y'[x04 9 x/(x05 cos(x + 1) + cos x 9xO6(x)tan(x + 1)] , [x07 - sin 2 x - x08(x)sin2 x + x07 sin2(x + 1)] + y [ x 0 1 , (x 9 x09(x))2
_ X10(X
nL- log
x) + x l 1 cos x log x 2]
* [x12 * x l 3(x) + x14 tan x + (xl 5 sin x + x l 6 cos x + x l 7)] = sin x + x18 9 (1 + x 2) 9 x09(x) + x l 9 ( x ) + x l 0 * x 2 cos x. It is hard to say at this point how difficult it would be for a person to recover the original differential equation. It certainly would take a considerable effort, and this disguise uses rather elementary techniques. 4. Use reverse communication. This is a standard technique [29] to avoid passing source code into numerical c o m p u t a t i o n s and can be used to hide parts of the original computation. Instead of passing the code for u(x) to the agent, one insists that the agent send x to the customer who evaluates u(x) and returns the value to the agent; the agent never has access to the source code. If u(x) is very simple, c o m m u n i c a t i o n adds greatly to the cost of evaluating it. But it is more likely that u(x) is complex (otherwise another disguise would be used) and that the extra c o m m u n i c a t i o n cost is not important.
4.3
Disguise Strength Analysis
A systematic or general analysis of the security of disguises is not practical at this time for two reasons. First, the disguises are very problem dependent
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
253
so each type of computation, perhaps even each individual computation, must be analyzed using different techniques. Second, disguises can be of essentially unlimited complexity; and this complexity is not of a linear nature (e.g., a function of the number of bits in the keys), but is a disorganized conglomeration of unrelated techniques. The strength of disguises is derived from this complexity but it also tends to defeat the analysis of the resulting strength. Before one becomes uncomfortable with the strength of disguises, one must realize that a determined agent will be able to apply enormous resources in attacks. It may be feasible soon to analyze every code fragment to see if it computes a standard mathematical function and, if so, use this to simplify the disguise. It may be feasible to generate systematically disguise programs such as those in Section 2.3 (including choices for the numerical values of parameters) and see if they generate the outsourced computation. Further, it is possible that some disguise techniques that look strong now will become easy to defeat once they are studied in depth. Nevertheless, we are optimistic that the complexity possible for disguises will allow this approach to withstand attacks for the foreseeable future. This optimism comes from the analysis of just three simple numerical computations given below. We note that it is harder to disguise simple computations than complicated ones. We also note that historically one of the biggest sources of weak security is that people become bored and lazy in applying security properly. Thus for the disguise of complex computations, it is imperative that tools such as the Atri system be available to minimize the effort of disguise and the risk of accidental holes in them. To illustrate the strength possible for disguises and the analysis techniques one can use, we consider disguises of three simple, common scientific computations.
4.3. 1 Matrix Multiplication We use the disguise program of Section 2.4 modified to (a) have controlled lengths of sequences from random number generators, and (b) to use better disguised random number generators. Let L be the maximum length for a sequence from a random number generator so that M = [m/L] is the number of distinct generators needed. Let GONE(A(i)), i = 1,2, ..., M, be one time random number generators as described in Section 2.2.1. Each has a vector A(i) of 12 random parameters/seeds. The fourth line of the Section 2.2.1 program can be replaced by V e c t o r (1:m) M = [MIL 1
~,
/3,
254 For
MIKHAIL J. ATALLAH ET AL. j=1
to M do
Vector
(1:12)
A(j)
using
V e c t o r (I:L) TI, T2, T3 o((j-1)L+i) = T1(i) for /3((j-1)L+i) = T2(i) for
End
~((j-1)L+i) do
= T3(i)
for
GL,,~TF ( k ( j + l ) ,
u s i n g GONE A(j) i = I:L i = I:L
[0,13)
i = I:L
This change in the program creates the nonzero vectors for the three matrices P1, P2, P3 used to disguise M1 and M2. An agent attacking this disguise receives X = P 1 , M 1 , P2 -1 and Y = P2 9M2 9P3 -1 The only attack strategy possible is statistical. If the agent has no information about M1 and M2, there is no possibility of determining M1 and M2. Not even the average size of their entries can be estimated with any accuracy. Thus this disguise is completely secure. If it is possible that the agent has external information about M1 and M2, then one should take steps to disguise that type of information. For example, the class of problems one normally deals with might have some "typical" sparsity, sign, or size patterns. These patterns can also be disguised, but we do not discuss this topic here.
4.3.2
Numerical Quadrature
We use the disguise program of Section 2.4 modified to have a second disguise function added. One potential weakness of the earlier disguise given is to an approximation theoretic attack. The g(x) created is quite smooth and f ( x ) might not be. So we replace the 7th line of the program as follows: Cubic spline g1(x)=approximation v a r i a b l e b r e a k s , I%)
V e c t o r (I:L) V e c t o r (I:L) Cubic s p l i n e Cubic spline Cubic spline
(f,a,b,
cubic
spline,
x = breakpoints of g1(x) R u s i n g GUNIF ( k ( 3 ) , 0.8, 1.31) g2(x) with g 2 ( X ( i ) ) = R ( i ) * g ( X ( i ) ) for i = I:L g3(x) w i t h g 3 ( P ( i ) ) = V(i) for i = 1:7 g(x) = g 3 ( x ) + g2(x)
This change effectively modifies f (x) randomly by about 25% and then adds another, smooth random function to it. An agent attacking this disguise can try both approximation theoretic and code analysis strategies. The disguise so far only protects against approximation theoretic attacks. We see no way that the values of h(x) sent to the agent can lead to accurate estimation of the result, say better than about 100% error. If we choose to protect against a code analysis attack using reverse communication, then the security is complete. If we choose to
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
255
protect against a code analysis attack by replacing the f ( x ) by a highaccuracy approximation, then the security is again complete. To do this one just replaces 1% in the above code, by, say, e p s and then defines
h(x) = g(x) + g2(x) + g3(x).
4.3.3 Differential Equations We assume that the differential equation is of the form
al (x)y" + az(x)y' + a3(x)y = a4(x)
y(a) = yl, y(b) = y2.
If reverse communication is used for the evaluation of the functions ai(x) then no symbolic information about them is available to the agent. There is no opportunity for statistical attack so the only feasible attack is approximation theoretic. Program (1.4.4) of Section 2.4 can be used to disguise the solution y(x). If similar disguises are applied to the four functions ai(x), then neither the solution nor problem could be discovered by the agent and the disguise would be completely secure. Since reverse communication can be expensive, we consider the alternative of using symbolic disguise such as in item 3 of Section 4.2.3. A symbolic attack is made "bottom up", that is, one attempts first to identify the elements (variables, constants, mathematical functions) of the symbolic expressions. There are two approaches here:
9 Program analysis. For example, deep in a subroutine one might set X137 = X09 and later, in another subroutine set X 1 1 - X137. This can be done either with constants or actual variables. Or, one can replace sin(X11) by X12(X11) and implement X12 with machine code for sine; there are several standard alternatives for this implementation. 9 Value analysis. For example, one can check to see if the value of X9, l / X 9 or (X9) 2 is the same as any of the other constants in the program. Similarly, one can evaluate X12(x) for a wide variety of x values and check to see if it is equal (or close) to sin(x), log(x), tan(x), X3, etc. The use of program substitution or modification can greatly increase the strength of a disguise. Indeed, it is clear that if the symbolic code is lengthy, then a modest amount of program disguise scattered about provides enormously strong security. However, we focus our analysis on shorter symbolic forms such as the differential equation program which are completely defined by a single symbolic equation and "simple" functions. That is, how hard is it to break
256
MIKHAIL J. ATALLAH ET AL.
disguises based on introducing mathematical identities? The symbolic code attacker would first check: 1. Which constants are related? 2. Which functions are standard mathematical functions?
For question 1, suppose that there are N constants c; in the equation and we ask if any c; is r(ci) for a collection of relationship functions r. If the number of relationships r used is M, then this checking requires evaluating N M constants and then testing if any pairs are equal. The latter effort is work of order N M log M (using sorting) and values of N = 20, M = 500 are reasonable. Thus, the relationship of pairs of constants could be determined by sorting them or using about 105 operations. Such a computation is well within the capability of the agent, so we assume there is no security in binary constant disguises, i.e., if X1 = 1.108, X2 = sec(1.108) and X3 = F(1.108) are present, then the agent will discover this fact. Consider the next three-way relationship such as X03 = oL2/cos(2),
X15 = 1 + c~,
X12 =c~ tan(2)
which appears in the example at hand. In the notation used above, one considers if any ci is ri(r2(ci)r3(ck)). The number of constants computed is M 3 N 2 (perhaps 5. 101~ The pairwise comparison requires work of order M 3 N 2 log(MN) and the total comparison work is of the order of M 3 N 2 log(MN). With N = 20 and M = 500, this is a computation of order about 1012 (including a factor of 10 or so for the function evaluations). This is a substantial, but not outrageous, computation today and will be less formidable with each passing year. We believe that disguises using four-way relationships, e.g., X03 = c~2fl/cos(2), X17 = c~ tan(2)H1/2(/3),
X15 = 1 + 6~, X23 = log 2
will not be broken in the foreseeable future. We estimate the computational work to be the order of 1016 operations assuming M = 500, N = 20. There is another consideration in using values to discover relationships among constants: there are only about 10 l~ different 32-bit numbers and only about l0 s of them in any one broad range of sizes. Thus, the pairwise comparison approach must lead to many "accidental" matches, perhaps 100-1000 for three-way relationships. Since each match is a starting point for a complex disguise attack, each of these accidental matches initiates a
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
257
lengthy additional analysis for the attacker. This means that even three-way relationships among constants provide a very high level of security. Finally, another level of complexity can be obtained by splitting constants. Thus the 1.108 used above can be split three times by replacing X1 = 1.108, X2 = sec(1.108), and X3 = F(1.108) with X1 - 0 . 0 6 3 4 , X2 - 1.0446, X3 - X1 + X2 X4 - -0.6925, X5-
1.8005,
116 - sec(X4 + X5) - 1/(cos X4 cos X5 - sin X4 sin X5) X7-
1.000,
X8 - 0.108, X 9 - F(X7 + X8) which makes all the constants unique. The task of checking all pairs of sums is not insurmountable but applying simple identities as follows makes the task of identifying the number 1.108 extremely complex" X6-
1/(cos X4 cos X5 - sin X4 sin X5)
X9 - X8 9 1~(X8). Note that this is another way to use data dependence in the disguise program. For question 2, consider a function F(x) and we can ask: Is F one of the standard mathematical functions? Is F(x) related to another function G(x)? First, we note that if F(x) is parameterized by a random real number c~, e.g., sin(c~x) or u(c~, x) (the parabolic cylinder function), then it is hopeless to identify F by examining its values. It requires at least 100 times as much effort to compare two functions as it does two constants, and the parameter c~ increases the number of potential equalities by a factor of about 10 s Thus, checking two-way relationships between standard functions and parameterized functions is a computation of order 105 (what we had for constants) times 100 times 10 s or about order 1015. Consider then the problem of deciding if F(x) is one of, or related to one of, the standard mathematical functions G~(x). There are perhaps 2 5 - 5 0 standard functions (x/, sin, log, cosh .... ) and perhaps five times as many in simple relationships (1/log x, 1 + log x, 2 , log x, log(1 + x), l o g ( 2 , x), ...).
258
MIKHAIL J. ATALLAH ETAL.
Then there is the uncertainty in the range of arguments, so one might try five ranges (e.g., [0.1,0.2], [1.5, 1.8], [12, 13], [210,215], [ - 1, 1]). Thus, F(x) should be compared to a known Gi(x) about 1000 times. If one comparison requires 1000 operations and there are 20 functions F(x), then the total computing effort is the order of 20 9 1000 * 1000 = 2 9 107 operations. This is easily done and there is no security in simple, single function disguises; a diligent agent will identify every one of them. Consider the next two-way function disguises, e.g., X1 (x) = cos(x)sin(x + 1), X3(x) = (1 + x2)sin(x),
X2(x) = cos(x)/tan(x + 2), X4(x) - cos(sin(x + 1)).
There are perhaps 100 simple functions (as above), so the number of pairs is 104 , and the number of simple combinations is five times this. Thus the work of 2 9 107 operations seen above increases to the order of 1011 or 1012 Thus the use of two-way function disguise provides strong security and three-way function disguises, e.g., X1 (x) = cos(x)sin2(x + 1)e x, X2(x) = cos(x)/tan(x + 2), X3(x) = e-X(1 + x2)sin x, X4(x) = x 2 tan(x + 2)/sin(x) provides complete security. This not only "hides" the original symbolic information but it introduces data dependence into the disguise in a more complex way. Consider now the situation where no disguises of functions or constants are made or, equivalently, when the agent is successful in breaking all these disguises. Question 3 is then:
3. How well can one identify the identities and/or partitions of unity used and remove them from the problem? The process needed here is essentially the same as simplifying mathematical expressions by using manipulations and identities. This task is addressed in some symbolic computer systems and is known to be a formidable challenge even for polynomials and the elementary mathematical functions. We are aware of no thorough complexity analysis of mathematical simplification, but results relating the process to pattern matching suggest that the complexity is very high. The original differential
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
259
equation can be represented as a tree and the process of using identities and simple manipulations is an expansion and rearrangement of this tree. It is plausible that the expansion process involves (a) some functions with random parameters, (b) some functions that already appear in the expression, and (c) some completely unrelated common functions. Let us assume that for each identity used there are three simple manipulations, e.g., combining terms, splitting terms, rearrangements. The number of choices for (a) is enormous, but these functions might be clues to simplification by using the assumption that they are unlikely to be part of the original problem. The number of choices of identities for (b) is moderate but still substantial, perhaps 6-10 identities for each original function. The number of choices for (c) is substantial, perhaps 100 distinct partitions of unity exist involving the common functions. Many more can be obtained by simple manipulation. Assume now that the differential equation is simple, involving, say four common functions. Assume its tree representation has 10 nodes (four of which cannot be modified, being the = and differential operators). Further assume the disguise process uses N1 identities involving random functions, N2 identities for each original function, and N3 identities involving unrelated functions. Finally, assume that each identity is represented by a tree with six nodes. We obtain a rough estimate of the number of possible trees for the disguise as follows. The tree has 10 + N1 + 4 N2 + N3 "fat nodes" (a fat node represents an identity used). The number of choices for the N1 identities is, say, four per function. The number of choices for the N2 identities is, say, eight. The number of choices for the N3 identities is four per new function introduced. The final tree has 10+6(N1 + 4 N2 + N3) nodes. The possible number of disguise trees is enormous, perhaps the order of 4 N'. 8 N2 .4 u3. If N1 = N 2 - N3 = 3, there are about 106 possible trees; increasing 3 to 4 in these values gives more than 108 trees, and this estimate does not include the effect of the simple manipulations. The fact that the number of possibilities is so huge does not automatically mean the simplification is impossible. It does, however, suggest that the simplification is a very formidable computation, one that provides very strong security.
4.3.4
Domains of Functions
Another view of symbolic code disguises and analysis comes from identifying domains of symbolic functions. A domain is a collection of symbolic elements that are related naturally by having some simple operations which transform one element of a domain into another. Examples include: 9 Numbers: integer, real, complex. The operations are arithmetic.
260
MIKHAIL J. ATALLAH ET AL.
9 Polynomials: In one variable, examples are x, x 3, ..., 17 + 3x + 9x 4, .... 9 + 3xy + In several variables examples are x, xy, x y 2,..., 6.042x3ySz 4, .... The operations are again arithmetic, both numerical and polynomial (e.g., (x + 2) 2 - x 2 + 4xy + 4y2). 9 Algebraic" Examples 4 + x + 1.03x 3/2
in
one
variable
include
X 0"073 X 0146 X 1"073 2 + 7X 0"073 § 5 4 3 X ~
x 1/2 x 7/2
+ l S x 2"146
These domains are closed under multiplication by the "base element", i.e., by constants, by x, by x or y, by x or particular fractional powers of x, respectively. 9 Trigonometric" Examples in one variable include sin x, cos 2 x, 17 + 3.1 sin x + 14.2 cos2(3x). This domain is closed under multiplication by "base elements" sin x and cos x, by constants, and by identities involving sines, cosines, tangents, cotangents, and secants.
There are many other domains involving higher transcendental functions which have natural operations (particular to each domain) under which they are closed. Examples include hyperbolic functions--exponentials, logarithms, G a m m a , Bessel, etc. The span of an expression is the union of the domains of the various elements (terms) in the expression. The span of a problem, equation, operator .... is the union of the spans of the expressions composing the problem, equation, operator, .... The dimension of the span is the number of domains in the span. The principal techniques for disguising an expression within a domain are: 1. Use standard identities to eliminate all the original elements, e.g., 4 + 3.79x 2 + 8x 4 ~ 1.729 + 2.271 + 3.74x 2 + 0.05x 2 + 6.204x 4 + 1.796x4 1.729 + 2.271 + 0.05(x + 1)(x - 1) + 0.05 + 6.204x 4 + 1.796(x 2 + 1)(x 2 - 1 ) + 1.796 5.846 + O.05(x + 1)(x - 1) + 3.74x 2 + 6.204x 4 + 1.796(x2 + 1)(x + 1 ) ( x - 1) 5.846 + 1.801 (x + 1)(x - 1) + 3.74x 2 + 6.204x 4 + 1.796(x2 + 1)(x + 1 ) ( x - 1).
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
261
2. Use partitions of unity (e.g., cos2x + sin2x - 1) to introduce new terms into the expression. Thus 4 + 3.79 cos x + 8 sin2x ~ 1.729 + 2.271 + 3.79/sec x + 4(1 - cos 2x) 1.729 + 2.271(sin2x + cos2x) + 3.79/sec2x + 4 + 4 sin 2x - 4/sec2x. 3. Use partitions of unity to introduce new domains into the expression. Thus 4.0 + 3.79x 2 ~ 1.729 + 2.271(sin2x + cos2x) + 3.79(x - 1)(x + 1) + 3.79 5.519 + 2.271 sin2x + 2.271(cos2x- 1/sec2x) + 3.79(x - 1)(x + 1). We note that: (i) For constants there are no partitions of unity or other useful disguise identities within the domain. (ii) For polynomials there are no partitions of unity as 1 is a polynomial. However, factoring provides disguise opportunities. (iii) Partial fractions provide a way to easily transform to rational functions. From 1/(x- 2 ) + 1/(x- 4) we have 2x 2 - 10x+ 12 1-
x- 2 +
x2-6x+8
. 4-x
Using such a partition of unity we can expand the domain of polynomials to the domain of rational functions. It is clear that one can expand the complexity of an expression without limit. It is well known that simplifying complex expressions is very difficult. It is well known that one can replace common terms (e.g., sin x, x/~, log x) by polynomial or piecewise polynomial approximations that are highly accurate and that there are multiple choices for doing this. The use of data dependence in selecting, removing, and replacing symbolic and numerical values further adds to the complexity of simplifying expressions. The basic scientific questions to be answered are: QI-
How does one measure the strength of such disguises, e.g., what is the complexity of the symbolic expression simplification problem? We believe that the answer to this question is not known but that experience shows that this complexity grows very rapidly with the number of terms and with the dimension of the expression. Simplification is an intrinsic problem for symbolic algebra systems
262
MIKHAIL J. ATALLAH ET AL.
and they provide a number of tools to aid people in the process. Clever people using these tools often have difficulty finding suitably "simple" forms of expressions even when no direct effort has been made to disguise the symbolic structure.
Q2:
Given that one starts with a particular expression (k terms, dimension d) and wants a disguise with K terms, how strong can the disguise be and how is the strongest disguise determined?
It is clear that Q2 is much harder than Q1, but it focuses the process of disguise clearly: One wants to limit the size of the disguised expression while making the disguise as strong as possible. Chieh-Hsien Tiao has developed the following scheme for automatically generating disguises within a domain by combining random selections of partitions of unity (and other identities if available), including partitioning constants. This creates an algorithm for making symbolic disguises.
Disguise Algorithm. Assume we are given a function f ( x ) and a collection C of m different sets of partitions of unity. All functions involved lie in a particular domain and each set involves at most n functions, i.e., the collection is C - {(f(i,j)(x)}'/= 1}'i'~ I. Let q be the disguise depth and d(x) the current disguised function. Then set d ( x ) - f ( x ) and Fork=l toqdo (a) Randomly permute the positions of the elements of d(x). (b) For each term in d(x) apply a partition of unity chosen randomly from C. Make a final random permutation of the elements of d(x). This algorithm expands the number of terms in d(x) exponentially in q, i.e., the final number of terms in d(x) is of the order of t * n q where t is the number of terms in the originalf(x). The collection C and the random objects in steps (a) and (b) are the keys to this disguise. We illustrate this algorithm with a simple example. Assume we want to disguise f ( x ) = x, and we have three partitions of unity involving two functions each. {{cos2(x), sinZ(x)}, {sec2(x),-tan2(x)}, {x, 1 - x}}. Choose q = 3 and set d ( x ) = x. 1. (a) With only one term d ( x ) = x, no permutation is needed here. (b) Choose a random number between 1 and 3, say 2.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
263
(c) A p p l y the s e c o n d p a r t i t i o n of unity on x to o b t a i n d(x)
-
x
9
sec2(x) - x 9 tan 2(x).
2. (a) C h o o s e a r a n d o m p e r m u t a t i o n of { 1,2}, say {2, 1}. (b) C h o o s e a r a n d o m n u m b e r for each term, say {2, 3}. (c) A p p l y 2a a n d 2b to o b t a i n d(x)
-
-
x *
tan2(x) + x * sec2(x)
= - x 9 tan 2(x) 9 sec2(x) + x 9 tan 4(x) + x 2 * sec2(x) + x * (1 - x) * sec2(x). 3. (a) C h o o s e a p e r m u t a t i o n o f { 1,2, 3, 4}, say {3, 4, 1,2}. (b) C h o o s e a r a n d o m n u m b e r for each term, say {3, 2, 3, 1}. (c) A p p l y 3a a n d 3b to o b t a i n d(x)
-
x 2 *
sec2(x) -k- x * (1 -- x) * secZ(x) -- x 9 tan2(x) * sec2(x)
+ x * tan4(x) = x 3 * sec2(x) + x 2 * (1 -- x) * sec2(x) + x * (1 -- x) 9 sec4(x) -- x * (1 -- x) 9 sec2(x) 9 tan2(x) -- x 3 , tan2(x) * sec2(x) -- x 2 9 (1 -- x) * tan2(x) 9 sec2(x) + x * tan 4(x) 9 cos2(x) + x 9 tan 4(x) * sin2(x). 4. C h o o s e a final r a n d o m p e r m u t a t i o n , say { 1,3, 7, 6, 2, 5, 4, 8} to o b t a i n f(x)
-
x -
x 3 9
sec2(x) + x , (1 - x) 9 sec4(x)
+ x , tan4(x) 9 cos2(x) - x 2 9 (1 - x) 9 tan2(x) * sec2(x) + x 2 9 (1 - x) 9 sec2(x) - x 3 , tan2(x) * sec2(x) - x , (1 - x) 9 sec2(x) 9 tan2(x) + x , tan4(x) 9 sin2(x) = d(x).
5. T h e key to retrieve
f(x)
from
d(x)
is
{3, {1}, {2}, {2, 1}, {2, 3}, {3,4, 1,2}, { 3 , 2 , 3 , 1}, {1,3, 7, 6 , 2 , 5 , 4 , 8 } } . Q u e s t i o n s that arise n a t u r a l l y include: Q3:
U n d e r w h a t c i r c u m s t a n c e s is it a d v a n t a g e o u s to increase the s p a n o f the disguised expression?
264
MIKHAIL J. ATALLAH
ETAL.
Q4: Is the best choice (i.e., providing strongest disguise with a given
Q5:
4.3.5
number of terms) to reduce everything to piecewise polynomials (i.e., include logic and constants)? Note that one can disguise the common routines for x, x " , a x, log x, etc., so their well-known constants are not used directly. Can one analyze the simplest disguise problem, namely that for constants? Given C1, C2, C3 ..... Ck and C*, is there a way to write C* = c~1 C1 + c~2C2 + . . . + c~kCk where the c~k are "small" integers, i.e., - M ~
Code Disguises
There is another class of disguises which can further add to the security of symbolic disguises. These are not based on mathematical methods but rather on programming language (code) transformations. These techniques are called code obfuscations and are primarily intended to provide "watermarks" for codes, i.e., to change the code in such a way that it computes the same thing but is different from the original code. The changes made must be both unique to each version and difficult to reverse-engineer. In other words, the original code must be difficult to obtain from the obfuscated code. A review and taxonomy of these techniques is given in by Collberg et al. [30]. These obfuscations can be applied to the actual computer programs for the symbolic portions of an outsourced computation and further increase the security of the symbolic content. In summary, our analysis of disguise strength shows that: 9 9 9 9
Reverse communication provides complete security. Disguises of constants provide strong-to-complete security. Disguises of functions with random parameters provide complete security. The difficulty of mathematical simplification (removal of identities) provides very strong, probably complete, security. 9 The use of code obfuscation techniques provides another level of security for symbolic disguises.
These conclusions depend, of course, on the effort put into the disguise. Only moderate effort is needed to achieve these conclusions; substantial effort will provide complete security. The combined effect of these analyses suggests that even modest effort may provide complete security.
SECUREOUTSOURCINGOF SCIENTIFICCOMPUTATIONS
5.
265
Cost Analysis
We have identified three components of the cost of disguise and these are analyzed here.
5.1
Computational Cost for the Customer
A review of the disguise techniques proposed shows that all of them are affordable in the sense that the computation required of the customer is proportional to the size of the problem data. For some computations, e.g., solving partial differential equations or optimization, the cost can be several, even five or 10, times the size of the problem data. However, these are computations where the solution cost is not at all or very weakly related to the problem data, i.e., very large computations are defined by small problem statements. The principal cost of the customer is, in fact, not computational, but in dealing with the complex technology of making good disguises. The envisaged Atri problem-solving environment is intended to minimize this cost and it is plausible that an average scientist or engineer can quickly learn to create disguises that provide complete security.
5.2
Computational Cost for the Agent
A review of the disguise techniques shows that some disguises might dramatically increase the computational cost for the agent. Since the customer is probably paying this cost, this is our concern. The effects are problem dependent and in many (most?) cases the disguise has a small effect on the agent's computational cost. None of the disguise techniques proposed change the basic type of the computation. Thus, numerical quadrature is not re-posed as an ordinary differential equations problem. Nor do we propose to disguise a linear system of equations as a nonlinear system. Such disguises might provide high security, but they are not necessary. But the nature (and cost) of a computation is sometimes affected greatly by fairly small perturbations in the problem.
5.2. 1 Preservation of Problem Structure We define problem structure to include all those aspects of a problem which affect the applicability of software or problem solving techniques or which strongly affect the cost of using a technique. Examples of structure
266
MIKHAIL J. ATALLAH ET AL.
that are important to preserve in disguises are:
1. Sparsity. In optimization and partial differential equation problems the sparsity pattern of an array can reflect important information about the problem. One can usually change the sparsity by applying some "solution preprocessing" such as augmenting or reducing the number of variables. However, these changes, especially augmentation, might have a substantial negative impact on the efficiency of solution algorithms. Just how much impact appears to be dependent on the specific problem and its disguise. 2. Function smoothness and qualitative behavior. Function smoothness and other qualitative behaviors (e.g., rapid oscillations or boundary layers) could provide important information about a problem. These behaviors also have an important impact on the efficiency of many algorithms, even rendering some of them ineffective. Thus the disguise should mimic these behaviors whenever possible (the third item in Section 4.2.2 presents a way to do this). Further, the disguise should not introduce any such behavior if this could have a substantial negative impact on algorithm efficiency or applicability. 3. Geometric simplicity. Geometry is sometimes a very important part of the critical information about a computation so it is important to disguise it using domain augmentation, splitting or transformations. All of these can introduce problem features leading to increased computational cost. For example, introducing a reentrant corner into the domain of a partial differential equation problem can make the computation more expensive by orders of magnitude. Disguise techniques must be careful to avoid such changes in the geometry. 4. Problem simplicity. Some simple problems allow exceptionally efficient solution techniques, e.g., u,,x + uy;. + 2u = f (x, y)
(x, y) in rectangle
can be solved by F F T methods. One can disguise the solution or map the rectangle into a similar one and still use the F F T method; other disguise techniques are likely to make the F F T method unusable and to require a substantially more expensive computation for the solution. Since problem structure can be a very important attribute of a computation from the security standpoint, care must be taken to disguise this structure as well. We see from the examples above that the disguise of structure can have a large negative impact on the cost of the computation. In most cases the structure can be disguised without this impact if care is
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
267
taken. The case of extremely simple computations seems to present the hardest challenge to disguise well while maintaining efficient computation. In some of these cases it may be necessary to make the computation more expensive. Most of these computations are "cheap" due to their simplicity, so perhaps the extra cost for security will be tolerable.
5.2.2
Control of Accuracy and Stability
A computation is unstable if a small change in it makes a large change in its result. Care must be taken so that disguises do not introduce instability. There are two typical cases to consider: 1. The original computation is stable. This means that moderate perturbations can be made safely, but how does one measure "moderate"? There is no general technique to estimate quickly and cheaply the stability of computations. Since the disguises involve "random" or "unpredictable" changes, it is highly unlikely that a disguise introduces instability. But highly unlikely does not mean never. For some computations, stability can be estimated well, and with low cost, during the computations. Agents providing outsourcing services should provide stability estimates for such computations. For other computations the customer may need to estimate and control the stability of the computation. 2. The original computation is not very stable. This means that moderate, even minor, perturbations are unsafe if made in the "wrong direction." Even if the disguise perturbs the computation in a "safe direction" (making it more stable), the consequence is that the inversion process of the disguise becomes unstable. The agent would correctly state that the computation is stable. If the customer is not aware of the potential instability in the disguise inversion, a complete loss of accuracy could occur with no warning. Of course, we should always hold the customer responsible for understanding the stability properties of the original problem. It is the " d u t y " (at least the goal) of the numerical algorithms to preserve whatever stability that exists in the computation presented to them. They should also report any large instability that they sense. Since the disguise and its inversion is in the hands of the customer, the agent has no additional responsibility. One can envisage that the Atri system could provide the user with tools and aids to assist in the evaluation of the effects of disguises on stability.
268
MIKHAIL J. ATALLAH ET AL.
5.3
Network Costs
The disguises can increase the network traffic costs by increasing (1) the number of data objects to be transmitted, and (2) the bulk of the data objects. A review of the disguises proposed in this paper shows that the number of objects is rarely changed much from the original computation. However, the bulk of the individual objects might change significantly. Coordinate changes and the use of identities can change functions from expressions with five- 10 characters to ones with many dozens of characters. This increase is unlikely to be important as in most computations with symbolic function data, the size of the data is very small compared to the size of the computation and thus the network cost of the input data is negligible. The disguise techniques used for integers at the end of Section 1.2 increased their length from 8 bits to 9 bits; similar schemes could increase their length from 1 byte to 2 bytes. This might double the network costs for some computations. The network cost in returning the result is less likely to be increased, but it can be for some computations.
5.4
Cost Summary
In summary, we conclude that: 9 Customer costs for disguise are reasonable and linear in the size of the computation data. The principal cost is in the "intellectual" effort needed to make good disguises. 9 Agent costs have a minimal increase unless the customer changes the problem structure. The control of problem structure increases the intellectual effort of the customer. There might be especially simple computations where good disguise requires changing the problem structure. 9 Network cost increases vary from none to modest; rarely will this cost be doubled.
6.
Conclusions
This chapter shows that one can securely outsource a wide variety of scientific problem solving. A number of basic disguise techniques are presented, but it is observed that none of these apply to all problems. The disguise must be tailored to the class of problems being solved. Success in disguise has been achieved so far for every problem attempted, but this first analysis is biased toward considering problems where disguise appeared to
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
269
be possible. In some cases in this chapter, it took considerable study and time to find a disguise procedure. It is plausible that, as problems become more complicated, finding effective disguises will become more difficult. This chapter focused on problems where the solving effort is large compared to the problem size. This approach assumes a context where the critical resource is computing power. There are also many applications where the critical resource is software availability. For example, many organizations outsource their payroll and tax computations, not to save on computing time, but to gain access to complex software systems. The techniques presented here can be used for many such applications, but the range of applicability has not been explored. This chapter focused on mathematical problem solving and various mathematical structures and tools are used in the disguise. Many applications have a lot of structure, but it is not of a classical mathematical nature. Examples of these include payroll computations, disease diagnosis from medical records, document formatting and layout, preparing tax returns, and processing credit applications. All these examples require complex software using the very specific and complicated structure of the applications. Computing time is secondary for most of them. We believe that disguise techniques can be found for such applications, but we have not explored this possibility. This chapter vaguely describes a problem-solving environment with an associated "natural" language for disguise programs to manage the outsourcing. For very high value applications (e.g., outsourcing seismic data analysis) one expects the disguise would be done on an ad hoc basis just for that application. However, the widespread use of secure outsourcing requires that an easy-to-use system be created to support the process. It is hard to predict the effort needed to create such a problem-solving environment, but it clearly requires a major investment. It is not clear that there is sufficient demand for such a system to justify this investment. As the computing world moves to a netcentric structure, the use of outsourcing will become routine. What is unclear is whether there will be sufficient concern about the security of outsourced computations. Security depends on the strength of the various disguise techniques used. We are confident that randomization can provide as much security as desired against statistical and approximation theoretic attacks. We are less confident about the strength of symbolic disguises. The lack of confidence is not due to a belief that they are less secure, but a belief that the mathematical framework for defining, analyzing, and testing security is much less developed. In Sections 4.3.3 and 4.3.4 we posed a number of open questions about the strength of symbolic disguises. Such questions need careful study and are likely to be very hard to answer. Consider the
270
MIKHAIL J. ATALLAH ET AL.
following analogy. Suppose a thousand years ago one decided to disguise two numbers, 1789 and 4969, by computing their product 8889541 and sending it out. One could hope the two numbers are secure inside 8889541, but how could one prove it? After a thousand years of study of primes, we now have confidence about the amount of work to factor 8889541. But even today we do not have any proof that factoring integers (which is just simplifying an expression) is truly hard; someone could find a magical new factoring algorithm at any time. Another general area of open questions is in how well disguises preserve various important properties of the problem. For example: Is the disguise of a self-adjoint PDE still self-adjoint? Is the disguise of a positive definite matrix still positive definite? Is sparsity preserved for matrix disguises? Is the convergence rate of a solution algorithm affected by the disguise? Is the numerical stability of the problem lost by applying a disguise? The preservation of such properties must be verified on a case-by-case basis. It is important to do so. All the linear algebra disguises in this chapter have been tested extensively to verify that they preserve numerical stability. During this testing we discovered that an earlier disguise for solving a linear system of equations was somewhat less stable than desired; one would commonly lose two or three more digits of accuracy than expected. A modification was found that eliminated the loss of stability, but the underlying "cause" of the loss of accuracy was not discovered. In summary, the secure outsouring of scientific computation is practical in many areas. Yet there are many open questions about how to extend the applicability of disguise and about how to assess the strength of the security provided.
REFERENCES [1] Dole, B., Lodin, S. and Spafford, S. E. (1997). "'Misplaced trust: Kerberos 4 session keys". Proceedings of the 4th Symposium on Network and Distributed System Security, IEEE Press, Los Alamitos, CA, 60-71. [2] Drashansky, T., Joshi, A. and Rice, J. R. (1995). "SciAgents--an agent based environment for distributed, cooperative scientific computing". Proceedings of the 7th Interntional Conference on Tools with Artificial lntelli~ence~ IEEE Press, Los Alamitos, CA, 452-459. [3] Drashansky, T., Weerawarana, S., Joshi, A., Weerasinghe, R. and Houstis, E. N. (1995). "Software architecture of ubiquitous scientific computing environments for mobile platforms". Department of Computer Sciences, CSD-TR-95-032. [4] Matsumoto, T., Kato, K. and Imai, H. (1988). "Speeding up secret computations with insecure auxiliary devices". Advances in Co'ptology, (Guldwasser, ed.), Lecture Notes in Computer Science, Vol. 658, Springer Verlag, New York, 497-506. [5] Atallah, M. J., Pantazopoulos, K. N. and Spafford, E. H. (1996). "Secure outsourcing of some computations". Department of Computer Sciences, CSD-TR-96-074, Purdue University.
SECURE OUTSOURCING OF SCIENTIFIC COMPUTATIONS
271
[6] Abadi, M., Feigenbaum, J. and Killian, J. (1989). "'On hiding information from an oracle". Journal of Computer and System Sciences, 39, 21-50. [7] Schneider, B. (1996). Applied Cryptography, second edition. Wiley, New York. [8] Simmons, G. J. (ed.) (1992). Contemporary Crvptology." The Science of Information Integrity. IEEE Press, Los Alamitos, CA. [9] Rivest, R. L., Adleman, L. and Dertouzos, M. L. (1978). "On data banks and privacy homomorphisms". In Foundations of Secure Computation (R. D. DeMillo, ed.), Academic Press, San Diego, CA, 169-177. [10] Beguin, P. and Quisquater, J.-J. (1995). "Fast server-aided RSA signatures secure against active attacks. Proceedings of the 15th International Ct3'ptolog)' Conference, Santa Barbara, CA, 57-69. [11] Hwang, S.-J., Chang, C.-C. and Yang, W.-P. (1996). "'Some active attacks on fast serveraided secret computation protocols for modular exponentiation". Cryptography." Policy and Algorithms, LNCS 1029, 215-228. [12] Kawamura, S.-I. and Shimbo, A. (1993). "Fast server-aided secret computation protocols for modular exponentiation. In Proceedings of IEEE Journal on Selected Areas in Communications, 11,778-784. [13] Lim, C.-H. and Lee, P. J. (1995). "Security and performance of server-aided RSA computation protocols". Proceedings 15th. International Cryptology Conference, Santa Barbara, CA, 70-83. [14] Laih, C.-S. and Yen, S.-M. (1992). "Secure addition sequence and its application on the server-aided secret computation protocols". Advances in Cryptology (Sebevry and Zheng, eds.), Lecture Notes in Computer Science, Vol. 718, Springer Verlag, New York, 219-230. [15] Pfitzmann, B. and Waidner, M. (1992). "Attacks on protocols for server-aided RSA computation". EUROCR YPT'92 (Rueppel, ed.), Lecture Notes in Computer Science, Vol. 903, Springer Verlag, Berlin, 153-162. [16] Quisquater, J.-J. and de Soete, M. (1991). "'Speeding up Smart Card RSA computations with insecure co-processors". Smart Card 2000, North Holland, Amsterdam, 191-197. [17] Garfinkel, S. and Spafford, E. H. (1996). Practical UNIX & Internet Security, second edition. O'Reilley, Cambridge, MA. [18] deBoor, C. (1978). A Practical Guide to Splines. SIAM Publications, Philadelphia, PA. [19] Eastlake, D. E., Crocker, S. D. and Schiller, J. I. (1994). RFC-1750 Randomness Recommendations for SecuriO'. Network Working Group, Internet Engineering Task Force. http: llwww, ietf .org/ [20] Knuth, D. E. (1981). The Art of Computer Programming, Volume 2, second edition, Addison-Wesley, Reading, MA. [21] Boisvert, R. F., Howe, S. E. and Kahaner, D. K. (1995). "GAMS: a framework for the management of scientific software". A CM Transactions on Mathematics Software, 11, 313-355. h t t p : / / g a m s . n i s t . g o v / [22] Ribbens, C. J. (1989). "A fast adaptive grid scheme for elliptic partial differential equations". ACM Transactions on Mathematical Software, 15, 179-197. [23] Ribbens, C. J. (1989). "Parallelization of adaptive grid domain mappings". In Parallel Processing for Scientific Computing (G. Rodrique, ed.), SIAM, Philadelphia, PA, 196-200. [24] Abramowitz, M. and Stegun, I. A. (1964). Handbook of Mathematical Functions. Applied Mathamatics Series 55, National Bureau Standards, US Government Printing Office. [25] deBoor, C. and Rice, J. R. (1979). "An adaptive algorithm for multivariate approximation giving optimal convergence rates". Journal of Approximation Theory, 25, 337-359. [26] Gonzalez, R. C. and Woods, R. E. (1992). Digital Image Processing. Addison-Wesley, Reading, MA.
272
MIKHAIL J. ATALLAH ET AL.
[27] Jain, A. K. (1989). Fundamentals o1 Digital hTlage Processing. Prentice Hall, Englewood Cliffs, NJ. [28] Boninsegna, M. and Rossi, M. (1994). ~Similarity measures in computer vision". Pattern Recognition Letters, 15, 1255-1260. [29] Rice, J. R. (1993). Numerical Methods, Software, and AnaO'sis, second edition. Academic Press, Section 7.6.D. [30] Collberg, C., Thomborson, C. and Low, D. (1988). "A taxonomy of obfuscating transformations". Technical Report 148, Department Computer Science, University of Auckland.
Author Index Numbers in italics indicate the pages on which complete references are given.
Borkar, S., 208, 213 Boucher, K., 152 Box, D., 16, 34 Brand, R., 98 Briand, L.C., 54, 97, 155, 181 Briody, D., 187, 188, 211 Brown, A.W., 2, 6, 33, 34 Budd, T., 152 Burson, S., 98 Buschmann, F., 98
A
Abadi, M., 217, 271 Abdurazik, A., 78, 98 Abramowitz, M., 229, 23 l, 271 Adleman, L., 218, 271 Allen, E.B., 155, 181 Allen, P., 3, 5, 21, 33, 34 Allen, R., 3, 33 American Institute of Aeronautics and Astronautics 167, 170, 173, 181 Andrale, R.S., 80, 98 Anneke, G., 43, 96 ANSI, 43, 96 Arnold, P., 38, 95 Asbury, S., 18, 34 Astrahan, M.M., 129, 152 Atallah, M.J., 217, 236, 241,270 Austin, C., 18, 34
C
B
Barquin-Stolleman, J.A., 187, 188, 211 Barsan, R., 208, 213 Basili, V.R., 39, 54, 55, 56, 95, 97, 98, 155, 181
Beizer, B., 63, 97 Bell, G., 187, 211 Benamati, J., 192, 212 Bequin, P., 218, 271 Berghout, E., 51, 96 Billings, C., 160, 181 Binder, R.V., 63, 97 Blaha, M., 38, 40, 95 Bodoff, S., 38, 95 Boisvert, R.F., 231,271 Boninsegna, M., 246, 272 Booch, G., 21, 34, 38, 39, 40, 41, 42, 44, 95, 96
Caldiera, G., 55, 97 Cale, E.G., 187, 188, 211 Cangussu, J.W.I_,., 48, 96 Carver, J., 97 Chamberlin, D.D., 129, 152 Chandra, J., 185, 211 Chang, C.-C., 218, 271 Cheesman, J., 2, 33 Chen, Y.C., 187, 188, 211 Chidamber, S.R., 53, 97 Chow, G.C., 187, 188, 197, 211 Christerson, M., 38, 40, 95 Chung, L., 97 Churbuck, D., 191,212 Clark, D., 74, 98 Clifton, J., 160, 181 Clunie, C., 52, 97 Coad, P., 23, 34, 52, 97 Codd, E.F., 152 Colberg, C., 264, 272 Cole, R., 187, 188, 211 Coleman, D., 38, 95 COM Technologies, 100, 121, 151 Conallen, J., 39, 95 CORBA, 145, 152 Cox, B., 3, 34
273
274
AUTHOR INDEX
Crocker, S.D., 219, 271 Culler, D., 208, 213
D
Daniels, J., 2, 6, 33, 34 Dave, D., 187, 188, 211 Dave, D.S., 202, 212 De, V., 208, 213 deBoor, C., 219, 231,250, 271 Delamaro, M.E., 97 Dertouzos, M.L., 218, 271 de Soete, M., 218, 271 DeTar, J., 185, 208, 211 DiNardo, J.E., 192, 212 Dole, B., 216, 270 Douglass, B.P., 44, 96 Drashanshky, T., 216, 270 Dryden, P., 212 D'Souza, D., 21, 34 D'Souza, D.F., 39, 44, 96 Dulberger, E., 187, 188, 211
Fowler, M., 41, 96 Francalanci, C., 207, 213 France, R.B., 39, 95 Fredericks, M., 39, 95 Freeman, P., 3, 34 Freeman, P.A., 37, 95 Frost, S., 21, 34
Gamma, E., 86, 98 Gampel, 152 Gao, J., 63, 97 Garfinkel, S., 219, 271 Garfinkel, S.L., 209, 213 Garlan, D., 3, 33 Gefen, D., 155, 181 Gilb, T., 55, 56, 97 Gonzalez, R.C., 244, 264, 271 Graham, D., 55, 56, 97 Graham, I.M., 39, 96 Gray, J., 132, 152 Gray, S.D., 100, 121, 151 Gremillion, L.L., 187, 188, 211 Grimes, R., 121, 128, 152 Grosch, H.A., 188, 211, 212
Eastlake, D.E., 219, 271 Economist, 185, 211 Eddy, F., 38, 40, 95
Edwards, J., 106, 151 Ein-Dor, P., 1, 187, 188, 211 Emigh, J., 205, 212 Enterprise JavaBeans 2.0 Specification, 152 Enterprise JavaBeans Technology, 100, 123, 154
Epstein, P., 39, 96 Eriksson, H., 44, 96 Evans, A., 39, 96
Fagan, M., 55, 56, 97 Farr, W.H., 167, 181 Feigenbaum, J., 217, 271 Feldmesser, J., 187, 188, 211 Finkelstein, A., 37, 95 Flanagan, D., 18, 34 Flurry, G., 20, 34
Halstead, M.H., 205, 212 Halstead, R., 155, 181 Halvacian, N., 187, 188, 211 Harel, D., 78, 97 Harkey, D., 106, 151 Harris, A.L., 202, 212 Helm, R., 86, 98 Henderson-Sellers, B., 39, 80, 86, 96, 98 Henry, J., 155, 181 Henry, S., 54, 97, 155, 18I Heuring, V.P., 209, 213 Hodge, J.H., 187, 188, 211 Holden, D., 208, 213 Hollenbach, C., 154, 167, 180 Holton, W.C., 209, 213 Horstmann, C., 18, 34 Houstis, E.N., 216, 270 Howe, S.E., 231,271 Hsia, P., 63, 97
AUTHOR INDEX Hu, W., 107, 151 Hwang, S.-J., 218, 271
IEEE, 37, 48, 95, 96 Imai, H., 216, 218, 270 Inoue, K., 39, 95 International Organization for Standardization (ISO), 130, 152
Jacobs, A., 206, 212 Jacobson, I., 38, 39, 40, 41, 44, 95, 96 Jager, D., 39, 96 Jain, A.K., 244, 272 Jalote, P., 38, 70, 95 Jennings, R., 100, 121, 151 Johnson, R., 86, 98 Jonsson, P., 38, 40, 95 Jordan, H.F., 209, 213 Joshi, A., 216, 270 Juristo, N., 97
Kafura, D., 155, 181 Kahaner, D.K., 231,271 Kang, Y.M., 187, 188, 192, 211, 212 Kara, D., 5, 34 Kato, K., 216, 218, 270 Katz, F., 152 Kawamura, S.-I., 218, 219, 271 Keller, T., 125, 167, 181 Keller, T.W., 167, 181 Kemerer, C.F., 53, 97 Kendall, S., 149, 152 Kent, S., 39, 96 Khoshgoftaar, T.M., 155, 181 Kidd, J., 52, 97 Killian, J., 217, 271 Kim, K.W., 209, 213 Kim, Y.-M., 155, 181 King, J.L., 205, 212 Kitchenham, B.A., 38, 95 Kitzmiller, T., 98
275
Kleppe, G., 43, 96 Knight, K.E., 188, 212 Knuth, D.E., 219, 271 Kobryn, C., 40, 96 Kolkhorst, B., 160, 181 Korson, T., 152 Krammer, J., 37, 95 Krutchen, P., 39, 96 Kung, C., 63, 97 Kung, D.C., 63, 97 Kusumoto, S., 39, 95
Laih, C.-S., 218, 271 Lammers, D., 208, 213 Lanubile, F., 55, 97 Larsen, G., 39, 95 Lau, C., 102, 151 Lederer, A.L., 192, 212 Lee, E., 160, 181 Lee, P.J., 218, 271 Lehman, M.M., 154, 180 Lehr, W., 192, 212 Leite, J.C.S.P., 37, 95 Lewis, B., 206, 212 Lewis, R., 144, 152 Lichtenberg, F.R., 192, 212 Lie, W., 54, 97 Lievano, R.A., 100, 121, 151 Lim, C.-H., 218, 271 Lockheed-Martin, 175, 181 Lockman, A., 37, 95 Lodin, S., 216, 270 Ldpez, M., 97 Lorenson, W., 38, 40, 95 Lorenz, M., 52, 97 Low, D., 264, 272
M
McGee, W.C., 134, 152 McGregor, J.D., 152 McKenny, J.I_,., 187, 188, 211 Maggalioni, P., 207, 213 Magid, D., 107, 151 Maldonado, J.C., 48, 96, 97 Mann, C.C., 208, 213
276
AUTHOR INDEX
March, S., 185, 211 Markosian, L., 98 Martin, W., 208, 213 Masiero, P.C., 48, 96 Matheson, L., 155, 181 Matsumoto, T., 216, 218, 270 Mayfield, M., 23, 34 Meieran, E.S., 211 Melo, W.L., 54, 97 Mendelson, H., 192, 212 Messina, P., 208, 213 Metcalfe, R., 189, 212 Meunier, R., 98 Meyer, B., 6, 34, 37, 40, 41, 72, 95 Miller, M.J., 185, 209, 210, 211 Miller, R.B., 188, 212 Monson-Haefel, R., 100, 123, 151 Moore, G.E., 185, 208, 211 Moreno, A.M., 97 Morisio, M., 39, 96 Mukherjee, S., 185, 211 Mylopoulos, J., 97
Naamad, A., 78, 97 Nainika, P., 192, 212 NASA, 56, 97 Nelson, L.E., 155, 181 Newcomb, P., 98 Niessink, F., 38, 95 Nixon, B.A., 97 Nuseibeth, B., 37, 95
Oram, A., 107, 151 Orfali, R., 106, 151 Overgaard, G., 38, 40, 95
Packan, P.A., 209, 213 Pantazopoulos, K.N., 217, 236, 241,270 Pape, W., 185, 211 Parnas, D., 3, 33 Pataknayakuni, R., 192, 212 Pearse, T., 155, 181 Penker, M., 44, 96 Penteado, R.A.D., 48, 96 Perry, W., 48, 63, 96 Petreley, N., 205, 212 Pfeiffer, W., 208, 213 Pfitzmann, B., 218, 271 Pflegger, S., 37, 87, 95 Pflugrad, A., 154, 167, 180 Pick, R.A., 188, 212 Pigoski, T.M., 155, 181 Pischke, J., 192, 212 Platt, D.S., 17, 34 Porter, A., 56, 97 Powlan, M., 18, 34 Premerlani, W., 38, 40, 95 Pressman, R., 38, 40, 47, 87, 95 Prieto-Diaz, R., 3, 34
Quisquater, J.-J., 218, 271
OAO Corporation Report, 178, 181 Object Data Management Group, 131, 152 Object Management Group, 106, 151 ObjectStore, 104, 151 Ockerbloom, J., 3, 33 Oden, J.T., 208, 213 O'Donnell, D., 205, 212 Offutt, J., 63, 78, 97, 98 Oldehoeft, A.E., 205, 212 Oman, P., 155, 181 OMG, 15, 34, 39, 40, 41, 74, 76, 95 Oosterbeek, H., 192, 212
Rai, A., 192, 212 Ramesh, R., 185, 211 Rao, B.R., 13 l, 152 Rao, H.R., 185, 211 Regalado, A., 209, 213 Reuter, A., 132, 152 Ribbens, C.J., 219, 229, 271 Rice, J.R., 216, 231,250, 252, 270, 271, 272 Rieffel, E., 209, 213 Rivest, R.L., 218, 271 RMI--Remote Method Invocation, 108, 152
AUTHOR INDEX
Rocha, A., 52, 97 Rohnert, H., 98 Roman, E., 20, 34 Ross, P.E., 185, 211 Rossi, M., 246, 272 Rotman, D., 209, 213 Rumbaugh, J., 38, 39, 40, 41, 44, 95, 96
Salasin, J., 37, 95 Samuels, G., 191,212 Sanders, G.D., 209, 213 Sandhu, R., 39, 96 Sawicki, J., 208, 213 Schaller, R.R., 192, 208, 212 Schiller, J.I., 219, 271 Schleicher, A., 39, 96 Schneberger, S.L., 155, 181 Schneider, B., 218, 271 Shneidewind, N.F., 155, 157, 160, 163, 167, 170, 177, 178, 180, 181 Scott, K., 41, 96 Selic, B., 39, 96 Sessions, R., 16, 34, 102, 106, 151 Seymour, J., 205, 212 Sharpe, W.F., 187, 188, 211 Shimbo, A., 218, 219, 271 Shirley, J., 107, 151 Shneidewind, N.F., 38, 95 Shroff, M., 39, 95 Shull, F., 39, 55, 95, 97 Siegal, J., 15, 16, 34 Simmons, G.J., 218, 271 Simple Object Access Protocol 1.1, 146, 152 Singer, J., 38, 95 Singh, M., 192, 212 Sircar, S., 187, 188, 211 Smith, D., 154, 167, 180 Smith, G., 208, 213 Smith, O.D., 167, 181 Sneed, H., 155, 181 Solomon, M.B. Jr., 188, 205, 212 Sommeerlad, P., 98 Spafford, E.H., 217, 219, 236, 241,270, 271 Spafford, S.E., 216, 270 Stahl, M., 98 Stark, G.E., 155, 181 Stark, M., 39, 96
277
Stavins, J., 207, 212 Steane, A., 209, 213 Stegan, I.A., 229, 23 l, 271 Stix, G., 208, 213 Stonebreaker, M., 129, 152 Studt, T., 185, 211 Sun Microsystems, 18, 34 Szyperski, C., 6, 34
Takada, S., 38, 95 Thomborson, C., 264, 272 Thornton, P.A., 167, 181 Travassos, G., 39, 95 Travassos, G.H., 38, 39, 78, 80, 95, 96, 97, 98 Trio, G.P., 155, 181
U Uemura, T., 39, 95
V van Solingen, R., 51, 96 Vehvilainen, R., 38, 95 Viera, M.E.R., 78, 98 Vijayan, J., 184, 211 Vlissides, J., 86, 98 von Mayrhauser, A., 38, 95 Votta, E.G. Jr., 56, 97
W Waddoups, R., 185, 211 Waidner, M., 218, 271 Waldo, J., 149, 152 Waldrop, M.M., 209, 213 Wallnau, K.C., 3, 6, 33, 34 Warmer, J.B., 43, 96 Weerasinghe, R., 216, 270 Weerawarana, S., 216, 270 Wegner, P., 148, 152 Werner, C., 52, 97 Westfechtel, B., 39, 96 Wiener, L., 45, 96
278
AUTHOR INDEX
Wiener, S.R., 18, 34 Wilkerson, B., 45, 96 Wills, A.C., 21, 34, 39, 44, 96 Wingert, W.B., 160, 181 Wirfs-Brock, R., 45, 96 Wollrath, A., 149, 152 Woods, R.E., 244, 264, 271 Wyant, G., 149, 152
X X/Open CPI-C Specification, 130, 131, 152
Yakimovitch, D., 98 Yang, H., 38, 95 Yang, W.-P., 218, 271 Yelavich, B.M., 134, 152 Yen, S.-M., 218, 271 Younessi, H., 39, 96 Young, R., 154. 167, 180 Yourdon, E., 52, 97 Yu, E., 97
Subject Index
Abstraction, 148 ACID transaction properties, 132 Advanced program to program communication (APPC), 135 Amphibian interface, 105 ANSI/IEEE Standard, 129, 156 Application Programming Interfaces (APIs), 20 Application servers, 32 Applications and component architecture, 23-4 Approximation theoretic attacks, 248-9 Association relationship, 69 Asynchronous communication, 14 Atomic disguises, 223-32 Average number of support classes per key class (ANSC), 52
BDAM, 134 BEA, 32 Best-in-class funcionality and performance, 4 Binary internetworking, 108 BOOCH, 38 Business components, 32 Business logic implementation, 137-9 Business object (BO), 116, 117 Business object instance manager (BOIM), 117
C++, 101-2 CASE tools, 41 CBD, 1-34 applications assembly using, 20-4 current practice, 24-32 "divide and conquer approach", 3
goals of approaches, 3 key to understanding, 5 role of, 3-5 use of term, 3 CBD HQ, 27-8 CBDi Forum, 27 CBO, 54 Change metric, 158-9, 175 application, 179 example computations, 166 ranks, 179-80 summary, 170 Chip-manufacturing progress, 187 Chip-manufacturing technologies, 208-11 CICS (Customer Information Control System), 134- 5 Class size (CS), 52 CLI, 130, 131 CMOS technology, 191 Code obfuscations, 264 Cohesion, 52 COM, 102-5, 131 analogs, 113 evolution, 149- 51 interfaces, 113-14 COM+, 8, 10, 12, 16-17, 19, 142-3 Commercial-off-the-shelf software (COTS), 89 Common Object Request Broker Architecture. See CORBA Complexity containment, 4 Component architecture and applications, 23-4 logical, 24 physical, 24 Component assembly environmnent, 30 Component-based development. See CBD Component-based software engineering (CBSE), 2, 3 Component Broker, 114-20 architecture, 116-17 component development, 120 example, 115-20
279
280
SUBJECT INDEX
Component Broker (continued) life-cycle, 118 programming model client's view, 119- 20 developer's view, 118-19 run-time, 118 server group, 117 Component evolution, 101 - 14, 128- 36 Component framework, 31 Component implementation, 101, 102 Component infrastructure, 111 - 14 implementations, 14- 20 services, 13-14 vendors, 28- 32 Component interface, 101, 102 Component model, 7-8, 12 Component object model. See COM Component-oriented software process, elements, 22 Component persistence single-level, 103-4 two-level, 103-4 Component Software, 27 Component software portals, 26-8 Component software vendors, 25-6 Component standard, 12 Components and concept of objects, 6 and distributed systems, 8-11 and objects relationship, 8 definition, 5-6 deployment approach, 13 elements, 11 - 13 execution environment, 13 goals of approaches, 3 overview, 1-34 packaging, 12, 13 reuse orientation, 101-2 sources, 21 - 3 use of term, 5-13 versus objects, 101-2 ComponentSource, 25 Computer purchasing strategies absolute dominance, 196-7 comparing strategies, 194-6 cost/performance ratio, 195 demand, 206-8 interpretation, 200 laptops, 202- 5 performance characteristics, 194
personal computers, 191-202 relative dominance, 197-200 results of strategies, 196-200 upgrade strategy, 201-2 Computer technology centralization, 205-6 changes and purchasing strategies, 183-213 decentralization, 205-6 TCO, 184-5, 205-6 Computers, changing use, 190 Concurrency, 113, 149 Consistency improvement, 4 Containment, 105 Convolution, 240-1 Coordinate system changes, 228-9 CORBA, 10, 12, 15-16, 19, 106-11,114, 119, 133 component model (CCM), 145-6 components, 145- 6 services, 112 Coupling, 51 - 2 between objects (CBO), 53 Cryptography and outsourcing, 217-19 Customer needs, 58
Data abstraction coupling (DAC), 54 Data access abstraction, 129-33 distribution and heterogeneity, 131-2 Data-dependent disguises, 222, 232-3 Data independence, 129 Data object (DO), 116 DCE, 103 DCOM, 10, 17, 105-7, 109, 110, 114, 131 Defect-based testing, 63 Delegation, 105, 148-9 Delivery time reduction, 4 Deployment descriptor, 123 Deployment tools, 32 Depth of inheritance (DIT), 53, 54 Derived subkeys, 232 Design inspection techniques, 60-2 Development tools, 32 Differential equation, 255-9 solution, 222- 3 Dimension modification, 227-8 Disguise. See Outsourcing and disguise
SUBJECT INDEX Distributed component object model. See DCOM Distributed components, 105-11 client view, 108 identity, 108-10 transparency, 110-11 transport, 106-8 Distributed development, 4 Distributed enterprise components, 122, 127, 147 Distributed InterNet Applications (DNA), 17 Distributed systems, 8-11 levels of transparency, 9 Distribution services, 14 Domain/dimensional/coordinate system modifications, 222 Domain modification, 227-8 Domains of functions, 259-64 DRAM, 185-7 DRDA, 132 DSOM, 105, 107 DTP. See X/Open Distributed Transaction Processing (DTP) model Dynamic data exchange (DDE), 150
Economies of scale, 188-9, 191 Edge detection, 221 EJBs, 8, 10, 12, 16, 18-20, 30-1, 99-152 application server, 31 architecture, 127 asynchronous processing, 144-5 component-container contract, 126 component models, 126-8 container-managed persistence, 143 continued evolution, 143-5 creating a persistent component, 140-2 messaging, 144-5 overview, 123-5 perspective, 126 programming model, 125-6 sample application, 136-42 see also MTS Embedded SQL, 130 Encapsulation, 6 Encryption. See Cryptography Enterprise application integration (EAI), 22 Enterprise-CBD approach, 21
281
Enterprise components and infrastructure, lll-14 Enterprise JavaBeans. See EJBs Enterprise resource planning (EPR), 22 Equivalence partitioning technique, 70 Errors. 48 Events, 113 Exhaustive match attack, 248 Extend relationship, 69
Failure rate, 171-5 Failures, 49, 164-8, 170-1 Faults, 48 Financial services industry, 217 FlashLine, 25 Forte. 31 Function shipping, 135 Functional (black-box) testing, 63 Functionalities, 75 FUSION, 38
Gas station control system (GSCS), 64-7 metrics, 80 General inter-ORB protocol (GIOP), 107 Generalization, 70 Global Positional System, 175 Globally unique identifier (GUID), 108-9 GQM paradigm, 51 Grosch's law, 188-9, 191
High-level design (HLD), 44-6, 70-84
IBM, 30-1 Identities and partitions of unity, 222, 230-1 IDL, 10, 15, 16, 102, 103, 108, 110, 111 Image analysis, template matching in, 243-6 Implementation, 7, 12 inheritance, 104-5
282
SUBJECT INDEX
IMS (Information Management System), 134 Include relationship, 70 Inspection, 54-62 methods, 55-6 techniques, 56- 7 Instance manager (IM), 116 Instance manager framework (IMF), 116 Integration testing, 64 IntellectMarket, 26 Interactive development environment (IDE), 31 Interface, 7 Interface Definition Language. See IDL Interface-focused design, 23 Internet Inter-Orb Protocol (IIOP), 10. 15, 108 Interoperable object references (IOR). 119 ISAM, 134
Java, 102 development tools, 31 Java-based distributed component environment, 17-20 Java Naming and Directory Interface (JNDI). 18, 20 Java Server Page (JSP), 20 JavaBeans. See EJBs Java, 2 Enterprise Edition (J2EE), 20 JRMP, 108
Key processing, 232
Lack of cohesion in methods (LCOM), 53, 54 Laptop purchase strategies, 202- 5, 210 absolute dominance, 203-4 relative dominance, 203-4 results, 203 upgrades, 204- 5 Life-cycle, 44, 113, 118 Life-cycle sevices, 16 Linear algebra, 236-42
Linear equations, 239-40 Linear operator modification, 222, 226 Lithography methodologies, 208-10 Local components, 102- 5 description, 102-3 identity, 103-4 inheritance, 104- 5 load methods, 104 persistence, 103-4 store methods, 104 Long-term analysis, metrics for, 160, 163-6, 170-4 Low-level design (LLD), 46, 84-6
M
Mainframes, price/performance, 187-91 Maintained software, characteristics of, 161-2 Maintenance process, 153-81 cost reduction, 4-5 data and example application, 160-3 related research and projects, 155-7 reliability and test effort, 163-6 Managed object (MO), 116, 117, 119-20 Managed object framework (MOFW), 116, 120 analogs. 126 Master key, 223, 232 Master key/subkey methodology, 222 Mathematical object modification, 222 Matrix inversion, 237-9 Matrix multiplication, 220, 237, 253-4 Mean time between critical failures (MTBCF). 178 Mean time to failure (MTTF), 164 Mean time to repair (MTTR), 178 Message-oriented middleware (MOM), 9 Message passing coupling (MPC), 54 Metric function. 159 Metrics, 50-4, 153-81 for long-term analysis, 160, 163-6, 170-4 for short-term analysis. 160, 170-4 GSCS, 80 Microsoft, 29- 30 developer network (MSDN), 30 Interface Definition Language (MIDL), 17 Message Queue (MSMQ), 17 Transaction Server. See MTS
SUBJECT INDEX
Minicomputers, 191 Moore's law, 185-6, 208-9 MQIntegrator, 30 MQSeries, 30 MTS, 17, 30, 99-152 component models, 126-8 continued evolution, 142-3 creating a persistent component, 140-2 overview, 121-3 sample application, 136-42 Multiple interface inheritance, 105
Naming services, 16, 113 NASA Space Shuttle, 154, 160 NetBeans, 31 Number of children (NOC), 53, 54 Number of key classes (NKC), 52 Number of operations added by a subclass (NOA), 53 Number of operations overridden by a subclass (NOO), 53 Number of scenarios scripts (NSS), 52 Number of subsystems (NSUB), 52 Number of support classes (NSC), 52 Numerical quadrature. See Quadrature
Object Object Object Object Object
constraint language (OCL), 43, 74 identifier, 6 linking and embedding (OLE), 150-1 Management Architecture, 15-16 Management Group (OMG), 15-16, 39, 40 Object modification, 226 Object orientation, principles, 6-7 Object oriented approaches, 6, 101 basic concepts, 7 Object oriented constructs, 39 Object oriented databases, 131 Object oriented languages, 148 Object oriented paradigm, 37, 40, 47 Object oriented programming languages (OOPLs), 6 Object oriented reading techniques (OORTs), 46, 60-2, 71, 80-1, 89-91
283
Object oriented software, 36 Object references, 109 Object transaction monitors (OTMs), 38, 114-20, 124 example, 115-20 Objectools, 25-6 Objects, 148 versus components, 101-2 ObjectStore, 104 ODBC, 131, 151 OID, 172-3, 174 One time coordinate changes, 229 One time elementary function approximations, 251 One time identities, 231 One time pad, 218 One time random sequences, 222, 224 One time random space, 250 OOSE, 38 Open Component Foundation, 28 Operational increment functionality, 175-6, 176 OTS, 133 Outsourcing, 215- 72 and cryptography, 217-19 and disguise, 216-17, 2 1 9 - 23 applications, 236-47 attack strategies and defenses, 248-52 breaking disguises, 247-8 code disguises, 264 control of accuracy and stability, 267 cost analysis, 265-70 data-dependent disguises, 232-3 disguise algorithm, 236, 262 disguise program, 223, 233-6, 254 disguise strength analysis, 252-64 examples, 220- 3 general framework, 223-36 need for multiple disguises, 223-4 network costs, 268 case./(x, y) = (x - y) 2,244 case./(x, y) = ]x - y ], 244-6
Packaged business logic, 133-6 Parallel development, 4 Parameter statistics attack, 249
284
SUBJECT INDEX
Partial differential equation (PDE) problem, 228-9 Pass-by-reference semantic, 110 Pass-by-value semantics, 111 Persistence, 149 Persistence/storage, 113-14 Personal computers declining prices, 189-91 price/performance, 187-91 purchasing strategies, 210 Perspective-based reading (PBR), 44-6, 57-60, 68, 70, 89-90 Preservation of problem structure, 265-7 Privacy homomorphism, 218 Problem structure, 265-7 Procedural logic, 130 Process improvement, 175-6 chronology, 177 Product evaluation, 153- 81 data and example application, 160-3 related research and projects, 155-7 Productivity improvement, 4 Program analysis, 255
Quadrature, 220-1,247, 254-5 Quality improvement, 4 QueryInterface, 105, 109
Random mathematical objects, 222 Random number generator, 232 Random objects, 224-6 Rational's unified process, 21 Reliability, 154, 163-6 goals, 175 predictions, 166-9 Remaining failures, 167-8, 170-1 Remote data access (RDA), 131 Remote method invocation (RMI), 18, 108 Remote procedure calls (RPC), 9, 106-8 RemoteException, 111 Requirements inspection technique, 57-60 Resource manager, 135 Response for a class (RFC), 53, 54 Reusable assets, 87
Reusable COM components, 30 Risk, 159
San Francisco Project, 31 Security, 14 analysis, 247-64 services, 16 Select perspective method, 21 Server-aided computation, 218 ServerSide, 27 Shape metrics, 159 Shared property manager (SPM), 128 Short-term analysis, metrics for, 160, 170-4 Size metrics, 52-3 Smalltalk, 102 SOAP (Simple Object Access Protocol), 29, 146-7 Software artifacts, defects in, 48-50 Software defects, 49, 50, 59 Software failure time loss, 178-9 Software reading techniques, 56 SOM (System Object Model), 102, 103, 105 Sorting, 242- 3 Specialization index (SI), 53 Specification, 11 SQL, 130, 133, 141 Stability concept, 157-9 long-term, 174 short-term, 174 Standard deviation (SD), 158 Statistical attacks, 248-9 Stored procedures, 130, 135 String pattern matching, 246-7 Structural (white-box) testing, 63 Substitutability, 7 Sun, 31 - 2 Symbolic code analysis, 250-2 System availability, 178 System design, 58 System test, 58
Template matching in image analysis, 243-6 Test effort, 163-6
285
SUBJECT INDEX TheoryCenter, 32 Time to next failure, 168-9 TopLink, 32 Total cost of ownership (TCO), 184- 5,205-6 Total failures, 164-7 Total test time, 165-6, 170-1, 175 Transactions, 113, 132 management, 14 routing, 135 service, 16 Trend metrics, 157-8, 160 Two-phase commit, 132 Two-phase locking, 132 Typing, 148
UDDI, 29 UML, 35-98 activities diagrams, 81-4 applications, 39 artifacts, 41 - 3 class descriptions, 72-5 class diagrams, 72-5, 93 refining, 78-81 collaboration diagram, 77 component diagrams, 85 deployment diagrams, 85-6 design artefacts, 71 design concepts, 86 design modeling, 41 - 3 development activities, 44-6 development objectives, 40-1 evolution, 86-93 example, 91 - 3 example, 64-7 extended form, 21 future, 94-5 high-level design activities, 44-6, 70-84 inspection and testing, 48-50 interaction diagrams, 75-8 low-level design activities, 46, 84-6 maintenance, 86-93 example, 91 - 3
measurement programs, 50-4 overview, 40-3 package diagrams, 81-4 quality assurance, 47-64 requirement activities, 67-70 sequence diagram, 76 software process, 43-64, 87-8 state diagrams, 75-8 testing, 47-8, 62-4 understanding activites, 88-91 use case diagrams, 69, 70 use cases, 92 verification and validation (V&V) activities, 47-64 Unified Modeling Language. See UML Uniform data transfer, 114 Unit testing, 63-4 US Air Force Global Awareness (GA) program, 155, 177-9
V Value analysis, 255 ValueType, 111 Virtual function tables, 108 Visibility into project progress, 4 Visual C++, 110 VisualAge for Java (VA), 30-1
W WebLogic Server, 32 WebSphere, 30, 31 Weighted methods per class (WMC), 53, 54
X XM L, 29 X/Open Distributed Transaction Processing (DTP) model, 133
This Page Intentionally Left Blank
Contents of Volumes in This Series Volume 40 Program Understanding: Models and Experiments A. VON MAYRHAUSER AND A. M. VANS Software Prototyping ALAN M. DAVIS Rapid Prototyping of Microelectronic Systems APOSTOLOS DOLLAS AND J. D. STERLING BABCOCK Cache Coherence in Multiprocessors: A Survey MAZIN S. YOUSIF, M. J. THAZHUTHAVEETIL, and C. R. DAS The Adequacy of Office Models CHANDRA S. AMARAVADI, JOEY F. GEORGE, OLIVIA R. LIU SHENG, AND JAY F. NUNAMAKER
Volume 41 Directions in Software Process Research H. DIETER ROMBACH AND MARTIN VERLAGE The Experience Factory and Its Relationship to Other Quality Approaches VICTOR R. BASILI CASE Adoption: A Process, Not an Event JOCK A. RADER On the Necessary Conditions for the Composition of Integrated Software Engineering Environments DAVID J. CARNEY AND ALAN W. BROWN Software Quality, Software Process, and Software Testing DICK HAMLET Advances in Benchmarking Techniques: New Standards and Quantitative Metrics THOMAS CONTE AND WEN-MEI W. H w u An Evolutionary Path for Transaction Processing Systems CARLTON PU, AVRAHAM LEFF, AND SHU-WEI, F. CHEN
Volume 42 Nonfunctional Requirements of Real-Time Systems TEREZA G. KIRNER AND ALAN M. DAVIS A Review of Software Inspections ADAM PORTER, HARVEY SIY, AND LAWRENCE VOTTA Advances in Software Reliability Engineering JOHN O. MUSA AND WILLA EHRLICH Network Interconnection and Protocol Conversion MING T. LIU A Universal Model of Legged Locomotion Gaits S. T. VENKATARAMAN
287
288
CONTENTS OF VOLUMES IN THIS SERIES
V o l u m e 43 Program Slicing DAVID W. BINKLEY AND KEITH BRIAN GALLAGHER Language Features for the Interconnection of Software Components RENATE MOTSCHNIG-PITRIK AND ROLAND T. MITTERMEIR Using Model Checking to Analyze Requirements and Designs JOANNE ATLEE, MARSHA CHECHIK, AND JOHN GANNON Information Technology and Productivity: A Review of the Literature ERIK BRYNJOLFSSONAND SHINKYU YANG The Complexity of Problems WILLIAM GASARCH 3-D Computer Vision Using Structured Light: Design, Calibration, and Implementation Issues FRED W. DEPIERO AND MOHAN M. TRIVEDI
V o l u m e 44 Managing the Risks in Information Systems and Technology (IT) ROBERT N. CHARETTE Software Cost Estimation: A Review of Models, Process and Practice FIONA WALKERDEN AND ROSS JEFFERY Experimentation in Software Engineering SHARI LAWRENCE PFLEEGER Parallel Computer Construction Outside the United States RALPH DUNCAN Control of Information Distribution and Access RALF HAUSER Asynchronous Transfer Mode: An Engineering Network Standard for High Speed Communications RONALD J. VETTER Communication Complexity EYAL KUSHILEVITZ
V o l u m e 45 Control in Multi-threaded Information Systems PABLO A. STRAUB AND CARLOS A. HURTADO Parallelization of DOALL and DOACROSS Loops a Survey A. R. HURSON, JOFORD T. LIM, KRISHNA M. KAVI, AND BEN LEE Programming Irregular Applications: Runtime Support, Compilation and Tools JOEL SALTZ, GAGAN AGRAWAL,CHIALIN CHANG, RAJA DAS, GUY EDJLALI, PAUL HAVLAK, YUAN-SHIN HWANG, BONGKI MOON, RAVI PONNUSAMY, SHAMIK SHARMA, ALAN SUSSMAN AND MUSTAFA UYSAL Optimization Via Evolutionary Processes SRILATA RAMAN AND L. M. PATNAIK Software Reliability and Readiness Assessment Based on the Non-homogeneous Poisson Process AMRIT L. GOEL AND KUNE-ZANG YANG Computer-supported Cooperative Work and Groupware JONATHAN GRUDIN AND STEVEN E. POLTROCK Technology and Schools GLEN L. BULL
CONTENTS OF VOLUMES IN THIS SERIES
289
V o l u m e 46 Software Process Appraisal and Improvement: Models and Standards MARK C. PAULK A Software Process Engineering Framework JYRKI KONTIO Gaining Business Value from IT Investments PAMELA SIMMONS Reliability Measurement, Analysis, and Improvement for Large Software Systems JEFF TIAN Role-based Access Control RAVI SANDHU Multithreaded Systems KRISHNA M. KAVI, BEN LEE AND ALLI R. HURSON COORDINATION MODELS AND LANGUAGES GEORGE A. PAPADOPOULOSAND FARHAD ARBAB Multidisciplinary Problem Solving Environments for Computational Science ELIAS N. HOUSTIS, JOHN R. RICE AND NAREN RAMAKRISHNAN
V o l u m e 47 Natural Language Processing: A H u m a n - C o m p u t e r Interaction Perspective BILL MANARIS Cognitive Adaptive Computer Help (COACH): A Case Study EDWIN J. SEEKER Cellular Automata Models of Self-replicating Systems JAMES A. REGGIA, HuI-HSIEN CHOU, AND JASON D. LOHN Ultrasound Visualization THOMAS R. NELSON Patterns and System Development BRANDON GOLDFEDDER High Performance Digital Video Servers: Storage and Retrieval of Compressed Scalable Video SEUNGYUP PAEK AND SHIH-FU CHANG Software Acquisition: The Custom/Package and Insource/Outsource Dimensions PAUL NELSON, ABRAHAM SEIDMANN, AND WILLIAM RICHMOND
V o l u m e 48 Architectures and Patterns for Developing High-performance, Real-time ORB Endsystems DOUGLAS C. SCHMIDT, DAVID L. LEVINE AND CHRIS CLEELAND Heterogeneous Data Access in a Mobile Environment - Issues and Solutions J. B. LIM AND A. R. HURSON The World Wide Web HAL BERGHEL AND DOUGLAS BLANK Progress in Internet Security RANDALL J. ATKINSON AND J. ERIC KLINKER Digital Libraries: Social Issues and Technological Advances HSINCHUN CHEN AND ANDREA L. HOUSTON Architectures for Mobile Robot Control JULIO K. ROSENBLATT AND JAMES A. HENDLER
290
CONTENTS OF VOLUMES IN THIS SERIES
Volume 49 A survey of Current Paradigms in Machine Translation BONNIE J. DORR, PAMELA W. JORDAN AND JOHN W. BENOIT Formality in Specification and Modeling: Developments in Software Engineering Practice J. S. FITZGERALD 3-D Visualization of Software Structure MATHEW L. STAPLES AND JAMES M. BIEMAN Using Domain Models for System Testing A. VON MAYRHAUSER AND R. MRAZ Exception-handling Design Patterns WILLIAM G. BAIL Managing Control Asynchrony on SIMD Machines--a Survey NAEL B. ABU-GHAZALEH AND PHILIP A. WILSEY A Taxonomy of Distributed Real-time Control Systems J. R. AGRE, L. P. CLARE AND S. SASTRY
Volume 50 Index Part I Subject Index, Volumes 1-49
Volume 51 Index Part II Author Index Cumulative list of Titles Table of Contents, Volumes 1-49
V o l u m e 52 Eras of Business Computing ALAN R. HEVNER AND DONALD J. BERNDT Numerical Weather Prediction FERDINAND BAER Machine Translation SERGEI NIRENBURG AND YORICK WILKS The Games Computers (and People) Play JONATHAN SCHAEFFER From Single Word to Natural Dialogue NEILS OLE BENSON AND LAILA DYBKJ,'ER Embedded Microprocessors: Evolution, Trends and Challenges MANFRED SCHLETT
V o l u m e 53 Shared-Memory Multiprocessing: Current State and Future Directions PER STENSTROM, ERIK HAGERSTEN, DAVID I. LIJA, MARGARET MARTONOSI and MADAN VERNGOPAL
CONTENTS OF VOLUMES IN THIS SERIES
291
Shared Memory and Distributed Shared Memory Systems: A Survey KRISHNA KAUI, HYONG-SHIK KIM, BEN LEE and A. R. HURSON Resource-Aware Meta Computing JEFFREY K. HOELINGSWORTH,PETER J. KELEHER and KYUNG D. RYU Knowledge Management WILLIAM W. AGRESTI A Methodology for Evaluating Predictive Metrics JARRETT ROSENBERG An Empirical Review of Software Process Assessments KHALED EL EMAM and DENNIS R. GOEDENSON State of the Art in Electronic Payment Systems N. ASOKAN, P. JANSON, M. STEINES and M. WAIDNER Defective Software: An Overview of Legal Remedies and Technical Measures Available to Consumers COLLEEN KOTYK VOSSLER and JEFFREY VOAS
ISBN
9
0-12-012154-9
>