Building Application Servers
Advances in Object Technology Series Dr. Richard S. Wiener, Series Editor and Editor-in-Chief of Journal of Object-Oriented Programming SIGS Publications, Inc. New York, New York and Department of Computer Science University of Colorado Colorado Springs, Colorado 1. Object Lessons: Lessons Learned in Object-Oriented Development Projects • Tom Love 2. Objectifying Real-Time Systems • John R. Ellis 3. Object Development Methods • edited by Andy Carmichael 4. Inside the Object Model: The Sensible Use of C++ • David M. Papurt 5. Using Motif with C++ • Daniel J. Bernstein 6. Using CRC Cards: An Informal Approach to Object-Oriented Development • Nancy M. Wilkinson 7. Rapid Software Development with Smalltalk • Mark Lorenz 8. Applying OMT: A Practical Step-By Step Guide to Using the Object Modeling Technique • Kurt W. Derr 9. The Smalltalk Developer's Guide to VisualWorks • Tim Howard 10. Objectifying Motif • Charles F. Bowman 11. Reliable Object-Oriented Software: Applying Analysis & Design • Ed Seidewitz & Mike Stark 12. Developing Visual Programming Applications Using Smalltalk • Michael Linderman 13. Object-Oriented COBOL • Edmund C. Arranga & Frank P. Coyle 14. Visual Object-Oriented Programming Using Delphi • Richard Wiener & Claude Wiatrowski 15. Object Modeling and Design Strategies: Tips and Techniques • Sanjiv Gossain 16. The VisualAge for Smalltalk Primer • Liwu Li 17. Java Programming by Example • Rajiv Sharma & Vivek Sharma 18. Rethinking Smart Objects: Building Artificial Intelligence with Objects • Daniel W. Rasmus 19. The Distributed Smalltalk Survival Guide • Terry Montlick 20. Java for the COBOL Programmer • E. Reed Doke, Ph.D. and Bill C. Hardgrave, Ph.D. 21. Building Application Servers • Rick Leander
Additional Volumes in Preparation
Building Application Servers Rick Leander
|CAMBRIDGE 1
UNIVERSITY PRESS
M S I GS TtBk B O O K S
I PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE
The Pitt Building, Trumpington Street, Cambridge, United Kingdom CAMBRIDGE UNIVERSITY PRESS
The Edinburgh Building, Cambridge CB2 2RU, UK 40 West 20th Street, New York, NY 10011-4211, USA 10 Stamford Road, Oakleigh, Melbourne 3166, Australia
www.cup.cam.ac.uk www.cup.org
Ruiz de Alarcon 13, 28014 Madrid, Spain Published in association with SIGS Books © 2000 Cambridge University Press All rights reserved. This book is in copyright. Subject to statutory exception and to the provisions of the relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. Any product mentioned in this book may be a trademark of its company. First published in 2000 Design and composition by Andrea Cammarata Cover design by Andrea Cammarata Printed in the United States of America A catalog record for this book is available from the British Library. Library of Congress Cataloging-in-Publication Data is on record with the publisher.
ISBN 0 521 77849 2 paperback
To Barb
Contents
Acknowlegments
xix
Introduction Who Should Read This Book Organization Part 1—Architecture Part 2—Design Part 3—Programming How to Get the Program Code
xxi xxi xxii xxii xxiii xxiii xxiii
PARTI • ARCHITECTURE
l
Chapter 1: What Is an Application Server and Why Do I Need One? Two-Tiered vs. Multi-Tiered Computing Why I Chose Multi-Tiered Client/Server What Can an Application Server Do? Scalability Distributed processing Reusable business objects Business rule processing Cross-platform integration Costs and Disadvantages of Application Servers Long-term commitment Middleware acquisition New ways to twist the brain The end of the coding cowboy Software reuse Moving from Traditional Client/Server to N-Tier Computing vii
3 3 4 6 7 7 8 8 9 9 10 10 11 11 12 12
viii
Building Application Servers Summary References
13 14
Chapter 2: Anatomy of an Application Server Overview of the Application Server Architecture Middleware: The Glue That Holds It Together Middleware Categories Remote database protocols Remote procedure calls Distributed objects Transaction processing monitors Message brokers Commercial application servers Applying Middleware to the Application Server Architecture Your best face forward: presenting a clean application interface Business objects: modeling your business in software Persistence: talking to the database Alternative Application Server Architectures The fourth layer Data-centric application servers Web server-based approaches Putting It All Together Summary References
15 16 18 19 21 21 23 26 27 27 28 29 30 32 34 34 34 37 38 38 39
PART 2 • DESIGN
41
Chapter 3: Designing Application Servers Joint Application Design Business Object Design Modeling business processes Reuse Design standards Iterative Development Why combine design and programming?
43 44 45 46 46 47 47 48
Contents
Self-directed technical review Design Constraints Layered design Middleware matters Integrating existing applications A Brief Introduction to UML Notation Diagrams and symbols Use case diagrams Class diagrams Sequence diagrams Meeting the End User's Needs Summary References Chapter 4: Service Interface Design What Is a Service Interface? Design by Interface More on JAD: Developing Use Cases Describe the context Describe the actors Describe the procedure Describe exceptions Use common language Iterate and refine A brief example Making use cases work Turning Use Cases into Services The service is application-specific The service is self-contained The service handles all exceptions The service hides the business object layer The service conforms to standards Bundling services into interfaces Handling Errors and Exceptions User interface errors
48 49 49 50 51 51 52 53 53 58 61 62 63 65 66 66 68 69 71 71 72 73 73 74 76 76 78 79 80 81 82 83 84 85
ix
Building Application Servers Application errors System and network errors Exceptions and interface design Summary References Further Reading Chapter 5: Designing Business Objects Moving from Interfaces to Objects From data models to business objects Choosing a design approach What Exactly Is a Business Object? Finding the Objects in Your Business Defining the Objects Designing the Objects Attributes Methods States Events Business object specifications Object Interaction Aggregation Generalization and specialization Association Collections Creating the class diagram Application Server Issues and Constraints Short business cycles Reuse Concurrency and synchronization Repositories Persistence Linking Business Objects to the Service Interface Developing sequence diagrams Creating new business objects
86 87 87 88 89 89 91 92 92 93 94 95 98 101 101 102 103 104 105 107 107 109 112 114 115 116 116 117 119 119 120 120 121 122
Contents Implementing services Business Object Architecture Summary References Further Reading
123 125 126 127 127
Chapter 6: Designing the Persistent Object Layer The Role of the Persistence Layer Relational Database Principles Database history The relational data model Structured query language (SQL) Database middleware Designing a Persistent Object Layer Persistence layer example Generalized object servers Tracking the objects Objects and relational databases Scalability Using Object-Oriented Databases Using Objects to Represent External Applications Summary References Further Reading
129 130 132 132 133 134 136 137 138 140 143 144 149 152 154 154 155 155
Chapter 7: Integrating Existing Systems and Legacy Software Design Issues for Application Integration What Do We Have—Application Mining Turning Subroutines into Services Proxy Objects How to access remote software Input and Output Streams Message-oriented middleware Advanced sneaker net Accessing Application Databases
157 158 159 160 160 162 164 165 167 168
xi
xii
Building Application Servers Direct database access Replication Synchronizing Transactions Fun with Punch Cards: What to Do with Legacy Software Summary References Further Reading
168 169 170 171 172 173 173
PART 3 • PROGRAMMING
175
Chapter 8: Implementing an Application Server Framework The Application Server Framework Initializing the framework Processing service requests Commercial frameworks Choosing a framework strategy Additional Framework Requirements Scalability Concurrency Security Fault tolerance Development Strategies Communications support Development environment Tools Training Metrics Summary References
177 178 178 180 182 182 183 183 184 184 185 185 186 187 188 192 194 194 195
Chapter 9: Using Java to Build Business Objects Using Java to Illustrate Programming Principles Overview of the Distributed Java Architecture Object-Oriented Programming in Java Java class definitions
197 198 198 201 201
Contents
Class composition in Java Class association in Java Class generalization in Java Coding Guidelines in Java Using Interfaces to Package Objects Distributing Java Objects with RMI Creating the remote interface Creating a remote object Creating the stub and skeleton Registering the remote object Accessing the remote object Comparing Distributed Java with Other Middleware Architectures Distributed objects in CORBA Microsoft's DCOM Summary References Further Reading Chapter 10: Persistent Objects: Communicating with Databases An Overview of JDBC JDBC architecture SQL basics Basic JDBC programming Other database middleware Creating a Persistent Object Framework A Simple Persistent Object Server Joining business objects with relational databases Tracking business objects Serving up customer objects Extending the Simple Object Server Adding more objects Serving up multiple objects from the same query Serving up complex objects Optimizing the Persistence Layer Capacity planning
203 203 206 206 207 211 211 212 214 214 217 219 219 221 223 224 224 227 228 228 230 233 238 238 239 240 245 248 250 250 251 255 258 259
xiii
xiv
Building Application Servers Minimizing database connections Distributing business objects Concurrency and synchronization Optimizing throughput Summary References Further Reading
259 262 263 266 266 267 267
Chapter 11: Interfaces and Client-Side Communication Client/Server Communication Establishing remote communication Processing remote communication Server-side communication Requesting services Creating a Service Interface Defining the service interface Implementing the interface Registering the service interface Using the Service Interface Accessing the services Locating the data Storing the data Releasing the remote object Passing Data, Objects and Properties Primitives Objects Properties Returning errors Messages, events, and asynchronous communication Summary References Further Reading
269 270 271 272 272 274 274 275 276 279 280 281 282 284 285 286 286 286 288 290 294 295 295 296
Chapter 12: Enforcing Business Rules What is a Business Rule?
297 298
Contents Turning Business Rules into Code Structure-based rules Rules in code Rules in data Classification Maintaining rule and classification tables Where to Put the Code User interface Service interface Business objects Persistence Database server Standardized Error Handling Standardized messages Exception objects Message handling Error logs Commercial Business Rule Engines Security and Authorization Strategies Organizing security rules Where to implement security Summary References Further Reading Chapter 13: Multiprocessing, Concurrency, and Transactions The Trouble with Multiprocessing Multi-tasking Multi-threading Multiple objects Multiple, synchronized data Multiple data sources Multiprocessing Within the Application Server User interfaces Service interfaces
299 300 301 306 313 316 317 318 319 320 321 321 322 322 322 323 324 325 326 327 328 329 330 330 331 332 333 334 335 336 336 337 338 338
xv
xvi
Building Application Servers Business objects Persistent objects Database servers The Class Factory Model Applying the class factory model Creating the class factory object Using the class factory When to use the class factory Multi-Threading Implementing multi-threading Synchronizing execution Synchronizing Objects and Data Locking at the database level Locking at the object level Locking at the persistence level Resolving deadlocks Transactions Transaction basics Implementing transaction objects Commit or rollback Two-phase commit Commercial transaction monitors Summary References Further Reading Chapter 14: The Next Generation of Business Applications Clues from the Past Increased automation Ease of use Business intelligence Communications How much farther can we go? Emerging Component Standards Microsoft's Distributed Internet Architecture
338 339 339 340 341 342 345 346 346 347 353 353 354 355 357 358 360 360 361 363 365 366 367 368 368 369 371 372 373 374 375 376 376 377
Contents
Enterprise JavaBeans CORBA object monitors Other contenders The Application Software Marketplace Off the shelf applications The component marketplace The open source bazaar The Emerging Business Platform Cheap computers Palm-tops and cell phones Pervasive computing Where is it all going? Final Thoughts References
379 380 380 382 383 384 385 387 387 388 389 390 390 392
Appendix: Setting up a Development Environment Development Using a Single Computer Hardware requirements Software requirements Development on the Network Network hardware Software Compiling and Testing Java and RMI Step 1: set up a project directory Step 2: compile the server and applet Step 3: use rmic to create the stub and skeleton classes Step 4: start the Web server and RMI registry Step 5: start the application server Step 6: run the applet Running on a network Where to get help Setting up JDBC Summary Sources for Software
393 394 395 395 396 396 397 398 398 399 399 400 400 401 401 402 402 403 403
Index
405
xvii
Acknowledgments
Many thanks to Lothlorien Hornet and the people at SIGS and Cambridge University Press for all of their guidance and help in making this book possible. Thanks also to technical editor Lisa McCumber for her many insights and to copy editor Matt Lusher, who transformed my ramblings into readable prose. Thanks also to Dr. James Gerlach at the University of Colorado at Denver for his excellent course, Distributed Object Computing, that sparked my interest in middleware and distributed processing. Thanks also to RFB&D for quickly providing reference materials. Finally, special thanks to Barb, my wonderful wife, for her encouragement, support, and assistance.
XIX
Introduction
You've read everything you can find about middleware, CORBA, transaction monitors, message brokers, enterprise JavaBeans, and other distributed technologies. Now it's time to put them to work. Time to build your company's first multi-tiered application. But where do you start? How do you structure the programs? How do you distribute the code? What about integrating existing applications and databases? This was the problem that I faced as I began working with multi-tiered development. There was plenty of information on the tools and technologies, but little on how to make them work in a business setting. Application servers and related technologies offer great promise and potential for solving the issues that trouble corporate computing. Problems like scalability, application integration and code reuse. But before we can solve these grand problems, we have to figure out how to use the technology. How do we process orders, ship products, bill customers, approve loan applications and pay insurance claims. My hope is that this book will offer some guidelines to start you on your way. Instead of focusing on middleware, the emphasis is on the design issues and programming techniques necessary to create an overall business application framework. The approach is user-centric, relying on joint development between developers and business people, using short, iterative design-program-review cycles. Object-oriented development is also stressed using designs illustrated with UML and programming examples written for the Java platform. Although Java and RMI are used, the framework will work with almost any language or distributed object platform.
Who Should Read This Book This book is primarily intended for software developers, the designers and programmers who have to take these new technologies and turn them into business solutions. It is written at a moderate technical level XXI
xxii
Building Application Servers and assumes that you, the reader, are familiar with client/server or mainframe development in a business environment. You do not need to understand middleware, object-oriented programming, or be a Java programming wizard, but you should be familiar with relational databases, user interface design and be able to read and understand program code. For those not familiar with some of the more technical topics, such as UML and distributed processing, the book provides enough background to get you started, then suggests additional references to fill in the details that are beyond the scope of this book. Although the book is intended for software developers, the first two sections will be useful to business people working in a joint development team environment. These sections offer background on the development process and introduce the tools needed to create an effective design. Joint application design, use cases, and iterative development are concepts that must be understood by all team members. Managers can also read through these chapters to gain a better understanding of the benefits of the technology and the overall design process. Other information technology workers, such as network and database administrators, can also benefit from this book by gaining an understanding of these new technologies and processes.
Organization To fully understand application server technology, it must be examined from several different perspectives, first from a high level view, on to the user's perspective and finally to the programmer's vantage point. Not only does each perspective show different aspects of the technology, the three separate perspectives allow you, the reader, to ease into the many details that must be considered before you can understand how to make the technology work.
Part 1—Architecture The book begins by examining what an application server is and how it can benefit the business. Benefits and drawbacks are listed, followed by a general overview of the technology. Once these are understood, the three layers of the application framework, the service interface, business objects and persistent layer, are discussed in general terms.
Introduction
Part 2—Design The design view looks at the application framework from a user-centric view, examining how the layers perform business functions. The emphasis in this section is on specifying the business requirements through use cases and then creating a software design that will meet these needs.
Part 3—Programming Once these layers are examined from a user-centric business perspective, the programming section examines each layer in even greater detail, offering techniques that can be used to create the program code that will perform the tasks specified during software design.
How to Get the Program Code The source code for the program examples, as well as the full implementation of each program, can be downloaded from the Cambridge University Press site: http://www.cup.org/Titles/77/0521778492.html In addition to this site, the files can also be obtained from my personal Website at: http://pages.prodigy.net/rleander Once expanded, the files are distributed into directories by chapter, with program listings in the main directory and additional program code included in subdirectories underneath each chapter directory. Check the readme.txt file included in the primary directory for additional information.
xxiii
Part 1
Architecture Part 1 offers an overview of application server architecture, describing its benefits in the business environment and providing an overview of its fundamental technologies including multi-tiered client/server computing, distributed applications and middleware.
Chapter 1
What Is an Application Server and Why Do I Need One? Over the past year or so, quite a few software vendors have released packages they call application servers. Inprise, Oracle, BEA, and a number of others all have jumped onto the application server bandwagon, extending their product lines with products that target enterprise computing. So, what exactly is an application server? This chapter will explore the reasons why application server technology will play an important role in the next generation of enterprise computing. Topics include: • Two-tiered vs. multi-tiered computing • Why I chose multi-tiered client/server • What can an application server do? • Costs and disadvantages of application servers • Moving from traditional client/server to n-tier computing
Two-Tiered vs. Multi-Tiered Computing There are quite a few advantages to traditional two-tiered client/server. The database products are very mature with heavy competition to constantly
Building Application Servers improve performance and features. Client-side development tools like Microsoft Access, Borland Delphi, and C++ Builder have become so easy to use that much of the code writes itself. Even the networks are easier to install and maintain. But as most client/server developers soon discover, it is almost too easy. New applications multiply on the server and, with the constantly plunging price of computers, more clients keep coming on board. In no time at all, the server is overloaded. Even after all of the memory slots have been filled, more CPUs have been added, and thousands of dollars have been spent to upgrade the network, the users still complain that response time is too slow. To solve this problem, the industry is now touting three-tier and n-tier (multi-tiered) client/server. The server itself becomes a network of computers that can grow to meet the increasing client demands. Instead of the client software communicating directly with the database server, a middle layer of software, called an application server, provides services to the client (see Figure 1-1). This minimizes the number of connections to the database server and spreads the processing over several computers. It also allows the client software to shrink, because much of the processing is passed to the application server. The client software can now become as simple as a form running on a Web page. Since so much of the processing moves to the middle layer, building an application server can be a difficult, complex process requiring a whole new set of tools and skills. Every software vendor in the business is rushing to sell middleware tools, but the techniques to use these tools are still in the early stages of development. Many sources of information are available on how the tools have been implemented, their architecture, services, protocols, and how to get them running. But even now, little practical knowledge exists on how to build the software. This book will look at some of the principles and practices that can help software developers make middleware tools solve these business problems.
Why I Chose Multi-Tiered Client/Server Most of my consulting work revolves around managed healthcare, an industry where large volumes of information are scrutinized against a constantly changing set of business rules. The software must handle large
What Is an Application Server and Why Do I Need One? Database Server
Client
Two-Tier Client/Server
Client
Application Server
Database Server
Three-Tier Client/Server
Figure 1-1. Two tier vs. threetierclient/server
amounts of data efficiently, yet be flexible, because the industry tries to balance ever-increasing costs with demands for quality care while fielding an ever-increasing barrage of government regulations. My work began with mainframe-based systems, but several years ago, I began creating two-tiered client/server systems for small start-ups and medical specialty groups. As the client/server software began to grow, I found several problems. The first was that response time started to bog down with the number of workstations in use. Upgrading the database server helped, but only until more workstations were added. Next, I found that the development tools worked well for interactive, screen-oriented software development, but lacked the ability to do mainframe style batch processing, working with large amounts of data quickly. Tools like Crystal Reports helped, but these did not provide the flexibility in calculations that were necessary for the application.
Building Application Servers The problem that finally led me to investigate multi-tiered computing was business rule processing. As a claim is submitted into the database, a few critical pieces of information (for example, patient, doctor, diagnosis, date of service and medical procedure code) must be checked against a large set of business rules such as: • Is the member eligible for services? • Has the service been authorized? • Is this an appropriate service for the diagnosis? • Is the doctor allowed to perform these services? • Is this an appropriate service for the age and gender of the patient? There are often two or three hundred rules that must be verified before a claim can be paid. Once these rules are checked, the appropriate fee and benefit schedules must be matched to determine how much the doctor should receive, and what portion must be paid by the patient. Try coding this in Visual BASIC or Delphi. It can be done (we've done it), but without some form of back-end processing, there is no way it can be done within the limits of reasonable response time. Our solution was to process the rules off-hours in batch mode, but this limited the ability of the claims processors to get their work done. With multi-tiered computing, I can begin to offload business rule processing to a separate server and test these business rules as claim information is entered. Many of the data tables can be loaded into streamlined business objects in memory, which will speed up processing. The more complex checks can be run in the background at a lower priority. Batch processing for decision support and reporting can be moved to separate machines where the data can be replicated into a data warehouse application.
What Can an Application Server Do? In addition to the issues described above, an application server can also solve many other weaknesses of traditional two-tiered client/server computing and provide many new benefits as well. An application server
What Is an Application Server and Why Do I Need One?
helps the system administrator by providing scalable software that can be spread over multiple machines for better system performance. It helps the software designer by providing clearly defined logical boundaries that enable the designers to create business objects that model the business closely. In some ways, software development is also easier, because the code is broken up into smaller, more granular modules and services that are much easier to reuse. Application integration is also much easier, using middleware services to translate data formats and simplify communication between different vendor's machines.
Scalability The most apparent benefit of multi-tiered client/server is scalability, because the workload is spread among several computers. No matter how much is spent on the latest leading-edge mega-server, there is a finite amount of processing power that any one computer can produce. Spending the same amount of money on several medium-grade servers will generate more computing horsepower and may even cost less. Where the savings really show is in the incremental costs of upgrades. The mega-server may have some limited upgrade capabilities, but when it maxes out, it has to be replaced with an even more expensive supermega-server. Not only does the company have to absorb the cost of a new server, it has to write off the old one. With distributed servers, the only cost is the incremental cost of an additional medium-grade server.
Distributed processing Another advantage is that the databases and application servers can be distributed closest to where the work needs to be done. If order entry is done in San Francisco and production and inventory are done in Atlanta, it makes sense to keep the databases where the majority of the work is done instead of keeping all the data at the corporate office in Chicago. Network traffic will be minimized, because order entry will be done locally in San Francisco, with a much smaller amount of traffic routed between San Francisco and Atlanta to check inventory levels. If Chicago wants management reporting, data can be summarized in San Francisco and Atlanta, then sent in summary form to Chicago.
8
Building Application Servers Distributed processing can also be used to hold local instances of remote data. This will minimize network traffic even more and allow processing even when the remote connection goes down. An inventory item object residing in San Francisco could hold the current number of items in stock and the number reserved by recent orders. Periodically, it would send a message to its counterpart in Atlanta to reserve the items and get a new update of the number actually in stock. If the network goes down, order entry does not have to stop, because the local computer has a close approximation of how many items are available. Once the network comes back up, the update message can correct any discrepancies.
Reusable business objects An application server is a repository of services and objects that reflect business processes. Since these processes can be described in business language, rather than computer language, it is much easier for the developer to translate business requirements into effective software design. With clearer communication between software developers and business people, the design will come closer to reflecting the real business needs. This results in software that is delivered sooner and costs less to produce. Once the application server is in place, the objects and services already developed are available for reuse in other applications. Instead of a tightly integrated, closed application, much of the functionality is exposed to the development team. Just as Visual Basic provides a set of GUI components that are used over and over again, most middleware implementations require a common, standardized interface and component model that makes reuse much easier and more cost effective.
Business rule processing Most two-tiered client/server tools emphasized a data-centric view of software design in which the client software provided a user interface to manipulate information stored on a database server. Almost all processing had to be done on the client side away from the database; this arrangement required additional network traffic. Some processing could be moved to the database server, but this required proprietary stored-procedure languages that were limited to each particular database server vendor.
What Is an Application Server and Why Do I Need One?
Application server development stresses business object construction rather than data storage. Each component contains services as well as data, which allows a much wider range of data integrity checks as well as the ability to derive additional information from the data contained in the object. The services can also encapsulate business rules and processes that model the actual business. When an invoice is entered, the invoice object has the intelligence to check the customer object to ensure that credit can be authorized. These rules and processes are performed automatically when a service is requested to store the data.
Cross-platform integration Since most organizations already have a large inventory of applications in place, the middleware vendors have invested much of their effort into cross-platform integration. The developer does not have to be concerned with translating low-level data formats, byte-order representations or other vendor-specific data. The middleware can also bridge multiple programming languages by using a high level interface definition language (IDL). Once the functions are declared in this language, the IDL compiler will generate translation code in a variety of programming languages. This allows programs in one language to call functions or access objects written in another language even when they are located on a different computer.
Costs and Disadvantages of Application Servers Although there are many advantages to implementing an application server, the technology is not appropriate for every application. Multitiered development requires a substantial up-front commitment that may not immediately show results. The application server is a complex piece of software that requires a whole new set of skills and tools. Most middleware packages are based on object-oriented design and programming concepts that require a higher level of abstraction and have higher learning curves. Many also rely on component architectures that must conform to rigid new programming standards. Components and modules must also be general enough to allow later reuse. The technology solves many problems but also brings its own set of difficulties.
10
Building Application Servers
Long-term commitment Implementing an application server architecture is a long-term, enterprise scale commitment. This is not the appropriate choice for a project that must be developed in "Internet time" or for a single, limited use application. This is an enterprise architecture that requires new hardware configurations, middleware, programming models, administrative tools, and, most of all, a new way of looking at software development. The first application will not be easy. Much time will be spent in trial and error, evaluating tools, learning the idiosyncrasies of middleware, and creating infrastructure instead of applications. Viewed as a single application, it will definitely not be cost-effective. This technology can only be justified when seen as the first step in building the foundation to a new enterprise architecture.
Middleware acquisition One or more middleware packages are probably already sitting on your hard disk. Microsoft bundles its Component Object Model architecture (COM and DCOM) with its most recent versions of Windows. Microsoft Transaction Server (MTS) is also making its way onto the Windows NT platform with the release of SQL Server 7. Java development packages provide a simple object request broker (ORB) called RMI (Remote Method Invocation) that is included with the Java Software Developers Kit (SDK) version 1.1 or higher. So why pay for another middleware solution? Most of these middleware packages are bound to one proprietary platform, but a comprehensive middleware solution must span a variety of computer platforms, programming languages and databases. The choice of middleware depends on current hardware and programming languages that are already in place, as well as future expansion requirements. COM, MTS and RMI are each vendor-, language-, or platform-specific. This may not be a problem if the organization is already standardized on Microsoft or Java platforms, but each may limit future scaleability and growth. The initial purchase price is also just the beginning of the middleware cost. Any choice must also take into account staff training, hardware and network acquisition, programming, and administration costs. Training and start-up costs can often eclipse the purchase price of even the most expensive middleware package.
What Is an Application Server and Why Do I Need One?
New ways to twist the brain Multi-tiered client/server also requires new ways of thinking about software. Although programming is a fairly abstract ability, object-oriented software design and programming require an even higher level of abstraction. Instead of a single sequential flow of execution, object orientation requires visualizing the interaction of multiple processes running in parallel on several computers at the same time. Consultants are available to act as guides through the project and to provide training, but at a very high cost. Many tools are also available to help manage the transition, but each of these add acquisition and training costs to the project. Money spent wisely in this area can greatly increase the chances of success, but costs can quickly mount with little benefit if spent in the wrong direction.
The end of the coding cowboy In "Coding Cowboys and Interdependent Systems," Warren Keuffel and Bryce Carey (Keuffel 1998) make an analogy to the Old West. The cowboys out on the range worked on their own, independent, untroubled by the rest of the world. When the day came that the railroads ran tracks across the range, that independent spirit suddenly changed. If the cowboys could not coordinate their track-crossing schedule with the railroad's standard of time, the cows would be caught on the cowcatcher. Until recently, programmers could make up their own rules, too; but with the advent of the Internet and electronic commerce, software developers must now adhere to common standards or the bits they herd will be roadkill too. Components and middleware architectures require much more discipline and standardization. Objects and components must conform to rigid standards and implement tightly defined interface methods. Much of this is provided by the programming tools, but design and structure must conform to these standards or the application will not run. Developers must also work closely together to ensure that interfaces and communication paths match and that objects and modules are coordinated to each other.
11
12
Building Application Servers
Software reuse Software reuse can be as much a problem as it can be a benefit. Not only do the components have to meet the current objectives, they also have to anticipate future needs. The time required to implement and test reusable software will take longer and development costs will increase significantly; however, in most cases the benefits will outweigh the additional costs. The additional costs and difficulties will be offset by the flexibility and scaleability of the software. Also, future costs will be reduced when application services are exposed to the development team and components are available for reuse.
Moving from Traditional Client/Server to N-Tier Computing The move from traditional client/server to a multi-tiered architecture takes time and planning. Management must support the transition and be willing to absorb the additional front-end costs. Architecture, middleware, and development tools must be evaluated and selected, and then servers and networks must be purchased and installed to support development. A comprehensive training strategy must be structured to get the development staff productive with these new tools quickly. And, of course, all of this must be done while supporting the current computing environment (Mowbray 1997). Often the best approach is to choose a small, highly visible trial project. Since much of the time will be spent determining architectural issues and learning new software tools, a small project will minimize the development time. At the same time, the project must also produce tangible, visible benefits to prove the technology and justify spending the resources and time required to make the remaining transition. A workflow tracking application will often provide a good initial project—something like customer inquiries, call tracking, work scheduling, software problem tracking, or another similar application. These applications have a limited amount of data entry, require business logic to move objects from one state to another (route a question to the appropriate person, change the status from open to completed, etc.), and do
What Is an Application Server and Why Do I Need One?
not require large or complex data storage. At the same time, the application logic includes a few wrinkles that will force the designers to consider more than just data storage and retrieval. A trial project like this also minimizes the initial cost of entry. Development machines can be reconfigured to accommodate new architectures. Evaluation versions of software tools and middleware can often be obtained for free from the vendor's Internet site for 30 to 90 days. Developers usually jump at the chance to play with new technology. Just remember to emphasize that the development effort must produce tangible results in a short period of time, or the developers will be going back to their old jobs. Once the test project is implemented and the technology is sold to management, it is time to firm up architectural and middleware decisions. Document the overall client/server architecture strategy and begin to develop interface and development standards. Train the development team and determine server and network needs. Also, remember to purchase the licenses for the middleware and tools that were downloaded from the Internet; once these tools expire, development comes to a screeching halt. Finally, develop a plan to periodically review the architecture and development standards. Determine relevant measurements and metrics to support process improvement. Be receptive to new development tools and methodologies and evaluate products and processes that look promising. Provide training and make sure that the initial architecture and development documents stay up to date (McConnell 1997). Everything should now be in place to begin serious multi-tiered client/server software development.
Summary Application servers and multi-tiered computing are emerging technologies that can provide many benefits to business. This is a technology that should not be approached lightly; but with the strong commitment of the organization, it can solve many problems inherent with traditional mainframe and two-tiered client/server. As the technology matures, it will play a key role in the evolution of software development.
13
14
Building Application Servers • Application servers are built by networking a number of computers together to provide an expandable, scalable application platform. • Two-tiered client/server is a mature technology that can address basic business problems but has difficulty addressing complex business logic and high user volume. • Application server technology enhances the two-tiered model by adding a middle application layer to isolates the business processing. • The distributed platform can provide better throughput by delegating tasks among many computers and can easily expand to accommodate growth. • Software development is enhanced by separating development into smaller tasks and by providing a framework for code reuse. • Application servers do require additional tools and costs that must be absorbed before these benefits are attained. • Approach application server development using a trial project to make sure that the platform meets the business environment.
References Keuffel, Warren, and Bryce Carey. "Coding Cowboys and Interdependent Systems." Software Development Magazine, April 1998: 31, 32. McConnell, Stephen. Software Project Survival Guide. Redmond, Washington: Microsoft Press, 1997. Mowbray, Thomas J., and William A. Ruh. Inside CORBA—Distributed Object Standards and Applications. Reading, Massachusetts: Addison Wesley Longman, 1997.
Chapter 2
Anatomy of an Application Server According to the vendor literature, it appears that moving to multitiered client/server computing is as simple as buying a few products, creating a few Web pages, and writing a little bit of application code. In a recent Microsoft presentation, a company representative created a simple three-tiered application in less than ten minutes (Microsoft, Inc. 1998). He built a Web page with a couple extra lines of VBScript code, displayed about twenty lines of Visual Basic code that was already installed on the transaction server, and then with a few mouse clicks, showed how easy it was to validate a customer number. Too bad the transaction server code only returned "credit OK" if the customer number was 123456789. Microsoft is not the only vendor using this approach to sell middleware products. Although the vendors make the development process look easy, these products are complex pieces of software. Just learning the programming conventions and protocols can take weeks, while producing industrial-strength code could take months. Tools like the Microsoft Transaction Server can make programming somewhat easier for developers by providing communications protocols and development frameworks, but even the simplest service will take far more than twenty lines of code. This chapter will examine the application server from an architectural viewpoint and will also examine the major categories of middleware software. Topics will include: 15
16
Building Application Servers • Overview of the application server architecture • Middleware—the glue that holds it together • Middleware categories • Applying middleware to the application server architecture • Alternative application server architectures • Putting it all together
Overview of the Application Server Architecture An application server contains the middle layers of the client/server software solution. The user interface programs request services from the application server, and these services then store and retrieve data from databases or other application servers. In between lies a collection of business objects that perform the services and enforce business rules. This is illustrated in Figure 2-1. The service interface layer is the "front door" to the application server. Each user interface program is granted a set of services, or remote procedures, that hide all of the details of the business objects and persistent data that reside on the application server. A service may be a request to get customer data or post a bank deposit to a customer's account. These services are then packaged into a service interface object that gives the user interface programmer one simple object that handles all interaction with the application server. The business object layer is a collection of many software objects that encapsulate the processes and rules of the business. A well-designed business object should be described in business terms, not programming language, and should be generalized so it can be reused by several different applications. The service interface layer is application-dependent, servicing specific applications, whereas the business object layer is approached functionally, modeling business processes. The persistence layer acts as an object broker that creates and stores business objects from permanent storage or interfaces with other legacy
Anatomy of an Application Server
Interface Layer Business Object Layer Persistence Layer
0
0
Database Servers
0
Figure 2-1. Application server layers
systems. When business objects are needed, the persistence layer is called upon to retrieve the data from a relational database or other permanent storage then use this data to create a new business object. Once the object is no longer needed, the persistence layer is responsible for storing the data before removing the object from memory. The design of the persistence layer is heavily dependent on existing data structures and the needs of the business object layer. Binding all of these layers together is one or more middleware products that allow programs on one computer to call functions or pass data to programs running on another computer. Middleware comes in many different flavors with a variety of programming models, communication architectures, built-in services, and administration tools. Choosing the best middleware architecture will greatly increase the chance for a successful implementation.
17
18
Building Application Servers
Middleware: The Glue That Holds It Together Before getting into the details of each application server layer, you need to understand the concept of middleware and its role in application server architecture. Briefly stated, middleware is a category of software that provides program-to-program communication across multiple computers. An application server uses middleware to communicate between the client software and the service interface, between the persistence layer and the databases, and often between objects to provide scalability across multiple server computers. Middleware provides communication across multiple computers, programming languages and data representations. To the programmer, there are few differences between calling a function within the same program or calling a function on a remote computer. Middleware can enable a Java program running on a Web browser to access functions written in COBOL residing on an IBM mainframe just as if it were another Java object method. Some additional setup is usually needed to access the remote objects, but once the setup is complete, the location of the objects becomes relatively transparent to the programmer. Though programming becomes transparent, the support that middleware provides for distributed processing can be complex and difficult. There may be differences in data representations, byte orders, floatingpoint standards and parameter passing conventions. Most middleware solutions provide services to solve many of these problems. These services include marshaling, which converts differences between data formats and provides consistent protocols for parameter passing between different programming languages. Directory and naming services are provided to locate functions on computers connected to the network. Life cycle management and load balancing functions are often available to ensure that the remote programs are in memory when needed and that the programs are distributed efficiently between computers. Because of the need for transparency, many of the middleware architectures are joint development efforts that have become industry standards. Computer manufacturers have formed cooperative groups such as the Open Software Foundation (OSF) and the Object Management Group
Anatomy of an Application Server
(OMG) to create standards and specifications that ensure interoperability between hardware platforms and software implementations. Standards such as OSF's Distributed Computing Environment (DCE) and OMG's Common Object Request Broker Architecture (CORBA) have become the foundation for many middleware implementations. With the broad base of industry support, these architectures can solve application server requirements and provide tools for legacy system integration. In its attempt to overcome industry standards, Microsoft has released the Distributed Component Object Model, or DCOM, an extension of Microsoft's original COM (This will be superseded in Windows 2000 by C0M+). DCOM is a component-based distributed object standard that is primarily limited to Microsoft Windows platforms, as well as some limited third-party support on other platforms. Related technologies include ActiveX, DNA, and OLE. A major advantage to the architecture is that the support software is integrated into the Windows operating system, so the middleware software is already resident on most desktop machines. Microsoft also provides a variety of architectural choices that include distributed objects, transaction servers, and message queues. Drawbacks include platform limitations and the complexity of the component model. Although not as broad-based as DCE or CORBA, the Microsoft architectures may be a good choice for organizations that have standardized on Microsoft products.
Middleware Categories The middleware market is still evolving, but several standard architectures have begun to gain dominance. Each addresses specific architectural problems and conforms to different programming models. Although there are overlaps between categories and some vendors provide a combination of approaches, most middleware architectures fall under one or more of the following categories: • Remote database protocols • Remote procedure calls • Distributed objects • Transaction processing monitors
19
20
Building Application Servers
Middleware Services just as an operating system provides functions to support file systems, serial communications, printer spooling, dates and windowing, middleware products provide a range of services that support the needs of middleware programmers. These services do not directly contribute to interprocess communication, but they offer support to make middleware programming and administration much easier. Some of the most common services are listed below. • Naming: The most common service provides an association between a text string (the name) and a remote object or process. Remote IDs are usually illegible bit strings (something like 67474-932-8209943-21002) that are difficult to work with and almost impossible to type. By providing a network-wide handle for each process, naming services provide programmers much easier access to remote procedures or objects. • Directory: Linked closely to naming (many standards combine the two services together), directory services provide a centralized list of all remote processes or objects currently active on the network. A call to the directory service will provide a programmer with the location of any remote process. • Life Cycle: This service provides tools for creating, activating, stopping, and deleting processes from memory. Services are also available to copy or move processes from one machine to another. • Persistence: In addition to handling the remote object's life cycle, persistence enables objects to be stored on disk when they are not needed, yet still maintain their attributes when they are loaded back into memory. • Concurrency: Since many programs often use the same remote procedures or objects, services must be provided that control concurrent access. Critical code segments may not be able to run concurrently without corrupting local variables or losing state information. Concurrency services let the programmer specify how concurrent access is managed within each remote process. • Security: Remote processes are often subject to the same type of access restrictions as programs or databases. Incorporating security within the middleware product allows a consistent standard manner of restricting access. • Time: Maintaining a consistent time reference among all machines may be necessary to insure that objects interact correctly. This can be difficult when remote machines are located in other time zones. Time services provide a standardized time reference as well as conversion to local time zones.
Anatomy of an Application Server
• Message brokers • Commercial application servers As each category is described below, specific vendors and products are mentioned only as examples. All were chosen because they come from established companies and are products that have name recognition.
Remote database protocols Remote database protocols are probably the most familiar of all middleware implementations. These have been around since the inception of client/server databases and are familiar to most client/server developers. The protocols allow a client computer on a network to communicate with the server database without having to be concerned about low-level programming details. These packages include Microsoft's ODBC (Open Database Connectivity) and Data Access Objects (DAO) as well as database gateways and the call-level interfaces provided by database vendors. Their primary purpose is to hide the details of network calls and communication protocols behind a set of simple objects or function calls. They provide naming and location services to easily locate remote databases and provide marshaling services to translate data into multiple programming language and machine formats. Most also enable invocation of remote procedure calls for database-oriented distributed processing using SQL, Java, or other proprietary languages.
Remote procedure calls The OSF Distributed Computing Environment (DCE) was one of the first industry-wide initiatives to develop standards for distributed computing. Companies like IBM, Hewlett/Packard, Digital Equipment, and many others met and agreed on standards that enabled programs running on their computers to call software functions residing on other remote machines. The DCE specification includes standards for remote procedure calls, security, directory services, time services, threads and distributed file services (The Open Group n.d.). In DCE or other remote procedure call architectures, a remote procedure call between two computers requires the calling program to have
21
Building Application Servers
22
some identifier that can be used to locate the procedure on the remote computer. These names or identifiers are then stored in a directory service or naming service. When a remote procedure is created, it is registered in some type of directory or repository on a specific computer; then, when the calling program wants to access the procedure, a text string is passed to the directory or naming service to retrieve this information. The calling program then receives a pointer that identifies the network location of the remote computer and the address of the remote procedure. Once this address is known, the calling program then calls a function located on the local machine that translates the remote procedure information into a common format, then sends this information over the network to the remote computer. The remote computer marshals the data into the remote machine's format and runs the procedure. The results of the procedure call are then passed back to the local computer (where the calling program resides), and the results are marshaled back into the local computer format before being passed back to the calling program (see Figure 2-2). Most remote procedure architectures require that procedures remain stateless. That is, the remote procedure cannot be relied upon to remember data between procedure calls. Memory variables must be reinitialized
Global Directory Remote Function Library
Local Program Data Translation
Figure 2-2. Remote procedure call
Network Connection
Data Translation
Anatomy of an Application Server
before each remote procedure call to ensure that they run correctly. This is both an advantage and disadvantage to the programmer. The programmer does not have to be concerned with the interaction of multiple users, but at the same time, any state variables must be held by the calling program and managed by the programmer. Although most of the current interest is in distributed objects, remote procedure call architectures do have their place. DCE has been an industry standard for over ten years; it is stable, and DCE experience is easy to find. It is available on most mainframe and minicomputer platforms, and it is an excellent choice when integrating legacy software into an application server.
Distributed objects As programmers moved towards object-oriented technology, the distributed programming initiatives moved towards distributing objects instead of distributing function libraries. Like remote procedure calls, distributed objects have been in development for quite a while and several standards are in place to provide interoperability between computer platforms. In a manner similar to the OSF, the Object Management Group (OMG) was formed to create a set of standards and specifications for distributed objects called CORBA (Common Object Request Broker Architecture). The CORBA standard has the broadest industry support and is the most extensive of the distributed object standards. The standard provides cross-platform support from PCs to mainframes, and CORBA compliant software can be written in almost any language. Sun Microsystems provides extensive support for CORBA within the Java platform and recently released new class libraries that simplify CORBA access in release 1.2 of the Java SDK. The Java platform also supports a simpler distributed object model called RMI (Remote Method Invocation). RMI can only be used within the Java runtime environment but provides a simple method of distributing objects over a LAN or internetwork. Microsoft has been slow to adopt CORBA technology, since they have invested heavily in their own distributed object model, the Distributed Component Object Model (DCOM). DCOM is based on the Component
23
24
Building Application Servers Object Model (COM), a programming model that arose out of the need to create compound documents and provide a consistent programming model for the Windows operating system. DCOM is limited primarily to the Windows operating systems, but also has some limited third-party support on other platforms. Each standard has its advantages and drawbacks, but a detailed discussion is beyond the scope of this book. The references listed at the end of this chapter supply detailed analysis, and one or more of these references will be extremely helpful in selecting a distributed object architecture. Although each distributed object standard approaches the task differently, there are many similarities between them. Like remote procedure calls, each provides naming services to locate the objects and provide marshaling to convert data representations across multiple programming languages or machines. Each also provides life cycle and load balancing, but these tasks are complicated by the need to maintain attributes or states within the objects. Also, since these standards work with objects rather than procedure calls, additional services are required to externalize or transform the object representation into a stream of bits that can be sent over the network. All three distributed object standards use some form of interface definition language (IDL) to specify an object's definition to the distributed object system. Using a standard ASCII text file or existing program code, a compiler (IDL for CORBA, MIDL for DCOM, or RMIC for RMI) converts this definition into program modules that provide communication between the calling program and the remote object (see Figure 2-3). The IDL compiler also can generate language header files (such as a C++ header file) that are used by the programming language (C++, or Java for RMI) compiler to resolve object names within the local program. The IDL compiler also generates stub and skeleton program files, which provide communication between the two computers. The stub program contains an object definition that looks, to the local program, just like the object being accessed on the remote computer. It contains method names and attributes that are identical to the remote object, but the stub's methods only contain code that passes arguments to the remote object. The skeleton program on the remote computer acts as the receiver of these network messages, calling the methods of the remote object with the parameters sent over the network.
Anatomy of an Application Server
The IDL generates static object references where object definitions are compiled directly into the program. In addition to accessing static objects, the distributed object architectures also provide access to dynamic objects. These are objects that are not known at compile time but can be discovered and instantiated as the program runs. Once the object is selected, it can be queried to obtain its method names and arguments; then methods can be called using a generalized protocol. Dynamic objects enable tools like Visual Basic or JavaBeans to discover new components as they are added into the distributed architecture. Another technology that has risen from distributed objects are component models. To make dynamic objects easy to use, each object must have a standardized interface which is then used to discover their methods and attributes. Microsoft's DCOM uses COM, a complex component model that requires each component to implement certain standardized interfaces and methods. Sun has created the JavaBean standard, which requires strict naming conventions within each component. The
Network connection
Figure 2-3. Defining a distributed object
25
26
Building Application Servers JavaBean framework then uses a language feature called introspection to retrieve these standardized names and interpret them within the component framework. Component architectures make programming somewhat more difficult with their rigid standards and additional interfaces, but as the technology continues to grow, the tools to create these components will become more sophisticated and component-based development will become the standard way of building software.
Transaction processing monitors Transaction monitors, like IBM's CICS or BEA's Tuxedo products, extend the remote procedure call architecture with a transaction processing layer. This layer takes the database concept of a transaction and applies it to distributed processing. A series of operations can be bound together into a transaction; then, if an operation fails anywhere during the process, all operations that have occurred since the beginning of the transaction are rolled back. This ensures consistent, reliable data no matter what kind of error occurs. The BEA sales literature (Bea Systems, Inc. n.d.) uses the example of an automated teller deposit. When a customer submits a deposit for $500, a message is sent from the ATM machine to the bank's computer. The computer processes the deposit and sends back a message saying that the operation completed successfully. If the status message gets lost somewhere between the bank and the ATM, the ATM has no idea whether the message was processed by the bank's computer or not. The message may have been received by the computer, posting the $500 deposit, and the response lost on the way back to the ATM; or the computer may never have received the posting message. If the ATM machine tries to resend the message, the $500 may be posted twice, or not at all. The transaction monitor ensures that a failure on either side of the message will undo whatever work was done by the bank's computer. With the growing interest in distributed objects, both CICS and Tuxedo are now implementing distributed objects within their transaction monitors. Microsoft, Inprise, and several other client/server tool vendors are also getting into the transaction monitor business, and the OMG has implemented a transaction specification for the CORBA architecture. With all of the possible errors that can occur between client
Anatomy of an Application Server
computers, networks, middleware, and application servers, transaction processing is an important consideration when selecting a middleware architecture.
Message brokers While other middleware architectures rely on transfer of program execution from one computer to another, a message broker uses data (messages) to communicate service requests. The primary advantage of this technology is that once a message is sent, the calling program can continue execution without having to wait for a response. If the network or remote computer is down, the message is retained in a message queue, waiting for the network or computer to come back on line. This is an ideal solution for legacy integration or intermittent communication processes like dial-up modems. The process is often referred to as "fire and forget" since the calling program can continue, knowing that the service will be performed at some time in the future. It is not a good approach for interactive processing, wherein a dialog must be established between the computer and the user, but it is gaining popularity for system integration to share information between a variety of applications and platforms. Message broker products are available from a variety of vendors, including IBM (MQ-Series) and Microsoft (MSMQ). As with the other middleware architectures, message brokers provide marshaling services to convert data representations across a variety of platforms and often include transaction services to provide data integrity (messages are not lost or processed more than once). The products also provide multiple message queues distributed on several computers with routing to ensure message delivery when parts of the network are down.
Commercial application servers With the push for enterprise-wide distributed processing, many of the software vendors are now marketing shrink-wrapped application servers. These vary in content, but each is an attempt to provide one-stop shopping for multi-tiered client/server. These packages include middleware, Web servers, programming tools, administration utilities, and, in some
27
28
Building Application Servers cases, database products. In the best of these packages, companies have either acquired or partnered with other vendors to provide a comprehensive set of tools. In other cases, vendors try to breathe new life into dying products by repackaging them into client/server bundles. Examine these products carefully before selecting a shrink-wrapped application server, since content and price vary greatly between products. Make sure that the tools will fit both the current project and future development plans. Much of the marketing material for these products emphasizes business benefits with many broad promises and few specific details. A major advantage of these products is that they are single-vendor solutions. All services are provided through one software vendor, so much of the finger-pointing is eliminated. Integrating software from a variety of vendors can be difficult, so a single-vendor solution does have its advantages. Just remember that a commitment to a commercial application server package is also a commitment to the vendor.
Applying Middleware to the Application Server Architecture Choosing the best middleware architecture is a difficult task and more than one product may be required. The application server foundation usually relies on a distributed object model, although remote procedure calls can be used if much of the code resides on legacy systems. Service interfaces also require distributed objects or remote procedure calls, while integration with legacy software can be made through any number of middleware options. The interface between the persistence layer and the database is usually provided by the database vendor, but if data integrity is a concern, a transaction monitor may be useful. Within the category of distributed objects, the choice depends on platforms and services needed. Most CORBA implementations provide a wide range of services that simplify the programmer's job. Look closely at the specifications, though, since many services may either not be available or may be sold separately. DCOM is also a viable alternative if the organization uses only Microsoft-based systems. Although this book uses RMI for its examples, RMI is usually not a good choice since it is limited to the Java programming language and
Anatomy of an Application Server
has very few prebuilt services. Version 1.2 of the Java SDK provides similar APIs for CORBA and this will provide a much more scalable architecture while requiring about the same level of programming difficulty. Nevertheless, RMI is a good platform for studying application server development. All of the tools are included within the Java SDK and force the programmer to learn the intricacies of distributed objects without relying on prebuilt services. Once a programmer understands RMI, moving to CORBA is not difficult, and the programmer will have a better understanding of what is going on under the hood.
Your best face forward: presenting a clean application interface The application or service interface layer encapsulates all of the services required to implement a user interface for one specific application. Although the user interface program could access the business objects directly, this would add complexity to both the user interface and the business objects. The user interface would have to track each separate business object connection and would need additional logic to integrate the objects. Each business object would also have to implement methods to support external user interface requirements, which would bloat the objects and limit reusability. A separate application interface layer enables the user interface to concentrate on presentation logic and enables the business objects to concentrate on business requirements. Figure 2-4 shows a block diagram of the service interface. The user interface program makes a single connection to the application interface object, where it can request services to perform each task described in the use cases. Each service then performs the application logic that coordinates the activities of one or more business objects to perform the requested service. Application logic should be kept to a minimum, instantiating new business objects, calling these object's methods to perform the work, and coordinating the exception handling when error conditions occur. The communication between the user interface and the service interface is usually provided by either distributed object or remote procedure call middleware (message brokers may also be used, but are more likely used for integrating external applications). The services are described in an IDL; then the IDL compiler creates stubs and skeleton code to perform
29
30
Building Application Servers the communication. The user interface programmer can then use a set of language-specific files (either header or class files) that act as proxies for each of the services provided by the service interface. The application server programmer must then take the skeleton files and implement the domain-specific code to perform each service.
Business objects: modeling your business in software The business object layer is a repository for all of the business objects used by any application. Each business object is a package of properties and methods that perform a specific business function. These should be described in business language. Examples of business objects are customers, orders, inventory items, and so on. Business objects are often
Business Object A Interface Object Stub
Skeleton
User Interface
Service A Service B Service C Service D
/ Business Object B
X Business Object C
Figure 2-4. The service interface layer
Anatomy of an Application Server
built by first creating fine-grained objects, such as customers and product items, that are then combined to create larger domain objects, such as orders and invoices (see Figure 2-5). In addition to the business design issues, each business object should conform to a standard object model. Naming conventions, standard methods, error handling procedures, documentation and other standards should be agreed upon before the first object is created. This will make programming easier over time and aid in object reuse. Component models such as Enterprise JavaBeans or ActiveX can help enforce these standards and make interfacing to middleware much easier. The choice of middleware across the business object layer is usually limited to distributed objects or a transaction monitor that has been augmented with object technology. Although business objects can be created
Business Object C
Interface Object Service Service Service Service
A B C D
Persistent Object E
High Level Business Object A Business Object D
High Level Business Object B
Figure 2-5. The business object layer
Persistent Object F
31
32
Building Application Servers from procedural code, an object-oriented programming language will make life easier for both the designers and programmers. The idea is to encapsulate all of the functionality of a business entity into a software entity. This is difficult using non—object-oriented software tools. Another consideration is how to distribute business objects across the network. The tradeoff is between efficiency and scalability. Keeping related objects on the same computer will keep network communication down, but limit scalability. A recent article in Component Strategies Magazine described how a distributed object application quickly grew to almost two billion objects (Shelton 1998). Although this sounds incredible, consider that a single invoice object may be composed of a customer object, two or more address objects, many product objects, and so on. Several hundred invoices can easily contain several thousand objects. Scalability must be considered when deciding both the middleware architecture and distribution strategy for the business object layer.
Persistence: talking to the database At the other end of the application server is the persistence layer that interfaces to databases and external applications (see Figure 2-6). Since most business applications rely on database management systems to store data, the attributes of business objects must be loaded and stored in this format. Mapping objects to their data can be done directly in each business object, but this would bloat these objects and add quite a bit of code, as well as additional attributes, making business object construction much more difficult. A better solution is to create a separate persistence layer that is specifically built to load and store business objects. When the service interface needs to locate a business object, it sends a request to the persistence layer to locate and return a reference to the object. If the object is not in memory, the persistence layer locates the data, creates a new instance of the object, loads the data into the object instance, then returns the reference back to the service interface. Once the object is no longer needed, the data can be stored back into the database by sending the object back to the persistence layer. Once a business object is placed in memory, it can be used by any number of different service interfaces or other business objects. When the persistence layer receives a request for a business object, it will know
33
Anatomy of an Application Server
Request Objects
Retrieve Data to Create Objects
Figure 2-6. The persistence layer
if the object is already in memory and will not have to load another instance of the same object. Instead, it can simply return a reference to the existing object. As you can see, the job of the persistence layer is quite a bit like that of an object broker, creating, removing, locating, and tracking objects across the entire application server. Communication between the persistent objects and the database is handled using traditional database middleware such as ODBC, JDBC, or some other protocol provided by the database vendor. If data integrity is a requirement, a transaction monitor can be introduced to provide commit and rollback within the persistence layer. Object databases can also be used to handle all of the persistence chores, but since most companies rely on relational databases for existing applications, selling an object database solution can be difficult. Since the persistence layer also acts as the object broker for the application server, building this layer will be much easier using distributed object middleware to handle the naming and life cycle chores.
Database Server
34
Building Application Servers
Alternative Application Server Architectures The three-layer approach described above with service interface, business object, and persistence layers is only one approach to application server architectures. Some authors suggest a fourth layer inserting a transaction layer between the persistence layer and the database. Others use the traditional two-tiered client server approach but move the data access objects from the user interface onto a middle-tier server. Internet tool vendors are also joining the multi-tiered market with Web serverbased application servers. All of these approaches are worth examining and have merits and drawbacks over the approach described above.
The fourth layer One alternative to the three-layer architecture is to insert a transaction processing layer underneath the persistence layer (see Figure 2-7). This will ensure the integrity of the database when errors occur and prevent the posting of partial transactions. Before the databases are updated, a boundary is set that marks the beginning of the transaction. Each database or remote application is updated, and when all updates have completed successfully, the transactions are committed to the databases and the transaction is closed. If an error occurs, all transactions are rolled back. This may be a good approach in the few cases where applications are highly sensitive and data integrity is extremely critical. Otherwise there is little reason to create a separate layer of code when there are a variety of middleware products that easily manage these functions automatically. If transaction processing is critical, it makes more sense to implement it as a separate service of the persistence layer, offering transaction control as part of the service interface logic.
Data-centric application servers Another approach to application server architecture is to eliminate the business object layer and just create a pool of persistent objects that are directly available to the user interface (see Figure 2-8). This architecture
Anatomy of an Application Server
User Interfaces
Interface Layer Business Object Layer Persistence Layer - ' Tfansac1
8
Q
Database Servers
Figure 2-7. The four-layer architecture
extends the data-centric approach of the classic two-tiered architecture onto additional servers to provide connection pooling and data caching. The user interface program is then still responsible for the business logic, but performance is enhanced by adding the scalability of the distributed architecture. This is the approach taken by many of the RAD (Rapid Application Development) tool vendors to move their products into the application server market. It works well for this form of software development, creating highly efficient multi-tiered software. The best of these tools automatically create the remote data objects through programming wizards that generate highly efficient objects and all of the CORBA code required
35
36
Building Application Servers
User Interface
o
-o
Network Persistence Objects
Q
Q
Database Servers
Q
Figure 2-8. A data-centric architecture
to interface them with the user interface. For applications that have little business logic but high transaction loads and tight development timeframes, these are excellent tools. Unfortunately, this approach has many of the same drawbacks as the two-tiered client/server applications. It produces highly data-centric applications with little ability to handle complex business logic, and the software is fairly inflexible and sometimes difficult to maintain. Other than the data objects, which are programmatically generated, it is also difficult to reuse program code. Finally, since there is no common service interface layer, each user interface program must be built independently.
Anatomy of an Application Server
Web server-based approaches Not to be outdone by the database and RAD tool vendors, Web server-based tools are also appearing that provide multi-tiered applications. The Web server becomes the service interface, serving up Web pages and forms that interface directly with a relational database (see Figure 2-9). Additional business logic can be added through plug-ins or servlets, exposing functions available to the HTML scripts that define the Web pages. This approach works well for Internet- or intranet-based applications but are difficult to expand. Forms must be defined in some version of HTML and must be based on simple table views or SQL queries. Business logic is limited to simple function calls while integration with existing applications can be difficult. This architecture works well for high-volume Internet applications but cannot support the breadth of application requirements needed to support an enterprise architecture.
o
Internet or Intranet
Web Server
Add-lns or Servlets
9
9
Database Servers
Figure 2-9. A Web server-based approach
-o
37
38
Building Application Servers
Putting It All Together An application server is a combination of client computers, servers, networks, middleware, databases, legacy applications, and application code. Without a well-planned, organized architecture, this collection will quickly become a disorganized mess. The application may work, but enhancements and maintenance will be almost impossible. When problems arise, there will be any number of vendors and consultants each pointing their fingers at each other, and the project will become a black hole sucking up the organization's resources and your career. Middleware and program tool selection should be done carefully. Small trial projects can often show weaknesses quickly without large expense. Most software vendors are willing to provide trial software at little or no cost and even provide some sales support and training to ensure that your trial goes smoothly. Bundled commercial application server toolkits and single vendor solutions can also keep the vendor list down. As software is evaluated, you should include training costs and administration as part of the total cost of ownership, these are often complex, difficult tools to learn and manage. A note on trial projects: Be willing to throw away things that do not work. It is easy to look at the cost invested and want to hold on to work already completed. This is always a mistake. Application server technology is still new and there are many products and technologies that are either underdeveloped or just do not work. Find the right products that solve the relevant problems in a way that works for the business, programmers and end users. Do not waste time and resources trying to make bad products fit where they do not belong.
Summary The application server architecture can best be viewed as a number of layers connected by a variety of middleware tools. This book uses a three-layered approach with service interface, business objects, and persistence services, but other architectures can be used. • The user interfaces running on the client computers use distributed object or remote procedure middleware to communicate with the service interface, the front door of the application server.
Anatomy of an Application Server
• The service interface calls on a host of business objects to perform the business logic. • The persistence layer acts as an object broker for the business objects, creating and storing the objects, retrieving their attributes from the database servers through database middleware. • Middleware is a class of software that simplifies communication from a program running on one computer to a program running on another computer. • Remote database middleware enables programs to access data easily residing on separate database servers. • Remote procedure call middleware allows a program on one computer to call a function on another computer. • Distributed object middleware allows programs to access objects located on other computers. • Transaction monitors ensure fail-safe execution of a set of procedures or database operations by providing roll-back capabilities when any step of the transaction fails. • Message-oriented middleware routes data in the form of messages from one computer to another, storing data in message queues when the other computer is inaccessible.
References BEA Systems, Inc. "Programming a Distributed Application: The BEA Tuxedo Approach" n.d. Available from http://www.beasys.com/products/tuxedo/tuxwp_pda/tuxwp_pda.htm Microsoft, Inc. "Developer Briefing." Denver, Colorado: Presented at Denver Southeast Holiday Inn, July 21, 1998. The Open Group. "DCE Distributed Computing Environment Overview." n.d. Available from http://www.opengroup.org/dce/info/ papers/tog-dce-pd-1296.htm
39
40
Building Application Servers Shelton, John H. Ill, and Scott E. Nelson. "Managing a Billion Object System." Component Strategies. September 1998: 44-53.
Part 2
Design Part 2 examines the issues involved in designing an open, scaleable application server architecture. These include requirements analysis, user interface and business object design, persistent storage, and application integration. Emphasis is on user involvement through joint application design teams, use cases analysis, and incremental, iterative development.
41
Chapter 3
Designing Application Servers The goal of business software design is to create information tools that support the organization's business activities. The software developer has to work closely with the end users and be able to communicate in business language as well as computer language. At the same time, the end users have to become educated to understand the computer's capabilities and limitations. Business needs change quickly, so an effective design methodology must be flexible and adapt easily to changing business requirements. N-tier computing complicates the design process even more. While traditional two-tiered client/server emphasized applications and user interfaces, n-tier design requires both application-oriented software design as well as process-oriented business object design. The pressure is on to produce applications in shorter timeframes while creating more robust, reusable business objects. The days of the coding cowboy are over. Fortunately, there are methodologies and tools to support these requirements. The joint application development GAD) team approach brings software developers and end users together to design software jointly. Iterative, incremental development provides shorter design and programming cycles to ensure that the project stays on track and meets the end user's needs. The Unified Modeling Language (UML) can be used by both software developers and end users to communicate design ideas. Computer-aided software engineering (CASE) tools such as Rational Rose can streamline this process by creating UML diagrams, generate skeleton code and then update the UML models as the design progresses. 43
44
Building Application Servers Application server design is still in its infancy. Methodologies are evolving and vendors are constantly introducing new tools. This chapter will provide an overview of the application server design process and will include the following topics: • Joint application design • Business object design • Iterative development • Design constraints • A brief introduction to UML notation • Meeting the user's needs
Joint Application Design For many years, the accepted method for software design was the waterfall method. This was a long, sequential process that started with business analysis, followed by design, programming, testing, and documentation. Each step was followed meticulously. Once one phase of the process was completed, changes were not allowed. This led to long rigid development projects that were measured in man-years, and the software was often obsolete before it was implemented. The process may have worked well for NASA or compiler projects, but quickly broke down when applied to constantly changing business applications. Although the waterfall method still has its adherents, iterative or cyclical approaches have become more prominent in the last few years. The software developers and end users form a joint development team that works together through the entire process. The team drafts rough narratives called use cases that describe how the software will be used to solve a variety of business problems. Once the use cases are agreed upon, the software developers quickly (within a few days) create design specifications and a software prototype to address each use case. This prototype is then brought back to the team for refinement. As the team reviews the prototype and suggests change, they discover additional requirements that either extend the existing use cases or trigger addi-
Designing Application Servers
tional requirements that become new use cases. This process continues as long as necessary. Since the software is constantly refined, it comes much closer to meeting the actual needs of the business. Projects show tangible results almost immediately, not months or years later. Fewer projects get canceled along the way, since visible results can be demonstrated. Projects also cost less because the true requirements are isolated sooner and less time is spent on nonproductive bells and whistles. The process also has advantages for organizational development. End users see that their input makes a difference and feel that others are supporting their work. Management gets quick, tangible results from their investment and, over time, will be willing to commit more resources to information technology. Even the software developers benefit, receiving more recognition for their efforts. The projects often move beyond software development and become a chance to improve business processes. At the same time, some drawbacks exist which you must monitor and manage. You must set boundaries at the beginning of the project to limit both scope and timeframes. An open-ended project can develop a life of its own and never be completed. Projects can easily become sidetracked or steered into wrong directions, resulting in a failure to solve the original problem. Putting software developers and end users together can also cause communication difficulties and personality conflicts, but effective team leadership and management oversight should prevent these problems.
Business Object Design While the joint design team spends its time developing use cases and user interfaces, the application server designers must also focus on the business objects that will support the application. Business object design is a bottom-up process starting with low-level objects that describe business entities. These low-level objects are then aggregated together into objects wherein the entities can work together to perform business functions. Finally, these higher-level objects are packaged into one or more interface objects that will perform all of the functionality of the application.
45
46
Building Application Servers
Modeling business processes Early in the process, while the use cases are being developed, the application server designers must begin to isolate the business objects that will support the use cases. Low-level objects will appear as nouns in the use cases. When a use case includes "a new customer is added," the designer knows that a customer object will be required. "The invoice will be checked to see that each item is in stock" indicates that invoice and item objects must be built. In most cases, the objects will be easy to identify, and many may already exist in the object repository. Defining the methods and properties of each new business object will also become clearer as the use cases are refined. You can often derive properties from user interface screens. For instance, you can assume the "add new customer" screen will show many of the properties of a customer object. A sample invoice will contain the properties of invoice and item objects. You can also extract methods from the use cases by looking at the verbs acting on the objects. "The new customer is added" can indicate the customer object's constructor, or it could indicate an add or insert method within a customer collection. "The invoice will be checked to see that an item is in stock" can imply that an isInStock method is needed in the item object that returns true if the object is in stock. Once each business object is defined, it should be documented so it can be presented to the joint design team. Higher-level objects should be translated back into business language so they can be understood by the nontechnical members of the team. The narrative should include the name and the purpose of the object, followed by its properties and methods. Objects should also be laid out on a class diagram using UML or some other object notation. This diagram will show the relationships among the objects and keep them organized in a logical manner.
Reuse Reuse is the key to effective business object design. Low-level objects like customers or product items often act in many different roles, while larger aggregate objects such as orders or bills flow through several applications. Even the interface object may be referenced by several different user interface applications. But reuse is not something that can be enforced by the software police. Incentives and repositories may help,
Designing Application Servers
but real reuse will only occur as the core objects become familiar to the developers and make sense within the business framework. In addition to reusability, the business objects must perform the services required by the application. This is often where reusability breaks down. An invoice item must know how to create itself, sending errors back to the application when an item is not in stock or a customer does not have sufficient credit. The error mechanism must be flexible enough to communicate consistently with other objects, enabling errors to be passed back to any application that uses the object.
Design standards To make all of these objects work consistently, you need a set of comprehensive yet flexible standards. In many cases the middleware will require adherence to component architecture standards or will generate skeleton code to force compliance. In other cases, the development team must agree to a set of standards that can evolve as the application server grows. Standards must include naming conventions, consistent data formats, class hierarchies, exception objects, documentation formats, and a host of other considerations.
Iterative Development Just as you need a constant flow of communication between the software designers and the end users, you also need similar communication between the designers and programmers (if they are not the same people). A large, complex class diagram can be a pretty sight. It can be aesthetically balanced and logically oriented, with lines flowing between big boxes with little diamonds and triangles. It may be a work of art—but can someone translate this artistic masterpiece into down-and-dirty code? The only way to find out is to let the programmers start programming as soon as possible. The key to success is short, small design (code) review cycles. A programmer can often discover design flaws, inconsistencies, and awkward programming problems that are easily overlooked by the designer. A difficult programming construct can be cleaned up easily if caught early, but after other objects have begun to depend on this construct, the problems will be far more difficult to repair.
47
48
Building Application Servers
Why combine design and programming? Design and programming are really two different views of the same process. Design is a high-level view using symbolic abstractions such as charts, diagrams, and textual narratives. Programming simply translates these notations into a form that the computer understands. In the early days of programming, the process of translation was a highly specialized, technical skill requiring thousands of lines of detailed assembly or compiler code. The task was time-consuming and it made logical sense to divide the labor. The most experienced developers worked out the software design while coding was delegated to a large number of programmers. Today, most programming is done at a much higher level of abstraction using GUI builders, code wizards, and CASE tools that integrate design and programming. These tasks are so tightly integrated that it makes sense to let the same people perform both tasks. In spite of all the coding wizards and GUI builders, programming is still difficult work. Just because the appropriate charts are drawn in the CASE tool and the screens are laid out with the GUI builder, that does not mean programming is complete. Object-oriented programming still requires technical skills, careful attention to detail, and long hours of testing. But programming is easier today than it was when writing assembly language was necessary, and as a result, the distribution of time for each task has shifted dramatically. Iterative design also changes the sequence of these tasks, interweaving short repetitive cycles of design and programming. These tasks each have their own purpose but are integrated so tightly that it makes sense to assign both to the same person.
Self-directed technical review The emphasis in iterative development is on shortening the cycles between design and coding. Instead of trying to work out the entire system design, start with a single use case; then throw together a rough design. Specify a few critical classes and determine their relationships, then start programming. If something does not work or is difficult to program, go back and revise the design to solve the problem. Continue to bounce back and forth between design and programming until the code meets the requirements of the use case.
Designing Application Servers
By using shorter iterative cycles, programming will quickly determine if the design will work. The process acts as a self-directed technical review. Programming will quickly isolate design flaws and often reveal alternative design constructs that may have been overlooked when laying out the class diagrams. Programming can also reveal constraints of language or middleware that couldn't be detected while working out the design. Conversely, the experience gained in programming will speed up design on subsequent use cases. The designer will remember constructs that did not work, and presumably he will not make the same mistake a second time. The capabilities and limitations of the individual objects will be known, helping to isolate changes and enhancements that address the new requirements. Reuse will also be enhanced by knowledge of the objects already available.
Design Constraints Another reason for iterative development is that the application server model requires several new skills that will be new to traditional client/server and mainframe programmers. As seen in the last chapter, the software is designed using a layered approach. Each layer has its own tasks, and much of the work involves communication and interface between layers. New tools and middleware products also impose constraints and new programming techniques to be mastered, and these will influence the way the application is designed. Finally, since most programs must communicate with other systems, application integration also places restrictions on the design.
Layered design Multi-tiered, layered software design is both a blessing and a curse. Since each layer has its own responsibility, the developer can concentrate efforts on that one task and disregard the others. When developers consider user interfaces, they do not have to be concerned with how to access the data from the relational database. For new developers, this concept can be a difficult one to master. Two-tiered, client/server development was based on mapping the user interface to the database, and all functionality resided in the same place. This new development approach splits out these func-
49
50
Building Application Servers tions into separate layers, and it can be difficult for someone not familiar with this approach to envision these pieces separately. Another difficulty lies in developing the interfaces between the layers. A large portion of the design effort now involves isolating the interfaces and determining how they should be structured. Instead of thinking about how to access data from a relational database table, the developers must now consider what services will be required from the application server to retrieve the data needed on the user interface screen. Once the service is designed, the developers must also consider how the middleware can be used to communicate this service request.
Middleware matters The middleware architecture also complements and constrains the design process. Middleware choices are often restricted to the package that management already purchased or the few packages that run on the existing mix of hardware and network. In most cases, a good object-oriented design will work no matter which middleware architecture is chosen; it just has to be translated to an alternate physical design. While working on the design, use the middleware services to their full advantage. Naming, life cycle, persistence, and transactions can save quite a bit of programming time and create a far more robust application. Also try to find out as much as possible about the architecture and internals of the middleware to get the best performance possible. Middleware relies on network communication, which is much slower than local processing. Keep communications between machines to a minimum and use concurrent processing as much as possible. One major constraint in most middleware architectures is the object or component model requirements. Restrictions may include lack of state variables, a remote procedure model instead of object orientation, limited support for certain data formats, or other similar restrictions. You should catalog these and add them to the design standards prior to starting the object design. Disregarding these requirements in the design phase will quickly lead to programmer revolt.
Designing Application Servers
Integrating existing applications Interfacing to existing applications can be difficult. Documentation and source code may be difficult to locate or may not even exist. The guy who wrote the original software may have moved to a better climate without leaving a forwarding address, or the code may have been written by an outside vendor who is no longer in business. Even if the problem was caused by none of the above, the code will seldom work within the middleware architecture. Before spending a lot of time on integration, determine if the interface is really needed. If the users request changes ("it would be nice to..."), determine the costs and benefits of doing the integration. Look at the amount of data transferred; it may be easier to let someone occasionally key the data into the other system. Find out how much time is currently spent on this particular operation and how much time and money would be saved by automating the process. If the interface is necessary, determine the easiest way to access the data. Try to keep the interface as small as possible. If the data is coming from a legacy application, set up a persistent object to retrieve the data directly from the data source, or set up some form of replication. If data must be entered, try to find some existing input process that can be redirected to accept data.
A Brief Introduction to UML Notation Effective software design requires both conceptualization and communication. Conceptualization is the ability to visualize part or all of the design, from the big picture down to the minute details. Communication, in this context, is the ability to replicate at least part of this conceptualization into someone else's head. Neither is an easy task, because object-oriented technology adds several additional levels of abstraction as well as much finer granularity. N-tiered client/server makes this even more difficult, because the processes are synchronized across many different computers. Since the mid-1980s, gurus of object-oriented software design have been developing graphic notation methods to make it easier to conceptualize object-oriented design. For many, it is much easier to visualize a picture or diagram; so graphic notations were chosen over text or program
51
52
Building Application Servers code. Unfortunately, each used different symbols for the same concepts, and, unless everyone understood the specific notation, the methods helped individual conceptualization but got in the way of effective communication. Finally, over the past few years, three industry leaders in object-oriented software design, Grady Booch, James Rumbaugh, and Ivar Jacobson, joined forces to standardize the notation into the Unified Modeling Language (UML). This has quickly become a standard notation for object-oriented modeling and has been adopted by industry groups such as the Object Management Group (OMG), the same group responsible for the CORBA specification. In addition to the standards organizations, the UML has also been incorporated into several (CASE) tools. With these tools, the software designer creates the diagrams, then the tool automatically generates skeleton program files in C++, Java, IDL or other language. Many also provide "round-trip" engineering features that read changes made to the program files, then update the graphical diagrams. This ensures the model stays up-to-date with the software without tedious revisions to the diagrams. Throughout this book, UML will be used to communicate software design concepts. The rest of this section is a quick overview of the UML diagrams and notation. UML is a powerful notation language, and a complete discussion is far beyond the scope of this book. For a detailed yet readable discussion, see UML Distilled by Fowler and Scott (Fowler and Scott 1997).
Diagrams and symbols UML is a notation, not a design methodology. It is a language of diagrams and symbols that describe a detailed software design. Similar to construction blueprints, it can be used by both technical and nontechnical people to communicate design ideas in a graphical manner. A building tenant may not understand all the symbols and underlying technical information represented by a blueprint, but he can still visualize the building layout. In the same way, an effective set of UML diagrams will convey the general complexity, structure, and usage of a software application. A UML model begins with use case diagrams that describe how the
Designing Application Servers
software interacts with the outside world. This is followed by one or more class diagrams that show the class definitions and their relationships and associations. Sequence diagrams describe the flow of information between objects and ensure the appropriate methods have been assigned to the correct objects. UML also includes many other useful tools including collaboration, state transitions, activity, and deployment diagrams, but these will not be included in this discussion.
Use case diagrams Use case diagrams graphically illustrate the interaction between the actors and use cases. An actor, represented by a stick figure, is any person or external software system that receives value from the use case. Each use case, represented by an oval, is a short description of a use case that can be performed by or interacts with the actor. Figure 3-1 shows a typical use case diagram. The actor called customer receives value by placing an order. The « u s e s » arrow indicates that the "place order" use case will rely on the "check credit" use case to check the customer's credit. The «extends» arrow shows that the "credit denied" use case will extend the "place order" use case when the customer's credit is insufficient to place the order. The "place order" use case also relies on the "check inventory" use case. The use case diagram helps visualize and organize the use cases but does not provide any information other than the name of the use cases. Each use case on the diagram should be followed up with a narrative description listing procedures, exceptions, and results. Use cases will be covered in much more detail in Chapter 5.
Class diagrams The class diagram shows each object or class definition in visual form with a variety of different arrows indicating relations and associations between objects. Each class is represented by a box divided into three sections listing the class name, its properties and its methods (see Figure 3-2). The class Customer has properties name, address, city, state, zip, phone, and creditLimit. Notice that the constructor and destructor methods are not included in the class diagram, because these are implied with each
S3
54
Building Application Servers
Customer
Figure 3-1. Use case diagram
Customer name address city
state zip phone creditLimit CheckCredit Display
class Customer { private String name; private String address; private String city; private String state; private String zip; private float creditLimit; public boolean CheckCredit (float n); public void Display ();
Figure 3-2. Class diagram and Java representations of a single object
Designing Application Servers
class. The CheckCredit and Display methods are listed below the attributes. Type information is optional and can be omitted unless there is a logical reason for clarifying the details. Parameter lists are often omitted for methods unless there is a critical parameter that needs to be emphasized. Standard notations are also available to indicate the access level of properties and methods (public, private, or protected), but these will not be used in this book. An association is a logical link between two classes, usually through a key value that points to another object, or through an address pointer in languages like C++. Association provides navigation from one object to another. The association may be either one-to-one or, using a table or list, one-to-many. Figure 3-3 shows a Customer object associated with an Order object. The Customer object points to many Order objects. The 1 next to the Customer indicates that there is only one Customer object per Order, and the * next to the Order indicates that each Customer can be associated with many Orders. The arrow pointing toward the Customer indicates that the Order object has a logical pointer that can locate the Customer, but the Customer object has no knowledge of the Order objects.
Order orderlD date ShipOrder CheckStatus
Customer
1
name address city state zip phone creditLimit CheckCredit Display
Figure 3-3. Class association
55
56
Building Application Servers Composition, often called a whole-part relationship, is a way to show that one class is an attribute of another class. Composition is indicated using a line with a solid diamond next to the composite object. In Figure 3-4, an Order object contains a Shipping Address object. A class definition for the Order object would include a shipAddress property of type Shipping Address. Aggregation is somewhat between association and composition. It is stronger that association, but can be implemented programmatically the same way. Figure 3-5 shows the aggregation of contacts for a customer. There may be many contact events for a customer, but the information within the contact would not be relevant to any other customer. Generalization, or inheritance (in C++ and Java terms), is the process of extending one class to make a new, more specific class. All of the public or protected properties and methods of the superclass are available to the subclass, but the subclass either adds new properties or methods and/or redefines one or more methods to change the superclass's behavior. In Figure 3-6, the Corporate Customer class is derived from the Customer class, with the addition of a contactName property and a revised authorizeCredit method (this method may allow a 10% overrun on the
Order orderlD date ShipOrder CheckStatus
Shipping Address name address city state zip phone Display
Figure 3-4. Class composition
Designing Application Servers
credit limit for corporate customers only). The Corporate Customer class now has all the functionality of the Customer class plus the contactName and a more lenient credit authorization method. The class diagram can illustrate design concepts in a simple, readable manner. Figure 3-7 shows a class diagram that illustrates an order. The Order object shows a composition relation with the Customer object and an association with a collection of Item objects. In simpler terms, the Order object contains a Customer object and a list of Item objects. The Order object provides methods to add (addltem), drop (dropltem), and navigate (findFirst, getNext) among the list of Item objects. There are also methods to print or display the entire order. Note that although the diagram appears relatively simple, this is a moderately difficult programming task. The diagram indicates that the Order object will contain a list of Item objects that can be accessed using sequential navigation. Logical order is not specified in the class diagram, but some ordering key is very likely, and the add and drop methods will have to accommodate this logical order. Figure 3-7 is a simple diagram with only three objects, but most class
Customer Contact name address city state zip phone creditLimit AuthorizeCredit Display
Figure 3-5. Class aggregation
o
date notes Display
57
58
Building Application Servers
Customer name address city state zip phone creditLimit AuthorizeCredit Display
Corporate Customer contactName AuthorizeCredit Display
Figure 3-6. Class generalization
diagrams will have more objects than can fit on a page. A diagram can be subdivided into several sub-diagrams, each showing one portion of the system that represents one or more logical subsystems. When this occurs, the same class may be displayed on several diagrams, showing only the class name with no properties or methods on subsequent pages.
Sequence diagrams The sequence diagram shows how the methods of several objects interact to perform one or more use cases. This diagram is an excellent tool for checking the completeness of an object design and can help discover missing methods and incomplete or poorly designed objects. Although any number of drawing programs can be used to create sequence diagrams, a UML-based CASE tool such as Rational Rose will make the process much easier. These tools do not allow inconsistencies between diagrams, such as method calls in a sequence diagram that are not listed in the class diagram.
Designing Application Servers
Item
Order orderlD orderDate shipDate totalPrice
Customer name address city state zip phone
addltem dropltem findFirst getNext printlnvoice display
1
prodictID name description unitPrice units getPrice print display
Print Display
Figure 3-7. Composite class diagram
A sequence diagram is constructed by listing the objects as blocks across the top of the page with broken lines descending from each object. Bars are then drawn between the broken lines to represent method calls from one object to another. The front of the line represents the object that is calling the method, the arrow end of the line is the object that implements the method. Parameters may be included to help convey object flow, but are usually left off to simplify the diagram. Iteration is represented by an asterisk (*) preceding the method call. Figure 3-8 is a simple sequence diagram that implements the "Place Order" use case. See Figure 3-1 for the use case diagram and Figure 3-7 for the corresponding class diagram. The user interface begins by creating a new order, which causes the Order object to create a new Customer object. As each item is ordered, the User Interface object first creates the Item object, then uses the addltem method to insert it into the list inside the Order object. A comment is placed above the new method to indicate that this process will be repeated for each new item, and the asterisk (*) also indicates which methods are repeated.
59
60
Building Application Servers User Interface
Order
new()
Item
new()
w W
Places Order
Customer
»
for each item entered: * new() * addltem(item)
display() display() 1 findFirst() for each item in list: ^
I * item=getNext() * display(item) I
Figure 3-8. Sequence diagram
Once all new items are added, the user interface calls the Order object's display method which displays the entire order. When the display method is called, the Order object first calls the Customer object's display method, then calls its own findFirst method to locate the first item in the list. Note that the arrow for the findFirst and getNext methods turn back towards the Order object line. This indicates that the Order object is calling its own methods. Once findFirst points to the first item, getNext can retrieve each Item object in sequence and then call the Item's display method. This is repeated for each item. Once all items are displayed, the Order object's display method may display totals or other information,
Designing Application Servers
but since this is performed within the display method and does not require a separate method call, it is not shown on the sequence diagram. Use sequence diagrams to test how well an object design can perform a use case. Building a sequence diagram will often locate missing methods and clear up relationships between objects. As the diagram is built, make sure that there is a navigation path between the objects. In the above example, there is an aggregation relationship between the Order object and the Customer object. If this relationship did not exist, the Order would not be allowed to call the Customer's display method since the Order object could not reference the Customer object. As with all UML diagrams, the sequence diagram is time-consuming and can easily bog down in too much detail. The diagrams are tools to visualize and communicate, so do not expect to diagram the entire system. A lot of pretty pictures may look nice and put a smile on the auditor's face, but users want working software, not pretty pictures. Use the diagrams to rough out the design, and then let the programmers start writing code. As problems arise, use the diagrams to focus the discussion and revise and redraw them as needed. The emphasis should be on getting the code right, rather than the pictures. If pictures are needed, use the round-trip feature of the CASE tool to generate pictures once the code is finalized.
Meeting the End User's Needs It may be easy to get caught up in the technology, but the goal of application server design is to create information tools that meet the end user's needs. These people have a job to get done and they often do not share your enthusiasm for distributed objects, transaction servers, or message queues. They will, however, show their hostility if the transactions fail or the messages stop queuing. Make sure the technology selected supports both the software developer's and the end user's needs transparently. The JAD process will go a long way towards meeting this goal. Make sure the people who really know the application are involved with this team. These are the people who actually do the work; the supervisors are sometimes not familiar with the application. Get users on the team whenever possible. If this is not possible, make sure to get their input and let them review the use cases in which they are the actors. Also
61
62
Building Application Servers remember that these users have their own jobs to do; work with the company's supervisors to balance the users' time so they do not get behind in their own work. Remember that this is a business process that goes far beyond the Information Technology department. Also, make sure from the beginning to structure the project in such a way that changing requirements do not slow down or stop development. Just as new features are added incrementally, allow time for requirement changes to be implemented incrementally. In a recent interview, Grady Booch, one of the designers of UML, said: "Our view of the world is, we guarantee you won't get your analysis right. This is a given. Plan on it. You need a process that allows you to manage the risk of failure and incrementally improve your understanding of the world over time" (Zamir 1998).
Finally, throughout the project, try to keep things in perspective. There is more to life than software design. Using the JAD team approach will produce better software, but as with any team approach, time will be spent in compromising and resolving conflicts. Be willing to fight for and defend your ideas, but also be ready to compromise and listen to others. Keep the focus on building good software.
Summary Application server design is best approached as an iterative process, beginning with a simple concept, then adding refinements and extensions to meet all of the business needs. The following are some guidelines that can be used to approach application server design: • Form a joint application design QAD) team, consisting of both software developers and end users, that can work together through the life of the project. This approach keeps the project focused on the needs of the users. • Develop use cases that define interactions between the users and the application. • Design business objects that model the business entities and processes.
Designing Application Servers
• Use iterative development. Quickly build prototypes based on each use case then review them with the JAD team, refining them until they meet the needs of the users. • Make sure that all constraints are known before beginning design. Middleware and application integration issues can have serious implications on the way software is built. • The UML modeling language, a graphical notation for object oriented design, can greatly enhance conceptualization and communication. • Focus on meeting the needs of the business and the end users.
References Fowler, Martin, and Kendall Scott. UML Distilled. Reading, Massachusetts: Addison Wesley Longman, 1997. Zamir, Saba. "Interview with Grady Booch—Taking UML from Innovation to Usage." Component Strategies, August 1998: 15-20.
63
Chapter 4
Service Interface Design To those outside the application server team, an application server is just a set of services that support the user interface programs. The user interface collects data and then sends it to the application server, where the data is processed. Depending on the result, the application server returns either the requested data or an error message. The user interface programmers do not need to know how the service interface does its job— only that it works according to the specifications. This is the goal of a good service interface design. The implementation details should be irrelevant to those working with the services. The services are well defined and documented and the results are understood, but only the application server programmers need to know how the results are obtained. This chapter will examine how the service interfaces are designed, from use case analysis through design specifications. The topics covered will include: • What is a service interface? • Design by interface • More on JAD: developing use cases • Turning use cases into services • Building services out of business objects 65
66
Building Application Servers
What Is a Service Interface? A service interface is more than just a list of function calls specified by the user interface programmers. Each interface should contain a set of standardized services that not only make sense within the context of a single application, but conform to an organization's standard application architecture. This requires each service to conform to standard naming conventions and use consistent parameter-passing and exceptionhandling protocols. The user interface programmer should be able to take a new service interface and quickly and easily integrate it into an application with a minimum of research and testing. The advantage of utilizing interface design is that it has little to do with program code. An interface object does not contain any program code; it only specifies services and protocols. The interface object describes what can be done, not how it is implemented. As such, an interface is defined in documentation, not in program code. The documentation should specify each service, the task that the service will perform, the parameters passed to the service, what data or object is returned and any errors that may be passed back as exceptions. Figure 4-1 shows a sample format to document an interface definition. This form includes the name of the interface, a short one- or twosentence summary describing the purpose of the interface and a summary of each of the services. In addition to the overview, each service should be listed in detail, describing the function, the calling protocol, the return value, the parameters, and any exception handling. Optimally, this information will be stored in some form of interface repository accessible online with multiple search capabilities.
Design by Interface Designing a good service interface is a fairly straightforward process. Once use cases are developed, you can specify the forms, reports, and processes to support these requirements. The forms and reports become the basis for designing user interface programs, and the processes dictate the services that must be called by the user interface. Once you design the user interface, the service requirements will become readily apparent.
Service Interface Design
Loan Calculator Interface Definition Interface:
LoanCalc
Description: Loan calculation routines to determine monthly payment amounts and maximum loan amounts. Uses simple calculations for customer inquiries. Used by:
Customer inquiry Web page
Services: 1. getPayment—Calculate loan payment from principal, interest and years. 2. getPrincipal—Calculate maximum principal from monthly payment, interest and years. getPayment Protocol:
pmt = getPayment (prin, intr, years)
Returns:
Payment amount in dollars (double)
Parameters: prin—Principal amount in dollars (double) intr —Interest rate in percent (double, 6.5 = 6.5%) years—Time in years (double) Exceptions: Throws remote exceptions Returns 0 if intr or years = 0 Use cases:
Customer Loan Inquiry ...
Also in interfaces: ARMLoanCalc ... (repeats for getPrincipal service) ... Figure 4-1. Sample interface definition
67
68
Building Application Servers One of the primary advantages of design by services and interfaces is that the implementation can remain fairly abstract. At this point, the task is to determine the services needed, not how the services will be performed. Sketches of the user interface form will indicate the data items available; then the use cases will describe what actions must be performed using this data. When a user interface program receives a command request, a service request is passed on to the application server. These command requests make up the list of services that must be specified in the service interface. This list of services is now the starting point for the design of an integrated service interface. Because the same services are often required in more than one application, you can combine similar services into common groups shared by several applications. Once you define the services, you can then aggregate them into application-specific service interfaces.
More on JAD: Developing Use Cases A use case, in its simplest form, is a step-by-step description of how a person interacts with a computer. It explains the context of the interaction, i.e., why the person is performing this task and the steps involved from start to finish. It will also describe exception conditions and what happens when the exception occurs. It is written in business language, understandable by both the business users and the software developers. Ivar Jacobson, one of the codevelopers of UML, has advanced the concept of use cases as a foundation for software development (along with iterative development and tight version control). In an article in Component Strategies (Jacobson 1998) he shows how use cases can replace most requirements specifications, drive the design process, and be used to create test cases once programming is completed. By approaching requirements analysis within the context of job tasks, you isolate the most important features while minimizing all of the blue sky requirements that are not actually needed. You can then develop incrementally, building and refining a few use cases at a time. Once you've collected the complete set of use cases, they provide an easy-to-use, yet comprehensive, set of software specifications. The techniques involved in deriving use cases and gathering requirements are far beyond the scope of this book, and many good references are available (see Further Reading at the end of the chapter). Here are
Service Interface Design
some of the characteristics of good use cases that will aid in the development of application servers: • Describe the context • Describe the actors • Describe the procedure • Describe exceptions • Use common language • Iterate and refine
Describe the context As work becomes fragmented across an organization, employees sometimes lose focus and the reasons for doing a task sometimes become obscured. Managers look at the overall process without worrying about the details, while the employees perform detailed tasks without quite knowing how their actions fit into the process. This same problem can occur when approaching use cases. Without the context or big picture, important details may be omitted or ignored. Back in the days of mainframes, an employee at a managed care organization used to get a report once a month listing people who were within three months of reaching age 65. The employee would get out colored highlighters and color each line either pink, yellow or blue based on the birth date. She would spend three to four hours each month coloring the ten- to fifteen-page report. She even told her friends that this was one of her favorite jobs. As we were beginning to review Medicare processing for this organization, we found this job task and were somewhat surprised. The reason for all the coloring was that she was responsible for sending out three separate mailings to remind people to enroll in Medicare before their 65th birthday. This involved an initial mailing and two follow-up letters. By changing the sort order of this report by birth date (a change to a couple lines of a MARK IV program, an early database language), we eliminated a half day of coloring and manual sorting. When the report was originally created, an enhancement form was passed to a programmer asking for a report listing people approaching
69
70
Building Application Servers age 65 who were not enrolled in Medicare. Since the programmers were always backlogged, the report was quickly thrown together and passed back to the Medicare department. Had someone spent the time to investigate why the report was needed (the business context), countless hours of coloring and sorting could have been saved. In this example, an understanding of the underlying process would have revealed the need to order the report by letter type. In addition to process context, the business context and workflow context can also be helpful. Use cases examine individual job processes, but to understand these processes fully, developers need to know how the use cases relate to each other and how they relate to the way the company does business. Before they examine any use cases, the JAD team should discuss how the application fits into the overall structure of the business. In many cases this may appear obvious, but by spending a few minutes focusing on the "big picture," you can eliminate false assumptions before they cause problems. Discuss the role of the department, how it supports the overall business process, whether it directly supports the customer or whether it supports other business units, what products or services are delivered, and so on. Depending on the formality of the project, these functions may need to be documented. Once you've examined the business context, move on to how the application fits into this context. How does the application support the responsibilities of the department? If this is a workflow process, itemize the steps that are performed. Define the participants. Who initiates the task? Who performs the work? Are there additional people that must be contacted? Does the work pass from one person to another? Who receives the finished product? Examine what flow of information exists between steps, what questions are asked of the customer, and what documents are sent or received. Often, workflow analysis tools or flowcharts can be used to help document the process. Once this is drawn out, isolate the steps that will be performed by the new application; this will point you towards the use cases that need to be developed. Once the JAD team understands the context, it is much easier to begin to develop the individual use cases. Reference the workflow steps in the use case narratives and add a brief description at the beginning of each use case to describe how it fits into this overall flow.
Service Interface Design
Describe the actors Like the business context, knowing the people or actors who perform these activities will make the process much more understandable. Employees, customers, even external computer systems all have roles and responsibilities that either empower or limit the actions to be performed. Each employee has certain job domains, responsibilities, and authorities and can be expected to act within these constraints. Moving outside of these boundaries can cause serious problems. In addition to roles and responsibilities, each actor has to have some motivation for performing these tasks. In use case terminology, the actor must gain value from the use case. When the Medicare clerk colored the report, she gained value from the color coding because it helped her collate the letters. When a customer places an order on the Internet, she gains value from exchanging her money for merchandise and also gets the added premium of convenience. Whatever the motivation or value, this should be documented in the use case. Begin by isolating the actors and give them names to describe each one's role within the process. Mary may be the person who sends out the Medicare letters, so list Mary as the actor; but also describe her role as the letter collator. Once the actors are isolated, describe each actor's role in the use case and describe why each performs her job.
Describe the procedure Every use case needs to present a logical, sequential description of how the task is performed from start to finish. As the use case is first developed, this description may have some ambiguities and may miss some details, but as it is refined, these items can be clarified. As already stated, it is impossible to get it right the first time, but since everyone on the development team knows that this is an iterative process (keep reminding them), the initial use cases are still useful tools for software design, and the programmers can begin to develop the initial prototypes. As the prototypes begins to take shape, the JAD team must review them to ensure that they match the use case requirements. Step through the procedure with the prototype to test both the software and the procedures and revise both in parallel. It is important that the use cases reflect the procedures.
71
72
Building Application Servers The software developers will use the use cases as the basis for software requirements, program specifications and test plans, so if there are inaccuracies in the procedures, the software will not meet the business needs. When describing the procedure, itemize the steps in a sequential, logical manner. State who the actor is, each step performed, any decisions that have to be made, the source of each one's information, and so on. In the case of the Medicare letters, the steps may include how Mary requests the report, what information is included on the list, the criteria for each letter, who receives the letters and what messages are communicated, how Mary addresses the letters, and what the desired result of each letter will be.
Describe exceptions While developing the procedures, you may discover a variety of exception conditions. A customer may have insufficient credit to complete a purchase, or a network connection from San Francisco to Atlanta may not always be available. All common exceptions must be listed and procedures must specify how to handle these problems. The procedures and exceptions are the foundation for the business rules that will be coded into the software, so exceptions relating to each use case should be documented. Once the basic procedure is itemized, each step should be examined to determine where errors and exceptions may occur. Some exceptions may be trivial and may be annotated directly in the procedure. Others may be serious enough to warrant an extension that can be examined as a separate use case (the « e x t e n d s » notation in the UML use case diagram). Examine each exception to determine how it will impact the procedure narrative. Remember that it is impossible to anticipate all of the exceptions. Spending too much time in exception analysis will be counterproductive. It is important to specify only the common exceptions, since it is easy to overload a use case with rare and exotic problems that may never occur. Also, as new use cases are developed, the JAD team will discover exceptions that were not addressed in previous use cases. As these are found, determine if these errors could occur in other use cases, then revise those use cases to include the exceptions. Document exceptions as separate steps in the use case procedure. In the above example, Mary requests the report. Insert an additional step after this to have Mary check that the report printed. If it did not print,
Service Interface Design
have her consult the "Printer errors" use case that describes how to resolve a printer error. This would be a new use case that would extend any use case that creates a printed report.
Use common language The goal of the JAD process is to create software that solves business problems. There is always a tendency to get wrapped up in the technology, and soon even the business people involved in the JAD team will be speaking in acronyms and buzzwords. This is fine as long as it does not overrun the use case descriptions. A use case must define a business process, so use business terminology. The technical language belongs in the program code, not in the use cases. At the same time, make sure that the technical people understand the business language. Just as the business people will adopt technical terms without really understanding their meaning, the technical people will begin to use the business terminology without full comprehension of what these words mean in the business context. Often, a project glossary can help clarify both business and technical terms. Just be careful that developing the glossary does not overshadow the development of use cases.
Iterate and refine Again, there is no way that anyone can get the business requirements right the first time. Create the use cases based on what is currently known, then quickly create software that reflects the use cases. Review the software and use cases in parallel and continue to refine both. How many times have you heard a user say "I'll know what I want when I see it"? This may be frustrating to a developer, but without experience and training in software design, the end users do not have the knowledge base to envision their needs. A simple prototype will often solve this problem. With constant refinement comes the need for revision tracking and change management. Just making sure that everyone has the most current version of every use case may be a project in itself. You must address revision tracking early in the JAD process and establish procedures to ensure that everyone stays current. Revision tracking can also provide quick recovery when a revision has moved in the wrong direction.
73
74
Building Application Servers
A brief example To illustrate these principles, Figure 4-2 documents a sample use case for the Medicare report described throughout this section. The use case begins by describing the context, stating that Medicare requires each member who is approaching age 65 to apply for Medicare benefits. It then explains why the letters are needed, describes the value gained by both the company (and indirectly the person collating the letters) as well as the members, and specifies the procedure used to generate the report and the contents of each mailing. Next, the use case lists the actors as well as the value gained by each. The company receives a monthly payment from Medicare while the member has to pay a much lower premium. Next, the use case describes the procedure step by step, starting with how to request the report, followed by the steps used to select letters for each member. The use case also briefly mentions what happens to the letter when it is returned by the member, referencing the use case that describes this process in detail (this is part context, part procedure). Since the process is relatively straightforward, the only exception described occurs when a member has already passed their 65th birthday (code 0). Additional exceptions could be included to describe printer errors, but these can be handled through an extended use case that could be shared by any use case that performs printing functions. Notice, too, that the use case is written primarily in business language, understandable by the person performing the work. A use case can always be refined, and a couple of iterations would ensure that the use case follows the actual procedure. The first iteration would be a review by Mary and other people in the Medicare department. Since they perform the task, they will best know the procedures. Once this is done, it would be a good idea to spend some time discussing process improvement. If Mary is hand-addressing the envelopes, it may make sense to also print mailing labels or, depending on the volume, print the letters themselves in a format that can be easily stuffed into a window envelope. A quick check of the number of members who receive follow-up letters and calls could also determine how effective the letters are and if there is a need to revise them.
Service Interface Design
USE CASE Medicare Reminder Letters Within three months prior to each member's 65th birthday, the federal government requires that each member submits an application for Medicare coverage. Since our organization receives a substantial monthly payment to cover the member's benefits, it is to our advantage to do what we can to remind the members to fill out these forms. Since the member's premiums are also greatly reduced, it is to their advantage as well. At the beginning of each month, someone from the Medicare Services Department (usually Mary) requests the Members Approaching Age 65 report from the reports menu of the Medicare menu of the membership program. The report will list all members who are within three months of turning age 65 and who have not yet applied for Medicare. Each line is annotated with numbers 1, 2, or 3, indicating how many months until their 65th birthday. Members who have already turned 65 are annotated with 0. Depending on the number 1, 2, or 3, the member will receive a mailing that includes one of the following three letters as well as a membership application form. Members who are three months prior to age 65 will receive an initial letter stating the advantages of signing up for Medicare. Members annotated with a 2 (2 months before their 65th birthday) will receive a second notice reminding them to fill out the letter, emphasizing the reasons for submitting the form. Members annotated with the number 1 will receive a more harshly worded letter reminding them to submit the form. The few members annotated with 0 are contacted by phone to remind them that they must complete their application. Once the form is filled out by the member, it is mailed back to the Medicare Services department where it is entered into the computer, then forwarded to HCFA (Medicare). These steps are covered under the Receive Medicare Application use case.
Figure 4-2. Medicare Reminder Letters use case
75
76
Building Application Servers
Making use cases work A strong foundation of use cases will go a long ways towards good software development and make the JAD process effective and efficient. Start with the context, addressing both the business and the actors. Make sure everyone is speaking the same language, then define the procedures and exceptions. Constantly revise the use cases to reflect changes in both business requirements and to parallel the software prototypes. Continue to develop new use cases until they completely solve the business problem.
Turning Use Cases into Services Since each use case will describe an interaction between actors and the computer, the JAD team's next task is to design user interface screens. Interactions usually follow a pattern in which the user enters some information, requests a service, then receives a result or an exception. The JAD team must determine what data items are needed to perform each service request and how to prompt for these items. The team must also determine the actions that will trigger each request and specify the information displayed after the request occurs. Services can often be categorized as data retrieval, locating and displaying information; or as transactions, performing a series of data transformations. In both cases, the request is augmented with data that specifies or refines the scope of the request. A customer enters both a vendor and product name to request pricing and availability, or a bank teller enters a customer's account number and the amount before posting a deposit. The screen layout design specifies the service request options and the data used to define the request. The user interface designers must also specify the results and how they will be presented to the user. For transaction requests, a simple message box summarizing the actions performed lets the user know that the request completed successfully. For data retrieval operations, the specifications should list each data item, as well as the order of presentation when multiple items are selected. While considering the results of a retrieval, keep in mind that one retrieval request will often produce data that must be acted on, triggering a subsequent request. A request that looks up a customer by last
Service Interface Design
name will, in most cases, return more than one customer matching this last name, so the next step in the procedure will be to have the user select a specific customer. Once the use selects the correct customer, an additional retrieval request may occur, or a transaction may be posted against this customer's account. Each of these operations will invoke additional service requests. Once the user interface is specified, make a list of all of the required services specifying the input parameters from the user interface, the operations required, the results that must be returned, and a list of possible exception conditions. Although the descriptions should be written in business language, these are program specifications and some computer terminology will be required. While the use case is aimed at the business people, the interface design should be targeted towards the software developers. Also remember that interface design is not the place to worry about implementation details. The process should be specified in functional terms, describing the outcome of the service request, not the procedural details needed to produce the outcome. The service request at the beginning of the chapter (Figure 4-3) used to calculate the loan payments illustrates these concepts. The input parameters are the principle amount, the interest rate, and the length of the loan. The process is to calculate the monthly payment amount. The returned result is the monthly payment amount. Exceptions will occur when either the interest rate or the time period is zero. Notice that there is no description of how the calculation is made. The specific formula does not matter at this point, since the user interface's responsibility is only to provide the payment amount, given the information about the loan. Later, when it is time to specify the business objects, a loan calculator object will be specified that will include specific calculation formulas. In addition to specifying the input parameters, the process, the results, and the exceptions, services should follow a number of standards. These guidelines make services more accessible to user interface programmers and make design and coding easier for application server developers. By providing standardized names and parameter lists, user interface programmers do not need to spend as much time looking up and researching the services. At the same time, standardizing services will allow more consistency for the server-side developers. The following
77
78
Building Application Servers
getPayment Protocol:
pmt = getPayment (prin, intr, years)
Returns:
Payment amount in dollars (double)
Parameters: prin—Principal amount in dollars (double) intr—Interest rate in percent (double, 6.5 = 6.5%) years—Time in years (double) Exceptions: Throws remote exceptions Returns 0 if intr or years = 0 Use cases:
Customer Loan Inquiry...
Also in interfaces: ARMLoanCalc
Figure 4-3. getPayment Interface Specification
are a few guidelines when designing services: • The service is application-specific • The service is self-contained • The service handles all exceptions • The service hides the business object layer • The service conforms to standards
The service is application-specific While business objects are built for reuse, the services within an application's interface are intended to provide services for specific tasks within the context of one user interface screen or application. Although
Service Interface Design
reuse may sound appealing at this point, subtle differences may exist between the requirements of this service and a similar service in another module. Trying to force reuse during this phase of design may cause details to be ignored and result in complex, difficult-to-use interfaces. As you develop an order processing system, several use cases may indicate that a selection screen is required to locate an order. In the first use case, the customer calls and requests the status of his order. The customer service representative enters the customer's name or phone number and receives a list of orders that match this criteria. The resulting list includes the order number, customer name, phone number, sales content, and a brief indication of the order status, such as "received 9/15" or "shipped 9/25." A second use case may also require an order lookup screen, used by the sales representative to review a customer's purchases prior to a sales call. The sales representative will enter the customer's name or phone number along with the number of months of history desired. A summary list of orders will then be displayed showing the order date, the content and amount of the sale, and the sales representative's commission. It would be tempting to set up a generic getSalesOrders service that would return orders by customer, phone, date range, and order status and then provide all of the information to satisfy both requests. This service could also be extended to handle many other sales order requests by customer and eliminate a lot of server-side programming. The problem is that the requirements for the service become far too complex and changes made to accommodate sales representatives' inquiries may cause problems in order status lookup. Also, the generic service will produce at least twice as much network traffic as application-specific requests would generate. Although the same data tables may be retrieved using similar access paths, the data requirements and results have little in common.
The service is self-contained A service should be a single procedure call giving results that are generated from the data supplied in the parameter list. Services will be called by many different users concurrently, so the service cannot rely on data from prior service calls. By providing self-contained, often called stateless services, you eliminate the need to propagate separate instances of the
79
80
Building Application Servers object on other machines, tying up system resources or adding additional overhead to manage object life cycles. There will be times when this rule must be broken and a state-based service object will be needed, but these should be kept to a minimum. Life cycle and concurrency management can take far more resources than the application itself. Any additional overhead will result in slower response times and larger server requirements. In the order status inquiry example described above, a customer may ask follow-up questions after learning more about his order status. When a customer learns that his order was shipped two weeks ago, the customer may want to know which carrier was used, in order to determine why he has not received it. To follow up on this inquiry, the customer service representative will need to know the carrier and the tracking number. In designing this sequence of services, a stateless getShippinglnfo service must receive the customer and order number again before supplying the information. A state-based service can simply return the information based on the customer and order information retained from the getSalesOrder service request. In determining the best design for this service, the stateless version is almost always the better choice. A stateless service can be handled by a single service object, processing each request independently without having to retain data between calls. A state-based service will require a separate service object for each sales inquiry to retain information between service calls, creating the object at the beginning of the inquiry stream and then deleting it when the inquiry process is complete. This requires a much higher level of application server complexity to save a small bit of network bandwidth. Since the customer and order number are still on the user interface screen, the software will be much simpler if this data is simply sent back to the next service request.
The service handles all exceptions As I described above in "More on JAD: Developing Use Cases," all reasonable error conditions should be anticipated and procedures for exception handling should be specified. In addition to those described in the use case specification, exception handling should also be extended to software-specific problems. These will include testing input parameters for
Service Interface Design
bad data, checking for referential integrity and other similar issues. Each error unique to this service should be listed as possible exception results. The service interface specification is not the place to list every possible error condition. Many of the exceptions are common to all services, and a general purpose error handler can be specified to handle these errors. This is within the domain of programming standards, not service design. General error handling should include program faults, middleware exceptions, and network errors. Another issue that must be addressed in exception handling is the location of the error checking. Many simple checks can be made by the user interface programs. Empty fields, correct data formats and other local data checks can easily be performed by the user interface. Other checks that are within the domain of business rules, or to ensure referential integrity, such as verifying that a customer number exists in the customer database, must be done on the server side. Two exceptions to this rule are when a new object is created or an object's attributes are modified. In these cases, the service should always check the validity of the data being accepted because this data will be retained and bad data can cause subsequent errors. Once an error is detected by the service interface, a standardized process should be used to communicate the error back to the user interface program. Most programming languages, as well as middleware interface definition languages, provide standardized exception handling that can be used or extended. Once the error is thrown back to the user interface program, it must then be reported back to the user with sufficient prompting to let the user know what has gone wrong and how to fix it. Nothing is more aggravating to the users than getting an incomprehensible error message without any clues describing how to resolve it.
The service hides the business object layer Just as a well-designed object hides its attributes and protects them with get and set methods, a service should not allow business objects to be exposed outside the application server. It is tempting to pass objects directly to the user interface program, but by doing so, you permit the objects to be corrupted either by network errors or by malicious programmers. They may discover and call additional methods that may
81
82
Building Application Servers breach security, extend access authority, or violate business rules. Instead, you should use service-based container objects to pass parameters and results to application domain objects, which can then validate the data prior to altering business objects. In the inquiry screens above, another subsequent action may be to revise the customer information. The order status may indicate that the order could not be delivered because the address was incorrect. The customer service rep will request a delivery address correction. The data to be corrected comes directly from the customer object, so at first glance it makes sense to have the service pass a copy of the customer object to the update screen. Unfortunately, the customer object also has methods that set credit limits and discount rates. If the customer object is exposed directly, an unethical employee or an Internet hacker may be able to use features such as Java's object introspection to discover these methods and invoke them. A better alternative is to create a separate customer data object that only holds the data attributes. Instead of passing the entire customer object with its additional attributes and methods, a customer data object will only carry the data that is used within the customer update screen. An initial service called getCustomerData can retrieve the necessary data, load it into the customer data object, and send it to the user interface program. When all of the changes are made, the data can then be loaded back into the customer data object and the user interface program can call updateCustomerData to request that the changes be made. By using separate container objects, you hide the business objects inside the application server from the user interface program. You can validate the data items to prevent missing data or incorrect values from being placed inside the object, and isolate methods so they can only be accessed through service interface requests. This may add more work for you, the service interface programmer, but it also protects and secures the integrity of both the business objects and the company's data.
The service conforms to standards Early on, the design team should standardize naming conventions, parameter sequences, and exception handling, then make sure that these standards are enforced for all service definitions. Although the standards
Service Interface Design
can be difficult to adopt at first, once the developers adapt to them, programming becomes much easier because all calls conform to these standards. Services can be composed of a common set of verb-noun combinations, such as addCustomer or getPrincipal. Parameter lists can be ordered in similar manner, using container classes to encapsulate multiple data items entered on a screen. Results can also be encapsulated in container objects that conform to specific standards. Standards are often set by the middleware vendor, with service interfaces defined according to the vendor's interface definition language. In addition to these standards, most development groups have naming standards in place that can be adapted to address the needs of the application server environment. As an example, Microsoft shops often use some form of Hungarian notation for naming conventions. Do not put a lot of effort into standards, adapt those already in place or try to find standards that others have used successfully.
Bundling services into interfaces By packaging services into service interfaces, you enable a user interface programmer to access all necessary services by obtaining only one handle. This simplifies programming and lowers the overhead cost of accessing the application server. The service interface can then be viewed by the user interface programmer as a single object, providing a logical collection of application services. Once most or all of the services are defined for a use case, the services should be refined and documented in fairly detailed form. Before aggregating services into an interface, you should review of all the services, along with those services defined in earlier use cases, to locate duplicate services that perform the same basic functions. These can be examined to determine if they are candidates for reuse. Service reuse should not be forced when it is not appropriate, but many occasions for reuse will appear throughout the project. If standardized names and parameter passing are used, duplicates can be merged and shared. Once reuse analysis is complete, services are packaged into interfaces, aggregating services by user interface program, user capabilities, or applications. The scope of a service interface depends on the size of the project and can vary depending on the number of different user interface
83
84
Building Application Servers modules and the level of integration across different applications. A simple application server that has one or two user interface modules and has no interaction with other systems may only require a single service interface. A large, mission-critical application server that supports several departments or several locations may require separate service interfaces for each user interface application as well as additional interfaces to support system integration and external access. When determining the aggregation of service interfaces, remember that each service can be accessed from several interfaces using interface inheritance. Many user interface programs will need many of the same common services. Distributing these services across multiple interfaces will provide common functions to the programmers and promote software reuse across the project. At the same time, a service interface does not need to expose all of the services within a particular implementation. Use service interfaces to expose only the services needed. This feature can be used to limit access and enhance application security. Service interface packaging is a fairly simple, intuitive process. Since each user interface requires a certain set of services, start by specifying a separate service interface for each user interface. Once this is done, it will become apparent that some service interfaces share common services or a subset of another service interface. The services required by the customer inquiry program will be a subset of the customer maintenance service interface. When this is the case, the two service interfaces can be merged together or interface inheritance can be used to simplify the interface design. Do not spend a lot of time agonizing about how to distribute services within an interface, the proper distribution will occur naturally and intuitively.
Handling Errors and Exceptions One of the most critical pieces of the service interface design is how to handle errors and exceptions. C++, Java, Visual Basic, and IDL all provide standard mechanisms for exception handling and recovering from errors, but using these features requires some forethought and planning. Standardize message formats to make errors readable and understandable. Standardize program recovery so each service or object will react to errors in the same way. Establish standard error handlers incorporating
Service Interface Design
logging and alarms to notify the software developers when application errors occur. The primary goal of error handling is to isolate problems and provide suggestions to fix the errors. When an error condition occurs, the user must first be informed, then prompted to correct the problem. The message should be in language that makes sense to the user and suggests an action that will correct the problem. Most user frustration occurs when error messages do not make sense or are phrased in indecipherable technical terms or accusation language. These types of messages do not provide the information required to understand the problem or correct the error, and often intimidate the user. When developing error handling strategies, consider different approaches based on the source of the errors. The most common are user interface errors; these arise from incorrect keystrokes or undefined data. More difficult to handle are the application errors—those caused by software bugs, bad design assumptions, or corrupted data. Finally, those errors caused by system or network problems cannot be easily recovered by the user or programmer, but error handlers must be in place to prevent data corruption.
User interface errors The easiest and most common errors will occur when data is miskeyed or when items are entered that are unknown to the application. These errors will usually stop the application's processing and the user will be prompted to correct the problem. For unknown data, such as a new customer or a discontinued product, you should provide a cancel option to void the operation and start over. Standardize across an application the process of distributing error checking between the user interface program and the application services. In most cases, the user interface program can quickly check to ensure that required data is supplied and that entries conform to required patterns (for example, a Social Security number entered must contain 9 digits). You can also delegate other coding checks to the user interface program by populating pull-down or list boxes. Pass any other data validation to the application server. Each service should begin by checking for parameter errors before
85
86
Building Application Servers processing begins. Required parameters must not be empty and all parameters must conform to proper data types within specified ranges and conform to business rules. Referential integrity should also be checked at this time. If any errors occur, processing should stop, throwing an exception specifying the data item in error and the type of error that has occurred. It is then the responsibility of the user interface program to inform the user of the error and prompt for corrections. In those cases when a data exception occurs during processing, the transaction must be aborted and the data transformations must be rolled back to the state that existed prior to the request. Usually, you can use the transaction processing capabilities of the database to roll back changes. For more complex systems, you can add transaction middleware to insure data integrity.
Application errors While most user errors are caught and intercepted prior to processing, application errors can occur any time during the service request. These errors must be intercepted, logged, recovered when possible, and then the user must be informed that an error occurred. Much of this can be accomplished through exception handling processes provided by development tools, extended with application-specific, standardized error handling methods. When an application error occurs, the error handler will catch the error and divert program flow to an error-handling process. This can be a catch block in C++ or Java, or an On-Error label in Visual Basic. Once the exception occurs, a separate error handler should be called that logs the errors into an application error database, then locates and formats standardized error messages. Depending on the severity of the error, the application can either continue processing, display a message to the user, or raise an alarm that the software must be fixed immediately. Hopefully, few errors will cause alarms, but these should be built into the program to ensure that critical errors are corrected quickly before data can be corrupted or operations are stopped. For those errors that stop the application's processing but are not critical, the application should send an error message back to the user, informing him that processing did not complete, and then describe
Service Interface Design
what needs to be done to correct the error. As with any error that stops the application's processing, corrective measures must be in place to roll the data back to its state before the service request began.
System and network errors The most frustrating and difficult errors to recover are those that are outside of programmer control. Network failures, middleware problems, and computer crashes all have the capability to corrupt data. The errorhandling routines will catch the errors, but often, it is impossible to request rollbacks or do anything to recover the error. Much of the work in standardizing error handling revolves around these types of errors. When a network connection goes down, the user interface program will continue to run, but service requests will either fail or time out. When this occurs, the application must inform the user of the problem but must also assure the user that no data was corrupted. At the same time, the application must raise alarms to let the network administrator know that a problem has occurred so that network communication can be restored. Fortunately, the middleware products provide retry and recovery capabilities, and the programmer only has to handle the final failure message. Network administrators already have tools to monitor and manage the networks, and database vendors have built in rollback and recovery when system failures occur. Still, the application software must catch the errors and inform the users.
Exceptions and interface design Although the primary rule of interface design is to focus on services, not processes, determining interface exceptions does require some thought about the processes involved in performing the service. Procedures such as data validation and referential integrity must be considered while listing the possible exceptions. Even so, most of these can be considered in general terms and specified without a detailed knowledge of the implementation. Many of the exceptions will also be program-specific and cannot be known while specifying the service interface. By standardizing the error handling processes and providing a general purpose error handler, you can incorporate these errors without changes to the interface design.
87
88
Building Application Servers
Summary This chapter examined how to design use cases as well as how to use them to determine the services and interfaces provided by the application server. Use cases describe in procedural fashion how the users interact with the computer. Once these use cases are specified, user interfaces can be designed to support this interaction and service interfaces can be designed that describe how the user interfaces access services from the application server. • A service interface is a group of services that provide the functionality needed by one or more user interface programs. • Each service describes an action that can be requested by a user interface program, listing the data items that restrict the action as well as the results that are provided. • Service interface design focuses on what the services do, not how they do it. • When designing use cases, use the following rules: • Describe the context • Describe the actors • Describe the procedure • Describe exceptions • Use common language • Iterate and refine • When designing service interfaces, use the following guidelines: • The service is application-specific • The service is self-contained • The service handles all exceptions • The service hides the business object layer • The service conforms to standards
Service Interface Design
• Develop standardized error and exception handling procedures, passing messages back to the user that describe the problem as well as explain how to resolve it. • When application and system errors occur, make sure that data is not corrupted.
References Jacobson, Ivar. "Use Cases and Architecture in Objectory." Component Strategies, August 1998: 70-72.
Further Reading Interface Design Coad, Peter, and Mark Mayfield. Java Design-Building Better Apps and Applets. Upper Saddle River, New Jersey: Yourdon Press, 1997.
Use Cases Ambler, Scott W. The Object Primer: The Application Developer's Guide to Object-Orientation. Managing Object Technology Series, no. 3. New York: SIGS/Cambridge University Press, 1998. Jacobson, Ivar, Grady Booch, and James Rumbaugh. Unified Software Development Process. Object Technology Series. Reading, Massachusetts: Addison Wesley Longman, 1999. Schneider, Geri, Jason P. Winters, and Ivar Jacobson. Applying Use Cases: A Practical Guide. Object Technology Series. Reading, Massachusetts: Addison Wesley Longman, 1998.
89
Chapter 5
Designing Business Objects The goal of business object design is to create a collection of reusable software objects that model your business. While interface design is a bottom-up approach, used to determine application requirements, business object design is a top-down analysis of the entire business, identifying roles and functions. Business objects are formed by specifying properties and services that reflect the real-world objects they model. As with real-world objects, software objects often combine and collaborate to perform tasks that they cannot perform individually. Object design requires a global view of the organization, determining not only the needs of the current task, but the functions required for the entire business. Objects must be designed for reuse across both the current application and be ready for use in the next project, even if the project is for another department or a different line of business. Although not a simple task, it is not as difficult as it seems. Business objects mirror people, forms, and other objects that have already been integrated into the business. As such, when the objects simulate these functions, they also fit inside the same business context. The object designer cannot possibly know all of the business requirements, so designing all functionality from the beginning is an impossible task. Business requirements are constantly changing and today's needs may not be relevant tomorrow. Business objects must be designed as open, dynamic components that can easily be changed without impact on other functions. 91
92
Building Application Servers With all of these requirements, business object design is still a difficult task, but no more difficult than most other business functions. Business is a competitive, dynamic process that must be flexible and shift to constant market changes. Managing change is an important part of any successful organization, and business objects that mirror these functions must also be flexible and able to manage changing requirements. This chapter gives an overview of business object design as it relates to application server development. Topics include: • Moving from interfaces to objects • What exactly is a business object? • Finding the objects in your business • Designing the objects • Linking business objects to the service interface • Business object architecture
Moving from Interfaces to Objects Chapter 4 looked at how to design the service interfaces that provide functionality to user interface programs. One of the most important pieces of service interface design was to get a written description of the software requirements in a sequential, narrative form. These narratives (use cases) were developed jointly using a team of both business users and software developers to ensure that each case met the business needs. This chapter looks at how to deliver this functionality and, at the same time, design software objects that can be reused throughout the organization. Again, the information needed to design the objects is located in the use case documents. By defining the actors, objects and tasks, a comprehensive set of business objects can be defined.
From data models to business objects In traditional two-tiered client/server development, software design started with the data model. User interfaces and paper forms were examined to determine what data had to be stored; then a data model was
Designing Business Objects
designed that met the requirements. Once the data model was complete, the developers created the physical database and user interface programs that accessed and modified the database. All processes were embedded within the user interface programs. In object-oriented design, the emphasis shifts from a data-centric view to a business object view, creating a model in software that mirrors the activities of the organization. Data is encapsulated in objects that also contain the processes to manipulate the data. These objects then interact in the same way that business people interact, processing and transferring information between themselves. Object modeling looks similar to data modeling, but there are many subtle differences. Those who have spent years doing data modeling will find the transition difficult and confusing. Data structures are physical, persistent organizations of bits that sit on disk drives. Once the columns and tables are defined, they stay there as long as the database is in use. Objects are transient, dynamic, memory-resident entities with short life cycles. They are created, transformed and deleted in finite periods of time, sometimes within milliseconds. It takes experience and practice to really see the differences and adapt to the realities of object design.
Choosing a design approach Object design is an art, not a science. Every designer will look at the same problem and come up with a different object design. According to one source, there are at least 30 different object design methodologies and notations (Carmichael 1998), so, depending on training, background and personal preferences, each designer will approach the task in a different manner. Most likely, the reader has already been exposed to one or more of these design methodologies and has developed a personal approach that produces good software design. The techniques described here reflect my own personal approach and should be used to augment your own experience. Determining the business object layer of the application server is not much different than any other object-oriented (OO) design approach, since good software design spans all languages and architectures. As with other design topics in this book, the intent is to point towards practices that will produce efficient, flexible application server designs that meet
93
94
Building Application Servers the needs of the organization. For those not familiar with object-oriented design, there are many excellent references explaining and comparing the leading methodologies. See the list of Further Reading at the end of this chapter. In addition to design methodologies, there has been quite a bit of work done in the past few years on design patterns (Gamma et al. 1998). This is the concept that most software design elements can be classified into a collection of common patterns, tailored to meet specific business problems. By categorizing and documenting these patterns, people new to software design can gain the experience of previous designers and not have to reinvent these processes themselves. Experienced designers can also use these patterns to share design ideas and learn from each other. This text will present a variety of approaches, but will emphasize object design using the UML notation and an amalgam of several design methodologies. These combine use cases as defined by Ivar Jacobson, object-oriented design according to Grady Booch, and the traditional structured analysis and design of Coad and Yourdon. The goal is not to produce a unified design methodology (we can leave that to the UML guys), but to find tools and techniques that provide cost-effective solutions to meet the needs of the business quickly. Remember that the goal is to produce effective business software implemented as program code, not stacks of binders full of charts and pretty pictures.
What Exactly Is a Business Object? A business object, within the context of application server design, is a computer representation of a physical business entity. These entities can be physical objects such as business forms, inventory items, shipped products, or the trucks that carry them. They can be classes of people like customers, employees, loan processors, or even a single person like George Smith, the only guy in the company who can approve loan amounts over $10,000,000. A business object should almost always represent a physical object (or person) that can be seen or touched. A business object should also be defined in business terms. If computer terminology is needed to describe it, then the object needs further refinement or it may not be a proper candidate for inclusion as a business object. The design should also be understandable by everyone on
Designing Business Objects
the JAD team. This means that, in addition to lack of computer terminology, the business language should be simple enough to be understood by the computer people. Every group of people, whether technical- or business-oriented, uses terms and expressions that are only fully understood by those in the group. Since the JAD team is made up of both business and technical people, the language used in the specifications should be understandable by everyone on the team.
Finding the Objects in Your Business The first step in locating the objects from a use case is to determine who the actors are. Actors are usually people, business entities or other computer systems. The actors work with the objects to perform business tasks, adding value either for themselves or the business. Once the actors are located, it is often easy to determine their associated objects. Since this is a business simulation process, the actors also become objects, passing messages between themselves and manipulating the objects around them. A list of the actors and the objects they use becomes the starting point for the object model.
Objects vs. Classes Object-oriented technology distinguishes between objects and classes. An object is a specific instance, containing a set of data and the methods to process the data. A class is a generalized description of a group of objects that have the same data item(s) and method organization. In object-oriented programming, class definitions are created using the programming language, and then any number of specific objects can be created using the class definition. In this discussion of object design, the term object is used to denote both classes and objects. Although not technically correct, the word object is more effective in communicating the concept to those not familiar with object-oriented development. Also, the distinction between classes and specific objects is often not clear during the design phase, and keeping the terminology correct will get in the way of doing the design work.B
95
96
Building Application Servers In addition to the actors, many business objects will model physical objects and forms (including computer screens and computer-generated reports) currently used in the business. Billing forms, inventory items, raw materials, even the warehouse itself can exist in the computer as a business object. In addition to listing the actors, include any physical objects that may be relevant to the project. Throughout this chapter, a set of loan application use cases will be used to illustrate object design. The first use case is the loan application process summarized below. Loan Application: A customer wants to purchase a new home and has contacted our company to request a loan application. The customer service representative asks the customer to fill out a loan application. This application form requests information about the customer and the property that the customer wants to purchase, including the location and purchase price. The form also requests the customer's employer, monthly salary, other monthly bills, credit card information, bank accounts, and other financial information. After the customer fills out this form, the customer service representative enters the form into her computer. The computer assigns a loan number and makes a quick check to ensure that the information is complete, notifying the customer service representative if any problems are found. After the information is in the computer, the loan number is written on the form; then the form is forwarded to a loan processor.
In this first use case, the actors are the customer, customer service representative and loan processor. Possible objects include the property to be purchased, the financial information, bills, credit cards, bank accounts, and the loan application form. The employer may either be an actor or may just be an attribute of the customer or financial information, but, since little is known at this point, it will be added to the list of actors. Figure 5-1 summarizes the actors and objects derived from the first use case.
Designing Business Objects
Looking at the second use case helps to refine this list: Initial Loan Approval: For each loan application form, the loan processor first pulls up the loan application on the computer and selects the "create application documents" button. The computer generates a cover page that lists the loan number, customer name, address and phone, loan-to-debt ratios, and checkboxes for each approval requirement (credit report, appraisal, etc.). The computer also generates several verification letters for employers, banks, credit cards, and bills, each verifying the information submitted on the application. After these documents are printed, the loan processor checks the loan-to-debt ratios; then, if these ratios do not meet current requirements, the loan is rejected. Otherwise, a credit report is initiated by phone, the verification letters are put in the mail, and then all of the paper documents including cover page, loan application, and copies of the verification requests are placed in a paper file. As each of these verification forms return, they are entered into the computer, the cover page is annotated, then the form is placed in the file. When all verification forms have returned, the file is passed on to a loan officer who then approves or rejects the loan.
Actors
Objects
Customer Customer Representative Loan Processor Employer
Property Financial Information Bills Credit Cards Bank Accounts Loan Application Form
Figure 5-1. Actors and objects found in the first use case
97
98
Building Application Servers This use case is highly simplified, but still enables us to refine our list of actors and objects. The loan officer is added to the list of actors and the object list now expands to include a loan file, cover page, income verification, bank verification, and credit report. Adding these new items produces the list shown in Figure 5-2. This list will grow quickly, and many of the actors and objects will not be needed in the final model. Even so, starting with an exhaustive list will ensure that all of the objects are at least considered. Listing these items may also remind the designer of additional actors and objects that may have been specified in earlier use cases. Remember that this is an iterative process and that the design does not have to be complete or correct on the first pass. The list of objects, as well as the use cases and every other part of the design, will be corrected and refined as new information is discovered.
Defining the Objects Each business object begins with a name, such as those given in the first two use cases we've examined. The name should be something fairly Actors
Objects
Customer Customer Representative Loan Processor Employer
Property Financial Information Bills Credit Cards Bank Accounts Loan Application Form Loan File Cover Page Income Verification Bank Verification Credit Report
Figure 5-2. Actors and objects found in both use cases
Designing Business Objects
short, but still readable and comprehensible. A name such as Loan Application works well since it describes the object, yet is still fairly brief. The name LoanAppContainer may describe the same object, but it does not make sense to the business people. George Smith, as it was described above, would also be a poor choice for an object name since the realworld George Smith may later move to a different job function. The business people would understand who George is and what his position is, but the contract guy brought in to write the code will have no clue. Senior Loan Manager may be a better name. In addition to its name, each object should be defined in a short, oneparagraph narrative description, listing what the object represents along with its purpose and function. This must be written in business terms understandable by everyone on the JAD team and, just like the use cases, everyone should agree to the basic concepts laid out in this description. These object definitions will be changed and revised during development, but understanding the basic assumptions is critical to meeting the business requirements successfully. In the loan application example, the software designer now begins to look for potential business objects from the list of actors and objects. The easiest object to identify is the loan application object. This is a physical document that contains the information entered into the user interface and initiates the loan application process. The software developer begins by creating a simple narrative describing the Loan Application object: The Loan Application object contains all of the information received from the loan application form filled out by the customer. The object holds information about the customer, the property, and the customer's financial information: employer, bank accounts, credit cards, bills, and other sources of income. Once stored, the financial information can be summarized by monthly income, monthly payments, current assets, and current total debt.
This description is short and concise, describing the purpose of the object and its function within the software system. The description is most likely not complete, but gives enough information to communicate the basic idea to the rest of the JAD team. The object holds data and the data is grouped into several categories and subcategories: customer, property, and financial information. Once stored, the object has the capability to summarize information, giving aggregate totals.
99
100
Building Application Servers Although each item of the loan application form (his name, her name, address, employer name, monthly salary, etc.) could be listed individually, this would make for a large, unmanageable object with hundreds of attributes. Since the information is already described as a set of categories, it makes sense to turn each of these categories into a set of separate lower-level objects, then aggregate these objects together to form a higher-level business object: the Loan Application object. The following narratives describe each of these lower-level objects. Customer The customer object represents the customer who applies for the loan and lists his or her name, mailing address, phone numbers, Social Security number, and other relevant information.
Property The property object represents the residence or land that secures the loan and includes a short description of the property, a street address including city, state and zip, the legal property description, and the purchase price.
Employer An employer object describes the customer's place of work and level of income. It includes the name and address of the employer, the job position, the length of employment, and the monthly income.
Bank Account A bank account object lists the customer's bank account number, the name and address of the bank, the date the account was opened, and the current balance.
Additional object specifications are needed to describe the credit cards, bills, assets, and other sources of income that are listed on the application form. Each of these would be similar to the objects already specified, describing the other items listed on the loan application form. Note that at this point, the design looks a lot like a data model. There are few methods defined and each of the objects will be stored in persistent storage. As the design progresses, the objects will be enhanced with additional attributes and methods that act on these attributes.
Designing Business Objects
Designing the Objects Once the object descriptions are all written and revised, the software developers can begin to formalize the object definitions. This includes determining the following items for each object: • Attributes—what the object knows • Methods—what the object does • States—the changes that occur due to process flow • Events—responses to the outside world The object definitions should be annotated with these items. This will provide the beginnings of a detailed specification document. In addition, each object should be put into a UML diagram, since this helps summarize the information into small, tight representations of each object. Later, these diagrams will be transformed into a class diagram that represents the relationship between the objects—kind of like a blueprint for the structure of the business objects.
Attributes Attributes, often called properties or instance variables, are data items stored inside the object. Specifying the attributes may look similar to database design, but remember, these are transient items, not persistent data. The data items specified within the object provide storage between method calls. Much of the data may be passed on to the persistent object layer where it is stored in a relational database, but until then, each attribute is only a memory variable, placed there to provide functionality for the object. Each attribute should be described in both business and computer terms. In business terms, describe what each attribute represents, what its purpose is, where the data comes from and where it will be used. In computer terms, describe the data type and size. For lists and arrays, also indicate the minimum and maximum number of items. Specify enough detail to communicate the reason for the attribute, but do not get too detailed. There can be hundreds or thousands of attributes in the busi-
101
102
Building Application Servers ness object layer, and a detailed specification of all attributes could take thousands of pages. The purpose of the specification is to communicate design ideas, not to fill notebooks and kill trees.
Methods Methods are the functions and processes that give life to an object. Each method acts on the attributes stored within the object to either manipulate the attributes or communicate results to other objects. Methods can be thought of as messages sent to an object, either sending information to be saved for later use, requesting information that the object knows, or asking an object to perform a specific action. Methods are written in program code, but they represent messages and activities that the object has the ability to understand. Often it is difficult to determine the methods for each object early in the design process. Some methods will be apparent right away, but many can only be determined after the relationships are specified. Testing the use cases against the design will also reveal additional methods, so do not waste a lot of time trying to come up with an exhaustive list at this phase in the design. Methods also must be documented in the object specification, describing their function in business terms. Methods can be described either in terms of messages or actions. The verification received method is used to let the Loan Application object know that a verification document has been returned. When the message is received, the object locates the appropriate bank account, credit card, or other item and forwards the message on to this object. Other methods are requests to perform an action, such as getName that returns the name from a Customer object. Inputs and outputs should also be documented for each method. When the verification received message is sent to the Loan Application, the message must include which document was received, the VISA credit card or the employment verification. When the getName method is called, the method must return the customer's name. These must be described in business as well as computer terms.
Designing Business Objects
States As objects move through a series of processes, they may take on several different forms or states. Each state affects the behavior of the methods, and exceptions may occur when certain methods are requested when an object is not in the correct state. Each loan application object described above will move through the following states: • New—the loan application has been received but no action has been performed • In review—data is complete but not yet processed • In verification—waiting for verifications to be returned from banks and creditors • In approval—verification received, waiting for approval or rejection • Approved—loan application was approved • Rejected—loan application was rejected When the object is in different states, it will perform its actions in different manners. A loan application just submitted cannot be instantly approved, but it may be immediately rejected. A loan application that is awaiting verification likewise cannot be approved until the verification forms have returned. Once all verifications are back, the loan application moves to the awaiting approval state and can then be approved. Few objects move through state transitions, and those that do will be recognized early in the design. When an object does have several states, it is helpful to specify one or more methods that return the current state of the object. A state attribute within the object also helps to quickly identify the state of the object. In the loan application example, methods like isNew, isInReview, isInVerify, isInApproval, isApproved, and isRejected will simplify state checking. An alternative would be a getState method that returns an integer that represents a specific state value. In either case, a state attribute, set by the methods, can identify the current state of the object. When describing the object, include a detailed description of the possible object states. Describe what triggers a state change, and how the object's attributes change as the state changes occur. Document the
103
104
Building Application Servers methods that can be used to determine the present state of the object and describe any state attributes that are used by the object to track the current state. Also, document the methods that behave differently depending on the state of the object noting how state affects these methods, and list state-related exceptions that may occur.
Events Events are external occurrences that cause objects to react, either by a state change, by triggering a method, or by sending messages to other objects. In the application server environment, events are most often related to state transitions. In our loan example, receiving a credit verification may cause the Loan Application object to change states from "in verification" to "in approval" if all other verifications have been received. A Loan Approval event will move the object from "in approval" to "loan approved" state. Events are represented in a varieties of ways. Often, the event will be represented as a message sent from the service interface to an object using an "event occurred" message. In the loan application example, the loan officer will click a button indicating that the loan was approved. This will cause the user interface to call the Loan Approved service from the service interface, which then calls the Loan Approved method of the Loan Application object. Another approach is to encapsulate the event itself into an object. This event object is passed to the business objects that need to know that the event occurred. An example of this would be a "credit verification received" message. The information from the verification form would be entered and stored in a Verification object. This object would be passed to the Loan Application object by calling a Loan Approved method. The Loan Application object will respond to this message by locating the corresponding Bill or Credit Card object, then send a "verification received" message along with the Verification object (see Figure 5-3). Event handling can be generalized even more by creating a "handle event" method, then passing all of the event objects to every business object within the hierarchy. The "credit verification received" message would be passed to the Loan Application object, where it would be checked to determine if any response was needed. Once this response is complete, the event would be passed to every Employer, Bank Account, and Credit Card object in turn. Each would check the event to determine if it need-
Designing Business Objects
Loan Application Verification Received
Verification Received
Figure 5-3. Passing an event between objects
ed to respond. Since this event was intended for a specific credit card, the Credit Card object would be the only object to respond to the event.
Business object specifications As the object analysis continues, the object descriptions should be updated to include descriptions of attributes, methods, states, and events. This does not have to be comprehensive, but should contain enough information to communicate the design concepts and goals to both the business and technical people. Figure 5-4 shows an example of the Bank Account object specification. Do not spend too much time creating these specifications. Simply list the items that seem appropriate and keep the process moving. These are still rough drafts and will be updated as the project moves along. Consider one of the object-oriented CASE tools if the project is large. These tools make finding and managing large collections of objects much easier.
105
106
Building Application Servers
Object Specifications Object Name: Bank Account Description: A bank account object lists the customer's bank account number, the name and address of the bank, the date the account was opened and the current balance. Attributes: Bank Name—the name of the bank Bank Address—the mailing address for the bank Bank Account—the account number for this account Current Balance—the account balance on or near the application date Verification Sent—date that the verification form was sent Verification Received—date that the verification form was returned Verified—yes/no, indication that the account exists Methods: Send Verification—sets the Verification Sent date and returns the Bank name, address and account number Receive Verification—sets the verification received date and the account verified indicator is New—returns yes if no verification has been sent is Sent—returns yes if verification has been sent but has not received is Verified—returns yes if verification has been received and account exists is Rejected—returns yes if verification has been received but account does not exist States: New—Verification not sent Sent—Verification sent but not yet returned Verified—Verification has returned and account exists Rejected—Verification has returned but account does not exist Events: Verification Sent—the form is sent to the bank Verification Received—the form is returned by the bank and indicates if the account exists or not. Figure 5-4. Bank Account object specification
Designing Business Objects
Object Interaction For an object to do useful work, it must communicate and interact with other objects. The Bank Account object specified above has a method that sends a verification letter. As it gathers the data to send the letter, it knows the bank name and address and it knows the account number. A verification letter could be sent out at this point, but there is no way to verify that the account is owned by the customer who has applied for the loan. The customer could have written down someone else's account number; then, if an account exists, the verification would pass. What the bank really wants to know is if this customer has an account at this bank. To make this work correctly, the Bank Account object must have some form of interaction or communication with the Customer object that was submitted on the loan application. Once this connection is established, the letter can include the name and address of the customer. Without these connections and communication paths, the objects will not have the information necessary to perform useful work. Objects communicate through a variety of methods and relationships. Some objects are aggregated to form larger objects, and this new higher-level business object provides the communication paths between the objects. Other objects have a looser relation, knowing each other through association, but are not as tightly coupled as when they are aggregated. Quite often, objects will be brought together into collections, pulling together similar objects into a common group where they can be sorted and compared. Object relations are most easily documented using UML diagrams. A few class diagrams can quickly summarize the relations and communicate the design concepts to both the programmers and business people. Once the relations are determined, sequence diagrams can be created that describe each use case using these objects and relations. Creating the sequence diagrams will quickly locate missing methods and test object relationships.
Aggregation The simplest and easiest object relation is aggregation, when several lowlevel objects are combined to create a new higher-level business object.
107
108
Building Application Servers This new object uses each of the lower level objects as instance variables (attributes) in the same manner as it would if these objects were dates or strings. These new attributes are then used within the higher-level object's methods to perform business functions that simulate real world processes. Since the lower-level objects are now attributes of the higher-level business object, the low-level object is encapsulated, or hidden, within the new business object. This makes the low-level objects accessible only to the business object, hidden from access except through the aggregated object. This protects the low-level objects from corruption, but it also restricts the design when other objects need to access the object independently. The specification for the Loan Application object states that it contains information about the customer, property, and financial information. This could be represented as an aggregation. A Customer object, a Property object, and a Financial Information object could all be aggregated into the Loan Application object (see Figure 5-5).
Loan Application Loan Amount Approve Reject
Customer Name Addres Work Phone Home Phone
Property Short Description Legal Description Address Purchase Price
Figure 5-5. Aggregated Loan Application object
Financial Information Name Address Phone Contact
Designing Business Objects
This new Loan Application object now has the responsibility to manage the Customer, Property, and Financial Information objects. The only access to each of the lower-level objects is through Loan Application methods. As long as all the work involving the customer or property resides inside the Loan Application object, this is acceptable, but if the customer can have multiple loan applications or the property needs to be tracked separately, this may be a problem. In the loan application example, this is not a problem since the customer and property are part of the loan application form. Later on, after the loan has been approved, the Loan Application object can give up its customer and property information to create true Customer and Property objects that stand separate from the loan. Until then, the customer embedded in the Loan Application object is not a true customer, since a rejection of the application will terminate business with the customer. It would probably make more sense to call the object an applicant instead of a customer, but for this example, the term customer is easier to understand.
Generalization and specialization The Customer and Property objects fit well within an aggregation relationship, but there is no Financial Information object included in the specification. The financial information referenced in the loan application specification is a set of objects that include bank accounts, credit cards, bills, and employers. Each has some similar information that includes a name, address, phone number, and a monthly amount that either adds or subtracts from the customer's cash flow. All but the Employer object has an account number and an amount that contributes to the customer's net worth. One of the foundations of object-oriented technology is the concept of inheritance or specialization. A generalized, or parent, object is defined that specifies the attributes and methods that all objects have in common. Specialized objects can then be created that inherit these attributes and methods. Within these new objects, methods are either replaced or added to give new functionality, and additional attributes are added to support these new methods. Although generalization and inheritance look like a great opportunity to share code, this should be avoided. Inheritance should be restricted to likeminded objects, each representing a more specialized subset of its parent.
109
no
Building Application Servers Otherwise, a change to a parent object not well related to a derived object may cause failures when the other object expects the original functionality. Inheritance should be used sparingly in business object construction, since derived objects are tightly bound to their parents. Often, aggregation or association is a much better choice, putting the needed functionality inside the object instead of inheriting it (Coad and Mayfield 1997). The financial information example does show a situation where inheritance is appropriate. Figure 5-6 shows the hierarchy for the financial information objects. The parent object, called Verification, holds the common attributes and methods for all of the financial information objects. The name Verification was selected to indicate that the primary function of these objects is to send and receive verifications for all of the
Employer Hire Date Monthly Salary Verification Name Address Phone Verification Sent Verification Received Verified Send Verification Receive Verification is New is Sent is Verified is Rejected
Send Verification
Bank Account Date Opened Current Balance Send Verification
Credit Card Current Balance Monthly Payment Send Verification
Figure 5-6. Financial Information object hierarchy
Loan Application Loan Amount Send Verification
Designing Business Objects
financial information. Attributes for the Verification object include a name, address, phone, the date the verification was sent, the date when it returns, and an indicator of a positive or negative verification. Methods include Send Verification, Receive Verification, and a set of state
indicators representing each state the object may be in. Note that the Verification object is never used by itself; it is just there to allow other objects to derive their functionality from this object or class. The Employer, Bank Account, and Credit Card objects are all derived
from the Verification object. Each of these objects inherit the methods and attributes from the Verification object, so each object also has a name and address attribute and the Send Verification and Receive Verification
methods. Each derived object also adds attributes and methods unique to its own requirements. The Employer needs to know the date the employee was hired and the monthly salary. The Bank Account object needs to know when the account was opened and the current balance. Since each object now has its own attributes in addition to the ones derived from the Verification object, and each verification letter will be worded in a different manner, each object will also have to replace the Send Verification method with a new method that performs the appropriate action. When a verification is sent to the employer, the letter must request employment information including the hire date and monthly salary reported on the loan application. When sent to a credit card company, the letter must request credit information including the total balance and monthly payments. The same is true for every derived object, so each must implement its own Send Verification method to handle these differences. Since each derived object has to implement its own Send Verification method, the Verification object would not even have to implement the method. This is an example of a virtual method. By specifying the method in the parent object, you force each derived object to implement the method using the same name and force consistency between the methods. Most object-oriented languages allow a virtual method to be declared, but not implemented, in the parent object. While the Send Verification method is declared virtual and implemented in every object, the Receive Verification method is only implemented for the parent object. It performs the same operations for all of the objects. When a verification letter is returned, the method updates the verification received and the verified attributes. The Receive
111
112
Building Application Servers Verification method does not have to act on additional information like hire date or credit card balance, and does not have to perform different functions based on the type of financial information; so it can be implemented in the parent object, inherited by each derived object. Finally, the Installment Loan object is derived from the Credit Card object. This allows the Installment Loan object to inherit all of the functionality of the Credit Card object, but also carry an initial loan amount. Again, the Installment Loan object must implement its own Send Verification method, since it carries additional information, but it can inherit the Receive Verification and state methods. Inheritance is a powerful feature of object-oriented design, but it should be used only when objects have a relationship that can be summarized as a "this is like a...but" relationship. In this example, an Installment Loan is like a Credit Card, but it has an initial loan amount. If this "kind of" relationship can be stated, inheritance is a good choice for the object relationship.
Association In the aggregation example, the Customer object was aggregated into the Loan Application object. Suppose a customer could have more than one loan application in process at the same time (not likely, but it makes a good example). If this were the case, a loan officer would have a difficult time locating all of the applications for one customer, since every loan application would have to be checked. In this case, the Customer object should be independent of the Loan Application objects, but a relationship must still exist to know which loan applications are associated with each customer. Where aggregation put one object inside another object, association lets each object stand on its own, but it also allows each object to know about the other object. Instead of embedding one object in the other, each object has a pointer or reference to the other object. This can be done using a unique identifier, such as a customer number or a loan application number, or by using an object pointer or reference such as a memory address or C++ pointers. No matter how the reference is implemented, the object has a communication path to the other object. This association can be one-way or two-way, and can be one-to-one or one-to-many (see Figure 5-7). A one-way association is a relation where object A knows about B, but B does not know about A. In a two-way rela-
Designing Business Objects
B
A One Way Association
Two Way Association
LJ_
E
D
c
1
One to One Association
1..*
H
One to Many Association
Figure 5-7. Types of association
tion, both objects know about each other. Likewise, in a one-to-one association, each object E is associated with only one object F. In a one-tomany association, one object G is associated with many object Hs. In the loan application example, each Loan Application object is associated with at least one Employer object. This can be specified as a one-way relationship, since each Loan Application must communicate with the Employer objects, but access will never be required independently from the Employer back to the Loan Application. In the same way, the association is one-to-many, since each Loan Application may have one or more Employers. Associations are the most common object relationships since each object stands on its own but has communication paths with other objects. As more objects and relationships are specified, each object will have a number of relationships. The primary goal in determining relationships is to make sure that each object can communicate with every object with which it will have to interact. An object cannot call another object's methods unless that object has access to the other object through some form of relation. At the same time, associations should not be specified unless there is a need for communication, since each association does create additional overhead and programming effort.
113
114
Building Application Servers
Collections Where the other relations connect objects of different types, a collection is a grouping of objects either from the same class, or from objects derived through inheritance from the same class. Collections often form one side of the one-to-many associations or are aggregated inside another object. Collections can also be formed from primitive types, such as strings or integers; but for this discussion on class relationships, only collections of classes will be considered. Most object-oriented programming languages provide a variety of collections including arrays, vectors, lists, maps, and other, more complex data structures. Each allows a set of objects with a common base class to be inserted or deleted from the list, then, depending on the type of list, allows access sequentially from top to bottom or by an identifying value. In the loan application example, each Loan Application will have a collection of Verification objects (see Figure 5-8). These objects can include Employers, Bank Accounts, Credit Cards, or Loans. Since each loan applica-
Figure 5-8. The loan verification list
Designing Business Objects
tion will have a different number of each type of verification objects, a collection is an excellent method for handling this relation. Once the Verification objects are stored in the collection, the Loan Application object can determine the total monthly income and total monthly payments by accessing each object, checking whether it is an employer or bill, then accumulating the totals for each. Likewise, total assets and total debt can be accumulated in a similar manner.
Creating the class diagram The final class diagram shows the relations between all of the objects needed to meet the requirements of the use cases. Figure 5-9 shows all of the relations described throughout this section. This diagram forms a blueprint of the business object layer. Later, the persistence and interface objects will be added. Once completed, this diagram becomes the roadmap for the software designers and a communication tool to use with the business people.
Loan Application
Verification 1..*
Employer
Bank Account
Credit Card
Installmen Loan
Figure 5-9. Loan application business object layer
115
116
Building Application Servers
Application Server Issues and Constraints When designing business objects for an application server, you must consider additional issues and constraints. Some are business constraints, such as short development cycles and the ability to reuse objects from one development cycle to the next. Others are technical issues, such as handling concurrent processing and setting up object repositories. Each of these factors constrain the choices available as the business object layer is designed.
Short business cycles Years ago, the standard development cycle was measured in man-years. Business processes were relatively stable, requirements were set at the beginning of the project, and development moved at a leisurely pace (at least that's the myth that everyone seems to remember). Today, business requirements change quickly and software must be flexible, open, and configurable. At the same time, development cycles must be shortened to meet these rapidly changing business demands. Business objects must also be able to change and reshape themselves quickly and efficiently. When determining functionality, the object must implement the current requirements and also be ready to meet future needs. This is not as difficult as it seems. Most business processes evolve and, as such, the methods to implement these processes are extensions of the current functionality. The business members of the design team will usually have some idea of where their processes are moving, and upper management should have long term strategies in place and understand the market needs (if they don't, start looking for another job). Once these trends are determined, make sure that the object design will accommodate these changes easily, but also make sure that the system meets the current needs first. The short business cycles complicate the software requirements, but also tighten the development schedule. Balancing the need for flexibility and quick delivery is a difficult task.
Designing Business Objects
Reuse Reuse has long been one of the holy grails of software development. The model of the integrated circuit is often cited to illustrate how prepackaged designs can be reused over and over again. Unfortunately, reuse has seldom been effective in the business programming environment and, when it finally is achieved, the cost savings will never be as large as everyone assumes. Often we forget that software reuse is already happening on a large scale within every organization. The Windows or Mac operating systems are huge repositories of reusable software. Every program written relies on these reusable functions for a large portion of the work being done. Programming languages also rely on large sets of software libraries and application frameworks like MFC to provide faster software development. Finally, component frameworks like Visual Basic and JavaBeans also have a growing base of reusable components. Imagine the costs of having to reinvent these basic software building blocks every time a new application had to be built. The reason that these forms of reuse are so effective is that each has a wide range of uses. A Visual Basic list box can be used in any program that needs to present a set of options; the printer services in Windows are used constantly to send data to the printers. In each case, the function is something that is widely needed, so it is cost-effective to write a very customizable, general-purpose function. In the case of Windows, millions of dollars can be spent to develop these reusable functions because they can be sold to every PC user in the world. In the case of business software reuse, these economies of scale do not exist. Within the business programming environment, object reuse has just as many costs as it does benefits. Objects must be designed to have a much wider breadth than when they are written for one specific use. Repositories and documentation must be kept up to date and be quickly accessible to the programmers. Often it takes more time to locate and research how to use a reusable object then it does to rewrite it from scratch. Change management is also far more difficult when a single object is being used in a variety of different applications. To make software reuse effective, Paul Bassett suggests that a development group must have its process, infrastructure, and culture oriented
117
118
Building Application Servers towards reuse (Bassett 1999). An application server architecture provides some of the process and infrastructure, but culture is something that must be built over a long period of time. Process The application server design process is one of creating business objects that are models of business entities and processes. These are created within the context of an application, but are not intended to be application-specific. If these objects are designed correctly, they will be just as effective for the next application as they were for the current development effort. Additional functionality may be required, but the core processes will not change.
Infrastructure In addition to the application server environment, additional pieces must be in place to support reuse. These include object repositories, upto-date documentation, coding standards, and development tools that support reuse. During both design and programming phases, the developers must be able to quickly retrieve information about the business objects already in place and be able to incorporate them into their design. Reusing an object has to be easier than redesigning the same object. This can only happen when the information and objects are standardized, easy to access, and easy to use. Culture Reorienting culture towards reuse is difficult and can only occur over a much longer period of time. Most discussions of reuse include incentives and rewards to move the culture towards reuse; but rewards must be based on metrics, and metrics are difficult to determine when trying to encourage reuse. When reuse is first addressed, there is no code base or prior experience in place, so it is difficult to determine how reusable a particular object or piece of code will be. Also, the culture is already oriented towards other values such as meeting the user's needs within tight schedules. The pressure to get the project out the door will override the need to spend time making the objects reusable.
Designing Business Objects
Reorienting the culture towards reuse will be a multi-phased process. In the first stage, enforcing naming conventions, standardizing documentation, and building an object repository will begin to lay a foundation for reuse. In the next phase, the foundation for reuse will begin to appear, but current cultural pressures like technical elegance and customer service will cause conflicts and "culture clash" that will be difficult to resolve. Finally, as the process and infrastructure mature, reuse will become easier and will enhance instead of conflict with these other cultural pressures. Only then will reuse succeed.
Concurrency and synchronization One of the difficulties of the application server architecture is how to manage concurrency, accessing the same objects at the same time. In some cases, the same object may be accessed concurrently by hundreds of other processes. In other cases, each of these same processes may spawn a host of unique objects, resulting in thousands of objects active at the same time. Remember, too, that this is happening over a network of computers, not just one computer. Tracking and synchronizing all of these objects can become a nightmare. Fortunately, this is the job middleware is designed to perform. It can keep track of all of these objects and route messages between them transparent to all other programs. Or can it? The middleware may do the job, but it is much easier to design an efficient business object layer than it is to wait and hope that the middleware can handle the load. Efficiency and throughput come from good design, not software tuning. Minimizing the number of objects will help minimize the load placed on the hardware, and minimizing network traffic will increase throughput. Locating and eliminating bottlenecks during design is far easier than waiting until the users complain about slow response time.
Repositories In addition to concurrency and synchronization, many middleware products also provide repositories that allow objects to be stored and retrieved in a structured, organized manner. There are also a variety of
119
120
Building Application Servers other commercial repositories that work alongside the middleware architecture to perform this same task. No matter which repository is chosen, it will place restrictions on the form objects take and the way objects are built. Although much of this is technical in nature, it does affect and restrict object design choices. Many repositories and middleware packages require objects to conform to a specific component form. Some, like the JavaBean specification, have little impact on object design. Others, like ActiveX, require very tight naming requirements and a host of additional interfaces and functions that restrict the implementation of the objects. If you don't know these restrictions at design time, programming can become very difficult.
Persistence Although persistence is the topic of the next chapter, it also impacts the design of the business object layer. Data is almost always stored in relational databases, and, just like middleware and repositories, database management software will work better if the data is accessed in a manner that is consistent with the rules of the relational databases. As objects are aggregated and relationships are formed between objects, these relationships will determine how data is retrieved. Very few organizations do not use relational databases. Many of these databases have existed for a long period of time and may not have the most efficient, logical designs. These structures may have migrated from legacy systems, or tradeoffs were made to gain efficiency from older, more primitive database systems. The data may exist in denormalized forms or may make no logical sense whatsoever. Just as there is bad code and bad software, there are also a lot of bad databases out there. Knowing how this data is organized leads to more compatible object design and will help in the long-term success of the project.
Linking Business Objects to the Service Interface The business object layer is a collection of business objects, each modeling a part of the business in software. It is the responsibility of the service
Designing Business Objects
interface to link these objects together to perform useful work. Chapter 4 examined how to determine the functions and services provided by the service interface. Now that the business objects are available, they can be attached to the service interface to perform the services. The UML sequence diagram is one of the best tools for determining how the business objects will perform the services specified by the service interface. Each service is diagrammed, showing the connection between the objects and what method calls are needed to perform the service. This exercise will often reveal problems with the object relationships and quickly locate methods that have not been specified. Again, do not sequence every service; just do enough to prove the object design.
Developing sequence diagrams When developing sequence diagrams, begin with the application service interface object. This will be placed at the top left of the diagram, followed by the business objects needed to perform the services. Next, list the services down the left side of the page in the approximate order the user interface will call them. For each service, start with the service interface object and decide how it will communicate with the first business object. In most cases, it must either create a new instance of the object or use the persistence layer to create the object from the relational database. Once this link is established, each additional object must also have a similar communication path. Once the communication paths have been established, the service interface object will send a message to the business object to perform one of the object's methods. This object will then do the steps listed in the method, calling methods from other objects. Each of these method calls is indicated by an arrow from the calling object to the called object. To make the diagram readable, the objects should be ordered in approximately the same order as the sequence of method calls. The service interface requests a method from object A, which then requests methods from objects B and C, and so on. The easiest way to see how a sequence diagram is constructed is to go back to the loan application example and begin to lay out the services that will be required to accomplish the use cases. Figure 5-10 is a sequence diagram that illustrates how to enter a new loan application. The service interface object is where each process begins. The user interface program will request a method from the service interface, then
121
122
Building Application Servers Service Interface
: Customer
: Loan Application
: Property
: Employer
: Bank Account
: Credit Card
new
Create Loan Application
new n
1 1
m
L
J
n,
Add Employer
-
Add BEink Account
1 1 1
I
add verificaticjn (
new add verilicaticjn (bank
)
!
1 Add Credit Card
nev add verification credit
Send Verification Letters
send
i
•send
)
•se id "send
r
)
I
'V
1
Figure 5-10. Sequence diagram for loan application entry
the service interface establishes the links between objects before calling methods that implement the service.
Creating new business objects The first task is to initiate a new loan application (create loan application). The service interface uses the new operator to create a new Loan Application object, then the information from the user interface screen is passed through the service interface into this new object. Once the new
Designing Business Objects
Loan Application object is created, the Loan Application constructor creates Customer and Property objects and passes the relevant information to each of these new objects. There is a large amount of data required to create these objects, but the data can be encapsulated into a data structure to limit the number of parameters that must be passed between objects. Once the loan application is entered, the loan processor will also have to enter the employer, bank accounts, credit cards, and bills. Since there may be multiple instances of each, the user interface program is set up to add these separately. Each of these functions is listed on the sequence diagram, using the service interface to manage the work. For each of these functions, the service interface first creates a new instance of the Employer, Bank Account, or Credit Card object. Once the object is created, the add verification method is called to insert the new object into the Loan Application object's collection.
Implementing services After all of the information is entered into the computer, the operator can send a request to print the verification letters. The user interface will request the send verification letters service from the service interface. This request will be passed on to the Loan Application object, which will find each Verification object and request that it print its letter. This will continue for each Employee, Bank Account, or Credit Card. Notice the asterisk before each send method; this is the notation for a repeated operation. When the verification letters return, each must be logged into the computer to indicate whether the data was verified. These services can be generalized into a common receive verification report method (see Figure 5-11). The processor first locates the loan application and specific employee, bank, or credit card, then clicks a button indicating that the verification has been returned. This calls a service that sends the same message to the Loan Application object. The Loan Application object locates the corresponding Verification object and calls its "verification received" message. While creating this sequence diagram, it was apparent there were several problems with the object design. There is a broad assumption that the user interface and service interface can locate loan applications. This may be true, but there is no collection object specified to track the Loan Application objects and no services specified to perform this task (this was omitted to simplify the illustration).
123
124
Building Application Servers : Loan Application
Service Interface
Receive Verification Report
: Verification
receive verification I find verification (
receive verification (i
V Check Application Status
check approval ( I *is verified ( is verified = false, *is rejected (
T I
u get monthly income ( f get monthly bills ( get total assets ( M get total debt (
approve ( Approve
Reject
V reject (
V
Figure 5-11. Receive verification and loan approval
Designing Business Objects
Another problem found by this service was that there are methods to add and delete Verification objects, but there were no navigation methods specified. Methods like find first, find next, find by key, and others will be needed to provide access to the Verification objects. Laying out the association between the Loan Application object and the Verification objects is easy, but remembering to add all of the methods to support the relationship is often more difficult. The sequence diagram makes these omissions easy to find. Figure 5-11 also shows the sequence required to check a loan application for approval. It checks that all verifications have been received and approved, then returns the result back to the user interface program. At that time, the loan officer can approve or deny the loan.
Business Object Architecture Depending on the number of objects and the volume of activity, you have a number of choices for how the business object layer can be distributed and accessed. This is not really a design issue, but the consideration does affect how the business object layer is designed. The goal is to maximize throughput by keeping related objects close together with a minimum of network and system overhead. At the same time, the architecture must be flexible and allow for growth and redistribution of objects over multiple servers as the system demands increase. Often, the best choice is to use middleware to bind the service interface to the high-level business objects, then place all objects with relations onto the same machine and link them together into tightly integrated program modules. These high-level business objects then become the partitions where object groups can be distributed as resources begin to fill up. Other partitions may be by geographical location or by business function. As the design progresses, these logical boundaries will become apparent and the objects can be distributed accordingly.
125
126
Building Application Servers
Summary Business object design is a difficult, complex topic that cannot possibly be covered in sufficient depth in one chapter. Use the guidelines listed here as a framework for additional study using the Further Reading list at the end of this chapter. Some guidelines to follow include: • A business object is a computer representation of a physical business entity. • Approach business object design from the bottom up, selecting relevant actors and objects that participate in the use cases. • Begin with a written narrative of the objects in business terms, describing each object's role and activities. • As the objects begin to take shape, augment the description with the following characteristics: • Attributes—what the object knows • Methods—what the object does • States—the changes that occur due to process flow • Events—responses to the outside world • Use class diagrams to define relations between the objects. These relations include aggregation, generalization, and association. Collections can also be useful when aggregating or associating many similar objects. • When designing business objects in an application server environment, remember that reuse, concurrency, synchronization, repositories, and persistence all add further restrictions and constraints. • Use sequence diagrams to outline how the service interface will use the business objects to perform its services.
Designing Business Objects
References Bassett, Paul. "Is Reuse a Transient Issue?" Component Strategies, January 1999: 64. Carmichael, Andy, et al. Developing Business Objects. New York: Cambridge University Press, 1998. Coad, Peter, and Mark Mayfield. Java Design—Building Better Apps & Applets. Upper Saddle River, New Jersey: Prentice Hall, 1997. Gamma, Erich, Richard Helms, Ralph Johnson, and John Vlissedes. Design Patterns—Elements of Reusable Object-Oriented Software. Reading, Massachusetts: Addison Wesley Longman, 1998.
Further Reading Booch, Grady. Object-Oriented Analysis and Design With Applications. Reading, Massachusetts: Addison Wesley Longman, 1994. Fowler, Martin, and Kendall Scott. UML Distilled. Reading, Massachusetts: Addison Wesley Longman, 1997. Jacobson, Ivar, Grady Booch, and James Rumbaugh. Unified Software Development Process. Object Technology Series. Reading, Massachusetts: Addison Wesley Longman, 1999. Liberty, Jesse. Beginning Object Oriented Analysis and Design. Chicago, Illinois: WROX Press Ltd., 1998.
127
Chapter 6
Designing the Persistent Object Layer Just as the service interface layer connects the application server to user interface programs, the persistent object layer connects the application server to databases, object stores, and other external applications. Once the business objects perform the processes requested by the service interface, the resulting data must be stored for later use. The persistent object layer routes this data to relational databases or other forms of long-term storage. In the business environment, data is most often stored in relational databases. Although there are other persistence alternatives, such as object database management systems (ODBMS) that directly store and retrieve objects, these are products that are just now starting to move into the mainstream. Relational database is a mature technology that has been refined over 25 years, and business data processing relies heavily on this technology. For the application server to fit into the business environment effectively, the persistence layer must bridge application server objects with the relational data model. A number of methods can be used to implement persistent objects, and many of these will be explored in this chapter. Depending on the size and scope of the application server project, these methods can range from a simple set of collections, each representing similar business objects, all the way to a comprehensive persistence service that acts as an object broker, handling persistence, life cycle, and directory services for the entire application server. When data integrity is an important requirement, transaction objects can also be created that sit between the 129
130
Building Application Servers business objects and the persistent layer. The design choices are almost endless. This chapter will explore the considerations and constraints necessary to design the persistence layer of the application server. We will examine the following topics: • The role of the persistence layer • Relational database principles • Designing a persistent object layer • Using object-oriented databases • Using objects to represent external systems
The Role of the Persistence Layer The persistent object layer is a group of low-level objects and collections that retrieve and store business objects from relational databases, data warehouses, bulk storage devices, or external applications. When a business object is needed, the persistence layer must first locate the data or attributes of the object, then create a new instance using the data retrieved. For this to occur, the persistent objects must know both the structure of the particular business object and the structure and location of the data. This requires a large number of special-purpose objects that bridge the business object layer and the storage devices. A good way to understand the role of the persistent object layer is to step through the process of retrieving and updating an invoice. A customer service representative enters invoice number 1234 into a user interface program, which then sends a "get invoice" request to the service interface. The service interface must locate the invoice object for invoice 1234 and send all relevant information back to the user interface program when the invoice object for this number cannot be located among the business objects currently in memory. To get the invoice object back into memory on the application server, the service interface must first send a request to a persistent object to create an invoice object for invoice number 1234. The persistence service requests the information for the specific invoice from the database server.
Designing the Persistent Object Layer
It then creates a new invoice object that encapsulates this information. Once this is complete, a reference to the invoice object is sent back to the service interface, which can request the methods from the invoice object. Invoice object 1234 is now back in memory. After the service representative enters the changes to the invoice, the user interface program sends the invoice data back to the service interface in the form of an "update invoice" request. The service interface then calls invoice object 1234's methods to update the data. Once the invoice object is updated, it must be sent back to the persistence layer, where its data can be stored into the database. If the invoice object is no longer needed, it is deleted. Now suppose, in the midst of this process, someone else wants to look at the same invoice. It would be convenient if the persistence layer could simply create a second object representing the same invoice, then send this information to the second computer screen. This will not work, however, since the second object would not know about changes that have been made by the customer service representative. Instead, the persistence layer must return a reference to the same invoice object so the data displayed remains consistent for both users. Since there are now two separate processes using the same invoice object, life cycle management becomes an important issue. When the customer service representative saves the changes to the invoice, the invoice must be sent back to the persistent layer to store the changes; but it must also remain in memory until the second user no longer needs the object. Once the second user is finished, the object representing invoice 1234 can be removed from memory. As the example illustrates, the persistence layer has many complex tasks, including object creation and tracking, database communication, concurrency, life cycle management, and garbage collection. Notice that many of these functions are the same as those provided by most distributed object middleware. Life cycle management can track object creation, concurrent usage, and garbage collection. Locating specific object instances is the function of naming and directory services, and synchronization can be handled by transaction services. The only piece missing is access to the database.
131
132
Building Application Servers
Relational Database Principles The relational database is a mature technology that has become the foundation of most business data processing. Because of its maturity (note that The Relational Model of Data, by E. F. Codd, was first drafted in 1969), there are a wide variety of relational database products that all conform to a set of unified industry standards (Date 1998). Database vendors compete in a mature market with products that are highly optimized, secure, and reliable, and sold at competitive prices. For large volumes of data, it would be difficult to build a business case for using anything other than a relational database. Since relational databases are common and most developers are familiar with the technology, this brief overview will concentrate on the basics of the data model and how it relates to object-oriented software design. For those not familiar with relational databases, see the references at the end of the chapter.
Database history The relational data model was originally developed in the early 1970s as an alternative to traditional file-oriented data processing. At that time, most processing was done in batch mode, merging changes punched on paper cards with large "master files" often stored on magnetic tape, since disk space was prohibitively expensive. Once the merges were completed, the information was distributed throughout the company using paper reports that often consumed several cases of paper. As online systems began to appear, it was apparent that the information had to be organized in a form that was easier to access. A variety of database models began to appear that organized information into more efficient, logical structures. The advantage of the database was that all of the company's data was now stored in a few logical structures. A customer's name was now stored in one location that kept data consistent between applications. Data was also accessible randomly, by multiple key values, without having to read the entire file. Instead of running long batch updates, data could now be updated online so changes could appear immediately throughout the company.
Designing the Persistent Object Layer
133
The relational data model The foundation of the relational data model is the concept that data can be broken into sets of small, independent tables (much like a large spreadsheet) each representing a set of related information. Each instance of data in the table is represented as a row of data items, while the columns separate all rows consistently by attribute type. In object terms, each row is an instance of an object of type table, with each column storing a specific attribute. Figure 6-1 illustrates a simple customer table listing a customer number, first name, last name, address, city, state, zip, and phone number. Any set of information can be organized into similar tables. As additional tables are created, the number of attributes can be minimized by replacing redundant information with a reference to another table using a unique identifier common to both tables (such as a customer ID). This reduces the amount of redundant data stored in the database and provides a navigable network of relations between the tables. Figure 6-2 illustrates a simple relationship between an invoice table and the customer table. The invoice table carries a unique identifier (the invoice number) followed by attributes listing the customer, the order date, and the total amount. Each customer can have multiple invoices, but there is no need to store the customer name, address, or phone number in each invoice, since they can be quickly retrieved from the customer table. In relational terminology, this is called a join operation.
Last Name Smith
Address
City
State
Zip
Phone
1001
First Name John
1234 S. Main
Denver
CO
80101
123-1234
1002
Fred
Jones
1500 Lincoln
Denver
CO
80101
123-1111
1003
Mary
Lamb
110 Main
New York
NY
10001
111-1111
Customer
Figure 6-1. Simple customer table
134
Building Application Servers
Invoices
Customers Customer 1001
First Name John
Last Name Smith
Invoice
Order Date Customer Amount
90001
06/15/1999
1001
$5,325.47
90002
06/17/1999
1001
$5,100.00
1002
Fred
Jones
90003
06/17/1999
1002
$742.69
1003
Mary
Lamb
90004
06/18/1999
1003
$3,750.00
90005
06/20/1999
1003
$2,949.95
90006
06/22/1999
1003
$1,000.00
Figure 6-2. Relation between customer and invoice tables
Structured query language (SQL) Over time the Structured Query Language (SQL) has emerged as the standard tool to access and manipulate the information in relational tables. This language provides a standard set of commands to store, update, delete, retrieve, and aggregate data. Information from one or more tables can be retrieved using the select command which then creates a temporary table based on the criteria in the command. The SQL command: SELECT customer, first_name, last_name, address. city, state, zip, phone FROM Customers WHERE state = "CO" gives a sales representative a list of customers to contact when he makes his monthly sales trip to Colorado. The resulting table is shown in Figure 6-3.
Designing the Persistent Object Layer
135
Last Name Smith
Address
City
State
Zip
Phone
1001
First Name John
1234 S. Main
Denver
CO
80101
123-1234
1002
Fred
Jones
1500 Lincoln
Denver
CO
80101 123-1111
Customer
Figure 6-3. Customers who live in Colorado
Data can also be joined using a similar SQL command: SELECT invoice, order_date, customer, first_name, last_name FROM invoices, customers WHERE invoices.customer = customers.customer AND order_date >= '06/17/1999' AND order_date <= '06/20/1999' which will show the name of customers who placed orders between June 17th and June 20th of 1999. This produces the table shown in Figure 6-4. In addition to selects and joins, the SQL language provides commands
Invoice
Order Date Customer
90002
06/17/1999
90003
First Name
Last Name
1001
John
Smith
06/17/1999
1002
Fred
Jones
90004
06/18/1999
1003
Mary
Lamb
90005
06/20/1999
1003
Mary
Lamb
Figure 6-4. Results of table join
136
Building Application Servers to add, modify and delete table entries, create, modify and delete table structures, and optimize data access by specifying index and sort sequences. Depending on the implementation, some relational database packages also provide functions to check references between tables, raising errors when, for example, a customer is added to an invoice that cannot be found in the customer table. Other extensions include stored procedures that can precompile frequently used processes and triggers that automatically call stored procedures when data is added to or deleted from a table. Relational databases are powerful tools optimized to manage large quantities of data.
Database middleware One of the earliest applications of middleware was to connect diverse platforms and programming languages to database servers. In the early days before middleware, databases were usually accessed through a set of simple API calls from COBOL or other languages. Some vendors included preprocessors that translated language extensions into API calls to make programming easier and the code more readable. As time went on, SQL became the standard language of data access, replacing the proprietary command languages of the APIs and preprocessors. Finally, as client/server became more common, database middleware had to take over network chores as well as provide access to the database server. In addition to accessing the databases, this middleware had to marshal data into different data representations for a variety of development platforms and provide network services to transfer data between machines. Today many database middleware choices are available. Although there are still some vendor-specific middleware implementations, most vendors have moved to a number of common industry standards. On the desktop platforms, ODBC (open database connectivity) is the most common, although Microsoft is moving towards Data Access Objects (DAO, formerly ADO) using their COM component model to encapsulate database functionality. In the Java world, JDBC (Java database connectivity) is now the most common database middleware choice. All of these standards are based on SQL, passing commands as text strings and then receiving the resulting data set as an array of data. These middleware APIs are not difficult to use, but there is little reason to work
Designing the Persistent Object Layer
even at this level, since most programming tools provide their own support for database access. These tools provide frameworks that encapsulate database access and wizards that automatically generate the objects that represent queries, tables and result sets. Most C++ development tools provide wizards to build database objects while visual development environments like Microsoft Visual Basic and Inprise's C++Builder and JBuilder all come with drag-and-drop components that encapsulate database objects. If these are not sufficient there are also third party tools that provide similar enhanced functionality.
Designing a Persistent Object Layer The persistence layer must recreate objects from relational data whenever an object is needed. It must also keep track of the objects once they are created so that these objects are not duplicated and data remains consistent. When attributes change, the relational data must be updated to reflect the changes; then, when the objects are no longer needed, they must be removed to optimize memory. If this were all that the persistence layer had to perform, it would be easy to create a simple general-purpose tool that loads and stores objects. Much of this functionality does exist in most middleware frameworks. Unfortunately, the persistence layer must also understand the structure of the business objects and how each object relates to the database structure. It is this requirement that complicates the task of building the persistence layer and requires a customized solution for each application server. A simple solution would be to implement persistence within each object. This is an acceptable option for small applications with a limited number of classes and object instances. The problem is that each object now has the added overhead of persistence, which, as the number of objects grows, quickly eats up system resources. Also, the application server must still implement some type of application-wide directory service to locate specific instances of each object. This is why a separate layer with a separate set of objects and services must be implemented to serve up and track all of the business objects. Business objects are then not cluttered with extraneous persistence overhead. They can be located and retrieved by a single service request, even when they are stored offline in the database. This layer can also synchronize
137
138
Building Application Servers the objects with their database representations and distribute objects across multiple machines as system resources begin to diminish.
Persistence layer example To get a feeling for the requirements and design tradeoffs that must be addressed, this section will describe a simple persistence service that serves up customer objects. The customers can be from almost any business application, each having a customer identifier, name, address, city, state, zip code, and phone number. In addition to the customer objects, the corporate database also contains a customer table with all of this information plus other information not relevant to this application. Figure 6-5 shows a class diagram showing the classes used to implement the customer object server. The Customer Server object interfaces with the service interface or other business objects when any persistence service is needed. To load a Customer object, the find method is called to locate and return a reference to a specific Customer object. The service interface can make any changes needed to the Customer object by using the Customer object's methods. When the service no longer needs the object, the release method is called to store the data back into the database and release memory if there are no other references in use. Methods are also available to create new instances of the Customer object and to delete both the objects and their database entries. There are two other objects that support the Customer Server object, both representing collections of customers. The Customer Collection object holds references to the Customer objects in memory while the Customer Table object provides access to the customer entries in the database. Both have similar methods (find, add, and delete) to manage the collections. The Customer Table, which relies on the database server to manage the collection of objects, also has an update method to post changes to the database. The find method of the Customer Server object (the server) is called by passing the customer ID number to specify which customer is needed. The first step is to call the find method of the Customer Collection object (the collection) to see if this Customer object is already in memory. If it is, the reference counter is incremented and the Customer object reference is returned. If the customer is not in memory, the find method of
Designing the Persistent Object Layer Customer
Customer Collection refCount find add delete
0..*
customer name address city state zip phone
Customer Table customer name address city state zip phone find add update delete
Figure 6-5. Customer object server
the Customer Table object (the table) is called. If the entry is found in the database, a new Customer object is created and initialized with the information from the table. This object is then added to the collection, the reference counter is incremented, and the reference is returned. If the customer is not in the database, an exception is returned. When the service no longer needs the Customer object, it must call the release method to indicate that it has finished using the object. The release method will decrement the reference counter, then call the update method of the table object to post the changes back to the database. If the reference counter is zero, the Customer object is deleted from the collection and removed from memory. New Customer objects can be built using the server's create method. The service interface creates the Customer object, then the create method
139
140
Building Application Servers adds the new customer to the database. If the customer is already on file, an exception is raised. Once the customer information is stored, there is no reason to place the Customer object in the collection, since processing of the customer record is already complete. A more difficult task is deleting a Customer object. This can only be done if the reference is held exclusively by one user, since deletion by one process will cause program errors when the second process tries to access a now-nonexistent object. Once the reference count is verified, the object is removed from the collection and the entry is deleted from the table. The design shown above is simplified, but still illustrates many of the issues that must be considered when designing the persistence layer. There is always more than one class of objects used throughout the application, and many of these objects are complex objects formed by aggregation and association. Creating and locating objects will never be as easy as what was described above. At the same time, database structures seldom match the object structures, so loading and storing objects can be a difficult task.
Generalized object servers One major design problem is managing the creation of new business objects. Few objects stand on their own and most rely on aggregation, association, and inheritance to perform useful work. Depending on the business object design, some objects aggregated into one object structure may stand alone in another set of relations. While a property object may be aggregated into a loan application, it may also be part of a collection in the property management portion of the application. These relations must be considered when creating and managing objects in the persistence layer. The first step is to generalize the object server to handle a variety of classes. Instead of having a large number of different object servers managing each individual object class, one object server can serve up a variety of classes. This can be done by tying together many object servers or by generalizing a single object server to handle a variety of object classes. Figure 6-6 shows how several servers can be combined using a frontend object server to route the requests to the appropriate object server. This is effective when there are a limited number of independent object classes with only one or two complex objects.
Designing the Persistent Object Layer
Ubject Server
Collect. A
Class A
Collect. B
Class B
Server B Table B
Collect. C
Class C
Figure 6-6. Horizontally structured object server
Figure 6-7 shows another alternative using inheritance to consolidate the object servers, collections, and table classes together. This approach requires that all of the collections and tables implement the same methods and that all business objects are derived from a common parent that can be referenced in the collection. Most object frameworks, including Java's standard class library and Microsoft's Foundation Classes (MFC), provide a base class that can be used as a parent to all of the business objects. The more difficult issue occurs when complex business objects are created from lower-level objects. Although the high-level objects are handled as independent objects, the low-level objects encapsulated within these business objects must also be handled independently, since each can be used concurrently in any number of different objects. When the service interface requests an invoice, the invoice may aggregate a Customer object
141
142
Building Application Servers
Collect. A
Class
Class A
Table
Class C
Class B Table C
Table B
Figure 6-7. Inheritance-based object server
and a Sales Rep object, while associations exist with several order items that associate with their corresponding product items. Most likely, a second invoice accessed by another user will also associate the same sales representative and associate several of the same product items. These associations are not a problem, since they are loosely held relations that are not directly integrated into the Invoice object. Each can be created independently with references used to represent the associations. The difficulty lies in the aggregations, since these are encapsulated in the Invoice object and not exposed outside of a specific Invoice object. There are a couple of options for solving this problem. The first is to re-evaluate the business object design and change aggregations to associations. If this is not possible, a second option is to handle them as independent attributes of the larger object, not as shared objects. Changes made would then not be recognized by the other processes until the higher-level business object is released and the attributes are
Designing the Persistent Object Layer
stored back into the database. This simplifies object management, but slows down communication of changes, because the database is now responsible for tracking them. The first option should be used when data has a large number of changes that must be propagated throughout the system; the second option should be used for more static, informational data like names and descriptions. When the object server receives a request to find or create a complex object, the server must break down the request and call additional find requests to locate each lower-level object. Once each object is found, its references can be assembled into the complex object and, as changes occur to any lower-level object, they will be instantly reflected to other objects that reference the same instances.
Tracking the objects Once all of these objects are created, they must be tracked in some form of collection so they can be located and referenced quickly when other services need them. Several design decisions and tradeoffs must be considered when structuring these object collections, including the number and organization of each collection, the structure of the collection, and how objects can be organized and indexed to locate them quickly. Most programming environments have several collection classes that can be used to store references to objects. As each new object is created, it will have a reference pointer that can be shared between the collection, the service interfaces and any other number of business objects floating around the application server. Objects can be structured as lists, tables, maps, or other data structures. Selecting the best data structure will be determined by the number of objects stored, the need to access objects either sequentially or randomly, and the number of different access paths needed to locate the objects quickly. In the case of most business objects, the object server will have to locate each object by both object class and a unique identifier, such as customer number 1234 (class Customer, unique identifier 1234). Within the MFC architecture, there is a collection called MapStringToObject that implements an indexed list. A data structure similar to this often provides the fastest access, because the class name and the identifier can be aggregated to form a character string that then acts as the access key to
143
144
Building Application Servers each specific object. This string is then hashed to provide quick access to the associated object. As the object collection grows, another option is to distribute the objects using distributed object middleware. The naming service can then be used to locate each object by class name and identifier. The middleware can take over much of the object server role. There are some tradeoffs with this approach, since the middleware will require more system overhead and it will increase network traffic. At the same time, the application server gains scalability and additional servers can be added as the application server continues to grow. Using this approach, the only programming task remaining is mapping the objects to the relational database.
Objects and relational databases In the quest to move to object-oriented software development, there has always been the nagging problem of how to match object technology to relational data. The trade literature talks of the incompatibilities, or, in object-speak, the impedance mismatches between the two technologies. To some extent, this is a valid issue. Relational databases rely on small, independent, flat, two-dimensional tables, while object technology provides a wealth of complex data structures and object relations. Flattening these structures into two-dimensional data representations, then later reassembling the flat data back into these multi-dimensional objects can be a difficult task. When creating a purely object-oriented application, persistence is a nasty side issue you must consider, since the objects must be reloaded when the application stops and starts. Mapping this type of an application to relational data can be a very difficult task. In the business-oriented application server environment, this mismatch is usually not as difficult, since the application is oriented more towards data management and the relational data probably existed long before the application server was designed. This design may not have been based on the relational model, but the forms and business processes that drive the application were originally based on some previous form of the current data model. This limits the mismatch and will make interfacing between objects and relational data much easier to handle.
Designing the Persistent Object Layer
Objects from databases In most cases, the relational data to create the objects will be well established. This is both a blessing and a curse. Since the data is already there, there is no need to design table structures to support the objects. Unfortunately, the data is often structured in a way that is not compatible with the object design and, in the case of older, legacy systems may be in very strange haphazard forms, requiring access to a number of different data sources to retrieve the information needed to create one specific object. As shown in our simple object server above, each business object has a corresponding table object that represents the data as it is stored in relational form. Often the source of each object's data does not reside in a single table, but is accessible by one or more SQL commands that can locate, join, and retrieve the data. Once the data is located, it can be combined to form a new instance of the requested business object. After the object is no longer needed, the object is released. This operation calls the update method of the table object, which unloads the data from the object and updates the appropriate information using one or more SQL commands. The delete command also acts in a similar manner, deleting the data from the database that was represented by the object. Examining the process used to retrieve an invoice will illustrate the complexities involved in retrieving data and creating complex objects. Figure 6-8 shows a simple data model of the tables that hold the invoice information. The figure uses a variant of the entity-relation notation (as created by Sybase's Star Designer product), with each table represented as a box listing the table name at the top, followed by the data items contained in each table. Primary keys are underlined and relationships are indicated as lines drawn from one table to another, annotated by the item relations. See C. J. Date's Introduction to Database Systems for more information on database design and modeling notation. The primary table is the Invoice table. The primary key is the invoice number, followed by a customer number, shipping information, order date, and other related items. The relation between the Invoice table and the Customer table can be used to retrieve the customer's name, address, and other demographics, using the customer number to join the two tables. For each order, there are a number of order items, indexed by the invoice number combined with a sequence number (line 1, 2, 3, etc.), that include the
145
146
Building Application Servers Invoices
Order Items
invoice
invoice
customer
invoice = invoice
sequence
ship_contact
product
ship_address
quantity
Products product product =;product description shipping_weight
unit_price
ship_city ship_state ship_zip_code
customer = customer
order_date
Customers
ship_date shipping_weight
customer
shipping_charge salesjax discount totaLbilled
name address city
Customer Classes
state zip_code phone class
class = I class
class description standard credit limit
Figure 6-8. Database design for invoice example
product number and the quantity ordered. To obtain information about the product by product number, the relationship between the order items, and the Products table, can be used to access the description, shipping weight and other information. Note that this database design is simplified, limiting the number of items and tables within the database. Given this existing database structure, it makes sense to design the objects to reflect their corresponding database structures. An Invoice object can encapsulate most of the same information as the Invoice table. The same will be true of customers, order items and products. Figure 6-9 shows a compatible object design that reflects the relational database structure. In the Invoice object there are some slight differences from the Invoice table. The Invoice table has a couple of poor design choices that must be addressed in the object design. The shipping weight and the total billed items can both be derived from the order items and products table, so
Designing the Persistent Object Layer
Invoices invoice shipName shipAddress shipCity shipState shipZip orderDate shippingWeight shippingCharge salesTax discount totalBilled
Order Items Products
1..*
sequence product quantity description shippingWeight unitPrice totalPrice
product description shippingWeight unitPrice
Customers customer name address city state zipCode phone class
CustomerClasses class description creditLimil
classDescription
Figure 6-9. Object designfromInvoice database
these are redundant fields and there is a possibility of incorrect data if these are not computed correctly. These have been replaced by methods that calculate the correct amounts. A good case could also be argued that the shipping charge, sales tax, and discount in the order items should be placed in the order entry table, since these are line items on the invoice. Depending on the specific application, treating them as line items could create a cleaner design and simplify programming. Nevertheless, the Invoice object retains this design choice, keeping the shipping charge, sales tax and discount in the Invoice object. Also, since the customer is now aggregated into the Invoice object, there is no reason to carry the customer number inside the Invoice object. There are many reasons for poor database design. Sometimes they are just that, poor design choices, in which case they should be ignored and the object should be designed independent of the database design. In other cases, there are deficiencies in the database software that have to
147
148
Building Application Servers be accommodated. In earlier database management systems, tables were often denormalized to speed data access. Totals, like those illustrated in the above example, were added to provide quick access without waiting for calculations. Again, if this is the case, correct it in the object design, then remember to update these data items as they are stored back into the database. The remaining cases for what may appear to be poor database design could reflect business processes or requirements that are not yet known and may impact object design. In the example above, the total shipping weight may differ from the sum of the individual shipping weights because of repackaging or special handling requirements. If the vendor is selling computers, the invoice may include a case, motherboard, hard disk, CD-ROM, video and network cards, and so on. Each of these products shipped separately would include additional packaging, but when assembled, the packaging is thrown away. Once the computer is built and packaged, it is then weighed and a new shipping weight and charge are entered. The individual product shipping weights are only on file for use when shipping the product independently. If this is the case, both shipping weights are needed, but the total weight may be overridden by the computer assembly staff.
Databases from objects In other cases, the database tables do not exist and new tables must be designed. Since there are no legacy data structures, the tables can closely reflect the object design, but must also fit the rules and requirements of the relational model. If possible, one-to-one mappings between the objects and the tables can often eliminate quite a bit of work when creating the persistent object layer. At the same time, good database design should not be ignored just to simplify object mapping. Report writers and other external systems will also need to access this data. Object design is never completely compatible with data modeling, so follow best practices for each, justify the reasons for breaking the rules, and make sure that they do not cause difficulties later.
Designing the Persistent Object Layer
Scalability Scalability is the ability of a software product to easily adapt and grow as the volume of transactions, objects, and data increases. What works well in a small system will often overwhelm system resources as the workload increases. It seems that no matter how much growth is expected, the ultimate needs of the software will always grow to exceed its capability; so the more scalability the better. Scalability can be achieved in a number of different ways, meeting different growth requirements. Since one of the main advantages of the application server environment is the ability to use multiple computers, distributed object support at the business object level is an excellent approach. In parallel with distributed processing is multi-threading, allowing processes to run simultaneously on one or more computers. Finally, transaction support protects the integrity of the data even when objects are distributed across multiple computers. Each of these will allow the application server to grow as business needs increase.
Distributed object support The level and partition of distributing objects across multiple machines is a very tough call. Too little distribution and the system will max out server capacity and the distribution strategy will have to be reworked. Too much distribution and the system will bog down from excessive network traffic. This decision will be one of the toughest choices and should be approached carefully. Ideally, all objects can be distributed across all machines. This quickly solves all of the problems and allows total flexibility in load balancing and scalability. If only distributed systems were that simple! Distributed objects communicate by way of network traffic, even when both objects are on the same machine. This quickly overloads the network and slows object communication to unusable levels. The best solution is to partition objects into groups, then create distributed interfaces to communicate between the partitions. There are several ways to partition the application server without adding too much complexity. The first is to partition the application server vertically, breaking up the server into several smaller functionally based
149
150
Building Application Servers servers. An accounting application server may be broken into general ledger, payables, receivables, inventory, and so on. Although some objects may be duplicated and exist concurrently on several servers, the database server will synchronize the data while still ensuring data integrity. A second approach is to use the vertical partitions, but send service requests between servers to eliminate duplication. Using the same example, the payables, receivables, inventory, and other application servers send service requests to the general ledger application server when it comes time to post journal entries. This way, fewer objects will have to be duplicated between servers, and the servers communicate through the same service interfaces that are already available to the user interfaces. Within the application server itself, there may also be some need to distribute objects. When this need occurs, the less distribution, the better. The best option is to distribute high-level business objects, allowing lowerlevel objects to be encapsulated within the higher-level objects. This is very application-dependent since, as we saw earlier, lower-level objects often are encapsulated in many different higher-level business objects. Object distribution is still a difficult task that must be approached with caution. Thorough understanding of the inner workings of the middleware implementation will help you to create an efficient distributed object design.
Multi-threading Another consideration that can help improve the efficiency of the persistence layer is multi-threading. Most operating systems and programming languages now provide the ability to have multiple processes running concurrently within the same program. This allows objects to act in the background on their own without requiring an external method call. Within the persistence layer, the task of synchronizing data between the objects and the database can be done during otherwise idle time. A thread can sit in the background and periodically check each object stored in the persistent collection to see if any changes have been made to the attributes. If a change has occurred, the thread can send an update to the database so the change can be reflected on the server. This thread can be tuned to perform an object check every certain number of milliseconds, balancing computer time between services and data operations. There are
Designing the Persistent Object Layer
many ways to implement this process. One of the simplest is through change flags that are set when an attribute changes, then cleared when the database is updated. As long as the method is standardized across all objects, the thread can detect changes and synchronize the database in a timely manner. In addition to persistence, multi-threading can be used by business objects to perform process intensive activities in the background, outside of the control of the service interface. Allowing multiple computers to all execute code simultaneously will greatly speed the throughput of the application and improve response time. Learning how to harness this power takes time, but provides great benefits. Transactions
Finally, the persistence layer must also protect the integrity of the data. Good business object design will go a long way towards ensuring data integrity, but often, a disruption in the program flow will allow incomplete data to be stored back into the database. Another step in protecting the data is to set transaction boundaries, requiring all related changes to be grouped into a single update that is rolled back if any errors occur. Most relational databases provide transaction facilities, and these are usually all that are needed to ensure consistent data. A single "begin transaction" operation will mark the beginning of the sequence, then either a "commit" or "roll back" operation can be requested to store or cancel the set of updates consistently. When more than one database is involved, additional transaction capabilities can be added by either including transaction server middleware or programmatically controlling several different database transactions. Within the persistence layer, there are several ways to implement transactions. The approach selected will depend on other design factors. The simplest approach is to add transaction boundaries within the persistent object server. When a complex object is released, a "begin transaction" operation is posted, then each lower-level object is released and written to the database. After all lower-level objects are released and stored, the transaction is completed by issuing the "commit" operation.
151
152
Building Application Servers For more complex objects or objects that span multiple databases, the object server can serve up transaction objects that can be used by either highlevel business objects or service interface code. When the transaction object is created, it sends the "begin transaction" command to the databases; then, when the object is released, it commits the transaction. If an error occurs, the object is either discarded or lost and the transaction does not commit. Finally, for the most complex transaction requirements, the application server can be built using transaction middleware such as Tuxedo or MTS. These middleware products provide transaction-based distributed objects so transaction capabilities are built directly into the middleware. Complex transaction boundaries then become part of the business object implementation. Transactions, threads and distributed objects all can add scalability to the persistence layer. Chapter 13 will examine many of these techniques in far greater detail.
Using Object-Oriented Databases Since most business data already resides in relational databases, it makes good business sense to continue using the relational data model. But this technology does have some drawbacks. All data must conform to a set of predefined rows and columns. Data must also fit within the limited set of predefined data types defined by the DBMS vendor. Often, object models have difficulty fitting within these bounds. To solve this problem, a new data model has emerged based on object technology. This model, the object-oriented database management system (ODBMS), stores objects instead of data. Each object is derived from a persistent base class that can automatically load and store itself. The ODBMS provides its own naming, persistence, and life cycle services so that when an object is needed, it is automatically swapped into memory. The ODBMS monitors object usage, and when memory is needed or the object has not been used for a predetermined amount of time, the object is written back into the ODBMS and is removed from memory. In addition to naming, persistence, and life cycle services, most ODBMS packages can also distribute objects across multiple machines, and many provide transaction services to roll back changes when exceptions occur. Some even provide "pipelines" that act as gateways between
Designing the Persistent Object Layer
the ODBMS and relational databases so that existing relational data can be accessed and updated automatically (Saljoughy 1997). As these products mature, they may eventually become the logical choice for handling persistence and object distribution for application servers. Until then, there are issues that should be considered before adopting ODBMS technology. The first issue, already addressed, is product maturity. The relational model has become almost a commodity with standard interfaces and functionality. If there are problems with a relational implementation or the database must be moved from one platform to another, there is little difficulty making the transition between vendors. This is not the case with the ODBMS market. Many products are still language-specific with marked differences in implementations. These standards for implementations and terminology are still maturing. Another issue is scalability. As the application server gains new functionality, as the number of classes and instances of classes grow, and as the transaction volumes increase, will the ODBMS have the horsepower to maintain response time and throughput? Remember that in an ODBMS, each customer, invoice, and order item will become a distinct object that will have to be managed by the ODBMS. Just as relational databases quickly grow from thousands to millions of entries, the ODBMS must have the capacity to manage this same, growing volume of distinct objects. Finally, an issue that has already been addressed several times is that of application integration and compatibility. Unless the ODBMS can also present its data as a relational model, there is no backwards compatibility for existing applications. All applications will have to be replaced, or redundant data will have to be managed in a separate relational database. Report writers and other tools already in use throughout the organization will have to be replaced, causing new learning cycles, confusion, and difficulty. All of these issues have to be addressed before ODBMSs will be widely adopted in the business environment. Many vendors are currently working on ODBMS implementations that will solve these problems, and over the next few years, this technology should mature. If they succeed, there will be little need to design a separate persistence layer, since all of the functionality will be provided by the ODBMS.
153
154
Building Application Servers
Using Objects to Represent External Applications In addition to relational or object-oriented databases, data will often come from other external systems. These sources can still be represented in the persistent object layer using the same methods as table objects, but these translation objects will retrieve and store information either through external services or through other network requests. Chapter 7 looks at how to interface the application server to external systems and the issues of accessing legacy data.
Summary The persistence layer is responsible for storing and retrieving data between business objects and databases. It also acts as object broker, creating and removing business objects and managing their life cycles. When designing the persistent layer, use the following guidelines: • Review database design documents as well as the database structure to understand how the existing data is stored. • Create mappings between the business objects and the existing databases. • Design new tables to accommodate new data that will be stored and handled by the business objects. • Design persistence server objects that can be used to request and store business objects. • Design mapping objects, used by the persistence server objects, that can create and store each business object or object structure from the database objects. • Consider using middleware to manage the life cycle and directory services required by the persistence objects. • Consider multi-threading to optimize persistent services. • Consider implementing transactions if data integrity is critical.
Designing the Persistent Object Layer
References Date, C. J. "The Birth of the Relational Model." Intelligent Enterprise, October 1998: 61-63. Saljoughy, Arsalan. "Object Persistence and Java." Java World, May 1997. Available from http://www.javaworld.com/javaworld/jw-05-1997/jw05-persistence.html
Further Reading Date, C. J. Introduction to Database Systems. Reading, Massachusetts: Addison Wesley Longman, 1990.
155
Chapter 7
Integrating Existing Systems and Legacy Software Legacy software—the words conjure up images of glass-walled rooms, guys with crew cuts, lab coats, and horned-rimmed glasses, COBOL, FORTRAN, RPG, ISAM, line printers, tape drives, and maybe even punch cards: everything that's ancient and evil, old and obsolete. This is the image that vendors want their customers to see as they show off their latest client/server tools. But in reality, legacy software is a core resource. Those ancient COBOL programs are the tools that keep the business running smoothly. If this were not true, the Y2K problem would never have been an issue and all of this ancient software would have been replaced long ago. It is a tribute to those old COBOL programmers that the code still plays an important part in their organizations 15 or 25 years later. This chapter will examine how to integrate application server technology into the existing information system environment. Topics will include: • Design issues for application integration • Application mining • Turning subroutines into services • Input and output streams 157
158
Building Application Servers • Accessing application databases • Synchronizing transactions
Design Issues for Application Integration When approaching application integration, remember that there is no one best solution. Integration tasks will vary depending on the hardware and software platforms, system architecture, communication links, data models, and a host of other factors. The level of integration may also vary based on the amount of information needed and the directions of data flow. A task may require a simple data transfer, or it may need to share procedural code. What worked when linking to the general ledger system may not work when accessing inventory. The first question to ask is, how necessary is this link? Any integration effort will take time and resources away from other development tasks. If the only need is to populate a list box or to look up static data, it makes more sense to have someone key the information or routinely copy the table onto the local database server. Another option is to periodically replicate data between the two databases. These simple, less sophisticated processes will often solve the problem without impacting tight development schedules. If a simple replication process is not sufficient, then you must perform analysis to determine how to link the external applications. Here are some of the design issues you must address when considering application integration: • Take inventory of the existing applications and determine the functionality that is already available. This is often referred to as system mining or application mining. • Determine the level of integration required. Data transfer is often easier than remote procedure calls, but there are tradeoffs and application needs that will determine which approach is better. • Consider architecture and networking. The difficulty in accessing older, legacy systems often makes data transfer a better option, but if high-speed network connections are available, access to external services may make more sense.
Integrating Existing Systems and Legacy Software
• Examine the design of the external programs. This will often limit the amount of code sharing you can accomplish. Many systems tightly integrate the business functions with the presentation code. If integration is too tight, it will be very difficult to extract business logic. All of these issues must be considered when choosing an integration strategy.
What Do We Have—Application Mining Before you can design an integration strategy, you must study the external systems to determine what functionality can be exposed to the application server. You can do this by examining design documentation, program code and database models. In addition to the actual code, there will also be business policies and security issues that you must investigate. Each employee's monthly salary amount could be used to assign priorities within an email system, but it is highly unlikely that the human resource manager would release these numbers. When approaching application mining, start by determining the best, yet simplest level of integration desired. A remote procedure call is often advantageous when posting transactions into an external system because the procedure will encapsulate logic to validate the data and protect the integrity of the database. When you need lookups, direct database access will often be a better choice. Make sure that the access is as simple as possible, but choose an access strategy that matches the needs of the application. See the guidelines listed in the Summary for help with assessing your needs. Once you've chosen an ideal level of access, examine the application to determine the least invasive approach that produces this level of access. Minimize changes to the external application and use existing code when possible. Program changes incur a high level of risk and add the possibility of introducing errors in the external system. This is even more risky when attempting to modify older, legacy code, since software development models have changed and it may be difficult to understand what the original code was supposed to accomplish. Often, middleware tools can be used to isolate an interface that will expose existing code. Remote procedure calls or distributed objects can
159
160
Building Application Servers isolate the code and enforce access security. Message queues and message brokers can provide pipelines that will route data and store requests even when network connections are intermittent. You can also use transaction monitors to coordinate activity between a variety of external applications.
Turning Subroutines into Services The safest, but usually the most difficult and expensive, approach to application integration is to execute program code from the external application directly. This ensures that the proper sequence of events occur, and error checking and exception handling can protect the integrity of the data. You can perform execution through remote procedure calls, distributed object architectures, or other more direct approaches. Each service is exposed to the application server using IDL or some other custom interface. You can then call this interface through a proxy object, located in the business object layer of the application server, that implements methods representing these external services. The proxy object can also route exceptions and errors back from the external services if a problem occurs. One difficult restriction of direct program execution is that the external system must be available at the same time that the application server performs its activities. If the external system is down or communication cannot be established, the application server will either wait indefinitely or crash. Relying on two separate systems doubles the probability that the system will not be available, and this may not be an acceptable option. If the application server performs mission-critical services, the design must accommodate any possible loss of communication and continue to work in spite of these problems. An alternative design using message passing or replicated data may be a better solution.
Proxy Objects To isolate the external legacy or existing system effectively, you should represent each within the application server by one or more proxy objects. Although these objects expose the external services to the application server, they should still conform to the same rules and guidelines as any other business or persistent object. Each proxy object should rep-
Integrating Existing Systems and Legacy Software
resent a business entity that models business activities, not just external software functions. Once this is accomplished, these new business or persistent objects can seamlessly interact with the service interface and other business objects to perform the services needed by the user interface programs. The only difference is that they call external software to perform their methods. In an Internet electronic commerce application, the order entry services will need to check inventory availability and reserve the number of items ordered. If the inventory system resides on an IBM mainframe, the reserve inventory function can be exposed as a remote procedure call. Once you've exposed the function, there are many different ways to call it from the application server. The first option is to execute the check inventory and reserve inventory methods inline within the order item objects. The remote functions to check and reserve inventory are called each time a new order item is created. The main problem with this approach is that these functions are integrated too tightly into the application server. The initialization procedures required to set up the remote procedure calls will either require complex linkages when creating new order items or will add a high level of overhead into each order item object. Since the remote procedures aren't isolated, locating the links between systems will be difficult when changes are made to the inventory system. This is not a good design choice. A second option is to create either an object or an interface called Inventory System that exposes the two methods and enables them to be called. This solves the initialization problems because the linkage can be set up during the constructor. Methods that mirror the remote functions can send the item numbers and quantities on to the remote procedure calls. This is better than the first option, and may be adequate for seldomused functions, but is still not ideal. It does not conform to the same level of abstraction as the other business objects, being described in software terms, not business terms. This approach can be used, but a separate business object would fit better in the application server framework. The best option is to create a separate persistent inventory object that represents the legacy inventory system. When a customer chooses an item on the Web page, a message is sent to the service interface to add the item to the order. The service interface first requests a new order item from the inventory using a getlnventory method, then adds this item to
161
162
Building Application Servers the order. The getinventory method first calls the remote lookup function to retrieve the item number, descriptions, and quantity on hand. If the quantity on hand is sufficient to fill the order, a new order item is created and a remote reserve inventory call is sent to the inventory system. Once the inventory is reserved, the inventory object returns a new order item object which can be added to the order. If there is insufficient inventory, an exception can be sent back and the customer can choose to change the order or select another item. By conforming to the same requirements as other business or persistent objects, the programmer does not have to take time to research how the object will be used. A final consideration in proxy object design is to determine its life cycle and how that matches the availability of the external system. Since the remote system may not have the same availability as the application server, you must put processes into place to determine when the proxy object should be created, how long it should remain in memory, and when it should be removed. In the case of the inventory object described above, it must have the same availability as the application server; otherwise, the order entry operation will fail. If the inventory system is not available at the same time, the remote procedure call design will not work and you may need to explore other design alternatives. Message passing or other less timely methods may be required to verify inventory availability.
How to access remote software There are any number of ways to transfer execution from one machine to another. The most common, remote procedure calls and distributed objects, were described in Chapter 2, but will be briefly recapped here. In addition to middleware solutions, there are other more complex solutions, such as screen scraping and socket monitors that can be implemented to integrate different software platforms. There are also a number of proprietary options that work well for access to mainframe and legacy systems but, because these options address only a single platform, it would be impossible to cover them in any detail in this section. Remote procedure calls A common solution available on almost any mainframe platform is the
Integrating Existing Systems and Legacy Software
remote procedure call. The Distributed Computing Environment (DCE) described in Chapter 2 is a stable industry standard that has gained widespread acceptance throughout the industry. Any subroutine written in almost any programming language can be defined through IDL and then be exposed to the network. In addition to DCE, some mainframe vendors also provide proprietary remote procedure architectures.
Distributed objects Most mainframe vendors also support the OMG's CORBA standard for access to distributed objects. CORBA, also described in Chapter 2, is an industry standard that defines protocols and services that enable objects to be accessed over a variety of programming and computer platforms. Microsoft's DCOM standard for distributed objects is also gaining some support on mainframe platforms and can be used as a common distributed object architecture. CORBA also has facilities that allow legacy code modules to be packaged within an interface so they appears as distributed objects. Often a package of related procedures can be linked together in this manner to transform a legacy system into something that resembles a distributed object. This new legacy object now acts as if it were any other distributed object. Since distributed objects are the foundation of most application server architectures, a distributed object standard will greatly simplify application integration. Depending on the design approach used in the external system, accessing remote objects may be no different than using the application server's business objects. This is the ideal option.
Screen scraping, sockets, and other custom solutions When remote procedure or distributed object support is not available, you must investigate other alternatives. At this point, reevaluate the integration requirements. Database access or replication will most often be a better choice, since custom remote access strategies will involve complex, difficult technical skills, which translates into a much higher development cost and takes time away from more productive activities. If remote execution is still an absolute necessity, there are several alternative "hacks" that may solve these difficult problems.
163
164
Building Application Servers One alternative is to use approaches such as "screen scraping" (Pageno and Komides 1998). This method uses terminal emulation software (such as that provided by IBM 3270 emulation packages) to replace human input from a computer terminal with a communication stream from another software program. Coding can be difficult, requiring knowledge of terminal emulation and an intimate knowledge of how the user interface program will react to every different input possibility. Alternatively, the same results can be obtained by replacing the mainframe terminal handling logic with a new interface that accepts input as parameters. Many mainframe user interface programs rely on input streams from products like IBM's CICS or HP's VIEW communication utilities. These load a COBOL record structure with fields entered from the computer terminal. Of course, there are now two separate sets of code to maintain, the original user interface and one modified for remote procedure calls, but since most of these COBOL programs have matured over many years, maintenance should be fairly limited (if not, try something else). Another solution is to program at the network level, using socket APIs that provide direct communication between the application server and the mainframe (see The Legacy Continues—Using the HP 3000 with HP-UX and Windows NT by Yawn, Stachnik, and Sellars for an interesting approach to client/server implementation). The software can be structured so the mainframe acts as a remote server monitoring the socket and responding to requests sent from the application server. Each request, submitted through a programmer-defined protocol, calls a subroutine on the mainframe. Remember that the programmer is responsible for marshaling data formats and interfacing between the multiple programming languages. Each of these solutions will incur complex technical programming tasks that take resources away from application server development or other projects. These solutions also involve a much higher level of technical risk and future maintenance support. Make sure that the application requirements are absolutely essential before implementing these integration solutions. There usually is a much simpler design alternative available.
Input and Output Streams Not all application integration tasks require the complexities of remote procedure calls. In fact, remote access has many pitfalls and problems.
Integrating Existing Systems and Legacy Software
As mentioned above, both applications and the network must all be running before remote procedure calls can be processed. This can be a problem if the two computers are located at opposite ends of the country or if one of the systems has limited capacity for additional work. An alternative integration approach is to route data streams between the two applications. These can be done continuously using message-oriented middleware or by periodic replication or file transfers. No matter how the data is transmitted, the purpose of a data stream is to send information from one application to another. This stream can contain almost anything. It can be a list of orders sent from an order entry system to an external billing system, or it can be a mailing list sent from one company to another. Data is most often transmitted in only one direction, but, as with credit card verification, it can also be bidirectional. Transmission can be over modem, message broker, email, or even sneaker net, carried on floppy from one workstation to another. Handling a data stream inside the application server is no different than any other persistence process. To create an output stream, business objects are sent to a persistent object which extracts the necessary information then adds it to the output stream. Input streams are handled in a similar manner, creating business objects derived from the data then generating events to trigger processing of the data.
Message-oriented middleware Message-Oriented middleware (MOM), including message queues and message brokers, is a technology that allows asynchronous, one-way message passing (Nance 1998). These messages can be either events, triggering remote program execution, or data messages, sharing information between two programs. Messages can be persistent, held on disk to prevent data loss in case of system crashes, and can have transaction boundaries, allowing messages to be rolled back when errors occur. This technology is an ideal solution for application integration in environments where computers and networks are not always connected. Each message has an origin and a destination, using network addresses and naming conventions set up by the message administrator. When a program sends a message, it is sent to a central message server where it is held until the destination machine is available to receive messages.
165
166
Building Application Servers Within the application server, message middleware can be used either to route data streams or to trigger events. Data streams can be handled within the persistence layer, acting as if it were loading or storing local data. Events can be handled from either the service interface or the persistence layer, depending on the type of event. When an event is sent from another application, the service interface can often handle it in the same manner as a button-pressed event would be sent from a user interface program. It does not matter whether a user presses a "shipment received" button or whether the receiving system sends a "shipment received" event; both are handled by the application server using the same service interface process. To see how message brokers can be used to link applications, consider a large, national organization with customer service centers taking orders by phone in Atlanta and Seattle (see Figure 7-1). Their shipping facility is in Memphis and their corporate offices are in Chicago. When an order is taken, it is entered into the local customer service system in either Atlanta or Seattle. Each order is then sent to Memphis for fulfillment. Once the order arrives in Memphis, the local inventory system is updated and the product is shipped. Messages are sent back to the customer service center to indicate the order was shipped, and to the Chicago office to trigger a bill and to update inventory levels within the corporate general ledger system. Each message is queued into a local message server; then, when the network connection is available, it is routed to the message server at the remote site. The remote message servers forward the message to the appropriate application when the application signals that it is ready to receive messages. For orders taken in Atlanta and Seattle, these messages will be very data-intensive, carrying all of the information needed to fulfill the order. Other messages, such as the "order filled" message, will represent an event with little data other than an invoice number. To minimize wide-area networking costs, the customer service centers may want to use dial-up connections to transmit orders each night. The message servers will hold the orders throughout the day; then, in the evening when phone costs are lower, the customer service center message servers will call the Memphis message server and send the orders queued up during the day. At the same time, the "order filled" messages can be sent back to the customer service centers. Message-oriented middleware is still a complex technology, but it does provide a more reliable approach to application integration than
Integrating Existing Systems and Legacy Software
Chicago Corporate Reporting General Ledger
Memphis Shipping Inventory
Atlanta Order Entry
Figure 7-1. Electronic commerce example
remote procedure calls. For applications that must communicate across wide-area networks that require fail-safe communication, this may be the most appropriate technology. Before committing to the expense, though, see if other, simpler data transmission options are available.
Advanced sneaker net In the early days of PCs when networks were still beyond the reach of most organizations, much of the data sent between PCs was done by carrying floppy disks from one desk to another. This "sneaker net" technology is still a viable option for data transmission. Someone once suggested that the cheapest bandwidth is a tractor-trailer filled to capacity with 4mm DAT tapes: it may not be fast, but it sure has a large data capacity. For periodic, high-volume data transmissions, a tape cartridge sent by overnight mail or an email attachment may be an effective solution.
167
168
Building Application Servers For programming simplicity and interoperability, there is no communication protocol simpler than a flat ASCII data file. It can be sent by diskette or tape cartridge, sent across the Internet using email or FTP, sent across networks, or sent by modem using a variety of communication protocols. It can be read by any programming language and can be directly imported into databases. It serves the same functions as a message queue, but with far less software overhead and expense. For any process that is not time-sensitive, ASCII files are probably the simplest and most effective application integration tool.
Accessing Application Databases A convenient, yet somewhat risky approach to application integration is to access the external system's database tables directly. This is an effective approach for lookups and to read external data, but should be approached cautiously when it is necessary to write data into the external database. Writing or updating data will most likely breach database security and may cause problems with the integrity of the external data. There are times when remote database access can be an effective means of communication. It is an effective approach for validating identifiers such as customer codes or product items to maintain consistency between systems. It can also be used as a message repository without adding the complexities of message middleware. Database replication is also a valuable technique that can be used to synchronize and distribute databases located in different geographic areas.
Direct database access Direct external database access is usually safe as long is it is limited to reading, but not modifying external databases. Most applications use pulldown boxes that must be populated with standard codes, or use name and address lookups that validate customer or inventory identifiers. These external reads should not affect database integrity and will ensure that the application server data remains consistent throughout the enterprise. Take care when writing data into an external system. In most cases, you should pass remote procedure calls or messages instead, allowing the other system to validate the data before placing it in the database. In
Integrating Existing Systems and Legacy Software
rare cases when data must be written directly to application tables, you must build the same validation rules into the local application server's persistence layer to enforce data integrity. One place where remote data access does work effectively is when a table is used as a message repository. Instead of setting up message servers or calling remote procedures, it is often simpler and more effective to write the message data directly into a remote database. These messages can then be processed when convenient using the existing software to validate the integrity of the data.
Replication Most database vendors provide replication facilities that allow changes written to a central database to be replicated or synchronized into mirrored databases on other servers. This is an effective tool for providing database access across a wide geographic area without the expense of maintaining continuous network connections. Changes are logged throughout the day and periodically a replication process is run to synchronize the data between the two databases. An example of this would be in the customer service sites described above. Each maintains its own order entry database with customer service representatives entering orders throughout the day. If there is a necessity to have all orders accessible in both offices, one of the databases servers could connect and replicate changes each night, synchronizing the data between the two databases. In the morning, the databases at each customer service site would contain exactly the same data. As with direct database access, use caution to ensure that the same verification logic is used in both locations, or the data may become corrupted. Also, you will need additional logic to assign unique identifiers when new entries are added at each site to prevent data collisions. If the Seattle customer service site adds invoice number 1005 and the Atlanta site also adds an invoice 1005, one of the two invoices will be lost. The application logic must know that the application is running in Seattle and that an invoice in Seattle is assigned a number with a pattern different from those created in Atlanta. Another use for replication is for lookup tables. Most organizations set up standard codes that are used across the enterprise. These codes seldom
169
170
Building Application Servers change. These may be geographic region codes, general ledger account codes, or order status codes. They are used by many different applications and only change when a new region is entered or the accounting system is replaced. A similar, yet more volatile, set of information consists of the customer, vendor and product codes. These change more often, but are still have relatively little maintenance. None of these tables change often enough to justify paying for a continuous network connection across the country just to validate the codes. These tables are excellent candidates for replication. You can store a copy of the tables on the database server, then periodically run a replication process to synchronize the data. This ensures that the data is available locally, yet still stays relatively timely using daily or weekly replication cycles. Replication does have its own set of pitfalls. It is not an effective solution when high volumes of changes are made, causing large periodic transfers that can take many hours to accomplish. Also, most replication processes have a hard time synchronizing multiple changes, especially when changes are made to the same table entries at different sites. Resolving these replication conflicts often requires manual intervention that can make this process very difficult. Replication works best when there are relatively few changes and ownership can be established between sites so that replication conflicts do not occur. Carefully examine how and where changes occur and determine the volatility before adopting replication. It can be a powerful application integration tool, but it has its place.
Synchronizing Transactions Finally, no matter what technology or middleware architecture is selected for integrating external applications, you must put mechanisms in place to ensure the integrity of the data across all applications. For some simple application integration tasks this may not be a problem, but as the number of external applications grows or the complexity of the interfaces increase, there will come a time when transaction management will become an issue. Many of the middleware tools described in this chapter have built-in transaction capabilities. Even many of the legacy systems such as CICS provide transaction monitor capabilities. If this is not sufficient, the application server can implement its own transaction capabilities, or a transaction monitor or server can be used to manage these tasks.
Integrating Existing Systems and Legacy Software
Fun with Punch Cards: What to Do with Legacy Software When I began my career in the early 1970s, both my undergraduate computer science program and my first programming job used punch cards. All program code and data was submitted to the computer on paper cards that contained up to 80 columns of data per card. I'm still amazed that we got any work done, considering the deck of cards was submitted to the computer center. Several hours later, the cards and a stack of paper would be returned listing the results of the computer run. Any errors in the cards would cause the entire run to be rejected. The errors had to be corrected and the process would be repeated again. A couple years later at another company, we used a minicomputer and remote job entry (RJE) software connected to an IBM mainframe. Even though we now worked with online terminals and had the ability to store data in primitive databases, the data still had to be structured in the same 80-column record formats so the RJE software could treat it as if it were a punched card. In the early 1970s, we were making the shift from punch card-based software development to online software. Today we are making another architectural shift from mainframe and client/server-based development into object-oriented, distributed enterprise computing. As the punch cards went the way of the dinosaur, we slowly moved away from the 80-column record structures. As we move closer towards distributed enterprise computing, integration will become easier and new standards will emerge to allow tighter interoperability. Until then, every application integration task will have to be approached as a different puzzle, requiring different design approaches, programming models, and integration tools. Determining the integration approach requires careful thought and evaluation of design requirements. A project can quickly bog down when the wrong integration choices or tools are chosen. Each choice has a certain amount of technical risk that must be traded against the tightness of the integration. Table 7-1 gives a summary of these choices and gives some general guidelines for choosing an appropriate integration strategy. Balancing these risks and benefits is the key to successful application integration.
171
172
Building Application Servers
Table 7-1. Selecting an integration strategy Network Application integration connection
Administration Support
Cost
Risk
RPCs: DCE
High
Concurrent
Moderate
Moderate
Moderate
Moderate
Custom
High
Concurrent
High
High
High
High
CORBA
High
Concurrent
Moderate
Moderate
Moderate
Moderate
DCOM
High
Concurrent
Moderate
Moderate
Moderate
Moderate
Custom
High
Concurrent
High
High
High
High
Messaging
Moderate to High
Intermittent
Moderate to High
Moderate
Moderate
Moderate to High
Database Access
Low to Moderate
Intermittent
Low
Low
Low
Moderate to High
Intermittent or None
Moderate
Low
Low
Low
Distributee Objects:
File Sharing
Low
Summary Application integration has always been a difficult part of any software design and still is difficult in the application server environment. Distributed processing makes integration a little easier, but choosing the best approach is often difficult. Here are some guidelines to follow when considering external application integration: • Begin by taking inventory of the existing applications. Determine what functions are available and how they can be accessed. • Choose the lowest level of integration that will solve the problem. • Use remote procedures or objects only when direct online access is needed. • Use messaging when system access is intermittent and when online access is not needed.
Integrating Existing Systems and Legacy Software
• Consider file sharing for simple data transfers. • Limit remote database access to reading existing data or as a repository for message passing. • Limit database replication to cases where few changes are made or where ownership of the data can be established. • Stay away from custom solutions unless there is absolutely no other way to integrate the systems.
References Pageno, Dennis, and George Komides. "Integrating Internet Applications with Legacy Systems." Component Strategies, September 1998: 32. Nance, Barry. "MOM Implementation Issues." Network Computing, July 15, 1998. Available from http://www.techweb.com/se/directlink.cgi? NWC19980715S0025
Further Reading Yawn, Mike, George Stachnik, and Perry Sellars. The Legacy Continues— Using the HP 3000 with HP-UX and Windows NT. Upper Saddle River, New Jersey: Prentice Hall, 1997.
173
Part 3
Programming Part 3 describes tools and processes that can be used to transform the user requirements into working program code. These chapters examine how to implement the business objects and place them into a framework that services the user interface programs.
175
Chapter 8
Implementing an Application Server Framework Up to now, this book has looked at application servers as abstract concepts—first from an architectural view, then from the designer's vantage point. It's almost time to roll up our sleeves and start the real work: translating the abstractions into program code. But before we can start programming, we need to establish a general framework that can hold the service interfaces, business objects, and persistent objects that implement the server. This chapter bridges the discussion between design and programming, describing how to establish this framework. In addition to the program framework, we must also create an organizational framework that manages and structures the development process. This includes communication channels, programming tools, testing strategies and other administrative details. Although the emphasis of this book is on the technical side of application server development, these topics will also be discussed in this chapter. This chapter will examine the following topics: • The application server framework • Additional application server requirements • Development strategies 177
178
Building Application Servers
The Application Server Framework In the previous chapters, we examined each of the different application server layers as separate entities, each having different responsibilities and requirements. This is the advantage of using a layered architecture: each layer can be examined on its own, viewed independently. Business object design can focus on the needs of the application. Service interface design can implement services for the user interface. The persistence layer design can focus on mapping business objects to their representation inside databases or persistent storage. By focusing on a single layer at a time, it is easier to manage the complex requirements of the application server design. But before the design can be completed, you need an integration phase, drawing the layers back together. You must design a framework that enables the service interface to locate the business objects and determine when to load and store objects from databases to memory and back. Although this is still design work, it is detailed, low-level design that is highly dependent on architecture, middleware, and programming language choices. Each has a strong impact on how the framework is created. In some cases (like Microsoft's Transaction Server), the framework is provided as part of the middleware so there is little framework design. In other cases, the complexity of the operating environment and the limitations of the programming language dictate a unique, custom inhouse solution.
Initializing the framework To start the application server, there must be one executable program that can be launched from the command line or program manager. This program is simply called the application server, since it contains links to every object specified in the design. It has the responsibility of setting up the framework in memory. This includes loading the service interfaces and registering them with the middleware naming service so that the user interfaces can begin to request services. Figure 8-1 shows a block diagram of the basic application server program. When the program begins, it first creates an instance of one or more persistence objects. Next, one or more service interfaces are start-
Implementing an Application Server Framework
ed, each receiving a reference to the appropriate persistence objects. Each service interface object is then registered with the middleware, making them available to any authorized user interface attached to the network. Once the service interfaces are registered, they are ready to begin processing service requests. Examining this in more detail, the application server program must perform the following steps: 1. The executable program is launched from a command line or operating system interface. 2. Persistent objects are loaded into memory and references to these objects are stored in the executable program. Once the persistence layer is started, it can launch threads in the background processes that preload frequently used business objects, such as lookup tables and business rule objects.
Middleware
Step 4
Step 3
Step 1
Serviice Interfaces
Application Server
Business Objects
Step 2
Figure 8-1. The basic application server program
Persistence Layer
179
180
Building Application Servers 3. The objects that implement the service interfaces are loaded into memory, each receiving references to the persistent objects. 4. Each service interface object is registered to the middleware naming service. This way, the user interface programs can gain access through the middleware to these services. 5. The executable program now waits for service requests from the user interface programs. Note that this sequence assumes that the middleware naming service or object monitor is already running and accessible by the application server.
Processing service requests Once the application server is running, it can begin to process services for the user interfaces. Figure 8-2 illustrates the sequence of events that occur when a service request is processed. When a request is received, the service interface first requests business objects from the persistence layer. The persistence layer either locates the objects in memory or loads them from persistent storage. Once all of the objects are available, the service interface calls the appropriate business object methods, then returns the results requested by the user interface. For each service request, the following steps are performed: 1. At startup, the user interface uses a middleware API to obtain a reference to the service interface. 2. Using this reference, the user interface can request services from the service interface. 3. The service interface sends a request to the persistence layer to obtain references to the business objects that will be needed to process the request. 4. The persistence layer first tries to locate the business objects in memory. If they already exist, the persistence layer returns a reference to the existing object. 5. If the object does not reside in memory, the persistence layer retrieves the object from persistent storage, loading the object's
Implementing an Application Server Framework
Middleware
Step 3 User Interface
Step 2 Step 4
Step 6
Step 10
Service Interface < - >
Business Objrcts
Step 7
<
Step 9
Step 5 Persistence Layer
Database
Step 8
Figure 8-2. Processing a service request
attributes from the database, then it creates the object in memory. Once created, it returns the reference to the new object back to the service interface. 6. The service interface calls business object methods to process the request. 7. Once processing completes, the service interface informs the persistence layer that it is done using the business objects. 8. Changes made to the business objects are stored back to the database. 9. If the business objects are not in use by other processes, the persistence layer removes the business objects from memory. 10. The service interface returns the results of the request back to the user interface program. Although most of these tasks are performed by the service interface, business object or persistence layers, it is the responsibility of the frame-
181
182
Building Application Servers work to make sure that the layers are in memory and can communicate with each other. This is why the application server must pass references for the persistence layer to the service interfaces. It is also why the persistence layer passes references to the business objects when an object request is made. Without these communication paths, the service interface would not be able to call the necessary methods.
Commercial frameworks Although the following chapters will present detailed explanations of how to programmatically implement the application server framework, it usually makes more sense to buy part or all of it. Products like the Microsoft Transaction Server, BEA's M3, or the Inprise Application Server implement the basic framework, allowing the programmers to concentrate on the application logic, not the infrastructure. These products also provide middleware and other application development tools, offering a single vendor solution. The main advantage of the commercial frameworks is that the application objects are simply placed into the framework, usually registered using a GUI-based administration tool. Once registered, objects can be called by user interface programs to request services (just like service interface objects). These objects can call any other objects (business objects) to perform the business logic needed to process the request. Since the framework is responsible for objects storage and life cycles, the programmer no longer needs to worry about how the objects are loaded and whether they are in memory. The persistent objects now only have to be concerned with the interface to the corporate database.
Choosing a framework strategy In most cases, the middleware and programming languages will dictate the type of framework necessary and how much will be built or bought. Most commercial application servers provide a robust application framework, but also require that objects conform to fairly rigid component standards. Other middleware products, such as CORBA, require that the development team creates their own framework. This allows more latitude and flexibility, but also requires more work on the part of the developers.
Implementing an Application Server Framework
The choice depends on the size and scope of the project and the amount of flexibility needed. Simple projects can get by with a simple, home-grown framework. Larger projects will probably need the scalability and administration capabilities of a commercial framework. Enterprise-scale projects will need a hybrid, using commercial frameworks tied together with custom integration logic. The choice will depend on the factors described in the next section.
Additional Framework Requirements Developing the framework isn't difficult, but it does require a number of design choices and trade offs. When evaluating commercial application server frameworks, or when designing one to support the application server project, the following requirements should be considered. This framework will be the foundation for the application server that, once implemented, will be difficult to change. These functions include: • Scalability • Concurrency • Security • Fault tolerance
Scalability One of the main reasons for choosing a multi-tiered client/server framework is scalability. As more users and connections are added to the application, the server will run out of resources. With load balancing, another server can be rolled in and attached to the network and the objects will begin to be migrated over to the new server, freeing up resources on the other servers. Although load balancing is built into many middleware products, you need to partition the interfaces and objects ahead of time to provide efficient load balancing once migration occurs. Related objects should be deployed within the same server programs, and migration should be controlled in an orderly fashion. Future expansion and growth should also be considered when determining object distribution across servers.
183
184
Building Application Servers In addition to load balancing, metrics play a large part in providing scalability. Measurement tools must include the number of objects on each server, the amount of memory in use, how often each object is accessed, number of objects created and destroyed, and many others. These numbers can be used to tune the application server during development and to monitor system resources after deployment.
Concurrency In a multi-user, object-intensive system, objects can often be accessed by different users at the same time. Most languages provide tools to manage concurrency and synchronize execution, but an overall strategy (similar to database locking strategies) is needed to insure efficient throughput. Concurrency and synchronization are important considerations in the application server environment and an entire chapter will be devoted to these topics.
Security Although it seems easier to wait and implement security as a final addon after the rest of the software is complete, you should define a comprehensive security strategy from the very beginning. Security should encompass access control, network security, data integrity and external security. Access control is usually the central issue and should extend from user interface access through service interface, business objects, and data access. Passwords, location, or even biological markers (voice, fingerprints, etc.) can be used to validate a user's identity. Network security can include secure sockets or other encryption schemes when sensitive data is passed over the network. Data integrity is another security issue that can encompass a variety of approaches. You should consider transaction processing strategies to provide rollback of data when exceptions occur and, depending on the complexity of data manipulation, transaction processing middleware may be necessary. You also need to implement database techniques such as referential integrity, and have audit methods available to periodically check the data for inconsistencies. Although it's not a programming issue, you should address physical security to ensure that the computer hardware is protected from theft or vandalism.
Implementing an Application Server Framework
It is easy to forget that the development system placed in the programmer's cubicle may contain sensitive data and that software residing on these machines are valuable corporate resources. Make sure all developer machines are backed up often and that sensitive data is not left out in the open.
Fault tolerance A final consideration is a comprehensive strategy for error handling. As each new server is added, the number of communication paths increase exponentially. Both communication and software errors will occur, including power failures, system crashes, lost network connections, and many other unexpected problems. Gracefully recovering all possible errors can be difficult, and communicating them back to the user interface requires a consistent error handling protocol.
Development Strategies Another framework that may be even more important than the program framework involves the organizational structure that will support application server development. Most organizations already have solid software development strategies in place that include organization structures, methodologies, and a comprehensive suite of tools for traditional software development. These strategies have evolved over time and are tailored to meet the needs and culture of the organization. When considering strategies for multi-tiered software development, you have to examine the current development strategies and revise them to meet these new needs, but radical changes are usually not necessary. You may implement new methodologies to address object-oriented development and purchase new tools to support middleware or component architectures, but the development processes must still match the needs and culture of the organization. At the same time, moving to a new architecture is a good time to reevaluate the development strategy. The rapid rate of business and market changes do alter the needs and culture of every organization and it may be time to examine the software development process. Take time to develop a comprehensive strategy, not just a few quick fixes. In response to recent needs to respond quickly to business pressures, the trade press has offered articles like "Web Time Software
185
186
Building Application Servers Development" (Thomas and Constantine 1998). The authors do offer a few new techniques, but then suggest changes that either enforce longer hours or set up separate evening or night shifts. The referenced article even suggests allowing Mom to bring the kids and dog into the office so they can visit Daddy while he's chained to his desk. This is no way to run a railroad (or a software project). Do you really want your business to have to depend on software written by a bunch of social degenerates who last saw the sun shine six months ago? Instead of the short-sighted approach given by the trade journals, it makes sense to take the time to set up a development framework that can be sustained effectively over time. Iterative, incremental development strategies and joint design teams can produce tangible results in short timeframes, yet generate solid software products that will have the flexibility to serve the company for many years. The following issues and standards should be considered when organizing the application server development process: • communications support • development environment • tools • training • metrics
Communications support To have any chance for success, the developers must communicate with the project managers, other developers, JAD team members, and the user community. This communication will not just happen on its own, since software developers are not always known for their social skills. Instead, the exchange of ideas and information must be facilitated through both structured and unstructured channels set in place before the project begins. Structured channels such as formal meetings, interviews, memos and published documentation disseminate information consistently to all team members and move the project in the right direction. Unstructured channels including informal conversations, phone calls and email allow questions and problems to be resolved quickly.
Implementing an Application Server Framework
Many of these communication channels can be supported or facilitated with technology tools such as groupware, intranets, teleconferencing and email. Most groupware products, such as Lotus Notes, include tools that support meeting agendas, remote conferencing, document archives, message boards, and group scheduling as well as email, workflow, and other communication tools (Bock and Applegate 1995). With the growth of intranet technology, many of these same tools are now available in Web browser-based packages. Using these tools, team members can participate in discussions and collaborate without having to be at the same location or in the same time zone. Time and distance are no longer limitations, since each person can contribute their ideas whenever or wherever they are. Communication is more than just discussing issues and solving problems. It is about getting to know the other person, his experiences, his interests, the way he thinks. By cultivating acquaintances and friendships with people throughout the organization, you make it easier for the team to work together when the need arises. Spend a few minutes talking to the people in the break room. Bring a sack lunch and eat in the lunch room; get to know people throughout the business. Then, when the time comes to chase down some piece of business knowledge, it will be much easier to find it.
Development environment A separate development environment (servers, computers, networks, etc.) has long been an accepted practice in corporate software development. It allows programmers to test and experiment without the worry of data corruption or stealing resources from the business users. Application server development must have the same isolation, but may run into problems when integrating legacy software, since it is difficult to duplicate legacy hardware in a cost-effective manner. You will have to consider these issues. Fortunately, most of the requirements for a development network can be easily built from low-cost hardware. With the scalability inherent in application server design, two or three Pentium 100s can easily support the server requirements for 5 to 10 developers. Add some spare Ethernet cards and a cheap hub, and the development environment is up and running. Usually, the highest cost will be the operating systems, middleware, and database software.
187
188
Building Application Servers
Tools Approach tool selection with care, taking into consideration the preferences of the people who will be using the tools. Most software development tools are complex pieces of software, requiring time and effort before they can be used effectively. This breeds an emotional attachment that develops over time. Taking away a programmer's favorite IDE or language will probably be almost as traumatic as taking away your child's favorite teddy bear. Approach with caution. Nevertheless, a project must standardize on a common set of tools. Reaching this consensus may be difficult, but it is necessary. In many cases, the company will have already standardized on certain tools, and these products will limit the selection of other tools. Standardizing on a particular CORBA middleware may restrict the languages and development tools available. Selecting DCOM will probably limit the tools to Microsoft's Visual Studio IDE and a limited set of programming languages. Programming tools are often expensive, but investing in the right tools and providing the proper training will lower the development costs in the long run. Leverage the development team's existing knowledge and use their experience to select tools that will do the job. Project management Depending on the complexity of the development effort, project management software can either speed up the process or get in the way. For complex projects, this software can coordinate resources, such as people and development machines, and locate critical processes that can delay other parts of the project. In the hands of an effective project manager these tools are a great resource. CASE and modeling tools CASE (Computer Aided Software Engineering) tools have often promised much more than they deliver, but, just like project management tools, do have their place. Both data and object modeling can be well supported with the proper design tools. Each provides graphic representations of complex ideas that can be understood both by the software developers and the end users. Just as a blueprint is used to represent an
Implementing an Application Server Framework
office floorplan, a data model or class diagram can go a long way towards representing the final product. The typical business person may not understand all of the symbols on a floorplan, but he can still visualize the final layout of the new building. In the same way, an entity-relation model may not be completely understood by the end user, but it does offer a tool for communicating the overall design of the database. Most case tools will generate at least part of the source code from the graphical model. This eliminates much of the tedious detailed coding and can eliminate several days of programming time. Most tools also offer some form of round-trip engineering to update the model as changes are made to the code, synchronizing the model with the code automatically. CASE tools are some of the most expensive software development products, most costing several thousand dollars. Evaluate each tool carefully to make sure that it meets both current and future project needs. Make sure it is compatible with the programming tools already in use. Also, evaluate the learning curve and how the code generated by the tools fits with the current development methodologies and strategies. A CASE tool should do more than just generate pretty pictures; it must integrate directly into the project to be effective. For a lower-cost alternative, look at some of the graphics products that are emerging to support software design. These are not as tightly integrated to software development as CASE, but do provide many of the same tools to support data and object modeling. Graphics programs like Visio can be tailored to support software modeling, and many of the older traditional flowcharting tools have been extended to perform data and object modeling tasks. These are far less expensive and will communicate the same ideas. Version control A necessary support tool for iterative development is version control. As each iteration of the prototype is developed, changes must be tracked and isolated so that poor design choices can be quickly retracted. The tools also provide coordination of program code among a number of people so that changes are not lost as the code moves between developers. In addition to software changes, version control can also be used to track all of the other documents that are generated as part of the development process. These include design documents, test plans and user man-
189
190
Building Application Servers uals. Each of these documents also goes through a constant flow of changes, and it is much easier to isolate these changes when version control is in use. Another benefit is that the tool itself provides a trail of change documentation, tracking the progress of the development process. Programming languages and tools At the heart of the development process are the programming languages and tools that turn the design concepts into program code. Almost any modern compiler will generate clean, fast, optimized code, so benchmarking these factors should be considered, but should not be a primary concern. The main criteria for selecting a language should be the "teddy bear" factor described above, along with suitability to the task. If at all possible, let the programmers keep their teddy bears—the languages and IDE's that they are comfortable with. At the same time, select the proper level of support that will meet the needs of the project. Most language vendors now provide "enterprise" editions of their language products. In addition to the compiler and IDE (integrated development environment), they provide a wide range of development tools including CASE, version control, testing and many of the other tools listed in this section. Most also provide limited editions of their middleware products and database servers along with the additional development aids and debugging utilities needed to support their products. These enterprise editions are not cheap, but do provide one-stop shopping and integration between all of the tools. RAD tools Complementing traditional programming languages, rapid application development (RAD) tools such as Microsoft's Visual Basic, Borland's C++ Builder and JBuilder, and Powersoft's Powerbuilder provide excellent tools for creating both user interfaces and back-end components. Even Microsoft's Visual C++ now provides wizards to create COM objects and ActiveX controls. These tools speed program development by automating many of the programming tasks that used to take up so much of the developer's time. When these products first arrived, many generated bloated, inefficient code, but as these products have matured, they are now excellent alterna-
Implementing an Application Server Framework
tives to traditional third-generation languages. When using these tools, make sure that they integrate with the version control software, since changes can often be difficult to back out or redo. Programming wizards perform many useful tasks, but many of the wizards and agents do not have the ability to go back and change selections made up front. Make sure to checkpoint the changes prior to invoking these one-way wizards.
Debuggers Debugging distributed objects is still a difficult chore. Many of the debugging products are just beginning to address this problem, so make sure that any aftermarket debugger is compatible with your distributed computing platform. The debuggers included in the enterprise editions of the languages do support these requirements, but are tied very close to the specific language and IDE, so may not fulfill all of the requirements. Download several different vendors' trial versions and see which will support the development needs of the project.
Testing tools Methodologies and tools for testing software are now available for almost every phase of software development (Kit 1998). Test plan software can manage and monitor every aspect of the test plan. Code inspectors and analyzers can move beyond the errors generated by compilers and point out possible problems and deviations from accepted programming practices. Data generators and scripting tools can take much of the tedium out of testing by automating user interface testing, rerunning tests and comparing results between each testing run. Debuggers, bounds checkers, and profilers can track code execution and isolate problems. There is a wealth of testing tools that will engineer quality through all phases of software development. Testing and debugging tools are often tied closely to the development environment, so make sure that the tools chosen are compatible with the platforms and languages selected. In an application server environment, testing is complicated by distributed processing, and many tools are just beginning to support this development model. Nevertheless, traditional testing and debugging tools can support distributed processing, and careful planning can make this task much easier.
191
192
Building Application Servers
Bug tracking and reporting You need to set up bug tracking and defect management tools to monitor and resolve problems from the time the code is written until long after it has gone into production. As problems occur, the tracking software should provide ample space to document the problem, categorize it by function or module, then track it as it moves through multiple stages of resolution. Problem tracking software can either be purchased or developed inhouse. The tracking program's requirements are relatively simple and often need to be tailored to the organization. Testers and end users should be able to enter problems directly from their workstations, monitor their progress and be notified when the problem is resolved. Those responsible for fixing the problems should also be able to view related problem reports, annotate them with status information, and either indicate that the problems are resolved or hand them off to other developers.
Other support tools In addition to programming tools, most developers also need to have standard office applications, groupware, email and Internet access. Developers have to write reports and documentation, make presentations, use spreadsheets to verify calculations, and perform other routine office tasks. An office suite such as Microsoft Office or Corel's WordPerfect Suite should be included on every workstation. If the company is using groupware or an intranet to facilitate communication, these tools must also be available from the programmer's workstation as well as Internet email. Also, Internet access is a must, since there is a wealth of knowledge and tools available and these should be immediately accessible.
Training Many of the tools described in the previous section are large, complex, and difficult to learn. After investing thousands of dollars to put together the right set of tools, it is easy to throw away thousands more trying to train the staff to use them effectively. People learn in many different ways, so training that is effective for some may not be effective for others. Finding the best techniques for all members of the team is just as difficult as finding the tools to support the development effort.
Implementing an Application Server Framework
Classroom training, although effective, is the most costly solution. With costs often ranging from $1,200 to $1,500 for a three- or four-day course, classroom training will often be prohibitively expensive. Vendor-supplied courses are usually the best, but are higher-priced and the subject is usually limited to vendor-supplied products. If classroom training is absolutely necessary, rely on aftermarket training instead of the classes offered directly from the vendor. These classes are priced somewhat lower than the vendor's classes and will provide much of the same information. On-site training is also an option. If the team is large enough, the fixed price of bringing the trainer on-site will sometimes bring the per-person cost down. Video training programs are also available at a much more reasonable price, but since there is no interaction with the instructor, training is one-way. Consultants brought in to help set up and oversee a project are also a valuable resource for training. As part of their standard fee, they can tailor courses to address the tools and methodologies used in the project and can focus the training specifically to the needs of the project. Costs will be less than that of classroom training, and time will not be spent learning all the intricacies of the language or tool that may not apply to the developer's needs. For some, it is easier to just sit and experiment and learn by doing. There are a wealth of aftermarket books on every language, tool, and methodology, and their prices range from $30 to $70. Coupled with the tools themselves, or downloaded demos, these can provide cost-effective training if the learner has the motivation and ability to learn in this manner. Often, one or two people can spend a few days and learn enough to either prepare a course for the rest of the team or pass on the information in joint study groups. Other training resources include trade magazines, academic journals, user groups and vendor publications. These provide a wide range of information on current development issues that may not be available from other sources. Also, consider Internet searches for products, white papers and other information. Web searches can turn up a wealth of information and lead to technologies and products that are not well known and do not get the trade press that the larger vendors receive. Finally, as the project gets under way, set up procedures to disseminate tips and tricks throughout the group. This can be set up as a discussion group in Lotus Notes, a document repository on an intranet or server or as a distribution list within the email system. As people discover techniques or find solutions to difficult problems, encourage them
193
194
Building Application Servers to write a brief summary and post it to the repository. Then as others run into similar problems, they can use the experience gained by other team members to solve their problems.
Metrics Throughout the development process, you should include measurements and metrics to ensure quality and to manage both project and system resources. Metrics can be used to measure the breadth of the project, the quality of the design or program code, monitor defects and measure system performance. Within the application server environment, the metrics to focus on are the number of objects and the resources that they will require. A major reason for moving to application server technology is to allow scalability and better application performance. Since these are the primary goals, they must also be the primary measurements. Object counts and memory utilization should be built into the application framework so they can be monitored by the system administrators. These measurements then can be used to balance the application load among the servers. Network throughput and data access times are also important measurements. Keeping a close eye on these numbers will ensure continued application server performance and a more responsive application.
Summary The application server framework is the foundation where all application objects are placed. When developing this framework, use the following guidelines: • The framework is highly dependent on the middleware and programming languages used. • Carefully examine the trade-offs before deciding to build or buy the framework. • Make sure that the framework will be able to grow with the needs of the organization. It should provide scalability, concurrency, security and fault tolerance.
Implementing an Application Server Framework
• The organizational framework is just as important as the software framework. Develop strategies for the group's communications support, development environment, tools, training, and metrics.
References Bock, Geoff, and Lynda M. Applegate. Technology for Teams. Cambridge, Massachusetts: Harvard Business School Publishing, 1995. Kit, Edward. "Passing the Test." Software Development, March 1998: 34-42. Thomas, Dave, and Larry Constantine. "Web Time Software Development." Software Development, October 1998: 78-80.
195
Chapter 9
Using Java to Build Business Objects Creating business objects is a straightforward process in any object-oriented language. Just declare a class, list the attributes and methods, then fill in the business logic. This is basic object-oriented programming. The difficulty lies in linking these objects across a distributed network and getting them to perform useful work. This chapter will use the Java programming language to illustrate how business objects can be created and distributed across a network. The Java programming language, released by Sun Microsystems in the mid-1990s, was originally intended to be used in consumer electronic devices such as personal digital assistants, cable access boxes, television sets, and other devices that required simple user interfaces. The language was designed to be hardware-independent by compiling to an artificial byte-code instruction set that could be easily implemented in an interpreter on almost any microprocessor device. As the Internet and World Wide Web gained momentum, Web page authors wanted to add features like animation and interaction to their Websites. Since the Internet browsers had to run on a large variety of different computer platforms, the browser manufacturers found that the Java programming language, with its platform-neutral instruction set, was a good fit. The software could be compiled once, placed on a Web server, then run on any Java-enabled browser on a PC, Mac, UNIX, or other machine. Besides being platform-independent, the compiled files were small and moved quickly over the Internet. The Java language is loosely based on C++ but eliminates many of the 197
198
Building Application Servers most difficult aspects of C++ programming. It is purely object-oriented, with all data types other primitives (integers, floats, characters, etc.) implemented as classes. It also has the advantage of an extensive integrated class library. There are no pointer types in Java; all memory variables and objects are handled as references. Objects are created with the new operator, but Java provides a built-in garbage collector that automatically deletes objects when they go out of scope or are no longer needed. This eliminates the memory leak problems associated with C++.
Using Java to Illustrate Programming Principles This chapter will illustrate how to create and distribute business objects using the Java programming language and its simple object request broker, RMI (Remote Method Invocation). Any number of programming languages and middleware products could have been chosen. C++ and CORBA or Visual Basic and DCOM are also excellent choices, but Java and RMI were selected because they have been ported to a variety of platforms and are readily available, either free from the Internet (at Sun Microsystems's Website, see Further Reading) or from a variety of vendors at reasonable costs. The code is also a bit simpler, making it more readable than C++ with no pointers, simpler memory management, and integrated garbage collection. RMI is also included as part of the Java Software Developer's Kit (Java SDK), so there is no additional cost to obtain the middleware. Also, the code to implement RMI has less overhead than CORBA or DCOM and can be run on a single PC with little additional hardware or software. The Java programming language is well documented and the development kit is easy to obtain. For those not familiar with the language, check Further Reading at the end of this chapter. Also check the Appendix for an explanation of how to set up the Java and RMI programming environment on one or more Windows machines.
Overview of the Distributed Java Architecture Although Java is often thought of as just a programming language, the Java development and runtime environment, Java Virtual Machine
Using Java to Build Business Objects
(JVM), can be considered a separate computing platform. Being deviceindependent, it sits on top of the host operating system and provides a separate virtual computer platform that responds according to the rules of the Java Virtual Machine, not the host machine language or operating system. At the same time, it does rely on the host operating system for hardware-dependent functions like file processing, screen displays and network communication. A Java applet will look different running on a PC than it would on an X Windows display, but the underlying program logic will work the same. A Java program begins with a source file created using a program editor or IDE (integrated development environment) like JBuilder or SuperCede. Each Java source file is compiled by the javac compiler into an artificial machine language called byte-code, stored in a class file. This code was designed to be very compact to keep memory requirements low. The byte-code relies on class libraries that already reside on the target machine along with the Java runtime and browser, so that common operations like network communication and GUI objects do not have to be included when downloading code. The class files may be run either as standalone programs called applications or as applets embedded in a Web page. Applications are run inside the Java Virtual Machine (invoked by the java command), which executes the byte-code as if it were a separate computer machine language. This process, called interpretive execution, does impact execution speed. Each instruction must first be interpreted by the Java runtime, then executed on the host computer. For computationally intensive work where performance is a concern, a byte-code compiler can be used to convert the class files to native machine language. In addition to applications, Java allows class files to be inserted into Web pages and run within a Java-enabled browser such as Netscape or MS Internet Explorer. These classes, called applets, are also compiled into byte-code form and are then inserted into Web pages using the