MATHEMATICAL FRAMEWORKS FOR COMPONENT SOFTWARE Models for Analysis and Synthesis
Z H H E
J
Series on Component-Based Software Development - Vol. 2
MATHEMATICAL FRAMEWORKS FOR COMPONENT SOFTWARE Models for Analysis and Synthesis
SERIES ON COMPONENT-BASED SOFTWARE DEVELOPMENT Vol. 1
Component-Based Software Development: Case Studies (Ed. Kung-Kiu Lau)
Vol. 2
Mathematical Frameworks for Component Software: Models for Analysis and Synthesis (Eds. Zhiming Liu and He Jifeng)
Aims and Scope Component-based software development (CBD) is concerned with software development by using independently produced components, e.g. commercial off-the-shelf (COTS) software. It is an emerging discipline that promises to take software engineering into a new era. Building on the achievements of objectoriented software construction, CBD aims to deliver software engineering from a "cottage industry" into an "industrial age for Information Technology," whereby software can be assembled in the manner that hardware systems are currently constructed from kits of parts. Although the idea of using components to build complex systems is as old as the software engineering discipline itself, it is only recently that CBD has become the focus of research for a sizeable international community. This series aims to serve this community by publishing all kinds of material on CBD: state-of-the-art reports and surveys, conference proceedings, collected works, research monographs, as well as textbooks. It should therefore provide a key forum for CBD researchers, practitioners and students worldwide.
Call for Contributions If you have suitable ideas or material for the series, please contact the Editor-in-Chief: Kung-Kiu Lau Department of Computer Science The University of Manchester Manchester Ml3 9PL United Kingdom
[email protected]
VIAlHEMAllCAL FRAMEWORKS FOR COMPONENT SOFTWARE Models for Analysis and Synthesis
?%* «sS#
Z H ! M I N 6
L i U
United National University, Macao, China
HE
J 1 F E N G
East China Normal University, Shanghai, China
1 | | | World Scientific NEW JERSEY • LONDON • SINGAPORE • BEIJING • S H A N G H A I • HONGKONG • TAIPEI • CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
Series on Component-Based Software Development — Vol. 2 MATHEMATICAL FRAMEWORKS FOR COMPONENT SOFTWARE Models for Analysis and Synthesis Copyright © 2006 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-270-017-X
Printed by Mainland Press Pte Ltd
Preface
The idea to exploit and reuse components to build and to maintain software systems goes back to "structured programming" in the 70'ies. It was a strong argument for development of object oriented methods and languages in the 80'ies. However, it is today's growing complexity of systems that forces us to turn this idea into practice. So far, there is no agreement on standard technologies for designing and creating components, nor on methods of composing them. Finding appropriate formal approaches for describing components, the architectures for composing them, and the methods for component-based software construction, is correspondingly challenging.
The Theme of the Volume The range of component technology is both wide and diverse, but some common understanding is emerging through ideas of model-based development. These include the notions of interfaces, contracts, services, connectors and architecture. Also key issues in application of the technology become clearer, these include: consitent integration of different views of a component, component composition, component coordination, component customization, component system reconfiguration, and component reuse. There are solutions to some of these problems such as composition, refinement and transformation for platform. However, we still know little about theories that support analysis and synthesis of component base systems, including adapting components for specific non-functional requirements. This volume focuses on mathematical models that identifies the "core" concepts as first class modelling elements, and providing techniques for integrating and relating them. Volume contains eleven chapters by well-established researchers. The chapters are written from different perspectives. However, each chapter gives an explicit definition of components in terms of a set of key aspects and addresses some of the problems of integration and analysis of different views: component specification, component composition, component coordination, refinement and substitution and techniques of solving the problems. The concepts and techniques are motivated and explained with the help of examples or case studies.
VI
Preface
A Summary of the Chapters The chapters are organised according the alphabetic order of of the authors. The chapters are all reviewed by experts in the community of formal methods of software system development, and sample chapters are reviewed by four referees appointed by the publisher, World Scientific Publishing. We give a brief summary of each chapter here. Chapter 1. Temporal Specifications of Component Based Systems with Polymorphic Dynamic Reconfiguration, by N. Aguirre and T. Maibaum This chapter presents a formal characterisation of component based systems with support for polymorphic dynamic reconfiguration. Dynamic reconfiguration is about changes in the system architecture at run time. Polymorphic reconfiguration means that reconfiguration operations may concern different types of components or connections, exploiting an inheritance relationship over components, as in object orientation. On top of a first-order temporal logic, and in the form of a (rather low level) specification language, a necessary machinery is built for specifying components, connectors and amalgamations, together with inheritance and polymorphism. Chapter 2. Coordinated Composition of Software Components, by F. Arbab This chapter leaves the realm of object oriented programming and thus Abstract Data Types (ADT) and develops a simpler model of components and their composition based on the notion of Abstract Behavior Types (ABT). Whereas the ADT model emphasizes abstract operations on data types and hide data structures, the ABT model emphasises abstract observable behavior and hides operators and data types altogether. Consequently, the ABT model supports a much looser coupling than is possible with the ADT's operational interface, and is inherently amenable to exogenous coordination. Component composition in the ABT model requires only simple operators under which the model is closed: the composition of two ABTs always yields another ABT. To demonstrate the utility of the ABT model, an exogenous coordination language, called Reo, is described for compositional construction of component connectors based on a calculus of channels. Chapter 3: On the Semantics of Componentware: A Coalgebraic Perspective, by L. Barbosa, M. Sun, B.K. Aichernig and N. Rodrigues In this chapter software components are regarded as specifications of state-based modules, encapsulating a number of services through a public interface and providing limited access to an internal state space. Components persist and evolve in time, being able to interact with their environment during their overall computation. We adopt the standpoint of coalgebra theory to address a number of issues in the semantics, calculi and methodologies of componentware, presenting an integrated view of our current research concerns. At the specification level, the duality between
Preface
vn
algebraic and coalgebraic structures provides a bridge between models of static and dynamic systems. At the programming level such a duality, in a canonical initialfinal specialisation, captures the intuitive symmetry between data and behaviour, providing the basis for more uniform and generic approaches to systems' construction. Chapter 4. A Theory of Requirements Specification and Architecture Design of Multi-Functional Software Systems, by M. Broy This Chapter extends the FOCUS model and theory of distributed concurrent interactive systems to two essentially dual concepts of systems engineering. One addresses the comprehensive functionality of multi-functional systems in terms of services and the other that of architectures formed by networks of components that are described by their interfaces. We show how these notions interact and work together in requirements engineering and systems design. Chapter 5. Components: From Objects to Mobile Channels, by F.S. de Boer and M.M. Bonsangue and J.V. Guillen-Scholten This chapter introduces a formal model of components which extends object-orientation with additional structuring and abstraction mechanisms to support a modelling discipline based on interfaces. The component model formalizes the concepts of interfaces, roles, connectors, and ports. Components encapsulate their internal class structure and interact only through a certain kind of objects which are called ports. Ports are instances of classes which are represented by roles. Roles export information about the required and provided operations of these classes by means of interfaces. By means of connectors which wire roles of different components together, ports of one component can dynamically create ports of another component. As an example, it shows how to model mobile channels for the dynamic reconfiguration and exogenous coordination of components. Chapter 6. Formalizing the Transition from Requirements t o Design, by R.G. Dromey This chapter addresses the problem about how to construct a design out of its requirements. The author shows how a formal representation for individual functional requirements, called behavior trees makes this possible. Behavior trees of individual functional requirements may be composed, one at a time, to create an integrated design behavior tree. From this problem domain representation it is then possible to transition directly, systematically, and repeatably to a solution domain representation of the component architecture of the system and the behavior designs of the individual components that make up the system both are emergent properties of the integrated design behavior tree. Chapter 7. rCOS: a Relational Calculus of Components, by Z. Liu, J. He and X. Li This chapter defines a model for components, their composition and refinement. Components are specified for its syntactical view at the interface level, functional view at the requirement level, internal view at the design level and
Vlll
Preface
how they are composed. In a component based system development, a component consists of a set of interfaces, provided to or required from the software being developed. In a component development, the component is an executable code that can be coupled with other components via its interfaces. The developer has to ensure that the specification of a component is satisfied by its design and the design is met by its implementation. Chapter 8. Characterising Frameworks in First-Order Predicate Logic, by S-M. Ho and K-K. Lau This chapter provides a formal foundation to the component-based approach Catalysis. In Catalysis, a framework is a reusable artefact that can be adapted and composed into larger systems. The signed contract between components specifies how the required properties of one component are satisfied by the provided properties of another. The authors examine this concept in the context of framework-based development. They consider a simplified view of frameworks and their transformation into first-order logic. Theorem proving may be used to check the consistency of framework specifications, the chapter also identifers ways in which these specifications may be simplified beforehand to reduce the burden of proof. Chapter 9. Formalization in Component Based Development, by J.P. Holmegaard, J. Knudsen, P. Makowski, A.P. Ravn This chapter presents a unifying conceptual framework for components, component interfaces, contracts and composition of components by focusing on the collection of properties or qualities that they must share. A specific property, such as signature, functionality behaviour or timing is an aspect. Each aspect may be specified in a formal language convenient for its purpose and, in principle, unrelated to languages for other aspects. Each aspect forms its own semantic domain, although a semantic domain may be parameterized by values derived from other aspects. The proposed conceptual framework is introduced by small examples, using UML as concrete syntax for various aspects, and is illustrated by one larger case study based on an industrial prototype of a complex component based system. Chapter 10. A Model Driven Approach for Building Business Components, by V. Kulkarni and S. Reddy This chapter presents a methodology, emerging from and aimed at guiding the engineering practice in modern business system development. The method uses aspect-orientation and model-driven development techniques for specifying different views of interest of a system as models and transforming them in successive stages of refinement with specific aspects of interest being imparted at each stage. The chapter discusses how this approach was used to restructure a model driven development environment resulting in greater reuse and ease of its evolution. Chapter 11. A Formal Approach to Constructing Weil-Behaved Systems using Components, by S. Moschoyiannis, J.K. Filipe and M.W. Shields
Preface
IX
This chapter is motivated by the fact that present-day software systems are in increasing need of modification and evolution due to changing requirements. It argues that component-based development constitutes a key methodology for creating large-scale, evolvable systems in a timely fashion as it advocates the (re)use of prefabricated replaceable software components. However, it is often the case that undesirable or unpredictable behaviour emerges when components are combined. This is partly due to lack of behavioural information about the individual components. To deal with this problem, the authors describe a formal model for component specification which can be used to support the analysis and predictability of component composition and to identify undesirable behaviour. In their approach, component behaviour is modelled by so-called behavioural presentations, a powerful model of true-concurrency. Moreover, the framework is compositional and supports the assembly of the final system from arbitrary components. Practical benefits of our framework are discussed. Acknow
ledgements
We would like to think all the authors for their hard work and high quality contribution to the volume. We would also like to thank the chapter reviewers for their comments and suggestions on improvement of each chapter, and the volume reviewers for their support. We appreciate the constant support from and the collaboration with Ian Seldrup, Senior Editor of World Scientific Publishing.
Zhiming Liu, International Institute for Software Technology, United Nations University, Macao SAR, China He Jifeng, Software Engineering Institute East China Normal University, Shanghai, China July 2006
This page is intentionally left blank
Contents
Preface 1.
v
Temporal Specification of Component Based Systems with Polymorphic Dynamic Reconfiguration
1
N. Aguirre and T. Maibaum 2.
Coordinated Composition of Software Components
35
F. Arbab 3.
On the Semantics of Componentware: A Coalgebraic Persecutive
69
L.S. Barbosa, M. Sun, B.K. Aichernig and N. Rodrigues 4.
A Theory for Requirements Specification and Architecture Design
119
M. Broy 5.
Component: From Mobile to Channels F.S. de Boer, M.M. Bonsangue and J.V.
6.
155 Guillen-Scholten
Formalizing the Transition from Requirements to Design
173
R.G. Dromey 7.
rCOS: A Relational Calculus of Components Z. Liu, J. He and X. Li
207
Contents
Xll
8.
Charaterising Object-Based Frameworks in First-Order Predicate Logic
239
S.-M. Ho and K.-K. Lau 9.
Formalization in Component Based Development
271
J.P. Holmegaard, J. Knudsen, P. Makowski and A.P. Ravn 10.
A Model-Driven Approach for Building Business Components
297
V. Kulkarni and S. Reddy 11.
A Formal Approach to Constructing Well-Behaved Systems Using Components
321
S. Moschoyiannis, J. Kiister-Filipe and M. W. Shields Subject Index
351
Chapter 1 Temporal
Specifications
of C o m p o n e n t B a s e d with Polymorphic Dynamic
Systems Reconfiguration
Nazareno Aguirre^ and Tom Maibaum* t Departamento de Computation, FCEFQyN, Universidad National de Rio Cuarto, Enlace Rutas 8 y 36 Km. 601, Rio Cuarto (5800), Cordoba, naguirre@dc. exa.unrc. edu. ar
Argentina
t Department of Computing & Software, McMaster University, 1280 Main St. West, Hamilton L8S 4K1, Ontario, Canada
[email protected] In this chapter, we present a formal characterisation of component based systems with support for polymorphic dynamic reconfiguration. By dynamic reconfiguration we mean, as usual, changes in the system architecture at run time. By polymorphic reconfiguration we mean that reconfiguration operations may concern different types of components or connections, exploiting an inheritance relationship over components, as in object orientation. The formal characterisation of component based systems is based on a firstorder temporal logic. The logic is a variant of the Manna-Pnueli logic, expressive enough for straightforward specification of component types, connector types and dynamic amalgamations of components. On top of this logic, and in the form of a (rather low level) specification language, we build the necessary machinery for specifying components, connectors and amalgamations, together with inheritance and polymorphism.
1.1.
Introduction
W h e n the complexity of software systems started to increase some decades ago, in part due to more complex or bigger application domains, the need for techniques t h a t would allow developers t o modularise or divide systems a n d the problems they solve into manageable parts became crucial. Various heuristic techniques regarding modularisation were conceived. Some of these then evolved to become constructs of what were, at t h a t time, modern programming languages, and were eventually integrated into programming methodologies [24, 25]. T h e advantages t h a t structuring software systems into modules have in all phases of software development, from analysis to maintenance, were instantly recognised and have strengthened over the intervening decades. 1
2
N. Aguirre and T.
Maibaum
In the past decade or so, a new branch of software engineering emerged with the name software architectures (SAs) [4, 15]. This branch reemphasises the notion of module, or component, at a perhaps higher level of abstraction than that normally used in other modern modelling (and programming) methodologies, such as object orientation. Software architectures suggest the modelling of systems structure in terms of components related by means of connectors. Software architectures thus introduce a second modularisation concept to accompany that of components, the connector. The motivating principles are the very same ones originally motivating modularisation techniques. Software architectures notably differ from object orientation in the way interaction is represented. In software architectures, interaction is typically defined externally to components, which has the advantage of explicitly showing the structural appearance of systems. In object orientation, on the other hand, interaction can be obscured because it can be implemented via "feature calling" (of other parties in the interaction) within the interacting classes [6]. This is a problem of object oriented programming and modelling languages that several researchers have acknowledged. Rumbaugh's relational object oriented language [26], Andrade and Fiadeiro's coordination contracts [5], and various design patterns [14], are examples addressing this deficiency of object orientation. The increasing focus on higher level structural descriptions of systems led to the development of a special type of specification language, called architecture description language (ADL) [23], The purpose of ADLs is to describe the architectural aspects of software systems, so properties of the specified systems, especially those involving architectural information, can be analysed. Modern applications typically require a feature that some ADLs are able to deal with, namely dynamic reconfiguration. Dynamic reconfiguration refers to the run time modification of the system's structure [22]. Although this is not an inherent feature of software architectures, it appears frequently and naturally in the design of systems, perhaps due to the success of object oriented methodologies and programming languages, where it certainly is intrinsic. While ADLs provide constructs for modelling the architecture of a system, and some of them also allow for the description of possible changes to it (usually via operations that may modify the system's structure at run time), they often do not directly support (within the language) reasoning about possible system evolution. More precisely, some ADLs support the definition of components and interconnections to build architectures, and transformation rules or operations for making architectures change dynamically, but any kind of reasoning about behaviours is often performed in some "meta-language", often informally. Moreover, the description of architectural elements in ADLs, particularly those related to dynamic reconfiguration, is usually done in an operational way, as opposed to declaratively [16, 18, 27]. Being able to specify and reason about the consequences of using certain reconfiguration operations in a declarative manner would add abstraction to what, to our understanding, can be operationally specified by using ADLs. In addition, the abstraction gained by using a declarative framework might allow us to study possibly more sophisticated, abstract ways of describing software architectures. We therefore proposed a temporal logic as a formal basis for the specification of component based
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
3
systems with support for run time reconfiguration [2]. We aim at facilitating the reasoning that is usually necessary regarding this kind of system. Besides providing direct support for reasoning, temporal logic gives us a declarative and well known language in which behavioural properties can be expressed, and which is currently used in several branches of software engineering. We employ for this purpose a variant of a logic that Manna and Pnueli [21] proposed for specifying reactive systems. This variant, proposed by M. Abadi [1], allows for more general flexible, i.e., state dependent, symbols, and therefore is better suited for the specification of dynamically reconfigurable systems. In particular, the logic admits a derived proof rule, which enables us, if specifications are organised appropriately, to import properties from components when building (dynamic) amalgamations. This allows us to exploit the modular structure of specifications to localise reasoning to the relevant parts of these specifications, when proving certain properties. A prototypical language based on the proposed logic is defined, where systems specifications are hierarchically organised around the following notions: • the notion of datatypes, which constitute the lowest level in specifications, • the notion of components, which are represented by classes that define templates for these components; • the notion of connector types, which we call associations, which are then used to define the potential ways in which components may be organised in a system; • the notion of subsystem, the new notion that defines the (coarse grained) unit of modularity from which reconfigurable systems are built, and which conveys the information about which components, which associations and which reconfiguration operations are used to define the module. It is not our aim to propose (yet) another architecture description language, but to study an alternative declarative and formal semantics for software architectures, with direct support for reasoning. We prefer to illustrate the capabilites and expressive power of the formalism by defining a simple front-end to the logic, our prototypical language. This is simpler than trying to relate, at this stage of our work, our logic to existing high level ADLs. In addition, this language allows us to study the enhancements that our alternative semantics might provide to the specification of software architectures. In particular, we show that more powerful modelling constructs, such as class inheritance, as in object orientation, can be provided using our alternative semantics. This not only provides the well known advantages in terms of the reuse of component definitions, but also enables us to define polymorphic reconfiguration operations.
1.2. A Model of Reconfigurable Component Based Systems We now define in some detail the model of reconfigurable component based systems that we assume. We try to preserve some of the good features of some ADLs,
N. Aguirre and T.
4
Maibaum
such as declarativeness (as in Acme [17]), hierarchical composition of systems (as in Darwin [19]) and designs at a high level of abstraction (as in CommUnity [28]).
1.2.1. Basic
Components
As described in [17], components are often meant to represent the primary computational and data storage units of systems in software architectures. This view of components is shared by Acme [17], CommUnity [28], Darwin [19], and Wright [3], for instance. In our view of reconfigurable component based systems, the smallest computational and data storage units are recognised as special, and are called basic components. We prefer to use the term 'basic component' because we also want to be able to build complex components out of simpler ones, as in Darwin or Acme. These complex components could then form part of even more complex components, or systems. As justified in [17, 19], being able to hierarchically structure components is an important feature, which can help in dealing with large systems through several layers of decomposition. We consider that a component is basic if it is not composed of simpler components. For basic components, we take the (more abstract) model of components from CommUnity [28], and define basic components as consisting of variables, as in imperative programming languages, which define the components' internal state. We also assume that there is a set of datatypes provided, which are used as types for the variables within basic components. It is important to note that, since basic components are not defined in terms of other components, their variables must be typed with basic types, not types which represent components. (This underpins our view that, in order to achieve the desired low coupling between components in a system, any interaction between components must be defined completely externally to the components involved.) As is the case with Community's channels [28], a component can have input, output or local variables. These are represented as follows: components have attributes, which correspond to local or output variables, and read variables, which correspond to input variables. Local variables are differentiated from nonlocal ones by means of a component interface definition. Basic components also encapsulate actions, which represent their associated computational behaviour. These actions allow a basic component to change its state, by modifying the values stored in its variables, and provide, in this way, some functionality to the system. Actions of components are assumed to be instantaneous, and they might have parameters. Actions which take some period of time to be completed can be modelled with 'start' and 'end' instantaneous actions, as in [12] and the specification stage of [9]. As an example, suppose we want to model a network of units which can interchange messages. These units might represent basic components, whose state is composed of a private address, attributes for storing outgoing messages, and read variables for obtaining incoming messages. Their associated behaviour might be represented by actions for producing and sending outgoing messages, and obtaining and consuming incoming ones. The visibility of read variables, attributes and actions should be reflected when
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
5
components are connected. For instance, a public attribute of a component A could be connected to a private read variable of a component B, meaning that B can read A's attribute, but B's clients cannot read A's attribute through B. Clearly, it should not be allowed to connect private features (read variables, attributes or actions) of a component to private features of other components. 1.2.2.
Connectors
One of the central characteristics of software architectures is that the interaction between components is characterised by means of connectors [17], externally to the definition of components. The purpose of connectors is to allow for the communication between different components in the system, so they can perform activities in combination. The externalisation of the interaction between components in connectors makes the structural appearance of systems (in terms of interrelated components) immediately apparent; this makes it easier to describe operations, constraints, properties, etc, concerning the structure of systems. The externalisation of component interactions and its benefits are successfully exploited in object orientation by techniques such as those based on certain design patterns [14], coordination contracts [5] and middleware technology [8]. In order to make components (and, as we will see, this will include complex components) interact, we use the notion of coordination, as in CommUnity. Basically, we use connectors to specify how the states and behaviours of interacting components are combined. Special cases of this kind of communication are sharing of variables and action synchronisation, but more sophisticated ways of communication are also possible. When two or more components are connected, their behaviours become related, in the way specified by the corresponding connector. For instance, the connector might specify that the occurrence of an action in one of the participating components enforces the occurrence of an action in another participant, but not vice versa (i.e., as in a remote procedure call). An example of a connector might be a link between the previously described units. A link might enforce the occurrence of the retrieve operation on one of the connected units whenever the send operation occurs on the other one. We are particularly inspired by the style of specification and component coordination used in [13] and related work, which has also been inherited by modern versions of the CommUnity design language [27]. 1.2.3.
Complex
Components
In SAs, indexSA systems are typically seen as configurations of components related by means of connectors. We use the term subsystem to refer to such configurations. We do so because (dynamic) configurations of interrelated components might be themselves the component parts of bigger systems. A system can then be seen simply as a top level subsystem. Subsystems represent complex types of components. They are complex in the sense that their internal definitions may involve simpler components (other subsystems or basic components). Subsystems can encapsulate data in the form of
N. Aguirre and T.
6
Maibaum
variables, but as opposed to the case of basic components, subsystems can also build their internal state by using interacting instances of simpler components. Subsystems also encapsulate behaviour. Subsystem actions represent the computational behaviour of a subsystem and, besides modifying the values stored in the subsystem variables, they can modify the internal structure of the subsystem, by creating or deleting instances of simpler components, or modifying the way in which these simpler components interact (e.g., creating or deleting instances of connectors). As an example of a subsystem, we can consider a subnet, i.e., a collection of units connected in a star topology to a gateway (a different kind of basic component, which forwards messages to other subnets). Some basic attributes of a subnet could be the maximum number of units allowed in the subnet, or the number of messages sent since the subnet was started. A subnet might have (reconfiguration) operations for adding new units, or deleting existing ones. As we said, complex components might also be built out of other complex components. As opposed to the case of Darwin [19], we do not allow for recursion in the definition of components, and allow only for hierarchical organisations of (sub)systems in terms of other subsystems or basic components. Connectors could then be defined to relate subsystems in complex configurations.
1.3. A Temporal Specification Language Before describing the constituents of our prototypical specification language, let us describe in more detail the logic our work is based on. 1.3.1. The Logic The main characteristics of the logic that underlies our prototypical specification language are: • the logic is first-order, with sorts and equality, • time is considered to be linear, with a discrete set of instants and an initial instant (i.e., the model of time is N), • besides the usual connectives and quantifiers, the logic also features the temporal operators O J '-'I ^ a n < i ^ i • some function and predicate symbols (called flexible) are interpreted in a state dependent way, although functions and predicates with state independent interpretations (called rigid) are also available. This logic is a many sorted version of a predecessor of the Manna-Pnueli logic [21], presented by M. Abadi [1]. With respect to the Manna-Pnueli logic, it is more general since it allows for flexible predicates (not available in Manna-Pnueli's) and flexible function symbols of arbitrary arity (in the Manna-Pnueli logic, only function symbols of arity zero are allowed to be flexible). These more general flexible symbols allow us, as we will show, to represent operations and attributes of components in a straightforward way.
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
7
As shown in [1], the logic admits a sound (but not strongly sound) proof system. However, no complete proof calculus can be constructed for the logic [1]. Nevertheless, completeness can be achieved if one considers restricted notions of completeness. An example of this fact is the completeness of the proof calculus of [20] with respect to time sequences corresponding to executions of a concurrent program. We represent basic components and subsystems as theory presentations. The proof calculus of the logic can then be used in order to reason about the consequences of the axioms of a component or subsystem specification. 1.3.2. Organisation
of the
Language
The prototypical language we present allows us to specify reconfigurable component based systems, by defining: • basic datatypes, which serve as the types for variables of basic and complex components, • basic component types, which, in contrast with complex component types, represent instances which are not composed of other (simpler components), • connector types, which define the possible ways in which components might interact, • complex component types, which are called subsystems; these represent the instances of components whose internal state is defined in terms of a dynamic set of interacting simpler components. Specifications are modularly organised in layers, from datatype specifications to the specification of architectural subsystems. Subsystems might be composed of other simpler subsystems, in a non recursive way (in contrast with practice in Darwin, for instance). Thus, we permit a hierarchical approach in the organisation of reconfigurable component based systems. As we will show, we can use a derived inference rule of the logic in order to relate deduction in different layers of a specification. 1.3.3. A Specification
Problem
In order to introduce the language and its semantics, we will use the previously described example of communicating units. Let us describe it in more detail. Suppose we want to model and analyse a network of units, which can interchange messages. Each unit has an associated address, which is supposed to be unique to a unit in the system. Messages are sent by units to other units. The address of the destination unit is included within each message. Also, there is a special message, called null, whose associated destination address is undefined. Units are organised into subnets. Subnets contain a non-empty set of units, organised in a star topology. The unit at the centre of the star is of a special kind, and is called a gateway. Gateways receive and send messages from and to other units outside the subnet. All other units in a subnet communicate through the gateway. Gateways are units as well, so they can receive and send messages of their
N. Aguirre and T.
8
Maibaum
own. New units can be added to a subnet, and existing units can be deleted, except for the gateway. A gateway recognises whether a message is addressed to a unit within its subnet by checking its netmask. The netmask allows a gateway to decide whether a given address belongs to the set of valid addresses for its subnet or not. Note that it is not necessarily true that the whole netmask "address space" has to be covered; in other words, there might be valid addresses that actually correspond to the netmask of a subnet, but no live unit with that address resides in the subnet. A further requirement is that no address must correspond to more than one netmask in the subnets of the system. The whole system is basically a collection of interconnected subnets. Subnets are also connected in a star topology, with a gateway in the centre (that we call a router). It would be important to allow the units in the system to send messages to other units possibly outside the system, although we will not use such a feature here. An interesting capability one might want to provide for the system is the possibility of dynamically adding new subnets, or possibly detaching existing subnets. An informal diagrammatic view of a system is shown in Fig. 1.1. Normal units are labelled with indexed u's, gateways with indexed g's and the only router with r l . The region within the dotted circle corresponds to a subnet; so, all units within the subnet (including the gateway) belong to the address space determined by the gateway's netmask.
u1
u2
uk
gi
'
all belong to g1 's netmask
Fig. 1.1.
1.3.4. Specification
of
A graphical representation of a network of units
Datatypes
Basic components build their state by means of variables, as in imperative programming languages, whose types are defined in the abstract datatypes specification. The
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
9
logic can be used to build first-order characterisations of abstract datatypes, in a way similar to that used in algebraic specifications [10]. Flexible, i.e., state dependent, symbols are not necessary for the specification of abstract datatypes (flexible symbols are reserved for the specification of the state of components and configurations). Therefore, the specification of abstract datatypes consists of a static temporal theory presentation, i.e., a temporal theory presentation over a language without flexible symbols. Let us assume that we count on a datatype specification AVT, containing descriptions for the standard basic datatypes, such as integers, natural numbers, booleans, etc, with their usual operations. For the sake of modelling the above described system, we assume that datatypes message and address are provided within AVT. We also assume that the operations null :—> message (which represents the 'empty' message) and dest : message —> address (which 'extracts' the address contained in a message) are available. No special axioms are required for these datatypes. 1.3.5. Specification
of Basic
Components
Basic components are one of the basic building blocks of software architectures. We intend to use temporal logic theories to describe components, as in [11]. In dynamically reconfigurable systems, a varying number of "instances" of the same type of component can be involved. Then, we want a way of describing templates of these components, rather than the components themselves (this is, in fact, the approach in most ADLs). We call these descriptions class definitions. Class definitions are modularly built on top of an underlying datatype specification, AVT in our case. Possible counterparts of class definitions in some architecture description languages are (basic) component types in Darwin [19]. component definitions in Acme [16] and in Wright [3] (within styles) , and component designs in CommUnity [27]. A class specification (type for basic components) is composed of: • a set of typed read variables (entry points for the components, used for communication), • a set of typed attributes, which represent the variables that constitute the state of the instances of the class, • a set of parameterised actions, which represent the behaviour associated to the instances of the class. The intended behaviour of the instances of a class specification is defined by means of temporal formulae. These formulae will, in particular, indicate what is the effect of actions in the state of the instance, i.e., in the values of the attributes. Given a class C, a temporal axiom for it is a formula over the alphabet obtained by combining the alphabet of the datatype specification AVT with: • the read variables and attributes of the signature treated as flexible 0-ary function symbols of the corresponding sort; the flexibility of these symbols represents the possible change of the values stored in read variables and attributes.
N. Aguirre and T.
10
Maibaum
• The actions of the signature treated as flexible predicate symbols; the truth of flexible predicate a(x) at a given instant i represents the occurrence of action a at i, with the arguments x. • The class name C treated as a 0-ary flexible predicate. The truth of C at a given instant i represents the "activeness" of the corresponding component at that instant. The flexible predicate C, named after the name of the class, is used to represent the activeness of the corresponding "object" or instance. Note that C represents a kind of structural information about the system. However, this is all the knowledge that a component can have regarding the structure of the system. It is useful to have this kind of information, since one usually requires an instance to have a property only while it is active. Moreover, the use of the activeness predicate is central to our approach to the specification of reconfigurable systems. In order to understand why, consider a simple property P of a component type C. This property should be characterised within C's theory by an axiom of the form: C^P As will be shown later on, in order to build theories representing aggregations of instances of C, this formula is relativised, into the form: Vx : C(x) - • P(x) which adequately indicates that, for every x, if x denotes the name of a "live" instance of class C, then x has the property P. A class definition might also include an interface, which is simply a list of those read variables, attributes and actions that are visible from the outside of the class. The most trivial basic component that we can recognise in our problem is unit. As we described units, they have the capability of sending and receiving messages to and from other units. However, we do not allow a class or any component type (e.g., subsystems) to directly reference other components, not contained in it. So, the communication part of a unit has to be characterised via read variables, attributes and connector definitions. A possible specification of units is composed of the following: • Read variables: — a boolean-typed read variable, called in, which whether there is an incoming message ready to be "environment"), — a boolean-typed read variable, called out, which the "environment" is ready to receive an outgoing by the unit).
indicates to a unit obtained (from the tells a unit whether message (produced
• Attributes: — an address-typed attribute, called addr, meant to contain the address of the unit (which, let us recall, is supposed to be unique to the unit), — a message-typed attribute, called curr-out, where outgoing messages are stored before sending them,
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
11
— a message-typed attribute, called curr-in, where the incoming messages addressed to the unit are stored, before consuming them. • Actions: — action u-init(address), which initialises a unit by setting its address to the one passed as a parameter, and setting attributes curr-in and curr-out to the null message, — action prod(message), which produces its argument message, storing it in the curr-out attribute, ready to be sent, — action serid(message), which sends a previously produced message to the environment, — action get (message), which obtains, from the environment, an incoming message, — action cons(message), which consumes a previously obtained message, provided that it is addressed to the unit, — action rem(), which removes a previously obtained message, provided that it is not addressed to the unit. The class specification Unit is shown in Fig. 1.2. Note that neither its read variables nor the addr attribute belong to the class' interface. The class' interface is defined by the E x p o r t s clause, and as we said, indicates the visibility of the class' attributes, read variables and actions. If the E x p o r t s clause is missing in a class definition, it is assumed that all the class' features are exported. Axiom 1 specifies the intended behaviour of action w-imi(message) (note how we use the 'next' temporal operator to specify the state after the occurrence of an action). Axiom 2 indicates that this action can be called at most once per lifetime of a unit (a constraint not originally stated in the problem description), using the derived 'weak until' temporal operator. Axiom 3 is a locality condition, indicating that only u-init (message) can update the addr attribute. The general flexible predicate symbols of the logic allow us to represent parameterised actions of components in a straightforward way. Axioms 4-5, 6-7 and 8-9 specify the ^(address), cons(address) and rem() operations, respectively. Axiom 10 is, again, a locality condition, indicating that gei(address), cons(address) and rem() are the only operations that can update the curr-in attribute. Axioms 11-12 and 13-14 specify the send(address) and prod(address) operations, respectively. Finally, Axiom 15 is another locality condition, restricting attribute curr-out to be modified only by actions senrf(address) and prod(address). Note that all these axioms are subject to the activeness of the unit. The careful reader might also notice that we have no axiom specifying the condition that an address must be unique to a single unit. The reason for this is that this condition corresponds to a structural constraint, which is beyond the language of the Unit component. Therefore, it will have to be specified within the containing subsystem, SubNet.
N. Aguirre and T.
12
Maibaum
1.3.5.1. Semantics of Classes A class specification C is interpreted as a theory presentation, over the alphabet composed of the AVT specification extended with C's read variables, attributes, actions and activeness predicate. The axioms of the presentation are obtained by putting together: the axioms explicitly provided for the class definition, the axioms AXJ&T of the datatype specification AVT, a special implicit axiom, called the locality axiom for the specification, whose general form is: C
9@g) ) V [ A 0(a) \aeAtt
where C is the component's name (i.e., its activeness predicate), Act is the set of exported actions, and Att is the set of attributes of the component. Intuitively, the locality axiom for a class specification expresses the fact that in every state in which the component is actively involved, either one of the actions is triggered or all the attributes remain unchanged. This axiom was originally proposed in [11], and enforces a kind of encapsulation. Locality axioms are sometimes referred to as "frame axioms" or "frame properties". We have imposed stronger locality conditions as part of the specification of units. Note that read variables are not considered in the locality axiom; this is because read variables are special attributes, meant to be "entry points" used by a component to query the state of the environment. Therefore, they are not controlled by the component, which implies they could change, from the point of view of the component, arbitrarily. Another important basic component is the gateway. Gateways have the purpose of "forwarding" messages that arrive to a subnet to its constituent units, and send to other subnets the "internal messages" addressed to "external units", i.e., to units which reside outside the gateway's subsystem. As stated in the problem description, gateways also behave as units, i.e., they can send messages of their own, and receive messages addressed to them. In order to specify gateways, we can take the specification of units, and extend it with further read variables, attributes and actions: • Read variables: — a boolean-typed one, named int-in, which indicates to a gateway whether there is a message from one of the "internal units" (the units within the subnet) ready to be obtained, — a boolean-typed one, named int-out, which tells a gateway if the "internal units" are ready to receive a message from the gateway. • Attributes: — an address —» boolean one, named netmask, which characterises the set of "valid" addresses for units internal to the corresponding subnet,
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
13
— a message-typed one, named int-curr-in, which holds a just received message from the internal units of the corresponding subnet, — a message-typed one, named int-curr-out, which holds a message ready to be sent to units internal to the corresponding subnet. • Actions: — action int-prod (message), which produces its argument message, storing it in the int-curr-out attribute, ready to be sent "internally", — action int-send(message), which sends a previously "internally produced" message to the environment (actually, to other units in the subnet), — action int-get(message), which obtains from the environment (actually, from some unit internal to the corresponding subnet) an incoming message, — action int-cons(message), which consumes a message previously obtained from other units internal to the subnet, provided that it is addressed to the gateway, — action int-rem(), which removes a message previously obtained from other units within the subnet, provided that it is not addressed to the unit. The behaviour of some of the actions originating in Unit has to be enhanced. Operation rem() has a new associated behaviour: if the message in curr-in, let us sayTO,is not addressed to the gateway but it does correspond to the netmask (i.e., it is a valid address of the subnet), then int-prod(m) is "called", so that the message is forwarded to the units of the subnet. Note that this does not violate any of the previous axioms regarding action remi). A similar behaviour is the one associated to action int-rem(), this time referring to the "internal interface" of the gateway. If the message in int-curr-in, let us say TO, which was previously obtained from other units in the subnet, is not addressed to the gateway, then there are two alternatives: • if TO corresponds to the netmask of the subnet, then int-rem() removes the message from int-curr-in and "calls" int-prod(m), in order to forward it to the units internal to the subnet, • ifTOdoes not correspond to the netmask of the subnet, then int-rem() removes the message from int-curr-in and "calls" prod(m), in order to forward it to the units outside the subnet. The rest of the operations behave in the same way as the operations originating in Unit, but managing the internal interface of the gateway. 1.3.6. Extending
Class
Specifications
A first choice for specifying gateways would be to give a separate theory presentation, independent of the specification of units. However, it is clear that gateways share many properties with units. In fact, as was stated in the problem description, gateways are special units, which provide an extra "interface". We could, therefore, try to represent gateways by making use of an inheritance mechanism, as in object orientation.
14
N. Aguirre and T.
Maibaum
In order to define an inheritance relationship between components (in this case, basic components), we have to agree on what is meant by correct extension of a component definition. In object orientation, the intended meaning associated with the extension of a class A by a class B is often associated with the following facts: (1) instances of B provide at least all the "services" that instances of A provide, but instances of B might provide more "services" (i.e., extra behaviour), (2) if instances of A are replaced by instances of B (in contexts in which instances of A are employed), then the observable behaviour should not be altered (the substitution principle). In order to characterise these points, we say that a class Csub extends a class CsUper if and only if the following conditions are satisfied: • the mapping T, that maps — predicate symbol CsUper to Csub, — any other element of the alphabet corresponding to Csuper to itself, as an element in the alphabet corresponding to Csub (i.e., r works as the identity injection for all symbols of class Csuper, except fot the predicate CSuper itself), is an alphabet morphism between the alphabet corresponding to Csuper and the alphabet of Csub, • The interface of Csuper is respected by Csub] i-e., all symbols exported by C'Super are also exported by Csub• T is a theorem preserving language translation, between the theory of the superclass Csuper restricted to the sublanguage of its interface plus predicate Csuper, and the theory of the subclass Csub restricted to the sublanguage of the interface of Csuper plus its activeness predicate Csub • Note that the first of the above conditions requires somehow that the subclass extends the language of the superclass, except for the predicate symbol denoting activeness of the superclass instances, which is appropriately translated into the corresponding predicate in the subclass. In other words, the signature of the superclass is embedded in the subclass. It is also worth noting that the conditions for valid inheritance do not forbid either interface or signature expansion of the superclass by the subclass. Our intention with the last of the above conditions is to logically represent what preservation of the "abstract meaning" of actions is. The restriction of meaning preservation to the interface of the superclass has to do with the fact that what is "internal" to a class (i.e., not exported) might be modified by the superclass, as long as the observable behaviour of the exported operations of the superclass in maintained. But, how do we define observable behaviour? Interfaces help with this. The behaviour of a component is characterised by the theorems of the component; the observable behaviour is then the set of theorems restricted to the sublanguage of the interface. We require the subclass to respect the observable behaviour of the superclass, which means that the theory of the subclass should be an extension
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
15
of the observable behaviour of the superclass. Hence the third condition for valid inheritance. Using this definition of class extension, we can define the Gateway class as an extension of Unit. The corresponding class specification is shown in Fig. 1.3. We have chosen to make a simplification to gateways: their internal netmask cannot change. This restriction (characterised by Axiom 2) was made simply to keep Gateway's specification simple; it is not difficult to extend gateways with further operations in order to manage their corresponding netmasks. The "forwarding" capability of gateways is characterised by Axioms 15-17. Forwarding is associated with actions rem() and int-rem(), i.e., the actions responsible for "removing" incoming messages of the two "interfaces" of the gateway, when they are not addressed to the gateway. Note how Axioms 15-17 look at the netmask in order to decide what to do with an incoming message. We might also want to add two further axioms to the specification of gateways: • if a message is produced in order to be sent "outside" the corresponding subnet, then its destination address does not correspond to the netmask of the gateway, • if a message is produced in order to be sent "inside" the corresponding subnet, then its destination address must correspond to the netmask of the gateway. These can be specified by the following formulae: • [Vx € message : Gateway Aprod(x) —> -^netmask(dest(x))] • [Vx G message : Gateway A int-prod(x) —» netmask(dest(x))} We assume that all axioms (explicit and implicit) corresponding to Unit are inherited by Gateway (although we do not write them explicitly). Then, we are trivially in the presence of valid class extension in this case, since all axioms of Unit are preserved (since the proof calculus for the logic is monotonic). 1.3.7. Specification
of
Connectors
We have just described the way component types can be defined, by means of class definitions. We choose to define these class definitions as independent units. With the exception of inheritance, classes cannot refer to other classes within their definitions. Even in the case of inheritance, we can still see classes as completely independent units, by incorporating the implicitly inherited behaviour from the superclasses in the subclasses. This is crucial, since from a logical point of view, it allows us to reason about component properties independently of the rest of the system. Now we want to start defining dynamic aggregations of components. But of course we need ways of making components interact. In order to allow gateways to forward messages to other units in the subnet, we need to define a kind of communication link between them. A communication link within a subnet will have: • at one end, a gateway instance, the centre of the star, • at the other end, a unit instance, which, in principle, could be another gateway.
16
N. Aguirre and T.
Maibaum
We represent communication links (i.e., connectors) by means of association definitions. Associations are composed of a set of participants (typed with class names) and connections, which are special formulae that relate the participants. For our specification problem, the specification of association Link is shown in Fig. 1.4. Note that the second participant, t, has been defined to be of "type" Unit. This type corresponds to the coverage of Unit, i.e., it characterises instances of Unit or subclasses of Unit (Gateway, for instance). We show how this predicate is characterised later on. In this association, it allows the second participant to be a unit or gateway. Note also that, probably against our first intuition, we have used implication instead of bi-implication in some of the connections. An example of this is the following: (t.curr-out ^ null) —> (s.int-in = T) which basically indicates that if the curr-out attribute of the target unit is not null, then the int-in read variable in the source gateway has to be T, but not vice versa, since int-in might be T due to some other unit in the subnet being ready to send a message. This is an example of an association relating the states of the participants in a way which is more sophisticated than shared memory. Of course, an eventual realisation of links would need to include elements for implementing this kind of more complex communication (a much more difficult task than implementing shared memory communication). Let us postpone the definition of the semantics of associations to the next section. 1.3.8. Specification
of Complex
Components
Basic components are those whose internal state is not composed of other (simpler) components. Subsystems correspond to complex components, i.e., components whose internal structure is built out of the dynamic aggregation of interacting simpler components. With subsystem definitions we reach the upper layer of the language. Subsystems correspond to the concept of dynamic aggregation of components in architecture description languages. Subsystem definitions correspond to the definition of composite components in Darwin (although we do not allow for recursion in the definition of complex components), systems in Acme, dynamic configurations in CommUnity and dynamic configurations in Wright. Basically, a subsystem is composed, at a given instant in time, of a set of interacting instances of simpler components, which are related by instances of associations. Besides this "structural composition", subsystems might also contain basic attributes and read variables, which will allow it to communicate with other subsystems in the context of a bigger system. One can also restrict the visibility of a subsystem's attributes and operations, by means of an Exports clause; however, we will not make use of this facility in our examples. Again, the absence of of an Exports clause in a subsystem definition is interpreted as all the subsystem's attributes and operations being exported. The most simple kind of subsystem, that we start describing now, is the one defined directly in terms of basic components. Subsystem SubNet is of this kind, and represents a dynamic collection of units connected to a gateway. A number of
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
17
requirements have already been made clear from the specification of the problem. Some of these are the following: • There is always a non-empty set of units within the subnet, • the units in the subnet are organised in a star topology, with a unit of a special kind, a gateway, in the centre of the star. Note that, since the gateway must always be present in the subnet, the first of the above conditions is trivially met. We can make the "design decision" of representing the gateway as an attribute; in fact, this is somehow necessary, since we will need to relate some attributes, read variables and actions of the gateway with others outside the subnet, in order to enable communication between the subnet and other subsystems. An extra datatype is necessary in order to start describing subsystems. This datatype is NAME, and represents the names of instances of components. According to the description of the problem, new units can be added to the subnet, and existing ones can be deleted (except for the gateway). So, we will need subsystem operations add-unit(NAME, address) (the second argument of the add-unit(NAME, address) operation will correspond to the address to be assigned to the new unit) and rem-unit(NAME) in order to characterise these tasks. Besides these two operations, we include an initialisation operation for the subsystem, called s-mii(address) (the argument of the s-init(address) operation will correspond to the address to be assigned to the gateway of the subnet). A subsystem specification corresponding to the above description is shown in Fig. 1.5. Axioms 1-3 specify that the units in the subnet are arranged in a star topology, with gw in the centre. Axiom 4 indicates that attribute gw is, throughout the lifetime of the subnet, an instance of Gateway, and Axiom 5 says that gw does not change (while the subnet is active). Axiom 6 corresponds to the informal requirement that an address must be unique to a unit a . It is worth noting that, thanks to inheritance, this requirement also involves the gateways in the subnet. Axiom 7 indicates that the addresses of the live units in the subnet respect the netmask specified within gw. This shows how the languages at different levels in the specification are related. Axioms 8-9, 10-12 and 13-15 specify the behaviour associated with operations s-imi(address), add-unit (NAME, address) and rem-unit(UAME), respectively. Axiom 16 complements the specification of association Link, by requiring the occurrence of the operation gw .int-get(message) to be associated with the occurrence of the operation n.send(x), for some of the units linked to the gateway of the subnet. The SubNet subsystem specification makes use of the language defined in simpler components Unit and Gateway, with a slight modification: an extra NAME-typed argument has been added to the flexible symbols originating in these components. Basically, if at is an attribute defined in class definition C, then symbol at(n) represents attribute at for the instance named n. Similarly, if action a(t\,. ..,U) is declared in a class definition C, then a(t\,... ,ti,n) represents the occurrence of a
Let us recall that this requirement was not specifiable within Unit or Gateway, structural constraint.
since it is a
N. Aguirre and T.
18
Maibaum
action a with parameters t\,... ,U in the instance referred to as n. We prefer to use the "dot notation" for writing the extra parameter of attributes, read variables and actions introduced by the relativisation, to have more readable expressions. 1.3.9. Semantics
of
Subsystems
As for class specifications, a subsystem Sub describes a theory presentation; its axioms are: • The formulae explicitly provided in the subsystem specification, • The (explicit and implicit) formulae corresponding to every class definition A aggregated by Sub, appropriately translated into the language of Sub by a relativisation, • Implicit formulae characterising association definitions, • Implicit formulae characterising general properties of subsystems. 1.3.9.1. Relativisation of Classes The relativisation of class definitions is a simple procedure that transforms the axioms of the form a, given in a component definition C, into the property "all live instances of C have property a", for inclusion in a subsystem aggregating C. The activeness predicate of the corresponding class C is very important in this translation. Basically, the statement of the form "while the component is active, the property P holds" within a class C becomes the subsystem statement "for every component x, if x is a live instance of C, then x has property P " . The relativisation consists of the universal quantification of the extra argument of type NAME, introduced when mapping the language of a component into the language of an aggregating subsystem. As an example, let us consider the following axiom of class Gateway: 0[Gateway —> netmask(addr)} The relativisation of this axiom is the following: Vn G NAME : 0[Gateway(n) —> n.netmask(n.addr)} Promoting Properties of Classes One of the advantages of our hierarchical organisation of reconfigurable systems in terms of simpler subsystems and classes is that we allow for further localisation of proofs to the relevant parts of a specification. For example, we can reason about some property concerning a class C within the theory of C, and then promote that property into an including subsystem. This is possible thanks to the fact that the logic admits a (derived) inference rule, that we call the rule of structurality. This rule asserts the following: If ip is a consequence of a set $ of premises, then the relativisation ip' of V1 is a consequence of the relativisation <£>' of <&. This rule, and the fact that the relativisation of the axioms of classes are included in the theory of an aggregating subsystem, enable us to promote properties proved at the level of classes as properties of an including subsystem.
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
19
We also need to provide a meaning to the flexible predicates representing the coverage of classes, related to class extension. Given a class definition C, predicate C, used to denote the live instances of C or any subclass of C in a subsystem Sub, is simply defined by the following formula of the language of Sub: Vx : C(x) «-• (C(x) V d ( x ) V . . . V Ck(x)) where C\,..., Ck are the subclasses of C, i.e., all the classes which inherit, directly or indirectly, from C. Polymorphic dynamic reconfiguration operations are easily defined, simply by using predicate C as an ordinary class predicate. 1.3.9.2. Semantics of Associations From the connection definitions within associations, we generate some formulae, to be included in the theory of subsystems using the associations. In order to incorporate the formulae related to an association R in an including subsystem Sub, we simply quantify the free variables of the formulae universally, and force them to be related via the flexible predicate R. Thus, if a(x, y) is a formula characterising interactions of a binary association R (the generalisation of this for n-ary associations is trivial), where x and y are the free variables of a(x,y) representing the participants, then the formula: U[Sub -> [Vx,y : R(x,y) - • a{x,y)\] is included in Sub, clearly characterising the fact that components referenced by x and y must colaborate according to the connections associated to R while they are "connected" by R (and while Sub is actively involved in the system). We also indicate the "type" of each of the participants in an association definition. If R is an association definition with k participants, then for all i, with 1 < i < k, if the i-th participant is defined to be of class Cj, then the following formula is incorporated in the theory of a subsystem Sub using R: a[Sub -^ [Vxi,...,xk
: R{x!,...,xk)
-> Ci(xi)]}
So, for our sample association Link, some of the formulae that will be implicitly included in the theory of SubNet are the following: a{SubNet a[SubNet d[SubNet 0[SubNet
-> [\/s,t: Link{s,t) -> Gateway(s)]], -> [Vs,i : Link(s,t) -> Unit(t)]], —> [Vs,t : Link(s,t) —> (t.curr-out ^ null) —> (s.int-in = T)]], —> [Vs,i : Link(s,t) —> (s.int-out = T) —> (t.curr-in = null)]],
1.3.9.3. Other Properties of Subsystems There might be a number of other implicit axioms to include in a subsystem. These might be related to general assumptions or properties of the application domain. For instance, the locality of the subsystem [2] is generally assumed to be a property of subsystems, and therefore might be implicitly included in the theory describing
N. Aguirre and T.
20
Maibaum
any subsystem. Axioms typing the associations can also be considered as general assumptions. We consider for our sample specification three general assumptions easily expressible in the logic, namely: (i) that nothing can be at the same time an instance of two different classes, (ii) that operations of "dead" instances cannot take place, and (Hi) that a subsystem may evolve only by means of its own operations (locality of a subsystem). 1.3.10. Higher Level
Subsystems
Since the semantics of subsystems (i.e., components aggregations) is defined in terms of conventional temporal theories, as for classes, there is no technical restriction for iterating the process of defining aggregations. Therefore, we can permit subsystems not only to be composed of instances of classes, but also to subsume instances of simpler subsystems, thus allowing for hierarchical organisation of systems. We have already defined several specification components, such as datatypes (particularly message and address), classes (Unit and Gateway), associations (Link) and a first subsystem (SubNet). We have made use of inheritance, in order to relate the definition of Gateway to Unit, and be able to define Link polymorphically. All these combined constitute the specification of a large component of the overall system, encapsulated in subsystem SubNet. We now want to build a bigger system combining different instances of interrelated subnets, as specified in the problem description. Clearly, the problem description calls for the specification of a high level subsystem. Let us refer to it as Net. A net is composed of a dynamic set of subnets, all of which are connected to a gateway of the net, which, in order not to confuse it with the gateways of the subnets, we are going to call the router. Certainly, we will need reconfiguration operations in Net in order to manage the population of subnets within it. Again, as we did with the SubNet subsystem, we make the design decision of representing the router as an attribute. This will allow us to make nets communicate with other nets in the environment (although we will not be concerned with the specification of this). The operations of a net are the following: • An initialisation operation, n-init(address), whose argument will be used to set the address of the router in the net. • An operation for adding new subnets, add-subnet (NAME, address), whose first argument represents the identifier of the subnet, and whose second argument represents the address that will be used to set the gateway of the new subnet. • An operation for deleting existing subnets, rem-subnet(N AME), whose argument is the identifier of the live subnet to be removed from the net. Relating the router to the live subnets within the net requires the definition of another association, this time a high level one. A problem arises because of our need for defining this association. Since Gateway is used within SubNet, the part of the language of SubNet which originated in Gateway (and has been relativised once because of this) will be relativised again when using it in a higher level subsystem,
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
21
say Net. Therefore, predicates such as Gateway (NAME) will be transformed by adding to them an extra NAME-typed argument; for instance, Gateway (NAME) will become Gateway (NAME, NAME), where the last argument refers to the subnet to which the gateway belongs. But for the router, we simply want a gateway, not a gateway within the scope of any subnet. Note that we cannot directly use Gatewat/(NAME), since including the theory of Gateway relativised only once directly into the theory of Net will cause a clash of names between the doubly relativised Gateway (coming from SubNet) and the relativised Gateway (that we want to include directly into Net). In order to represent the router, we will use a renamed copy of Gateway. Let Gateway' be a theory presentation isomorphic to Gateway, obtained by "priming" the signature of Gateway, and appropriately translating the axioms. The definition of association S-Link, whose purpose is to relate the router of a net with the gateways of the subnets within it, is shown in Fig. 1.6. Note how "multiple dots" in the dot notation we borrowed from object orientation are used. The intuitive reading is the top-down navigation from subsystem attributes towards their constituents. Recall that the multiple dots actually mean multiple NAME-typed arguments in the correponding expression. For example, the term t.gw.curr-out is a more readable version of curr-out(gw,t). This new NAME-typed parameter in the attributes, read variables and actions originating in SubNet correspond to a new relativisation, this time of the SubNet subsystem (which, we already know, contains the relativisation of the class definitions in the lower layer). Note how similar the S-Link and Link associations are. Basically, the connections we employ in S-Link are the same as the ones in Link, except that this time we need to navigate into the t subnet to reach its gateway. Perhaps now it is clearer why we represented the gateway as a special attribute of SubNet, to facilitate the communication. An informal graphical description of a particular state of a Net is shown in Fig. 1.7. We have depicted the router (labelled with r l ) and five connected subnets, one of which shows its internal structure in terms of units. The Net specification is shown in Fig. 1.8. Subsystem Net has only one attribute, of type NAME, the router (rt). The router is meant to be a gateway (instance of Gateway') through which the subnets within the net communicate with the outside world. The netmask of rt represents the netmask of the net, and therefore must subsume the netmasks of its live subnets. Axioms 1-3 and 8 specify how the subnets are connected to the router, again in a star topology. Also, they indicate that rt is the only instance of type Gateway'. Note that due to the types of the participants of association S-Link being different, we do not need an anti-reflexivity axiom (as we did in SubNet). Axiom 4 indicates that the router does not change throughout the lifetime of the net. Axiom 5 says that the netmasks of different subnets within the net are "disjoint". This is a nice example of a formula relating instances at different layers of the language. Also, note the simplicity with which we have captured a rather complex constraint. Axiom 6 is another example of this kind. It specifies that the netmask of the router subsumes the netmasks of the live subnets within the net. Axioms 7 and 9 specify the intended behaviour of the n-imi(address) operation. Axioms 10-12 and 13-15 specify the intended behaviour of the operations that
N. Aguirre and T.
22
Maibaum
manage the population of subnets in the net, i.e., add-subnet(NAME, address) and rem-subnet(NAME). Note that Axioms 12 and 15 are stronger locality constraints. Finally, Axiom 16 complements the definition of association S-Link, by imposing that ri.int-get'(x) cannot occur spontaneously, but only due to the occurrence of n.gw.send(x), for some live subnet n linked to the router rt. Let us summarise the theories involved in this specification, and how they are related. Besides the class definitions Unit and Gateway, a new actor is involved, namely Gateway'. The different relationships between theories in the specification of Net are shown in Fig. 1.9. Theory AVTNAME represents the conservative extension of AVT with sort NAME and a sufficiently large set of constants of this sort. Arrows labelled with id indicate that the inclusion of the source theory into the target does not require any changes in the language of the source. Arrows labelled with rel indicate that the inclusion of the source theory into the target requires a relativisation of the language of the source. We can also see the double headed arrow relating Gateway and Gateway', indicating the existence of an isomorphism between them (the translation involved is the "priming"). We can take advantage of the structurality rule, which allows us to translate proofs of theorems within a theory as proofs of their corresponding relativisations, in order to reuse reasoning, or reduce reasoning in some complex theories to reasoning within the smaller theories from which they are composed. 1.3.11. Some Properties
of Net
We can employ the proof calculus for the logic in order to prove properties of the Net subsystem. Some of the properties that we have attempted to demonstrate are: (1) "Different units do not share addresses": • [Vsi,S2,ni,7i2 : Net A Unit(rii, Si) A Unit(ri2, S2)A (si ^ S2 V ni ^ n2) —> s\.n\.addr ^ S2.n2.addr] (2) "Gateways are initialised with valid addresses": • [Vs : Va; G address : Net A SubNet(s) A s.gw.u-init(x)
—> rt.netmask'(x)].
(3) "Messages are not lost": • [Vsi,S2,ni,7i2 : Vx e message : Net A SubNet(si) A SubNet(s2)/\ si.Unit(ni) A S2.Unit(ri2) A S\.n\.send(x) A dest(x) = n2 —>
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
23
This property is characterised, within the language of subnets, as follows: • [Vni, ri2 : Va; G message : SubNet A Unit(ni) A Unit(ri2) A ni.send(x)
A (dest(x) = 712) —• Ori2.get(x)]
and can be proved straightforwardly by assuming the following extra constraints: • ri2 is not gw; if n-z is gw, then it will consume the message from its internal interface, i.e., via int-get rather than get, • ri2 should not be deleted; it can be characterised easily by nUnit(ri2), (There exist less strong conditions in order to ensure that 712 is not deleted before the message is received; we have chosen this one for the sake of simplicity, since it simplifies our proof.) • the subnet should not be deleted; it can be characterised easily by • SubNet, (Again, there exist less strong conditions in order to ensure that the subnet is not deleted before the sent message is received; we have chosen this one for the sake of simplicity, since it simplifies our proof.) • if the gateway gw holds a message addressed to an internal unit of the subnet, then that message is eventually dispatched; it can be characterised as follows: • [Vx G message : SubNet A (gw.int-curr-out = x) A (x ^ null) —> O gw .int-send(x)] • if the gateway gw gets a message from its internal interface, addressed to its internal interface, then it eventually forwards it; it can be characterised as follows: • [Vx G message : SubNet A (gw.int-curr-in = x) A (x ^ null)A gw .netmask(dest(x)) A (dest(x) ^ gw.addr) —> 0(gw .int-prod(x)] Note that we have as part of the antecedent in this requirement the conjunct (dest(x) 7^ gw.addr), since if the destination of x is gw, the message should be consumed by gw instead of being forwarded; also, we have characterised the "forwarding" of the message as its production towards the internal interface. 1.4.
Conclusions
We have presented a prototypical language for the specification of component based systems, with special support for reconfiguration. The way in which communication between components is achieved in the language (by means of dynamic connectors), standard in ADLs, allows it to express properties concerning the architecture of the system in a declarative way. Hence, operations that may change the topology of the system can be easily specified in the formalism. The presented prototypical specification language allows for the specification of reconfigurable component based systems by defining: • basic datatypes, used then as types of variables of components, • templates of basic components (components whose internal state is not composed of other components), called class definitions,
24
TV. Aguirre and T.
Maibaum
• templates of connectors, called associations, • templates of complex components, whose internal state might be defined by a dynamic set of simpler components, interacting by means of connectors; both the population of components and the population of connectors can be dynamically changed in a subsystem, by means of reconfiguration operations. The language also allows for the definition of inheritance relationships between component definitions. This is done for the purpose of defining polymorphic reconfiguration operations at the level of subsystems. Both the definitions of subsystems and associations are general enough to cope with complex components; subsystems can then be defined out of instances of simpler subsystems, similar to the situation in Darwin, although recursion is not allowed (due to the way a subsystem's semantics is defined); associations can also be defined to relate instances of subsystems. The semantics of the language is defined in terms of a logic based on a suitable variant of an existing and widely known first-order linear temporal logic, the MannaPnueli logic. This variant generalises the symbols whose interpretation is state dependent (in the modal sense) to general predicate and function symbols. A sound proof calculus is available for this logic, and admits a derived proof rule that helps us in our proofs regarding reconfigurable component based systems. We have been greatly motivated by the possibility of describing dynamic software architectures in a declarative way. We have attempted to maintain some interesting features of certain ADLs, particularly declarativeness (as in Acme), hierarchical composition of systems (as in Darwin) and designs at a high level of abstraction (as in CommUnity). The main features the presented specification language exhibits are: • A new modularity mechanism, the subsystem, which emerged as a consequence of our representation of component interaction. • The use of inheritance and the associated polymorphism, attempting to mimic the situation of object oriented languages, in the context of software architectures. We have been careful not to bring into the language the complexities of fully fledged object oriented languages, and kept the organisation of specifications hierarchical. • The use of classes and subsystems to build incremental specifications hierarchically. This was possible thanks to the representation of connections by means of coordination mechanisms, as in CommUnity. Although we have not presented it in this article, we have defined a representation of our specifications as STeP [7] specifications. STeP is a tool for assistance in the specification and verification of concurrent programs based on the Manna-Pnueli logic. Although we have encoded the more general flexible symbols of the logic we use into Manna-Pnueli's, the use of the STeP tool for assistance in reasoning about our specifications (which involves these encodings) becomes unreasonably complicated for medium size specifications (as, for instance, our specification of nets). We are currently trying to overcome this difficulty, by incorporating support for general flexible symbols and the rule of structurality into the STeP tool.
Temporal Specification of CBS with Polymorphic Dynamic Reconfiguration
25
We are planning to work on the development of an ADL using the ideas presented, and trying to learn from the positive and negative aspects of other existing ADLs and environments for software architectures.
Bibliography 1. M. Abadi, The Power of Temporal Proofs, Theoretical Computer Science 65:35-84. 2. N. Aguirre and T. Maibaum, A Logical Basis for the Specification of Reconfigurable Component-Based Systems, in Proceedings of Fundamental Approaches to Software Engineering FASE 2003, Warsaw, Poland, LNCS 2621, Springer, 2003. 3. R. Allen, R. Douence and D. Garlan, Specifying and Analyzing Dynamic Software Architectures, in Proceedings of Fundamental Approaches to Software Engineering FASE 98, Lisbon, Portugal, Lecture Notes in Computer Science, Springer-Verlag, 1998. 4. R. Allen and D. Garlan, Formalizing Architectural Connection, in Proceedings ICSE '94, Sorrento, Italy, 1994. 5. L. Andrade and J. Fiadeiro, Interconnecting Objects via Contracts, in UML'99 -Beyond the Standard, R.France and B.Rumpe (eds), LNCS 1723, Springer Verlag, 1999. 6. L. Andrade and J. Fiadeiro, Service-oriented Business and System Specification: Beyond Object-orientation, in Practical Foundations of Business System Specifications, H. Kilov (ed.), Kluwer Academic Publishers, 2003. 7. N. Bjorner, A. Browne, M. Colon, B. Finkbeiner, Z. Manna, M. Pichora, H. Sipma and T. Uribe, STeP, The Stanford Temporal Prover, User's Manual, Computer Science Department, Stanford University, 1998. 8. C. Britton, IT Architectures and Middleware: Strategies for Building Large, Integrated Systems, Addison-Wesley, 2000. 9. S. Cook and J. Daniels, Designing Object Systems: Object-Oriented Modelling with Syntropy, The Object-Oriented Series, Prentice-Hall, 1994. 10. H. Ehrig and B. Mahr, Fundamentals of Algebraic Specification 1: Equations and Initial Semantics, volume 6 of EATCS Monographs on Theoretical Computer Science, Springer, 1985. 11. J. Fiadeiro and T. Maibaum, Temporal Theories as Modularisation Units for Concurrent System Specification. Formal Aspects of Computing, vol. 4, No. 3, Springer, 1992. 12. J. Fiadeiro and T. Maibaum, Sometimes "Tomorrow" is "Sometime", Action Refinement in a Temporal Logic of Objects, in D. Gabbay and H. Ohlbach (eds), Temporal Logic, LNAI 827, Springer-Verlag, 1994. 13. J. Fiadeiro and T. Maibaum, Categorical Semantics of Parallel Program Design, Science of Computer Programming 28(2-3), 1997. 14. E. Gamma, R. Helm, R. Johnson and j . Vlissides, Design Patterns: Elements of Object-Oriented Software Architecture, Addison-Wesley, 1994. 15. D. Garlan, Software Architecture: A Roadmap, in The Future of Software Engineering, A. Filkenstein (ed), ACM Press, 2000. 16. D Garlan, R. Monroe and D Wile, ACME: An Architecture Description Interchange Language, in Proc. of CASCON'97, 1997. 17. D. Garlan, R. Monroe and D Wile, ACME: Architectural Description of ComponentBased Systems, in Foundations of Component-Based Systems, Gary T. Leavens and Murali Sitaraman (eds), Cambridge University Press, 2000.
26
N. Aguirre and T. Maibaum
18. P. Inverardi and A. Wolf, Formal Specification and Analysis of Software Architetures using the Chemical Abstract Machine, IEEE Transactions in Software Engineering, 1995. 19. J. Magee and J. Kramer, Dynamic Structure in Software Architectures, in Proceedings of the 4th ACM SIGSOFT Symposium on Foundations of Software Engineering, San Francisco, California, USA, 1996. 20. Z. Manna and A. Pnueli, Verification of Concurrent Programs: A Temporal Proof System, Technical Report No. STAN-G-83-967, Department of Computer Science, Stanford University, 1983. 21. Z. Manna and A. Pnueli, The Temporal Logic of Reactive and Concurrent Systems, Springer-Verlag, 1991. 22. N. Medvidovic, ADLs and Dynamic Architecture Changes, in Proceedings of the Second Int. Software Architecture Workshop (ISAW-2), 1996. 23. N. Medvidovic and R. Taylor, A Framework for Classifying and Comparing Architecture Description Languages, In ESEC-FSE'97, 1997. 24. D. Parnas, A Technique for Software Module Specification with Examples, in Communications of the ACM, 15(5), 1972. 25. D. Parnas, On the Criteria to be Used in Decomposing Systems into Modules, in Communications of the ACM, 15(12), 1972. 26. J. Rumbaugh, Relations as Semantic Constructs in an Object-Oriented Language. In Proceedings of OOPSLA '87, Orlando, USA, 1987. 27. M. Wermelinger, A. Lopes and J. Fiadeiro, A Graph Based Architectural (Re)configuration Language, in ESEC/FSE'01, V.Gruhn (ed), ACM Press, 2001. 28. M. Wermelinger and J. Fiadeiro, A Graph Transformation Approach to Software Architecture Reconfiguration, in Science of Computer Programming 44, Elsevier, 2002.
Temporal Specification of CBS with Polymorphic Dynamic Reconfiguration
Class Unit Exports curr-in, curr-out, u-init(address), prod(message), send(message), t/eZ (message), cons (message), rem() Read Variables in : boolean out : boolean Attributes addr : address curr-in : message curr-out : message Actions u-init (address) prod (message) send(message) gei(message) cons(message)
rem() Axioms (1) D[Va; e address :
Unit A u-init(x) —> Q)((addr = x) A (curr-in = null) A (curr-out = null))] (2) D[Va; 6 address :
Unit A u-init(x) —> 0(-<(3y £ address : u-init (y))W-< Unit)] (3) D[Unit A (addr ^ Qaddr) —> 3x E address : u-init(x)] (4) Q[Vx G message : Unit A get(x) —> ((in = T) A (curr-in = null))] (5) D[Va; € message : Unit A get(x) —> Q)(curr-in = x)] (6) D[Vx € message : Unit A cons (a;) —• ((curr-in ^ nuW) A (curr-in = x) A (dest(x) = addr))] (7) D[Va; e message : /7raZ A cons(771) —> Q(curr-in
= null)]
(8) d[Unit A rem() —> ((curr-in 7^ nuZZ) A (dest(curr-in) (9) D[(7mZ A rem() —> Q(curr-in
^ addr))]
= null)]
(10) •[[/nit A (curr-in ^ Q>curr-in) —• (3a; 6 message : get(x) V cons(a:)) V remQ] (11) D[Vx £ message : Cmi A send(x) —> ((out = T) A (curr-out = x))] (12) afVa; 6 message : ZTmZ A send(a;) —> 0 ( c u r T _ 0 U *
=
^"ZZ)]
(13) D[Va; 6 message : Z7nit A prod (a;) —> (curr-out = mdZ)] (14) D[Va; £ message : Unit A prod(x) —> Q(curr-out (15) D[Unit A (curr-out ^ Qcurr-out)
= x)]
—> 3a; e message : send(x) V prod(a;)]
EndofClass Fig. 1.2.
Class [/nit: A specification of a component that interchanges messages
27
N. Aguirre and T. Maibaum
28
Class Gateway Extends Unit Exports curr-in, curr-out, int-curr-in, int-curr-out, u-init(address), prod(message), send(message), ^et(message), cons (message), rem() int-prod(message), int-send(message), mi-get(message), mi-cons(message), int-remQ Read Variables int-in : boolean, int-out : boolean Attributes netmask : address —> boolean, int-curr-in : message, int-curr-out : message Actions int-prod (message), int-send(message), int-remQ
int-get (message), int-cons (message),
Axioms (1) Cl[Gateway —• netmask(addr)] (2) D[Vx € address : netmask(x)
<->
Qnetmask(x)]
(3) D[Vx € message : Gateway A int-get(x) —» ((int-in = T) A (int-curr-in = null))] (4) •[Vx 6 message : Gateway A int-get(x) —> O(int-curr-in = x)] (5) D[Va; 6 message : Gateway A int-cons(x) —» ((int-curr-in
/ nit//) A (int-curr-in = x) A (dest(x) = address))]
(6) n[Vx £ message : Gateway A int-cons(m)
—> Q(int-curr-in
= null)]
(7) 0[Gateway A int-remQ —• ((int-curr-in
^ nu//) A (dest(int-curr-in)
^ address))]
(8) d[Gateway A int-remQ —> Q(int-curr-in
= null)]
(9) D[Gatewaj/ A (int-curr-in ^ Q)int-curr-in) —• (3a; 6 message : int-get(x) V int-cons(x)) V in£-rem()] (10) D[Vx G message : Gateway A int-send(x) —> ((int-out = T) A (int-curr-out = a;))] (11) •[Va; € message : Gateway A int-send(x)
—> Q(int-curr-out
= null)]
(12) D[Vx e message : Gateway A int-prod(x) —+ (int-curr-out = null)] (13) d[Va; 6 message : Gateway A int-prod(x) —• Q)(int-curr-out (14) n[Ga(eway A (int-curr-out / Qint-curr-out) —> 3a; £ message : int-send(x) V int-prod(x)] (15) DfGatewa?/ A remQ A netmask(dest(curr-in))
= a;)]
—> int-prod (curr-in)]
(16) d]Gateway A int-remQ A netmask(dest(int-curr-in)) (17) •[Gateway A int-remQ A ^netmask(dest(int-curr-in))
—» int-prod (int-curr-in)] —» prod(int-curr-in)]
EndofClass
Fig. 1.3.
Class Gateway: A specification of a component that forwards messages
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
29
A s s o c i a t i o n Link Participants s : Gateway t : Unit Connections (t.curr-out 7^ null) —> (s.int-in = T ) (s.int-out = T) —> (t.curr-in = null) (t.in = T) <-> (s.int-curr-out ^= null) (t.out = T) <-> (s.int-curr-in = null) Vm G message : (s.int-send(m) —^ t.get(m)) Vm 6 message : (t.send(m) —> s.int-get(m)) EndofAssociation Fig. 1.4. subnets
An association defined to enable communication between gateways and units within
N. Aguirre and T. Maibaum
30
S u b s y s t e m SubNet Attributes gw: NAME
Operations s-imt(address) add-unit(NAME, address) rem-unit (NAME) Axioms (1) D[Vn, m : SubNet A Link(n, m) —> (n = gw)] (2) D[Vn : SubNet A Unit(n) A (n # gw) - • Link(gw, n)\ (3) • [Su&./Vei -» Vn :
^Link(n,n)]
(4) D [SubNet -> Gateway (gw)} (5) n[Vn : Su&JVei A (5iu = n) -> {{gw =
n)W^SubNet)]
(6) •[Vn, m : SubNet A Unit(n) A Unit(m) A (n ^ m) —> (n.addr ^ m.addr)] (7) •[Vn : SubNet A Unit(n) —» gw.netmask(n.addr)] (8) D[V:r £ address : SubNet A s-init(x) —> «;.u-mii(a;)] (9) n[Vx £ address : SubNet A s-init(x) -> C>(v™ : Unit(n) - • (n = ^w))] (10) D[Vn : Vx € address : SubNet A add-unit(n, x) - • - U n i t ( n ) ] (11) D[Vn : Va: € address : SubNet A add-unit(n, x) —> 0 ( U n i t ( n ) An.u-imi(i))] (12) D[Vn : SubNetA(n
^ gw) A - U n i t (n) AO(Unit(n)) - » 3 i e address :
(13) D[Vn : SubNet A rem-unit(n)
-* Unit(n)]
(14) D[Vn : Su&ATet A rem-unit(n)
-> 0 ( _ , U n i t ( n ) ) ]
add-unit(n,x)]
(15) a[Vn : 5«6iVe< A Unit(n) A 0("'Unit(n)) -> rem-unit(n)] (16) n[Vx 6 message : SubNet A gw.int-get(x)
—> 3n : Link(gw,n)
f\n.send(x)]
EndofSubsystem Fig. 1.5.
SubNet: A subsystem specification aggregating units and gateways
Temporal Specification of CBS with Polymorphic Dynamic Reconfiguration
Association S-Link Participants s : Gateway' t : SubNet Connections (t.gw.curr-out ^ null) —> (s.int-in' (s.int-ouf = T) —-> (t.gw.curr-in = (t.gw.in = T) <-> (s.int-curr-out' / (t.gw.out = T) <-> (s.int-curr-in' = Vm e message : (s.int-sena"(m) —> Vm £ message : (t.gw.send(m) —>
= T) null) null) null) t.gw.get(m)) s.int-get'(m))
EndofAssociation Fig. 1.6.
An association defined for communication between routers with subnets
Fig. 1.7.
Graphical representation of a configuration of subsystem Net
31
N. Aguirre and T. Maibaum
32
Subsystem Net Attributes rt:
NAME
Operations n-mii(address)
add-subnet (NAME, address) rem-subnet (NAME) Axioms (1) D[Vn, m : Net A S-Link(n, m) -> (n = rt)] (2) D[Vn : Net A SubNet(n) -> (3) D[ATee ->
S-Link(rt,n)]
Gateway'(rt)]
(4) D[Vn : iVei A (rt = n) - • ((rt = n)W--JVei)] (5) • [Vn, m : Net A SubNet(n) A SubNet(m) A (n # m) -* (Va; £ address : ->(n.gw.netmask(x) A m.g«;.neimasA;(a;)))] (6) •[Vn : JVe£ A SubNet(n) —> (Va; € address : n.gw.netmask(x)
—> rt.netmask'(x)]
(7) a[\fx € address : Net A n-init(x) —> rt.u-init'(x)] (8) n[iVei -» (Vn : Gateway'(n) <-> (n = rt))] (9) D[Vr € address : JVei A n-init(x) - • 0 ( V n : -.5u6i\Tei(n))] (10) D[Vn : Va; e address : Net A add-subnet(n, x) —> -^SubNet(n)] (11) D[Vn : Va; 6 address : Net A add-subnet(n,x) (12) D[Vn : iVet A ^SubNet(n)
A 0(SubNet(n))
—> Q(SubNet(n)
-> 3a; 6 address : add-subnet(n, x)]
(13) D[Vn : iVei A rem-subnet(n)
-> SubNet(n)]
(14) D[Vn : JVe£ A rem-subnet(n)
-»• O ^ ^ N e ^ n ) ) ]
(15) D[Vn : A^ei A SubNet(n) A 0(^SubNet(n)) (16) n[Vx 6 message : JVei A rt.int-get'(x)
A n.s-mit(x))]
-> rem-subnet(n)]
—> 3n : S-Link(rt,n)
A n.<7w.sered(a;)]
EndofSubsystem Fig. 1.8.
Net: A high level subsystem specification aggregating instances of subsystem SubNet
Temporal Specification
of CBS with Polymorphic
Dynamic Reconfiguration
Net
SubNet
ADTNAME
Gateway' •<
»- Gateway .1
Unit id
AVT Fig. 1.9.
Relationships among theories of the specification of Net
33
This page is intentionally left blank
Chapter 2 Coordinated Composition of Software Components
Farhad A r b a b Center for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
[email protected] www. cwi. nl/^farhad Leiden Institute
for Advanced Computer Science (LIACS) Leiden University Niels Bohrweg 1, 2333 CA Leiden, The Netherlands
[email protected]
School of Computer Science, University of Waterloo 200 University Ave. West, Waterloo, Ontario N2L 3G1, Canada Virtually all contemporary component models use object oriented method invocation as the foundation of their component composition semantics. However, two objects composed by method invocation do not yield a new object. This produces component models that are not closed under component composition; require complex composition operators; or, often, suffer from both of these deficiencies. Such object oriented component models are based on Abstract Data Types (ADT). In contrast, a simpler model of components and their composition exists that uses the notion of Abstract Behavior Types (ABT) as its foundation [3]. The ABT model supports a much looser coupling than is possible with the ADT model, and is inherently amenable to exogenous coordination. We have argued that both of these are highly desirable, if not essential, properties for models of components and their composition. Furthermore, the composition of two ABTs always yields another ABT, which in this model makes components closed under composition. In this chapter, we describe two concrete formalizations of the notion of Abstract Behavior Types as (1) constraint automata, and (2) relations on timeddata streams. To demonstrate the utility of the ABT model, we describe Reo: an exogenous coordination language for compositional construction of component connectors based on a calculus of channels. We show the expressive power of Reo, and the applicability of the ABT model, through a number of examples.
2.1.
Introduction
T h e object oriented programming paradigm offers powerful abstraction and encapsulation mechanisms whose utilization has dramatically reduced t h e complexity of 35
36
F. Arbab
software development. Without this reduction in complexity, the abundance of the sophisticated applications and systems of today would not have been practically viable. Although they vary in sometimes significant ways, all object oriented programming paradigms share the notion of Abstract Data Types (ADTs) as their common foundation. An ADT encapsulates some data structures and the procedures that manipulate them into a logically coherent abstraction that offers only a data type and a set of operations for its manipulation. For instance, a stack ADT offers only a set of operations (e.g., top, pop, push, and empty) for manipulating stacks, without giving any hint of what data structures are used to actually represent a stack, or what specific algorithms are used to manipulate such data structures in the implementation of those operations. The semantics of the operations in an ADT is specified in a set of axioms. The axioms define the semantics of the operations merely in terms of their mutual effects on each other. In the case of the stack, for instance, these axioms state that (1) empty performed on the empty stack yields true; and (2) empty applied to a stack obtained from a push operation on any stack yields false. They also state that (3) top applied to the empty stack yields an error; (4) top applied to a stack obtained from pushing a data item d onto some other stack, yields d; (5) popping a stack obtained from pushing a data item onto some other stack, s, yields s; (6) popping an empty stack yields an error; etc. The operational nature of the ADT interfaces manifests itself in the method invocation semantics of message passing in the object oriented paradigm. A message sent by one object to another is not merely a passive piece of structured data passed from one to the other. A message primarily invokes the operation encoded in a specific method of its recipient and this semantics has certain implications on the flow of control within and through the two objects. The precise details of this (method invocation) semantics (e.g., synchronous vs. asynchronous, active vs. passive objects, etc.) still significantly varies to the extent of incompatibility in different languages. Method invocation is the only way in the object oriented paradigm in which the behavior of objects can be combined into more complex software. Generally, when an object c sends a message m{p) to another object e, this implies that c invokes the method m of e with the actual parameters p. For this to happen: • c must know (how to find) e; • c must know the syntax and the semantics of the method m of e; • e must (pretend to) perform the activated method m on parameters p, and return its result to c upon its completion (the "pretense" refers to when e delegates the actual execution of m to a third object); and • c typically suspends between its sending of m and the receiving of its (perhaps null) result. This implies a rather tight semantic coupling between the message sender and receiver objects, involving an asymmetric, unidirectional dependency. On the one hand, the methods provided by an object can be used by any other entity (that has access to it). On the other hand, an object internally decides what operation of what other objects it invokes. This puts users and providers in asymmetric roles.
Coordinated Composition
of Software
Components
37
Users internally make the decisions on what operations are to be performed, and generally rely on some specific semantics that they expect of these operations, while it is left to be the responsibility of the providers to carry out the decisions made by the users to satisfy their expectations. Because of the intricate assumptions involved and the differences among various object oriented programming languages, it is generally not possible for an object in one language to directly invoke the methods of an object in another language. The intimate coupling inherent in method invocation becomes problematic when combining the behavior of software entities larger than single objects, even in the same language. Furthermore, a substantial body of useful software is written in nonobject-oriented languages. On the other hand, regardless of the specific language(s) that various software entities or sub-systems are written in, there is a universal way in which they (potentially can) communicate with one another, as well as with other non-software entities in their environment: exchanges of neutral, pure data through simple input/output operations. These observations behoove us to adopt a definition for the term component that is less restrictive and more flexible than the presently popular notions of this term, which predominantly reflect the bias of an object-oriented heritage. We have introduced the notion of Abstract Behavior Type (ABT) as a higherlevel alternative to ADT to model components and their composition [3]. An ABT defines an abstract behavior as a constraining specification of the behavior of an entity by relating the contents and the relative timing of its observable input /output exchanges with its environment, without specifying any detail about the operations that may be used to implement such behavior or the data types it may manipulate for its realization. The notion of ABT parallels the notion of ADT: an ADT abstracts away the data structures and the algorithmic instructions that manipulate them to offer a set of operations on an abstract data type. An ABT abstracts away the data types and the operations that manipulate them to offer an observable input/output behavior. Constraint automata [4] and a coalgebraic model of relations on timed-data-streams [8, 21] are two concrete formalizations of the notion of ABT. The ABT model supports a much looser coupling than is possible with the ADT's operational interface, and is inherently amenable to exogenous coordination. We have proposed that both of these are highly desirable, if not essential, properties for models of components and their composition. In this chapter, we define our notion of components and their composition in Section 2.2. In Section 2.3, we motivate these definitions in the context of a simple example. Sections 2.4 and 2.5 contain overviews of constraint automata and the coalgebraic model of ABT, respectively, each representing a formal model for defining the abstract behavior of individual components as well as their composition. While the ABT model provides a simple formal foundation for definition and composition of components, it does not provide primitives to directly express any form of non-trivial coordination. The latter requires an effective exogenous coordination model. Reo is a channel-based exogenous coordination model wherein complex coordinators, called connectors are compositionally built out of simpler ones [2, 7, 8]. A summary of Reo is presented in Section 2.6, with examples that show its flexibility and expressive power. In Section 2.7, we return to our example of Section 2.3
38
F. Arbab
and show the construction of a Reo connector circuit for the coordinator glue code of this system. Finally, Section 2.8 contains the concluding remarks of this chapter.
2.2. Components and Their Composition We define a component as a template, e.g., any executable code, whose concrete incarnations qualify as component instances. A component instance is a unique, identifiable execution-time collection that (1) includes at least one active entity; and (2) allows untargeted input/output of passive data as the only means of communication between the entities inside the component instance with any entity not in the same component instance. A component instance may include fragments or modules of sequential code, objects, or classes. An active entity is one that has its own independent thread of control. Examples of active entities include active objects, threads, processes, agents, or (other) component instances. No assumption is made about how the entities inside a component instance communicate with each other. Each component instance has a number of "contact points" that are recognized by its environment for the purpose of information exchange. We refer to these contact points as the ports of a component instance. The I/O operations, e.g., read and write, used by (the entities inside) a component instance to communicate with its outside world, are performed on its ports. Without loss of generality, we assume ports are unidirectional, i.e., the information flows through a port in only one direction: either from the environment into its component instance (through read) or from its component instance to the environment (through write). Each I/O operation inherently synchronizes the entity that performs it with its environment: a write operation suspends until the environment accepts the data it has to offer through its respective port; likewise, a read operation suspends until the environment offers the suitable data it expects through its respective port. 8. By this definition, a Unix process, for example, qualifies as a component instance: it contains one or more threads of control which may even run in parallel on different physical processors, and its file descriptors qualify as ports. A component instance may itself consist of a collection of other component instances, perhaps running in a distributed environment. Thus, by identifying their relevant ports through which they exchange data with their environment, entire systems can be viewed and used as component instances, abstracting away their internal details of operation, structure, geography, and implementation. Clearly, components and component instances are different things: in our Unix process example, a component instance is a running process, whereas the binary file that this process is an instance of, is a component. However, for brevity, we often (mis)use the term "component" instead of "component instance" when the context makes the intention clear. Restricting the content of inter-component communication to only passive data disallows transfer of control or sharing of pointers to their internal address spaces. Inter-component method invocations and (remote) procedure calls are thus forbidden. It also makes all inter-component communication uniformly undirected and a
Of course, a component may retract any one of its suspended operations, e.g., due to a time-out.
Coordinated Composition
of Software
Components
39
anonymous. Unlike the case of a message sent to a specific target or one that invokes a method, when a component writes a (passive) message through one of its (output) ports, it neither identifies or restricts the consumer of this message (untargeted) nor knows the identity of its consumer (anonymous). Similarly, a component's reading a (passive) message from one of its (input) ports also constitutes untargeted, anonymous communication. This simpler and more abstract mechanism for inter-component communication makes the coupling of components looser and more flexible than the coupling of ADTs or the dependencies of objects and classes. The fact that passive data can be interpreted as messages that invoke methods or operations means that the object oriented message passing and the ADT's notion of behavior as operation sequences can be modeled as well, if and when it is necessary. Therefore we do not lose expressiveness for the flexibility and simplicity of our model. A number of interesting questions immediately arise, for instance: What can one gain by using such a simpler model? How can one characterize the behavior of such component instances? Can such characterizations be formalized? How can one compose such component instances? How flexible and expressive can such a composition paradigm be? Can one use formal models to compositionally reason about the properties of the resulting systems? Intercepting and manipulating messages before they perform the methods that their sender objects intend to invoke is at the core of the contemporary approaches to Aspect Oriented Programming, as exemplified, e.g., by the so-called Composition Filters [9]. This clearly show the advantage of a paradigm based on a more abstract notion of messages as passive data (that can be freely manipulated and changed, before they are interpreted as a triggers for actions) over the active messages of object oriented programming whose immediate consequences are strictly to invoke the designated methods of their target objects. Unix pipes constitute an example of a mechanism for composition of our component instances. Useful as they are, their flexibility is limited and they are not expressive enough to allow but the simplest forms of (pipeline) composition. Classical dataflow models and Kahn networks [16] offer approaches to characterize the behavior of our component instances. Broy's dataflow-like characterization of components and incorporation of time tags in data streams comprise a functional paradigm wherein some compositional construction and reasoning about systems of component instances is possible [11, 12]. In the context of component composition, these models have shortcomings in at least two significant areas. First, they do not allow mixing synchrony and asynchrony in behavioral definitions. Second, they support, at best, only very rudimentary forms of exogenous coordination. Exogenous coordination means coordination from outside and refers to the ability, in a model or language, to coordinate the behavior of a black-box component, without its knowledge, from outside of the component itself. This is an essential property for a component composition model to have because it allows building systems with very different emergent behavior out of the exact same components, simply by composing them differently. In such a model, different compositions of the same components create different contexts for the components, each exogenously
F. Arbab
40
imposing a different coordination protocol on those components, yielding a different emergent system behavior. 2.3. System Composition Example Suppose we have three components, C, D, and T, as in Figure 2.1.a. They are all black-box components: we know nothing about what they are made of or how they work internally. They may be made out of hardware, software, or some combination of the two. We can make no assumptions about the language or model used to construct these components. Specifically, they neither provide an interface of methods to call, nor make any method calls to interact with their environment. C
t
Z3 (a)
C
Ik
C
"
C
•
T
"
T
*r
T
•
(b)
Fig. 2.1.
(c)
(d)
Three components and their various compositions
The only thing we know about C is what we can externally observe of its behavior. It has a single port of interaction with its environment, through which it periodically outputs some string of characters. Of course, for the output to take place, (an entity in) the environment of C must be prepared to accept its output. Assuming an ideally cooperative environment (i.e., always ready to take it whenever C attempts to output its string), C produces a string approximately every 15 seconds, with the tolerance margin of e. The actual content of the strings produced by C is the current time; so C is a clock. The only thing we know about D is that it has a single input port, through which it consumes strings and displays them on its accompanying monitor for approximately 30 seconds. The "processing time" of D is negligible for our purposes. We observe that T behaves very much the same as C, except that its tolerance margin is 6 and the content of its output strings convey the current temperature. We can construct a few systems out of these components, the simplest ones involving a direct connection, e.g., between C and D. Because we cannot alter any of these components, we must make the connection from outside. The simplest connector we can use to compose C and D is what we call a synchronous channel, as in Figure 2.1.b. A synchronous channel is a medium of communication with two ends. Through one of its ends, it accepts input, and through the other, it dispenses it. We call it "synchronous" because it synchronizes the pair of input and output operations at its opposite ends: the two operations are suspended as necessary to ensure that they succeed together atomically. If we connect C to D using a synchronous channel whose transfer and synchronization time is negligibly small (compared to the period of C), we obtain a composed system that displays the current time, updated approximately every 30 seconds. Similarly, we can construct another system out of T and D connected
Coordinated Composition
of Software
Components
41
by a synchronous channel, as in Figure 2.1.C, to display the current temperature, updated approximately every 30 seconds. In order to build a system, similar to what one finds on the top of some bank buildings, that alternately displays the current time and temperature, we have all the functional elements that we need in C, D, and T. What we need is a connector to compose them together as in Figure 2.1 .d. This connector must have a more complex behavior than that of a synchronous channel used in the previous compositions: not only it must facilitate the data exchanges among these three components, but it also needs to enforce the coordination protocol that implements the desired alternating behavior. Because the internals of the components cannot be changed, such a connector would have to impose its coordination protocol "from the outside" of the components, which illustrates what we mean by exogenous coordination. Obviously, such a connector, as well as other even more sophisticated ones, can be developed as programs in any modern programming language; their Turing completeness ensures that. However, it is interesting to ponder if there is a better, higher-level alternative to programming such connectors from scratch. Synchronization and coordination protocols are notoriously complex concurrent programs, and adding provisions to enable them to cope with mobility in distributed environments makes conventional programming models and languages grossly inadequate for their development. There is enough commonality of purpose (facilitating data exchange and exogenous coordination) among such connectors to warrant considering a special connector specification model and a special language for their development. To the extent that they merely connect and coordinate and lack application-specific functionality, each such connector can be generically designed and reused to compose widely different sets of components into entirely different systems. What would a special purpose connector specification model be like? Can connectors be reused not just to compose components into (sub)systems, but also to compose more complex connectors? What composition operators are necessary and sufficient to allow connector composition? Is there a set of primitive connectors out of which "all interesting or useful" connectors can be constructed by those connector composition operators? How can one characterize interesting and useful in this context? Before we can address any of the above questions, we need concrete models to specify the behavior of components and that of the connectors. We briefly describe two such models in the next two sections. We return to these questions in Section 2.6, where we describe Reo and show how it can serve as a language for compositional construction of reusable coordinating component connectors.
2.4. Constraint Automata Constraint automata are variants of labelled transition systems that operationally describe the maximally parallel data-flow activity through data-exchange observation points, called nodes, in a concurrent system. They were introduced in [4] to provide an operational semantics for coordination mechanisms formalized by composition of Reo connector graphs. In a constraint automaton of a Reo connector
F. Arbab
42
graph, the states of the automaton represent the possible configurations (e.g., the contents of the FIFO-channels); transitions going out of a state represent data-flow at that state and its effect on the configuration. More generally, constraint automata can serve as operational models for Abstract Behavior Types. The states of an automaton represent the externally distinguishable configurations of an actor, while the transitions encode its maximallyparallel stepwise behavior. The transitions are labeled with the maximal sets of nodes on which data-flow occurs simultaneously (i.e., atomically), and a data constraint (i.e., boolean condition for the observed data values).
(a)
Fig. 2.2.
(b)
Constraint automata for a synchronous channel and a FIFOi channel
We first give an informal, intuitive introduction to constraint automata using two simple examples. Consider a synchronous channel as an independent actor. Figure 2.2.a shows the constraint automaton model of our synchronous channel, with A and B as its two (acquiring and dispensing) ends. This automaton consists of a single state, which is also its initial state (marked by an incoming arrow emanating from the environment). It has a single transition from this state back to the same state. The label of this transition consists of two parts: a set of node names, {A, B}, and a list of data constraints, d{A) = d{B). The node names indicate that the transition is possible only if all the named nodes are active simultaneously. In this case, no transition is possible unless there is simultaneous (read and write) activity at the opposite ends of this channel. The data constraints further restrict the conditions under which the transition can take place. In this case, the data exchanged through the activity on A, i.e., d(A), must be identical to that on B, i.e., d(B). Figure 2.2.b shows the constraint automaton model of a 1-bounded FIFO channel with the ends A and B, where we assume the data domain Data = {0,1}. The initial state stands for the configuration where the buffer is empty, while each of the other two states represents the configuration where the buffer is filled with one of the data items 0 or 1. The outgoing transitions from the initial state are labeled with the singleton set {^4} which reflects the fact that in the initial configuration
Coordinated Composition
of Software Components
43
data-flow is possible only at the (acquiring) A-end of the channel. If the buffer is filled then data-flow at A is impossible and only B can dispense the value out of the buffer. In the sequel, we specify constraint automata using a nonempty and finite set Data consisting of data items that can be sent (and received) via channels, and a nonempty and finite set M = {A\,..., An} of names. Intuitively, we may think of the Ai's to be the nodes representing the input and output ports of a connector or a component instance. We refer to the subsets of N as node-sets. Data assignments, data constraints. A data assignment for 0 ^ iV C J\f is a function S : N —» Data. DA(N) denotes the set of all data assignments for N, and DA the set of all data assignments (on any N). Data constraints, which can be viewed as a symbolic representation of sets of data assignments, are formally defined as propositional formulas built from the atoms "d^ G P " and "d^ = de", where A, B G A/", dA,ds G Data, and P C Data. DC(N) denotes the set of data constraints using only names from N, and DC is a shorthand for DC(JV). Definition 2.1. Constraint automata [4]. A constraint automaton (over Data) is a tuple A = (Q,Af, —>, Qo) where Q is a finite set of states, A^ a finite set of nodes, —> is a finite subset of Qx (2" x DC) xQ, called the transition relation, and Qo Q Q a nonempty set of initial states. We write q —'-> p instead of (q, N, g,p) G—• and require that (1) N / 0 and (2) g G DC(N) is satisfiable. We call N the node-set and g the guard of the transition. States without any outgoing transition are called terminal. • The intuitive meaning of a constraint automaton as an operational model for an ABT is similar to the interpretation of labeled transition systems as formal models for reactive systems. The input/output points through which (the entity enacting) the ABT exchanges data with its environment play the role of the nodes in its corresponding constraint automaton. The states represent the externally discernible configurations that can be ascribed to (the entity enacting) the ABT. The meaning of a transition q —^ p is that in configuration q all the nodes At G N atomically perform I/O-operations that meet the guard g, resulting in a new configuration p, while at the same moment there is no data-flow at the other nodes Ai € J\f\N. (C,D},d(C)=d(D)
{T,D),d(T)=d(D) Fig. 2.3.
Constraint automaton for the connector in Figure 2.1.d
We can specify the desired behavior of the connector in Figure 2.1.d as the constraint automaton in Figure 2.3. In its initial state (left), this automaton waits
44
F. Arbab
for a pair of I/O actions on the nodes corresponding to the ports of components C and D. The transition to the other state (right) is possible only if both of these actions are possible({C, D}), and furthermore the data transfered through the port of C is the same as that of the port of D (d(C) = d(D)). Similarly, a transition out of its other state (right) is possible only if a pair of I/O actions on the ports of T and D are possible, and the data transfered through the port of T is the same as that of the port of D.
(a)
Fig. 2.4.
(b)
Constraint automata for a synchronous drain channel and a merger
In [4] we define how constraint automata can be composed to yield the specification of more complex systems. We also give the constraint automata for the various basic channels types in Reo, and show that the constraint automata for some interesting more complex behavior can be composed out of them. For instance, the constraint automaton in Figure 2.3 can be constructed as a composition of 4 synchronous channels (Figure 2.2.a), a FIFOi channel (Figure 2.2.b), a synchronous drain channel, and 3 mergers. The constraint automata for merger and the synchronous drain channel are shown in Figure 2.4 and described below. The synchronous drain channel is almost identical to a synchronous channel (compare Figures 2.2.a and 2.4.a). The difference between the two is that a synchronous drain channel has two acquiring ends (and no dispensing) end. It synchronizes the two operations on its two ends (in this case two write operations) just like a synchronous channel does. Because it has no dispensing end and thus no data can ever be obtained out of this channel, all data written to (either end of) this channel is lost. This is why the automaton transition has no data constraint. A merger is a primitive connector with two input (denoted as A and B in Figure 2.4.b) and one output (denoted as C in Figure 2.4.b) ports. As the automaton in Figure 2.4.b specifies, if a pair of write and take operations are pending on ports A and C of this merger, it transfers the data from A to C. Likewise, if a pair of write and take operations are pending on ports B and C, the merger transfers the data from B to C. Both transitions are enabled when two write operations and a take operation are pending on all three ports of this merger. In this case, the automaton nondeterministically chooses and performs one of the two transitions. Alas, knowing that the constraint automaton in Figure 2.3 can be constructed as a composition of the above mentioned constraint automata does not actually tell us what this composition ought to be. A behavior modeled as a constraint automaton gives us no clue about how it can be constructed out of the constraint automata representing the simpler behavior of a set of constituents. For that, we need a
Coordinated Composition
of Software
Components
45
constructive compositional paradigm such as Reo, which we describe in Section 2.6. Constraint automata are intuitively appealing as behavior specification tools especially for engineers who are accustomed to use automata, e.g., to describe logic circuits. They are also invaluable representations for model checking of behavior specifications, the primary purpose for which they were originally devised. Because of their finiteness, constraint automata cannot represent behavior that involves infinite or unbounded states, e.g., that of an unbounded FIFO channel. A closer look at the constraint automaton for a FIFOi channel in Figure 2.2.b shows that a behavior that involves a finite number of states over an infinite or unbounded data domain (e.g., that of a FIFOi channel over integers) cannot be represented by a constraint automaton either. Even when both states and data domains are finite, constraint automata specifications can become too big and cumbersome to be of direct utility for human users. A different formalism for behavior specification uses the coalgebraic notion of streams as its basic notion. Because streams are infinite, unboundedness and infinity of states and data domains pose no difficulty in this formalism. We describe this formalism in the next section.
2.5. A B T as Relations on Timed Data Streams An ABT can be defined as a (maximal) relation among a set of timed-data-streams. This particular formalization emphasizes the relational aspect of the ABT model explicitly, and abstracts away any hint of an underlying operational semantics of its implementation. This helps to focus on behavior specifications and their composition, rather than on operations that may be used to implement entities that exhibit such behavior and their interactions. The notion of timed-data-streams as well as most of the technical content in this section come from the work of J. Rutten on coalgebras [15, 19], stream calculus [20, 22], and a coalgebraic semantics for Reo [8, 21]. Analogous to the way in which algebraic methods constitute suitable models for the syntactic structure of systems, the coalgebraic approach is a promising mathematical foundation for modeling the dynamic behavior of (concurrent) systems. Defining observable behavior in terms of input/output implants a dataflow essence within ABTs akin to such dataflow-like networks and calculi as [13], [17], and especially [12]. The coalgebraic model of ABT presented here differs from all of the above-mentioned work in a number of respects. Most importantly, the ABT model is compositional. Its explicit modeling of ordering/timing of events in terms of separate time streams provides a simple foundation for defining complex synchronization and coordination protocols using a surprisingly expressive small set of primitives. The use of coinduction as the main definition and proof principle to reason about both data and time streams allows simple compositional construction of ABTs representing many different generic coordination schemes involving combinations of various synchronous and asynchronous primitives that are not present (and not even expressible) in any of the aforementioned models.
F. Arbab
46
2.5.1. Streams
and
Coinduction
A stream (over A) is an infinite sequence of elements of some set A. The set of all streams over A is denoted as Au. Streams in DS = D^ over a set of (uninterpreted) data items D are called data streams and are typically denoted as a, /3, 7, etc. Zero-based indices are used to denote the individual elements of a stream, e.g., a(Q),a(l),a(2),... denote the first, second, third, etc., elements of the stream a. We use the infix "dot" as the stream constructor: x.a denotes a stream whose first element is x and whose second, third, etc. elements are, respectively, the first and its successive elements of the stream a. Following the conventions of stream calculus [20, 22], the well-known operations of head and tail on streams are called initial value and derivative: the initial value of a stream a (i.e., its head) is a(0), and its (first) derivative (i.e., its tail) is denoted as a'. The kth derivative of a is denoted as a^ and is the stream that results from taking the first derivative of a and repeating this operation on the resulting stream for a total of k times. Relational operators on streams apply pairwise to their respective elements, e.g., a > {3 means a(0) > /3(0),a(l) > /3(l),a(2) > /3(2),.... Constrained streams in TS = IR" over positive real numbers representing moments in time are called time streams and are typically denoted as a, b, c, etc. To qualify as a time stream, a stream of real numbers a must be (1) strictly increasing, i.e., the constraint a < a' must hold; and (2) progressive, i.e., for every N > 0 there must exist an index n > 0 such that a(n) > N. We use positive real numbers instead of natural numbers to represent time because, as observed in the world of temporal logic [14], real numbers induce the more abstract sense of dense time instead of the notion of discrete time imposed by natural numbers. Specifically, we sometimes need finitely many steps within any bounded time interval for certain ABT equivalence proofs (see, e.g., [8]). This is clearly not possible with a discrete model of time. Recall that the actual values of "time moments" are irrelevant in our ABT model; only their relative order is significant and must be preserved. Using dense time allows us to locally break strict numerical equality (i.e., simultaneity) arbitrarily while preserving the atomicity of events. A Timed Data Stream is a twin pair of streams (a, a) in TDS = DS x TS consisting of a data stream a € DS and a time stream a 6 TS, with the interpretation that for all i > 0, the input/output of data item a(i) occurs at "time moment" a(i). Two timed data streams (a, a) and (/3,b) are equal if their respective elements are equal, i.e. (a, a) — (/3, b) = a = (3 A a = b. 2.5.2. Abstract
Behavior
Types
An Abstract Behavior Type (ABT) is a (maximal) relation over timed data streams. Every timed data stream involved in an ABT is tagged either as its input or its output. The input/output tags of the timed data streams involved in an ABT are meaningless in the relation that defines the ABT. However, these tags are crucial in ABT composition described in Section 2.5.4.
Coordinated Composition
of Software
Components
47
Generally, we use the prefix notation R(I\,l2, ..., iTO; Oi, O2, •••, On) and the separator ";" to designate the ABT defined by the (m-fn)-ary relation R over the m > 0 sets of input timed data streams li,0 < i < m and the n > 0 sets of output timed data streams Oj, 0 < j < n. As usual, m + n is called the arity of i? and we refer to m and n individually as the input arity and the output arity of R. In the special case where m = n = 1 it is sometimes convenient to use the infix notation I R O instead of the standard R(I; O). To distinguish the set of timed data streams that appears in a position in the relation that defines an ABT (i.e., a column in the relation) from a specific timed data stream in that set (i.e., which may appear in a row of the relation in that position) we refer to h and Oj as, respectively, the ith input and the j t h output portals of the ABT. Formally, a component, as defined in Section 2.2, with m > 0 input and n > 0 output ports is an ABT with m input and n output portals. The set of all possible streams of data items that can pass through each port of the component, together with their respective timing, comprise the set of timed data streams of the ABT's portal that corresponds to that port. 2.5.3. ABT
Examples
In this section we show the utility of the coalgebraic formalization of the ABT model through a number of examples. 2.5.3.1. Basic Channels indexchannels Following is a list of some useful simple binary abstract behavior types. Each has a single input and a single output portal. (1) The behavior of a synchronous channel is captured by the Sync ABT, defined as (a, a) Sync ((3,b) = (a, a) = (f3,b). Because (a, a) = {(3, b) = a = j3 A a = b, the Sync ABT represents the behavior of any entity that (1) produces an output data stream identical to its input data stream (a = (3), and (2) produces every element in its output at the same time as its respective input element is consumed (a = b). Recall that "at the same time" means only that the two events of consumption and production of each data item by a Sync channel occur atomically. (2) The behavior of an asynchronous unbounded FIFO channel is captured by the FIFO ABT, defined as {a, a) FIFO (j3, b) = a = (3 A a < b. The FIFO ABT represents the behavior of any entity that (1) produces an output data stream identical to its input data stream (a = (3), and (2) produces every element in its output some time after its respective input element is observed (a < b).
F. Arbab
48
(3) The behavior of an asynchronous channel with the bounded capacity of 1 is captured by the FIFOi ABT, denned as (a, a) FIFOi ((3, 6) = a = (3 A a < b < a'. The FIFOi ABT represents the behavior of any entity that (1) produces an output data stream identical to its input data stream (a = /3), and (2) produces every element in its output some time after its respective input element is observed (a < b) but before its next input element is observed (b < a' which means b(i) < a(i + 1) for all i > 0). (4) The behavior of an asynchronous channel with the bounded capacity of 1 filled to contain the data item D as its initial value is captured by the FIFOi (D) ABT, defined as (a, a) FIFOi (D) (/?, b) = j3 = D.a
Ab
The FIFOi (D) ABT represents the behavior of any entity that (1) produces an output data stream (3 = D.a consisting of the initial data item D followed by the input data stream a of the ABT, and (2) for i > 0 performs its ith input operation some time between its ith and i + 1st output operations (b < a < b'). (5) The behavior of an asynchronous channel with the bounded capacity of k > 0 is captured by the FIFOfc ABT, defined as (a,a) FIF0fe (P,b) = a = P Aa
a{k).
Recall the a^ is the fct/l-derivative (i.e., the fc'^-tail) of the stream a. The FIFOfc ABT represents the behavior of any entity that (1) produces an output data stream identical to its input data stream (a = f3), and (2) produces every element in its output some time after its respective input element is observed (a < b) but before its fc*ft-next input element is observed (b < a^ which means b(i) < a(i + k) for all i > 0). Observe that FIFOi is indeed a special case of FIFOfc with k — l. 2.5.3.2. Merge and Replicate We now define two other ABTs that, as we see in Section 2.6, form a foundation for an interesting and expressive calculus: merger and replicator. The merger ABT is defined as: Mr5((a,a),(/3,6>;(7,C>)=a(0)^(0) A
DMrg((a,a),{P,b);{y,c))
where Z?Mr fl ((a,a) ) 3 ) 6);( 7) c)) = J o(0) = 7(0) A a(0) = c(0) A DMrg({a', a'), <J3,6); (7', c')) if o(0) < 6(0) \ DMrg((a, a), ((3', b'); (f, c')) if a(0) > 6(0) Intuitively, the Mrg ABT produces an output that is a merge of its two input streams, by requiring that (1) the values in its input pairs do not arrive at the same time (a(0) ^ 6(0)), and (2) the relation DMrg holds. The DMrg ABT is a deterministic merger. If a(0) arrives before (3(0), i.e. a(0) < 6(0), then the ABT produces 7(0) = a(0) as its output at c(0) = a(0) and proceeds with the tails of
Coordinated Composition
of Software
Components
49
the streams in its first input timed data stream. If a(0) arrives after /3(0), i.e. a(0) > 6(0), then the ABT produces 7(0) = j3(0) as its output at c(0) = 6(0) and proceeds with the tails of the streams in its second input timed data stream. Observe that Mrg{(a, a), tfi, 6); (7, c)) = Mrg((/3, 6), (a, a); (7, c». The replicator ABT is defined as: Rpl((a,a);
((3,6), {j,c)) = f3 — a A 7 = a A 6 = a A c = a
It is easy to see that this ABT captures the behavior of any entity that synchronously replicates its input stream into its two identical output streams. Observe that Rpl((a, a); (/3, 6), (7, c)) = Rpl((a, a); (7, c), (0, 6». 2.5.3.3. Sum As an example of an ABT that performs some computation, consider a simple dataflow adder. The behavior of such a component is captured by the Sum ABT defined as Sum({a,a},(f3,b);('j,c)) EE 7(0) = a(0) + /3(0) A 3t : max(a(0),b(0)) < t < min(a(l), 6(1)) A c(0) = t A Sum((a',a'),(f3\b');W,c'}). Sum defines the behavior of a component that repeatedly reads a pair of input values from its two input ports, adds them up, and writes the result out on its output port. As such, its output data stream is the pairwise sum of its two input data streams. This component behaves asynchronously in the sense that it can produce each of its output data items with some arbitrary delay after it has read both of its corresponding input data items (c(0) = tAt > max(a(0),b(0))). However, it is obligated to produce each of its output data items before it reads in its next input data item (t < min(a(l), 6(1))). 2.5.3.4. Philosophers and Chopsticks The classical dining philosophers problem can be described in terms of n > 1 pairs of instances of two components: philosopher instances of Phil and chopstick instances of Chop. We define the externally observable behavior of each of these components as an ABT. We show in Section 2.6 how instances of these components can be composed into different component based systems both to exhibit and to solve the famous deadlock problem. We assume that a chopstick component has two input ports, t (for take) and / (for free), through which it reads in the timed data streams (at,at) and ( a / , a / ) , respectively. The data items in at and a / are tokens whose actual values are not of interest to us. In practice, it is a good idea for these tokens to contain the identifier of the entity (e.g., philosopher) who uses the chopstick, but as long as such informative requirements do not affect behavior, they are irrelevant for our ABT definition.
F. Arbab
50
When a chopstick is free (its initial state) it is ready to accept a take request and thus reads from its t port the next take request token out of (at,at). Once taken, a chopstick is ready to accept a free request and thus reads from its / port the free request token out of {af, af). For the user of the chopstick, the success of its I/O operation on port t or / means the chopstick has accepted its {take or free) request. This simple behavior is captured by the Chop ABT defined as Chop({at,at),
{af,af};)
= at < af < a't.
Because we are not interested in the actual value of the take/free tokens, the Chop ABT has nothing to say about the data streams at and a/; it is only the timing that is relevant here. The timing equation simply states that initially, there must be a take, followed by a free, and this sequence repeats. We assume that a philosopher component has four output ports, It (for left-take), If (for left-free), rt (for right-take), and rf (for right-free), through which it writes the timed data streams {an,an), {aif,aif), {art,art), and {arf,arf), respectively. The two ports It and If are "on the left" and two ports rt and rf are "on the right" of the philosopher component, so to speak. The philosopher's requests to take and free the chopsticks on its left and right are issued through their respective ports. The externally observable behavior of a philosopher component is as follows. After some period of "thinking" it decides to eat, at which point it attempts to obtain its two chopsticks by issuing take requests on its It and rt ports. We assume it always issues a request for its left chopstick before requesting the one on its right. The philosopher component interprets the success of its write operation as the acceptance of its request (e.g., for exclusive access to the chopstick). Once, and if, both of its take requests are granted, it proceeds to "eat" for some time, at the end of which it then issues requests to free its left and right chopsticks by writing tokens to its If and rf ports. The philosopher component then repeats the cycle by entering its thinking period again. This behavior is captured by the Phil ABT defined as Phil{; {att,ait), {&if, atf), {art, art), {arf, arf)) = ait < art < aif < arf < a'lt Again, because we are not interested in the actual values of the take/free tokens that this component produces, the Phil ABT says nothing about the data streams. All we are interested in is the timing constraints: an arbitrary "thinking" delay; followed by a request to take the left chopstick; once granted, followed by a request to take the right chopstick; once granted, followed by an arbitrary "eating" delay; followed by the requests to free the left and the right chopsticks; and the cycle repeats. 2.5.4. ABT
Composition
Abstract behavior types can be composed to yield other abstract behavior types through a composition similar to the relational join operation in relational databases. Two ABTs can be composed over a common timed data stream if one
Coordinated Composition
of Software
Components
51
is the producer and the other the consumer of that timed data stream. The same two ABTs can be composed over zero or more common timed data streams, each ABT playing the role of the producer or the consumer of one of the timed data streams, independent of its role regarding the others. Observe that the producer and the consumer of a timed data stream, (a, a), necessarily synchronize their I/O operations on their respective portals for the mutual exchange of the data items in its data stream a, according to the schedule in its twin time stream a. This is accomplished simply by "fusing" their respective portals together such that the timed data stream observed on one is identical to the one observed on the other. Consider two ABTs B\ with arity p = Pi+p0 and B2 with arity q = qi+q0, where Pi and p0 are, respectively, the input arity and the output arity of B\, and qi and q0, those for B2. B\ and B2 can be composed with 0 < k < min(pi,q0) + min(p0,qi) pairs of mutually fused portals, where the data items produced through an output portal, O, of one ABT are fed for consumption by the other ABT through its input portal that is fused with O. We define the k-dyad composition of the two ABTs B i ( J l i , II2, •••HPi; 0 1 i , 0 1 2 l . . . O l p J and B2(I21,I22,...I2qz;021,022,-02qo) as a special form of the join of the two relations B\ and B2 where k distinct portals (i.e., relational columns) of B\ are paired each with a distinct portal of B2 into k dyads such that (1) the two portals in each dyad have opposite input/output tags, and (2) the two timed data streams of the portals in each dyad are equal. The fc-dyad composition of B\ and B2 yields a new ABT, B(I\, I2, •••Im;Oi,02, •••On), with arity m + n = p + q — 2xk, defined as a relation over those portals of B\ and £?2 that are not involved in a dyad (i.e., the fused portals disappear from the resulting relation). The list I i , i 2 , •••Im is obtained from the list 7 1 i , / i 2 , ...IlPi,I2i:I22, •••I2qi by eliminating every one of its elements involved in a dyad. Similarly, the list 0\, O2, -On is obtained from the list Oli, OI2, ...Ol p o , 02^,022, ••02qo by eliminating every one of its elements involved in a dyad. We use the dyad indices 1 < / < k as superscripts to mark the corresponding portals of B\ and B2 in their fc-dyad composition. For example, B = B\({a, a), ((3, b)1; (7, c)) o B2({5, d); (fi, m)1) denotes the 1-dyad composition of the two abstract behavior types B\ and B2 where the output (portal) of B2 is identical to the second input (portal) of B\. The resulting ABT is defined through the relation B = {{{a, a), (S,d); (7, c)) | ({a, a), (/?, b); (7,0)) G B j A ((8,d);(fi,m)) G B2 A (/3, b) = (fi, m)}. Another example is the ABT B = Bx((a, a), (j3, b)1; (7, c) 2 ) o B2((S,d)2;(^L,m)1,(p,n)), which denotes the 2-dyad composition of the two abstract behavior types B\ and B2 where the first output of B2 is identical to the second input of B\ and the output of B\ is identical to the input of B 2 - The resulting ABT is defined as the relation B = {((a, a); {v,v)) | ({a, a), (/3,b); (7, c)) G Bi A ((5, d); <JJL, m), (v, n)) G B2 A Ifi, b) = (fi, m) A (7, c) = (5,
d)}.
The common case of the 1-dyad composition of B\ and B2 where the single output of B\ is identical to the single input of B2 is abbreviated as B\(...; (a, a)) o B2((P,b);...) instead of B\{...\ (a,a)1)oB2{((3,b)1;...). This abbreviation is particularly convenient together with the infix notation for binary abstract behavior types. For instance, B = (a,a)Bi(f3,b) o (^,c)B2{8,d) denotes the 1-dyad composition of the two abstract behavior types B\ and B2 where the output of B\ is identical to the
52
F. Arbab
input of £?2- Of course, the resulting ABT is defined as the relation (a, a)B{5, d) = {((a, a); (S, d» | ((a, a); (/?, b)) e Bx A ((7, c); (J, d)) e B2 A (/?, 6) = (7, c)}. For example, consider the binary ABTs defining the basic channels presented in Section 2.5.3. It is not difficult to see that the (1-dyad) composition of these ABTs produces results that correspond to our intuition. For instance, the composition of two Sync ABTs produces a Sync ABT. Indeed, composition of a Sync ABT with any other ABT (on its left or right) yields the same ABT. More interestingly, the composition of two FIFO ABTs produces a FIFO ABT. Composing two FIFOi ABTs produces a FIFO2 ABT. The formal proof of this latter equivalence relies on our notion of dense time (as opposed to discrete time) and is given in [8], together with the formal treatment of many other interesting examples.
2.6. Reo The ABT model provides a simple formal foundation for definition and composition of components. The composition of ABTs (formalized as constraint automata or relations on timed-data streams) supports a very flexible mechanism for software composition in component based systems. This furnishes the desired level of composition flexibility we expect in a component model. However, composing components directly with one another in this way does not immediately confer the power of exogenous coordination to the glue code. The ABT model itself does not provide primitives to directly express any form of non-trivial coordination; for that, we need an effective exogenous coordination model. Reo is a channel-based exogenous coordination model wherein complex coordinators, called connectors are compositionally built out of simpler ones [2, 7, 8]. The simplest connectors in Reo are a set of channels with well-defined behavior supplied by users. Reo can be used as a language for coordination of concurrent processes, or as a "glue language" for compositional construction of connectors that orchestrate component instances in a component based system. The emphasis in Reo is on connectors and their composition only, not on the entities that connect to, communicate, and cooperate through these connectors. Each connector in Reo imposes a specific coordination pattern on the entities (e.g., component instances) that perform I/O operations through that connector, without the knowledge of those entities. We propose Reo as a model and language for compositional specification and construction of reusable coordinating glue-code. Pieces of glue-code in Reo can connect and exogenously coordinate the behavior of components. They can also just as easily connect and exogenously coordinate the behavior of other pieces of Reo glue-code, thus enabling compositional construction of the glue-code itself. All composition in Reo consists of channel composition. Channel composition in Reo is a very powerful mechanism for construction of connectors. The expressive power of connector composition in Reo has been demonstrated through many examples in [2, 8]. For instance, exogenous coordination patterns that can be expressed as (meta-level) regular expressions over I/O operations performed by component instances can be composed in Reo out of a small set of only five primitive channel
Coordinated Composition
of Software
Components
53
types [2]. A Turing machine consists of a.finite state automaton for its control, and an unbounded tape. Since an unbounded tape can be simulated by two unbounded FIFO channels, adding FIFO to the above set of channel types makes channel composition in Reo Turing complete. A mobile channel allows (physical or logical) relocation of one of its ends without the knowledge or the involvement of the entity at its other end. Logical mobility changes the topology of the interconnections of communicating entities, while physical mobility can have other implications, e.g., on an entity's (efficiency of) access to various resources. An efficient distributed implementation of channels supporting this notion of mobility is described in [6]. Both component instances and channels are mobile in Reo. Logical mobility of channel ends in Reo allows dynamic reconfiguration of connectors, even while they are being used by component instances. In this respect, Reo resembles dynamically reconfigurable generalized Kahn networks, as in IWIM [1] and Manifold [10], and its dataflow nature is also related to Broy's timed dataflow model, although Reo is more general and more expressive than these and similar models. Much as Reo supports physical mobility through its move operation to allow more efficient flow of data, it ascribes no semantic significance to it. The move operation does not semantically affect connector topologies, flow of data, or connectivity of components to connectors. An important aspect of Reo which is not covered in this chapter is that the topology of connectors in Reo is inherently dynamic. This means that the configuration of a component-based system can dynamically change not only due to dynamic construction and connection/disconnection of connectors and component instances, but also - and more interestingly - due to dynamic reconfiguration of instantiated connectors even as they are actively in use. Moreover, Reo supports a very liberal notion of channels. As such, Reo is more general than dataflow models, Kahn-networks, and Petri-nets, all of which can be viewed as specialized channelbased models that incorporate certain specific primitive coordination constructs. Broy's work on timed dataflow channels [11, 12] is perhaps closest to Reo. Nevertheless, Reo's more general notion of channels, its inherent dynamic topology, its powerful exogenous coordination that uses a clear separation of flows of data and time, and the fundamental notion of channel/connector composition that allow, among other things, compositions involving an expressive mix of synchrony and asynchrony, distinguish it from this model as well. It turns out that the ABT model is quite adequate for defining the channel and connector composition operation which is the crux of exogenous coordination in Reo. In the rest of this section we show how connector construction in Reo can be seen as an application of the ABT model. 2.6.1. Channels
and
Connectors
Channels are the only primitive medium of communication between two components in Reo. The notion of channel in Reo is far more general than its common interpretation. A channel in Reo has its own unique identity and always has exactly two directed ends, each with its own unique identity. Based on their direction, there are two types of channel ends: source and sink ends. Data enters through a source
54
F. Arbab
channel end into its respective channel, and it leaves through a sink channel end from its respective channel. (Channels themselves have no direction in Reo, only their ends do.) Beyond a small set of mild obvious requirements, such as enabling I/O operations to read/write data items from/to their ends, Reo places no restrictions on the behavior of channels. This allows an open-ended set of different channel types to be used simultaneously together in Reo, each with its own policy for synchronization, buffering, ordering, computation, data retention/loss, etc. Some typical examples of conventional channels are, e.g., the ones defined in Section 2.5.3. These channels happen to each have a source end and a sink end. More unconventional channels are also possible in Reo, especially because a channel can also have only two source ends or only two sink ends. A few examples of some such exotic channels appear in Section 2.6.2; even more examples are presented in [2, 7]. Strictly speaking, Reo itself neither provides nor assumes the availability of any specific set of channel types; it simply assumes that an appropriate assortment of channel types, each with its properly well-defined semantics, is provided by users for it to operate on. Nevertheless, it is reasonable to expect that in practice certain most primitive channel types, e.g., synchronous channels, will always be made available in all cases. Reo defines a connector as a set of channel ends and their connecting channels organized in a graph of nodes and edges such that: • Zero or more channel ends coincide on every node. • Every channel end coincides on exactly one node. • There is an edge between two (not necessarily distinct) nodes if and only if there is a channel one end of which coincides on each of those nodes. We use x — i > N to denote that the channel end x coincides on the node N, and x to denote the unique node on which the channel end x coincides. For a node N, we define the set of all channel ends coincident on N as [N] = {x \ x — i > N}, and disjointly partition it into the sets Src(N) and Snk(N), denoting the sets of source and sink channel ends that coincide on N, respectively. Observe that nodes are neither components nor locations. Although some nodes are attached to component instances to allow their exchange of information, nodes and components are different notions and not every node can be associated with or attached to a component instance. A node is a fundamental concept in Reo representing an important topological property: all channel ends x G [N] coincide on the same node N. This property entails specific implications in Reo regarding the flow of data among the channel ends x G [N], irrespective of concern for the location of those channel ends or N, or the possible attachment of JV to a component instance. A node N is called a source node if Src(N) ^ 0 A Snk(N) = 0. Analogously, N is called a sink node if Src(N) — 0 A Snk(N) ^ 0. A node N is called a mixed node if Src(N) ^ 0 A Snk(N) ^ 0. By the above definition, every channel represents a (simple) connector with two nodes. From the point of view of Reo a port of a component instance is just a node that (initially) contains a single channel end. An input port is (initially a singleton)
Coordinated Composition
of Software
Components
55
source node, and an output port is (initially a singleton) sink node. From the point of view of a component instance, each of its ports is merely a simple connector corresponding to a synchronous channel (the node of) one end of which is made publicly accessible for I/O by its environment, while (the node of) its other end is hidden for exclusive use by the component instance itself. An output port of a component instance has the sink node of its synchronous channel public while its source node is available only for I/O operations performed by that component instance. Likewise, an input port has the source node of its synchronous channel public while its sink node is hidden for exclusive use by its component instance. Reo provides I/O operations on source and sink nodes only; components cannot read from or write to mixed nodes. A component instance can write to a source node or read from a sink node using node I/O operations of Reo only if it is connected to that node. Connection of a node to a component instance gives the latter the exclusive right to perform I/O operations on that node. Reo provides operations to change the connection of nodes to component instances dynamically, but a node can be connected to at most a single component instance at any given time. This is a prerequisite for the formal notion of compositionality presented in [5]. The graph representing a connector is not directed. However, for each channel end xc of a channel c, we use the directionality of xc to assign a local direction in the neighborhood of xc to the edge that represents c. The local direction of the edge representing a channel c in the neighborhood of the node of its source xc is presented as an arrow emanating from xc. Likewise, the local direction of the edge representing a channel c in the neighborhood of the node of its sink xc is presented as an arrow pointing to xc. See Figure 2.5 for examples. Complex connectors are constructed in Reo out of simpler ones using its j o i n operation. The j o i n operation in Reo is defined only on nodes. Joining two nodes N\ and N^ destroys both nodes and produces a new node N with the property that [AT] = [Ni] U [N2]. This single operation allows construction of arbitrarily complex connector graphs involving any combination of channels picked from an open-ended set of channel types. The semantics of a connector is defined as a composition of the semantics of its (1) constituent channels, and (2) nodes. Because Reo does not provide any channels, it does not define their semantics either. What Reo defines is the composition of channels into connectors and the semantics of this composition through the semantics of its (three types of) nodes. Intuitively, a source node replicates every data item written to it as soon as all of its coincident source channel ends can consume that data item. Reading from a sink node nondeterministically selects one of the data items available through its coincident sink channel ends. A mixed node is a self-contained "pumping station" that combines the behavior of a sink node and a source node in an atomic iteration of an infinite loop: in each atomic iteration it nondeterministically selects an appropriate data item available through its coincident sink channel ends and replicates that data item into all of its coincident source channel ends. A data item is appropriate for selection in an iteration only if it can be consumed by all source channel ends that coincide on that node. The behavior of every Reo node can be defined as a composition of two primitive ABTs [2]: a non-deterministic merger, and a replicator, as defined in Section 2.5.3.2.
F. Arbab
56
Every edge of a connector corresponds to a channel whose semantics is defined as an ABT. Since a connector consists of (three types of) nodes and edges, all of whose semantics are defined as ABTs, the semantics of every connector in Reo can be derived as a composition of the ABTs of its constituent nodes and edges.
2.6.2. A Cogent Set of Primitive
Channels
To demonstrate the utility of Reo we must supply it with a set of primitive channels. The fact that Reo accepts and the ABT model allows definition of an open-ended set of arbitrarily complex channels is interesting. What is more interesting, however, is that connector composition in Reo is itself powerful enough to yield surprisingly expressive complex connectors out of a very small set of trivially simple channels. A useful set of primitive channels for Reo consists of 7 channel types: Sync, FIFO, FIFOi, FIFOi(D), F i l t e r ( P ) , LossySync, and SyncDrain. This is not a minimal set, in the sense that some of the channel types in this set can themselves be composed in Reo out of others; however, minimality is not our concern here and these channel types turn out to be both simple and frequently useful enough to deserve their own explicit mention. The first four channel types were defined as ABTs in Section 2.5.3. We define the ABTs for the rest below. The common characteristics of the last three channels, above, are that they are all (1) synchronous, and (2) lossy. Neither channel has a buffer to store data and if necessary, delays the I/O operation on either one of its ends until it is matched with an I/O operation on its other end. A channel is lossy if it does not deliver through its sink end every data item it consumes through its source end. The difference between these three channels is in their loss policy.
(1) A F i l t e r ( P ) channel is a synchronous channel with a source and a sink end that takes a pattern P parameter upon its creation. It behaves like a Sync channel, except that only those data items that match the pattern P can actually pass through it; others are always accepted by its source, but are immediately lost. The behavior of such a channel is captured by the F i l t e r ( P ) ABT defined as (a,a) F i l t e r ( P ) (/?, 6) = ,3(0) = a(0) A 6(0) = a(0) A (a', a') F i l t e r ( P ) (/?', b') if a(0) 3 P (a1, a') F i l t e r ( P ) (/3,b) otherwise The infix operator a(0) 9 P denotes whether or not the data item ct(0) matches with the pattern P . If so, a(0) passes through, otherwise it is lost, and the ABT proceeds with the rest of its timed data streams. (2) A LossySync channel is also like a Sync channel except that it is always ready to consume every data item written to its source end. If a matching read operation is pending at its sink, the data item written to its source is actually transferred; otherwise, the written data item is lost. The behavior of this channel is captured
Coordinated Composition
of Software
Components
57
by the LossySync ABT defined as (a, a) LossySync (f3,b) = ( (a, a) LossySync (/3, a(O).b') if a(0) > 6(0) J /3(0) = a(0) A (a', a'} LossySync (/3', V) if a(0) = 6(0) [ (a', a') LossySync (/?, 6) otherwise (3) A SyncDrain is a channel with two source ends. Because it has no sink end, it has no way to ever produce any data items. Consequently, every data item written to its source ends is simply lost. SyncDrain is synchronous because a write operation on one of its ends remains pending until a write is performed on its other end as well; only then both write operations succeed together. The behavior of this channel is captured by the SyncDrain ABT denned as SyncDrain((a, a),(/3,b);) = a = b 2.6.3. Coordinating
Glue Code
To demonstrate the expressive power of connector composition, in this section we describe a number of examples in Reo. More examples are presented elsewhere [2, 7,8].
;•' Fig. 2.5.
Examples of connectors in Reo
2.6.3.1. Write-Cue Regulator Consider the connector in Figure 2.5.a, composed out of the three channels ab, cd, and ef. Channels ab and cd are of type Sync and ef is of type SyncDrain. This connector shows one of the most basic forms of exogenous coordination: the number of data items that flow from a to d is the same as the number of write operations that succeeds on f. (Recall that a designates the unique node on which the channel end a coincides.) The analogy between the behavior of this connector and a transistor in the world of electronic circuits is conspicuous. A component instance with a port connected to f can count and regulate the flow of data between the two nodes a and d by the timing and the number of write operations it performs on f. The entity that regulates and/or counts the number
F. Arbab
58
of data items through f need not know anything about the entities that write to a and/or take from d, nor that its write actions actually regulate this flow. The two entities that communicate through a and d need not know anything about the fact that they are communicating with each other, nor that the volume of their communication is regulated and/or measured by a third entity at f. 2.6.3.2. Barrier
Synchronizers
We can build on our write-cue regulator to construct a barrier synchronization connector, as in Figure 2.5.b. The four channels ab, cd, gh, and i j are all of type Sync. The SyncDrain channel ef ensures that a data item passes from a to d only simultaneously with the passing of a data item from g to j (and vice versa). This simple barrier synchronization connector can be trivially extended to any number of pairs, as shown in Figure 2.5.C. 2.6.3.3. Ordering The connector in Figure 2.5.d consists of three channels: ab, ac, and be. The channels ab and ac are SyncDrain and Sync, respectively. The channel be is of type FIFDi. The behavior of this connector can be seen as imposing an order on the flow of the data items written to a and b, through to c~: the data items obtained by successive read operations on c: consist of the first data item written to a, followed by the first data item written to b, followed by the second data item written to a, followed by the second data item written to b, etc. See [2] for more detail and [8] for a formal treatment of this connector. The coordination pattern imposed by our connector can be summarized as c = (ab)*, meaning the sequence of values that appear through c consist of zero or more repetitions of the pairs of values written to a and b, in that order. 2.6.3.4. Sequencer Consider the connector in Figure 2.5.e. The enclosing box represents the fact that the details of this connector are abstracted away and it provides only the four nodes a, b, 8, and d for other entities (connectors and/or component instances) to (in this case) read from. Inside this connector, we have four Sync, a FIFOi(o), and three FIFOi channels connected together. The FIFOi(o) channel is the leftmost one and is initialized to have a data item in its buffer, as indicated by the presence of the symbol "o" in the box representing its buffer. The actual value of this data item is irrelevant. The read operations on the nodes a, b, 8, and d can succeed only in the strict left to right order. This connector implements a generic sequencing protocol: we can parameterize this connector to have as many nodes as we want, simply by inserting more (or fewer) Sync and FIFOi channel pairs, as required. Figure 2.5.f shows a simple example of the utility of our sequencer. The connector in this figure consists of a two-node sequencer, plus a pair of Sync channels and a SyncDrain channel connecting each of the nodes of the sequencer to the nodes a and 3, and b and c;, respectively. The connector in Figure 2.5.f is another connector
Coordinated Composition
of Software
Components
59
for the coordination pattern c = (ab)*, although there is a subtle difference between the behavior of this connector and the one in Figure 2.5.d. See [2] for more detail. It takes little effort to see that the connector in Figure 2.5.g corresponds to the meta-regular expression c = (aab)*. Figures 2.5.f and g show how easily we can construct connectors that exogenously impose coordination patterns corresponding to the Kleene-closure of any "meta-word" made up of atoms that stand for I/O operations, using a sequencer of the appropriate size. 2.6.4. Constant
Replacer
Figure 2.6 shows a Reo connector (encapsulated in the outermost thick box, hiding mixed nodes Nl and N2) with one exposed input (i.e., source node A) and one exposed output (i.e., sink node B) nodes. This connector is composed out of four channels: a SyncDrain (A-Nl), a Sync (Nl-B), a FIFOi (N1-N2), and a filled FIFOi(T) (N2-N1) that contains an initial value T. Of course, the constructor of this connector can be parameterized to initialize this FIFOi channel with any supplied value, instead of T, every time it creates a new instance of this circuit.
T
N2*
1 A
V A
<X-c>
<8,d>
:
Nl
A
-eV
A a*
V B
Fig. 2.6.
Constant replacer
Using the ABT definitions of these channels and those of its nodes, we can find the relationship between the two timed data streams (a, a) and (j3, b) that pass through the nodes A and B, respectively [3]. We can formally show that there is no relationship between the stream of input values, a, and the stream of output values, /3, of this connector: whatever value comes through the node A, its corresponding output value through the node B is the constant value T. On the other hand, the input/output "timings" of this connector are formally related: passage of each pair of values through the nodes A and B is atomic. As a side note, it is interesting to observe that this relationship and other insights we gain, below, through a formal treatment of the behavioral equations of this connector, all correspond to and confirm the intuitive impression that we get through
60
F. Arbab
an informal reasoning using the schematic of this connector in Figure 2.6. This observation underscores the usefulness and the significance of visual representation of Reo connectors.
2.6.5. Fibonacci
Series
A simple example of how a composition of a set of components yields a system that delivers more than the sum of its parts is the computation of the classical Fibonacci series. To assemble a component based application to deliver this series we actually need only one (instance of one) component plus a number of channels. The component we need is a realization of the Sum ABT that we already saw in Section 2.5.3.
i
i
r ^ Z > — C h—
J
Fig. 2.7.
i
Sum
i
i
Computing the Fibonacci series
Figure 2.7 shows a component (the outermost thick enclosing box) with only one output port (the only exposed node on the right border of the box). This is our component based application for computing the Fibonacci series. Peeking inside this component, we see how it is made out of an instance of Sum, a FIFOi(l), a FIFOi(O), a FIFOi, and five Sync channels. As long as the FIFDi(O) channel is full, nothing can happen: there is no way for the value in FIFOi(l) to move out. At some point in time, the value in FIFOi(O) moves into the FIFOi channel. Thereafter, the FIFDi(O) channel becomes empty and the two values in the FIFOi (1) and the FIFOi channels become available for Sum to consume. The intake of the value in FIFOi (1) by Sum inserts a copy of the same value into the FIFOi (0) channel. When Sum is ready to write its computed value out, it suspends waiting for some entity in the environment to accept this value. Transfer of this value to the entity in the environment also inserts a copy of the same value into the now empty FIFOi (1) channel. At this point we are back to the initial state, but with different values in the buffers of the FIFOi (1) a n d the FIFOi (0) channels. The ABT models of the component Sum, channels, and Reo nodes that we presented earlier suffice for a formal analysis of the behavior of their composition in this example. Observe that all entities involved in this composed application are completely generic and, of course, neither knows anything about the Fibonacci series, nor the fact that it is "cooperating" with other entities to compute it. It
Coordinated Composition
of Software
Components
61
is the specific glue code of this application, made by composing 8 simple generic channels in a specific topology in Reo, that coordinates the communication of the components (in this case, only one) with one another (in this case, with itself) and the environment to compute this series. 2.6.6. Dining
Philosophers
We can vividly demonstrate the significance of exogenous coordination in component based system composition through the classical dining philosophers problem. In this section we use instances of two components, each of which is a realizations of one of the two ABTs Phil and Chop defined in Section 2.5.3.4, to (1) compose a dining philosophers application that exhibits the famous deadlock problem; and (2) compose another dining philosophers application that prevents the deadlock.
(a)
(b) Fig. 2.8.
Dining philosophers in Reo
Figure 2.8.a shows 4 philosophers and 4 chopsticks around a virtual round table. Each philosopher has 4 output ports, corresponding to the It, If, rt, and rf portals of the Phil ABT in Section 2.5.3.4. In this figure, philosophers face the table, thus their sense of left and right is obvious. Each chopstick has two input ports, corresponding to the t and / input portals of the Chop ABT in Section 2.5.3.4. In Figure 2.8.a, chopstick ports on the outer-edge of the table are their t ports and the ones closer to the center of the table are their / ports. The t (take) port of each chopstick is connected to the take ports of its adjacent philosophers, and its / port to their respective free ports. All channels are of type Sync. Consider what happens in the node at the three-way junction connected to the t port of Chop\. If Chopi is free and is ready to accept a token through its t port, as it initially is, whichever one of the two philosophers Phili and Phil4 happens to write its take request token first will succeed to take Chopi. Of course, it is possible for Phili and Phili to attempt to take Chopi at the same time. In this case, the
62
F. Arbab
semantics of this mixed node (by the definition of the ABT Mrg) guarantees that only one of them succeeds, nondeterministically; the write operation of the other remains pending until Chopi is free again. Because the definition of the ABT Phil states that a philosopher frees a chopstick only after it has taken it, there is never any contention at the three-way junction connected to the / port of a chopstick. The composition of channels in this Reo application enables philosophers to repeatedly go through their "eat" and "think" cycles at their leisure, resolving their contentions for taking the same chopsticks nondeterministically. The possibility of starvation is ruled out because the nondeterminism in Mrg is assumed to be fair. This simple glue code composed of nothing but common generic Sync channels directly renders a faithful implementation of the dining philosophers problem; all the way down to its possibility of deadlock. Because all philosophers are instances of the same component, they all attempt to fetch their chopsticks in the same order. The Phil ABT defines this to be left-first. If all chopsticks are free and all philosophers attempt to take their left chopsticks at the same time, of course, they will all succeed. However, this leaves no free chopstick for any philosopher to take before it can eat. No philosopher will relinquish its chopstick before it finishes its eating cycle. Therefore, this application deadlocks, as expected.
2.6.6.1. Avoiding the Deadlock Interestingly, with Reo, solving the deadlock problem requires no extra code, central authority, or modification to any of the components. In order to prevent the possibility of a deadlock, all we need to do is to change the way in which we compose our application out of the very same components. Figure 2.8.b shows a slightly different composition topology of the same set of Sync channels comprising the glue code that connects the exact same instances of Phil and Chop as before. We have flipped one philosopher's left and right connections to its adjacent chopsticks (in this particular case, those of Phil2) without its knowledge. None of the components in the system are aware of this change, nor is any of them modified in any way to accommodate it. Our flipping of these connections is purely external to all components. It is not difficult to see why this new topology prevents deadlock. If all philosophers attempt to take their left chopsticks now at the same time, one of them, namely PMI2, will actually reach for the one on its right-hand-side. Of course, Phil2 is unaware of the fact that as it reaches out through its left port to take its first chopstick, it is actually the one on its right-hand-side it competes to take. In this case it competes with PMI3, which is also attempting to take its first chopstick. It makes no difference which one of the two wins this competition, one will be denied access to its first chopstick. This ensures that at least one chopstick will remain free (no philosopher attempts to take Chapi as its first chopstick) to enable at least one philosopher to obtain its second chopstick as well and complete its eating cycle. Comparing the composition topologies in Figures 2.8.a and b, we see that in Reo (1) different glue code connecting the same components produces different system behavior; and (2) coordination protocols are imposed by glue code on components
Coordinated Composition
of Software
Components
63
that cooperate with one another through the glue code, without being aware of each other or their cooperation. The two fundamental notions that underpin this pair of highly desirable provisions are: • The underlying notion of component (Section 2.2) in the ABT model prevents a component from distinguishing individual entities within its environment directly. Components can exchange only passive data with their environment through communication primitives that (1) do not allow them to discern specific targets as communication partners, and (2) do not entail any further obligation on behalf of the environment. The ABT model of components, thus, grants the environment great flexibility in making late, even dynamic, decisions about how components are composed. This makes ABT components highly susceptible to exogenous coordination, although the ABT model itself offers no non-trivial coordination primitives. • Reo is a coordination model that takes full advantage of the composition flexibility inherent in the ABT model and offers a calculus of connector composition based on a user-defined set of primitive channels, all defined as ABTs. The crux of this calculus is the j o i n operator in Reo for composing channel ends into composite nodes, and the specific semantics it defines for these nodes as ABTs. Connector composition in Reo offers a simple yet surprisingly expressive exogenous coordination model that effectively exploits the flexibility of behavior specification in the ABT model. The two systems in Figures 2.8.a and b are made of the same number of constituent parts of the same types: the same number of component instances of the same kinds, and the same number of primitive connectors (Sync channels). The only difference between the two is in the topology of their inter-connections. This topological difference is the only cause of the difference between the "more than sum of the parts" in these two systems. 2.6.6.2. Making of a Chopstick A moment of reflection reveals that, especially since there is no computation involved in the behavior of a chopstick, it should be easy to realize the behavior defined by the ABT Chop through channel composition. The behavior defined as Chop is indeed all coordination: it must alternate enabling the write operations on one (t) then on the other (/) of its two input ports. Indeed, we can easily use a twoport sequencer (Figure 2.5.e) plus two SyncDrain channels to realize this behavior. But a much simpler construction is possible as well.
Fig. 2.9.
Inside of a chopstick
64
F. Arbab
The connector hidden inside the enclosing box in Figure 2.9 is a simplified twoport sequencer which exactly implements the behavior of the ABT Chop. This connector consists of two channels: a FIFOi and a SyncDrain. Initially, the FIFOi is empty, therefore enabling the first write to its port t to succeed immediately. While this channel is empty, a write to its / port suspends because there is no data item to be "simultaneously" consumed by the opposite (source) end of the SyncDrain. Once a write to t succeeds, the FIFOi channel becomes full and the next write operation on port t will suspend until this channel becomes empty again. When the FIFOi channel is full, a write to / succeeds, causing the SyncDrain channel to consume the contents of the FIFOi channel as well. This returns the connector to its original state allowing it to cyclically repeat the same behavior.
Sequencer
Fig. 2.10.
Using a sequencer to make a chopstick
Alternatively, one can observe that a chopstick is essentially a two-port sequencer that allows write operations to succeed sequentially on its ports. Thus, we can use a two-port version of our sequencer in Figure 2.5.e to construct a chopstick, as in Figure 2.10. The essential addition to the sequencer in this connector consists of the two SyncDrain channels. The two Sync channels exist primarily for aesthetic reasons. The intuitive equivalence of the behavior of the two connectors in Figures 2.9 and 2.10 shows the necessity for formal models, e.g., based on constraint automata or timed-data-streams, to allow investigating the equivalence or subsuming relationships among the behavior of seemingly different connectors.
2.7. Time/Temperature Display Coordinator The "chopstick" of Section 2.6.6.2 constitutes the cornerstone of the coordination behavior we need for the component connector in Figure 2.1.d. We can use either one of the constructions in Figures 2.9 and 2.10, or any other with equivalent behavior, as an off-the-shelf component/connector to construct our time/temperature display coordinator, as in Figure 2.11. The box labeled "Write-Sequencer" in Figure 2.11 represents a connector with the behavior of a chopstick. The C, D, and T ports of the connector in Figure 2.11
Coordinated Composition
of Software
Components
65
D
>
Fig. 2.11.
Write-Sequencer
<
Coordinating connector for the Time/Temperature Display system
represent the connections with the C, D, and T components in Figure 2.1.d, respectively. Assuming the "Write-Sequencer" is constructed as in Figure 2.9, we can verify the claim in Section 2.4 that our time/temperature display coordinator can be constructed out of 4 Sync channels, 1 FIFOi channel, 1 SyncDrain channel, and 3 mergers (inherent in the Reo nodes C, T, and D). Figure 2.11, thus, represents a two-dimensional recipe or program describing precisely how the behavior of each constituent primitive (channel or node) must be composed with that of its topologically connected neighbors, to yield the desired behavior of the connector circuit.
2.8. Conclusions Components are expected to be independent commodities, viable in their binary forms in the (not necessarily commercial) marketplace, developed, offered, exploited, deployed, integrated, maintained, and evolved by separate autonomous organizations in mutually unknown and unknowable contexts, over very long spans of time. The level of intimacy that is implicitly required of objects that compose by invoking each other's methods, is simply too unrealistic in the world of such components. Heterogeneity of the semantics and the technical idiosyncrasies of various (versions of) object oriented models make object oriented composition of third-party blackbox software inefficient and impractical, if not impossible. Furthermore, a substantial body of useful software is written in non-object-oriented languages. Nevertheless, this software, and many other not-purely-software systems, communicate with their environment through simple exchanges of passive data, and can sensibly be reused as building block components in the construction of more complex systems. Component models that rely on (variations of) object oriented programming (e.g., components as fortified collections of objects) and its composition mechanism of method invocation do not offer the looser coupling and the flexible exogenous coordination framework that are necessary to support the composition of the more general non-object-oriented components.
66
F. Arbab
Abstract Behavior Types offer a simpler and far more flexible model of components — and of their composition. An A B T is a mathematical construct t h a t defines a n d / o r constrains the behavior of an entity without any mention of the operations or the d a t a types t h a t may be used to realize t h a t behavior. This puts the A B T model at a higher-level of abstraction t h a n A D T s and makes it more suitable for components. T h e endogenous n a t u r e of their composition means t h a t it is not possible for a third party, e.g., an entity in the environment, to compose two objects (or two ADTs) "against their own will" so to speak. In contrast, the composition of any two A B T s is always well-defined and yields another A B T . Constraint a u t o m a t a and relations on timed-data streams, as described in this chapter, are two concrete formalizations of the notion of A B T . T h e A B T model provides a simple formal foundation for definition and composition of components. However, direct composition of component A B T s does not generally provide much of an opportunity to systematically wield exogenous coordination. Reo is a channelbased exogenous coordination model t h a t can be used as a glue language for dynamic compositional construction of component connectors in (non-)distributed a n d / o r mobile systems. Connector construction in Reo can be seen as an application of the A B T model. A channel in Reo is just a special kind of an atomic connector (i.e., component): whereas components and connectors offer one or more ports to exchange information with their environment, a channel is an A B T t h a t offers exactly two ports (i.e., its channel-ends) for interaction with its environment. Because all Reo connectors are ABTs, the semantics of channel composition in Reo can be defined in terms of A B T composition.
Bibliography 1. F. Arbab. The IWIM model for coordination of concurrent activities. In P. Ciancarini and C. Hankin, editors, Coordination Languages and Models, volume 1061 of Lecture Notes in Computer Science, pages 34-56. Springer-Verlag, April 1996. 2. F. Arbab. Reo: A channel-based coordination model for component composition. Mathematical Structures in Computer Science, 14(3):329-366, June 2004. 3. F. Arbab. Abstract Behavior Types: A foundation model for components and their composition. Science of Computer Programming, 55:3-52, March 2005. extended version. 4. F. Arbab, C. Baier, J.J.M.M. Rutten, and M. Sirjani. Modeling component connectors in Reo by Constraint Automata. In Proc. International Workshop on Foundations of Coordination Languages and Software Architectures (FOCLASA 2003), volume 97 of Electronic Notes in Theoretical Computer Science (ENTCS), pages 25-46. Elsevier, July 2004. 5. F. Arbab, F.S. de Boer, and M.M. Bonsangue. A logical interface description language for components. In Antonio Porto and Gruia-Catalin Roman, editors, Coordination Languages and Models: Proc. Coordination 2000, volume 1906 of Lecture Notes in Computer Science, pages 249-266. Springer-Verlag, September 2000. 6. F. Arbab, F.S. de Boer, M.M. Bonsangue, and J.V. Guillen Scholten. MoCha: A framework for coordination using mobile channels. Technical Report SEN-R0128, Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ Amsterdam, The Netherlands, December 2001.
Coordinated Composition of Software Components
67
7. F. Arbab and F. Mavaddat. Coordination through channel composition. In F. Arbab and C. Talcott, editors, Coordination Languages and Models: Proc. Coordination 2002, volume 2315 of Lecture Notes in Computer Science, pages 21-38. Springer-Verlag, April 2002. 8. F. Arbab and J.J.M.M. Rutten. A coinductive calculus of component connectors. In D. Pattinson M. Wirsing and R. Hennicker, editors, Recent Trends in Algebraic Development Techniques, Proceedings of 16th International Workshop on Algebraic Development Techniques (WADT 2002), volume 2755 of Lecture Notes in Computer Science, pages 35-56. Springer-Verlag, 2003. 9. L. Bergmans and M. Aksit. Composing crosscutting concerns using Composition Filters. Communications of the ACM, 17(10):51-57, October 2001. 10. M.M. Bonsangue, F. Arbab, J.W. de Bakker, J.J.M.M. Rutten, A. Scutella, and G. Zavattaro. A transition system semantics for the control-driven coordination language Manifold. Theoretical Computer Science, 240:3-47, 2000. 11. M. Broy and G. Stefanescu. The algebra of stream processing functions. Theoretical Computer Science, 258, 2001. 12. M. Broy and K. Stolen. Specification and development of interactive systems, volume 62 of Monographs in Computer Science. Springer, 2001. 13. J.W. de Bakker and J.N. Kok. Towards a Uniform Topological Treatment of Streams and Functions on Streams. In W. Brauer, editor, Proceedings of the 12th International Colloquium on Automata, Languages and Programming, volume 194 of Lecture Notes in Computer Science, pages 140-148, Nafplion, July 1985. Springer-Verlag. 14. H. Barringer, R. Kuiper, and A. Pnueli. A really abstract current model and its temporal logic. In Proceedings of Thirteenth Annual ACM Symposium on principles of Programming Languages, pages 173-183. ACM, 1986. 15. B. Jacobs and J.J.M.M. Rutten. A tutorial on (co)algebras and (co)induction. Bulletin of EATCS, 62:222-259, 1997. 16. G. Kahn. The semantics of a simple language for parallel programming. In J.L. Rosenfeld, editor, Information Processing '74-' Proceedings of the IFIP Congress, pages 471-475. North-Holland, New York, NY, 1974. 17. J.N. Kok. Semantic Models for Parallel Computation in Data Flow, Logic- and ObjectOriented Programming. PhD thesis, Vrije Universiteit, Amsterdam, May 1989. 18. P. Panangaden and F. van Breugel, editors. Mathematical Techniques for Analyzing Concurrent and Probabilistic Systems. CRM Monograph Series. American Mathematical Society, 2004. ISSN: 1065-8599. 19. J.J.M.M. Rutten. Universal coalgebra: a theory of systems. Theoretical Computer Science, 249(l):3-80, 2000. 20. J.J.M.M. Rutten. Elements of stream calculus (an extensive exercise in coinduction). In S. Brookes and M. Mislove, editors, Proc. of 17th Conf. on Mathematical Foundations of Programming Semantics, Aarhus, Denmark, 23-26 May 2001, volume 45 of Electronic Notes in Theoretical Computer Science. Elsevier, Amsterdam, 2001. 21. J.J.M.M. Rutten. Component connectors. In [18], chapter 5, pages 73-87. 2004. 22. J.J.M.M. Rutten. A coinductive calculus of streams. Mathematical Structures in Computer Science, 15(1):93-147, February 2005.
This page is intentionally left blank
Chapter 3 On the Semantics of Componentware: A Coalgebraic Persecutive
Luis S. Barbosa' 1 ', Sun Mengt, B e r n h a r d K. Aichernig§ and Nuno Rodrigues" ^Department
of Informatics, Minho University [email protected] $LMAM, School of Mathematical Science, Peking University [email protected]. edu. en § International Institute for Software Technology United Nations University (UNU-IIST) [email protected] " Department of Informatics, Minho University [email protected] In this chapter we present a coalgebraic semantics for components. Our semantics forms the basis for a family of operators for combining components. These operators together with their algebraic laws establish a calculus for software components. We present two applications of our semantics: a coalgebraic interpretation of UML diagrams and the design of a component repository.
3.1.
Introduction
By the end of last century component-based software development [86, 89] emerged as a promising paradigm to deal with the ever increasing need for mastering complexity in software design, evolution and reuse. From object-orientation it retains the basic principle of encapsulation of d a t a and code, but shifts the emphasis from (class) inheritance to (object) composition to avoid interference between the former and encapsulation and, thus, paves the way to a development methodology based on third-party assembly of components. T h e paradigm is often illustrated by t h e visual metaphor of a palette of computational units, treated as black boxes, and a canvas into which they can be dropped. Connections are established by drawing wires, corresponding to some sort of interfacing code. T h e expression software component, however, is so semantically overloaded t h a t its use is often a risk. As p u t by P. Wadler in a 1999 Seminar suggestively entitled 'Component-based Programming under different paradigms', just as Eskimos need fifty words for ice, perhaps we need many words for components. Moreover, as it happened before with object-orientation, a n d software engineering in the broad sense, component-orientation has grown u p to a collection of popular technologies, 69
70
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
methods and tools, before consensual definitions and principles (let alone formal foundations) have been put forward. This chapter is concerned with a formalization of component-based development. Its overall theme is the quest for suitable, mathematically sound, semantic models upon which practical component calculi can be established. The emphasis on formalization entails the need to restrict the broad scope of componentware, to a specific, although popular, family of components. Therefore, in the sequel, software components will be represented by their specifications. A state-based style will be used, as it arises, for example, within the so-called model oriented approach to formal systems design — a widespread paradigm of which V D M [36], Z [79], B [1] and RAISE [87] are well-known representatives. indexRaise A typical example of such a state-based component is the ubiquitous stack. Denoting by U its internal state, a stack of values of type P is handled through the usual top : U —> P pop : U — > P x U push :U x P — > U
operations. An alternative, 'black box' view hides U from the stack environment and regards each operation as a pair of input/output ports. Such a 'port' signature of. e.g., the top operation is then given by top : 1
P
where 1 stands for the miliary (or unit) datatype. The intuition is that top is activated with the simple pushing of a 'button' (its argument being the stack private state space) whose effect is the production of a P value in the corresponding output port. Similarly typing push as push : P — • 1
means that an external argument is required on activation but no visible output is produced, but for a trivial indication of successful termination. Such 'port' signatures are grouped together in the diagram below. Note how input (respectively, output) 'ports' are represented by the sum of their parameters. Such sums label the stack input (respectively, output) point represented by an empty (respectively, full) circle in the diagram. Combined input type 1 + 1 + P models the choice of three functionalities (top, pop and push in this order), of which only one takes input of type P.
top :
1 —> P
pop :
1 —> P
push:
P—»1
P+P+ l
On the Semantics
of Componentware:
A Coalgebraic
Perspective
71
Component Stack encapsulates a number of services through a public interface providing limited access to its internal state space. Furthermore, it persists and evolves in time, in a way which can only be traced through observations at the interface level. One might capture these intuitions by providing an explicit semantic definition in terms of a function [Stack] :U xl
—>{U xO + 1)
where / , O abbreviate 1 + 1 + P and P + P + 1, respectively. The presence of 1 in its result type indicates that the overall behaviour of this component is partial: in a number of state configurations the execution of some operations may fail. This function — which should describe how Stack reacts to input stimuli, produces output data (if any) and changes state — can also be written in a curried form a as [Stack] : U—• ( [ / x O + 1) J
(3.1)
that is, as a coalgebra U —> T U for datatype transformer TX = ((X x O) + l ) 7 , as explained in the next section. But what are coalgebras? Why resorting to them to explain componentware? Our starting point is the conjunction of two key ideas. First, the 'black-box' characterisation of software components favours an observational semantics: the essence of the stack specification above lies in the collection of possible observations and any two internal configurations should be identified wherever indistinguishable by observation. This is nicely captured by coalgebra theory [75]. Secondly, we aim at generic constructions, i.e., independent of any particular notion of component behaviour. Therefore, the other key idea is the application of the so-called functorial approach to datatypes, originated in the work of the A D J group in the early seventies [22], to the area of state-based systems modelling. This approach provides a basis for generic programming [5] which raises the level of abstraction of the programming discourse in a way such that seemingly disparate techniques and algorithms are unified into idealised, kernel programming schemata. Moreover, we would like to formalize component calculi in an essentially equational, pointfree way, as one gets used to in functional programming. Plan. This chapter adopts the standpoint of coalgebra theory to address a number of issues in the semantics, calculi and methodologies of componentware, presenting an integrated view of our current research concerns (as documented in [6, 9, 11, 69, 80-85]). After a brief introduction to coalgebras, the chapter presents a coalgebraic semantics for software components on top of which a calculus, parametric on the envisaged behaviour model and intended for reasoning about and transforming component based designs, is developed. Two applications of this semantic framework are discussed in some detail. At the methodological level, the coalgebraic standpoint is shown to be applicable to formalizing and reasoning about UML descriptions. At the engineering level, on the other hand, a discussion on the design of a component repository illustrates the development of suitable tools for componentware. a
I n order to emphasize the dependency of the possible observations X from the input, we resort to the standard mathematical notation X1 for functional dependency, instead of the equivalent / —> X more familiar in computing.
72
L.S. Barbosa, M. Sun, B.K. Aichernig
and N. Rodrigues
3.2. W h y Coalgebra Matters 3.2.1. State-based
Components
The Stack example mentioned above illustrates the basic elements of a semantic model for state-based components: • the presence of an internal state space which evolves and persists in time, • and the possibility of interaction with other components through well-defined interfaces and during the overall computation. This favours adoption of a behavioural semantics: components are inherently dynamic, possess an observable behaviour, but their internal configurations remain hidden and should be identified if not distinguishable by observation. The qualificative 'state-based' is used in the sense the word 'state' has in automata theory — the internal memory of the automaton which both constrains and is constrained by the execution of component operations. The input/output types of such operations are part of the datatype transformer (T X = {{X x O) +1)1 in the example above) through which a component's behaviour can be traced. The basic insight in coalgebraic modelling is that, for an arbitrary datatype transformer T, a state-based system can be represented by a function p:U
> TU
(3.2)
For every state u £ U, function p describes the observable effects of an elementary step in the evolution of the system (i.e., a state transition). The possible outcomes of such a step are captured by notation TU. Technically, T is a functor b . Intuitively, a shape for the component interface. Let us consider a few possible alternatives for T. An extreme case is the 'opaque' shape T = 1: no matter what one tries to observe through it, the outcome is always the same. A slightly more interesting case is T = 2 which has the ability to classify states into two different classes (say, 'black' or 'white') and, therefore, to identify subsets of U. Should an arbitrary set O be chosen the possible observations become more discriminating. Naturally, the same 'universe' can be observed through different attributes and, furthermore, such observations can be carried out in parallel, as in, for example T — O x O'. The case of a 'transparent' T, i.e., T U = U, is not particularly useful: any function p : U —> U is a coalgebra for T. This means that, by using p, the values in the state space U can indeed be modified. But, on Note that our semantic constructions 'live' in a space of typed functions, something one could model as a graph with sets as nodes and set-theoretic functions as arrows. As functions (with the right types) can be composed and, for each set S, one may single out a function idg (the identity on S) acting as the neutral element for composition, this working universe has the structure of a (partial) monoid, i.e., a category. In this setting, a functor is simply a function T over this universe which preserves the graph and monoidal structure, i.e., for each function / : A > B, T / is typed as TA —> TB and verifies: T i d x = id T x
and
T ( / • g) = T / • Tg
As most conceptual structures used in mathematics and computer science, this notion is borrowed from category theory [51], where it can be appreciated in its full genericity.
On the Semantics
of Componentware:
A Coalgebraic
the other hand, the absence of attributes makes sible. More interesting, however, are interfaces partiality (T U = U + 1), non determinism powerset of U or input triggering (T U = U1),
Perspective
73
any meaningful observation imposable to model, e.g., computational (T U = VU), for VU the finite among many others. Technically,
Definition 3.1. The pair {U,p : U —> TU) constitutes a coalgebra for functor T over carrier U. A morphism connecting two such coalgebras is a function h between their carriers making the following diagram to commute: U—^TU h
(3.3)
T h
U' — r ^ T V p
T-coalgebras and the corresponding morphisms form a category whose both composition and identities are inherited from Set, the usual category of sets and functions. 3.2.2. Behaviour
and
Bisimulation
By now one may ask what a convenient functor for coalgebraic models of software components would be and what notion of component behaviour would such a choice enforce. These questions will be discussed in detail in Section 3.3. For the moment, however, let us stick to a few variants of an elementary, deterministic, model in order to introduce the basic ideas of coalgebraic modelling applied to software components. The simplest model one could think of is that of components inspected via an attribute at : U —> O and reacting (deterministically) to a method (or action) m : U —• U with no external influence (but for, say, pushing a button). Those two functions can be 'glued' together leading to coalgebra p = (at,m) :U — > O x U
(3.4)
Successive observations of (or experiments with) component p reveals its behavioural patterns. For each state value u € U, the behaviour of p at u (more precisely, from u onwards) — represented by [(p)] u — is an infinite sequence of values of type O computed by observing the successive state configurations, i.e., \p\
U
=
(3.5)
Thus, the space of all behaviours, for this sort of systems, is the set of streams (infinite sequences) of O, i.e., O". Bringing input information into the scene leads to a mild sophistication of this model. The result is a Moore or a Mealy machine depending on the way input affects the computation of attributes c . Represented as coalgebras they are described as c
In classical automaton theory a distinction is traced between Moore machines [57], where each state is associated to an output symbol, whereas such symbols are associated to transitions, rather than states, in a Mealy machine [53].
74
L.S. Barbosa, M. Sun, B.K. Aichernig and N. Rodrigues (at, m):U
—> O x U1
and
(at, m) : U — » ( 0 x U)1
(3.6)
respectively, where notation / stands for the currying of function / . Let us concentrate on Mealy machines for a while. Their behaviours organise themselves into tree-like structures, because they depend on the sequences of input processed. Such trees, whose arcs are labelled with I values and nodes with O values, can be represented by functions from non empty sequences of I (as no input means no observation!) t o the a t t r i b u t e t y p e O. In other words, t h e space of behaviours of Mealy machines (on / and O) is the set O1 . T h e behaviour of p at a particular s t a t e u is computed by d :
[(p)] u — \<s
:i> . at ((next u) s,i)
(3.7)
(next u) <s : i> = m ((next u) s,i)
(3.8)
where (next u) <> = u
and
Observe now t h a t set O1 itself can be equipped with the structure of a Mealy machine as well. Actually, define w = (root, branches) : OI+
— » ( O x OI+)T
(3.9)
where, root {4>,i) = 4>
and
branches {<j>, i) = Xs .
Note t h a t a state in w is a function >. Therefore, the a t t r i b u t e (root) is computed by function application, whereas the m e t h o d (branches) gives a new function which reacts to a sequence s of inputs exactly as
Proof. For h = [(p)] : p —> to, we check conditions at = at' • (h x id)
and
h • m = m' • (h x id)
(3.10)
which corresponds to the homomorphism condition from Diagram (3.3) instantiated for In the sequel <> stands for the empty sequence. We also adopt the rather liberal notation to access elements of a sequence (indicated by letters i, j ,fc,...) and sub-sequences (s, t, ...).
On the Semantics
of Componentware:
A Coalgebraic
Perspective
75
the Mealy machines functor. Thus, (root-([(p)] xid)) (u,i) =
{ root definition }
=
{ [(P)] definition } at ((next u)
=
<>,i)
{ next definition } at
(u,i)
A n d , similarly, (branches- ([(p)] x id)) =
{ branches definition } A < s : j > . {[{p)]u)
=
{ [(p| definition } \<s
=
: j> . at ((next u)
s>,j)
{ next definition } \<s
=
(u,i)
: j> . at ((next m
(u,i))s,j)
{ [(p] definition }
[(p)](m( u ,i» D
Another fundamental result on coalgebra morphisms is behaviour preservation. Formally, given two coalgebras p and q and a morphism h : p —> q between them, [(pju = i(q)]hu This leads to a precise and generic notion of behaviour: any two states generate the same behaviour if they can be identified by a coalgebra morphism. Back to the Mealy machines example, note that if there is always a morphism [(p)] from any p t o w and, as we have just shown, morphisms preserve behaviour, such a morphism is unique. This makes to a very special Mealy coalgebra: it is the only such coalgebra to which, from any other one, there is one and only one morphism. We say 10 is the final Mealy machine. Finality is an example of a universal property e which, up to isomorphism, provides a complete characterisation of u>. In fact, suppose finality is shared by two Mealy machines u> and ui'. The existence component of the property gives rise to two morphisms h and h' connecting both machines in reverse directions. On the other hand, uniqueness implies ft, • ft.' = id and h' • h = id, thus e
Because, roughly speaking, it singles out an entity (a)) among a family of 'similar' entities to which every other member of the family can be reduced or traced back. The study of universal properties is the 'essence' of category theory.
76
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
establishing h and h! as isomorphisms. These two aspects of finality provide both a definition scheme and a proof principle upon which coalgebraic reasoning is based. In general, Definition 3.2. Whenever the space of behaviours of a class of T-coalgebras can be turned on a T-coalgebra itself (written as OJJ : vj —> TZ/T), this is the final coalgebra: from any other T-coalgebra p there is a unique morphism [(p)] making the following diagram to commute: "T
-r-
^JvT
VJ M
M
T[(p)] P
u—>TU The universal property is, equivalently, captured by the following law: k=[(p)}
o
wT • k = T k • p
(3.11)
Morphism [(p)] applied to a state value u gives, of course, the (observable) behaviour of a sequence of p transitions starting at u. It is called the coinductive extension of p [88] or the anamorphism generated by p [54]. Coalgebra p is referred to as its gene. In this context, equation (3.11) is the basic tool for calculating with behaviours. Being an universal property, it asserts, for each gene coalgebra p, the existence and uniqueness of anamorphism [(p)]. As we have already remarked, the existence part of this universal property provides a definition principle for functions to spaces of behaviours (technically, carriers of final coalgebras). This is called definition by co-recursion and boils down to equipping the source of the function to be defined with a coalgebra to capture the 'one-step' dynamics in the behaviour generation process. This is exactly the way component combinators will be defined in Section 3.3. Then the corresponding anamorphism gives the rest. The uniqueness part, on the other hand, entails a powerful proof principle — coinduction, as illustrated below. The following results, usually referred to as the cancellation, reflection and fusion law, respectively, are a direct consequence of (3.11). «n-M M M'h
= TM-P = id,T = [()] if P'h
(3-12) (3.13) = Th-q
(3.14)
Behavioural equivalence can also be defined in terms of anamorphisms: Definition 3.3. Two states u and v in the carriers of coalgebras (U,p) and (V,q), respectively, are behaviourally equivalent, represented by u ~ v, iff [(p)] u = [(g)] v. Therefore, the final coalgebra can be alternatively characterized as a coalgebra whose carrier is composed by all equivalence classes of behavioural equivalence. A somewhat simpler way of establishing behavioural equivalence, which moreover has the advantage of not depending on the existence of final coalgebras, is to look
On the Semantics
of Componentware:
A Coalgebraic
Perspective
77
for a morphism h such that one of the states is the /i-image of the other. Once conjectured, h determines a relation RCUxV such that (u,v) G R =>• u~v
(3.15)
Such a relation is, of course, the graph of h, i.e., {{x,h x)\ x G U}. Can this idea be generalised? More precisely, what properties must a relation R have so that one can conclude u ~ v simply by checking whether (u, v) is in Rl Such a relation can be characterised and is called a T-bisimulation. Formally, Definition 3.4. A (T)-bisimulation relating coalgebras p and q is a relation over their carriers which supports a coalgebra structure p such that projections TT\ and 7T2 lift to T-morphisms, as expressed by the commutativity of the following diagram: ffl
U <
T 7Ti
T U •*
J?
TR
"2 > V T 7T2
>- T V
(3.16)
i.e., T 7Ti • p = p • 7Ti A T n2 • p = q • n2
Informally, two states of a T-coalgebra (or of two different T-coalgebras) are related by a bisimulation if their observation produces equal results and this is maintained along all possible transitions. I. e., each one can mimic the other's evolution. Originally the notion was introduced in a functional formulation by [77] and in a relational one by [14]. Park's landmark paper [65] made bisimulation a basic tool in the context of process calculi. Later [2] gave a categorical definition which applies, not only to the kind of transition systems underlying the operational semantics of process calculi, but also to arbitrary coalgebras. Bisimulation acquired a shape: the shape of the chosen observation interface T. Bisimulation entails, thus, a local proof theory for behavioural equivalence, which is widely used in practice and in a variety of contexts f . Usually in the coalgebra literature what is understood by a coinductive proof is the explicit construction of a bisimulation containing the pair of states one wants to prove equivalent (see, for example, [75] or [33]). In the sequel, however, we favour a calculational approach which resorts explicitly to the universal property (3.11), because it is closer to the calculation of functional programs (as in, e.g., [16]). To a large extent this choice is, however, a matter of taste: the two methods (both rooted in uniqueness part of the universal property) are the two sides of the same coin — the theorem which (for a large class of functors, which include the ones considered in this text) establishes final coalgebras as fully abstract with respect to bisimulation (for a proof see, e.g.. 3.2.3.
Discussion
Mathematically, coalgebras are the formal duals of algebras, exactly in the sense that makes observation and construction symmetric notions. Actually, an algebra f
Strictly the Aczel-Mendler definition in [2] is not always equivalent to behavioural equivalence. Such is the case, however, for functors preserving weak pullbacks, such as the ones used in this chapter.
78
L.S. Barbosa, M. Sun, B.K. Aichernig
and N. Rodrigues
for a functor T is a function d : JU —• U capturing the intuition of an assembly process. For example, the construction of finite sequences of a type P is specified by an algebra [nil, cons] :l + PxP*—>
P*
(3.17)
which singles out the empty list (nil) and appending (cons) as the primitive list constructors. In general, because a value cannot be built simultaneously in two different ways, functor T is usually a coproduct (sum) and the algebra arises as an either (choice) of constructors. Again, under suitable conditions on T, there exists a canonical representative of the assembly process: the initial algebra, which may be regarded as the formal analog to the 'smallest' machine able to produce all possible T-values. For example, algebra (3.17) is initial among all algebras for functor TX = 1 + Px X. Differently from familiar, inductive, data types which are completely defined by a set of constructors, the sort of computational structures that coalgebras can describe, admits only behavioural characterisations, as we have seen before. Typical examples of such structures are processes, transition systems, objects, stream-like structures used in lazy programming languages, 'infinite' or non well-founded objects arising in semantics, and, as we want to argue along this chapter, software components, at least when seen from the point of view of the software architect who has to deploy and combine heterogenous, multi-source components. Only recently, however, coalgebra theory emerged as a common framework to describe 'state based', dynamical systems. Its study along the lines of Universal Algebra, was initiated by J. Rutten in [72] and [75]. There is a number of tutorials and lecture notes (see, e.g., [34], [25] or [44]) to which the interested reader can be referred to. The proceedings of the Coalgebraic Methods in Computer Science workshop series, initiated in 1998, document current research ranging from the study of concrete coalgebras over different base categories [56, 88] to the development of Set-independent, i.e., purely categorical, presentations of coalgebra theory (see, among others [24, 66, 88]), from coalgebraic logic (e.g., [43, 45, 58]) to applications. Application examples range from automata [74] to objects [30, 67], from process semantics [8, 48, 90] to hybrid transition systems [29]. B. Jacobs and his group, following earlier work by H. Reichel [27, 67] have coined the term coalgebraic specification [32, 33, 70] to denote a style of axiomatic specification involving equations up to bisimilarity acting as constraints on the observable behaviour. Such a specification style is used in Section 3.4.
3.3. Components as Coalgebras and their Calculi 3.3.1. Introducing
Generic
Components
Software components were characterised in the previous section as dynamic systems with a public interface and a private, encapsulated state. The relevance of state information precludes a 'process-like' (purely behavioural) view of components as inhabitants of a final coalgebra. Components are themselves concrete coalgebras.
On the Semantics
of Componentware:
A Coalgebraic
Perspective
79
For a given value of the state space — referred to as a seed in the sequel — a corresponding 'process', or behaviour, arises by computing its coinductive extension. We have remarked when introducing component Stack in Section 3.1, that partiality is a characteristic of its behaviour. This was captured there by resorting to functor Id x O + 1, i.e., an instance of the popular maybe monad. Other components may exhibit different behaviour models. For example, one can easily think of components behaving within a certain degree of non determinism or following a probability distribution. Genericity is achieved by replacing a given behaviour model (such as that captured by the maybe monad above or the identity Id in the discussion of deterministic Mealy machines in Section 3.2) by an arbitrary strong monads B, leading to coalgebras for T B = B(ld x O)1
(3.18)
as a possible general model for state based software components. Therefore computation of an action will not simply produce an output and a continuation state, but a B-structure of such pairs. The monadic structure provides tools to handle such computations. Unit (rj) and multiplication (/x), provide, respectively, a value embedding and a 'flatten' operation to reduce nested behavioural effects. Strength, either in its right (rr) or left (T;) version, cater for context information. Finally, monad commutativity h turn up as a welcome (although not crucial) property. Functor (3.18) may be regarded as an instance of an even more general shape T B = O'1' x B(ld x O)1 which specializes to a variety of interfaces for state-based component strutuctues, namely • 'Functional' components, as given by (3.18), which, for B = Id, correspond to standard Mealy machines, as discussed in Section 3.2. • 'Action' components, with no independent attributes: T B = B(ld x O) • 'Silent' components, which evolve invisibly without any sort of external control: TB=OfxB • 'Object' components, characterized by an attribute-method pair T B = O x B7 which , for B = Id, corresponds to Moore machines. S
A strong monad is a monad {B,rj,/j,) where B is a strong functor and both rj and fj, are strong natural transformations [41]. B being strong means there exist natural transformations T ' : T X — = > T(ld X —) and rj : — X T = > T(— X Id), called the right and left strength, respectively, subject to certain conditions. Their effect is to distribute the free variable values in the context "—" along functor B. Strength ry, followed by T; maps B7 x B J to BB(7 x J ) , which can, then, be flattened to B(7 x J ) via fi. In most cases, however, the order of application is relevant for the outcome. The Kleisli composition of the right with the left strength, gives rise to a natural transformation whose component on objects I and J is given by Sr = rri j • T; B / J Dually, ^< = T l ; j • Tri BJ- Such transformations specify how the monad distributes over product and, therefore, represent a sort of sequential composition of B-computations. 11 A strong monad is said to be commutative whenever Sr and 8i coincide.
L.S. Barbosa, M. Sun, B.K. Aichernig
80
and N. Rodrigues
In the sequel we assume a collection of sets / , O, ..., acting as component interfaces and the following definition of a component specification: Definition 3.5. A software component is specified by a pointed coalgebra (up e Up,ap : Up ^
B(UP x O)1)
(3.19)
where up is the initial state, often referred to as the seed of the component computation, and the coalgebra dynamics is captured by currying a state-transition function ap : Up x I —> B (Up x O).
3.3.2. Behaviour
Models
Several possibilities can be considered for B. The simplest case is, obviously, the identity monad, Id, whereby components behave in a totally deterministic way. Other possibilities, capturing more complex behavioural features, include: • Partiality, i.e., the possibility of deadlock or failure, captured by the maybe monad, B = Id + 1, as in the Stack example above. • Non determinism, introduced by the (finite) powerset monad, B = V. • Ordered non determinism, based on the (finite) sequence monad, B = Id*. • Monoidal "labelling", with B = Id x M. Note that, for B to form a monad, parameter M should support a monoidal structure. • 'Metric' non determinism capturing situations in which, among the possible future evolutions of a component, some are stipulated to be more likely (cheaper, more secure, etc) than others. All cases correspond to strong monads in Set, which can be composed with each other. The first two and the last one are commutative; the third is not. Commutativity of 'monoidal labelling' depends, of course, on commutativity of the underlying monoid. 'Metric' non determinism is based on a general notion of a bag monad defined over a structure (M, ©, ®), where both © and are Abelian monoids and the latter distributes over the former. This gives rise to, e.g., • Cost components: based on Bag M for M = (N, +, x), which is just the usual notion of a bag or multiset. Components with such a behaviour model assign a cost to each alternative, which may be interpreted as, e.g., a performance measure. Such 'costs' are added when components get composed. This corresponds to the non deterministic generalisation of monoidal labelling above. • Probabilistic components: based on M = ([0,1], min, x) with the additional requirement that, for each m e Bag M , J2(P7T2)m = 1. This assigns probabilities to each possible evolution of a component, introducing a (elementary) form of probabilistic non determinism.
On the Semantics
3.3.3. The Semantic
of Componentware:
A Coalgebraic
Perspective
81
Framework
3.3.3.1. A Universe of Generic Components. Having defined generic components as (pointed) coalgebras, one may wonder how do they get composed and what kind of calculus emerges from this framework. In our framework, interfaces are sets representing the input and output range of a component. Consequently, components are arrows between interfaces and so arrows between components are arrows between arrows. Thus, three notions have to be taken into account: interfaces, components and component morphisms. Formally, this leads to the notion of a bicategory 1 to structure our reasoning universe. In brief, we take interfaces (i.e., sets modelling components' observation universes) as objects of a bicategory Cp, whose arrows are pointed T B -coalgebras (as defined in (3.18)) and 2-cells, the arrows between arrows, the corresponding morphisms. Formally, Definition 3.6. Assume arbitrary sets as Cp objects. For each pair (/, O) of objects, define a category C p ( / , 0 ) , whose arrows Up —P-> T B Up
h : (up,ap) —• (uq,aq)
TBh
" Uq—+TBUq satisfy the following morphism and seed preservation conditions: aq-h
= T B h • ap
(3.20)
h Up — uq
(3-21)
Composition is inherited from Set and the identity l p : p —> p, on component p, is defined as the identity id[/p on the carrier of p. Next, for each triple of objects (I, K,0), a composition law is given by a functor ; 7 i * i 0 : Cp(I,K)
x Cp(K,0)
-^
Cp(/,0)
whose action on objects p and q is given by p;q=
((up,uq)
eUpX
Uq,apiq)
'Basically a bicategory [13] is a category in which a notion of arrows between arrows is additionally considered. This means that the the space of morphisms between any given pair of objects, usually referred to as a (hom-)set, acquires itself the structure of a category. Therefore the standard arrow composition and unit laws become functorial, since they transform both objects and arrows of each hom-set in a uniform way. A typical example is Cat itself: the category whose objects are small categories, arrows are functors and arrows between arrows, or 2-cells as they are often called, correspond to natural transformations.
82
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
where ap.q : Up x Uq x I —• B(UP x Uq x O) is detailed as follows UpxUqxI — ^ B(idXag) BBa
> UpxIxUq
a xid
—v
• B(C/p x K) x Ut
B(a xr)
B(t/P xKxUq)
" > B(Up x (tf, x X))
> B(?7p x B(J7, x O)) -
° > BB(C/P xUgxO)
^
BB(f/p x (C/q x O))
—^—• B(t/ P x [ / , x O )
The action of ";" on 2-cells reduces to ft,; fc = h x k. Finally, for each object K, an identity law is given by a functor copy^ : 1 —> Cp(K, K) whose action on objects is the constant component (* G l,a c o p y K ), where acopyK = VixK- Slightly abusing notation, this will be also referred to as copy K . Similarly, the action on morphisms is the constant morphism idi. The fact that, for each strong monad B, components form a bicategory amounts not only to a standard definition of the two basic combinators ; and copy^- of a component calculus, but also to setting up its basic laws. Recall (from e.g. [73]) that the graph of a morphism is a bisimulation. Therefore, the existence of a seed preserving morphism between two components makes them TB-bisimilar, leading to the following laws, for appropriately typed components p, q and r: copy/ ; p ~ p ~ p ; copy 0 (p;q);r - p;(q;r)
(3.22) (3.23)
3.3.3.2. Computing Behaviour. The dynamics of a component specification is essentially 'one step': it describes immediate reactions to possible state/input configurations. Its temporal extension becomes the component's behaviour. Formally, behaviour [(p)] of a component p is computed by coinductive extension, taking the seed-value of p as the starting state. I.e.,
H
= taP}up
Behaviours organise themselves in a category Bh, whose objects are sets and arrows b : I —> O elements of vjp, the carrier of the final coalgebra uito f° r functor B(ld x O)1. To define composition in Bh, first note that the definition of ap;q above actually introduces an operator — ; — between coalgebras: ap;q could actually have been written as ap ;aq. Thus, composition in Bh can be defined by a family J
As one would expect, reasoning about generic components entails a number of laws relating monads with common 'housekeeping' morphisms such as product and sum associativity, (a, a + ), commutativity (s, s+), left and right units (I, 1+ and r, r+), left and right distributivity (dl, dr) and isomorphisms xl : A x (B x C) —> B x (A x C), xr : A x B x C —> A x C x B and m : (Ax B) X (C X D) > (A X C) X (B x D). Such laws are thoroughly dealt with in [7]. By convention, binary morphisms always associate to the left.
On the Semantics of Componentware: A Coalgebraic Perspective of combinators, for each I, K and 0, ;LK,0Bh such t h a t 'II,K,O
=
:
83
Bh(7, K) x B h ( # , O) —> Bh(J, 0 ) ,
IUI,K;UK,O}
O n the other hand, identities are given by copyKBh:
l^Bh(K,K)
and
copyKBh =
[(a c o p y J] *
i.e., t h e behaviour of component copy^, for each K. It should be observed t h a t t h e structure of Bh mirrors whatever structure Cp possesses. In fact, t h e former is isomorphic to a sub-(bi)category of t h e latter whose arrows are components defined over t h e corresponding final coalgebra. Alternatively, we may think of Bh as constructed by quotienting Cp by the greatest T B -bisimulation. However, as final coalgebras are fully abstract with respect t o bisimulation, t h e bicategorical structure collapses. Moreover, as discussed below, some tensors in Cpe become universal constructions in Bh, for some particular instances of B. This also explains why properties holding in Cp up to bisimulation, do hold 'on t h e nose' in t h e behaviour category. For example, we may rephrase laws (3.22) and (3.23), for suitably typed behaviours b, c and d, in Bh, as copy 7 ; b = b = b ; c o p y 0
and
(b ; c) ; d =
b;(c;d)
First, however, we have t o check t h a t L e m m a 3 . 2 . Bh is a category.
Proof. Let b : I —> O be a behaviour. Then, 6;copy 0 = [(w/,o ; copy0)](b, *) = [w/,o)]& = b A similar calculation establishes copy7 ; b = b. On the other hand, for suitably typed behaviours b, c and d, (b;c);d
= l(u>I,K ; ujf.t) ; uL,o)]((b, c),d) =
= \WI,K ; (VK,L ; VL,o)J(b, (c, d}}
b;(c;d)
• Note t h e genericity and simplicity of t h e proofs above. For space economy, we omit t h e proof t h a t construction [( )] is a 2-functor [40] from Cp to Bh, which follows t h e same calculational style.
3.3.4.
A Component
Calculus
We shall now look at the structure of Cp by introducing an algebra of T B -components parametric on a behaviour model B. This structure lifts naturally t o Bh defining a particular (typed) 'process' algebra.
84
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
3.3.4.1. Functions as Components. Let us start from the simple observation that functions can be regarded as particular instances of components, whose interfaces are given by their domain and codomain types. Formally, Definition 3.7. A function / : A —> B is represented in Cp by rp
=
(*el,or/-,)
i.e., a coalgebra over 1 whose action is given by the currying of arp = lx
A
V lxB
-—>• B(l x B)
Note that, up to bisimulation, function lifting is functorial, that is, for g : I —> K and / : K —> O functions, one has r
•
f r
i d 7 n ~ copy7
(3.24) (3.25)
Moreover, isomorphisms, split monos and split epis lift to Cp as, respectively, isomorphisms, split monos and split epis. Actually, lifting canonical Set arrows to Cp is a simple way to explore the structure of Cp itself. For instance, consider the lifting of ?/ : 0 —• / . Clearly, ?/ keeps its naturality as, for any p : I —> O, the following diagram commutes up to bisimulation, p
because both r ? / n and r ? o n are the inert components: the absence of input makes reaction impossible. Formally, r
? r/ ^
;p -
r
?on
(3.26)
Equation (3.26) lifts to an equality in Bh, as does any other bisimulation equation in Cp. Therefore, 0 is the initial object in Bh. Naturality is lost, however, in the lifting of !/ : / —> 1, as the following diagram fails to commute for non trivial B p
To check this, take B as the finite powerset monad. Clearly, p ; r ! o n deadlocks whenever p does. By 'deadlocking' we mean the empty set of responses is produced. On the other hand, r ! j n never deadlocks as this is prevented by the definition of function lifting above. Therefore, the two components are not bisimilar and 1 fails to become the final object in Bhe, for non trivial monads. It is, however, the final object in the behaviours category of deterministic components (i.e., for B = Id).
On the Semantics
of Componentware:
A Coalgebraic
Perspective
85
3.3.4.2. Wrapping. The pre- and post-composition of a component with Cp-lifted functions can be encapsulated into a unique combinator, called wrapping, which is reminiscent of the renaming connective found in process calculi (e.g., [55]). Let p : I —• O be a component and consider functions / : / ' —• I and g : O —> O'. Component p wrapped by / and g, denoted by p[f, g] and typed as I' —• O', is denned by input pre-composition with / and output post-composition with g. Formally, Definition 3.8. The wrapping combinator is a functor
-lf,g]:Cp(I,0)^Cp(I',0>) which is the identity on morphisms and maps component {up,ap} into (up,ctp[f,g]}, where Up
a
p[f,g]
xl'
idx/
Unxl
B(UP x O)
B(idxg)
B(J7P x O')
As expected, the following properties hold:
p[f,g] ~rr;p;r9~l (piLgmf',g']~plf-f,g'-g]
(3.27) (3.28)
Some simple components arise by lifting elementary functions to Cp. We have already remarked that the lifting of the canonical arrow associated to the initial Set object plays the role of an inert component, unable to react to the outside world. Let us give this component a name: inert A
r
?^n
(3.29)
In particular, we define the nil component, nil = inertg = r?0~1 = r id0 n typed as nil : 0 —• 0. Note that any component p : I —• O can be made inert by wrapping. For example, p[?/,!o] ~ inerti. A somewhat dual role is played by component idle = r idi n . Note that idle : 1 —> 1 will propagate an unstructured stimulus (e.g., pushing a button) leading to an (similarly) unstructured reaction (e.g., switching on a led). 3.3.4.3. Choice. Components can be aggregated in a number of different ways, besides the 'pipeline' composition discussed above. Next, we introduce three other generic combinators, corresponding to choice, parallel and concurrent composition. Let p : I —> O and q : J —> R be two components denned by (up,ap) and {uq,aq), respectively. The first composition pattern to be considered is external choice, as depicted bellow:
I
i +J
J
pB q
q
p
o
R
-e O+ R
L.S. Barbosa, M. Sun, B.K. Aichernig
86
and N.
Rodrigues
When interacting with p EH q, the environment is allowed to choose either to input a value of type I or one of type J, triggering the corresponding component (p or q, respectively) and producing output. Formally, Definition 3.9. The choice combinator is defined as a lax functor EH : Cp x Cp —> Cp, which consists of an action on objects given by IS J = I + J and a family of functors h,o,j,R • Cp(/, O) x Cp(J, R) —• Cp(/ + J,0 + R) yielding p\Bq
UpxUqx(I
= ((up,uq)
+ J) a D xid+idxa a
eUpx
Uq,ap®q)
(xr+a) dr
" > Up x / x Uq + Up x (Uq x J ) „^
,
*- B (Up x O) x Uq + Up x B (Uq x R) —
^ B(UpxOxUq)
B(UpxUqxO) [ B^( i±d E x n!)!, B ( i d x i _ 2)]
+ B (Up x (£/, x R)) + B (Up
xUqxR)
^ B (C/p x C/q x (O + R)) and mapping pairs of arrows (h\,h-2) into h\ x hiThe following laws arise from the fact that EH is a lax functor in Cp: (p Si p') ; (q B q') ~(p;q) Si (p';q') copy KSK> ~ c °Pyx ffl copy^, r / - , f f l r f f - , ~ , 7 + 5-1
(3.30) (3.31) (3.32)
Moreover, up to isomorphic wiring, EH is a symmetric tensor product in each homcategory, with nil as unit, i.e., (pEB^fflr ~(pEH(gEBr-))[a + ,a + °] nil fflp ~ p[r+,r+°] and p EB nil ~ p[\+,\+°] pffl< 7 ~(gffl;p)[s + ,s + ]
(3.33) (3.34) (3.35)
Laws (3.33) to (3.35) can be alternatively stated as providing evidence that the canonical Set isomorphisms a+, r + , l + and s+, once lifted to Cp, keep their naturality up to bisimulation. 3.3.4.4. An Either
Construction.
The definition of a choice combinator raises the question whether there is a counterpart in Cp to the either construction in Set. The answer is partly positive. Let p : I —• O and q : J —> O be two components sharing a common output type O, and define \p,q] = ( p E B g ) ; ^ " 1
On the Semantics of Componentware: A Coalgebraic Perspective
87
where V — [id, id]. I t can be shown t h a t t h e following diagram commutes u p t o bisimulation,
r
ir^imj^^—j , ,/
r
Lin;[p,q}~ P t 2 n ; \p,q] ~ q
(3.36)
o even though [p, q\ is not t h e unique arrow making the diagram commute. This is formalized in t h e following lemma whose proof is included t o give a flavour of t h e calculation style adopted here. L e m m a 3 . 3 . The choice combinator
EB lifts to a weak coproduct in Bh.
Proof. A weak coproduct is defined like a coproduct but for the uniqueness of the mediating arrow (the either construction). Existence, i.e., the validity of (3.36), is proved considering the equivalent formulation [p,q][ti,V] ~ p and [p,q][t2,V] ~ q replacing composition with lifted functions by wrapping. We show that both thefirstand the second projection are morphisms from the left to the right. Therefore, B(TTI x V) • [B(id x n ) , B(id x t 2 )] • (Bxr + Ba°) • ( r r + n) • (ap x id + id x aq) • ( x r + a) • dr • (id x n )
{ law: t i = d r . ( i d x n ) (c/., [7]) } B(7ri x V) • [B(id x n ) , B(id x i2)} • (Bxr + Ba°) • (r r + n) • (ap x id + id x aq) • (xr+ a) • n =
{ + absorption and cancellation } B(7Ti x V) • B(id x ti) • Bxr • TV • ap x id • xr
=
{ routine: V • L\ = id } B(-7Ti x id) • Bxr • Tr • ap x id • xr
=
{ routine: (-7ri X id) • xr = 7ri } B7n • Tr • ap x id • xr
=
{ law: BTU -Tr =TTI (c/., [7]) }
B-7T1 • ap x id • xr =
{ x definition and cancellation } ap • 7Ti • xr
=
{ routine: (wi X id) • xr = 7ri and xr = xr° } ap • (TTI X id)
L.S. Barbosa, M. Sun, B.K. Aichernig and N. Rodrigues
88
which establishes the first clause of (3.36). A similar calculation will prove the second one. Note that in both cases seeds are trivially preserved. It is impossible to turn either into a universal construction in Bh. The basic observation is that the codiagonal V does not keep its naturality when lifted to Cp. In fact, a counterexample can be found even in the simple setting of deterministic components (i.e., with B = Id). Let p = (0 e N, ap) : N —> N be such that, upon receiving an input i, i is added to the current state value and the result sent to the output. Consider the following sequence of inputs (of type N + N): s = (ti5, t 2 3, ti4,...). The reaction to s of (p EB p) ; r V n is (5,3,9,...) while r V n ; p, resorting only to one copy of p, produces (5, 8,12,...).
• Failing universality means there is not a fusion law for EH, even in the deterministic case. However, cancellation, reflection and absorption laws do hold strictly in Bh and, u p t o bisimulation, in Cp. Cancellation has just been dealt with. T h e other two — reflection rnViai and
~
c
°py/+j
(3-37)
absorption (PEiq);\p',q'}
~ \p;p',q;q'}
(3.38)
are easy to prove. For example, (p Si q);\p',q'] ~
{ definition of either in Cp } (pfflg);((p'ffl< 7 '); r V-')
~
{ ; associative (3.23) } ((p ffl q) ; (p'ffl ')); r V n
~
{ EB functor (3.30) } ((p;p')EB(g;));rV^
~
{ definition of either in Cp } \p;p',q;q]
As expected, the EH combinator can be written in terms of an either construction on components. In fact, for p : I —> O and q : J —> R, we obtain pWq
~ [p;rnn,;p;ri2l
(3.39)
T h a t is t o say, Set coproduct embeddings — once lifted t o Cp, — keep their n a t u rality: r
i^;(Pmq)
- p;rL^
and
r
z ^ ; (p EB q) ~ q ; r i 2 n
(3.40)
A direct corollary of this fact is the following 'idempotency' result: p;rL^
~
r
L^;(PmP)
(3.41)
On the Semantics
of Componentware:
A Coalgebraic
Perspective
89
3.3.4.5. Parallel. Parallel composition, denoted by p Kl q, corresponds to a synchronous product: both components are executed simultaneously when triggered by a pair of legal input values. Note, however, that the behaviour effect, captured by monad B, propagates. For example, if B can express component failure and one of the arguments fails, product fails as well. Formally, Definition 3.10. The parallel combinator E3 is defined by an action IMJ on objects and a family of functors hojR • Cp(I,O) x Cp(J,R) —> Cp(/
= I xJ
xJ,OxR)
which yields pMq
= {(up,uq) eUpx
Uq,ap®q)
where Up x Uq x (I x J )
—*~ Up x I x (Uq x J)
^ ^
*- B(UpxO)xB
(Uq x R)
*- B (Up x O x (Uq x R)) —
^ B (Up xUqx(Ox
R))
and maps every pair of arrows (/ii, /12) m t o hi x h%. The following laws hold for E3: lax (p El p') ; (q ® q') ~ (p ; q) M (p' ; q') functions assoc id zero coram
copy^KiK' ~
C0
r
r
n
r
n
PYx ^ copy x , n
/ EJ g ~ / x j (plElg)Klr - ( p B ( ? B r ) ) [ a , a ° ] idle Kip ~ p[r, r°] nil El p ~ nil[zl,zl°] pMq ~ (g[3p)[s, s] if B is commutative
(3.42) (3.43)
(3.44) (3.45) (3.46) (3.47) (3.48)
Again one may ask whether El lifts to a universal product construction at the behavioural level. Dually to the either combinator, we start by dennning the split of two components as (p,q) = r A n ;(plX]g)
where A= (id,id)
This definition, however, does not guarantee, in general, the commutativity of I
O^—OMR-r-^rR
L.S. Barbosa, M. Sun, B.K. Aichernig
90
and N.
Rodrigues
It does, however, and a cancellation law (p,q) ; r7Ti~l ~ p holds, for commutative monads B which exclude the possibility of failure (e.g., the non-empty powerset). On the other hand, diagonal A keeps its naturality when lifted to Cp, for B expressing derministic behaviour (e.g., the identity or the maybe monad), entailing a fusion law:r ; (p, q) ~ (r ; p, r ; q). Combining these two results, one concludes that IE] is a product in Bh, but only for behaviour models excluding both failure and non determinism, which narrows the applicability scope of this fact to the category of total deterministic components. However, reflection, absorption and definition laws hold for any B: reflection ( r 7ri n , r7r2~l) ~ copy 0 x f l (3.49) absorption (p,q) ;(p'Mq') ~ (Pip',q;q') for B commutative (3.50) r 1 r n definition pMq ~ ( 7Ti~ ;p, 7r2 ; g) (3.51) Product projections, on the other hand, keep naturality only when cancellation holds. Always, however, one has (T B q); r*-T ~ V (p^T);<-n^ r^r^.p 3.3.4.6.
;q
(3.52) (3.53)
Concurrent.
Finally, concurrent composition, denoted by @, combines choice and parallel, in the sense that p and q can be executed independently or jointly, depending on the input supplied. Formally, Definition 3.11. The concurrent combinator is defined by an action I Si J I + J + I x J on objects and a family of functors BIOJR
• Cp(J,O) x Cp(J,R) —• Cp(J + J + IxJ,0
+R +
=
OxR)
yielding p®q
= {(uo,v0) e Up x Uq,apmq)
where flpHg =
Up xUq
X (I \E J)
[B(idxti),B(idxt2)]-(a pE B,+a p |g|,)-dr
B (E/p x l / , x ( 0 | i?)) and maps pairs of arrows (hi, /12) into hi x /i2The laws of concurrent composition combine corresponding results about EB and IE. In particular we get again permutation with sequential composition and the structure of a tensor product, which is symmetric for commutative behaviour monads. Moreover, the following reduction laws relate E to the other two tensors: Vi(pi?) r
L2~";(p®q)
- (PBq);rLi^ -
r
(p®q); L2~
1
(3.54) (3.55)
On the Semantics
3.3.4.7.
of Componentware:
A Coalgebraic
Perspective
91
Interaction.
So far component interaction was centred upon sequential composition, which is the Cp counterpart to functional composition in Set. This can be generalised to a new combinator, called hook, which forces part of the output of a component to be fed back as input. Formally, Definition 3.12. The hook combinator — *lz is defined, for each tuple of objects (I, O, Z), as a functor between the (categories underlying) hom-sets Cp(I+Z, O+Z) and Cp(/ + Z,0 + Z) which is is the identity on arrows and maps each component p: I + Z —> O + Z to p V I + Z —• O + Z given by P^z = (up G Up,ap*]2 where <¥iz =
Up x {I + Z)
-
B(l?+ap)
*- B(UP x(0
> B(B(UP x(0 —
+ Z))
+ Z)) + B(UP x{0 + Z)))
*- B(UP x (O + Z))
For components with the same input/output type, the hook combinator has a particularly simple definition as the Kleisli composition of the original dynamics. It is then called a feedback and denoted by p\.
Z —> Z = (up e Up,ap^)
where api = i.e.,
Ojp*]
=:
UpXZ —^ ap
•
B(UP x Z) - ^ X BB(C/p x Z) —^U- B(UP x Z)
dp,
Both hook and feedback specialise to components representing functions according to the following laws,
T r«~i g^z
1 ~rf-P
(3.56)
~rh,g-L2]-gn
(3.57)
for / : Z —> Z and g : I + Z —> O + Z. Moreover, for components p : Z —> Z and q : I —• O, one has p 1 ~ p [ r + ) r + 0 ] « l z [r+°,r+] qfflp*l ~(<7fflp)*lz
(3.58) (3.59)
p1fflg~(gfflp)«l z
(3.60)
[s+,s+]
All laws above, with the exception of (3.60) are actually strict Cp arrow equalities, and not just bisimulations. Also notice that equation (3.59) generalises to (
for p : J + Z —> R + Z.
(3.61)
92
L.S. Barbosa, M. Sun, B.K. Aichernig
and N. Rodrigues
The last set of laws relate hook and feedback with the other combinators in the calculus. Let p : Z —> Z and q : R —> R be components and i : W —> Z be a Set isomorphism. Then p «i [M°] ~ p[M°] *1 ipUq) 1 ^p^Oq^
( 3 - 62 ) (3.63)
for • = EB, [3 or ffl. Finally, let p : I + K —> O + K and q : J —> R be components, / : I' —• I, g : O —> O' functions and i : W —• K a Set isomorphism. Then P^K[f + i,g + i°] ~p[f + i,9 + i°] V (pffl g )[xr+,xr + ]V ~(pVEHg)[xr + ,xr+]
(3.64) (3.65)
Note that all equations are strict Cp arrow equalities. However, validity of (3.63), for • = M, ll, depends on the commutativity of the behaviour monad B. The reader is refered to [7] for proofs of all laws mentioned in this section.
3.3.5. A (Generic) Folder from Two Stacks The purpose of this section is to illustrate how new components can be built from old ones, relying solely on the functionality available. The example is the construction of a folder out of two stacks. Although these components are parametric on the type of stacked objects, we will refer to these as 'pages', by analogy with a folder in which new 'pages' are inserted on and retrieved ('read') from the righthandside pile. A static, VDM-like specification of the component we have in mind can be found in [62]. According to this specification, the Folder component should provide operations to read, insert a new page, turn a page right and turn a page left. Reading returns the page which is immediately accessible once the folder is open at some position. Insertion takes as argument the page to be inserted. The other two operations are simply state updates. Let P be the type of a page. The Folder 'port' signature may be represented as follows, where input and output types are decorated with the corresponding action names: tr : 1 + rd : 1 + tl : 1 + in : P
Folder
1 rd : P + { t r , t l , i n } : 1 Our exercise consists in building Folder assuming that two stacks are used to model the left and right piles of pages, respectively. The intuition is that the push action of the right stack will be used to model page insertion into the folder, i.e., action in. On the other hand, it should also be connected to the pop of the left one to model tr, the 'turn page right' action. A symmetric connection will be used to model tl.
On the Semantics
of Componentware:
A Coalgebraic
Perspective
93
The rd operation observes the 'front' page — the one which can be accessed by top on the right stack. According to this plan, the assembly of Folder starts by defining RightS as a Stack component suitably wrapped to meet the above mentioned constraints. At the input level we need to replicate the input to push by wrapping p with the codiagonal Vp function. On the other hand, access to the top button on the left stack is removed by L2- At the output level, because of the additive interface structure, we cannot get rid of the top result. It is possible, however, to associate it to the push output and collapse both into 1, via \p+±. So we define: RightS = Stack[id + V,id] : 1 + 1 + (P + P)—>P+P+1 LeftS = Stack[i2 + id,(id+! P + i) • a+] : 1 + P—>P + 1 Then, we form the EH composition of both components: LeftS EH RightS : 1 +P + (1 + 1 +(P + P)) —>P + 1 + (P + P + 1) The next step builds the desirable connections using hook over this composite, which requires a previous wrapping by a pair of suitable isomorphisms: AlmostFolder = ((LeftS EB RightS)[wi, wo]) V + p
where, denoting by i^- the composite t* • Lj, [[tai> ^112], £212], ^222], tai> ^122]
wo
[<42) '-111]; [ [ ^ l l ,
A
22], <<2lJ
In a diagram:
(1 + 1 + 1+P) + (P + P)
P+P 1+P+1)
+ (P + P)
Finally, to conform AlmostFolder to the Folder interface, we restrict the feed back input — by pre-composing with fi = L\ — and collapse both the trivial output and the feed back one to 1, by post-composing with fo = [[ta''•lh l 2], ^ ^ P + P ] • Therefore, we complete the exercise by defining Folder = AlmostFolder[fi,fo]
which respects the intended interface. Note this design retains the architecture of the 'folder' component without any commitment to a particular behaviour model.
L.S. Barbosa, M. Sun, B.K. Aichernig
94
3.3.6. Composing
Behavioural
and N.
Rodrigues
Models
Both the component model and the calculus introduced in the previous sections are generic, i.e., parameterised by a behavioural model, but homogeneous in so far a single such model is assumed common to all components. In order to be useful in the design of large, complex systems, a suitable extension is needed to support interaction and coordination of software components with different behaviour models. This can be achieved through essentially two ways. The first one is behavioural casting: to compose two components originally defined under distinct behavioural assumptions, one of them (or both) is previously casted as a component for a 'richer' behavioural model. The condition under which such a conversion becomes possible is the existence of a monad morphism between the two models (or between both of them and a third, more general one). Recall that a morphism between two monads Bi = (#1,771,/zi) and B2 = (-82,772,^2) o n Set is a natural transformation a : Bi =$> B2 making the following diagrams to commute: Id
B1B1^^B2B2
B\
>• B2
B\
>- B2
Every such morphism a lifts to a natural transformation a1 : Bi(ld x O)1 = > B2(ld x O)1 transforming each T B l component into a TB2 one, over the same state space. Clearly, not all behavioural casting is possible: the target monad should have 'enough' structure to do the embeding. For example, partiality may be casted as non determinism, taking h = [sing, 0] as the undelying monad morphism from Bi = Id + 1 to B2 = V, but not the other way round. An alternative way is to lift behavioural casting to the structure of the semantic category for generic components, which becomes defined as a cofibred category of coalgebras Co over a family of interface functors (see [46] for a discussion on coalgebras cofibred over functors). In [81] a systematic reconstruction of the calculus is made on top of this category in order to accommodate components exhibiting different behavioural patterns. Category Co is defined as the 'total' category which encompasses the categories Cpe for all the possible monads B, entailing a notion of morphism between coalgebras for different monads. Given two components p and q, with possibly distinct behaviour models Bi and B2, a morphism in Co between them is a pair (h, a) where a : Bi =$• B2 is the underlying monad morphism and h : Up —• Uq is a function between the corresponding state spaces is such that hup = uq and the following diagram commutes: Up p\
1
Bi(Up x O)1 —^> B2(UP x Oy-—»-
*~Uq U
B2{Uq x O)1
Composition in this category is defined as before, but for the resulting signature: both monads involved are combined into a new pattern. Thus, given components
On the Semantics
of Componentware:
A Coalgebraic
Perspective
95
p= {up e Up,ap : Up -> Bi(i7p x K)1) and q = (uq e Uq,aq : Uq -> B2(C/q x 0)K), their sequential composition is given by p; q = {{up, uq) e U, ap;q : U -> B{U x O)1) where U = Up x Uq, as before, and B = B1B2. Note, however, that the simple composition of functors B\ and B B1B2 satisfying a number of (coherence) conditions (see [7] for a detailed discussion), which constraints the applicability of this approach. A similar situation arises in the definition of the (corresponding) tensor products. Reference [81] introduces such operators in detail for components modelled as pointed coalgebras for functor T ? i 0 = A x B(ld x O)1
(3.66)
which adds an independent attribute part, represented by type A, to the basic shape considered in the previous sections. Therefore, a T B -coalgebra aggregates in (op,ap) an independent observer op : U —> A and an input-driven action ap : U x I —> B(U x O). Curiously the resulting calculus is quite close to the basic, homogeneous, one discussed above.
3.3.7.
Discussion
This section introduced a semantic model for software components, regarded as concrete pointed coalgebras for some Set endofunctors, and a calculus to reason about (and transform) component-based designs. Both the model and the calculus are parametric on a strong monad capturing the intended behaviour model. The approach focuses on state-based components with a form of synchronous interaction. Such assumptions, which underly popular technologies like, e.g., CORBA [78], D C O M [23] or JAVABEANS [52], reflects what could be called the object orientation legacy. A component, in this sense, is essentially a collection of objects and, therefore, component interaction is achieved by mechanisms implementing the usual method call semantics. The bicategorical setting adopted in this section seems appropriate to capture a 'two-level structure' in the component models. This is clearly in debt to previous work by R. Walters and his collaborators on models for deterministic input-driven systems [37-39]. Two other influences should be acknowledged. The first is the recent area of coalgebraic specification of object-oriented systems (see e.g., [30, 67]), which has been developed with a similar motivation, although in a propertyoriented, or axiomatic, framework. The other is the 'dataflow paradigm' [61] to which some of the aggregation patterns and the general idea of structured wiring can eventually be traced back. An alternative approach to componentware is inspired by research on coordination languages [21, 64] and favors strict component decoupling in order to support a looser inter-component dependency. Here computation and coordination are clearly separated, communication becomes anonymous and component interconnection is externally controlled. This model is (partially) implemented in JAVASPACES on top
96
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
of JlNl [60] and fundamental to a number of approaches to componentware which identify communication by generic channels as the basic interaction mechanism — see, e.g., R E O [3], PICCOLA [59, 76], [15, 18] or
[12].
Finally a pointfree, essentially equational calculational proof style, has been used. In particular, equational proofs replace the more traditional use of coinduction (in terms of explicit construction of bisimulations). Generic proofs performed in this style are often long, even if easy to follow. In most cases their length results from the systematic recording of almost all elementary steps. On the other hand, this style has become familiar to the functional programming community, where it has been popularised under the 'Bird-Meertens formalism' heading (see e.g., [4, 16] or [5]). After this brief discussion, we shall now focus on specific applications of this framework. Such is the aim of the next two sections. 3.4. Application to the Semantics of UML The Unified Modeling Language (UML) [17, 63] became a de facto standard for object-oriented design. One of the main advantages of UML is that it offers a set of different views to describe specific aspects of the system to be developed, such as its static structure, the dynamic behavior of single objects as well as their interoperation and coordination. The rich notation and the broad scope of UML, however, had made it difficult to define a consensual and unifying formal semantics. Consequently, the quest for sound foundations for UML is still an active research area. However, most of ongoing formalization work focuses on specific aspects of the method and does not scale to a framework able to integrate the multiple design views the method provides. Such a formal framework becomes mandatory whenever the establishment of semantic relations and consistency checks among the different views are in order. This section discusses the application of coalgebraic techniques to the semantic formalization of UML. Although the emphasis is placed on the characterization of static models represented by class diagrams, we claim the same framework can accommodate UML dynamic models (namely, use cases and statecharts) in a smooth, incremental way, leading eventually to a unifying semantics for the whole method. The central idea of this approach is the translation of the basic UML elements to coalgebraic specifications in the style introduced in [31, 33]. Our starting point is the UML Specification [63] which describes the abstract syntax itself as an (informally annotated) UML class diagram. 3.4.1. Class
Diagrams
A class diagram captures the static structure of a system, as a set of classes and relationships between them. A class is an abstract description of a set of objects with similar structure, behavior and relationships. The description of a class includes the common attributes and operations of its objects whereas relationships with other classes are represented by generalizations and associations, including aggregation and composition.
On the Semantics
of Componentware:
A Coalgebraic
97
Perspective
3.4.1.1. Classes. A class is a description of a set of objects sharing the same attributes, methods and relationships. Every object o of a class C in a system has an identifier id0 which is unique in the system. We denote the set of all identifiers by Id. Therefore, an object o : C is represented as a triple o = (id0, Uc, » c : Uc —• T Uc) where Uc is the state space of class C and T is a functor encapsulating a signature of attributes and methods. During the lifetime of an object, its local state u may change over the state space Uc, but its identifier id0, state space Uc and the transition structure etc remain the same. Therefore, we can use a pair (id0,u) to present an object o at a particular state u. Every class in a UML class diagram will correspond to a coalgebraic specification Spec. Definition 3.13. A coalgebraic specification Spec is a tuple (T, $, \t) in which: • T is a functor on a local state space X, being used to represent the signature of all the attributes and methods of the class; • $ is a set of axioms that gives the constraints on the functors for the attributes and methods to characterize the properties of the class; • \P describes the properties that hold for newly created objects. A model of a given class specification Spec = (T, <E>,^') is a triple c = (U,a : U —> TU,UQ), where U is a carrier set interpreting the state space of the class, a : U —> T U is a transition structure which satisfies all the properties given by $ and Mo € U is an initial state satisfying \&. The semantics of a concrete class C in a UML class diagram is defined as the category Coalg(Spec) of models of the corresponding coalgebraic class specification Spec together with the initial state preserving morphisms between them, i.e., 5[C] = C o a l g ( T c , $ c , * c )
ifisAbstract(C)
= False
where (Tc, $Ci ^ c ) is the specification of class C and the Boolean value function isAbstract specifies whether the class C can be directly instantiated. The objects of Coalg(Spec) are 0 6 j ( C o a l g ( T c , $ c , * c ) ) = {c=(Uc,ac : Uc^Tc(Uc),u0 (c^$c)A(c,w0 N*c)}
e Uc) \
where c (= 3>c means that all the axioms in $ c are satisfied by coalgebra c and c, uo |= * c means that the properties in ^>c are satisfied by the initial state UQ of c. The arrows are initial state preserving Tc-morphisms. Let us now look at the specification of attributes. The default UML syntax is visibility name: type-expr[multiplicity ordering] = initial
value{property-string}
Therefore, the semantics of an attribute At in class C is defined as S{v At : T[m}=
i{p}j ±{At : Uc - TAt(Uc) | S[v] A S[At[m]] AS[At = iJAS[At{p}]}
L.S. Barbosa, M. Sun, B.K. Aichernig
98
and N.
Rodrigues
J At is the functor J At '• Uc —> 7 7 (5[T]) where V is the powerset functor used to capture the multiplicity of the attribute (can be dropped whenever the multiplicity is exactly one). If S\v\ gives the semantics for visibility of attribute At, the semantics for multiplicity is <S[i4i[m]] = V(i7 c ,ac,«o) e Obj(S\C\),\/u
G UC-card(At{u))
=m
or, if multiplicity is specified as a range, SlAt[l..k)j
=V(Uc,ac,u0)
G 06j(<S[C]),Vw G Uc.l < card{At{u)) < k
The additional ordering property is only relevant when its multiplicity upper bound is greater than one. The values may be unordered (the default) or ordered, S\At[l..k ordered]] = k > 1 A Vw G C/c- At{u) is an ordered set The initial value is used for initializing the attribute of a newly created object, it can be omitted. Its semantic function is: S[At = i]±V(Uc,ac,u0)
G Obj(SlCj).At(u0)
=i
The optional property string indicates property values of the attribute, for example changeability. Due to space limitations, its semantics will not be detailed here; the semantic functions for visibility of attributes, operations and interfaces is also omitted (the interested reader is referred to [82]). 3.4.1.2.
Associations.
An association in a class diagram describes the connections among objects in a system, with two or more association ends, as shown in the example of Figure 3.1. Student name: String studentID: Number
Fig. 3.1.
takecourse ^ student
*
Course
name: String course courselD : Numbei
An example of a UML association
Let S p e c u and S p e c v be the specifications of two classes U and V in a class diagram and A a binary association between them. Suppose c = (U, a) and d = (V, j3) are objects in Coalg(Specu) and Coalg(Specv), respectively. Then, association A, which connects the two coalgebras (classes), can be interpreted as a state space SA C V((Id x U) x (Id x V)). Identifiers in set Id are necessary to distinguish objects of the same class being in the same state. An element s G SA is a state of the association which records a set of object pairs which are linked by the association. Every pair of objects in SA is called a link between them. Any association has three basic components: a name, a role and the multiplicity at each of its ends. The semantics of an association is given by the corresponding observers in each of the related classes. Formally,
On the Semantics
of Componentware:
A Coalgebraic
99
Perspective
Let A stand for an association between classes U and V. a class diagram, and UA, VA the role names on the two ends with multiplicities mu and m y , respectively. The semantics of an association is defined as a pair of observers: {uA : (Id x V) - • V(Id x U), vA : (Id x 17) -> V(Id x V))
(3.67)
such that Law 1. For all objects cv,oy of classes U and V, o\j G
UA(OV)
•£> Ov &
VA(OU)-
where U and V are the statespaces of coalgebras (U,a : U ~+ TV U, UQ) and (V, (3 : V —> Ty V, no) corresponding to the semantics of classes U and V. Therefore, the semantic function of an association is given as the pair of observers in (3.67) which satisfies Law 1, i.e., slvuA__vAVj
A {(UA:VA)
| Law 1}
In UML each association end has a multiplicity constraint (eventually left unspecified) which is "a subset of the open set of non-negative integers". Should multiplicities be explicitly given in the diagram, the semantics of an association becomes
«
3.4.1.3.
T
X V 1
= {(^,vA)
| Lawl A (Voy : Id x U.(card(vA(ou))
€ my)) A
(\/ov : Id x V.(card(uA(ov))
£ mu))}
Generalization.
Generalization in a class diagram describes the inheritance relationship between a general class D (said the superclass) and a more specialized class C (the subclass), a fact represented as C < — D . Such an inheritance relationship between D and C is witnessed by a functor G : Coalg(D) —> Coalg(C) between the corresponding categories of models as in [28]. Let (Jpub, Tpro, Tpri) and (T'pub, Vpro, Tpri) be tuples of functors representing the signature of the public, protected and private parts of a superclass C and its subclass C respectively. By definition of the keywords public, protected and private in [63], it is obvious that all T p u j i (Tproi) can be found as components of T'pub (Tp r o ) k . Consequently, two projections ppub : Tpub = > Tpub and ppro : Tpro = > lpro can be identified as obvious natural transformations. The following definition gives a special kind of morphism, called inheritance morphism, between a class and its superclass. Definition 3.14. Suppose class specification Spec' inherits from Spec, and consider two coalgebras c and c' as models of Spec and Spec' respectively. Then an inheritance morphism from c' to c is a tuple (G,ppub,Ppro), such that all states k
This means that all the public and protected attributes and methods in the superclass can be found in its subclass, with identical or overloaded definition. Moreover, subclasses may have additional public and protected methods.
100
L.S. Barbosa, M. Sun, B.K. Aichernig
and N. Rodrigues
in U' are mapped by G to the states in U, Gu'0 = UQ and the following diagram commutes. Tpro
TproiU)
•*•
-
u
Tpub(U) T P „„(G)
Tpr-a(G)
I pro\y
~tpub{U')
)
Pprojjl
Ppubyf
T',
T'
T'(U')^^U'
•nub(c/')
where G is the (forgetful) functor between the model categories of the two class specifications. We may now define the semantics of a generalization relationship in UML class diagrams as all the possible inheritance morphisms between the models of the corresponding class specifications, i.e., 5[C <3— D] 4 {g : d - • c | d G Coalg(D) A c e Coalg(C)} where g is the inheritance morphism from d to c. name: String studentID: Number
Takecourse ^. *
Course name: Siring courselD : Numbei
"ZX"
GradStudent tutor: Professor
Fig. 3.2. Association being inherited
Associations are inherited by subclasses. In the example of Figure 3.2 an association of class Student should remain applicable when objects of class Student are substituted by objects of class GradStudent. Suppose the representation of association Takecourse is given by a pair of functions student and course. Then, we define another pair of functions to represent the inherited association between classes GradStudent and Course: studentgS : Course coursegs : GradStudent
V
(GradStudent), —> V (Course)
Its semantics is constrained by the semantics of the corresponding association on superclass Student, i.e., Student
gs(c)
coursegs(gs)
= Student(c)\
GradStudent
= course(g(gs))
On the Semantics
of Componentware:
A Coalgebraic
Perspective
101
where student(c) |GradStudent denotes the restriction of the set of students enrolled on course c to the subset of graduate students in the same course, and g is the inheritance morphism between classes S t u d e n t and G r a d S t u d e n t . The semantic function is given as: S{Bbm—CnC
<^T>m((b,c),g,(b',d))\(b,c)eSlBbm—lClA geS[C
(&', d) e S\Bbm
*,D] A
m' C m A
(Voy = (id,u) : B.b'(0u) = b((id,g{u)))) A (Voy : B,0[; = (id,u) : D.oy G d(oy) =4> (id,g(u)) € c(oy))} where g is the corresponding inheritance morphism for the generalization between C and D. The predicate m! C m specifies the restriction on the number of objects of class B being linked to an object of class D and class C respectively. Furthermore, the substitutability property is also satisfied. Generalization relationships organize classes into a lattice, with the most generalized class at the top of the hierarchy (eventually an abstract class). The meet and join operators are defined as the superclass and subclass (for multiple inheritance) of classes, respectively. An abstract class may not have direct instances. Therefore, they can not be interpreted in the same way as concrete classes. However, from the generalization relationship between an abstract class and its subclasses, one may obtain its semantics as the smallest superclass of all of its subclasses (or the least upper bound in the lattice of classes). Translated to the language of category theory this means that the semantics of an abstract class with respect to its subclasses is the colimit of the subclass coalgebras, i.e., S\C{Abstract\
<— *{Ci, C 2 , • • •, C n }J = Coliniit C o aig(ci, c 2 , . . . , c„)
where Cj are the coalgebras in <S[Cj] respectively. Two less common but useful notions are isRoot and isLeaf which specifies whether a class may have no parents or children. Especially in presence of multiple, independent inheritance lattices, it is useful to designate the top and bottom of each hierarchy via the root and leaf classes. The semantics of root and leaf classes are defined as follows: 5[C{rooi}l 4 S[C] A (VD.5[D < — C] = 0 ) SlC{leaf}]
± 5 [ C ] A (VD.5|C <— D] = 0 )
which are similar to the semantics of other classes, but for the condition that no superclass of a root class C (and subclass of a leaf class, respectively) appear in the class diagram. In a UML class diagram, a class may have more than one parent, a fact known as multiple inheritance. In such a situation the subclass has all the attributes, methods and associations of all the superclasses. We assume that multiple inheritance is wellformed, in the sense that an attribute or method with the same signature can not be declared independently by several superclasses that do not inherit it from a
102
L.S. Barbosa, M. Sun, B.K. Aichernig
and N. Rodrigues
common ancestor. The semantics of such a class having all the properties of the superclasses is the lower bound of the corresponding sub-lattice of coalgebras. A lower bound is expressed in category theory as as a universal property represented by a cone construction. Hence, if class C multiply inherits from a set of classes {Ci, C2, • • •, C r a }, its semantics is given by S[C ~i> {C 1 : C 2 . . . , C n }] 4 Cone C oai g (c 1 , C2, • - • , Cn J where Cj are the coalgebras in 5[C»J and the arrows on the cone construction are the corresponding inheritance morphism. 3.4.2. Object Diagrams
and Class
Diagrams
We are now able to characterize the semantics of UML class diagrams, building on the semantics of classes and inheritance presented before. An object diagram is a snapshot of the corresponding class diagram. It exhibits objects and their relationships populating a system at a given point of time. These objects and their relationships are defined by the semantics above. Thus, denoting the system state space by S, an object diagram represents a system state a £ E and can be seen as an instance of the corresponding class diagram. Thus, an element a G £ is interpreted as the product of states of different objects at the same point of time. Therefore, a class diagram can be specified as a coalgebra c = (S, a : £ —> T £), where a describes all the possible transitions between system states. Formally, Definition 3.15. The semantics of a class diagram C D is defined as a category denoted as Coalg(CD), whose objects are coalgebras (£,c : £ —> T(£)) where £ is the system state space which contains all the possible system states. T is the tensor product composed by the signature functors of the component classes and associations in the class diagram, which describes the possible system state transitions and observations to the system states. The arrows are T-coalgebra morphisms between them. The ultimate goal of a formal semantics is to allow rigorous reasoning and analysis. A typical problem in analysing UML descriptions is consistency checking of class diagrams. To end our discussion on class diagrams, let us illustrate, through a small example, how this can be achieved within the proposed coalgebraic semantics. Figure 3.3 is part of a class diagram for a library system. The diagram started with two classes Student and Book and an association which shows the relationship between students and books. Every student could borrow at most 5 books at the same time. Later another class GradStudent is added to the diagram which is intended to be a subclass of class Student. It is stipulated that every graduate student can borrow at most 10 books at the same time. From the semantics definition of association inheritance we can easily derive that this diagram is inconsistent, as SpBookJ,00^8—borrowerstudent
<—
GradStudent] => 0..10 C 0..5 -O- False
On the Semantics
of Componentware:
Student name: String studentID: Number
103
Book borrower 0..1
A amHStndpnt tutor: Professor
A Coalgebraic Perspective
loans name: String 0..5 loans
0..10
borrower 0..1
Fig. 3.3.
3.4.3. Beyond
Class
An inconsistent class diagram for a library system
Diagrams
3.4.3.1. Use Cases. In UML, use cases describe functional requirements. A use case is defined as "a sequence of transactions in a system, whose task is to yield a measurable value to an individual actor of the system" [35]. Thus, we can interpret a use case coalgebraically as a sequence of actions followed by some (eventually combined) observations. Single actions represent atomic use cases which change the system from one state to another. Such actions include creating new objects and deleting old objects, forming or deleting links between objects or modifying attribute values of objects. The signature of an atomic use case is again defined by a functor T. Let E be the system state space, as defined in the semantics of class diagrams. Then a use case is interpreted as a function uc : E —> T(E). An example should demonstrate the coinductive formalization of use cases. Consider a use case B B describing the process of borrowing a book in a library (see [82] for the complete example) as specified in Figure 3.4. Let E be the state space of the system. Then the behavior of B B can be defined by a coalgebraic function bb : E —> Y,studentxBook. If student so borrows book 6o, then the observable effects can be specified coinductively: pre h loans(s'0(bb(a, so, bo))) = loans(s'0(a)) U {60} pre \-borrower(b'0(bb(a,so,bo))) = s0(er) where s'0,b'0 are observers on E for obtaining the objects so,6o and a 6 E. Notice that, the purpose of precondition pre is to guarantee the multiplicity constraints in the class diagram. In addition to the system operations, an actor may also perform other actions to change the system state. So use cases are more generically specified by integrating the system actions and actor actions together. How can this be specified in our framework? The component calculus mentioned in Section 2 provides a set of combinators (notably, for sequential and parallel composition) which can be used for getting more complex use-cases from composing atomic ones. For example, use case B B is defined as a sequential composition of the atomic actions corresponding to events 1 and 2, another use case for returning a book, corresponding to Event 3,
104
L.S. Barbosa, M. Sun, B.K. Aichernig
Use Case Actors Precondition Flow of Events
Postcondition
and N.
Rodrigues
Borrow Book Reader, Librarian The book can be borrowed. 1. The use case begins when the reader chooses a book that is not already lent; 2. The librarian checks whether the reader is allowed to borrow any more books; 3. If the number of books borrowed by the reader arrived the upper bound Include Return Book; 4. The librarian assigns the reader as the borrower of the book and states a deadline for returning the book. The reader has successfully borrowed the book. Fig. 3.4.
Borrow Book use case
when the condition is true, and Event 4. 3.4.3.2. Statechart Diagrams In UML statechart diagrams describe the dynamic behaviour of systems. Several formal semantics for statechart diagrams have been proposed previously. For example, in [47] input-output labeled transition systems are used as the semantic domain. Our aim is to give a coalgebraic characterization of statecharts, so that consistency checks between class diagrams and statechart diagrams became proofs that the coalgebra (statechart) is a model of the corresponding coalgebraic specification (class diagram). Similarly, refinement (implementation) relations can be defined ranging over different view models. Because of the hierarchical structure of statecharts, we will endow the set of configurations with a coalgebraic structure induced by the operational semantics rather than simply construct the coalgebraic structure over the set of states. Functor T captures the shape of such a coalgebra: J(X) = B(X x PEf where E denotes the set of all events — notice that events, at this stage, may have parameters rather than remaining just primitive signals. B is a strong monad which specifies the behaviour pattern of the statechart (typically the powerset monad). For a given statechart SC, let CF be the set of configurations and (CF, a : CF —> T(CF)) the corresponding T-coalgebra. Hence, the behaviour of SC is given by [SCI =Xs:CF . a(s) 3.4.4.
Discussion
This section introduced a coalgebraic semantics for UML class diagrams. Classes, associations and generalizations in such a diagram were mapped to coalgebraic specifications in the sense of [33]. As a consequence, UML class diagrams can be used to specify the kind of components discussed in the previous section. In such a coalgebraic interpretation of UML, a class forms a category of coalgebras. The shape (functor) of this coalgebra is defined by the attribute and method declarations. Associations add pairs of observers to this shape, and generalisations define inheritance morphisms between the coalgebras of two classes.
On the Semantics
of Componentware:
A Coalgebraic
Perspective
105
Our overall aim, however, is to resort to coalgebraic techniques to integrate the semantics of both static and dynamic aspects of UML. Current work, which by space limitations was only glimpsed at here, includes the coalgebraic modelling of use-cases and statechart diagrams (see [83, 85] for some preliminary results), putting forward suitable notions of consistency and refinement. However, having a formal semantics is not enough. The next step will involve research on applications of the semantics. In particular, we intend to develop a use-case driven method for designing components with the help of UML diagrams.
3.5. Application to the Design of Component Repositories 3.5.1.
Motivation
Section 3.3 introduced a formal model for (state-based) components as pointed coalgebras for functor T B = B(ld x O)1
(3.68)
and a calculus to reason about their composition. The natural, challenging, question is then how can such a calculus be used in the practice of software engineering? What would be a suitable tool support for these ideas? As a first attempt to meet these questions a prototype for the component calculus was developed in which component specifications written in the V D M meta-language [20, 36] could be registered and new components assembled using the combinators defined in section 3.3. If this had some potential to validate the calculus, soon was realized the need for dealing directly with software components developed in different platforms and often only documented at the type-signature level typical of most API (the acronym for Application Program Interface) descriptions. From an engineering point of view a key issue to the success of the component paradigm is integration, i.e., how to cope in practice with heterogeneity, given the proliferation of competing 'standards', component frameworks and development languages. In fact, unless one restricts himself to a particular framework or, even worse, to a specific tool within a framework, it is no easy task to identify, select, reuse and compose heterogeneous software components from a virtual 'global market' (as represented, for example, by the Internet). This section reports on a concrete approach to such a component integration problem found in the context of an European-wide industrial software development project. The challenge was to build a component repository (of both source code and interface information) able to register and marketing all the components produced by different teams in the project. The repository was also supposed to act as an exchange market: users being able not only to register their own components, but also to announce plans to issue specific components and even to publicise their needs and associated requirements in the form of collections of APIs. Although such component marketplaces are emerging in the Web, the underlying description techniques remain rather informal, mainly textual, with an additional, limited, classification in terms of the intended business areas.
106
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
Initial prospects of implementing the repository as an API database resorting to text-based retrieve methods for querying, soon revealed impractical given the heterogeneity of the APIs supplied by the different project teams. As the complexity of the individual components and the size of the repository increased, higher levels of abstraction were required. Again the VDM meta-language was chosen as the common language to which different sorts of APIs were mapped. A complete, working prototype of the repository, named C O M P S R C to express the idea of a 'component source indexing space', has been developed in the V D M T O O L B O X complemented with a dedicated web interface for easier access. This repository has a number of differences with respect to the initial prototype for the component calculus. First of all, there was a need not to deal with formal specifications of coalgebras for functor (3.68), but with API and even source code developed on different platforms. The specification was not given, but should be derived. Or, at least, some information on the involved types, their structure and service signatures. Moreover, some API referred to state-based components in which an underlying state space and a collection of methods could be identified, while others were simply given as a functional library. It was also realized that often one is more interested in composing particular services of several components, to obtain more complex functionality, rather than in assembling two or more components in their entirety. In the sequel the architecture and functionality of this system is detailed. 3.5.2. The COMPSRC
Architecture
The C O M P S R C repository is built around a broad characterisation of what an interface to component source code is, specified in the VDM meta-language in the early project stages. This is called, in the context of this system, the abstract API. The component source code database (CS), which stores source code or language dependent APIs, is indexed by such abstract APIs which are 'extracted' from the actual component APIs. This is done by extractor functions, e^, specified for the typical target languages used in the project. The other element of C O M P S R C is the component calculator, Cc, which computes new services based on the automatic composition of existing functionality, according to more or less strict interaction patterns. It then builds new APIs and new components which would act as if the actual source code has been merged. The interaction patterns available are different kinds of pipelining (linear, flattened or monadic, as described below) as well as multiplicative and additive aggregation, corresponding to component parallel composition M and choice EB, respectively. Once a new component is built, by the calculator, the new abstract API is registered in the system and the component 'combined' code is supplied, in the form of a conditional compiling script for .NET. 3.5.3.
Interfaces
The 'quest' for a common abstract API format started from a reverse specification of the (static information part of) different sorts of component APIs used in the
On the Semantics
of Componentware:
A Coalgebraic
Perspective
107
project. These ranged from the 'object-oriented style', emphasising hierarchical class structures, to 'functional modules' described by service signatures and, eventually, datatype constraints. The formal analysis of typical API examples in JAVA, CORBA, MICROSOFT VISUAL BASIC, HASKELL and W S D L , which is documented in [68], lead to the description of a component as an indexed collection of services (referred to as modules in the sequel). Each module interface is, therefore, a (VDM) structure extracted from component's source code or submitted API, collecting both general context information (such as a textual description, imported services, development stage (e.g., requested, delivered, legated, generated, etc.), source platform and use examples), datatypes specified by polynomial structures and service signatures. Each service is registered either as a function f : I —• O or a method: m : I x U —> O x U, for U the underlying state space identified in the extraction process. It is classified as simple if / and O are primitive types at the programming language level. However, as discussed earlier in this chapter, output of a service may be observed in richer contexts (e.g., f : / —> O + 1 , to express partiality, or / : I —• VO , to cater for non determinism). In a sense this "pushes" the behaviour models discussed in Section 3.3 to the service interface level. Therefore, each service is represented in an interface either as a function I —> B O or a method U x I —> B(0 xU), where I (O) are tuples of input (output) primitive types embedded in a behavioural context B. The basic constructors for such contexts have already been met in this chapter: product (A x B) for aggregation in the spatial axis, sum (A + B), for choice (i.e., aggregation in the temporal axis), exponentiation, or function space, (AB) for functional dependence, constants, like the exception type 1 or, in general, any primitive type, and, finally, powerset (VA) and sequence space (A*) to cope with, respectively, non deterministic and deterministic collections of data, respectively. Note that these constructors can be either found almost directly in high-level languages (such as, e.g., HASKELL or the V D M meta-language) or inferred in a systematic way from other programming notations. Functors built from (and closed by) such constructors plus functor composition are called (extended) polynomial and extensively used in the repository to record functions' signatures. Informally, B can be thought of as a type transformer providing a shape for the output of / or m. This entails the need to equip C O M P S R C with the ability to compare functors involved in the datatypes of service signature's. In practice, functor (structural) equality and functor instantiation order are achieved by specific comparison functions.
3.5.4. Component
Assembly
For each API submitted to C O M P S R C a corresponding abstract API (i.e., a value of the respective VDM datatype) is built by the (language-dependent) interface extractors mentioned above. Such abstract APIs provide the basis for classifying, locating and retrieving components from the repository. Furthermore they become the 'raw material' used by the component calculator to generate new components. Such generation is done in two different, but complementary, ways: aggregation and wiring.
108
L.S. Barbosa, M. Sun, B.K. Aichernig
and N. Rodrigues
Component aggregation is achieved by the application of a small set of operators acting on the abstract interfaces as a whole. In particular, they model: • Interface restriction to a set of modules or even, within a particular module, to a set of services. • Interface renaming • Additive aggregation, in which different modules coming from two different interfaces are selected and packaged into a new one. If both interfaces index actual code written in compatible implementation languages (i.e., related by an embedding) the code generation process is activated. • Multiplicative aggregation, corresponding to the synchronous execution of modules in two different components. The semantics of these operators is, in all cases, the one given in Section 3.3 if whole components are combined or its restriction to the level of a service (regarded as a one-operation component in its own) if only service combination is required. The wiring process, on the other hand, is based on the search for composition possibilities among the collections of functions of a specified set of components. Such search can be systematic, exposing all possible connections arising from a given set of components, or user-oriented in which case each possible composition is validated or discarded by the user. In any case the problem is to identify pairs of functions, in different components, whose range and domain match, according to some matching criteria detailed below. Note that the collection of functions in a component abstract API models the available fine-grain services. Based on the comparison functions mentioned above, the repository is able to perform different types of 'functional' composition, besides the obvious one between functions sharing the domain of one with the codomain of the other. In particular the following 'extra' pipelining composition patterns are considered: • Curry insensitive, which allows to perform curry or uncurry operations on a pair of otherwise not composable services. • Monadic, whenever the context information is captured by a monad (which, as shown in [9], is often the case). Monadic composition includes both the usual Kleisli composition of monadic functions used in functional programming, and . a simpler scheme based on monadic embedding of a 'plain' function. 3.5.5. A Small
Example
A small example may illustrate the identification of composable functions in two rather different components. Despite of its simplicity, it should be stressed it is a real example which emerged in the context of the project within which C O M P S R C has been developed. It is fully documented in [68], to which the interested reader is referred to. The repository contained two components developed, with different technologies, by different teams. The first was a V D M - S L specification of a robot which manages box storing inside a generic warehouse. The component provided functionality to, e.g., find the best fit of a box inside the warehouse storing space, remove a box, rearrange the warehouse in order to get the biggest amount of free space, etc.
On the Semantics
of Componentware:
A Coalgebraic Perspective
109
The second component was a web-publisher generator consuming any sort of information organised as a leaf tree (i.e., a binary tree with all information stored on the leafs), developed in HASKELL. It provided, in particular, a function y2html to generate the H T M L representation of a leaf tree value. Applying some of the composition test suites defined in the C O M P S R C leads to the identification of several possible composition patterns between (the abstract interfaces of) these two distinct modules. In particular, the repository was able to determine, in an automatic way, that the polynomial functors underlying data type Y in the first component and data type Space in the second were structurally identical: both were instances of a leaf tree. Note the type information represented as
Y : DataType = mk-Plus([mk-("
Leaf",Unit),
mk-(" Node", mk-Times([REC, REC]))]); was extracted from both the HASKELL code d a t a Y a b i = Leaf (Unit a b) I Node (Y a b i , Y a b i ) d e r i v i n g Show and the V D M - S L
API,
Space : DataType = mk-Plus([mk-("Left", mk-(" Right"
Box), ,mk-Times([REC,REC}))});
This fact opened the possibility of composing function y2html in the former component with any function returning values of type Space in the latter. Such was the case of, e.g., functions freeSpace, defragment and whichBoxes. The composition of y2html with any of these functions provided, for free, generators of H T M L interfaces for the warehouse V D M prototype. Such wiring possibilities were detected by the application of function compareFunctor (in C O M P S R C ) whose result in the V D M T O O L B O X syntax, is { mk_( "y2html",{ "freeSpace","defragment","whichBoxes"
} ) >
This identifies a functional wiring scheme between y2html, on one hand, and functions freeSpace, defragment and whichBoxes, on the other. It also tells that the order of application for this interaction is y2html following freeSpace, defragment or whichBoxes. 3.5.6.
Discussion
The basic lesson learnt from the development of C O M P S R C is the potential of formal, model-oriented, methods in guiding the design of such platforms. We believe this exercise can be further extended to cope with some issues not covered in the present
110
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
version, namely, dynamic instantiation of components, location and mobility. Those are fundamental issues for modelling distributed component frameworks. The C O M P S R C prototype has been used, not only in the context of the project in which it was originally developed, but also to organise software components arising from a massive re-engineering effort of legacy code undertaken by a software company. Such an architectural re-engineering effort aimed to identify service components orthogonal to the basic development layers considered in the design practice (i.e., database definition, middleware and GUI). For each identified component an abstract interface, as described above, has been written and directly submitted to C O M P S R C together with a . N E T script to navigate in the (monolithic) legacy code (which remains unchanged) and generate the actual executable code corresponding to the new abstract API. In a subsequent stage new software products incorporating the recovered components have been generated within C O M P S R C . Future work on the the C O M P S R C prototype is foreseen in two main directions: • the development of a web-based implementation of C O M P S R C enabling component runtime composition and • its extension to cope with component heterogeneity not only at the 'linguistic' and component-style levels {e.g., the integration of object and functional models), but also at the level of the interaction style (aiming at the integration of, e.g., method invocation, dataflow stream processing and event-based interaction) (see [50] for work on a similar direction).
3.6. Conclusions and Further Work Summary. In this chapter we presented our coalgebraic viewpoint on components. After a brief introduction to the theory of coalgebras, we presented a coalgebraic semantics for software components. The semantics uses coalgebras to describe the possible external behaviour of a component. Interface descriptions are captured by functors, giving us very general operators that form an elegant component calculus. Applications to UML and to a component repository served to put the framework into a broader context. Discussion. The motivation for our coalgebraic semantics of components is twofold: (1) the black-box characterisation of a component via its possible observable behaviour can be nicely captured in coalgebra theory. (2) the use of functors for describing the shape of a component interface enables us to provide generic constructions that are independent of a particular type of interface. We think that this generality is worth the price of using the 'heavier' mathematical apparatus of coalgebra theory. The notion of a component discussed here, and stemming from the context of model oriented specification methods, is characterised by the presence of internal state and by an interaction model which reflects the asymmetric nature of input and output. Since the behaviour semantics of each component can be described by the corresponding (final) coalgebra, the semantics of a larger system can be
On the Semantics
of Componentware:
A Coalgebraic
Perspective
111
determined by the operations in the category of coalgebras which give the semantics of the interactions among its components and the coalgebras of the components, that means, the semantics of its syntactic constituents. This approach specifies the compositionality and structural transparency of systems: a system composed of components and connections among them can be considered as a component again and connected to other components, and thereby the internal state space of the system is hidden again to outside. Related Work. The bicategorical setting adopted is in debt to previous work by R. Walters and his collaborators on models for deterministic input-driven systems [38, 39]. However, whereas R. Walters' work deals essentially with deterministic systems, our monadic parametrisation allows to focus on the relevant structure of components, factoring out details about the specific behavioural effects that may be produced. The hook and feedback combinators (not discussed here, but see [11, 82]) and tensors are also new. Also close to our modelling approach is [42] which proposes an axiomatization of what is called a 'notion of a process' in a monoidal category. This work, however, does not cover neither the definition of generic combinators nor the development of an associated calculus. Others are working on component calculi using different semantical frameworks. Here, we only mention two promising approaches that may inspire our future work. He et al. developed a component calculus based on the Unifying Theories of Programming (UTP) [26, 49]. Their motivation is the contract-oriented development of component software. Therefore, component services are modeled as UTP designs. In contrast to our calculus their interfaces have a fixed object-oriented shape, consisting of field and method declarations: interfaces can be composed and inherited. Each method is specified via a first order predicate over its interface variables. A component is basically represented as a contract consisting of the interface, the method specifications and the initial values for the fields. The contracts are composable using a merge and nondeterministic choice operator. Refinement of components is straightforward since it is based on UTP's refinement calculus of designs. Another component framework worth mentioning is the Focus theory of Broy et al. [19]. Focus is based on a component model where components are interconnected via (timed) streams of data. Here an interface is a set of ports declarations defining the name and the possible set of data communicated over a port. The specification style in Focus has a flavour of functional programming since list (stream) operators are used extensively to define the relation between input and output ports. In contrast to our coalgebraic approach, Focus is algebraic. The theory is well integrated with standard graphical description techniques, like data-flow diagrams, message sequence charts and state transition diagrams. Various compositions as well as refinement is defined, too. With its Autofocus case tool, Focus is perhaps the most advanced formal component technology with respect to automation and tool support. Future Work. Our present work is concerned with refinement, at both the interface and the behavioural levels, and tuning of software components to particular
112
L.S. Barbosa, M. Sun, B.K. Aichernig
and N.
Rodrigues
'use contexts'. This last aspect may become relevant in the context of the second author's work on a unified coalgebraic semantics for the U M L [71]. Tuning deals with designs in which it is required t h a t a particular component be used in a restricted way, namely as p a r t of a broader system. This entails the need for a specification of the intended behaviour, which is not intrinsic t o the component itself, but to its role (use) in a particular situation. For example, one may want to prescribe t h a t action a is the initial action or t h a t an action b is to follow each occurrence of a. Such a distinction is totally absent from modeloriented specification methods, often leading to undesirable over specification. In process calculi, on the other hand, it may be traced back to Milner's distinction between static and dynamic process connectives, the later being understood as the source of temporal extension. In CCS, 'prefixing' is the typical example of a dynamic connective. Our component algebra lacks such an operator, as we are dealing with concrete coalgebras instead of pure behaviours [10]. Notice, on the other hand, t h a t 'choice', which is also a dynamic operator in process calculi, is treated, at component level, as an aggregation combinator. A further thread of future work is the development of a coalgebraic testing theory. Aichernig is currently working on a unifying theory of testing t h a t relates testing theories on process algebras to model-oriented and algebraic testing techniques. Coalgebras seem to be perfectly suitable for such an endeavour: (1) they allow to represent and reason over the observation space, (2) the process algebraic theory would be just a special case of the functorial theory expressing the testing criteria as universal properties. T h e result would be a component-based testing theory, with tester components interacting with the components under test in order to check their conformance to a (coalgebraic) specification.
Acknowledgements. T h e contributions of Luis S. Barbosa and Nuno Rodrigues were funded by the Portuguese Foundation for Science and Technology, in the context of the P U R E project, under contract P 0 S I / I C H S / 4 4 3 0 4 / 2 0 0 2 . T h e work of Sun Meng was partially supported by the National Natural Science Foundation of China, under Grant No. 60273001 and 60473056. Comments by an anonymous referee on an earlier version of this chapter were greatly appreciated.
Bibliography 1. J. R. Abrial. The B Book: Assigning Programs to Meanings. Cambridge University Press, 1996. 2. P. Aczel and N. Mendler. A final coalgebra theorem. In D. Pitt, D. Rydeheard, P. Dybjer, A. Pitts, and A. Poigne, editors, Proc. Category Theory and Computer Science, pages 357-365. Springer Lect. Notes Comp. Sci. (389), 1988. 3. F. Arbab. Abstract behaviour types: a foundation model for components and their composition. In F. S. de Boer, M. Bonsangue, S. Graf, and W.-P. de Roever, editors, Proc. First International Symposium on Formal Methods for Components and Objects (FMCO'02), pages 33-70. Springer Lect. Notes Comp. Sci. (2852), 2003.
On the Semantics of Componentware: A Coalgebraic Perspective
113
4. R. Backhouse. An exploration of the Bird-Meertens formalism. CS 8810, Groningen University, 1988. 5. R. C. Backhouse, P. Jansson, J. Jeuring, and L. Meertens. Generic programming: An introduction. In S. D. Swierstra, P. R. Henriques, and J. N. Oliveira, editors, Third International Summer School on Advanced Functional Programming, Braga, pages 28-115. Springer Lect. Notes Comp. Sci. (1608), September 1998. 6. L. S. Barbosa. Components as processes: An exercise in coalgebraic modeling. In S. F. Smith and C. L. Talcott, editors, FMOODS'2000 - Formal Methods for Open Object-Oriented Distributed Systems, pages 397-417. Kluwer Academic Publishers, September 2000. 7. L. S. Barbosa. Components as Coalgebras. PhD thesis, DI, Universidade do Minho, 2001. 8. L. S. Barbosa. Process calculi d la Bird-Meertens. In CMCS'01, Elect. Notes in Theor. Comp. Sci., volume 44.4, pages 47-66, Genova, April 2001. Elsevier. 9. L. S. Barbosa. Towards a Calculus of State-based Software Components. Journal of Universal Computer Science, 9(8):891-909, August 2003. 10. L. S. Barbosa and J. N. Oliveira. Coinductive interpreters for process calculi. In Proc. of FLOPS'02, pages 183-197, Aizu, Japan, September 2002. Springer Lect. Notes Comp. Sci. (2441). 11. L. S. Barbosa and J. N. Oliveira. State-based components made generic. In H. Peter Gumm, editor, CMCS'03, Elect. Notes in Theor. Comp. Sci., volume 82.1. Elsevier, 2003. 12. M. A. Barbosa and L. S. Barbosa. Specifying software connectors. In K. Araki and Z. Liu, editors, 1st International Colloquium on Theorectical Aspects of Computing (ICTAC'04), pages 53-68, Guiyang, China, September 2004. Springer Lect. Notes Comp. Sci. (3407). 13. J. Benabou. Introduction to bicategories. Springer Lect. Notes Maths. (47), pages 1-77, 1967. 14. J. van Benthem. Modal Correspondence Theory. Ph.D. thesis, University of Amsterdam, 1976. 15. K. Bergner, A. Rausch, M. Sihling, A. Vilbig, and M. Broy. A Formal Model for Componentware. In Gary T. Leavens and Murali Sitaraman, editors, Foundations of Component-Based Systems, pages 189-210. Cambridge University Press, 2000. 16. R. Bird and O. Moor. The Algebra of Programming. Series in Computer Science. Prentice-Hall International, 1997. 17. G. Booch, J. Rumbaugh, and I. Jacobson. The Unified Modeling Language User Guide. Addison Wesley, 1999. 18. M. Broy. Semantics of finite and infinite networks of communicating agents. Distributed Computing, 2(1):13-31, 1987. 19. M. Broy and K. St0len. Specification and Development of Interactive Systems: FOCUS on Streams, Interfaces, and Refinement. Springer, 2001. 20. J. Fitzgerald and P. G. Larsen. Modelling Systems: Pratical Tools and Techniques in Software Development. Cambridge University Press, 1998. 21. D. Gelernter and N. Carrier. Coordination languages and their significance. Communication of the ACM, 2(35):97-107, February 1992. 22. J. Goguen, J. Thatcher, E. Wagner, and J. Wright. Initial algebra semantics and continuous algebras. Jour, of the ACM, 24(l):68-95, January 1977. 23. R. Grimes. Profissional DCOM Programming. Wrox Press, 1997. 24. H. P. Gumm and T. Schroeder. Covarieties and complete covarieties. In B. Jacobs, L. Moss, H. Reichel, and J. Rutten, editors, CMCS'98, Elect. Notes in Theor. Comp.
114
L.S. Barbosa, M. Sun, B.K. Aichernig and N. Rodrigues
Sci., volume 11. Elsevier, March 1998. 25. H. Peter Gumm. Elements of the general theory of coalgebras. Technical report, Lecture Notes for LUTACS'99, South Africa, 1999. 26. J. He, Z. Liu, and Li X. Component-based software engineering. In Pro. ICTAC'2005, Lecture Notes in Computer Science 3722. Springer, 2005. 27. U. Hensel and H. Reichel. Defining equations in terminal coalgebras. In E. Astesiano, G. Reggio, and A. Tarlecki, editors, Recent Trends in Data Type Specification, pages 307-318. Springer Lect. Notes Comp. Sci. (906), 1995. 28. B. Jacobs. Mongruences and cofree coalgebras. In V.S. Alagar and M. Nivat, editors, Algebraic Methodology and Software Technology (AMAST), pages 245-260. Springer Lect. Notes Comp. Sci. (936), 1995. 29. B. Jacobs. Object-oriented hybrid systems of coalgebras plus monoid actions. In M. Wirsing and M. Nivat, editors, Algebraic Methodology and Software Technology (AMAST), pages 520-535. Springer Lect. Notes Comp. Sci. (1101), 1996. 30. B. Jacobs. Objects and classes, co-algebraically. In C. Lengauer B. Freitag, C.B. Jones and H.-J. Schek, editors, Object-Orientation with Parallelism and Persistence, pages 83-103. Kluwer Academic Publishers, 1996. 31. B. Jacobs. Objects and classes, co-algebraically. In B. Freitag and C.B. Jones, C. Lengauer and H.-J. Schek, editor, Object-Orientation with Parallelism and Persistence, pages 83-103. Kluwer, 1996. 32. B. Jacobs. Behaviour-refinement of coalgebraic specifications with coinductive correctness proofs. In TAPSOFT'97: Theory and Practice of Software Development, pages 787-802. Springer Lect. Notes Comp. Sci. (1214), 1997. 33. B. Jacobs. Exercises in coalgebraic specification. In R. Backhouse, R. Crole, and J. Gibbons, editors, Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, pages 237-280. Springer Lect. Notes Comp. Sci. (2297), 2002. 34. B. Jacobs and J. Rutten. A tutorial on (co)algebras and (co)induction. EATCS Bulletin, 62:222-159, 1997. 35. I. Jacobson. Object oriented development in an industrial environment. In Norman K. Meyrowitz, editor, Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'87), October 4-8, 1987, Orlando, Florida, volume 22 of SIGPLAN Notices, pages 183-191, 1987. 36. Cliff B. Jones. Systematic Software Development Using VDM. Series in Computer Science. Prentice-Hall International, 1986. 37. P. Katis. Categories and Bicategories of Processes. PhD thesis, University of Sydney, 1996. 38. P. Katis, N. Sabadini, and R. F. C. Walters. Bicategories of processes. Journal of Pure and Applied Algebra, 115(2):141-178, 1997. 39. P. Katis, N. Sabadini, and R. F. C. Walters. On the algebra of systems with feedback and boundary. Rendiconti del Circolo Matematico di Palermo, II(63):123—156, 2000. 40. G. M. Kelly. Basic Concepts of Enriched Category Theory, volume 64 of London Mathematical Society Lecture Notes Series. Cambridge University Press, 1982. 41. A. Kock. Strong functors and monoidal monads. Archiv fur Mathematik, 23:113-120, 1972. 42. S. Krstic, J. Launchbury, and D. Pavlovic. Categories of processes enriched in final coalgebras. In Proceedings of FOSSACS, pages 303-317. Springer Lect. Notes Comp. Sci. (2030), 2001. 43. A. Kurz. Specifying coalgebras with modal logic. In B. Jacobs, L. Moss, H. Reichel, and J. Rutten, editors, CMCS'98, Elect. Notes in Theor. Comp. Sci., volume 11. Elsevier, March 1998.
On the Semantics of Componentware: A Coalgebraic Perspective
115
44. A. Kurz. Coalgebras and modal logic. Technical report, Lecture Notes for ESSLLII'2001, Helsinki, 2001. 45. A. Kurz. Logics for Coalgebras and Applications to Computer Science. Ph.D. Thesis, Fakultat fur Mathematik, Ludwig-Maximilians Univ., Muenchen, 2001. 46. A. Kurz and D. Pattinson. Coalgebras and modal logic for parameterised endofunctors. Technical report, CWI Technical Report, SEN-R0040, 2000. 47. Diego Latella, Istvan Majzik, and Mieke Massink. Towards a formal operational semantics of UML statechart diagrams. In Proc. FMOODS'99. Kluwer, 1999. 48. M. Lenisa. Themes in Final Semantics. PhD thesis, Universita de Pisa-Udine, 1998. 49. Z. Liu, J. He, and Li X. Contract-oriented development of component software. Technical Report 285, United Nations University, International Institute for Software Technology (UNU-IIST), 2003. 50. K.-P. Lohr. Towards automatic mediation between heterogeneous software components. Electr. Notes in Theor. Comp. Sci., Elsevier, 65(4), 2002. 51. S. Mac Lane. Categories for the Working Mathematician. Springer Verlag, 1971. 52. V. Matena and B Stearns. Applying Entreprise JavaBeans: Component-Based Development for the J2EE Platform. Addison-Wesley, 2000. 53. G. H. Mealy. A method for synthesizing sequential circuits. Bell Systems Techn. Jour., 34(5): 1045-1079, 1955. 54. E. Meijer, M. Fokkinga, and R. Paterson. Functional programming with bananas, lenses, envelopes and barbed wire. In J. Hughes, editor, Proceedings of the 1991 ACM Conference on Functional Programming Languages and Computer Architecture, pages 124-144. Springer Lect. Notes Comp. Sci. (523), 1991. 55. R. Milner. Communication and Concurrency. Series in Computer Science. PrenticeHall International, 1989. 56. L. Monteiro. Observation systems. In H. Reichel, editor, CMCS'OO - Workshop on Coalgebraic Methods in Computer Science. ENTCS, volume 33, Elsevier, 2000. 57. E. F. Moore. Gedanken experiments on sequential machines. In Automata Studies, pages 129-153. Princeton University Press, 1966. 58. L. Moss. Coalgebraic logic. Ann. Pure & Appl. Logic, 1999. 59. O. Nierstrasz and F. Achermann. A calculus for modeling software components. In F. S. de Boer, M. Bonsangue, S. Graf, and W.-P. de Roever, editors, Proc. First International Symposium on Formal Methods for Components and Objects (FMCO '02), pages 339-360. Springer Lect. Notes Comp. Sci. (2852), 2003. 60. S. Oaks and H. Wong. Jini in a Nutshell. O'Reilly and Associates, 2000. 61. J. N. Oliveira. The Formal Semantics of Deterministic Dataflow Programs. PhD thesis, Department of Computer Science, University of Manchester, February 1984. 62. J. N. Oliveira. Formal Software Development. Lecture Notes for the MSc in Computer Science, Minho University, 1992. 63. OMG. OMG Unified Modeling Language Specification, Version 1.4 , 2001. 64. G. Papadopoulos and F. Arbab. Coordination models and languages. In Advances in Computers — The Engineering of Large Systems, volume 46, pages 329-400. 1998. 65. D. Park. Concurrency and automata on infinite sequences. In Proceedings of the 5th Gl-Conference on Theoretical Computer Science, volume 104 of Lect. Notes Comp. Sci., pages 167-183. Springer-Verlag, 1981. 66. J. Power and H. Watanabe. An axiomatics for categories of coalgebras. In B. Jacobs, L. Moss, H. Reichel, and J. Rutten, editors, CMCS'98, Elect. Notes in Theor. Comp. Sci., volume 11. Elsevier, March 1998. 67. H. Reichel. An approach to object semantics based on terminal co-algebras. Math. Struct, in Comp. Sci., 5:129-152, 1995.
116
L.S. Barbosa, M. Sun, B.K. Aichernig and N. Rodrigues
68. N. Rodrigues. Formal methods laboratory: the component repository specification. Technical report, Univ. Minho, DI, 2003. 69. N. Rodrigues and L. S. Barbosa. On the specification of a component repository. In Hung Dang Van and Zhiming Liu, editors, Proc. of FACS'03, ('Formal Approaches to Component Software,), pages 47-62, Pisa, Spetember 2003. 70. J. Rothe, B. Jacobs, and H. Tews. The coalgebraic class specification language CCSL. Jour, of Universal Computer Science, 7(2), 2001. 71. J. Rumbaugh, I. Jacobson, and G. Booch. The Unified Modeling Language Reference Manual. Addison-Wesley, 1997. 72. J. Rutten. A calculus of transition systems (towards universal co-algebra). In A. Ponse, M. de Rijke, and Y. Venema, editors, Modal Logic and Process Algebra, A Bisimulation Perspective, CSLI Lecture Notes (53), pages 231-256. CSLI Publications, Stanford, 1995. 73. J. Rutten. Universal coalgebra: A theory of systems. Technical report, CWI, Amsterdam, 1996. 74. J. Rutten. Automata and coinduction (an exercise in coalgebra). In Proc. CONCUR' 98, pages 194-218. Springer Lect. Notes Comp. Sci. (1466), 1998. 75. J. Rutten. Universal coalgebra: A theory of systems. Theor. Comp. Sci., 249(l):3-80, 2000. (Revised version of CWI Techn. Rep. CS-R9652, 1996). 76. J.-G. Schneider and O. Nierstrasz. Components, scripts, glue. In L. Barroca, J. Hall, and P. Hall, editors, Software Architectures - Advances and Applications, pages 13-25. Springer-Verlag, 1999. 77. K. Segerberg. An essay in classical modal logic. Technical Report 13, University of Uppsala, Filosofiska Studier, 1971. 78. R. Siegel. CORBA: Fundamentals and Programming. John Wiley & Sons Inc, 1997. 79. J. M. Spivey. The Z Notation: A Reference Manual (2nd ed). Series in Computer Science. Prentice-Hall International, 1992. 80. M. Sun and B. K. Aichernig. Component-based coalgebraic specification and verification in RSL. Technical Report 267, UNU/IIST, October 2002. 81. M. Sun and B. K. Aichernig. A coalgebraic calculus for component based systems. In Hung Dang Van and Zhiming Liu, editors, Proc. of FACS'03, ('Formal Approaches to Component Software,), pages 27-46, Pisa, Spetember 2003. 82. M. Sun and B. K. Aichernig. Towards a Coalgebraic Semantics of UML: Class Diagrams and Use Cases. Technical Report 272, UNU/IIST, January 2003. 83. M. Sun, B. K. Aichernig, L. S. Barbosa, and Z. Naixiao. A coalgebraic semantic framework for component based development in UML. In L. Birkedal, editor, Proc. Int. Conf. on Category Theory and Computer Science (CTCS'04), volume 122, pages 229-245. Elect. Notes in Theor. Comp. Sci., Elsevier, 2005. 84. M. Sun and L. S. Barbosa. On refinement of generic software components. In C. Rettray, S. Maharaj, and C. Shankland, editors, 10th Int. Conf. Algebraic Methods and Software Technology (AMAST), pages 506-520, Stirling, 2004. Springer Lect. Notes Comp. Sci. (3116). 85. M. Sun, Z. Naixiao, and L. S. Barbosa. On semantics and refinement of UML statecharts: A coalgebraic view. In J. Cuellar and Z. Liu, editors, Proc. of 2nd IEEE Int. Conf. on Software Engineering and Formal Methods, pages 164-173, Beijing, China, September 2004. IEEE Computer Society Press. 86. C. Szyperski. Component Software, Beyond Object-Oriented Programming. AddisonWesley, 1998. 87. The RAISE Language Group. The RAISE Specification Language. Prentice Hall International, 1992.
On the Semantics
of Componentware:
A Coalgebraic Perspective
117
88. D. Turi and J. Rutten. On the foundations of final coalgebra semantics: non-wellfounded sets, partial orders, metric spaces. Math. Struct, in Corap. Sci., 8(5):481-540, 1998. 89. P. Wadler and K. Weihe. Component-based programming under different paradigms. Technical report, Dagstuhl Seminar 99081, February 1999. 90. U. Wolter. A coalgebraic introduction to CSP. In CMCS'99, Elect. Notes in Theor. Comp. Sci., volume 19. Elsevier, 1999.
This page is intentionally left blank
Chapter 4 A Theory for Requirements Specification and Architecture Design of Multi-Functional Software Systems Manfred Broy Institut filr Informatik, Technische Universitdt Miinchen D-80290 Miinchen Germany, [email protected] http://wwwbroy.informatik.tu-muenchen.de We extend the F o c u s model [2] and its interface theory of distributed concurrent interactive systems and their structuring into components. Our goal is the scientific foundation for two methodologically essential, complementary and orthogonal concepts for the structuring of multi-functional systems in software and systems engineering. One addresses essentially requirements engineering and the specification of the comprehensive user functionality of multi-functional systems in terms of their functions, features and services. The other addresses mostly the design phase with its task to develop logical architectures formed by networks of interactive components that are specified by their interface behavior. The first concept is of major interest for the requirements engineer, the second for the software architect. We show that these concepts are complimentary, how they work and fit together as milestones in requirements engineering and systems architecture design.
4.1.
Motivation
Software development is today one of the most complex and powerful tasks in engineering. Modern software systems typically are embedded within technical or organizational processes and support those. T h e y are deployed and distributed over large networks of computing devices; they are dynamic, and accessed concurrently via a family of independent user interfaces. They are based on software infrastructure such as operating systems and communication services. T h e y exploit the services of middleware such as object request brokers. T h e y offer t o their users a large variety of different functions, in our terminology called services or features. We speak of multi-functional distributed interactive systems. To master their complexity large software systems are typically constructed in a modular fashion and structured into components. These components are grouped together in software architectures. Ideas along these lines go way back to the early concepts of "structured programming" due t o Dijkstra [5] and "modularization" due to P a r n a s [10]. We base our approach on the FOCUS theory providing a modular approach to the logical description of distributed interactive systems (see Broy, 119
120
M. Broy
St0len [2]). FOCUS models systems that are composed of interacting components exchanging messages and working concurrently. FOCUS provides a modular technique for the specification of the interfaces of systems and their components composed by a logically simple, general, nevertheless powerful form of composition. In the following we extend F o c u s . We introduce a formal model of comprehensive system functionalities structured in terms of services and architectures. In F o c u s a component is represented by a total behavior. In contrast, a service is a partial behavior. We are in the following mainly interested in fundamental issues of structuring systems. In particular, we deal with two complementary ways of structuring - namely • structuring the functionality of a system in a family of principal user functions, called services and • decomposing the system into an architecture of components. User functionality is one dimension of system structuring. Another dimension structures systems into logical architectures in terms of a decomposition into families of components that are connected by communication links and this way composed to form architectures that implement the user functionality of the system. In practice today, requirements are often not sufficiently precisely formulated. One reason is that documentation of requirements is not well supported by formalisms, in practice, since tractable structuring concepts are missing. Also the modeling approaches to requirements that have been proposed so far are weak. For instance, UML offers little for modeling requirements apart from its quite informal use case diagrams. Services form a helpful structuring concept in requirements engineering. They provide a promising step towards a formal model for structuring and formalizing requirements. Even architectural design and documentation is not sufficiently well supported by formalisms, in practice, today. In spite of the many existing approaches such as architecture description languages, tractable and sufficiently expressive specification methods for architectures is a challenge, yet. For instance, interface specifications of components in architectures are not precisely formulated, in general. Existing techniques are neither powerful nor expressive enough. This serious methodological weakness shows negative implications on distributed and concurrent system development when it comes to system integration. Also from a methodological point of view there are severe deficiencies in the state of the art of systems engineering when it comes to requirements and architectures. The transition from requirements to architecture is not well supported by today's formalisms and methods. The integration of quality assurance and architecture/interface specification is insufficient. This is a severe methodological drawback, since architectures have proved to be crucial for system development and maintenance. In this paper we are mainly concentrating on foundations supporting the structuring of the functionality and architecture for multi-functional systems. Our goal, in essence, is a scientific foundation for the structured presentation of system functions, organized in abstraction layers, forming hierarchies of functions ("services").
A Theory for Requirements
Specification
and Architecture
Design
121
Our vision is a service-oriented software engineering method where services form the basic building blocks. In particular, services address aspects of interface behaviors and specifications. In fact, they induce an aspect-oriented view onto systems and their architectures [6]. Our approach introduces two fundamental structural views for multi-functional systems. These views are addressing (at least) two principle independent dimensions of a structured modeling in the early phases of software development: (1) Structured model of the user functionality of systems - requirements engineering and specification: (a) A multi-function system offers many different functions ("use cases"); we are interested in a structured view onto the family of these functions and their logical dependencies. (b) The main result should be a structured interface specification of the user functionality of the system. (2) Decomposition of systems - design of the architecture: (a) A system is decomposed into a family of components that mutually cooperate to generate the behaviour according to the specified user functionality of the system. (b) The components form the component architecture of the system. This leads to two complementary tasks of structuring in the early phases of software and systems development. The structuring of a system into a component architecture is better understood today, since it has been studied in a model-oriented way for quite some time (for instance under the heading architectural description languages [4]). Nevertheless, the key issues of architecture specification namely interface specification and the composition of interface specifications is still an active field of research. Structuring the user functionality is indeed far less well understood. The key goal here is a structured interface specification in terms of sub-functions. For that, we have to understand how to specify these functions independently and how to derive from them integrated system specifications. We are looking for the answer to the question on how to model the functions and what the typical relations and logical dependencies between sub-functions might be. Moreover, both views of user functionality and architectures have to be integrated into a system development process. In particular, we need to understand the systematic transition from the requirements specification of a multi-functional software system to its logical architecture and the interface specification of its components. For reaching this understanding we have to answer the following questions: • What precisely is a system function and a subsystem sub-function? • What is a good way to structure the functionality of a system? • And how do we capture its architecture and relate it to its user functionality? One problem when dealing with services and functionality to start with is an appropriate, unambiguous consistent terminology. There are many related (more or
M. Broy
122
less synonymous) notions such as • • • •
function feature service use case.
Each of these notions is used in slightly different meanings by different groups of researchers. Therefore we better introduce our terminology very carefully. Our informal definition of a service is that it represents a system sub-function of a multi-functional system. A system sub-function corresponds to a set of patterns of interactive behaviours that allow us to use a system for a specific purpose in the sense of a use case (see Jacobson [8]). We call such a system sub-function a service (some practitioners rather speak of the features of the system). In the following we give a formal definition of the concept of services. This formalization is guided by our basic philosophy of services which is as follows: • Services are formalized and specified by patterns of interactions. • Services are partial behaviours of interface behaviours of systems. • A system behaviour realizes a family of services. This philosophy of services is taken as a guideline for our way to build up a theory of services. We outline a formal approach to services. We specify services in terms of relations on interactions represented by streams. We aim at answers to the following questions this way: • • • • • •
What exactly is a service? What is a good formal model of a service? How can we specify services? How can we relate services? How can we combine services? How can we structure the user functionality of multi-functional systems in terms of services? • How can we specify multi-functional systems in terms of services? • How can we define and express the relationships between services? • How can we exploit these ideas in architectural design?
The system model underlying our approach is based on streams and component behaviours as well as component interfaces. Our goal is to work out a theory of services and its embedding into the theory of architectures. We give a strictly formal, mathematical definition of components and services. Consequences of our formal definitions are, in particular, as follows: • A component is described by a total behaviour! • In contrast, a service is, in general, described by a partial behaviour! • A componentis the special case of a total service. Services can be formally related to components.
A Theory for Requirements
Specification and Architecture
Design
123
• A multi-functional system/component can be defined in terms of its family of services. Our theory covers all these notions. The open questions that remain are mainly of methodological and practical nature. We intend to address in the following text mainly questions of theory of a kind listed as follows: • How can we structure multi-functional systems/components into a family of sub-services? • Is there a canonical decombination of a component into a family of services? • How can we define relations between services? • How can we capture causal dependencies between services? The overall goal of this paper is to study semantic models of functionality in terms of services and architectures. The purpose of this theory is to provide a basis for an engineering method for the design and specification of system functionality and architectures. In the following we briefly outline this theory, define, in particular, formally our notions of interface, service, component, architecture, and that of system refinement. Based on these notions we define a number of relations between services that we consider useful for structuring multi-functional systems. However, we do not give a comprehensive framework of such relations. Actually, we see the contribution of this paper as only a first step into the theory of multi-functional systems.
4.2. Components, Interfaces and Services In this section we introduce the syntactic and semantic notion of a component interface and that of a service. We closely follow the FOCUS approach as explained in all its details in Broy, St0len [2]. F o c u s provides a flexible modular notion of a component, its interface, and of a service, too. We repeat shortly the notion of a component based on the idea of a data stream. Throughout this paper we work with only a few rather simple, but powerful notations for data streams. 4.2.1. Types, Streams,
Channels
and
Histories
We find it useful to structure the data handled by systems in terms of types. A type is a name for a set of data elements. Let TYPE be the set of all types. With each type T s TYPE we associate a set CAR(T) of data elements. CAR(T) is called the carrier set for the type T. Let ID denote a set of identifiers. A typed identifier is a pair (x, T) consisting of an identifier x € ID and a type T € TYPE. We write also x : T to express that the identifier x has type T. Given a set M, by M* we denote the set of finite sequences of elements from M, by M°° the set of infinite sequences of elements of M that can easily be represented
M. Broy
124
by functions N —>• M. By M" we denote M* U M°°, called the set of finite and infinite (non-timed) streams. In the following we work with streams that include timing information. Such streams are used to represent histories of communications of data messages transmitted within a time frame. To keep the time model simple we assume a model of time consisting of an infinite sequence of time intervals of equal length. Given a message set M of data elements of type T we represent21 a timed stream s by a function s : N\{0} -> M* where M* is the set of finite sequences over the set M. For each time interval t G N\{0} the sequence s(t) denotes the sequence of messages communicated at time interval t in the stream s. The set of all timed streams of messages of a type T forms the carrier set CAR(Stream T) of the type Stream T. A channel is an identifier for streams. A channel thus is basically a name for a stream. Formally a channel is an identifier of type Stream T with some type T. The concept of a stream is used to define the concept of a channel history. Let ID be a set of identifiers. For a set C C ID x TYPE of typed channels we denote by SET(C) C ID its set of channel identifiers: SET(C) = {c: 3 T G TYPE: (c, T) G C}
A set of typed channels is called a typed channel set, if every channel c G SET(C) has a unique type in C: (c, T l ) G CI A (c, T2) G C2 => T l = T2 A typed channel set CI is called a subtype of a typed channel set C2 if SET(Cl) C SET(Cl) and if the following formula holds: (c, T l ) G C I A (c, T2) G C2 => CAR(Tl) C CAR(T2)
We write then C I subtype C2
The idea of subtypes is essential for relating services (see later). For a set of typed channels C (which is nothing but a set of typed identifiers of type stream) a channel history is given for each channel c G C by the stream of messages communicated over that channel. Definition. Channel history Let C be a set of channels; a channel history is a mapping (let M be the universe of all data elements) x : C - • (N-> M*) a
Here we use the extended representation of a timed stream in comparison with the FOCUS represetation (see Broy, St0len [2]).
A Theory for Requirements
Specification and Architecture
Design
125
such that x.c is a stream of type Type(c) for each channel c e C. We denote the set of channel histories for the channel set C both by H(C) as well as by C . • Throughout this paper we work with a couple of simple basic operators and notations for streams and timed streams respectively that are shortly summarized below: ()
empty sequence or empty stream,
(m)
one-element sequence containing m as its only element,
x.t
t-th element of the stream x (the sequence s(t) of messages communicated at time interval t in the stream s),
#£
length of the stream x,
x -~" z
concatenation of the sequence x to the sequence or stream z; to cover the concatenation of infinite streams we define: if x is infinite then x^z = x,
x C. y
the prefix ordering C. which is a partial order on streams specified for streams x, y by the formula xC.y = 3z: x^~-z = y,
x It
prefix of length t of the stream x (the first t sequences of the stream x),
S d)x
stream obtained from the stream x by deleting all its messages that are not elements of the set S,
S#x
number of messages in x that are elements of the set S,
x
finite or infinite stream that is the result of concatenating all sequences in the timed stream x. ~x is called the time abstraction. Note that x is finite if x carries only a finite number of nonempty sequences.
We will use this notation to write specifying assertions. A more complex operation on histories is the direct sum. Definition. Direct Sum of Histories Given two sets C and C of channels with consistent types and histories z G H(C) and z' £ H(C') we define the direct sum of the histories z and z' by {z®z')
C H(CUC')
M. Broy
126
It is specified as follows: {y.c : y G (z © z')} = {z.c} {y.c: y £{z®z')} = {z'.c} merge(z.c, z'.c) = (z (B z').c
<= c G C \ C <= c G C'\C <= c G C fl C
where merge({), s) = merge(s, (}) = {s} merge((al)^sl, (a2)^s2) = { { a l ) ~ s : s G merge(sl, (a2)^s2)} U { ( a 2 ) ^ s : s G m e r y e ( ( a l ) ~ s l , s2)} This definition expresses that each history in the set z © z' carries all the messages the streams z and z' carry in the same time intervals and the same order. • Based on the direct sum we can introduce the notion of a sub-history ordering. It expresses that a history contains only a selection of the messages of a given history. Definition. Sub-history Ordering Given two histories z G H(C) and z' € H(C') where C subtype C we define the sub-history ordering < su b as follows: z < s u b z' iff 3 z" : z = z' © z"
In fact, the sub-history ordering relation between histories is a partial ordering on the set of channel histories. The proof is rather straightforward. The notions of a timed stream and that of a channel history are essential for defining the behavior of components and services. 4.2.2. Components
and
Services
Components have syntactic interfaces determined by the families of their typed input and output channels. We describe the black box behavior of components by their semantic interfaces.
yi:Ti
Xl:S!
F x„:S n
Fig. 4.1. node
M
M
ym:Tra
Graphical representation of a component F with its syntactic interface as a data flow
In our terminology an interface provides both a syntactic and semantic notion. We work with the concept of channels, data types, and that of timed data streams to describe interfaces.
A Theory for Requirements
Specification
and Architecture
Design
127
Definition. Syntactic interface Let J be a set of typed input channels and O be the set of typed output channels. The pair (I, 0) characterizes the syntactic interface of a component. By (I • 0) this syntactic interface is denoted. G A component is connected to its environment exclusively by its channels. The syntactic interface indicates which types of messages the component can exchange but it specifies nothing particular about the specific interface behavior of the component. Next, we introduce a model of the interface behavior. For each history y £ C and each time t £ IM the expression y it denotes the history (the initial communication behavior on the channels) of x till time t. y it yields a finite history for the channels in C represented by a mapping of the type C - ( { I , . - - } - M*). Definition. Interface behavior of a component A component interface (behavior) with the syntactic interface (J • O) is given by a function
F :
I^p(O)
that fulfills the following timing property (for all its input histories x and z) called strong causality: x it= z U
=> {y U+i: y £ F.x} = {y | t + 1 : y £ F.z}
The timing property axiomatizes the time flow (for details see Broy, St0len [2]). A function that fulfills this timing requirement is called strongly causal. By CIF[7 • 0} we denote the set of all component interfaces with the set of typed input channels / and the set of typed output channels 0. By CIF we denote the set of all interfaces for arbitrary channel sets I and 0. • Syntactic interfaces define a kind of syntactic type for components. Semantic interfaces characterize the observable behavior (users' view) of components, also called interface abstraction or black box view. The definition of a component behavior as a strongly causal function has an immediate consequence. It implies that for a component interface F either the set F.x is the empty set for all histories x or nonempty for all x. In the first case we call the interface function F paradoxical. In the later case we call the interface function total. A component interface behavior therefore is either total or paradoxical. A service has a syntactic interface just like a component. Its behavior, however, is upartiaF in contrast to the totality of a non-paradoxical component interface. Partiality here means that a service is defined (has a nonempty set of output histories) only for a subset of its input histories. This subset is called the service domain.
M. Broy
128
Definition. Service interface A service interface with the syntactic interface (7 • O) is modeled by a function F:
l-+p(0)
that fulfills the timing property only for the input histories with a nonempty output set (let x,z El,y E 0,t EN): F.x^0^
F.z A x U= z U
=*• {y it+i: y e F(x)} = {y W
y E F(z)}
Such a partial function that fulfills this property is called strongly causal, too. The set dom.F = {x : F.x ^ 0 } is called the service domain of F. The set rng.F = {y E F.x : x E dom.F} is called the service range of F. By IF[/ • 0] we denote the set of all service interfaces with input channels I and output channels 0. By IF we denote the set of all interfaces for arbitrary channel sets / and 0. • Clearly we have CIF C IF and CIF[7 • 0] C IF[7 • 0 ] . In contrast to a component, where the causality requirements imply that for a component F either all output sets F.x are empty for all input histories x or none is, a service is a partial function, in general, with a nontrivial service domain. To get access to a service, typically, certain access conventions have to be valid. We speak of a service protocol. Input histories x that are not in the service domain do not fulfill the service access assumptions. This gives a clear view: a nonparadoxical component is total, while a service may be partial. In other words, a non-paradoxical component is a total service. For a non-paradoxical component there exist nonempty sets of possible behaviors for every input history. i L
I
c 1
' Service iiiteiiao;
Fig. 4.2.
Service interface
From a methodological point of view a service is closely related to the idea of a use case in object-oriented analysis. It can be seen as the formalization of this idea.
A Theory for Requirements
Specification
and Architecture
Design
129
A service provides a partial view onto the interface behavior a component. The characterization of the service domain can be used in service specifications by formulating assumptions about the input histories.
4.3. Specifying, Structuring, Relating, and Combining Services In the following we work out an approach to specifying, structuring, relating, and combining services. We depart from the modeling and specification techniques developed in Broy, St0len [2] and extend them to services. 4.3.1. Specification
Notation
In a timed stream x G (M*)°° we express in which time slots which messages are transmitted. As long as the timing is not relevant for a system it does not matter if a message is more delayed and thus transmitted a bit later (scheduling messages earlier may make a difference with respect to causality - see later). In that case we may work with time abstractions. An abstract specification of a component interface provides the following essential information: • its syntactic interface, describing how the component interacts with its environment via its input and output channels, • its behavior by a specifying formula relating input and output channel valuations. This leads to a specification technique for components (see Broy, St0len [2] for lots of examples). In F o c u s we specify a component by a scheme of the following form:
= C o m p o n e n t / S e r v i c e : namp(paramethprs)
in
input channels
out
output channels
— frame label
=
specifying formula
A scheme with this shape is called a template. It is inspired by well-known specification approaches like Z (see Spivey [13]). We use these templates to specify system interfaces and services. frame label denotes, which kind of specification frames (e.g. timed or time independent) is used. The key word untimed expresses that the identifiers x denoting channels in the specifying formulas do not refer to timing properties and therefore they actually stand for their time abstractions x . In the case of time independent
M. Broy
130
specifications we get the original specifying formula easily by replacing each channel identifier c by its time abstraction c. 4.3.2. Service
Specification
As we have explained, a service is a set of interaction patterns with a strong causality property. In this section we show how to specify services and demonstrate that by a set of basic examples. Logical Specification of Services We start with a simple example of a service and its specification. We specify the classical service of a queue. It is a simple example out of a rich class of services for storing and retrieving data. Example. Queue service A Queue service allows to store elements of type Data and to request them in a first in, first out (FIFO) fashion. A typical application of a device with such a service might be a PDA that offers the option to store a queue of tasks. We specify the involved data types as follows: type Qln = {req} U Data type QOut = Data Based on these data types we formulate the specification template describing the service Queue (here we write M#x as an abbreviation for # M © x: p= Service: Queue— in
x : Qln
out
y : QOut
untimed =
{req}#x = Data#y AyQ Data©x => {req}#x' < Data#x'
A \f x':
x' C. x
The specifying assertion basically expresses that the number of requests in the input stream x determines the number of messages in the output stream y and that y returns the data in x in a FIFO discipline. This is, in fact, the specification of a partial behavior. If the input stream x has for instance the shape x = (dl) -^{req) ^(req) ""(d2)
^(req)...
then the specifying assertion for x cannot be made valid for any output history y. In other words, for x there does not exist an output history y that fulfils the specification. The input history x is therefore not in the service domain. We may characterize the set of input histories in the service domain by a predicate on the set of input histories as follows:
A Theory for Requirements
Queue~Aspt(x)
Specification and Architecture
= V i ' : I ' C I ^ {reg}#x <
Design
131
Data#y
This predicate is also called the service assumption.
•
This example shows how we relate an assumption predicate that characterizes the expected input histories with a service specification that characterizes the service domain. Example. The service of a postfix calculator A very simple example of a service is that of a calculator. Again we start with data type declarations. Let type Op = { + , * , . . . } type Calln = N u O p type CalOut = N be data types. The specification of the calculator service reads as follows: = Service: f!a.lService in
x : Calln
out
y : CalOut
— untimed =
\/x' e Calln* : ( I ' C I => Aspt(x')) A y = App(x) where Vn, m, k E N, op € Op : Aspt({)) = true Aspt({n)) = Aspt({n) -~-(m)) = true Aspt((n) ^(m) ^(op) ^ x) — Aspt(x) Aspt((n) —(m) ^{k) -^ x) = Aspt((n) ^{op) " x) = Aspt((op) ^ x) = false App({n) -^(m) ^ ( + ) ^ x) = (m + n) ^ App(x) App((n) ^(m) ^(*) ~ x ) = (m * n) -~- App(x)
This is again the specification of a partial behavior. The input is required to follow the syntactic form as specified by the predicate Aspt. If the input stream x has the pattern a: = ( n l ) ~ ( + > ~ ( n 2 ) ~ . . . then the condition for the stream x cannot be made valid, since there is no output history that fulfils the specification. We may characterize the set of input histories in the service domain as follows: Cal-Aspt(x)
= V i ' e Calln*
: x' C. x =^ Aspt(x')
The predicate Cal-Aspt forms the service assumption.
a
M. Broy
132
It is not difficult to model and to represent also services that incorporate some physical aspects such as for the services of a cash machine. Then the physical output of cash is simply represented by respective messages. Example. A cash machine A cash machine returns an amount of cash c to its users with account u if there is enough money on the users' accounts. This is indicated by the predicate ok(u, c). We deal with the following data types type CM In = User Account U Amount U {Closed} type CMOut = Amount U {Rejected] To retrieve cash a user sends his user account and then repeatedly the amount of cash he wants to receive. The behavior of the cash machine is described by the following specification:
= Service: CashMacl-nne in
x : CMIn
out
y : CMOut
- untimed
Vz' e CMOut* : ( I ' C I =>• OKIn(x)Ay=f(x) where V«, u\, u2 £ User Account, a G Amount : f((u)^(a)-x) = (a)~f((u)-x) <= ok{u,a) f ((«) —(a) ^ x) = {Rejected) ^f((u) ^ x) <= -iok(u,a) / « u ) -(Closed) ~x)= f{x) OKIn(()) = OKIn{(u)) = true OKIn({u) -(a) - x) = OKIn((u) - x) OKIn{(u) -(Closed) ~x) = OKIn(x) OKIn((Closed) - x) = OKIn((a) - x) = OKIn((ul)
-(u2) - x) = false
In CashMachine the function / is underspecified - any function / that fulfils the specifying equations is fine. Again CashMachine specifies a partial function. Its output is only defined if the input history starts with an account - followed by an arbitrary number of amounts and finally is closed before it starts with another account. • Now we look at a slightly more complex example namely that of a password service. Example. Password service Let the following data type definitions be given:
A Theory for Requirements
Specification
type Pswln = {quit} U Amount{psw(w) type PswOut = {pok, qok,perror}
and Architecture
Design
133
: w G Password}
The specification of the password service is parameterized by the type of messages protected by the password service. Messages of type M are only transmitted if a valid password is provided beforehand. Let valid be a predicate that indicates whether a password is valid.
= Service: PswService(type M) in
x' : M \ Pswln
out
x : M | PswOut
— untimed =
x = P(x') &CMOut* : (x'Qx => OKIn(x) A y = f(x) where Vz € M*, w G Password, i e ( M Pswln)" : valid(w) =>• P({psw(w)) ^ z^(quit) ^x) = pok^- z^(qok) -^ P(x) valid(w) => P((psw(w)) ^ zr~~{psw{w')) ^ x) = P((psw(w)) ^ z ^ x) -ivalid(w) =$• P({psw(w)) ^ x) = (perror) ^ P(x) P(z -x) = P({quit) -x) = P(x)
This actually is the specification of a total behavior. We could also specify a partial behavior instead by specifying that input histories starting not with a password are not in the service domain. • This example shows a very simplified view onto a password service that works rather like a firewall. Next we show how to use the password service on top of other services. Example. Queue with Password Service The specification QueuePsw of a queue service combined with a password service reads as follows. So far we have two independent specifications of the services Queue and Password. The two specifications of services can be combined into a super-service. Here 0 and L are truth values to distinguish the case were the password login was successful or not. This is the specification of a partial behavior as services usually are. Q This is a first example of combined services.
134
M. Broy
= Service: QuenePsw
untimed
in
x' : Qln \ Pswln
out
y : QOut \ PswOut
y> = F((),0).(x>) where Vw 6 Password, x € (Qln \ Pswln)", d € Data, s £ Data F(s,L).((d)~x) = F(s~(d),L).(x) F(s -(d), L).((req) - x) = (d) ~ F(s, L).(x) valid(w) => F(s,0).(psw(w)—x) = pok —F(s,L).(x) F(s, L).((quit) ~x) = qok - F(s, 0).(x) F(s, L).(psw(w) -x) = F(s, L).(x) -^valid(w) =^ F(s, 0).(psw(w) — x) = perror — F(s, 0).(x) F(s, 0).((z) -x) = F(s, 0).((quit) - x) = F(s, 0).(x)
Specifying Services by S t a t e Machines Services can nicely be specified by state machines with input and output. Since services are partial we use partial state machines. From such partial state machines we can deduce state machines, which describe the service domain. We represent the state machines by state transition diagrams. The meaning of state transition diagrams is explained in detail in Broy, St0len [2].
)} x : req / y : m • N *
qempty i
; nonempty
: m J ~{q:=
q^(m)}
W ¥=() A q = (m) p q'\ x : req j y m {q := q'} x : m J —\q '•= q"(m)} Fig. 4.3.
Queue service as a state machine
Fig. 4.3 shows a state machine for the queue service. It is in fact a partial state machine. Each transition includes a guard (represented by an assertion to hold that the transition is ready to fire), an input message (a message to arrive that the transition is ready to fire), an output message and a state change. In some states
A Theory for Requirements
Specification
and Architecture
Design
135
for certain input messages there are no transitions denned, q' denotes the value of the attribute q in the state reached after the state transition.
?:=(>
i nonempty )
x : m {q :=
q^(m}}
W ¥=() A q = (m) - q'} x : req {q := q'} x : m {q := q^~~(m)\ Fig. 4.4. Assumption as a state machine Fig. 4.4 shows the state machine that formalizes the input assumption. It accepts only those streams that are in the service domain. It is easily derived schematically from the state machine in Fig. 4.3 We only have skipped the output messages.
{n = 1} x : req I iszero i
I nonzero
x : m {n := n + 1} {n > 1} x : req {n := n — 1} x : m {n := n + 1} Fig. 4.5.
Queue service assumption as a state machine with simplified state space
There are several options to reflect the partiality of the service Queue in a state machine. In Fig. 4.4 for certain input no transition is enabled. In Fig. 4.5 we have simplified the state space (abstracting the sequence q into a number representing its length) and totalized the behavior by introducing an error state. In a slightly different way we can totalize the service from Fig. 4.3 as shown in Fig. 4.6 Here the state machine is total. It issues an error message if unacceptable input arrives.
M. Broy
136
x : m {n := n + 1} W + (> A g = (m) " q'} x : req / y : m {q := q'}
3~0 \
{q = (m)} x : req / y : m
I qempty )
I nonempty )
x : m I — {q •=
q^(m)}
x : req / y : error Fig. 4.6.
Queue service as a state machine
Another interesting observation has to do with the state machine description of services. The description of a service by a state machine requires the introduction of a service state. 4.3.3. Extension
and Refinement
of
Services
In this section we study how we can extend services and the service domain. Service extension introduces a first example of a relation between services. We introduce the general notion of service extension and illustrate it by examples. We can extend our examples of services in several ways. Service Domain Extension Given a service, we can extend its domains (if it is not total). This means that we allow for more input histories with specified output histories and thus enlarge its service domain. Example. Domain Extension of the Service Queue A simple way to extend the service of the queue is to allow for requests even if the queue is empty by storing such requests and returning data only later as soon as they become available. We show two versions: one where we assume that finally all requests can be served by data = Service: Queuel in
x : Qln
out
y : QOut
{req}#x = Data#y
untimed =
A y Q
Data©x
A Theory for Requirements
Specification and Architecture
Design
137
and another one where requests are served only if data are available. •= Service: Queue?— in
x : Qln
out
y : QOut
min({req}#x,
Data#x)
untimed =
= Data#y
A lj C. Data®~x
Queue2 is a total service while Queuel is a partial service. Both services coincide with the service Queue on the domain of its service. • Extending service domains is an essential step in making services total and thus error tolerant. It is also a step from services to components. Example. Extensions to the Queue Service The examples of services that we have introduced above can be further extended for instance as follows: (1) Extension of the queue service: • Query: How many elements are in the queue? • Sort elements in the queue according to their priorities! (2) Extensions to the calculator service: • Add new operators • Redefine operators • Add an accumulator (3) Extensions to the password service: • Change password • Make password invalid These functions ("features") extend the functions and services specified above. It is not difficult to give precise specifications of these extended services. Q In the following we are study ways to specify such extension in a structured way on top of already specified services. We are, in particular, interested in the relationships between the different services. Example. Extension of the Queue Service We extend the Queue service by the possibility to ask how many elements are in the queue. We introduce a message # 5 that is answered by a natural number.
138
M. Broy
= Service: Extended Queuein
x : Qln U {#}
out
y : QOut U N
Queue(QIn(S)x,QOut(s)y)
— untimed =
A N© y =
numbers(x,0)
Here we use the following auxiliary function numbers : (Qln U {#q})* x N -> N We specify this function as follows (let d G Data, n G N) numbers((), n) = () numbers((#q) -^ a;, n) = (n) ^ numbers(x, n) numbers((req) ^ x, n + 1) = numbers(x, n) numbers((d) ^- x,n) = numbers(x, n + 1) We easily prove that the function numbers is uniquely defined for histories x which fulfill the assertion Queue-Aspt(x). • In the example above we find two services - that of a Queue and that of an Extended-Queue. Extension is a prominent example of a relation between services. Now we tackle the following questions: • • • • •
How to define structured relations between services formally? What about refinement of services? How to define multiplexing of services? How to combine services? How can we structure and relate families of services?
An essential notion to relate services and also components is that of refinement. Note that a component is a special case of a service (with an empty or a total domain) and therefore the following refinement concept works both for services and components. A simple relation between services is that of property refinement as introduced in Broy, St0len [2]. Definition. Property Refinement Given two service interfaces Fl, F2 G IF[7 • O] the service F2 is called a property refinement of service F l if V i e t :
F2.x C Fl.x
• This is a very straightforward notion and corresponds to implication between the specifying assertions of the services F2 and F l . Property refinement is not the only relation we may introduce between services. Other notions are sub-services
A Theory for Requirements
Specification and Architecture
Design
139
and many more. In fact, there is a rich family of relations between the services of a system. As a relation between services property refinement as defined above is both too restrictive and too liberal. We do not want to decrease the service domain in a refinement step. We are rather interested to allow more for more messages in the input and output streams, as long as they do not interfere with the required service. For instance, the extended queue in the example above is not in this property refinement relation. We therefore introduce a more general relation called "sub-service relation". To be prepared to do that we specify some auxiliary notions. First we recall the idea of a subtype. Definition. Sub-typing Let C and C" be sets of typed channels with (recall that SET(C) denotes the channel identifiers without their types) such that SET(C) C SET(C") and for all channels (c, T) G C, (c, T") 6 C" we have CAR(T) C CAR(T') then C is called a subtype of C" and we write C subtype C"
• Based on the subtype relation we define the idea of projection. Definition. History projection Let C and C be sets of typed channels with C subtype C". We define for histories i e H [ C ] , channels (c, T) G C (x\C).c =
CAR(T)©(x.c)
x\ C is the restriction of the history x G H[C] to the channels in the set C and to the messages of the types of the channels in C. The mapping a: H[C] —> H[C], where a(x) = x\C is called a sub-history projection.
•
A sub-history is the projection of a history with respect to certain channels and their types. In sub-history x\C we filter out those channels and types of messages in the history x that belong to the typed channels in C. Based on the concept of projection we define restriction. It allows for the concentration of a certain subbehavior for a given more comprehensive behavior.
140
M. Broy
Definition. Restriction Given a service interfaced E IF[72 • 0 2 ] and a syntactic interface (7i • Oi) where I\ subtype 72 and 0\ subtype 0 2 , we define the restriction F2]{h • Or) E IF[/i • Oi] of the service F 2 to the syntactic interface (ii • 0\) for all input histories x E H[/1] as follows F2HI1 • Oi).x = {y\Oi: 3x' E H[/ 2 ]: x = x'\h Aye
F2.x'}
This notion of restriction applies both to services and to component specifications.
• Restriction allows us to define a notion of refinement that is more general and more appropriate for relating services than property refinement. Definition. Refinement Given two service interfaces Fi E IF[7i • 0\] and F 2 E IF[/2 • 0 2 ], where h subtype 72 and 0\ subtype 0 2 , we call the service F 2 a service refinement of F\ if for all input histories x E H[/i] F2W1 • Oi).x C Fi.x Then we write Fi ?$> F2
This notion of refinement applies again both to services and to component specifications. • Note that this refinement notion is a generalization of the notion of property refinement as introduced above (see also Broy, St0len [2]). In contrast to the refinement notion in Broy, St0len [2] where the syntactic interfaces of the two components are required to be identical we permit to enlarge the number of channels and their types in a refinement. The extended queue is a refinement of the queue service. The refinement relation represents a partial order on the set of services. One service may be the refinement of several quite unrelated services. Nevertheless this notion of refinement is a bit too liberal. The paradoxical service where all sets of possible outputs are empty always defines a refinement. Methodologically we rather insist that the service domains are not decreased. Definition. Service Refinement, sub-service relation A service G £ IF[I • 0} is refined by a service F if for all input histories x E dom.(G) we have G\{I • 0).x C F.x
A Theory for Requirements
Specification and Architecture
Design
141
and in addition dom.G C dom.F\(I
• 0)
A service G is refined by a service F if in F all the messages relevant for the service G are in F as in G required. The service domain must not be made smaller. Outside of the domain of G we can introduce new reactions. Then we say that F offers G or that G is a sub-service of F. We write G < su b F, then. • The sub-service relation, in fact, implies refinement. In other words, sub-service relation is a specialization of refinement. To give a comprehensive set of relations between services of a component and to specify the precise semantics of these relations is a major piece of work. In this paper we give only formal definitions for some of these relations. Service Granularity Refinement A quite general relation between services is that of a service granularity refinement. This form of refinement allows to replace channels by several channels and messages by several messages and vice versa. Definition. Service Granularity Refinement A service granularity refinement for the service F\ G IF[71 • 01] into a service F2 G IF[/2 • 02] is given by two pairs of functions Al G \F[I2 • 11} AO G IF[02 • 01]
Rl G IF[H • 12} RO G IF[01 • 02]
Where (Al, Rl) and (AO, RO) are refinement pairs such that the behaviours RI°AI and RO°AO are the identity (the identity is for a set of typed channels C the behaviour Id: C —> C" with id.a; = {x} and the function composition ° is specified for functions F: ~C -* "', F': ~C' -> ~C" by (F°~F').x = {z G F'.y : y G F.x} and the following equation holds Fl = Rl ° F2 ° AO We may even generalize service granularity refinement by replacing the equality in the final equation in the definition above by property refinement, the refinement relation R> or the sub-service relation <subQ Granularity refinement is a very general and powerful notion. It supports the refinement of services by replacing one message by many and one channel by many and vice versa. Equivalence of services There are many ways to introduce different services which in an abstract sense offer the same functionality. The relation
M. Broy
142
"service A is a granularity refinement of service B" as denned above, in fact, represents an equivalence relation. It captures "equivalence" of services in terms of different service access dialogs. The equivalence of services is of major interest not only from a theoretical point of view but also from a very practical point of view. When comparing multi-functional systems it is a practically relevant question, whether two systems offer the same services. 4.3.4. Structuring
Multi-Service
Systems
Multi-functional systems can incorporate many different, in principle, independent functions which in our terminology offer different services. So they can be seen as providing different use cases of a system. Our basic idea is that we should be able to reduce the complexity of the user functionality of a system by describing each of its singular use cases independently by services and then later define useful relationships between individual services that influence each other. Typically some of the services are completely independent and are just grouped together to get a system which offers a collection of functionalities. In other cases services may influence each other were some of the services are just having small, often not very essential side effects on other services while some services may heavily rely on other services and influence their behavior in some very essential way. Structuring a System into a Hierarchy of Services A component can actually implement many independent services. In fact, we can structure the overall functionality of a multi-functional system into a hierarchy of its sub-services. We may decompose each component into a family of sub-services and each of these services again and again into families of their sub-services. To understand the user functionality of a system requires the understanding of the single services but also understanding how they are related. Our vision here is that we can introduce a number of characteristic relations between the services of a system such that in the end we describe a system structure by just mentioning which services are available and how they are related. Each of the individual services is then described in isolation. Today's information processing systems offer many services as part of the functionality of one system. We speak of multi-functional systems. We first give an informal description of a simple example of a multifunctional system. Example. Communication We look at the example of a simple communication network. It has three subinterfaces and the following global services • User identification and access control • Communication • User administration In a finer grain service level we may have sub-services like
A Theory for Requirements
• • • • • • • •
Specification
and Architecture
Design
143
User log-in and identification Password change Sending Message Receiving Message Calibration of the quality of services User introduction User deletion Change of user rights
All these extended services can be described by the specification techniques as introduced above. • To obtain a comprehensive view onto the hierarchy of services we introduce a notion of user roles such as in our example: • SAP A and B (SAP = Service Access Point) • Administrator Now we can relate roles and services, as well as services and other service. Typical characteristic relations between services are • • • • • •
Uses Enables Changes Interferes (feature interaction) Is combined of Is sequentially composed of
There are, of course, many more practically interesting relations between two services A and B besides the sub-service relation. Informal examples of such relations for services A and B are: • • • • •
A A A A A
and B are mutually independent affects B controls B includes B interferes with B
It is not difficult to dream up a number of further such relations. Our approach, in fact, offers the possibility to give a formal definition to all of these. We only tackle a few fundamental examples. Based on our definition of sub-services we define the very important notion of combinability and independence of services. Since we have a formal notion of service we are actually able to give formal definitions for such relations that relate services. We do not go deeper into the question, however, which of the relations are most useful from a methodological point of view. Relating Services In this section we study a number of specific relations between services. The most
M. Broy
144
significant relation for requirements engineering is that of a sub-service as it has been introduced above already.
x1:Data
SAP A
rl: Data
! Transmission Network •:• User identification - Password change :• Identification •:• Communication Service : Sending message Receiving message • User Administration .- Calibration quality of services U&er introduction - User deletion x3 Data
r2: Data
SAP B
x2: Data
r3:Data
Administrator Fig. 4.7.
Service Hierarchy - the lines represent the sub-service relation
The set of services offered by a component can informally be described as shown in Fig. 4.7 In fact it can be structured into a hierarchy of services and sub-services as shown in Fig. 4.7 By such a service hierarchy we get a structure of a system into a hierarchy of services that are in the sub-service relation.
Transmission Network Service
User Identification Service
Password Login
Password Change
Communication Service
Calibration QoS
User Introduction
Fig. 4.8.
User Admin
Sending
Receiving
User Deletion
Service Hierarchy - the lines represent the sub-service relation
The definition of combinability of services basically uses the following idea. Two services are called combinable, if there exists a component that offers both services. Definition. Combinability of services The combinability of two services F\ and F2 is defined as follows. Fl and F2 are
A Theory for Requirements
Specification
and Architecture
Design
145
called combinable, if there exists a service F such that: F l < s u b F A F2 < s u b F Otherwise we speak of feature in-combinability between the services Fl and F2.
•
Combinability basically means that the services do not show contradictory requirements for their output histories. Note that mutually combinable services may share messages as the services may nevertheless have joint input and output messages. If combinable services share input messages we cannot select the input triggering the output for both services independently. The same holds if the services share output messages. Combinability does not imply the logical independence of services. Since they may share input and output messages they might not be independent. Actually there are several ways we may define the independence of services. A very general and strict notion of independent combinability is obtained as follows. Definition. Independent combinability of services Let F l G IF[J1 • 01] and F2 G \F[I2 • 02} be combinable services. The service F l is called independent combinable with service F2 if there exists a service F such that F l < s u b F and F2 <sut, F and for all histories xl G dom.Fl and j/1 G Fl(a;l) as well as x2 G dom.F2 and y2 G F2{x2) we have (2/1 ©y2) =
F(xl®x2)
• Independent combinability of services F l and F2 means that we can design a multifunctional system incorporating both the services F l and F2 where the access to the service F l by choosing the input messages is completely independent of the access to the service F2. By this definition sub-services that are interleaved do not share any of their input or output messages. Independent combinability of service F l of service F2 means that whatever input we choose for the services F l and F2 we can get any behavior for service F l . This behavior of F l does not depend on the chosen input to trigger service F2. Independent combinability of two services F l G IF[/1 • 01] and F2 G IF[/2 • 02] implies that the two services are independent. F l and F2 are then mutually independent. This means that we do not have to refine the services F l or F2 to be able to combine them. This independence does not mean that their input histories are disjoint. Independent combinability of services is not always what is required when combining services into systems. We may be interested in situations where a service can only be accessed under certain conditions. Now we give a formalization of the notion of independence between services, which is not a symmetric relation. Definition. Independence of Services Let F l G IF[71 • 01] and F2 G IF[/2 • 02] be given services. The service F l is called independent of service F2 if there exists a service F such that F l < su b F
M. Broy
146
and F2 < su b F and for each input history xl G dom.Fl, and each input history x2 G dom.F2 there exists an input history x G dom.F where for each output history y G F.x we have xl = x\Il
A x2 = x\I2 A
y\01eFl(xl)
• This means that we can select the input for the services Fl and F2 independently and that the output for service F l is not influenced by the choice of the input for service F2. Independence as defined above is not a symmetric relation. For example the Queue service is independent of the Number_in_the_Queue service but not vice-versa. Service independence is a very simple and general notion. However, often services are not mutually independent. If we change the password, this affects the password service, if we introduce a new user or delete a user, this may also effect the communication service or the cash machine services but not vice versa. This leads to another relation between services namely how they may depend on each other. Following this line we discuss various notions of service dependency. At a first sight we get two versions of dependency • enabledness, where a service F2 is only enabled (i.e. working correctly) in certain situations controlled by the service F l , • influence, where there is an information flow from the independent service F l to the dependent service F2. A specific relation between two services is the controlling of a service by another one. An example is a system where a password service is used to protect another service. Definition. Service Control Let F G IF[7 • O], F l G IF[71 • 01] and F2 G IF[72 • 02} be services and V be the least type with V subtype 7 such that 71 subtype V 72 subtype V We say that a service F l controls a service F2 in F if for every x2 G dom.F2 there exist zl, z2 G / ' with zl|71, z2|71 G dom.Fl, x2 = zl|72 and x2 = z2\I2 such that for all x G dom.F we have x\f = zl => F(x)\02 C F2(x2) x\I' = z2 =» F2(a;2)nFO;)|O2 = 0 In other words the input messages of service F l in input x G dom.F determine, whether we get a proper service as specified by F2. We write then F l controls F2 in F
• This relationship between services expresses that one service depends on the other one in the sense that the first service can only be properly accessed if the second
A Theory for Requirements
Specification
and Architecture
Design
147
one is operated accordingly. An example where a service controls another one is the password service for the queue service. Service control means that the service F2 is only offered by the system F if the service Fl is used side by side in a correct way. The definition essentially expresses that on a subset of input histories that are controlled by service F l the service F2 is granted. The introduced relations between services now allow us to study the individual relations between our service examples. Example. Relating Services For our examples we obtain the following relations • • • •
CalService and Queue are combinable PswService(Q/n) < su b QueuePsw QueuePsw and Queue are not combinable PswService((5/n) controls Queue in QueuePsw
• For getting a structured view onto the services of a system, it is helpful to decombine the system into a set of elementary services and to make all essential relations between them explicit. We can generalize granularity refinement from components to services, too. 4.3.5. Combining
Services
In principle there are several ways of combining services out of given services. First of all we can try to combine more elaborate services from given services. In a certain way in the simple case we just put together services, which are more or less independent into a multi-functional system. A second possibility is to compose services. That means that we build up from given basic services, complex services by composition, which in a certain sense reflect the old services but packaged in a different style. A typical example would be the introduction of a password restriction on services. Using this idea of composition we can think more in an architectural style. There we use a number of sub-services, which are not available to the user at the user interface at all, as auxiliary services to combine a service. In this case we rather talk about architectures of services. This idea of service composition is studied in detail in the following chapter. We are interested not only in the specification of services in isolation but also in the specification of how the basic services of a system interact and form a complete specification of a multi-functional system. A typical way to put together services that way is to make sure that only histories within the domain of the auxiliary services are produced. Definition. Service Combination The combination of the two services Fl and F2 is only defined, if they are combinable; then we denote by
148
M. Broy
Fl®F2 the least service F with the property Fl < s u b F A F2 < s u b F Fl © F2 is called the service combination of F l and F2.
•
By service combination we can build up multi-functional systems from elementary services. 4.3.6. Service
Hierarchies
In this section we illustrate the ideas of service hierarchies. Service hierarchies can be seen as formalization of feature trees complemented by more sophisticated relations. Service Relation Diagrams The relations between services open the door to a structuring of the user functionality of multi-functional systems. In the requirements engineering it is helpful to introduce specific relational diagrams that show all the relations between the services.
Body conlrol
Window control •
Child lock
-
•-
*fr
Locking control
\
•••>*• Right back window
Keyless lock
Subfunction
»•
Controls
Fig. 4.9.
Service relation diagram
Fig. 4.9 shows an example of a service network (also called a function net). Each service is represented by a box. The arrows between the services show the characteristic relations.
A Theory for Requirements
Specification
and Architecture
Design
149
Properties of Relations between Services The relations between services as introduced above show specific properties. This leads to a number of theorems about relations. Basically we have introduced the following relations between services: + is independent + is subservice + is consistent with Now we can analyze the logical properties of these relations. We give only a simple example. The sub-service relation is a partial order. The relations on the services allow us to reason about services at a higher level. So we can get a calculus of relations between services. 4.4. Architectures: Composing Components and Services In this chapter we study the composition of services and components. First we introduce the composition of two components or services. Then we extend the composition to a family of components forming an architecture. Then we study a service-oriented view onto architectures. The main idea is that we get a service hierarchy for the external services of the system and we relate that to the service hierarchy of the components of the architecture. 4.4.1. Service
Composition
In this section we study the composition of components and of services. Services and components are composed by parallel composition with feedback along the lines of Broy, St0len [2]. Definition. Composition Given two service interfaces F\ G IF[/i • 0\] and F2 G IF[/2 • 02], we define a composition for the feedback channels C\ C 0\ n I2 and C2 C 02 H h by
iMCi ~ C2]F2 The component F\[C\ <-> C2]F2 is defined as follows (where z G W\I\ U 0\ U/ 2 U 02], x G H[J], and / = h\C2 U I 2 \Ci): (F^d
<-> C2]F2).x = {x|(Oi\Ci) U (02\C2)
: x = z\I A z\0! £ Fiizlh)
A z\02 G Fi(*|/ 2 )}
The channels in C\UC2 are called internal for the composed system F\\C\ <-> C2]F2.
• The idea of the composition of components and services as defined above is shown in Fig. 4.10 In a composed component Fi[C\ <-> C2]F2 the channels in the channel sets C\ and C2 are used for internal communication.
M. Broy
150
ii\c 2 _ OjXC!
Fig. 4.10.
F|
« l — •
o 2 \c 2
H, „
i 2 \Q
Composition Fi[Ci ^
C 2 ]F 2
Parallel composition of independent sets of internal channels is associative. If the following simple syntactic condition is valid
(/uo)n(/'uO') = 0 then ( i M f t <- C2]F2)[r <- 0'}F3 = Fx[/ <-> 0](F 2 [/' «- 0']F 3 ) The proof of this equation is straightforward. The set of services and the set of components form together with the introduced composition operators an algebra. The composition of components (strictly causal stream functions) yields components and the composition of services yields services. Composition is a partial function on the set of all components and the set of all services. It is only defined if the syntactic interfaces fit together. 4.4.2.
Architectures
If we compose several components by the composition operator as introduced above we get an architecture. An architecture is an interactive composed system consisting of a family of interacting components (in some approaches also called agents or objects). These components interact by exchanging messages via their channels, which connect them. A network of communicating components provides a structural system view, also called system architecture. Its nodes represent components and its arcs communication lines (channels) on which streams of messages are sent. We model architectures by composed systems represented by data flow nets. Let K be a set of identifiers for components and / and 0 be sets of input and output channels, respectively. A composed system (y, 0) with syntactic interface (/ • 0) is represented by the mapping v. K -> IF
that associates with every node a component behavior in the form of a black box view, formally, an interface behavior given by an / / O-function. O denotes the output channels of the system. As a well-formedness condition for forming a net from a set of component identifiers K, we require that for all component identifiers k, j £ K(with k ^ j) the sets of output channels of the components u(k) and v(j) are disjoint. This is formally guaranteed by the condition
A Theory for Requirements
k^j^-
Out(z/(fc))
n
Specification
and Architecture
Design
151
Out(v(j)) = 0
In other words, each channel has a uniquely specified component as its source b . We denote the set of all (internal and external) channels of the net by the equation Chan((i/, 0 ) ) = O u { c e ln(i/(ft))
:
A; e K } U {c G Out(i/(Jfe)) : k G K }
The set / = Chan((i/, 0 ) ) \ ( 0 U {c G Out(i/(Jfc)) : jfe G K } )
denotes the set of input channels of the net. CS|p denotes the set of all composed systems with component behaviors represented by interfaces. The composition of all the components or services forming an architecture forms a service or component again. It is obtained by the iterative application of composition. Definition. Composition and Interface Abstraction Let k G Kand all definitions as above. For simplicity we assume that all channels have at most one source and one target, i.e. the set of input channels are pair wise disjoint and the set of output channels are pair wise disjoint. Let Ck and Ek be the set of channels defined by Ck = h H O and Ek = Ok C\ I where ° = UkEK°k
and I = Uk 6 K J *.
We define the composition inductively as follows: ®v = ®v\(K\{k})[Ck
<-> E'k]u{k)
According to the commutativity and associativity of composition the choice of k G K does not affect the result. The result of composing a system in terms of the interface behavior of the composed system is defined by the composition of all the interfaces of the subsystems.
•
Fig. 4.11 shows an example of a simple component architecture for our running example of a communication network. 4.4.3. Services
as Aspects
of
Architectures
Just as we decompose a user functionality of a system in terms of services we may also decompose the functionality of components of an architecture into subservices offered by the component. This allows us to get a restricted view onto systems. In particular, an interesting way of using such a structuring approach is to decompose a multi-functional system into a component architecture, to decombine each component of the architecture into a number of services and then to indicate which services of the component are used to offer the services of the composed system. b
Channels that occur as input channels have the environment as their source
M. Broy
152
Communication Network Sender
•
u
M a i mm
Receiver
Mi'ilnim
Ni-lwnik M.m.f.'L'i
Fig. 4.11.
)f
System component architecture
Assume we have a component or a service F, composed of a set K of services or components such that F = ®v and F is a combination of a set of services F = Fi © • • • © Fn then we can ask about a set of components K with v' such that F1 = ®v' and v'.k
<sub v.k
If v is the least valuation of K with such a property (which is unique and exists) then v' is called the aspect view of service Ft onto the architecture v of F (see Filman et al. [61).
4.5. Summary and Outlook The structures specification of multi-functional systems is a domain of requirements engineering that is not sufficiently understood so far. Services can help to provide a foundation for model driven requirements engineering for multi-functional systems. What we have presented is only a first step into the theory of multi-functional systems. We mainly introduced some basic notions but did not cover many relevant issues. This includes aspects of theory as well as aspects of practice and engineering. We presented this quite theoretical setting of mathematical models of services, architectures, service combination, composition and relations between them also to
A Theory for Requirements Specification and Architecture Design
153
demonstrate how rich and flexible the tool kit of mathematical models is and how far we are in integrating and relating t h e m within the context of software design questions. In our case the usage of streams and stream processing functions is the reason for the remarkable flexibility of our model tool kit and the simplicity of the integration. Second we are interested in a simple and basic model of a service and an architecture just strong and rich enough to capture all relevant notions. It is our hope t h a t services and components provide the appropriate foundation for early phases of software development. Software development is a difficult and complex engineering task. It would be very surprising if such a task could be carried out properly without a proper theoretical framework. It would at the same time be quite surprising if a purely scientifically theoretical framework would be the right approach for the practical engineer. T h e result has to be a compromise as we have argued between formal techniques and theory on one side and intuitive notations based on diagrams. Work is needed along those lines including experiments and feedback by experience gained from practical applications. Acknowledgements It is a pleasure t o t h a n k Andreas Rausch and Bernhard R u m p e for stimulating discussions and helpful remarks on draft versions of the manuscript.
Bibliography 1. M. Broy, C. Hofmann, I. Kriiger, M Schmidt: Using Extended Event Traces to Describe Communication in Software Architectures. In: Joint 1997 Asia Pacific Software Engineering Conference and International Computer Science Conference (APSEC'97/ICSC97), 203-212 2. M. Broy, K. St0len: Specification and Development of Interactive Systems: FOCUS on Streams, Interfaces, and Refinement. Springer 2001 3. M. Broy: Modeling Services and Layered Architectures. H. Konig, M. Heiner, A. Wolisz (Eds.): Formal Techniques for Networked and Distributed Systems. Berlin 2003, Lecture Notes in Computer Science 2767, Springer 2003, 48-61 4. P. Clements, F. Bachmann, L. Bass, D. Garlan, J. Ivers, R. Little, R. Nord, J. Stafford: Documenting Software Architectures: Views and Beyond. Series: The SEI Series in Software Engineering. Addison Wesley Professional 2002 5. E.W. Dijkstra: Notes on Structured Programming. In: O.-J. Dahl, C.A.R. Hoare, E.W. Dijkstra: Structured Programming. Academic Press New York 1972 6. R. Filman, T. Elrad, S. Clarke, M. Aksit: Aspect-Oriented Software Development. Addison Wesley Professional 2004 7. D. Herzberg, M. Broy: Modeling Layered Distributed Communication Systems. Formal Aspects of Computing 2004 8. I. Jacobson: Use Cases and Aspects - Working Seamlessly Together. Journal for Object Technology 2:4, July/August 2003, 7 - 28 9. I. Kriiger, R. Grosu, P. Scholz, M. Broy: From MSCs to statecharts. In: Proceedings of DIPES'98, Kluwer, 1999, 61-72
154
M. Broy
10. D. Parnas: On the criteria to be used to decompose systems into modules. Comm. ACM 15, 1972, 1053-1058 11. Ch. Prehofer: Plug-and-Play Composition of Features and Feature Interactions with Statechart Diagrams. Sosym 2004, Springer-Verlag, 212-234 12. B. Selic, G. Gullekson. P.T. Ward: Real-time Objectoriented Modeling. Wiley, New York 1994 13. M. Spivey: Understanding Z - A Specification Language and Its Formal Semantics. Cambridge Tracts in Theoretical Computer Science 3, Cambridge University Press 1988 14. G. Booch, J. Rumbaugh, I. Jacobson: The Unified Modeling Language for ObjectOriented Development, Version 1.0, RATIONAL Software Cooperation 15. P. Zave, M. Jackson: Four dark corners of requirements engineering. ACM Transactions on Software Engineering and Methodology, January 1997, 1-30
Chapter 5 Component: From Mobile to Channels
Frank S. de Boer 1 ', Marcello M. Bonsangue-'-* and J u a n V. Guillen-Scholten § t CWI, Amsterdam, The Netherlands Frank. de.Boer@cwi. nl * LIACS, Leiden University, The Netherlands [email protected] § CWI, Amsterdam, The Netherlands [email protected] In this chapter we introduce a formal model of components which extends objectorientation with additional structuring and abstraction mechanisms to support a modelling discipline based on interfaces. The component model formalizes the concepts of interfaces, roles, connectors, and ports. Components encapsulate their internal class structure and interact only through a certain kind of objects which are called ports. Ports are instances of classes which are represented by roles. Roles export information about the required and provided operations of these classes by means of interfaces. By means of connectors which wire roles of different components together, ports of one component can dynamically create ports of another component. As an example, we show how to model mobile channels for the dynamic reconfiguration and exogenous coordination of components.
5.1.
Introduction
UML has become the de-facto s t a n d a r d language for specifying, modelling, documenting, and visualizing software systems. T h e basic innovative ideas of UML, which are the main reasons for its popularity, are the unification of the concepts and notations used in the life-cycle of software development as well as the recognition of the importance of modelling a n d analysis as a means t o improve quality. UML consists of a number of diagrams used for expressing the requirements of the system (use case diagrams), for specifying the structure of the system (class diagrams) and the behavior of the system (state diagrams, activity diagrams, sequence diagrams). In this chapter we introduce a formal model of components which extends a class-based model of concurrent objects. T h e component model generalizes the "The research of Marcello Bonsangue has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences. 155
156
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-Scholten
basic concepts of object-orientation by an additional structuring and abstraction mechanism which allows a modelling discipline based on interfaces. Components allow to structure the class diagrams of a UML model, and use interfaces to abstract from the internal details of these encapsulated class diagrams. The basic concept of interfaces introduced in this chapter resolves in a natural way the fundamental asymmetry of object oriented systems: objects declare the provided services but not what they require. The component model is based on a formal semantics of a very simple but computationally powerful subset of UML which includes class and state diagrams. This semantics is extended to the component model by considering only the externally observable behavior of a component. The resulting semantics is compositional in terms of the component structuring and abstraction mechanisms. Such a semantics provides a formal justification of the modelling to interfaces discipline. We also briefly sketch how components can be used to model more complex mechanisms for the coordination of objects which involve an intricate combination of the asynchronous communication supported by an event-driven computational model (along the lines of the Actor model [1]) and the synchronous communication supported by the usual rendez-vous mechanisms of operation calls in object-orientation. Finally, we show how to extend the component model with mobile channels for the exogenous inter-component coordination which provides a clear separation of concerns between communication (or coordination) and computation. This chapter is structured as follows: Section 2 introduces a basic subset of UML and its formal semantics. Then we extend this subset with a formal model of components. Section 4 proceeds with a discussion of how to use mobile channels for the exogenous inter-component coordination. Finally, in Section 5 we draw some conclusions. Acknowledgments The work reported here has been partially funded by the European IST-2001-33522 project OMEGA. 5.2. UML Our starting point is a simple but computationally powerful subset of UML which includes class and state diagrams. This subset features a notion of abstract state machines which provides a clear separation between communication and computation. We do not consider association and inheritance relations because they can be eliminated following some pre-processing steps as explained in [9]. There are two different kinds of inter-object communication in UML: synchronous communication via synchronous operations which return by means of a rendez-vous mechanism and asynchronous communication via asynchronous operations which are stored in a message queue. In contrast to these operations used in communications, primitive operations are operations used to describe directly state transformations without any inter-object synchronization. The meaning of a primitive operation is defined in terms of some associated "code", whereas the "code" of both asynchronous and synchronous operation calls is dynamically 'spread-out' over the state machines.
Component:
From Mobile to Channels
157
The execution of a synchronous operation call involves a rendez-vous between the sender and the receiver of the call: first, the sender and receiver of the call have to synchronize on the execution of an operation call by the sender and a corresponding trigger by the receiver. Such a synchronization results in an assignment of the values of the actual parameters of the operation call to those attributes of the callee that are specified in the trigger as the formal paremeters of the operation. During the execution of the operation by the receiver, the sender is suspended. Upon termination of the call, the return value is send back to the sender, after which both sender and receiver resume their own execution. On the other hand, an asynchronous operation call is stored in the message queue of the receiver. The execution of a trigger involving an asynchronous operation consists of checking whether a corresponding operation call appears in the message queue (of the receiver) and storing its values in the attributes of the operation, as declared by the receiver. It suspends otherwise. In UML state-machines are used in a class diagram to describe the behavior of the instances of a class. We observe that without loss of expressive power we can restrict state machines to operations that are either asynchronous or primitive. This is because the trigger of a transition only involves the reception of an operation and operation calls can only be used as actions. Therefore we can implement synchronous operations in state machines in a distributed manner by means of asynchronous operations using a standard acknowledgment protocol. This is analogous to dialects of CSP which only allow inputs as guards (see [6]). It is worthwhile to observe however that such a protocol assumes that we can search the message queue for certain operation calls. In other words, just checking the first element of the message queue does not suffice. The action language of abstract state machines denned in detail below provides a clear separation of concerns between coordination (involving communication and synchronization) and computation. Coordination, which involves inter-object communication and synchronization, is modelled by the synchronous and asynchronous operations, whereas computation is modelled by atomic boolean guards g and statetransformations s which abstract from the internal details of the implementation of the corresponding primitive operations. Abstract state machines are composed of transitions of the form
I [9Ma l' where I is the entry location and V is the exit location of the transition. Transitions are guarded by a boolean guard g and labelled by a trigger t and an action a. The evaluation of the boolean guard g is assumed to be side-effect free. A trigger is of the form op(A1:...,An) where op is an asynchronous operation defined by the class itself and A\,... ,An is a parameter list of attributes of the class itself that will store the values passed by the caller of the operation. As explained above, primitive operations can not be used as triggers.
158
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-Scholten
In an action of the form A.op(Au...,An) op is a (asynchronous) operation, A is an attribute of the class itself storing the identity of the callee of the operation, and A\,..., An is a list of attributes of the class itself that will store the values passed by the caller of the operation. An action involving a call of a primitive operation is modelled in the action language by an atomic state-transformation s. As a special case, we model class instantiation as a primitive operation call c.new(A), where the value-result parameter A is an attribute used to store the identity of the newly created instance of a class c denned in the class diagram. When introducing the notion of components, we will relax the latter constraint and allow instantiation of external classes which are not defined in a given class diagram, but which are given only by means of their interface. 5.2.1. Operational
semantics
In this section we define the operational semantics of abstract state machines as defined above. In order to formally define the operational semantics we assume for each class c of a given class diagram a set Oc of references to objects in class c. (In case a class c extends another class c' we assume that Oc is a subset of Oci, whereas for classes which are not related by the inheritance relation, we assume these sets to be disjoint.) An object diagram of a given class diagram with classes C\,..., cn can be specified mathematically by partial functions ac, for c € { c i , . . . , c n }, which assign values to the attributes of each existing object of class c. The domain of ac represents the set of existing objects in class c. We will denote by ac(o.A) the value of the attribute A (declared in class c) of the object o. When clear from the context we may omit the information about the class and write simply 0(0.A). Control information of each object o in an object-diagram is given by 0(0.L), assuming for each class an attribute L which is used to refer to the current location of the state machine of o. Furthermore, the message queue of each object is given by the attribute M. It contains messages of the form op(v\,... ,vn), for some operation op and some corresponding sequence of values v\,..., vn. Given a class diagram consisting of a finite set of classes c i , . . . , c n and associated abstract state machines, we define its behavior in terms of a transition relation between object diagrams which describes all possible computation steps. This transition relation is parametric in the semantics of the primitive operations and the evaluation of the guards, and the way messages are stored and removed from the message queues (for example, in a FIFO manner or using priority information). Semantically, we interpret each boolean guard g as an evaluation function such that g(a, o) denotes the boolean result of the evaluation of g by the object o in the object diagram a. (Note that this evaluation is assumed to be free of side effects, i.e., it does not affect the object diagram). Similarly, we interpret each state-transformation s as a function such that s(o, o) denotes the object diagram that results from the application of s in the initial
Component:
From Mobile to Channels
159
diagram a by the object o. The semantics of a primitive operation c.new(A) is fixed: it describes an object diagram a' which results from the initial object diagram a by the creation of a new instance of class c (the identity of which is assigned to the attribute A of the executing object). Furthermore, we assume for each (asynchronous) trigger op(Ai,... ,An) a semantic function
pop-op{A1,...,An) which, given an input object diagram a and an executing object o, returns the object diagram a' that results from removing a message op(v\,... ,vn) from the message queue er(o.M) of o in er and assigning the values Vi to cr(o.Aj), i = 1 , . . . , n, i.e., a'(o.Ai) = v^. In case such a message does not exist this function is undefined. Finally, for each operation op the semantic function push-
op(o,vi,.
..,vn)
returns the object diagram a' that results from adding the message op(vi,... ,vn) to the message queue a(o.M). The operational semantics of a class diagram, is defined in terms of a transition relation —> between object diagrams labelled by observable events e of the form o!o'.op(i;i,...,u n ), which indicate the storage of the message op(vi,... ,vn) sent by object o in the message queue of the callee object d. By r we denote an internal activity which includes removing a message from the queue. On the other hand, instantiation of classes generates events c\o.new(d) where d is a reference to a new instance of class c created by o. We assume for each transition
I l9Ma I' of a state machine a unique intermediate location U to model the interleaving point between the guard and trigger on the one hand, and the action on the other hand. The transition relation describing computation steps is defined by distinguishing the following cases. Evaluation of a guard and execution of a trigger op(Ai,..., An), we have: Location: a{o.L) = I; Guard: g(a, o) = true; Trigger: a' = pop - op{Ax,..., An)(a, o); Transition: a —> a'{o.L := Li}.
For I —> /', with t =
160
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-Scholten
The evaluation of a guard and the subsequent execution of a trigger generates a 'silent' computation step. The first clause defines the entry location of the transition. The second clause states that the guard evaluates to true (without side-effects). The third clause describes the effect of the removal of a message from the message queue of in terms of the resulting object diagram a'. Note that the evaluation of the guard and the execution of the trigger are strictly sequentialized. This implies that the guard cannot refer to the new values of the actual parameters of the trigger which are stored in the message queue. However, a slight modification would suffice to allow for this; for technical convenience only we restrict to a simpler semantic model. Finally, in the resulting computation step the location counter of o is set to the intermediate location of the given transition (this is described by a'{o.L := Li} which denotes the object diagram resulting from a' by assigning li to o.L). The semantics of an action distinguishes between asynchronous operation calls and state-transformations. Sending a Message
For I —• V, with a = A.op(A\,...,
An), we have:
Location: a (o.L) = If Signal: a' = push — op(o', v\,..., vn)(o~), where a(o.A) — o', a(o.Ai) =Vi,i = l,...,n; Transition: a —> a'{o.L := I'}, where e = o\o'.op(vi,... ,vn). The execution of an operation call generates a computation step which observes the corresponding message. The first clause defines the intermediate location of the transition. The second clause describes the effect of sending a message in terms of the corresponding push-op function. Finally, in the resulting computation step the location counter of o is set to the exit location of the given transition. Execution of a s t a t e - t r a n s f o r m a t i o n state-transformation s, we have:
For / —> I', with a = s, for some
Location: o~(o.L) = If State transformation: s(a,o) = a'\ Transition: a —^-> a'{o.L := /'}. The execution of a state-transformation s generates a 'silent' computation step, where a' describes its effect. In case s is a primitive operation c.new(A), s(a,o) is assumed to describe the result of adding a new instance of c to the domain of ac and assigning its identity to the attribute A of o. We have the following computation step for class instantiation. Class i n s t a n t i a t i o n For I —> /', with a an action c.new(A) and s its corresponding state-transformation, we have: Location: a(o.L) = If State transformation:
s(a,o) = a';
Component:
From Mobile to Channels
161
Transition: a —> a'{o.L := I'},where e = c\o.new(o') and d = a(o.A). 5.2.2. Trace
semantics
Message sequence charts in UML provide a visual representation of sequences of sent messages. In the semantics of concurrent systems such sequences are formalized as traces of observable events. Given the above transition relation we can define the trace semantics of a given class diagram as a recursive function Trace which yields for each input diagram a the set of traces {e.t | t G Trace(cr'), a A a'} U {t | i e Trace(a'), a A a'} Restricting to finite traces we can mathematically justify this recursive definition of Trace in a straightforward manner by a standard least fixed point construction. We denote the union of the sets Trace (er), a ranging over all object diagrams (of the given class diagram), also by Trace. 5.3. The component model Following Szypersky [21], we view a software component as a structural abstraction that can be composed through well-defined interfaces only. The internal structure of a component consists of a set of classes, represented in UML by a corresponding class diagram. At run-time a component thus consists of a dynamically evolving object diagram instantiating its class diagram. Components do not share classes. Components only interact via their interfaces. A declaration of an interface / = [opi(7\),---pp„(f n )] of a component is similar to that of a class: It consists an identifier I with an associated set of operation signatures. Such a signature specifies a name op and a (possibly empty) list T of identifiers which specify the parameter types. The latter include type identifiers like INT and BOOL to denote the built-in types of the integers and booleans, and user-defined identifiers for component interfaces themselves. Differently from the interface of a class in a class diagram, a component interface has no declaration for attributes and cannot be part of associations. The interfaces of a component are used to characterize its roles. A role R of a component consists of at most one provided interface and some required interfaces. A role is self-contained in the sense that the types of the parameters of the operations of its interfaces are built-in types or the types of its (provided and required) interfaces themselves. As such a role cannot export information about the internal classes, as explained in some more detail below. Each role of a component is realized by an unique internal class. Internally the scope of the name-space of the interfaces of a role is restricted to its realization. That is, only the class realizing a role R may declare attributes typed by the interfaces of R and only its associated state machine may instantiate the required interfaces of R and call their operations. Furthermore, externally, other components may call
162
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-Scholten
only the operations of the provided interface (which will occur as triggers in the associated state machine). Note that a component itself is not a class: it has no attributes and it cannot be instantiated or be part of associations. However it can be generalized and hence replaced by any other component that offers at least the same provided interfaces and demand at most the same required interfaces. The use of roles allows the same interface to be provided by more than one role within the same components. This way we can distinguish between different implementations of the same interface and deal with the problem of coexistence of old and new realizations of an interface (the versioning problem). Furthermore, the inclusion of required interfaces in a role resolves in a simple manner the basic assymmetry of the client-server architecture of object-orientation which mainly focusses on the provided operations. Pictorially we follow the standard of UML and draw a component as a rectangle with in the right-hand corner a component icon: a rectangle with two smaller rectangles protruding from its left hand side. Roles are shown as small squares on the edge of the component rectangle, with links to interfaces, shown as labelled ball and socket for the provided and required interface, respectively. In general, only the instantiable interfaces of a role are shown. That is, a required interface is only shown explicitly if the realization of the role requires its instantiation and explicit representation of the provided interface allows its instantiation by external components. Figure 5.1 shows an example of a component Client with a role consisting of provided interface ISnd and a single required interface IRec. The provided interface ISnd contains the operation ack() and the required interface IRec contains the operation msg(ISnd). These interfaces together are intended to describe a Client that first creates a new server session by instantiating its required interface. After having passed its own identity (represented by the type ISnd of the provided interface) by calling the operation msg of the server it waits for an acknowledgment. IRec
ISnd
Fig. 5.1.
A black-box view of a component
Component system diagrams are used to visualize a system of components. In such a diagram, every interface declared as required in a role of a component has to be connected with exactly one corresponding provided interface of a role of another component in the diagram. The inter-component connections are used at run-time for the instantiation of a required interface: within a component a required interface I simply acts as placeholder for the external class realizing the role that provides the interface wired to / . Component connections induce the following subtype relation < between the provided and required interfaces (assuming a renaming in order to avoid name clashes between the interfaces of different components). It subsumes the subtype
Component:
From Mobile to Channels
163
relation of the primitive built-in types and is defined as the smallest relation such that I < I' (J is a subtype of / ' ) for every connected provided interface I and required interface I', and which satisfies the following condition: if I < I' then for every declaration op(T{, • • • , T^j of an operation in the interface / ' there exists a corresponding operation op(Ti,--- ,Tn) in the interface I such that T[ < Ti, i = 1 , . . . , n. Note that this condition ensures that any call of the operation op of an implementation of the required interface / ' , can be received by an implementation of the provided interface / , because the latter expects parameters typed by T\,..., Tn and T[ is a subtype of Ti} % = 1 , . . . , n. The existence of such a subtype relation induced by the component connections guarantees that these connections themselves are type-safe, i.e., the type of the data communicated by a call of an asynchronous operation of an instance of an external class corresponds with the expected type. Instances of classes realizing roles are also called ports. It follows from the above that components interact only via their ports. Classes realizing the required interfaces of a role are not part of the component itself, but belong to another component. They are known only at deployment time when a required interface of a role is connected to a corresponding provided interface of another role. This way, at run time, an action I.new(A) involving the instantiation of a required interface /, results in a new instance of the external class realizing the role which provides / . The identity of this object is assigned to A. IRec
'
Fig. 5.2.
ISnd
A component system diagram
Figure 5.2 shows a component system diagram. The connection between the required interface and the provided interface of the roles of the Server and Client components are given by the ball-in-socket notation. The intended behavior of a port of the Server component is to wait for a call of its provided operation msg and return an acknowledgement to the caller by calling its ack operation. The latter behavior is represented by a corresponding required interface which specifies the ack operation. Since this required interface is only used for representing the type of the parameter of the provided msg message, in other words, a Server port does not require its instantiation, it is not explicitly represented.
5.3.1. Active
and passive
classes
In UML [20] classes can be either active or passive. Each active object (i.e., an instance of an active class) coordinates its so-called activity group of passive objects which are instances of associated passive classes. Activity groups are run-time components that are created dynamically. The objects of an activity group share both one thread of control and one message queue. The sharing of control means
164
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-Scholten
that at most one object of the group is executing. Control is passed on by a synchronous operation call to another object belonging to the same group. On the other hand, a synchronous call to an operation of an object not belonging to the same activity group suspends the executing object and thereby its activity group. An asynchronous operation call to an object is stored in the message queue of its activity group. Using standard object-oriented design patterns like delegation, it is not difficult to define components consisting of active classes only which internally model the sharing of the message queue and the control by an activity group. Consequently, we can restrict UML models to active classes only which greatly reduces the complexity of the computational model. In other words, in our component-based view complex coordination patterns between groups of objects, can be introduced by means of components instead of wiring them into the underlying computational model. 5.3.2. Component
semantics
Recall that we associate to each role an abstract state machine which may use primitive operation calls of the form I.new(A) for the instantiation of external classes. Here A is an attribute which will store the identity of a new instance of the external class realizing the (required) interface / . In fact, / acts as placeholder for the name of the class realizing the role that provides / . This class is known only after composition when provided and required interfaces are wired together. After connection thus, a component system diagram can be reduced to a closed class diagram without roles and interfaces by substituting the class names for the corresponding required interfaces, assuming without loss of generality that different components do not use the same class names. For example, consider the component system given in Figure 5.2. A primitive operation call IRec.new(A) executed by a port instance of the component C l i e n t is translated into the primitive operation call server, new (A), where server is the name of the class realizing the port of the component Server. This translation is generated by the connection between the required interface IRec of the role of the component Client and the provided interface IRec of the role of the component Server. By means of the connections of a component system diagram we can thus formalize its semantics in terms of the semantics of the underlying class diagram, as described in the previous section. This class diagram is simply defined as the disjoint union of the classes of its components and renaming the required interfaces in the state-machines by their realizations as specified by the connections. However, the above semantics does not capture the semantics of a component system compositionally in terms of the semantics of its components. In order to describe the interactions of a component with its environment we model a component as an I/O automata with additional input events of the form o?op(vi,...,vn)
Component:
From Mobile to Channels
165
which indicate the storage of a message op(vi,..., vn) by the internal port o. Note that such inputs are always enabled. We have the following additional computation steps for receiving external messages. Receiving an external message For a message op(v\,..., vn), where op is an operation provided by the internal port o (that is, o is an instance of a class c realizing a role R of the component and op is an operation appearing in the provided interface of _R), we have: Message: a' = push — op(vi,..., Transition: a —• a1', where e = o!op(v\,...,
vn)(a, o);
vn).
On the other hand, the outputs of a component are observable events o\o'.op(vi,...,v„) which involve the exchange of a message op(y\,..., vn) between an internal port o and an external port d (whose type is specified by a required interface of the role realized by o). Such events are generated by operation calls which are modelled by the following additional computation steps. Sending an external message For a transition / —• V appearing in the state machine of a class realizing a role of the component, with a an action A.op(A\,... ,An) such that the type of the attribute A is specified by a required interface, we have: Location: a(o.L) = U; Transition: cr —• a'{o.L := / ' } , where e = olo'.op(vi,... ,«„), a(o.A) = o', and Vi = a(o.Ai), i = 1 , . . . ,n. Note that instances of required interfaces do not have a state and behavior defined by the component. Therefore, external messages sent cannot be stored and generate only a corresponding event. Similarly, the instantiation of a required interface / by an action I.new (A) only involves the assignment of a fresh reference of an instance of type / to the attribute A. For each component we assume that the corresponding state-transformation s generates such a new reference, i.e., a' = s(a, o) differs from a only with respect to the value of the attribute A of the executing object o. Instantiation of a required interface / generates an observable event I\o.new(o'), where d is a reference to a new instance of type I created by o. Instantiating required interfaces For a transition I —> I', with a an action I.new(A), I a required interface, and s the corresponding state-transformation, we have: Location: a{o.L) = If, State transformation: s(a,o) = a1; Transition: a —> a'{o.L := I'}, where e = I\o.new(d) and a'(o.A) = d.
166
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-8cholten
On the other hand, the instantation of a provided interface by external ports is modelled by sending an anonymous message c\new(o), where o is a reference to a new instance of t h e class c realizing a given provided interface. We have the following computation step which is always enabled and which does not involve the execution of a state machine. I n s t a n t i a t i n g p r o v i d e d i n t e r f a c e s Let s be a state-transformation such t h a t S(CT) (no executing object is required) results from adding to CT a new instance of the internal class c realizing a role t h a t contains a provided interface. State transformation: s(a) = CT'; Transition: a — • CT', where e = c\new(o) and o denotes a reference t o the newly created instance of c in CT'. Given these additional computation steps which model the interaction of a component with its environment, we can now define the local trace semantics of a component in terms of its underlying class diagram in the same way as the semantics of class diagrams in the previous section. T h e main difference with the trace semantics given in the previous section concern the additional events generated by sending and receiving external messages, and those events generated by the instantiation of the provided and required interfaces. In order t o express the m a i n compositionality result, we assume given a system S of components C i , . . . ,Cn. We denote by 7(1), for every required interface I appearing in the component system diagram S, the (unique) class c realizing the role t h a t contains the provided interface to which I is connected. By Trace I S we denote the set of global traces of the entire class diagram underlying the component system S restricted to those events involving ports only. T h a t is, the interactions between internal objects of the components of S are abstracted away. Note t h a t this class diagram is closed, in the sense t h a t it does not contain interfaces specifying external classes. On the other hand, let Trace J. C,, i = 1 , . . . , n, denote the set of local traces of Ci restricted to those events involving ports only. T h a t is, the interactions between internal objects of Ci are abstracted away. Furthermore, for a global trace t, we denote by t J. Ci the local trace of d which results from restricting t to the ports of Ci, i = 1 , . . . ,n. More specifically, t I Ci consists of those events of t which either indicate the sending or receipt of an external message by a port of Ci or the instantiation of the provided and required interfaces of Ci. Receipt of an external message additionally involves the transformation of a global event o\o'.op(v), with the callee o' an internal port of Ci, into a corresponding local anonymous event o'?op(v). Instantiation of a required interface / of a role of Ci, involves the transformation of a global event c\o.new(o'), with o a n internal p o r t of d, into t h e local event I\o.new(o'), where c = 7 ( f ) . O n the other hand, instantiation of a provided interface involves the transformation of a global event c\o.new(o'), with o an external port, into the anonymous local event dnew(o').
Component:
From Mobile to Channels
167
Given the above definitions, we have the following main compositionality result which captures the abstraction and structuring mechanism of our component model in terms of interaction and encapsulation. Theorem 5.1. Trace [ S = {t \ t i d e
Trace | Q , i = 1 , . . . , n},
where t ranges over the set of all traces involving ports (as defined by S) only. The proof is based on a straightforward analysis of the possible computation steps. Note that a global silent computation step a —> a', involving a statetransformation s of component C,, corresponds with a local computation step Oi -^> o-[, where Oi and a^ denote the local object diagrams of d contained in a and a'. Similarly, a global computation step a —> a', with an event involving an internal object of component Ci, corresponds with a local computation step a[, where, as above, cr, and er^ denote the local object diagrams of Ci contained in er and a'. Furthermore, such computation steps do not affect the local object diagrams of the other components, i.e., the local object diagramsCTjand c ' of a different component Cj contained in a and a' are the same. All other global computation steps involve the interaction between ports of different components and can be decomposed into corresponding local computation steps. As an example, we have the following global trace describing the intended behavior of the component system in Figure 5.2. do.new(o'), o\d.msg(o), d\o.ack where c is the class realizing the role of the Server component, o is a port of the component Client, and o' denotes the newly created port of Server. By restricting this global trace to the port o of Client we obtain the local trace IRec\o.new(o'), o!o'.msg(o), o?ack On the other hand, restricting the above global trace to the port o' of Server we obtain the local trace c\new(o'), o'?msg(o), o'lo.ack 5.4. Inter-component coordination via mobile channels Components thus interact via ports by means of asynchronous operations. Ports implement both their local coordination protocol (as specified by their state-machines) and the communication mechanism (the message queue). To further enhance the basic separation of concerns between computation and communication underlying component-based software engineering, we introduce in this section mobile channels as a general model for inter-component communications in dynamically reconfigurable component systems. The basic idea is to generalize the messagequeue associated to each port to a mobile channel as an independent exogenous communication-medium.
168
F.S. de Boer, M.M. Bonsangue
and J.V.
Guillen-Scholten
We model a mobile channel as a component with one single role. Its provided interface contains the following signatures of read and write operations read(Return) and write(Ack, Value), where Return and Ack are the names of the required interfaces specifying the type of ports calling the write and read operations. The type of the communicated values is specified by Value, which can be any built-in primitive type or any interface specified by the role. The required interface denoted by Return consists of an operation return(Value) for returning a value of type Value and the required interface denoted by Ack consists of the signature of an operation ack() for acknowledging the successful completion of the write operation. The provided interface of a mobile channel may contain several read and write operations which thus differ in the type Value of the communicated values. In this way we model 'polymorphic' channels. On the other hand, a component specifies its use of a mobile channel by means of a role which consists of a single required interface that contains some read and write operations, and a corresponding provided interface specifying the return and acknowledge operations. In general, a component system diagram only connects the required interface of a component role specifying its use of a mobile channel with a provided interface of a corresponding mobile channel. The provided interface of a component specifying the return and ack operations and the corresponding required interface of a mobile channel in general will not be explicitly referred to in a component system diagram because in general they are not supposed to be instantiated. In general, they only serve to characterize the types of the parameters of the read and write operations. Since these connections are assumed to be type-safe, it follows that if the interface of a component role R requiring a read operation read(I) is connected to the provided interface of a mobile channel then the latter contains a read operation read(Return), with / a subtype of Return. So / contains a return operation returniy), with Value a subtype of V. By definition of R, it also follows that I is the provided interface of R. Similarly, if the interface of R requiring a write operation write(I, V) is connected to the provided interface of a mobile channel then the latter contains a write operation write(Ack,Value), with / a subtype of Ack and V a subtype of Value. It follows that for the operation ack() in Ack there exists a corresponding operation ack in the provided interface / of R. By calling the read operation of a mobile channel a port of a component can thus pass its own identity which then can be used by the channel to return a value (by calling its return operation). On the other hand, by calling the write operation of a mobile channel it sends a value together with its own identity which then can be used by the channel to acknowledge receipt. Different kinds of channels can be defined by different implementations of the read and write operations and the way messages are stored. For example, a channel that requires a synchronization between read and write operations can easily be implemented by waiting first for a call of its read operation before acknowledging completion of a call of the write operation, and vice versa, by waiting first for a call of its write operation before returning a value after a call of its read operation. We conclude this section with the observation that we can model dynamic reconfiguration of a system of components that interact via mobile channels by
Component:
From Mobile to Channels
169
communicating these channels themselves. For example, a channel component with a role that provides read and write operations read(Return) and write(Ack, Value), where Value denotes its provided interface itself, can communicate and store its own ports. On the other hand, by introducing Value as a required interface which can be connected to a provided interface of another channel, it can communicate and store ports of that other channel. In general, since Value can be any kind of required interface, we can define mobile channels for the communication and storage of any kind of port.
5.5. Conclusion We have introduced components as units of abstraction that can be independently developed, like modules, using a development methodology based on interfaces. Unlike classes, components are also units of encapsulation that can be extended by subtyping of the interfaces, but not by inheritance of their implementation. Component-based systems are pictorially described by means of two new UMLlike diagrams: component diagrams and component system diagrams. Component diagrams are for describing the structural and behavioral information of components in isolation, whereas component system diagrams are for describing the connections between the components. The component model described in this chapter is similar to the approved proposal by the U2 partners for UML 2.0 [22]. However, our components have no externally observable state and are not instantiable, but, instead, allow for the dynamic creation of ports. Furthermore, components in UML 2.0 do not encapsulate their internal structure. Components described here do encapsulate their internal class structure because we have defined the relation between the internal class structure of a component and its (provided and required) interfaces at the level of the action language for state machines. We have been largely influenced by the main concepts offered by architecture description languages (ADLs): components, ports, and configurations. A large number of ADLs have been proposed, some of them with a sound formal foundation. We only mention here Wright [2], Rapide [16] and ACME [11]. Closer to our architectural diagrams are the architectural descriptions provided by UML-RT [19]. Many models for components have been proposed in the last years, some informal and remaining within the realm of the existing UML (see for example [17] for several strategies for modelling components and other architectural concepts within UML), and others founded on a logical and mathematical basis (e.g. Broy's component model based on streams of messages [7]. Similar to Broy's component model, the semantics of our model is also based on sequences of messages (like those used for the semantics of CSP [15]). However our components have dynamic aspects (e.g. Port instances) not fully covered by Broy's model. In [14] a formalization in a new extension of XML [23] is given of the subset of UML considered in this chapter. This new extension is called the Rule Markup Language (RML). RML is designed for the specification and execution of general transformations of XML data and is therefore very well suited for the specification
170
F.S. de Boer, M.M. Bonsangue and J.V. Guillen-Scholten
and execution of the semantics of UML models. T h e application of R M L to our subset of UML allows for b o t h simulation within UML as well as the coordination of external applications at run-time. There exists an implementation in Java of mobile channels [13] using the Remote Method Invocation package (RMI), so t h a t it can be used for b o t h distributed and non-distributed applications. T h e middleware, called MoCha, supports many types of mobile channels. T h e y can be used by M o C h a components [12] for exogenous coordination in dynamically reconfigurable component systems.
Bibliography 1. G. Agha. Actors: A Model of Concurrent Computation. MIT Press, 1986. 2. R. Allen and D. Garlan. A formal basis for architectural connections. In ACM Transactions of Software Engineering and Methodology 6(3):213-249, 1997 3. F. Arbab, F.S. de Boer, and M.M. Bonsangue. A coordination language for mobile components. In Proc. of SAC 2000, pp. 166-173, ACM press, 2000. 4. F. Arbab, M.M. Bonsangue, and F.S. de Boer. A logical interface description language for components. In Proc. of the 4th Int. Conf. on Coordination Models and Languages, volume 1906 of LNCS, pages 249-266. Springer-Verlag, 2000. 5. F.S. de Boer and M.M. Bonsangue. A compositional model for confluent dynamic data-flow network. In Proc. of the 25th Int. Symp. on Mathematical Foundations of Computer Science, volume 1893 of LNCS, pages 212-221. Springer-Verlag, 2000. 6. L. Bouge. On the Existence of Symmetric Algorithms to Find Leaders in Networks of Communicating Sequential Processes. Acta Inf. 25(2): 179-201, 1988. 7. M. Broy and K. Stolen. Specification and Development of Interactive Systems: FOCUS on Streams, Interfaces and Refinement. Springer-Verlag, 2001. 8. N. Carriero, D. Gelernter. How to Write Parallel Programs: a First Course, MIT press, 1990. 9. W. Damm, B. Josko, A. Pnueli, A. Votinseva. Understanding UML: A Formal Semantics of Concurrency and Communication in Real-Time UML. In Proceedings of FMCO 2003, volume 2582 of LNCS, pages 72-99, Springer Verlag, 2003. 10. E. Freeman, S. Hupfer, and K. Arnold, JavaSpaces TM Principles, Patterns, and Practice, Chapter 1 of book, Addison -Wesley, September 1999. 11. D. Garlan, R.T. Monroe, and D.Wile. ACME: An Architecture Description Interchange Language. In Proceedings of CASCON'97, pages 169-183, 1997 12. J.V. Guillen-Scholten, F. Arbab, F.S. de Boer and M.M. Bonsangue, A Channelbased Coordination Model for Components , A. Brogi and J. Jacquet, editors, Proceedings of 1st International Workshop on Foundations of Coordination Languages and Software Architec tures, ENTCS 68.3, Elsevier Science, 2002. 13. J.V. Guillen-Scholten and F. Arbab, MoCha, easyMoCha and chocoMoCha Manual vl.0, CWI Technical Report, Amsterdam , 2004. 14. J. Jacob. The Rule Mark-up Language RML. URL: h t t p : / / h o m e p a g e s . c w i . n l / j acob/rml/index.html. 15. C. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. 16. D. Luckham, J. Kenney, L. Augustin, J. Vera, D. Bryan, and W. Mann. Specification and Analysis of System Architecture Using Rapide. IEEE Transactions on Software Engineering 21(4):336-355, 1995. 17. N. Medvidovic, D. Rosenblum, D. Redmiles, and J. Robbins. Modelling Software Ar-
Component: From Mobile to Channels
18. 19. 20. 21. 22. 23.
171
chitecture in the UML. ACM Transactions of Software Engineering and Methodology ll(l):2-57, 2002. S. Schneider. Concurrent and Real-time Systems: the CSP Approach. J. Wiley, 1991. B. Selic, J. Rumbaugh. Using UML for Modeling Complex Real-Time Systems. ObjecTime Limited, 1998. P. Stevens and R. Pooley. Using UML: software engineering with objects and components. Addison-Wesley Longman (Object Technology Series), 1999. C. Szyperski. Component Software: beyond object-oriented programming. AddisonWesley, 1998. Object Management Group. UML 2.0: Superstructure Specification. Final Adopted Specification ptc/03-08-03. Extensible Markup Language (XML) 1.0 (Second Edition), W3C recommendation, October 2000. URL: http://www.w3c.org/XML/
This page is intentionally left blank
Chapter 6 Formalizing the Transition from Requirements to Design
R. Geoff Dromey Software Quality Institute Griffith University Nathan, Brisbane, Qld. ^.111, Australia g. dromey @griffith. edu. au Despite the advances in software engineering since 1968, current methods for going from a set of functional requirements to a design are not as direct, formal, repeatable and constructive as we would like. Progress with this fundamental problem is possible once we recognize that individual functional requirements represent fragments of behavior, while a design that satisfies a set of functional requirements represents integrated behavior. This perspective admits the prospect of constructing a design out of its requirements. A formal representation for individual functional requirements, called behavior trees makes this possible. Behavior trees of individual functional requirements may be composed, one at a time, to create an integrated design behavior tree. From this problem domain representation it is then possible to transition directly, systematically, and repeatably to a solution domain representation of the component architecture of the system and the behavior designs of the individual components that make up the system: both are emergent properties of the integrated design behavior tree. "/ believe that failure is less frequently attributable to either insufficiency of means or impatience of labour than to a confused understanding of the thing actually to be done". John Ruskin
6.1.
Introduction
A great challenge t h a t continues to confront software engineering is how to proceed in a systematic way from a set of functional requirements to a design t h a t will satisfy those requirements. In practice, the task is further complicated by defects in the original requirements and, subsequent changes to the requirements. A first step towards taking u p this challenge is to ask what are functional requirements? Study of diverse sets of functional requirements suggests t h a t it is safe to conclude individual requirements express constrained behavior. By comparison, a system t h a t satisfies a set of functional requirements exhibits integrated constrained behavior. T h e latter behavior of systems is not inherently different. Therefore, we may ask, can the same formal representation of behavior be used for requirements and for a 173
R.G.
174
Dromey
design? If it could, it may clarify the requirements-design relationship. Functional requirements contain, and systems exhibit, the behavior summarized below. • Components realise states • Components change states • Components have sets of attributes/properties that are assigned and change values • Components, by changing states, can cause other components to change their states • Conditions/decisions, and events are associated with components and states. • Interactions between components also play a key role in describing behavior. They involve sequential, concurrent and threaded control-flow and/or data-flow between components. Notations like sequence diagrams, class and activity diagrams from UML [1], data-flow diagrams, Petri-nets [2], Statecharts [6], and Message Sequence Charts (MSCs) [7], accommodate some or all of the behavior we find expressed in functional requirements and designs. Individually however, none of these notations provide the level of constructive and integrated support we need in a single representation. This forces us to contemplate another representation for functional requirements and designs. As Jackson wisely remarked [8], such ventures are generally not enthusiastically received: a consensus is that new proposals just muddy the waters. Our justification for ignoring this advice is that the Behavior Tree Notation solves a fundamental problem: it provides a clear, simple, constructive and systematic path for going from a set of functional requirements to a design that will satisfy those requirements. In other words, it provides a systematic means to transition from the microscopic behavior of functional requirements to the integrated macroscopic behaviour of a system that satisfies those functional requirements. In addition the component architecture and individual component behavior designs for the system are both emergent properties of the integrated macroscopic behavior of a system.
6.2. Behavior Trees The Behavior Tree Notation captures in a simple tree-like form of composed component-states what usually needs to be expressed in a mix of other notations. Behavior is expressed in terms of components realizing states, augmented by the logic and graphic forms of conventions found in programming languages to support composition, events, control-flow, data-flow, and threads. Behavior trees are equally suited to capture behavior expressed in the natural language representation of functional requirements as to provide an abstract graphical representation of behavior expressed in a program. To use David Harels metaphor, Behavior Trees represent a lifting up of behavior expressed in programming languages to a higher level of abstraction. Definition: A Behavior Tree is a formal, tree-like graphical form that represents behavior of individual or networks of entities which realize or change states, make
Formalizing
the Transition from Requirements
to Design
175
decisions, respond-to/cause events, and interact by exchanging information and/or passing control. To support the implementation of software intensive systems we must capture, first in a formal specification of the requirements, then in the design, and finally in the software; the actions, events, decisions, and/or logic obligations, and constraints expressed in the original natural language requirements for a system. Behavior trees do this. They provide a direct and clearly traceable relationship between what is expressed in the natural language representation and its formal specification. Translation is carried out on a sentence-by-sentence, word-by-word basis. Figure 6.1 shows a sample translation to a behavior tree. Components are in bold and states, conditions and events are in italics.
j
••'•.W-Erurantu''*'
GATE ?Closed?
Behavior When a car is at the entrance if the gate is open the car menproceed, otherwise if the gale is closed, when and if the driver presses the button the gate will open and then the car may proceed".
CAR
o>
V BUTTON [Pressed]
V. GATE iOp>!!'
Behavior Tree Fig. 6.1.
DRIVER ??Presse<$Button]?
_Y_ . CAR ' " Ro.-.oMs=
Translation of natural language to a Behavior Tree
The principal conventions of the notation for component-states are the graphical forms for associating with a component [State] ,??Event??, ?Decision?, [Attribute := expression | State ] or output data-flow and input data-flow >DataInput<. Exactly what can be an event, a decision, a state, etc are built on the formal foundations of expressions, Boolean expressions and quantifier-free formulae (qff). To assist with traceability to original requirements a simple convention is followed. Tags (e.g. R l and R2, etc, see below) are used to refer to the original requirement in the document that is being translated. Record/data definitions, other constraints, and relations are handled by state extension. System states are
R.G.
176
Dromey
Component-State Label tag
COMPONENT [Stale]
Internal State
T tag
COMPONENT [Attribute := Value]
Attribute - State
Indicates that the component will assign a value to one of its attributes.
IF - State
Indicates that the component will only pass control if If-state is TRUE
T tag
COMPONENT ? IF-State ?
>1 tag
COMPONENT ?? WHEN-State ??
WHEN -State
\ tag
COMPONENT ??•? WHEN-State ???
Guard - State
T tag
COMPONENT : Dataflow-State :
Data-out State
T tag
COMPONENT - Dataflow-State <
Data-in State
T tag
System-Name [ State ]
T
Semantics Indicates that the component has realized the particular internal state and then passes control to its output.
System - State
Indicates that the component will only pass control when AND if the event WHEN-state happens after reaching this component-state. Indicates that the component will only pass control when the event Guard-state happens OR has happened prior to reaching this component-state. Indicates that when the component has realized the state it will pass the data to the component that receives the flow. Indicates that when the component has realized the state it will have received the data from the component that sends the flow. The system component, System-Name realizes the state "State" and then passes control to its output.
Fig. 6.2. Behavior Tree Notation, Key Elements
used to model high-level (abstract) behavior, some preconditions/postconditions and possibly other behavior that has not been associated with particular components. They are represented by rectangles with a double line ( = = = ) border. A brief summary of key elements of the notation is given in Figure 6.2, (see web-site http://www.sqi.gu.edu.au/gse/ for details).
Formalizing the Transition from Requirements
to Design
177
In practice, when translating functional requirements into behavior trees we often find that there is a lot of behavior that is either missing or is only implied by a requirement. We mark implied behavior with a + in the tag (and/or the colour yellow if colour can be shown). Behavior that is missing is marked with a - in the tag (and/or the colour red). Explicit behavior in the original requirement that is translated and captured in the behavior tree has no + / - marking, and the colour green is used - see Figure 6.4. These conventions maximize traceability to original requirements. The Green-Yellow-Red traffic-light metaphor is intended to indicate the need for caution (yellow) and danger (red) and to draw attention to deficiencies in the original requirements. Subsequent change to a working system requirements/design is marked by a +-1- in the tag and/or the colour blue. These conventions are particularly useful when discussing requirements and designs with stakeholders. It provides a clear record of the evolution of, and deficiencies in the original system. We can now explore the relationship between a set of functional requirements and their corresponding design. And from this follows a systematic method for constructing a design from its requirements.
6.3. Genetic Design Conventional software engineering applies the underlying design strategy of constructing a design, D that will satisfy its set of functional requirements F. F may be represented by a set of natural language statements {R\, R2, • • •, Rm}- Representing this symbolically we have: Dsat F In contrast to this, genetic design enables us to use the behavior tree notation to construct a design out of its set of functional requirements. We achieve this by first applying a translation relation T to the natural language description of each functional requirement. A translation relation T takes a natural language statement of a functional requirement as input and produces a set of requirements behavior trees as output (in most cases one natural language statement of a functional requirement translates to one requirements behavior tree). In general, the requirement Ri yields one or more (the set Fi) requirements behavior trees (RBTs). The matter is further complicated because there may be more than one equivalent behavior tree translation (the set Ei) for the original natural language requirement Ri. So in each case, we have: Ri T Fi for all i G [l..m] and Fz = {Fa,Fi2,...,
Flk(i)}
and Fi € Ez
A complete set of requirements behavior trees F is obtained from the union of all the sets Fi. A design behavior tree D is constructed by integrating the behavior trees for all individual functional requirements (RBTs), one-at-a-time, into an evolving design behavior tree (DBT). Applying the behavior tree integration relation I to all the RBTs yields a design behavior tree D (it may be possible to construct more than one DBT for a given set of RBTs: when this happens the resulting set of DBTs yield equivalent architectures and component behavior projections because integration does not add any new direct component-state to the component-state
R.G.
178
Dromey
relationship of a set of RBTs). We have: F I D where F = {Fx U F2 U . . . U Fm} If we use a < b to denote that the behavior tree a is a sub-structure of the behavior tree b then: Fa < D A Fi2 < D A . . . A Fik{i) < D for all i e [l..m] and all k(i) > 1 This tells us that each RBT is a substructure of the design behavior tree. This very significantly reduces the complexity of a key part of the design process and any subsequent change process. Any design, built out of its requirements will conform to the weaker criterion of satisfying its set of functional requirements. What we are suggesting is that a set of functional requirements, represented as behavior trees, in principal at least (when they form a complete and consistent set), contains enough information to allow their composition/integration. This property is the exact same property that a set of pieces for a jigsaw puzzle possess. And, interestingly, it is the same property possessed by a set of genes that create a living entity. Witness the remark by geneticist Adrian Woolfson: in his recent book [15](p.l2), Life Without Genes, "we may thus imagine a gene kit as a cardboard box filled with genes. On the front and sides of the box is a brightly coloured picture of the creature that might in principle be constructed if the information in the kit is used to instruct a biological manufacturing process" The obvious question that follows is: "what information is possessed by a set of functional requirements that might allow their composition or integration?" T he answer follows from the observation that the behavior expressed in functional requirements does not "just happen". There is always a precondition that must be satisfied in order for the behavior encapsulated in a functional requirement to be accessible or applicable or executable. In practice, this precondition may be embodied in the behavior tree representation of a functional requirement (as a component-state or as a composed set of component states) or it may be missing - the latter situation represents a defect that needs rectification. The point to be made here is that this precondition is needed, in each case, in order to integrate the requirement with at least one other member of the set of functional requirements for a system. (In practice, the root node of a behavior tree often embodies the precondition we are seeking). We call this foundational requirement of the genetic software engineering method, the precondition axiom.
Precondition
Axiom
Every constructive, implementable, individual functional requirement of a system, expressed as a behavior tree, has associated with it a precondition that needs to be satisfied in order for the behavior encapsulated in the functional requirement to be applicable. A second building block is needed to facilitate the composition of functional requirements expressed as behavior trees. Jigsaw puzzles, together with the precondition axiom, give us the clues as to what additional information is needed to achieve integration. With a jigsaw puzzle, what is pivotal, is not the order in which
Formalizing
the Transition from Requirements
to Design
179
we put the pieces together, but rather the position where we put each piece. If we are to integrate behavior trees in any order, one at a time, an analogous requirement is needed. We have already said that a functional requirements precondition needs to be satisfied in order for its behavior to be applicable. It follows that some other requirement, as part of its behavior tree, must establish the precondition. This rule for composing/integrating functional requirements expressed as behavior trees is more formally expressed by the following axiom. Interaction
Axiom
For each individual functional requirement of a system, expressed as a behavior tree, the precondition it needs to have satisfied in order to exhibit its encapsulated behavior, must be established by the behavior tree of at least one other functional requirement that belongs to the set of functional requirements of the system. (The functional requirement that forms the root of the design behavior tree is excluded from this requirement. The external environment makes its precondition applicable).
Interaction Axiom
1 *\ 1 Jllllll
*
\ •••— ***<*h.,*
^™ Fig. 6.3.
\.
^j?
Interaction Axiom-Graphic Form
The precondition axiom and the interaction axiom play a central role in denning the relationship between a set of functional requirements for a system and the corresponding design. What they tell us is that the first stage of the design process, in the problem domain, can proceed by first translating each individual natural language representation of a functional requirement into one or more behavior trees. We may then proceed to integrate those behavior trees just as v/e would with a set of jigsaw puzzle pieces. What we find when we pursue this whole approach to software design is that the process can be reduced to the following four overarching steps: • Requirements translation (problem domain)
R.G.
180
Rl.
R2. R3. R.4. R5. R6. R7.
Dromey
Table 6.1. Functional Requirements for Microwave Oven There is a single control button available for the use of the oven. If the oven is idle with the door closed and you push the button, the oven will start cooking (that is, energize the power-tube for one minute). If the button is pushed while the oven is cooking, it will cause the oven to cook for an extra minute. Pushing the button when the door is open has no effect (because it is disabled). Whenever the oven is cooking or the door is open, the light in the oven will be on. Opening the door stops the cooking. Closing the door turns off the light. This is the normal idle state, prior to cooking when the user has placed food in the oven. If the oven times out, the light and the power-tube are turned off and then a beeper emits a sound to indicate that the cooking has finished.
• Requirements integration (problem domain) • Component architecture transformation • Component behavior projection Each overarching step needs to be augmented with a verification/defect detection and refinement step designed specifically to isolate and correct the class of defects that show up in the different work products generated by the process. Our intent now is to introduce the main ideas of genetic design. Study of a simple example has proven to be a good way to provide an initial understanding of the overall process. For our purposes, and for the purposes of comparison, we will use a design example for a Microwave Oven that has already been published in the literature [11]. The seven stated functional requirements for the Microwave Oven problem [ll](p.36) are given in Table 6.1. Shlaer, and Mellor have applied their state-based Object-Oriented Analysis method to this set of functional requirements. 6.3.1. Requirements
Translation
Requirements translation is the first formal step in the Genetic Design process. Its purpose is to translate each natural language functional requirement, one at a time, into one or more behavior trees. 6.3.1.1.
Translation
Translation identifies the components (including actors and users), the states they realise (and attribute/property assignments), the events and decisions/constraints that they are associated with, the data components exchange, and the causal, logical and temporal dependencies associated with component interactions.
Example Translation Translation of R7 from Table 6.1 will now be considered in slightly more detail. For this requirement we have put the states/actions in italics and made the
Formalizing the Transition from Requirements
to Design
181
Requirement-7 If the oven times-out the light and the power-tube are turned off and a beeper emits a sound to indicate that cooking has finished. R7 +
OVEN [Cooking]
w
R7
OVEN r?'Timed_Out??
R7
POWER-TUBE [Oil
R7
BEEPER [Sounded]
^
>'
\1 R7
Fig. 6.4.
OVEN [Cooking[Hnished|
Behavior Tree produced by translation of requirement R7 in Table 6.1
components bold, that is If the oven times out the light and the power-tube are turned off and a beeper emits a sound to indicate that cooking has finished. Figure 6.4 gives a translation of this requirement R7, to a corresponding requirements behavior tree (RBT). In this translation we have followed the convention of trying wherever possible to associate higher level system states (here OVEN states) with each functional requirement, to act as preconditions/postconditions. What we see from this translation process is that even for a very simple example, it can identify problems that, on the surface, may not otherwise be apparent (e.g. the original requirement, as stated, leaves out the precondition that the oven needs to be cooking in order to subsequently time-out). In the behavior tree representation tags (here R7) provide direct traceability back to the original statement of requirements. Our claim is that the translation process is highly repeatable if translators forego the temptation to interpret, design, introduce new things, and leave things out, as they do an initial translation. In other words translation needs to be done meticulously, sentence-by-sentence and word-by-word. In doing translations there is no guarantee that two people will get exactly the same result because there may be more than one way of representing the same behavior. The best we can hope for is that they would get an equivalent result. The translations of the other six functional requirements for the Microwave from Table 6.1 are shown in figure 6.5, together with implied behavior that includes oven system states we have chosen to add. While these additions are "new" they are clearly distinguished from the original behavior. Later, the relevance and importance of including system-states
R.G.
182
Dromey
will be made clear. For now it suffices to say that they provide a representation (high-level description) of the behavior of a system independent of the behavior of any of the components in the system. Requirement-R1 if the oven idle w Ith the door closed and you push the button the oven w ill Start cooking(that is, energize the pow er-tube for one minute) R1
OVEN [Idle]
Requirement- R2
Requirement- R3
ifthe button is pushed w Me the oven Is cooking it w III cause the oven to cook for an extra-minute.
Rjshing the button w hen the door is open hasno effect (because the button is disabled)
OVEN [Cooking]
R2
*
*
R1
USER
R1
BUTTON [Pushed]
R2 +
USER •"Sulton-FgstiW
re
BUTTON
4
R1
i
POWER-TUBE iEwrgladj
R2
I
i I
R2
OVEN [Cooking]
Requirement-R4
R* C
m c
OOOR lOpenl
c R4
c
OVBt [Coofcnfl]
I UOHT
ion J
DOOR (Closed]
R3 C+
BUTTON [Emhisdj
1
Requirement- R6 Closing the door turns off the light. This is the normal idle state prior to cooking w hen the user has placed the food in the oven.
R5 +
OVEN [Cooking]
R5 +
USER 77Door--Oper>ed??
OVEN [Open]
R6
USER »Door-Ctosed»
+
'
re
OQQR
re
POWER-TUBE
\' '
OVEN [CooWng-Stopped]
4 4
HB
DOOR ICIoeed]
4 m
m
+
R5
R6 +
1
i
Fig. 6.5.
R3 C+
i
Requirement- R5
•
R*
BUTTON [Disabled]
Opening the door stops the cooking
I UUHT :lOn|
R3 C+
OVEN A [Cooking]
+
Whenever the oven is cooking or the door is open the light in the oven w I be on.
OOOR
OVEN [Extra-Minute]
A R1
R3 C
UGHT
ion
1 R6
+
OVEN [Idle]
Behavior Trees for Microwave Oven translated from Table 6.1
6.3.1.2. Translation Defect Detection During initial translation of functional requirements to behavior trees there are four principal types of defects that we encounter:
Formalizing the Transition from Requirements
to Design
183
Table 6.2. States for a Component from Requirement 7 COMPONENT STATE FR Cooking OVEN R7 Timed-Out R7 Cooking{ Finished} R7
• Aliases • Ambiguities, where not enough context has been provided to allow us to distinguish among more than one possible interpretation of the behavior described. Unfortunately there is no guarantee that a translator will always recognize an ambiguity when doing a translation: this obviously impacts our chances of achieving repeatability when doing translations. • Incorrect causal, logical and temporal attributions. For example, in R4 of our Microwave Oven example, it is implied that the oven realizing the state "cooking" causes the light to go on. Here it is really the button being pushed which causes the light to go on and the system (oven) to realize the system-state "cooking". An example of the latter case would be "the man got in the car and drove off". Here "and" should be replaced by "then", because getting in the car happens first. • Missing implied and/or alternative behavior. For example, in R5 for the oven, the actor who opens the door is left out, together with the fact that the powertube needs to be off for the oven to stop cooking. It is necessary to maintain a vocabulary of component names and a vocabulary of states associated with each component to maximize our chances of detecting aliases. In Table 6.2 we show the states collected for the OVEN component from requirement R7. As other requirements are translated more states for the Oven component may accumulate in this table. In practice we have a tool [12] that automatically collects and generates this information as each requirement is entered graphically into the system. A final point should be made about translation. It does not matter how good or how formal the representations are that we use for design/modeling, unless the first step that crosses the informal-formal barrier is as rigorous, intent-preserving, and as close to repeatable as possible, all subsequent steps will be undermined because they are not built on a firm foundation. Behavior trees give us a chance to create that foundation. 6.3.2. Requirements
Integration
When requirements translation has been completed each individual functional requirement is translated to one or more corresponding requirements behavior tree(s) (RBT). We can then systematically and incrementally construct a design behavior tree (DBT) that will satisfy all its requirements by integrating the requirements behavior trees (RBT). Integrating two behavior trees turns out to be a relatively simple process that is guided by the precondition and interaction axioms referred to above. In practice, it most often involves locating where, (if at all) the component-state
R.G.
184
Dromey
root node of one behavior tree occurs in the other tree and grafting the two trees together at that point. This process generalises when we need to integrate N behavior trees. We attempt to integrate two behavior trees at a time: either two RBTs, an RBT with a partial DBT or two partial DBTs. In some cases, because the precondition for executing the behavior in an RBT has not been included, or important behavior has been left out of a requirement, it is not clear where a requirement integrates into the design. This immediately points to a problem with the requirements. In other cases, there may be requirements/behavior missing from the set which prevents integration of a requirement. Attempts at integration uncover such problems with requirements at an early time when the consequences and costs are likely to be minimized. Requirement - R6 R6 +
OVEN [Open]
R6 +
USER ??D«M>Gtosedn
R6 +
OVEN [Open]
R6 +
USER ??Door-CicBed??
\' Requirement - R3C A3
Integration
\' DOOR {Closed}
R6
\' H3
\'
DOOR [Closed)
V
R6
\'
BUTTON
R6
Point of Integration (@@)
DOOR [Closed]
\'
LK3HT [OH]
R6
UGHT [Oft]
R6 +
OVEN [Idle]
BUTTON [Enables
\
f
R6 +
Fig. 6.6.
OVEN (Idle]
Result of Integrating R6 and R3C
Example Integration To illustrate the process of requirements integration we will integrate requirement R6, with part of the constraint requirement R3C to form a partial design behavior tree (DBT) (note in general constraint requirements need to be integrated into a DBT wherever their root node appears in the DBT. This is straightforward because the root node (and precondition) of R3C, DOOR[Closedj occurs in R6. We integrate R3C into R6 at this node. Because R3C is a constraint it should be integrated into every requirement that has a door closed state (in this case there is only one such node). The result of the integration is shown in Figure 6.6. When R6 and R3C have been integrated we have a "partial design" (the evolving design behavior tree) whose behavior will satisfy R6, and the R3C constraint. In this partial DBT it is clear and traceable where and how each of the original functional requirements contribute to the design. Using this same behavior tree grafting process, a complete design is constructed (it evolves) incrementally by integrating RBTs and/or DBTs pairwise until we are left with a single final DBT (see
Formalizing
the Transition from Requirements
to Design
185
Figure 6.7).
Fig. 6.7.
Integration of all functional requirements for Microwave Oven
This is the ideal for design construction that is realizable when all requirements are consistent, complete, composable and do not contain redundancies. When it is not possible to integrate an RBT or partial DBT with any other it points to an
186
R.G.
Dromey
integration problem with the specified requirements that needs to be resolved. Being able to construct a design incrementally significantly reduces the complexity of this critical phase of the design process. And importantly, it provides direct traceability to the original natural language statement of the functional requirements. 6.3.2.1. Integration Defect Detection During integration of functional requirements represented as behavior trees (RBTs) there are four principal types of defects that we encounter: • The RBT that we are trying to integrate has a missing or inappropriate (it may be too weak or too strong or domain-incorrect) precondition that prevents integration by matching the root of the RBT with a node in some other RBT or in the partially constructed DBT. For example, take the case of R5 for the Microwave Oven: it can only be integrated directly with Rl by including OVEN [Cooking] as a precondition. • The behavior in a partial DBT or RBT, where the current RBT needs to be integrated, is missing or incorrect. • Both of the first two types of defects may occur at the same time. Resolving this type of problem may sometimes require domain knowledge. Examples of this and other integration problems can be found in reference [4]. • In some cases, when we attempt to integrate an RBT we find that more than the leaf node overlaps with the other RBT or partial DBT. In such cases this redundancy can be removed at the time of integration. While in principal, it is possible to construct an algorithm to "automate" the integration step, because of the integration problems that we frequently encounter in real systems, it is better to have manual control over the integration process. Tool support can however be used to identify the nodes that satisfy the matching criterion for integration. Our experience with using integration in large industry systems is that the method uncovers early on problems that have been completely overlooked using conventional formal inspections. The lesson we have learned is that requirements integration is a key integrity check that it is always wise to apply to any set of requirements that are to be used as a basis for constructing a design. 6.3.2.2. Inspection and Automated Defect Detection Once we have a set of functional requirements represented as an integrated design behavior tree we are in a strong position to carry out a range of defect detection steps. The design behavior tree turns out to be a very effective representation for revealing a range of incompleteness and inconsistency defects that are common in original statements of requirements. The Microwave Oven System case study has its share of incompleteness and other defects. The DBT can be subject to a manual visual formal inspection and because Behavior Trees have a formal semantics [14] we can also use tools [12] to do automated formal analyses. In combination, these tools provide a powerful armament for defect finding. With simple examples like the Microwave Oven it is very easy to do just
Formalizing
the Transition from Requirements
to Design
187
a visual inspection and identify a number of defects. For larger systems., with large numbers of states and complex control structures the automated tools are essential for systematic, logically based, repeat able defect finding. We will now consider a number of systematic manual and automated defect checks that can be performed on a DBT.
Missing Conditions a n d E v e n t s A common problem is with original statements of requirements that describe a set of conditions that may apply at some point in the behavior of the system. They often leave out the case that would make the behavior complete. The simplest such case is where a requirement says what should happen if some condition applies but the requirements are silent on what should happen if the condition does not apply. There can also be missing events at some point in the behavior of the system. For example, with the Microwave case; study a very glaring missing event is in requirement R5. It says "opening the door'stops'the cooking" but neglects to mention that is possible to open the Microwave door when it is idle/closed. .To systematically "discover55 this event-incompleteness defect we can use the following process. We make a list of all events that can' happen in the system (this includes the user opening the door). We then examine those parts of the DBT where events occur and ask the question "could any of the other events that we have listed occur at this point?55 In the case where the OVEN [Idle] occurs the only event in the original requirements is that the user-event of pushing the button to start the cooking can occur (see Figure 6.8).
I @@
OVEN [Idle]
II +
\f
IRI
R8
USER ??Button-Push??
Missing Event
>f R1 1 @@
Fig. 6.8.
BUTTON [Pushed]
USER ??Door-Open??
| [
Missing event detected by the event completeness check rule
In this context, when we ask what other event, out of our list of events could happen when the Oven is Idle, we .discover the user could• open the door. We have added this mipsing event in as requirement R8.
188
R.G.
Dromey
Missing Reversion Defects Original statements of requirements frequently miss out on including details of reversions of behaviour that are needed to make the behaviour of a system complete. Systems that "behave" as opposed to programs that execute once and terminate must never get into a state from which no other behaviour is possible: if such a situation arises the integrated requirements have a reversion defect. Take the case of the Microwave Oven DBT in figure 6.7. We find that if the Oven reaches either an OVEN [Cooking-Stopped] or an OVEN [Cooking JPinished] state then no further behaviour takes place. In contrast, when the system realizes an OVEN A [Cooking] leaf-node it "reverts" to a node higher up in the tree and continues behaving. To correct these two defects we need to append respectively to the R5 and R7 leaf nodes the two reversion nodes " A " shown in figure 6.9.
>f |R2
BUTTON
|
\m 1
w
\f R2
OVEN [Extra-Minute]
|R7l
\I
[re" 1 +
OVEN A [Cooking]
_. | j
R7
Missing Fig. 6.9.
m
_ i _ _ _ OVEN [Cooking-Finished
Missing
Reversion " A ' 5 nodes added to make D B T behaviour reversion-complete
Deadlock, Live-lock a n d Safety Checking The tool we have built allows us to graphically enter behavior trees and store them using XML [12]. Prom the XML we generate a CSP (Communicating Sequential Processes) representation. There are several translation strategies that we can use to map behavior trees into CSP. Details of one strategy for translating behavior trees into CSP are given in [14], A similar strategy involves defining sub-processes
Formalizing
the Transition from, Requirements
to Design
189
in which state transitions for a component are treated as events. For example, to model the DOOR [Open] to DOOR [Closed] transition the following CSP was generated by the translation system: DoorOpen = userDoorClosed —» doorClosed —• DoorClosed The CSP generated by the tool is fed directly into the FDR model-checker. This allows us to check the DBT for deadlocks, live-locks and also to formulate and check some safety requirements [14].
Reversion inconsistency Defect D e t e c t i o n The current tool does a number of consistency checks on a DBT. One important check to do is a reversion " A "check where control reverts'back to an earlier established state. For example, for the Microwave Oven example in Figure 6.7, one reversion check that needs to be done is to compare the states of all components at OVEN [Idle] with those at OVENA [Idle]. What this check allows us to do is see whether all components are in the same state at the reversion point as the original state realization point. Figure 6.10 shows the bottom part of the Oven DBT from Figure 6.7. .
R7
OVEN ?? Timed-Out ??
LIGHT [Off]
R7
POWER-TUBEJ [Off]
BUTTON [Disabled I
R7
USER '>?Door-Opcned??
R5
R5
R5
DOOR [Open]
POWER-TUBE [Off]
OVEN .Cooking-Stoppod;
R7
R7
Missing Behavior OVEN [Cooking-Finished!
OVEN * [ Idle ]
Fig. 6.10. Missing behaviour cleroctccl by checking OVEN [Idle]/ OVEN A [Idle] component state consistency
We see that requirement R7 (and the DBT in Figure' 6.7) is silent on any change to the state of the BUTTON component. This means we have from R l
R.G.
190
Dromey
that BUTTON [Pushed] still holds when OVENA[Idle] is realised. However this is inconsistent with OVEN [Idle] established by R6 and constraint R3 which has the state for BUTTON as BUTTON[Enabled]. That is, the system-state definitions which show up the inconsistency are as follows: OVEN[Idle] = DOOR[Closed] A LIGHT [Off] A BUTTON [Enabled] A . . . OVENA[Idle] = DOOR[Closed] A LIGHT[Off] A BUTTON[Pushed] A . . . These sort of subtle defects are otherwise difficult to find without systematic and automated consistency checking. There are a number of other systematic checks that can be performed on a DBT, including the checking of safety conditions (e.g., in the Microwave Oven requirement R5, it indicates that the door needs to realize the state open to cause the power-tube to be turned off: this clearly could be a safety concern). We will not pursue these checks here as our goal has only been to give a flavour of the sort of systematic defect finding that is possible with this integrated requirements representation. We claim, because of its integrated view, that a DBT probably makes it easier to "see" and detect a diverse set of subtle types of defects, like the ones we have shown here, compared with other methods for representing requirements and designs. We have found many textbook examples, where this is the case. Once the design behavior tree (DBT) has been constructed, inspected and augmented/corrected where necessary, the next jobs are to transform it into its corresponding software or component architecture (or component interaction network - CIN) and project from the design behavior tree the component behavior trees (CBTs) for each of the components mentioned in the original functional requirements. 6.3.3. Component
Architecture
Transformation
A design behavior-tree is the problem domain view of the shell of a design that shows all the states and all the flows of control (and data), modelled as componentstate interactions without any of the functionality needed to realize the various states that individual components may assume. It has the genetic property of embodying within its form two key emergent properties of a design: (1) the componentarchitecture of a system and, (2) the behaviors of each of the components in the system. In the DBT representation, a given component may appear in different parts of the tree in different states (e.g., in figure 6.7, the OVEN component may appear in the Open-state in one part of the tree and in the Cooking-state in another part of the tree). Interpreting what we said earlier in a different way, we need to convert a design behavior tree to a component-based design in which each distinct component is represented only once. This amounts to shifting from a representation where functional requirements are integrated (which may be thought of as a specification for the system) to a representation, which is part of the solution domain, where the components mentioned in the functional requirements are themselves integrated. A simple algorithmic process may be employed to accomplish this transformation from a tree into a network. Informally, the process starts at the root of the design
Formalizing
the Transition from Requirements
to Design
191
behavior tree and moves systematically down the tree towards the leaf nodes including each component and each component interaction (e.g., arrow) thai is not already present. When this is done systematically the tree is transformed into a componentbased design in which each distinct component is represented only once and each component-component interaction is represented by a single line. We call this a Component Interaction Network (GIN) representation. Below, we show the eighth step of this transformation, involving the components on the eighth level (from the root) of the DBT. Here the POWER-TUBE gets included into the GIN and the link between the BUTTON and the LIGHT is added to the network. Traversed Design Behavior Tree |R6
ll
STEP 8
OVEN lOpen j
+
V R6 USER ?*>Doo.--C!&sc-rj?'> +
>f R6
DOOR [Closed)
\1 LIGHT [Off]
R6
|R6
BUTTON [Enabled]
OVEN fitilel
j
* R1
USER ^BiiHosi Push"''
R1
BUTTON [Pushed]
R8
\f
I
_£ LIGHT [On]
Fig. 6.11.
R1
POWER-TUBE (Energized]
USER ^Door-Opened'?
V R8
R4 C
t
LIGHT [On]
DOOR [Open]
1
I R3 C
BUTTON [Disabled]
Eighth step in converting Oven DBT into a component architecture
Applying the tree-to-network conversion algorithm to level 8 of the DBT as shown in Figure 6.11 we see that the components, DOOR, BUTTON and LIGHT have been encountered earlier as have the DOOR-ALIGHT and DOOR-»BUTTON interactions. However the POWER-TUBE component has yet to be included in the evolving GIN. Also it is necessary to include the two interactions, BUTTON-FLIGHT and BUTTON-^POWER-TUBE as they have yet to be included. These level 8 inclusions are shown in the evolving GIN shown in Figure 6.12.
192
R.G.
OVEN
ii
USER
J
u
Dromey
DOOR
_l_
BUTTDN
LIGHT
Fig. 6.12.
POWER-TUBE
Eighth step in converting Oven DBT into a component architecture
The complete Component Interaction Network derived from the Microwave Oven design behavior tree is shown below in Figure 6.13. It defines the componentcomponent interactions for each component. It also captures the "business model" or "conceptual design" for the system and represents the "first cut" at the software architecture for a system. Why we say it is a first cut at the architecture is because it is sometimes possible to simplify the component interfaces and the number of interactions. For example, light has three "inputs". It only needs a single input to control its on/off status. In other situations, where there are a number of different interactions between two components it may be necessary to have more than one connection between two components (e.g., the interface between OVEN and USER requires two "arrows": to distinguish four distinct control inputs in the final component implementation architecture). Studying the network in figure 6.13, we note that the USER component interacts with only the DOOR and the BUTTON, as we would expect. This outcome was not something we consciously planned, but it is something that followed naturally from accommodating the original requirements: this shows the constructive power of the method for producing a semantically based system architecture. The CIN provides the starting point for constructing a component-based design and implementation. It identifies the component interactions, subject to simplification and rationalisation. The job that remains is to identify the integrated behaviour of each of the components in the network, which conceptually we "embed" in each of the components in the network. We then have an integrated component design that can be easily refined into a component-based implementation. We will now describe how to isolate the behaviors of the individual components present in the architecture from the DBT using projection.
Formalizing the Transition from Requirements
V
to Design
193
V j
POWER-TUBE
Fig, 8.13.
Component Interaction Network (CXN) derived from the OVEN D B T
What we have focussed on presenting thus far is largely a mechanism for building a system out of components. It yields an architecture built out of a set of connected, visible (at that level), interacting components each of which.encapsulates and executes behavior. What is important about this architecture is that it includes a system-behavior-component (the Oven in our example) which encapsulates the external behavior that the system exhibits. A system with this architecture has the important property that it can easily be used either: • as a standalone system? or • as a component in a more complex system. The inclusion of a system-behavior-component allows us to access the external behavior of the system without having any knowledge of its internal components and internal workings. All that is needed to use it is acquaintance with its external behavior. This facilitates reuse. Note that a system-behavior-component in this context is very different and separate from the "glue code55, or integration-component^ or other means needed to integrate the components that make up the system.
6.3.4. Component
Behavior
Projection
In the design behavior tree5 the behavior of individual components tends to be dispersed throughout the tree (for example, see the OVEN component-states in the DBT in figure 8.7).
R.G.
194
R2 +
USER ?? Button-Push??
R2
BUTTON [Pushed]
R5
Dromey
USER ??Door
\
f
Fig. 6.14.
LIGHT [Off]
Microwave Oven DBT with oven component behaviour highlighted
To implement components that can be embedded in, and operate within, the derived component interaction network, it is necessary to "concentrate" each components behavior. We can achieve this by systematically projecting each components behavior tree (CBT) from the design behavior tree. We do this essentially by ignoring the component-states of all components other than the one we are currently projecting. The resulting connected "skeleton" behavior tree for a particular component defines the behavior of the component that we will need to implement and
Formalizing the Transition from Requirements
to Design
195
encapsulate in the final component-based implementation. When conducting each projection we need to preserve information that allows us to identify alternative behaviors that result from sets of either events and/or conditions.
Example: Component Behavior Projection To illustrate the effect and significance of component behavior projection we show the projection of the OVEN system component from the DBT for the Microwave Oven in figure 6.7. OVEN COMPONENT - Projected Behavior OVBN
H6
, >' Component Behavior Projection
R5
OVEN (GuoWng-Sla«iad) 1
^' R5
OVEN* {Open]
Missing
Missing
Fig. 6.15.
Projected Behavior for the OVEN component derived from the DBT
In Figure 6.14 the OVEN component is highlighted in the DBT and the result of projection is shown in Figure 6.15. Component behavior projection is a key design step in the solution domain that needs to be done for each component in the design behavior tree. When this process has been carried out for ALL the components in the DBT, that is, USER, BUTTON, etc, all the behavior in the DBT has been projected into the components that are intended to implement the system. That is, the complete set of component behavior projections conserve the behavior that was originally present in the DBT. What this set of component projections allows us to achieve is a metamorphosis from an integrated set of functional requirements to an integrated component based design. It is worth commenting on what happens when we project out the behavior for the light component. What we see from this projection is that there is considerable "repeated" behavior that can be removed before embedding the light behaviour inside the light
R. G. Dromey
196
COMPONENT
~] Projected Component Behavior i, i uaw I ' l ~wt
H
Embedded behavior Hidden inside Component
i
- > yf
1.1
usm
" 1
l"l l») 1 \, \
V UQHT 1
1 1 m1
, *',
P i (On) ]
L_l uawr» 1 r\ (On) 1
Refined Behavior
Fig. 6.16.
Light component behaviour projection and simplification
component. The implications of what has happened here are significant. What it means in general is that this component-based representation of the behaviour for a design factors out redundant and partially overlapping behavior, just as we have observed with the light component. This contrasts with object-oriented implementations where different scenarios that partially overlap each need to be separately implemented. Component behavior projections frequently show up incompleteness and other defects, as is the case with the OVEN component projection: figure 6.15. Missing is the behavior that should happen next when the cooking is stopped by opening the door - what should happen after cooking has finished? We see from this that projection provides another systematic way of finding and removing subtle requirements defects that are difficult to identify by other means. Leaf nodes for the OVEN component need to revert (A) back to earlier behavior in order for the components behavior to be complete and consistent. For example, we need to add after OVEN[Cooking-Stopped] a reverting leaf node OVENA[Open] that transfers control back to OVEN[Open] at the top of the CBT. And, after OVEN[Cooking_Finished] we need to add a reverting leaf node OVEN A [Idle] to make the oven behavior complete. Such defects may be caught at the reversion-check stage of the DBT, discussed earlier, or later at the component behavior projection stage as we have indicated here.
Component Behavior Design Once we have projected the component behavior tree (CBT) for each component from the DBT and corrected any defects it is relatively straightforward to design the internal workings for a component together with its input/output interface. Below we show the corrected CBT for the button component. We proceed with the internal design of the button component by identifying each of the possible "output-states" for the button and then asking, for each output-state from which internal button states can the component transition to the particular
Formalizing
the Transition from Requirements
to Design
197
BUTTON [Enabled]
UTTON lisabled }
8
Fig. 6.17.
±
BSUTTON^ [ Enabled ]
Projected and reversion-corrected Component Behavior Tree for Button
output state. The Button CBT directly provides this information. Take for example, the output state BUTTON[Disabled]. The CBT tells us that the Button component can transition to this state when it is either in a BUTTON [Pushed] or a BUTTON [Enabled] state. Below in the button component design we show how this information is recorded together with the transition information for all the other output states. In undertaking the design of the Button component we have chosen to simplify and identify the set of components that provide input to button. We have also done a similar thing with buttons outputs. Because in this example Button only receives control from other components and only passes control to other components we have used Booleans ( T and F ) as inputs and outputs. This gives the button component the same level of independence as a component in a hardware system would enjoy. It also allows us to clearly separate component implementation from the integration of the components in the system. In figure G.19 the simplified input and output interactions needed for the button component are shown. To complete the component-based design, we embed the behaviors of each component into the architectural design refined from the component interaction network (OIN). This involves simplifying or augmenting the component interfaces where needed and implementing the component interactions that deliver the integrated system behaviors. And finally, we must provide implementations to support the behaviors exhibited by each of the components. This is relatively straightforward to do from the component design. Component integration can be done using either the facilities of a component framework [1] or by mapping the graphic integrated network into a component-based code implementation. The Microwave Oven problem has been previously studied in detail by Shlaer and Mellor [11] .They employ a state transition diagram and a state transition table to model the behaviors. The state transition diagram (STD) bears some similarity to the projected behavior for the OVEN system component. However the STD is an explicit network form rather than a tree-like form with reversions. Events involving other components cause transitions between STD states. In contrast, using genetic
198
R.G.
Dromey
INPUTS DOOR
T
J
gm
DOOM
i
t
F
T|
i
4 3C
\
i
BUTTON [Enabled]
3C
BUTTON { Disabled}
3C
BUTTON [Enabled]
BUTTON { Pushed }
1
BUTTON { Pushed}
1
BUTTON { Pushed}
w 1
BUTTON { Pushed}
|
mm [
1 I
IT 1 T 0 1 N
>r BUTTON { Disabled }
3C
!
IT\
\\ tmm |0n!
BUTTON [Enabled]
3C
|F
i
>)
>f
1
\1 ua«r
1
I O* 1
2
BUTTON { Pushed }
2 |
! ;
if
\I
|T
BUTTON { UnPushed }
¥\
!»OV!fiEflUTUBE 1 OH !
OUTPUTS Fig. 6.18.
Button Component internal state transitions derived from the Button CBT
design, the behavior of all other components in the system is incorporated directly in the DBT. Using STDs traceability to the original requirements is not direct and transparent. In going from original requirements to an STD additional behavior not specified in the original requirements has been added without comment (e.g. the behavior to allow the oven to be opened when it is idle). In genetic design, direct translation, defect detection, and augmentation of requirements are clearly separated steps. The use of STDs makes no provision for the determination of a problem-dependent architecture from the requirements or for the identification of behavior for other components. Instead the Shlaer and Mellor method proposes generic architectural classes for the finite state model, transition, timers and active instances (see [7], Chapter 9). In contrast, genetic design leads to an architecture and component behavior designs that are problem-dependent rather than generic.
Formalizing the Transition from Requirements
to Design
199
USER
V
DOOR
<1 I
M
BUTTON
I
JE LIGHT
Fig. 6.19.
6.3.5. Systems
POWER TUBE
Input and Output Context for the Button component
Implemented
on More than One Level
The system component architecture that we have proposed (and illustrated using the Oven system) thus far only deals with behavior on a single level. Complex systems usually need to be able to deal with behavior on more than one level. For example, we might have a two-level Kitchen system that exhibits behavior at the Kitchen level but includes a Microwave Oven which exhibits the behavior given in our treatment of the problem above. Once we open up the possibility of behavior on more than one level the two obvious questions to ask are: "(a) what is the role of sub-systems in such an architecture, and (b) how are sub-systems related to components?" Understanding of exactly what a sub-system is, often is not very clear. Consider, for example, the following statement from the literature about sub-systems. "A sub-system is not an object, nor a function, but a package of classes, associations, operations, events, and constraints that are interrelated and have a reasonably well-defined and (hopefully) small interface with other sub-systems" [10]. In the Unified System Model (USM), proposed here, things are clearer. The concept of a sub-system is unnecessary. What valid reason could there possibly be for discarding the notion of sub-system, a concept with such wide currency and acceptance? The answer lies in three things: (1) how we use standalone systems in more complex systems (e.g., how do we use the Microwave Oven system in a Kitchen system), (2) how we describe behavior at all levels, and that (3) at whatever level we describe a system, it is built out of a set of connected, visible (at that level),
200
R.G.
Dromey
interacting components each of which encapsulates and executes behavior. (Note that in the Microwave system, the DOOR component is visible, but when we are describing the behavior of the Kitchen System (see below), using the Microwave System component, the DOOR component becomes invisible because it is at a lower behavior level).
KITCHEN (System Behavior Component)
?Ji{CROWAVE
Fig. 6.20.
Kitchen system that includes Microwave Oven system as a component
In response to (1), we may use what is a standalone system at one level of description, as a component (or more accurately a system-component) at the next higher level of behavioural description: that is, the Microwave Oven system may be used as a component in the Kitchen system. Therefore system-components obviate the need for sub-systems. On the second point, fundamental to genetic design and the Unified System Model, is the idea that we describe behavior (and for that matter, requirements) in exactly the same way, at what ever level we are considering. For example, whether we are talking about the behavior of a Microwave Oven or the behavior of a Kitchen system that contains a Microwave Oven the treatment is exactly the same. It follows that if we loosen our definition of a system to say it is built out of connected, visible, interacting components and/or system-components then we have what we need. At whatever level we are describing a system it will be built out of connected components some of which may be systems in their own right. A consequence of this system description regime is that the need architecturally, or otherwise for sub-systems disappears. This model for a system seems preferable to some vague notion about what deserves to be a sub-system. The strategy also harmonises with the need for coarse-grained reuse of components. The word subsystem itself also gives us a semantic clue that reinforces this view: there does not appear to be anything architecturally to distinguish a sub-system from a system other than that it is part of a larger system. In a similar way, from an external view, in this model, there is nothing to distinguish a system from a component, both exhibit behavior and both have an external behavior interface.
Formalizing the Transition from Requirements
to Design
201
One other very important point needs to be made about the description of behavior, and therefore the requirements of a system. On whatever level we are describing the system, the scope of the language we can use to describe requirements/behavior for this level is restricted to the components that are visible at that level, and the states that each of those components can realize: no other information is relevant. We call this the Behavioral Description Invariant. Being mindful of, and employing this restriction simplifies the task of expressing requirements and behavior.
6.4. Comparison with UML and Other Methods As Jackson observed, new notations and new design methods are generally not enthusiastically received [8]. Such proposals are seen as just muddying the waters and tinkering around the edges. Our justification for ignoring this advice is that the Behavior Tree Notation and the accompanying genetic design method solve a fundamental problem: they provide a clear, simple, constructive and systematic path for going from a set of functional requirements to a design that will satisfy those requirements. Some of the major differences and advantages of the present approach are summarised below. • The most significant advantage of genetic design over UML [1] and other methods is that it allows designers to focus on the complexity/detail of individual requirements while not having to worry about the detail in other requirements. That requirements can be dealt with one at a time (both for translation and integration) significantly reduces the complexity of creating a design. This very significantly reduces the short-term memory overload problem that has plagued software development for so long. In fact this approach to design actually amplifies our ability to deal with complexity. UML and other methods do not do this. • Another important advantage of genetic design over UML is that the component architecture and the component behaviour designs of all individual components in a system are both emergent properties of the design behavior tree (DBT) that is constructed by integrating all the functional requirements of the system. • We have shown with the case study that integration of functional requirements is a powerful way to find behaviour gaps and other incompleteness and inconsistency defects with a set of functional requirements. Use-cases and scenario representations that involve abstraction and loose partial views of requirements information do not have the same focus on defects and therefore are unlikely to consistently deliver the same level of constructive defect detection. • The focus on direct translation of individual functional requirements maximizes the chances of preserving and clarifying intent and guarantees traceability to original statements of requirements. Because the focus is on translation the method approaches repeatability in design construction. The method also
202
R.G.
Dromey
provides a single integrated view of the requirements which we claim makes it easier to see and find defects either manually or using automated tools. • We have not emphasised it here but the genetic design method provides a formal, automatable method for mapping changes of requirements to changes in the architecture, the component interfaces, and the behaviors of the individual components affected by the change [13]. This follows because the architecture and individual component designs are emergent properties of the DBT that is modified by the change in functional requirements of the system. • The main steps to get to a design are very clear: translation of requirements to behavior trees, integration of behavior trees, architecture transformation, component behaviour projection for all components following by component design. In contrast with UML there is a choice of notations to use and an accompanying set of process choices. Where to start and how to proceed is less obvious. In scaling up genetic design to larger systems we need to introduce composition trees that provide an integrated view of data requirements (c.f. function requirements and behavior trees) and structure trees that provide a formal integrated view of structures that behaviour takes place on (e.g., a rail network). We also focus on deriving an initial, high-level, integrated system behavior tree (SBT) from the original requirements to gain cognitive control of the systems behaviour before considering the behaviour of requirements in detail. Because of space limitations presentation of these aspects of the method will not be pursued here.
A comparison of Behavior Trees with Statecharts has been published elsewhere [5]. A separate comparison with Cleanroom Software Engineering [9] is available from the author. Considerable thought has gone into whether it is appropriate to use the term "genetic design" given the established use of the term "genetic algorithms" in a different context. The parallels of the proposed method with key genetic principles spelled out in Woolfsons recent book [15] gives considerable justification to the claim that "genetic" is being accurately used here. The way behavior tree integration can result in the evolutionary growth of a design adds weight to the genetic characterization of the method. Genetic design exploits three fundamental genetic properties of a set of functional requirements that are revealed and become easily accessible when they are expressed and then integrated as behavior trees. It is these emergent properties that give the method its constructive power. Things may be summed up with the words of eighteenth century thinker Giambattista Vico, who said, "To understand something, and not merely be able to describe it, or analyse it into its component parts, is to understand how it came into being: its genesis, its growth ... true understanding is always geneticv.
Formalizing the Transition from Requirements
to Design
203
6.5. Conclusion To advance the discipline of software engineering four major problems need to be addressed. Amplification of our ability to deal with complexity is the single most important problem to overcome in order to advance the practice of software engineering. Genetic design has the potential to make an important contribution to solving this problem because it allows us to consider, translate, and integrate only one requirement at a time. This very significantly reduces the short-term memory overload problem that has plagued software development for so long. A clarification of the steps to go from a set of requirements to a design is also central to advancing the practice of software engineering. Presently there would appear to be too much choice at every stage in terms of which process to follow, which notation(s) to use and which tools to employ. The root cause of this uncertainty seems to be a lack of a clear understanding of the relationship between a set of requirements and a design that will satisfy those requirements. The suggestion to build a design out of its requirements, directly leads to a clarification and a simplification of the design process, and a reduction in the need for different notations. It also guarantees direct traceability of original statement of requirements. That the component architecture and individual component behavior designs are both emergent properties of the integrated requirements (the design behavior tree) represents a further simplification, systematization and a clarification of the design process. Early detection of requirements defects is another very significant problem that thwarts software engineering practice. Requirements translation, requirements integration and component behavior projection coupled with both manual and automated analysis/inspection of design behavior trees offer a powerful set of techniques for early requirements defect detection. In particular, integration of requirements behavior trees turns out to a very effective way of uncovering otherwise obscure defects because it forces us to consider each requirement directly in the context where it is used behaviorally. Yet another thorny challenge for software engineering is how to transition from a loose informal natural language statement of functional requirements to a formal representation. Unless this transition approaches repeatability all subsequent development within a formal framework is undermined because we may not be preserving the original intention. With many development approaches, when this barrier is crossed, we frequently find some things get left out, new things get added in and in other cases things are modified. In contrast, with behavior trees, because the focus is on translation, it follows that the emphasis is on meaning and on the preservation and clarification of intention. Although ambiguity is always a threat to repeatability, rigorous translation approaches repeatability when carried out by different translators. Genetic design has been successfully applied to a diverse range of real (often large) industrial applications. In all cases the method has proved very effective at
204
R.G.
Dromey
defect detection and in the control of complexity (in larger systems there can be layers of behavior: the method easily accommodates this). We expect the utility of the method will increase as we enhance the tool we are building to do more sophisticated graphics, multi-user editing, vocabulary control, and consistency checking. In summary, what we have presented is an intuitive, stepwise process for going from a set of functional requirements to a design. The method is attractive for its simplicity, its traceability, its ability to detect defects, its control of complexity, and its accommodation of change. Acknowledgements This work was supported by the Australian Research Council through a number of research grants. I would like to thank Xuelin Zheng, Lian Wen, Cameron Smith, David Billington, Zoran Milosevic and Dan Powell for many useful discussions on this work. I would also like to thank my colleagues in the SQI, in particular Bruce Hodgen, Don Abel, Angela Tuffley and Terry Rout and more broadly in Griffith University, Rodney Topor and Chengzheng Sun for their encouragement and I would like to thank my students Kate McClung, Liam Casey, Chris English, David Tannock, Elenkayer Sithirasenan, Casey Ackworth, David Whyte, Chris Mclean, Saad Zafar Michael Ransom-Smith, Brian Pack, Ashley Forsyth, John Seagrott, Henrik Hansen, Thomas Jansen and my software engineering students for their efforts in trialling the genetic design method and giving me plenty of useful feedback. Peter Lindsay, Ian Hayes, Kirsten Winter, David Carrington and Nisansala Yatapanage my colleagues from the ARC Centre for Complex System at University of Queensland are also thanked for their support and useful discussions as is David Abramson from Monash University. I would also like to thank David Harel and Rudy Marelly from the Weizmann Institute for their comments on an early version of this work and for sharing their pre-published work in this area. Brian Henderson-Sellers and Cesar Perez from the University of Technology, Sydney are thanked for their valuable work on meta-modelling and Bill Waite from the University of Colorado is thanked for his input on the Behavior Tree grammar. Adrian Pitman and Shireane McKinnie from the Australian Defence Materiel Organisation (DMO) are thanked for their support for supplying several defence projects. Terry Stevenson and Nev Delap from Boeing are especially thanked for their support and encouragement as are Kim Olsen from Queensland Rail, Adrian Mortimer from Emilex, James Ross from Telelogic, Peter Thornton from Department of Foreign Affairs, Duncan Cross from Suncorp, Brendan Lovelock, Linda Mathews and Steve Plant from Signature Software, and Mark Rheinlander, Shawn Parr and Rob Whitney from Calytrix Technologies. Bibliography 1. G.Booch, J.Rumbaugh, I.Jacobson, The Unified Modelling Language User Guide, Addison-Wesley, Reading, Mass. (1999).
Formalizing the Transition from Requirements
to Design
205
2. A.M.Davis, A Comparison of Techniques for the Specification of External System Behavior, Comm. ACM, vol. 31 (9), 1098-1115, (1988). 3. R.G.Dromey, From Requirements to Design: Formalizing the Key Steps, International Conference on Software Engineering and Formal Methods, (Invited Keynote Address), Brisbane, September, (2003). 4. R.G.Dromey, Using Behavior Trees to Model the Autonomous Shuttle System, 3rd International Workshop on Scenarios and State Machines: Models, Algorithms and Tools, (SCESM04), Edinburgh, May, (2004). 5. R.G.Dromey, Behavior Trees: Amplifying Our Ability to Deal With Requirements Complexity, Proceedings Dagstuhl Seminar, September 2003, Scenarios: Models, Transformations and Tools, Lecture Notes in Computer Science, Edited by, Francis Bordeleau, Stefan Leue, and Tarja Systa (to appear). 6. D.Harel, Statecharts: Visual Formalism for Complex Systems, Sci. Comp. Prog., 8, 231-274 (1987). 7. D. Harel., W. Damm, LSCs: Breathing Life into Message Sequence Charts, 3rd IFIP Conf. On Formal Methods for Open Objected-based Distributed Systems, New York, Kluwer (1999). 8. D. Jackson, Alloy: A Lightweight Object Modelling Notation, MIT Lab. for Comp. Sci. Report (1999). 9. S.J. Prowell, C.J.Trammell, and R.C. Linger, J.H. Poore, Cleanroom Software Engineering: Technology and Process, Addison-Wesley, Reading Mass., (1999). 10. J.Rumbaugh, M.Blaha, W.Premerlani, F.Eddy, W.Lorensen, Object-Oriented Modeling and design, Prentice-Hall, Englewood Cliffs, NJ, (1991). 11. S. Shlaer, S.J. Mellor, Object Lifecycles, Yourdon Press, New Jersey, (1992). 12. C. Smith, K.Winter, I.Hayes, R.G. Dromey, P.Lindsay, D.Carrington, An Environment for Building a System Out of Its Requirements, 19th IEEE International Conference on Automated Software Engineering, Linz, Austria, Sept. (2004) 13. L.Wen, R.G.Dromey, From Requirements Change to Design Change: A Formal Path, SEFM04, IEEE International Conference on Software Engineering and Formal Methods, Beijing, September, (2004). 14. K.Winter, Formalising Behavior Trees with CSP, International Conference on Integrated Formal Methods, IFM04, LNCS vol. 2999, 148-167 (2004). 15. A.Woolfson, Life Without Genes, Flamingo, (2000).
This page is intentionally left blank
Chapter 7 rCOS: A Relational Calculus of Components
Zhiming Liu^, He Jifengt and Xiaoshan Li§ t International Institute for Software Technology The United Nations University, Macao SAR, China [email protected] ^•Software Engineering Institute East China Normal University, China. jifeng@sei. ecnu. edu. en § Faculty of Science and Technology University of Macau, Macao SAR, China [email protected] We present a model for components, their composition and refinement to be used in component-based software development. We describe how components are specified for its syntactical view at the interface level, functional view at the requirement level, internal view at the design level and how they are composed. In a component based system development, a component consists of a set of interfaces, provided to or required from the software being developed. In a component development, the component is an executable code that can be coupled with other components via its interfaces. The developer has to ensure that the specification of a component is satisfied by its design and the design is met by its implementation. This work is an extended and revised version of [25].
7.1.
Introduction
While component technologies such as C O M [29], C O R B A [31], and Enterprise JavaBeans [30] are widely used, there is so far no agreement on s t a n d a r d technologies for designing and creating components, nor on methods for composing them. It seems component-based programming is now in the similar situation of objectoriented programming in the 80s: My guess is that object-oriented programming will be in the 1980s what structured programming was in the 1970s. Everyone will be in favor of it. Every manufacture will promote his products as supporting it. Every manager will pay lip service to it. Everyone programmer will practice it (differently). And no one will know just what it is. - T. Rentsch, September 1982 207
208
Z. Liu, J. He and X. Li
In this chapter, we consider a contract-oriented approach to the specification, design and composition of components. Component specification is essential as it is impossible to manage change, substitution and composition of components if components have not been properly specified. We take the view of Meyer [28] and Szyperski [32] and consider a software element as a component if it is usable by other independently developed software elements. We shall not consider all aspects of a component specification, like platform for deployment, QoS, cost, availability etc., but focus on a precise specification of its functionality and composition of components. Furthermore we consider communication only by method invocation, and leave synchronous and asynchronous communication to future extension. These can be dealt with by introducing additional observatables to cater for interfaces based on communication between concurrent processes (see Hoare and He [19]). To be able to describe a component precisely and completely and to ensure its correct integration and updating, the specification of the component must include the following elements: • a precise description of the functionalities that the component offers, and • a specification of all its dependencies. The component must be usable on the sole basis of its specification, without access to non-interface information such as the source code even it is available. The specification of a component is useful to both component users and component developers. For users, the specification provides a definition of each of its syntactic interface operation with a design contract to capture the functionality. Because it is only the interface that is visible to users, its specification must be precise and complete. For the developer of the component, a design refers primarily to quantities of the syntactic interface, but may also introduce abstract state variables that models persistent, observable states of a component. The specifications of components used in practical software development today are limited primarily to what we will call syntactic specifications. This form of specification includes the specifications used with technologies such as COM [29], the Object Management Group's CORBA [31], and Sun's JavaBeans [30]. The first two of these use different dialects of the IDL [12], whereas the third uses the Java programming language to specify component interfaces. The information that can be obtained from such a component specification is limited to what operations the component provides and the number and types of their parameters. In particular, there is no information about the effect of invoking an operation of a component interface, except for what might be guessed from the name of the operation and its parameters. Thus, the primary use of such specifications is type checking of the client code and as a base for interoperability between independently developed components and applications. Several techniques for designing component-based systems include semantic specifications. In the book of Cheesman and Dannies [5], UML and the Object Constraint Language [33] (OCL) are used to write component specifications. Another well-known method that uses the same notations is Catalysis [7]. In these frameworks, an interface consists of a set of operations. In addition, a set of preconditions and postconditions is associated with each operation. Preconditions are
rCOS: A Relational
Calculus of
Components
209
assumptions made by the component to be fulfilled before an operation is invoked. Postconditions are assertions that the component guarantees to hold just after the operation terminates, provided that the operation's precondition were true when it was invoked. Note that the idea of pre- and postcondition is not a novel feature of component-based software development, and it is widely used in a variety of software techniques, such as the Vienna Development Method [21] (VDM) and Meyer's Design by Contract [28]. This chapter discusses three categories of concepts used in component-based systems: interface, contract and component. In a component-based system, interfaces are used to give the syntactic specification of a service by declaring a collection of operations and fields. Many programming languages support the concept of interfaces, including Java and CORBA IDL. Interfaces are not only important for dividing the specification and the implementation of a component, but as the software scales up to large systems, one can use interfaces to specify the outside view of a package or subsystem. Interfaces are similar to abstract classes in Java, but there is a difference. Both allow one to define an interface and defer its implementation until later. However, the abstract classes allows you to add implementation of some of the operations; an interface forces one to defer even the definition of all operations. However, as syntactic specification of services, interfaces of a component do not provide any information about the effect, i.e. the functionality of invoking an operation of a component. For the functional specification of the operations in an interface, it is however necessary to know the conceptual state of the component that is represented by the fields of the interface. In the context of the fields of the interface, the functionality of each operation op in the interface is given a design contract MSpec(op) that is specified in the form p(x) h Q(x_, of) in Hoare and He's Unifying Theories of Programming [19] (UTP), where x is the set/vector of variables (including fields and parameters of op), and x' are the primed versions of the variables representing the values of x in the post state of the execution. This specification is seen as a contract between the component and its client and the specifier and the designer [5], meaning that if the precondition p{x) holds when op is invoked, the execution of op will terminate and the post condition Q(x_,xf) holds. This definition of a contract also agrees with that of Meyer [28]. To use the service op, a client has to ensure the pre-condition p(x_), and when this is true the component must guarantee the post-condition Q{x_,xS). A component assumes an architectural context defined by its interfaces. It should also be replaceable, meaning that developers can replace one component with another, maybe better, as long as the new one provides and requests the compatible interfaces. Component specifications are thus most useful when a notion of compatibility is defined. We assume that a notion of syntactic compatibility is defined for the interfaces and extend it with a notion of refinement between designs. 7.1.1. Related
work
There is much work on the definitions of software components. We take the informal views of Cheesman and Daniels [5], Heineman and Councill [17], D'Souza and Wills
210
Z. Liu, J. He and X. Li
[7], and Szyperski [32] that a component both provides to and requires services from other components. We use the notion of contract for formal specification of provided and required services. A contract in our framework is similar to that of Meyer [28]. However, we provide the notion of composition and a standard calculus to reason about and refine components at different levels of abstraction. A distinctive feature of our framework is the natural link of the component contract specification and its object-oriented implementation. The contract specification and the object-oriented notation are both defined in first-order predicate logic and based on Hoare and He's UTP [19]. A contract in the work of Helm et al [18] models the collaboration and behavioral relationships between objects. In our approach, we provide the separation between the specification of a contract for an interface from the specification of the behavior of the component that realizes the contract. A notion of contract can also be found in the work of Back et al [3] that emerged in the model of action systems. It promotes the separation between the specification of what an agent can do in a system and how they need to be coordinated to achieve the required global behavior of the system. However, the architectural components there are not explored. Andrade and Fiadeiro use contracts [1] to describe the coordinations among a number of partners (i.e. components or objects). Its main purpose is to support system architectural evolution and to deal with changes in business rules of the system application. Our contracts specify the services of components while we treat interaction and coordinations as part of the implementation of the components. Our aim is to support construction of software components and component software systems. However, it is interesting to investigate how these two notions of contracts can be combined to provide better support to both system construction and evolution. Recently, more delicate models are proposed for describing behavior of components and their coordinations, such as [2, 11]. Reo [2] is a channel-based model with synchronous communication. The composition of components (and connectors) are defined in terms of a few operators. The model is defined operationally and thus algebraic reasoning and simulation are supported for analysis. Being event-based, the model in [11] considers a layered architecture for composition, provided by connectors (glueing operations). It considers real-time constraints and scheduling analysis. The behavior of a component is defined in a form of a timed automaton. This provides a good low level model of execution of a component. It is desirable talk about a component at a higher level of granularity. The Stream Calculus [4, 34] is a denotational framework, but otherwise similar to those of [2, 11] for being a channel-based model. In general a denotational model supports the notion of stepwise development by refinement and links specifications at different levels of abstraction better. With the stream calculus, Broy also proposes a multi-view modelling to include interface model, state machine model, process model, distributed system model, and data model [4]. In [27], we use the notation of contract to specify a use cases at the requirement level of a system development in RUP. However, instead of a contract, a use-case controller class is used. The use case controller classes are associated with conceptual classes in the application domain. In [27], we describe the use-cases controller
rCOS: A Relational
Calculus of
Components
211
classes and their associated conceptual classes in a UNL class diagram. The domain class in the diagram are invisible to the outsiders. The contract calculus can be thus be used to specify the use-case controller classes and supports system decomposition and verification.
7.1.2.
Overview
After this introduction, we start in Section 7.2 with the definition of interfaces and discussion about interface composition, inheritance and method hiding. In Section 7.3, we introduce the the notion of design of Hoare and He's UTP [19] and show how it is used to define a program as a predicate. This will be used in Section 7.4 to provide the functional specification of a component by assigning a design contract to each method of an interface. We also study operations on contracts and contract refinement in Section 7.4. Section 7.5 extends the notion of contracts their composition and refinement by adding private methods to a contract. We study the formal semantic definition of components in Section 7.6 based on the concepts of provided, required interfaces and contracts of interfcaes. We define various operators on components and study their algebraic laws to show the correctness of the semantics model. Section 7.7 presents the refinement rules for component-based development, from which design patterns can be formalized. Section 7.8 briefly discuss how the component calculus can be applied effectively to a system development in the client server paradigm. We discuss in Section 7.8 how the method can be used in a client-server paradigm. Section 7.9 discusses the results and their perspective.
7.2. Interfaces An interface, in terms both of its syntax and semantics, of a component is a description of what is needed for the component to be used in building and maintaining a software without the need to know the code of the component. It is the interface that determines the external features of the component and allows the component to be used as a black box. The interfaces determine the substitutability of the components. In general, the description of an interface must contain information about all the viewpoints among, for example functionality, behavior, protocols, reliability, realtime, power, bandwidth, memory consumption and communication mechanisms, that are needed for composing the component in the given architecture for the application of the system. However, this description can be incremental in the sense that newly required properties or view points can be added when needed according to the application. In this chapter, we focus on the static functionality of interfaces. For recent development in rCOS, please see [13] where we have introduced protocols and [?] where we have studied component coordination. In this section, we introduce the syntactic notation of an interface, and call it an interface. A semantic description of an interface is called an contract that will be defined in the next section. Furthermore, we define operation on interfaces
212
Z. Liu, J. He and X. Li
t h a t allow us to compose, extend, and restrict existing interfaces to obtain new interfaces. An interface of a component can be defined as a specification of its access points [32]. T h e clients access the services provided by the component using these points. If a component has multiple access points, each representing a different service offered by the component, then the component is expected to have multiple interfaces. A primitive interface is a collection of features where a feature can be either a field or an abstract method. We thus define a primitive interface as a pair of feature declaration sections: I = {FDec, MDec) where the interface I has a set FDec as its field declarations and a set MDec as its method declarations. We will follow some conventions of UML. Fields are defined in the a t t r i b u t e p a r t of a class definition. For a given class I, we refer to this set as I.FDec. Methods are defined in the methods part of a class definition and we will refer t h e m as I.MDec. B o t h FDec and MDec can be optionally empty. A member of FDec has the form x : T where x and T represent respectively the name and type of this declared field. It is forbidden to declare two fields with the same name. A method op(in inx, o u t outx) in MDec declares the name op, the list of input parameters inx and the list of o u t p u t parameters of the method. Each input or o u t p u t parameter declaration is of the form u : U giving the name and type of the parameter. T h e method name together with the numbers and types of its input and o u t p u t parameters forms the signature of a method. In general b o t h inx and outx can be empty. For simplicity and without losing any generality in the theory, we assume a method has one input parameter and one o u t p u t parameter and thus can be represented in the form op(in : U, out: V). Notice t h a t the names of parameters are irrelevant, as a result op(in\ : U, out\ : V) and op(in2 : U, oufe : V) are treated as the same method. UML and many object-oriented languages allow method overloading. We shall without loss of generality consider only unique method names in a class, and assume t h a t none of the parameter names of a method appears in the set FDec. Otherwise we can rename the parameters. We will also use the notation op ^ MDec to indicate t h a t the name op is n o t declared in MDec. An interface merely names a collection of operations and describes only the signatures of these operations, but offers no implementation of any of its operations. An interface can be specified as a family of operation signatures in the following format: Interface I { Field : 7\ M e t h o d : op1(Ui mi,Vi outi); ..., opk(Uk ink,Vk outk) } Notice t h a t in the above format, we used the Java convention T x for resenting a variable with its type which is easy for the layout of the examples t h a n the convention x : T in our pagebreak formalism.
rCOS: A Relational Calculus of Components
CustomerService
213
LocateParcel ()
O < — DispatchParcel ()
Fig. 7.1.
Architecture of ParcelCall
E x a m p l e 7 . 1 . Figure 7.1 shows the ParcelCall system in the paper of Filipe [8] t h a t has three main components: • a Mobile Logistic Server (MLS): is an exchange point or a t r a n s p o r t unit (container, trailer, freight wagon, etc). It always knows its current location via the G P S satellite positioning system. • a Goods Tracing Server (GTS): keeps track of all the parcels registered in the ParcelCall system. G T S is also the component which is integrated with the legacy systems of t r a n s p o r t or logistic companies. • a Goods Information Server (GIS): is the component which interacts with the customers and provides the authorized customers the current location of their parcel, keeps t h e m informed in case of delivery delays, etc. In Figure 7.1, notation similar to UML for interfaces and components is used. T h e provided interface of GIS component will establish communication with the customer: for instance, a customer can request for finding the current location of a parcel via LocateParcel. T h e specification of this interface can be described as follows Interface CustomerService { Field : PNameSet S; / * *set of parcel names CNameSet S; / * *set of customer names CName x PName owns; / * *owns(s,p): customer s owns p (PName i—• Position) loc; / * * returns the location of p Method : LocateParcel(in : PName pld, CName sld, out : Position location); DispatchParcel[m : PName pld, CName sld)}
•
Z. Liu, J. He and X. Li
214
7.2.1. Interface
inheritance
and hiding
operations
Inheritance is a useful means for reuse and incremental programming. In UML it is called generalization. When a component provides only part of the services that one needs or some of the provided operations are not quite suitable for the need, we may still use this component by rewriting some of the operations or extending it with some operations and attributes. If a component realizing the inherited interface is used in the design of the extending interface, that component must not be changed but its provided operations can be used in the design of the newly added operations or the overwritten operations. Therefore, the overall purpose of inheritance is reuse, and extension or evolution. Definition 7.2.1. (Interface inheritance) Let li (i — 1, 2) be interfaces. Assume that no field of li is redefined in Ij for i ^ j . The notation 1^ extends I\ represents an interface with the following field and method sectors FDec = FDed U FDea MDec = MDec2 U {op(in : U, out: V) \ op(in : U, out: V) £ MDeci A op <£ MDeci} D To enable us to provide different services to different clients of a component, we allow to hide operations in an interface to make them invisible when the component is composed with certain components. Hiding operations provides the opposite effect to interface inheritance and is to be used to restrict an interface. In a graphical notation like UML, this can be achieved by the notation of generalization alone. Definition 7.2.2. (Hiding) Let I be an interface and S a set of method names. The notation I\S denotes the interface I after removal of methods of S from its method declaration sector. FDec ± I.FDec, MDec = I. MDec \ S
a Theorem 7.2.1. The hiding operator enjoys the following properties. (1) Hiding two sets of operations separately is the same as hiding all of the operations in the two set together, (I\Si)\S2
=
I\(S1US2)
Thus, the order in which two sets of operations are hidden is inessential too. (2) Applying hiding operator to an interface with an empty set of operations has no
effect I\9 = I
rCOS: A Relational
Calculus of Components
215
(3) Hiding distributes among operands of interface inheritance (I extends J)\S
= (I\S) extends (J\S)
• 7.3. Specification of Methods When a client uses a service of a component, it is not enough to only ensure that the operation that the client invokes is syntactically correct. The functionality of the operation has to be met too. This section gives a brief introduction to the notion of design in which the functionality of an interface method is specified and refined. This notion and its theory are presented in Hoare and He's Unifying Theories of Programming [19] (UTP). In UTP, a program or a program command is identified as a design, which is represented by a pair D = (a,P), where • a — ina U outa consists of the set of variables (with their types) of the program, called the alphabet of the program or design. • P is a predicate of the form p \- R = (ok A p) => (ok' A R)
— We call a the alphabet of the design and P the contract of the design; a declares the variables (including logical ones); it consists of the two disjoint sets of input and output variables. The values ina represent the initial state of the program at the moment of time when it is executed, the values of outa describe the state when the execution terminates, and the contract specifies the behavior of the program in terms of what change in state that its execution may cause. — Predicate p, called the precondition of the program, characterizes the initial states in which the activation of the program will lead its execution to termination. — Predicate R, called the post-condition of the program, relates the initial states of the program to its final states. — We describe the termination behavior of a program by the Boolean variables ok and ok', where the former is true if the program is properly activated and the later becomes true if the execution of the program terminates successfully. For a method op(in : U, out : V) of an interface I = (FDec, MDec), a contract of op is a design (a, P) where a = ina U outa and ina = {in : U} U FDec, outa = {out' : V} U FDec' where FDec' = {x' : T | x : T G FDec}.
Z. Liu, J. He and X. Li
216
D e f i n i t i o n 7 . 3 . 1 . (Design refinement) D\ = (a, Pi) is refined by D2 = (a, -P2), denoted by D\ C D2, if all the observations one can make over D2 is permitted by D\, i.e., Vo/c, ok', in, v, v , out • (P2 =>• A ) v are the variables representing the fields, v' is used for representing the values of v in the post-state of an execution of op. • We define D\ = D2 when D\ C. D2 and D2 C £>i. In what follows, we present a number lemmas t h a t will be used as refinement laws and show t h a t the set of designs is closed under s t a n d a r d operators on programming commands. They will be needed to prove equivalence between contracts of interfaces in the next section. T h e proofs of the lemmas can be found in [19]. L e m m a 7 . 3 . 1 . D\ is refined by D2 if and only if D2 requires no stronger dition and ensures no weaker postcondition than D\: (1) \/in, v • (pi =>- p2), and (2) Vm, v, , v', out' • ((pi A R2) => Rx)
precon-
•
In the remaining of this section, we are going to show how t o use the notation of design to define the semantics of program commands. We require t h a t the specification language t h a t is used to describe the implementation of a component in Section 7.6 includes U T P designs and the commands t h a t we will discuss in this section. A full syntax and semantics of such a langauge is defined in r C O S [15, 16, 26]. T h e command skip denotes a program t h a t terminates successfully, leaving all the variables unchanged. skip = true h (x = x A y = y A ... A z = z) where {x,y,..., z} is the input alphabet of the command and their primed versions form the o u t p u t alphabet. Let a; be a variable and e an expression. If b o t h x and e are well-defined, denoted by WF{x) and WF(e), and the type t y p e ( e ) of e matches t h a t of x, t h e n the execution of £ := e will assign the value of e to x and leave other program variables intact. x~e where {x, y,..., form the o u t p u t static, especially We chaos to dictable.
= (type(e) = t y p e ( x ) A W F ( i ) A W F ( e ) ) h (a;' = eAy' = y A ....)
z} is the input alphabet of the command and their primed versions alphabet. Notice t h a t a well-formedness condition is not always when considering an object-oriented language [16]. denote represent the worst program whose behaviour is unpre-
chaos = false h true
rCOS: A Relational Calculus of Components
217
Assume t h a t the o u t p u t variables of D\ m a t c h the input variables of D2, i.e., outai = inct2, then the notation D\; D2 stands for the sequential composition of D\ and D2, and its behaviour is described by the design (a, P) a = ina\ U outa2 P = 3m • Pi[m/v'] A P2[m/v] where Pi[y/x] represents t h e predicate Pi resulting from replacing all free occurrences of x in Pi by y. L e m m a 7 . 3 . 2 . The composition
of a design is also a design,
((pi h R{); (p 2 h R2)) = ((pi A -.(JJi; -np2)) h (iJ i ; Jfe))
D Let Z?j = (a, Pi) ( i = l , 2 ) be designs of the same alphabet. T h e conditional choice i f 6 t h e n D i e l s e Z ) 2 is also a design (a, P), where P=
(&APi)V(-bAP2)
L e m m a 7 . 3 . 3 . This lemma shows that the claim that the condition from two designs is a design: if bthen(pi h Ri)else(p2
where P1P2
= (bAP1)V
choice
built
h R2) = (pi < b > p2) \~ (Ri < b > R2)
{^bAP2).
•
T h e declaration command var x introduces a new variable x v a r x = 3x • skip Undeclaration end x ends the scope of variable x endx = 3x • sfcip For simplicity we assume t h a t no variable is allowed to be redeclared in its scope. A m e t h o d call m ( e , v) is also a design, which introduces t h e formal parameters as local variables and passes the actual input parameters to the formal input parameters, before executing the body of the method and then the final value of the formal parameter to the actual parameters. Finally, it undeclares the formal parameters at the end of the scope of the method: m(e, v) = vax in, out; (in := e; D; v := out); end in, out where the design D stands for the specification of m e t h o d m(in : U, out : V). Designs with the same alphabet can be composed using demonic and angelic choice operators:
Z. Liu, J. He and X. Li
218
£>i n D2 = Di V D2 Di U D2 = JDI A £>2
Lemma 7.3.4. 5o£/i demonic and angelic compositions of designs are designs too: (1) (p! h i?i) n (pa l- R2) = (PI A P 2 ) h (i?! v i?2) (2) (pa h i?i) U (pa h i2a) = (pi V p 2 ) I- ((pi =J> iJi) A (p2 ^ i? 2 ))
D
Theorem 7.3.1. The set of designs form a complete lattice with respect to the refinement, (a, false V- true) is the bottom and (a, true h false) the top of the lattice. • Let D be a design, and b a Boolean expression which only refers to the input variables of D. The notation while b do D represents an iteration construct, and is defined as the worst (maximal) fixed point of the recursive equation X = if b then (D; X) else skip For a detailed treatment of programming languages, readers are referred to Hoare and He's book on UTP [19] and its application to object-oriented programming languages in our related paper [14].
7.4. Contract This section gives the formal definition of a contract of an interface, the composition and inheritance of contracts. We also study the refinement relation between contracts. Definition 7.4.1. (Contract) A contract is a pair Ctr = (I, MSpec), where (1) I is an interface, (2) MSpec maps each method op(in : U, out : V) of / to a specification of op with the alphabet a = {in, out'} U I.FDecU I.FDec where I.FDec' = {x' \ x G I.FDec}.
D
For a contract Ctr = (I, MSpec), we will use Ctr.I, Ctr.FDec, Ctr.MDec and Ctr.MSpec to denote respectively / , I.FDec, I.MDec and MSpec. If we combine an interface with its contract, it can be represented in the following format.
rCOS: A Relational Calculus of Components
219
Contract Ctr { Interface I { Field: Ti M e t h o d : op1(Ui i m , : Vi owii){M5'pec(opi)}; • * •) opk{Uk ink,\ Vk outk){MSpec(opk)}
} } E x a m p l e 7.2. A contract for interface Customer Service of ParcelCall assigns a specification to each method and can be written as follows, where MSpec(op) is given as the specification following the name op of each operation. We present a contract in a style such t h a t the name of the interface is followed by the fields declarations, then the initial condition, and finally the operations with their specifications: Contract CS{ Interface Customerservice Field : PNameSet P; / / * set of parcel names CNameSet S; / / * set of customer names CName x PName owns; // * owns(s,p): s owns p PName i—> Position loc; // * loc(p): the location of p Meth : LocateParcel(in : PName pld, CName sld, out : Position location) { pld G P A sld £ S A owns(sId, pld) \- location' = loc(pId)}; DispatchParcel(in : PName pld, CName sld) : { pIdgP\-(P' = PU {pld}) A (S" = S U {sId})A (owns' = owns U {(sld, pld)}) A (loc = loc U {pld -> (0,0)}) } }
• 7.4.1.
Composition
and refinement
of
contracts
Two contracts of interfaces can be composed t o extend b o t h of t h e m only when their interfaces are composable and the specifications of the common methods are consistent. This composition will be used to calculate the provided and required services when components are composed. D e f i n i t i o n 7 . 4 . 2 . (Composable contracts) Contracts Ctri = (U, MSpect), are composable if (1) I\ and I2 are composable, and (2) for any method op occurring in b o t h I\ and I2, MSpec1(op(x :U,y: V)) = MSpec2(op(u : U, v : V))[x, x', y, y ju, u', v, v']
% = 1,2,
220
Z. Liu, J. He and X. Li
In this case their composition Ctn || Ctr2 is defined by I = Ii extends I2, MSpec = MSpecx extends MSpec2 where MSpec1 extends MSpec2 denotes the overriding the specification MSpecx (op) with MSpec2(op) if op occurs in both I\ and I2. • Notice that for the purpose of compositional reasoning, condition (2) makes the composition as conservative extension and serve a limited form of UML generalization. Definition 7.4.3. (Nondeterministic choice) Let Ctn and Ctr2 be contracts on the same interface / . The notation Ctr\ n Ctr2 represents the contract defined by MSpec(op) = MSpec^op) n MSpec2(op)
a The properties of composition and nondeterministic choice operators are summarised in the two theorems below. Theorem 7.4.1. The contract composition operator is symmetric and associative: (1) Ctn || Ctr2 = Ctr2 || Ctn (2) (Ctn II Ctr2) || Ctr3 = Ctn || (Ctr2 \\ Ctr3)
D
Notice that (1) does not hold for composition which allows general overriding. Theorem 7.4.2. The nondeterministic
choice enjoys the following properties
(1) Idempotent: CtrV\ Ctr = Ctr (2) Commutative: Ctn n Ctr2 = Ctr2 n Ctn (3) Associative: Ctn n (Ctr2 n Ctr3) = (Ctn n Ctr2) n Ctr3 (4) Distributive over composition: (Ctn n Ctr2) || Ctn = (Ctn || Ctr3) n (Ctr2 || Ctr3) provided that no operation of Ctn o-nd Ctr2 is overwritten by Ctr3. • Definition 7.4.4. (Contract refinement) Contract Ctn — (h^MSpec^ is refined by contract Ctr2 = (h, MSpec2) if (1) The methods I\.MDec of I\ is a subset of the methods I2-MDec of I2, and (2) there exists a bijective mapping p from the fields I\.FDec to those I2.FDec of l2 satisfying (MSpeCl(op) • p) C (p; MSpec2(op)) for all methods op declared in I\.MDec. We use Ctn E Gtr2 to denote the refinement relation between contracts.
•
rCOS: A Relational
Calculus of
221
Components
Clearly the binary relation E is reflective and transitive. Furthermore, we have more interesting properties about refinement and we present them as theorems. Theorem 7.4.3. The following properties say refinement is to reduce nondeterministic, but increase deterministic choices of services, i.e. the composition of two contracts refines each of the two contracts: (1) (Ctn n Ctr2) E Ctn fori = 1,2 (2) Ctn E {Ctn II Ctr2)
•
Theorem 7.4.4. || and l~l are monotonic with respect to the contract refinement ordering. If Ctn E Ctr2, then for any contract Ctr (1) {Ctn || Ctr) E {Ctr2 || Ctr) (2) {Ctn n Ctr) E {Ctr2 n Ctr)
D
The following are special cases of contract refinement; the first one relates method refinement with contract refinement, and the second explores the relationship between interface hiding and contract refinement. Theorem 7.4.5. Method refinement implies contract refinement, MSpecx{op) E MSpec2{op) for all op G I.MDec, then
i.e.
if
(I, MSpecJ E {I, MSpec2)
• Theorem 7.4.6. Interface hiding induces contract abstraction, (I\M, MSpec) E (/, MSpec) D 7.4.2. Contract
inheritance
When an interface is extended by inheritance, its contract can be extended too. Definition 7.4.5. (Contract inheritance) Let Ctn = {Ii,MSpect), i = 1,2 be contracts. Assume that no field of I\ is redefined in Ctr2. Then the notation Ctr2 extends Ctn denotes the contract defined by / = 7i extends I2 MSpec = MSpec2 extends MSpecx
• The above definition allows Ctr2 to overwrite a method declared in Ctr\. The inheritance operator is also monotonic. Theorem 7.4.7. If Ctn E Ctr2, then for any contract Ctr (1) {Ctr extends Ctn)
E (Ctr extends Ctr2)
Z. Liu, J. He and X. Li
222
(2) {Ctn extends Ctr) C (Ctr2 extends Ctr)
•
The following theorem states that the contract composition can be treated as a special case of the contract inheritance. Theorem 7.4.8. If Ctr\ and Ctr2 do not share methods, then Ctri extends Ctr2 = Ctr^ extends Ctr\ = Ctr\ || Ctri U
7.5. Contracts with Private Methods All the methods defined by an interface are public, i.e, they are directly accessible by the environment of the interface. We can restrict a client's access to some methods of the interface by the hiding operator. However, when a component is to be implemented, a method can be used in the code of another method. We would like to be able to hide the former from the interface but at the same the implementation of the latter method should still work without any modification. To handle this problem, we introduce in this section the notion of private (or internal) methods, which are not available to public, but can be used by the component itself. For this we need to generalize the notation of contract. Definition 7.5.1. (General contract) A general contract GCtr is a contract extended with a set PriMDec of private methods and their specifications PriMSpec. {Ctr, PriMDec, PriMSpec) where Ctr is a contract (/, MSpec) and (1) IMDec and PriMDec are disjoint, and (2) An internal operation can only refer to fields of I.FDec in its input and output parameters. • Obviously the previously introduced contract concept is just the special case with PriMDec = 0. We can also compose general contracts when the private methods do not interfere. Definition 7.5.2. (Composing general contracts) Let GCtn = {Ctri, PriMDeci, PriMSpec^) be general contracts over interfaces Jj, i = 1,2. GCtr\ and GCtr2 are composable if (1) Ctr\ and Ctr2 are composable. (2) {IiMDecUPriMDeCi) fl PriMDecj = 0 for i ^ j When GCtry and GCtri defined by
are composable, their composition GCtr\
|| GCtr2 is
rCOS: A Relational Calculus of Components
223
Ctr ^ Ctn || Ctr2 PriMDec = PriMDeci U PriMDeci PriMSpec = PriMDeci U PriMDec2
D We also extend the refinement relation to general contracts by denning GCtn
C GCtr2 = Ctn
C Ctr2
Clearly the general contract composition operator obeys the same algebraic laws as the contract composition operator: for i = 1, 2 GCtn
c ( G C t n || GCtr 2 )
Now hiding a set of operations in the interface of a general contract is t o make these operations private to the contract. D e f i n i t i o n 7 . 5 . 3 . (Hiding) Let GCtr = (Ctr, PriMDec, PriMSpec) be a general contract over an interface / , and M a set of its public methods in I.MDec. T h e notation GCtr\M represents the general contract (Ctri, PriMDeci, PriMSpec-i) over an interface I\ whose components are defined by h Ctn.MSpec PriMDeci PriMSpec^
= I\M = (I.MDec\M) < Ctr.MSpec = PriMDec U (I.MDec n M) = PriMSpec U (LMDecCiM) O Ctr.MSpec
where X <\ f represents the mapping / restricted to the domain X.
•
T h e hiding operator on general contracts have the following algebraic laws t h a t show its definition is appropriate. Theorem 7.5.1. (1) Hiding restricts clients choices among services: (GCtr\M) C GCtr (2) Hiding an empty set of operations has no effect: GCtr\% = GCtr (3) Hiding two sets of operations separately is the same as hiding all operations the two sets together: (GCtr\Mi)\M2
(4) Hiding distributes
over
(GCtn
=
in
GCtr\(MiUM2)
composition: || GCtr2)\M
= (GCtn\M)
\\
(GCtr2\M)
•
224
Z. Liu, J. He and X. Li
7.6. Component A component has a set of provided interfaces, optionally required interfaces, and an executable code. The code of the component can be coupled to the codes of other components via their provided interfaces. It is required that these coupled provided interfaces must be among the required ones of the component. The external behavior of a component is specified by the contracts of its interfaces. A design of a component has to reorganize the data to realize the conceptual states defined by the fields. The code implements the design of the provided interface operations with program commands. 7.6.1. Semantics
of
components
We are now ready to formalize components and discuss their composition, verification and refinement. Definition 7.6.1. (Component) A component C is a tuple (7, MCode, PriMDec, PriMCode, InMDec)
where (1) / is an interface. (2) PriMDec is a set of method declarations which are private to the component. (3) The tuple (I, MCode, PriMDec, PriMCode) has the same structure as a general contract, except for that the functions MCode and PriMCode map methods in the set I.MDec and PriMDec respectively to a program, written in the notations introduced in Section 7.3. MCode(pp) and PriMCode{op) can only mention variables defined in I.FDec. (4) InMDec denotes the set of input methods which are called by public or internal methods, but not defined in MDec U PriMDec. • We use C.I, C.MCode, C.PriMDec, C.PriMCode and C.InMDec to denote the corresponding parts of C. We will also use, for example, C.FDec to denote (C.I).FDec. As shown in Section 7.3, each program command, and thus the code of each method in a component is semantically a design specified in terms of a precondition and postcondition. Therefore, the semantics of the code of a method can be treated as a specification although it is at a lower level of abstraction than the specification of the method given in a contract. This implies that a component can be seen a parameterized general contract in which the parameters are the specifications of the input methods in InMDec. Operations in InMDec can be seen as holes in the component where their specifications or implementation given in other components that are to be plugged in. Therefore, the provided services of a component depend on its required services plugged in from other components. For any specification InMSpec that maps each method in InMDec to a specification, let InFDec be the global variables with their types that appear in the
rCOS: A Relational Calculus of Components
225
specifications of methods of InMDec given by InMSpec but not as parameters of these methods. T h e n InputlnterFace = (InFDec, InMDec) forms an interface and we call it an input or required interface for the component. T h e contract InCtr = (InputlnterFace, InMSpec) is a required or input contract for the required interface, t h a t specifies the required services. D e f i n i t i o n 7 . 6 . 2 . (Semantics of components) Let C = (I, MCode, PmMDec, PriMCode,
InMDec)
be a component. For any input contract InCtr of C. T h e notation C(InCtr), sometimes also denoted by C(InMSpec) if InCtr.MSpec is InMSpec, is the general contract t h a t is defined by C(InCtr).FDec C(InCtr).MDec C (InCtr). MSpec C{InCtr).PriMDec C'(InCtr).PriMSpec
± ^ = = =
C.FDec U InCtr.FDec C.MDec C.MDec <3 S p e c C.InCtrU C.PriMDec InCtr U C.PriMDec < S p e c
T h e mapping S p e c associates every (public and internal) method m of C to a design, and is defined by the recursive equations Spec(m)
j M{MCode(m))
m € C.MDec
\ M{PriMCode(m))
m £
C.PriMDec
where M. replaces every call of the method op of InMDec in the code of MCode(m(in : U,out : V) and PriMCode(m(in : U,out : V)) by its corresponding specification. / v a r in, out; \ in := inexp; InMSpec(m); M(m(inexp, outvar)) outvar := out; \ end in, out J if m(in : U, out : V) E InMDec / v a r in, out; \ in := inexp; M(m(inexp, outvar)) Spec(m); outvar := out; \ end in, out J if m(in : U, out : V) E C.MDec U C.PriMDec = c, c is of one of the forms in M(c) {skip, chaos, v := e, vara;, end a;} = D, D is a design M(D) ±M(pi);M( P2) M(pi;p2) ±M(pi)nM(P2) M(pi n p 2 ) M(±i bthenpi elsep2^ = if bthen.M(pi) e l s e M(p2) = while bdo M(p) M (while bdop)
226
Z. Liu, J. He and X. Li
If the program MCode(m)
is simply a design, clearly one has Spec(m) =
MCode(m)
n W i t h above definition, we can apply the operators on general contracts and the notion of refinement t o a component C(InCtr). T h e o r e m 7 . 6 . 1 . (Monotonicity)
For Let
C = (I, MCode, PriMDec, PriMCode,
be a component and InCtri, then C(InCtn) 3 C(InCtr2).
InMDec)
i = 1,2 be two input contracts.
If InCtr\
3
InCtri, •
Therefore, if component C\ depends on or requires from component Ci, we can then replace Ci with any component which provides a contract t h a t refines the services t h a t Ci provides. D e f i n i t i o n 7 . 6 . 3 . (Component refinement) Let C\ and Ci be components. We say C\ is a refinement of component Ci, denoted by C\ 3 Ci, if for any input contract InCtr, d{InCtr) 3 C2(InCtr). • T h e following definition give the strongest correctness of a component for a give general contract as its specification in the sense t h a t it should refine the general contract even with t h e weakest input contract. D e f i n i t i o n 7 . 6 . 4 . (Component as an implementation of general contract) A Component C implements a general contract GCtr, denoted C imp GCtr, if C(ChaosCtr) 3 GCtr,where ChaosCtr.MSpec(op) = GCtr.PriMSpec(op) op e C.InMDec n GCtr.PriMDec chaos op e C.InMDec \ GCtr.PriMDec
• T h e o r e m 7 . 6 . 2 . If C\ 3 Ci, then for all general contracts
GCtr
(C2 imp GCtr) implies (Ci imp GCtr)
D This also implies t h a t we can use a component C\ wherever Ci can be used if C\ refines CiE x a m p l e 7 . 3 . Now we define a component GIS in the ParcelCall system to provide the services to customers. We will use some Java conventions in writing the
rCOS: A Relational
Calculus of Components
227
specification, such as assignment to a variable with a method call t h a t has an o u t parameter. C o m p o n e n t GIS{ O u t p u t Interface Customerservice Field : VPName P; / / * set of parcel names VCName S; / / * set of customer names CName x PNarne owns; // * owns(s,p): s owns p (PName i—• Position) loc; / / * *loc(p) returns the location of p Method : LocateParceKya. : PName pld, CNamesId, o u t : Position location) { if pld £ P A sld e S A owns(sId, pld) / / * call required method t h e n location :— IParcellnfo.Where(pId) e l s e chaos}; DispatchParcel(vn : PName pld, CName sld) { if pldgPA sld <£ S t h e n (P : = P U {pld}; S := S U {sld}; owns := ownsU {(sld,pld)}; IParcelInfo.Deal(pId)) e l s e chaos}; Input Interface Parcelinfo Filed : VPName P; / / * set of parcel names (PName \—> Position) loc; // * loc(p) returns the location of p; Method : Where(in : PName pld, out : Coordinates location) { Deal(in : PName pld)}; Contract Parcelinfo IParcellnfo :: Init : P = 0 ; IParcellnfo :: W7iere(in : PName pld, out : Position location) : pld 6 P h location := loc(pld); IParcellnfo :: DeoZ(in : PName pld) -.pldgPh loc'(pld) = (0, 0) We can calculate GlS(Parcellnfo) C C S . We have kept the a t t r i b u t e £oc : PName i—• Position to avoid from defining a state mapping in the proof of the refinement. In the following part of the example, we provide a definition of component G T S to implement the contract Parcelinfo,where we use the specification of a class Parcel. C o m p o n e n t GTS C l a s s Parcel{ PName id; Position location = (0,0); PositionlocQ{return := location}}; O u p u t I n t e r f a c e IParcellnfo F i e l d : V Parcel Parcels : ; Method : Deal(in : PName pld){ Parcel. New(p) [pld]; Parcels := Parcels U {p}; e n d p}; Where (in : PName id, o u t : Position location) { var Parcel p = P.find(pld); location := p.locQ; e n d p}
Z. Liu, J. He and X. Li
228
Define the refinement mapping p from the attributes of Parcel to those of Parcellnfo: p(P) = {p.id | p € Parcel} p(loc(pId)) = p.location for all pld € P such that 3p 6 Parcel.p.id = pld Then GTS imp Parcellnfo. 7.6.2. Composition
of
• components
We can now extend the hiding operator to components. Definition 7.6.5. (Hiding) Let C be a component, and M C C.MDec, then the notation C\M represents the component C after removal of methods of M from its public method sector. (C\M).I (C\M).MCode (C\M).PriMDec (C\M).PriMCode {C\M).InMDec
± ± = = ±
(C.I)\M (C.MDec \M)< C.MCode C.PriMDec U ( I n C.MDec) C.PriMCode U (M n C.MDec) < C.MCode C.InMDec
• Theorem 7.6.3. (Component hiding vs contract hiding) [C\M){InCtr)
= C(InCtr)\M
a The hiding operator is monotonic, and enjoys the same properties of the hiding operator of contract. Theorem 7.6.4. For any components C, C\ and C2, we have (1) (2) (3) ^;
IfCx 3 C2, then ( d \ M ) 3 (C2\M) C\M C C C\0 = C (C\M1)\M2 = C\(MiUM2) = (C\M2)\Mi
•
Two totally independent components can be easily composed by putting their fields, input interfaces and codes of methods of their provided interface together. Definition 7.6.6. (Disjoint parallel composition) Let C\ and C2 be components satisfying (1) C1.FDecnC2.FDec
= 0, and
(2) (d.MDec U Ci.PriMDec C2.InMDec) = 0
U Ci.InMDec)
D (C2.MDec
U C2.PriMDec
U
rCOS: A Relational
Calculus of
Components
229
The notation C\ ® C 2 represents the component which integrates the corresponding method sectors of C\ and Ci. Let C denotes C\ ® Ci in the following definition. C.I =CiJi±iC 2 .J C. MCode = Ci. MCode UC2. MCode C. PriMDec = d. PriMDec U C2. PriMDec C.PriMCode = d.PriMCodeU C2.ProMCode C.InMDec = Ci JnMDec U C2.InMDec
a Theorem 7.6.5. For any input contract InCtr of the composed component (Ci <8> C2), we have following compositional law. (Ci C2){InCtr) = C1{Ci.InMDec< InCtr.MSpec) \\ C2{C2.InMDec < InCtr.MSpec)
a Theorem 7.6.6. The disjoint parallel operator is commutative, associative, distributive over hiding and monotonic with respect to the refinement: (1) (2) C5; f-^
d ® C 2 = C2 ® d d ® (C 2 ® C 3 ) = (Ci ® C2) ® C 3 ( C i ® C 2 ) \ M = ( C i \ ( M n MDeci)) ® (C 2 \(M n MZ)ec2)) If Ci E C 2 tten ( d ® C) C (C 2 ® C)
D
However, components are often interdependent so that some public methods of one are among the required methods of the other. When they are composed, they will provide services to each other. Definition 7.6.7. (Parallel composition) Let C\ and C 2 be components satisfying (1) C1.FDecnC2.FDec = 0, (2) (Ci.MDecU Ci.PriMDec) n{C2.MDecUC2.PriMDec) = 0, and (3) (d.PriMDec n Cj.InMDec) = 0 for i, j € 1, 2 A i ^ j . In this case, the composed component C1WC2, denoted by C, is defined below. C.FDec = C.MDec = C.MCode ± C.PriMDec = C.PriMCode = C.InMDec =
Ci.FDec U C2.FDec Ci.MDec U C2.MDec Ci.MCode U C2.MCode Ci.PriMDec U C2.PriMDec Ci.PriMCode U C2.PriMCode {Ci.InMDec\C2.MDec) U {C2.InMDec\ Ci.MDec) D
Example 7.4. Let us parallel compose GIS and GTS in example 3 and then hide the required methods of GIS which are also the provided methods of GTS. The resulting components then implement the contract CS given in example 2.
230
Z. Liu, J. He and X. Li
a Disjoint parallel is a special case of parallel composition, and this is presented as the following theorem. T h e o r e m 7 . 6 . 7 . If Ci.MDec DCj.InMDec { 1 , 2 } , thenCi\\C2 = Ci ® C 2
= 0 for Components
d,
i ^ j A i,j G D
T h e following theorem however gives the semantics of a composition of mutually dependent components. T h e o r e m 7 . 6 . 8 . C\ and C2 be components,
and
Mi = {Ci.MDecr\C2.InMDec) M2 = {C2.MDecnCi.InMDec)
= {opi, ..., opk} = {opk+i, - , opk+n}
Then {Ci\\C2){InMSpec)
C , i(/nM5pec 1 )||C 2 (/nM5pec 2 )
=
where InSpecl = InMDea < {InMSpec 0 {opi h A | fc + 1 < i < A; + n}) InSpec2 = InMDec2 < {InMSpec U {op, h A | 1 < i < fc}) where the designs Di {1 < i < k + n) are defined as the weakest fixed points of the equations Di = Ci(/n5pec 1 ).M5pec(opi) Dk+j — C2{InSpec2).MSpec{opk+j)
1< i
• T h e o r e m 7 . 6 . 9 . Parallel composition (1) C i | | C 2 =
is commutative
and
associative
C2||Ci
(2; Ci||(c2||c3) = (Ci||c2)||c3
n
T h e o r e m 7 . 6 . 1 0 . 7£ afeo enjoys the following (1) If Mi C {C1.MDec\C2.InMDec)
commutativity.
and M2 C {C2.MDec\Cl.InMDec),
(Ci||C2)\(MiUM2) = (2J / / C i . M D e c n Fj.InMDec
distributivity
= 0 orid Ft.MDecn
{C1®Ca)\\(F1®F2)
then
(Ci\Mi)||(Ca\M2) Cj.InMDec
=$fori^j,
then
= (CiUFOisCCallFa)
• T h e parallel operator is monotonic with respect to the refinement ordering. T h e o r e m 7 . 6 . 1 1 . J / C i C C 2 tfierc ( C i | | C ) C {C2\\C).
D
rCOS: A Relational
Calculus of
Components
231
The algebraic equations presented above are easy to prove from the definitions and are essential to show the correctness of the semantic model.
7.7. Refinement Rules for Components This section presents a set of refinement rules which enable us to develop a component software from a contract. These rules can be divided into three categories (1) Enrichment of public and private methods. This is to provide formal support for incremental and iterative construction in a development process such as the Rational Unified Process [22] (RUP), and unify refactorings [9] into the classical refinement framework. (2) Delegation of tasks from public methods to private methods. This is to capture the essential principle of component-based and object-oriented decomposition in which a functionality of a component is decomposed into sub-functionalities of a number of components. This is a component-based counterpart to the Expert Design Pattern in object-oriented development [23]. This is the basic pattern by which and other refinement rules we can prove the general design patterns of the Guan of Four [10]. (3) Packaging private methods into a component. This captures the meaning of data encapsulation to make a component software more robust and easy to maintain and reuse. In the presentation of the refinement laws, we use the notation C[MDec, MCode, PriMDec, PriMCode, InMDec] to denote a component with explicit indication of its public and private method declarations, their codes and input method declarations. We can refine a component by adding more public methods, private methods or refining an existing method. Law 2. Let Ci[MDeci, MCodei, PriMDec, PriMCode, InMDec], i — 1,2, be components. C\ E C2 if MDecx C MDec2 and MDeci < MCode2 = MCodex. This law allows us add public methods to refine a component. • Law 3. Let C^MDec, MCode, PriMDeci, PriMCodei, InMDec], % = 1,2, be components. Ci C C 2 if PriMDeci C PriMDec2 and PriMDecx < PriMCode2 = PriMCodei. • We can refine a component by refining a public method of it. Law 4. Let C[MDec, MCode, PriMDec, PriMCode, InMDec] be a component, m e MDec and c a command such that c ~3 MCode(m). Then C C C[MDec, MCode 0 {m >-* c}, PriMDec, PriMCode, InMDec]
a We can refine a component by refining a private method of it.
232
Z. Liu, J. He and X. Li
Law 5. Let C[MDec, MCode, PriMDec, PriMCode, InMDec] be a component, PriMDec and c a command such that c 3 PriMCode(m). Then
m G
C C C[MDec, MCode, PriMDec, PriMGode © {m H-> C}, InMDec]
• Law 6. Let C\ and C^ be components. well-defined, then C\ C (Ci <8> C^)-
If C'2.InMDec C C\InMDec,
and C\ ® Ci is •
T h e condition t h a t Ci.InMDec C Ci.InMDec ensures t h a t disjoint composition of C\ and C-z does not depend on more external services t h a n C\ itself. Therefore, the composition leads to the enlargement of the public and private methods of C\. This refinement law is in fact derivable from L a w 1 and L a w 2. T h e following law shows t h a t promoting a private method of a component to a public one also refines the component. Law 7. Let C[MDec, MCode, PriMDec, PriMCode, InMDec] be a component and Let op € PriMDec. C C C[MDea, MCodeu PriMDea, PriMCodei, InMDec] where MDeci MCodei PriMDeci PriMCodei
= = = =
MDec U {op} MCode U {op >-• PriMCode(op)} PriMDec \ {op} (PriMCode \ {op}) < PriMCode
• A public method can delegate some of its tasks t o other methods, either private or public. Law 8. C[MDec, MCode, PriMDec, PriMCode, InMDec] be a component, U, out : V) e MDec U PriMDec, and m £ MDec. Assume that MCodeim) such that P(c) is a command containing c as a subcommand, and c C var out, in := e; PriMCode(op);v
op(in : = P(c)
:= out; end in, out
then C C C[MDec, MCode ® {m ^ P{op{e, v)}, PriMDec, PriMCode, InMDec]
• A private method can also delegate some of its tasks to other methods, either private or public. Law 9. C[MDec, MCode, PriMDec, PriMCode, InMDec] be a component, op(in : U, out : V) G MDec U PriMDec, and m G PriMDec. Assume that MCode(m) = P(c) such that P(c) is a command containing c as a subcommand, and
233
rCOS: A Relational Calculus of Components
c C var out, in := e; PriMCode(op); v := owt; end in, out then C C C[MDec, MCode, PriMDec, PriMCode © {m H-» P(op(e, v)}, InMDec]
D Note t h a t when two components Ci and Ci are composed to C1HC2, the provided and private methods of b o t h components, including those which are input methods of one and provided methods of another, will become the provided methods of the composed component. Hiding t h e n can also be used to make some methods private to the composed component. Therefore, the above law can be used to delegate a task in a provided interface or input methods of one component t o a method provided by the other component. We can split a component into independent subcomponents by the following refinement law. Law 10. Let M\ and M2 be a partitions of the set MDec of method declarations. Let FDeci, PriMDeci a set of fields and methods publication used by operations in Mi respectively, i = 1,2, and PriMCode maps each method in PriMDeci U PriMDeci to a program text. Assume that (1) FDecx n FDec2 = 0. (2) PriFDeci n PriFDec2 = 0. (3) Methods in Mi U PriMDeci PriMDecj for i ^ j . Let for i = 1,2, Ii = (FDeci,
Mi),
do not refer to fields in FDecj
nor methods
in
and
d = (L, PriMDeci, PriMDeci < PriMCode, InMDec) ThenCQ
(Ci®C2).
•
We can package private methods into a new component. Law 11. Let M\ and Mi be a partition of a set PriMDec of method declarations and FDeci be the set of fields in FDec used by methods in M\. Let MDec be a set of m e t h o d declarations such t h a t it is disjoint with PriMDec, none of the methods in M\ calls the methods in MDec U Mi or refers t o fields in FDeci. Let h h d C2
= = = =
(FDeci, Mi) (FDec\ FDeci, MDec) (7i,Mi < PriMCode), 0, 0, InMDec) (h, MCode, M2, (M 2 < PriMCode), M1 U InMDec)
ThenCE(Ci||C2)\Si.
•
234
Z. Liu, J. He and X. Li
The above two laws allow us to decompose a complex component which may handle unrelated tasks into a number simpler and more cohesive components. This improves the reusability and maintainability of the system. 7.8. Client-Server Systems Client-server systems are often seen as applications in component software. The architecture of such a system is organized as a layered structure as illustrated in Figure 7.2. On the top are the clients that only require services from the components in the next layer. Components in a middle layer provide services to components in the layer above, but requires services from the layer below. The components at the bottom are the basic server components that only provide services to the components above the bottom layer. Components of the same layer have disjoint provided interfaces. The layers are organized according to interface dependencies. The whole system is the composition of the components with the interfaces hidden: ((1
•\\Cikl)\\-\\(Cnl
\Cnkn))\All
where All denotes all the linked interfaces. The construction of such a system can
Fig. 7.2.
Client-Server Systems
be top-down: S'i = ( C i i | | - - - | | C i f c 1 )
Si 4 (Sj-iHCiill • • • HC-fc^M,--!, 1 < j < n where Sn is the resulting component, and Mi , 1 < i < n are the linked interfaces. The system construction can also be bottom-up: Si =def Cnl\\ • • • \\Cnkn S
J - (Sj-i\\Cn-(j-i)i\\
• • • ||Cn_(j_i)fcri_0._1))\Mn_(J-_i), 1 < j < n
In both way we end with S = Sn.
rCOS: A Relational
Calculus of
Components
235
7.9. Conclusion We have introduced a component calculus, whereby a system can be composed from a number of interconnected components. A component provides certain services and these services are specified in terms of the component's interface and contract. The designer of the component has to design and implement the component to satisfy its specification. In a model based component development process, a developer may make use of the provided services of existing components. We model the behaviour of individual service by a design, which enables one to separate the responsibility of clients from the commitment made by the component. We adopt the notion of refinement to formalise the replaceability of components. This allows developers to improve the system performance by reorganising its components. The algebraic laws for the composition operators on interfaces, contracts and components show the correctness of the calculus. The refinement laws support incremental and step-wise construction of a component. Merge and hiding of interfaces for components add more support to incremental construction of component software as well as to restrict the use of some services by certain users. Prom a Model Based Development perspective, the calculus supports the following development process that starts by identifying system components and their dependencies in terms of provided and required services. The requirement analysis and specification mainly focus on obtaining the contracts of the components. Assume that we are given a contract Ctr for a component, we can then conduct the following design activities on the component. (1) Design a new contract Ctr\ = (I\, MSpec-^) by reorganising the fields and presenting a mapping p from FDec to FDeci to ensure that for every op G MDec (MSpec(op); p) C (p; MSpec^op)) (2) Introduce a component C\ = (I\, MCodei, 0, 0, 0) to implement the general contract Gtr= {Ctr\, 0, 0) by ensuring for every op € MDeci MSpec^op) C. MCodei(op)
(3) Rebuild a new component C2 = (ii, MCodei, PriMDec, PriMCode, 0) This is done by adding private methods whose specification is given by the function PriMCode. (4) Replace C 2 by C3 = (h, MCode2, PriMDec, PriMCode, 0, 0) after delegation of some tasks of public methods to internal method sector using Law 7.
Z. Liu, J. He and X. Li
236
(5) Implement C3 by (a) either a parallel construct (C/^C^)\M by using Law 10, (b) or disjoint parallel of components C4 ® C5 by using Law 9. (6) Repeatedly apply the steps (3)-(5) to the subcomponents C4 and C5. The decomposition is first done horizontally, i.e. decomposing according to the whole system's use cases that are involved in a large number of common concepts (classes) and associations, and use cases of different components are less dependent. And then each group of use cases are decomposed vertically such that use cases can be realized by interactions of a number of components that are invisible to the outside of the system. In fact, vertical decomposition is more relevant to component-based development, as horizontal decomposition is more or less about design of independent components. Based on this discussion, we propose an extension to the UML and RUP use-case driven approach to system development [22] so that components and their interfaces, instead of only classes and class associations [23], can be identified to create a component diagram. Then use cases can be described by component sequence diagrams or message sequence chart [24], rather than system sequence diagrams that only describe interactions of actors and the whole system treated as a black box [23]. UML2.0 has made some progress in this direction, its component diagrams in particular. Therefore, the requirement analysis in component-based development is about the identification of these high level use cases and the creation of component sequence diagrams and component diagrams. Then the application of UML follows to the development of an component if an object-oriented approach is thought appropriate for that component. We have not introduced a special concept of connectors [7, 20] as they can be treated as (special) components. It is however interesting and important to define a special class of components for connectors and study their properties and uses. We will include this in our future work. Future work also include the study of reactive components that only provide services according pre-specified interaction protocols [13]. We are also working on how to compose active process with components. The processes are used to glue and coordinate the behaviour of the components. The most recent development in rCOS is regarding coordination and it is presented [6]. Acknowledgement We would like to thank Anders Ravn and Jens Peter for their careful reading and detailed comments, and our colleague Dang Van Hung for his comments on the earlier version of the chapter. The work is partly supported HigQSoftD Project funded by Macao Science and Technology Fund and the Chinese NSF Project 60573058. Bibliography 1. L. F. Andrade and J.L.Fiadeiro. Interconnecting objects via contracts. In R. France and B. Rumpe, editors, UML'99 - Beyond the Standard, LNCS1723. Springer-Verlag, 1999.
rCOS: A Relational Calculus of Components
237
2. F. Arbab. Reo: A channeled based coordination model for components composition. Mathematical Structures in Computer Science, 14(3):329-366, 2004. 3. R.J.R. Back, L. Petre, and LP. Paltor. Formalizing UML use cases in the refinement calculus. Technical Report 279, Turku Centre for Computer Science, Turku, Finland, May 1999. 4. M. Broy. A theory of requirements specification and architecture design of multifunctional software systems. In Chapter \ of this volume. 2006. 5. J. Cheesman and J. Daniels. UML Components. Component Software Series. AddisonWesley, 2001. 6. X. Chen, J. He and Z. Liu. rCOS: A refinement calculus for object systems. Technical Report UNU-IIST Report No 335, UNU-IIST, P.O. Box 3058, Macau, March 2006. http://www.iist.unu.edu. 7. D. D'Souza and A.C. Wills. Objects, Components and Framework with UML: The Catalysis Approach. Addison-Wesley, 1998. 8. J.K. Filipe. A logic-based formalization for component specification. Journal of Object Technology, l(3):231-248, 2002. 9. M. Fowler. Refactoring - Improving the design of existing code. Addison Wesley, 1999. 10. E. Gamma, et al. Design Patterns. Addison-Wesley, 1995. 11. G. Gossler and J. Sifakis. Composition for component-based modeling. Science of Computer Programming, 55(1-3), 2005. 12. M. Gudgin. IDL: Interface design for COM. Reading, MA: Addison-Wesley, 2001. 13. J. He, X. Li, and Z. Liu. Component-based software engineering. In Proc. 2nd International Colloquium on Theoretical Aspects of Computing (ICTAC05), Lecture Notes in Computer Science 3722, pages 70-95. Springer, 2005. 14. J. He, Z. Liu, and X. Li. Towards a refinement calculus for object-oriented systems (keynote talk). In Proc. ICCI02, August 19-20, 2002, Alberta, Canada. IEEE Computer Society, 2002. 15. J. He, Z. Liu, and X. Li. rCOS: A refinement calculus for object systems. Technical Report UNU-IIST Report No 322, UNU-IIST, P.O. Box 3058, Macau, March 2005. Accepted for publication in Theoretical Computer Science - B. 16. J. He, Z. Liu, X. Li, and S. Qin. A relational model of object oriented programs. In Proceedings of the Second ASIAN Symposium on Programming Languages and Systems (APLAS04), Lecture Notes in Computer Science 3302, pages 415-436, Taiwan, March 2004. Springer. 17. G.T. Heineman and W.T. Councill. Component-Based Software Engineering, Putting the Pieces Together. Addison-Wesley, 2001. 18. R. Helm, I. Holland, and D. Gangopadhyay. Contracts: Specifying behavioral compositions in object-oriented systems. In Proc. OOPSLA'90/ECOOP'90, pages 169-180. ACM, 1990. 19. C.A.R. Hoare and J. He. Unifying theories of programming. Prentice-Hall International, 1998. 20. J.L.Fiadeiro and A.Lopes. Semantics of architectural connectors. In M. Bidoit and M. Dauchet, editors, Proc. TAPSOFT'97, LNCS 1214, pages 505-519. SpringerVerlag, 1997. 21. C.B. Jones. Systematic Software Development Using VDM. Upper Saddle River, NJ: Prentice Hall, 1990. 22. P. Kruchten. The Rational Unified Process - An Introduction (2nd Edition). AddisonWesley, 2000. 23. C. Larman. Applying UML and Patterns. Prentice-Hall International, 2001.
238
Z. Liu, J. He and X. Li
24. S. Leue. Methods and Semantics for Telecommunications Systems Engineering. PhD thesis, Department of Computer Science and Applied Mathematics, University of Berne, Switzerland, December 1994. 25. Z. Liu, J. He, and X. Li. Contract-oriented development of component software. In Proc. 3rd IFIP International Conference on Theoretical Computer Science, pages 355272, Toulouse, France, 2004. Kluwer. 26. Z. Liu, J. He, and X. Li. rCOS: Refinement of component and object systems. In Proc. 3rd International Symposium on Formal Methods for Components and Objects (FMCO04), Lecture Notes in Computer Science 3657, pages 222-250. Springer, 2005. 27. Z. Liu, J. He, X. Li, and Y. Chen. A relational model for object-oriented requirement analysis in UML. In Proc. of International Conference on Formal Engineering Methods (ICFEM). Lecture Notes in Computer Science 2885, Singapore, 5-7 November, 2003. 28. B. Meyer. Object-oriented Software Construction (2nd Edition). Prentice Hall, 1997. 29. Microsoft. The component object model specification. Report v0.99, Microsoft Standard, Redmond, WA: Microsoft, 1996. 30. Sun Microsystems. JavaBeans 1.01 specification, http://java.sun.com/beans. 31. OMG. The common object request broker: architecture and specification. Report v2.5, OMG Standard Collection, OMG, 2000. 32. C. Szyperski. Component Software: Beyond Object-Oriented Programming. AddisonWesley, 2002. 33. J. Warmer and A. Kleppe. The Object Constraint Language. Addison-Wesley, 1999. 34. M. Wirsing and M. Broy. Algebraic state machines. In T. Rus, editor, Proc. 8th Internat. Conf. Algebraic Methodology and Software Technology, AMAST 2000. LNCS 1816, pages 89-118. Springer, 2000.
Chapter 8 Charaterising Object-Based Frameworks in First-Order Predicate Logic Shui-Ming Ho and Kung-Kiu Lau School of Computer Science, University of Oxford Road, Manchester MIS 9PL,
Manchester U.K.
In the component-based approach Catalysis, a framework is a reusable artefact that can be adapted and composed into larger systems. The signed contract between components specifies how the required properties of one component are satisfied by the provided properties of another. We examine this concept in the context of framework-based development. Although Catalysis advocates rigorous development, frameworks lack a comprehensive formal foundation. We consider a simplified view of frameworks and their transformation into first-order logic. Theorem proving may be used to check the consistency of framework specifications and we identify ways in which these specifications may be simplified beforehand to reduce the burden of proof.
8.1.
Introduction
Code is not the only reusable artefact obtained during system development. Specifications and designs also have the potential for reuse. In particular, there has been a growing interest in the identification and application of recurring p a t t e r n s of interaction. T h e relationship between design patterns and component, for example, has previously been studied by Johnson [1] and Larsen [2]. P a t t e r n s may be applied during the development process and they may be realised by one or more software components. In the component-based development approach Catalysis [3], designs p a t t e r n s are at the heart of frameworks. Framework-based development shifts the focus or reuse from single classes to groups of classes. Underlying this approach are the concepts of abstraction and refinement: generic frameworks may be specialised and adapted to specific problem domains. T h e use of these frameworks may be subject to constraints and ensuring these constraints are satisfied is integral t o the Catalysis approach. Given its emergence as the de facto s t a n d a r d for object-oriented modelling, the Unified Modelling Language [4] (UML) is used t o model frameworks in Catalysis. T h e Object Constraint Language [5] (OCL) can be used to specify their semantics. Different modelling approaches introduce their own modelling concepts and Catalysis is no exception. Although the use of UML and OCL is widespread, they are 239
240
S.-M. Ho and K.-K.
Lau
not necessarily applicable to these approaches. Since the publication of D'Souza and Wills' text on Catalysis (Ref. 3), both UML and OCL have been revised, culminating in the definitions of UML 2.0 and OCL 2.0. Both languages can express more but because of the specifics of Catalysis' modelling approach, they are still inadequate for framework modelling. Prior to OCL 2.0, much work has been devoted to increasing OCL's expressiveness. Of importance has been the need to provide greater support for business modelling and the need to express different kinds of business rules, e.g., those classified by Eriksson and Penker [6]. Catalysis has relied on its own versions of UML and OCL, at the same time prescribing a different semantics for OCL. Different notations can be used in different situations, making it difficult to fix a standard notation for framework modelling. In this chapter, for the purpose of framework modelling, we will depart entirely from UML/OCL. Instead, we make use of a textual language, Framework Modelling Language (FML), for defining frameworks from scratch. The intention is that FML can be used to define the core structural and behavioural properties of frameworks. However, these are informal descriptions and we will examine how frameworks in FML might be formalised. Many existing approaches to formalising UML and OCL are based on the transformation of UML models and OCL constraints into an existing formalism. Given the differences in UML/OCL and Catalysis' version of these languages, the applicability of these approaches to frameworks is limited. In this chapter, we outline how frameworks defined in FML may be translated into first-order logic. The resulting specification may be submitted for theorem proving. Where appropriate, this specification may be simplified beforehand. These simplifications are explored in Sec. 8.7.
8.2. Catalysis Frameworks That different aspects of a system can be modelled independently, then assembled together, is not new: role modelling in OORAM [7], aspect-oriented development, and framework modelling in Catalysis are variations of this concept. Traditionally, frameworks are defined as groups of interacting objects. Figures 8.1 and 8.2 show two such frameworks. These domain specific frameworks describe the roles a person plays in different contexts: that of a driver and an employee. Figure 8.1 shows the structural relationship between drivers and their cars. Collaborative behaviours can be expressed as joint actions, as illustrated in Fig. 8.2: employees and companies collaborate in the action employ. Joint actions may be thought of as system level behaviours, much like use cases in object-oriented analysis. They may be decomposed into much smaller actions {local actions, or messages) which occur at the object level. The decomposition of these actions may be represented in a similar manner to use cases in UML. The discussion of frameworks in this chapter, however, will not be concerned with such decompositions. The above frameworks can be augmented with constraints, which can be expressed in either OCL or in any other notation, e.g., natural language.
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
241
Drivers owner
Person
drives
Fig. 8.1.
Car
1
1
People playing the role of drivers
Employees employ worksFor Company
Fig. 8.2.
. 1
employee 1..* .
Person
Companies and employees collaborating in the action employ
Model Synthesis/Composition. The models of Figs. 8.1 and 8.2 may be synthesised, or composed, to form a new framework Drivers+Employees. Figure 8.3 shows the unfolded view of the framework. Whether this synthesis is possible in UML is dependent upon how well defined the package extension mechanism of UML is. For a discussion of UML's package extension mechanisms, the reader is referred to Cook et al.'s discussion [8]. In Catalysis, the derivation of Drivers+Employees is expressed either using a package dependency diagram (Fig. 8.4a) or as a pattern application (Fig. 8.4b). Drivers+Employees employ
c—>> c
- >
worksFor Company
Fig. 8.3.
. 1
owner
employee 1..*
Person
1
drives 1
Car
The synthesis of the Driver and Employee frameworks
External Interactions. Usually, objects within a framework collectively maintain some invariant, which must be observed by interactions within the framework and also by interactions between objects in different frameworks. This latter situation arises when model composition occurs and the behavioural models of each component framework must be unified. The synthesis of frameworks may result in invalid models either because constraints conflict or existing ones are too weak, resulting in unexpected behavioural models. An effect invariant is a constraint on actions both within a framework {internal actions) and those defined elsewhere {external actions). They are usually expressed as trigger rules, similar to the behavioural contracts of Helm et al [9]. For example,
S.-M. Ho and K.-K.
242
Employees
Drivers
r^
Lau
1
Drivers
T
Car i
Person \
Employees,,' Person
i Company
Drivers+Employees
Car
(a)
Person
Company
(b)
Fig. 8.4. The derivation of Drivers+Employees can be expressed in two ways: (a) using package dependency diagrams; or (b) using pattern application diagrams
in the Observer pattern an effect invariant could specify the sequence of actions that take place whenever the state of a subject changes: the trigger—a change in the subject's state—results in the subject sending a notification to its observers, in turn, causing them to update themselves. OCL 2.0 goes some way toward expressing these kinds of rules. However, these rules can only occur within the context of a single class although, in general, rules may be defined on the whole system. From the informal
to the
formal
The remainder of this chapter is devoted to the problem of deriving a formal specification, spec(F) of an FML framework F. Ideally, we would like to fix a standard notation for defining frameworks and their properties. Catalysis makes use of its own extensions to UML/OCL. The notations may be intuitive but they lack a proper definition within the (informal) semantic framework of UML/OCL. The extensions either require subtle changes to the existing semantics of OCL, or complicate the language unnecessarily. For this reason, FML is used.
8.3. A Cursory Overview of FML In the majority of UML models there are two kinds of model element: structural elements and behavioural elements. In FML, structural elements correspond to classes, attributes, and associations; behavioural elements correspond to events that occur within the framework. 8.3.1.
Datatypes
Common to many object-oriented (or object-based) languages is the notion of a set of basic data types (e.g., Integer, String, etc.) from which more complex data structures can be built. UML is no exception and basic data types are defined within a package called DataTypes. In FML, we will assume the existence of a corresponding framework, DataTypes, which contains the ADT definitions of primitive types. Parameterised collection types (bags, sets, and sequences) also exist in UML/OCL. In the sequel we make use of one such collection: Set (Data), sets of Data elements. In
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
243
addition to primitive types it is also possible to define additional datatypes within a framework. This is illustrated later. 8.3.2. Classes
and
Associations
The FML declaration framework F class C {} defines a class C in framework F. The class has no features (or fields), which is denoted by the empty parentheses. The term feature is used to refer to a property of a class as attributes and associations are not conceptually distinguishable as they are in UML. The features of a class may be represented by either functions or predicates. In the declaration c l a s s C { f: -> T;
p : (T) }
the class C has two features, denoted by the function f and the predicate p. In general, f and p may either be attributes of class C or associations in which instances of C participate. If T is a class, then f can be interpreted as the name of an association that is navigable from C to T, through which a single instance of T is returned. The feature p may be interpreted as the name of an association which, when navigated, results in any number of T-instances. This loosely corresponds to the situation in UML where an association p is marked with the variable multiplicity marker * at the association end at T. Given the above, the correspondence between features in FML and attributes and associations in UML can be described as follows. Let F be a framework expressed in Catalysis' extended UML notation and let F consist of the n classes read from the diagram: Ci,... ,C n . The corresponding framework in FML will be denoted by F and the classes of F will be denoted by
Attributes.
Suppose the class C^ is defined as follows: Ci { a: -> T; b : ( t ! , . . . , t k , T) }
(1) The feature a corresponds to the UML attribute declaration a: T in Ci. (2) The feature b corresponds to the (Catalysis) parameterised attribute b with parameter types t i , . . . ,t|< and result type T. Binary Associations. Suppose R is a binary association joining the classes C2 and C3 (where both classes have no attributes or other associations) and that each association end is labelled with the role names r2 and r^ respectively. (1) The declaration C2 { r 3 : -> C3 } corresponds to the situation where the association end named r$ has the multiplicity constraint 1.
S.-M. Ho and K.-K.
244
Lau
(2) The declaration C3 { r 2 : (C2) } corresponds to the situation where the association end named r2 has any of the following multiplicity constraints: 0..*, 1..*, or 0..1. In the above, it is assumed that navigation is possible from one class to another if and only if there is a role name at the target end of the association. This gives rise to a third condition. (3) In the absence of role names, the direction of navigation may be explicitly marked as being from C2 to C3 in which case we either have the declaration C2 { R: -> C3 }
or
C3 { R: (C3) } ,
depending on the multiplicity at C3 as outlined above. Item (2) above represents a departure from UML/OCL in that it deviates from the usual notion of navigation through an association. An alternative is to allow collection types, i.e., r 2 : -> Set(C 2 ). Other collections may be used. For example, if r2 had the UML stereotype ordered, then we might use the declaration r 2 : -> Sequence(C 2 ) instead. The use of collection types allows us to express a greater range of multiplicity constraints on association ends. n-ary Associations. These associations represent a general relationship between n participant classes. Unlike binary associations, however, an n-ary association is not navigable. If R is a ternary association between Ci, C2, and C3, then each of those classes has a feature named R, e.g., a feature R: (C 2 ,C 3 ) in class Ci. It should be noted that the above covers only a subset of UML's repertoire of associations. The frameworks considered in this chapter are object-based frameworks, not object-oriented. Although inheritance may be used for more interesting designs, it may also introduce its own problems, particularly where behavioural overriding occurs [10]. Classes in FML are considered as defining a traits rather than taxonomic structures. Additionally, UML has the relationships of composition and aggregation. How aggregation should be applied has been the subject of much discussion. Both these relationships, however, may be represented as normal associations coupled with suitable constraints to represent either a composition or aggregation relationship.
8.3.3. Behavioural
Elements
Whereas Catalysis talks about actions, in FML observable behaviours are described using timed events. An event is instantaneous and corresponds to the occurrence of some action. Alternatively, it may represent some signal that is raised whenever some property holds at a given point in time. A timed event in F, declared as m(Ci, . . . , C m , t i , . . . ,t n ,Time), where C i , . . . , Cm
are the participant classes of the action and t i , . . . , t n are additional parameter types of the action, can be interpreted in two ways:
Characterising Object-Based Frameworks in First-Order
Predicate Logic
245
(1) there is a joint action m ( t i , . . . , t n ) in F in which instances of C i , . . . , Cm participate; or (2) there is a local action m initiated by an instance of Ci, which involves objects of the other specified classes. Actions, whether local or joint, are considered as globally observable behaviours in FML, occurring within an explicit time context, as dictated by the Time parameter. 8.3.4.
Constraints
Conceptually, objects have state histories and in FML it is possible to refer to an object as it is at specific points in time. We denote the state of an object x at time t by the term $ ( x , t ) . As usual, the dot notation is used to represent feature access. Thus, if a is a function feature of x, then the term $ ( x , t ) .a denotes the value of a for object x at time t . Constraints on objects are introduced as facts of the framework. Facts are expressed as formulae in first-order logic using FML's notation. As is the case with events, constraints are always global properties unlike OCL, where constraints are always local properties of classes. As an example, the multiplicity on the role named employee in Fig. 8.2 may be expressed as the situation where the set employee must not be empty: f a c t oneOrMoreEmployees { a l l c: Company, t : Time:: } •
!empty($(c,t).employee)
This assumes that association ends with variable multiplicity are represented as functions returning collections. In this case, employee is assumed to be a function with target type Set (Person) as opposed to a predicate employee: (Person). Preconditions and postconditions may be attached to events. The employ declaration may be attached with a pre- and postcondition as follows: event employ(c: Company, p : Person, t : Time) { pre: !mem(p, $ ( c , t ) . e m p l o y e e ) post: $(c,next(t)).employee = add(p,$(c,t).employee) } • The term $ (c ,next ( t ) ) denotes the state of the object c at time t+1 and the postcondition states that employ results in a person p being added to the set employee of c at time t+1. 8.3.5. Importing
Mechanisms
The import mechanism of FML provides the basis for framework composition and extension. An importing framework may extend existing frameworks in a number of ways. It may define new features for classes, new events, and new constraints. In addition, the importing framework may rename elements from imported frameworks.
S.-M. Ho and K.-K.
246
Lau
A statement "import F [A\B, A. f \g] " denotes the importing of the framework F subject to two conditions: the class (or type) A is renamed to B and the feature f of A is renamed to g. Renaming allows us to force classes from different frameworks to be considered as partial definitions of the same class. E x a m p l e : Suppose that two FML frameworks Drivers and Employees have been defined. Then the derivation of Drivers+Employees is accomplished by framework Drivers+Employees import Drivers import Employees .
This is the structured definition of the framework. The body of the framework module consists of two import statements, analogous to a textual inclusion of the bodies of the Drivers and Employees frameworks. The consequence is that there are two Person definitions: one with a feature d r i v e s and the other a feature worksFor. Suppose there is a procedure flatten, which takes the structured definition of F and produces the unfolded view of F. The purpose of flatten is to resolve multiple definitions of the same class and to produce a single definition of that class. The unfolded view, or flat definition, of Drivers+Employees is shown in Fig. 8.5. The flatten procedure takes the definitions of Person from Drivers and Employees and amalgamates them into the single definition shown. framework Drivers+Employees class Person {. drives: -> Car, worksFor: -> Company } class Car { owner: -> Person } class Company {. employee: -> Set (Person) y fact oneOrMoreEmployees { all c: Company, t: Time:: !empty($(c,t).employee) y event employ(c: Company, p: Person, t: Time) { pre: !mem(p,$(c,t).employee) post: $(c,next(t)).employee
= add(p,$(c,t).employee)
}
Fig. 8.5.
The unfolded Drivers+Employees framework in FML
In the following sections we will look at how, given the FML definition of a framework in FML, we can derive its specification in many-sorted first-order logic. For structured definitions, we can define corresponding structured specifications. Using the flatten procedure, we can obtain flat definitions and, correspondingly, obtain flat specifications. The next section explores the specification of the static structure of frameworks. In particular, the presentation will be concerned with the
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
247
definition of state spaces for objects. Classes are given a semantics by defining how all possible states (or local object configurations) can be generated. Thereafter, the specification of constraints and the dynamic aspects of frameworks is discussed.
8.4. The Specification of Structure For the time being, we concentrate on class specifications as opposed to framework specifications. The idea presented here is that from the FML definition of a class C, we can construct a state space specification vspec(C). The specification vspec(C) enumerates all the possible state values that instances of C may take, each value representing a unique assignment of values to the features of C. 8.4.1.
Notation
Let F be a framework and let C = {Ci,..., Cn} be the set of classes that participate in F. The specification of F is denoted by spec(F) and the specification of class Ci will be denoted by spec (Ci). For each class C i; let there be two lists, funs(C±) and preds(C±). The former is the lexicographically ordered list of the function features of Ci and the latter is a similarly ordered list of predicate features of Ci. In the Driver s+Employees framework this would mean funs (Person) = [ d r i v e s : -> Car, worksFor: -> Company] and preds(Person) 8.4.2. Object Identity
and
= [] .
State
At this point it should be noted that a class is defined by a pair of specifications: the specification of its object identities and the specification of its state space. As such, a class is considered purely in a structural sense: it is the framework that gives the class a context, specifying state constraints for objects and providing the link between object identities and state values. 8.4.2.1. Object Identity Formally, each class Ci, where 1 < i < n, has associated with it a corresponding sort Ci- We call the sorts C\, •.., Cn the identifier (or reference) sorts for C i , . . . , Cn. Intuitively, the identifier sorts serve two purposes: firstly, the data elements of the sort act as names (or object identifiers) for the individual instances of C i , . . . ,C n ; secondly, they act as object reference types in associations. A feature a: -> Cj of class Ci therefore represents an association from C± to Cj. For each class Ci, we can associate with it an abstract datatype of object identifiers, i.e., an abstract datatype in which there is one constant CiO : [] —» C* and a
248
S.-M. Ho and K.-K.
Lau
function next : [Ci] —• d for generating successive object names. There are therefore an infinite number of object names for each class. 8.4.2.2. Object States Associated with each class Ci is a sort dState. The sort CiState is called the object value sort of class C± and its data elements the object values or state values of Ci. The set of data elements of CiState represents the state space of Ci. Each state value x : CiState denotes an object configuration corresponding to a specific assignment of values to the function features of an object and a specific state over which properties (predicate features) hold. 8.4.2.3. Features The state of an object is observable through its associated features. The state of a Ci-object at any point in time is given by a single state value x : CiState. For each function a: -> T in /uns(Ci) the specification contains a corresponding function symbol a: [CiState] -> T . Similarly, for each predicate p : (to, • • • , t n ) in preds(C±) there is a corresponding predicate p : [dState,t0,...,tn]
.
8.4.2.4. Generating Object States Here, we consider the specifications of object states for the classes Ci. We denote these specifications by vspec(C±). The specifications vspec(Ci) describe how object states, x : CiState, are generated. Undefined States. For each class Ci we require that among its object values is one which represents the undefined state for any Ci-instance. This may be represented by the constant null : [] —> CiState . In FML this undefined value is denoted as null. Defined States. To enumerate all defined (non-null) states of Ci we require appropriate state generation functions. These functions may be derived according to the lists funs (Ci) and preds(C±). If a class has function features then there must be a constructor which assigns to each of these features a value. If funs(C±) — [ai: -> t i , . . . , a k : -> t k ], then a constructor C* : [h,...
,tk]->
dState
is introduced. A ground term C*(x\,... ,Xk) denotes a state in which the features a i , . . . , <2fc are set to the values x\,..., xk respectively. We call the states generated by C* constructors initial states of Ci.
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
249
Previously, it was mentioned that one way of dealing with collections is to take a predicative approach whereby x : T is an element in a collection p at state c : Ci State if p(c, x) holds. In this case, the intended object state denoted by the term C*(xi,..., Xk) is one where each of the collections p in preds(C±) is empty. Note that if funs(C±) = [] then the miliary function C* : [] —> d State is introduced. Therefore, the simplest class Cj, which has no features, can be in either one of two states: null or C*—in an undefined state or some defined state. If we are dealing with a predicative approach to collections or if there are parameterised features then we need constructors for generating the different states over which each predicate holds. For each predicate p : [ t i , . . . , t k ] in preds(C±) we introduce a constructor assertp : [d State, t\,..., tk] —> dState
.
Intuitively a term assertp : [s,x\, •..,Xk] denotes a state s' : dState formula p(s', x\,..., Xk) holds.
for which the
8.4.2.5. Constructor and Feature Axioms We can take advantage of the conventions used to generate constructors (i.e., the lexicographical ordering of arguments in C* constructors) to derive appropriate axioms for both the constructors and features of each class. For functions, equations describe how they can be evaluated over each state; for predicates, we can describe over which states a property can be computed to hold; and for constructors, we can identify some standard properties and generate axioms appropriate properties accordingly. Evaluating Features. Let funs(Ci)= [fi: -> t i , . . . , fk: (1 < j < k) be a function in funs(C±). An axiom (Vxi :ti,...,xk:tk)
{fj(C*(xi,...,xk))
-> t j and let fj
= Xj)
describes the evaluation of the function fj over the initial states of Ci. Additional axioms are required to specify how the function / is evaluated over more complex states, i.e., those states reached through the application of the assert constructors. For each predicate p in preds(C±), an axiom (Vx : dState,xi
:ti,...,xi
: ti) (f(assertp(x,xi,...
,xi)) = f(x))
is generated. Similarly, for each predicate p, an axiom p(assertp(x,xo,...
,xk),y0,...
,yk) ^ {x0 = yo A . . . A xk = yk) Vp(x)
is generated. Assert Constructors Properties. For the assert constructors we can specify a number of properties possessed by them. As discussed earlier, a predicate approach to representing collections can be used, in which case we would like the assert constructors to have the same properties as the member addition constructors of sets. The assert constructors therefore have the properties of being idempotent and commutative, i.e., for each predicate p we have the equations assertp(assertp(x,y),y)
=
assertp(x,y)
S.-M. Ho and K.-K.
250
Lau
and assertp(assertp(x,x),y)
— assertp(assertp(x,y),x)
.
The situation becomes more complex when dealing with multiple predicate features as different terms involving different assert constructors may refer to the same state. For instance, a class Person with features name : [] —-> String,
parents : [String],
and children : [String],
has associated with it the constructor Person* : [String] —> PersonState and the assert constructors assertParents : [String] —> PersonState and assertChildren : [String] —=> PersonState . Let x : PersonState be a state in which the following holds: name(x) = Noel, parentsix, Mary), children(x, Ann) . The state x can be reached from the initial state Person* (Noel) either by applying assertParents first, i.e., assertChildren(assertParents(Person*
(Noel), Mary), Ann) ,
or by applying assertChildren first, i.e., assertParents(assertChildren(Person*(Noel),
Ann), Mary) .
Interpreting the Undefined State. There are two ways of viewing this. The first approach is to take the view that if an object is in an undefined state then any query on its features results in an undefined value. Thus, a function applied to the object value null would itself result in an undefined value _L Each datatype is extended to include an undefined value, resulting in the underlying logical formalism being a three-valued logic. Alternatively, a function applied to the value null results in some unspecified value. Thus, the term f(null) may denote any value in the range of / . In this case, the features queried over null may be observationally equivalent to those of any other Ci-state although the two states may be distinct. The idea can be extended to predicates. Here, we make the assumption that any property trivially holds for the null state. That is, for each predicate p we include the axiom (Vzo : t0,..., xk : tk) (p(null, x0,...,
xk)) .
A motivation for taking this stance is that any state that can be constructed from the undefined state must itself be the undefined state. In general, such states are constructed using the assertp constructors, in which case we have the axiom (VXQ : to,...,
Xk : tk) (assertp(null, XQ, • • •, xk) = null) .
Charaterising
8.4.3. State Space
Object-Based Frameworks in First-Order
Predicate Logic
251
Specifications
We can collect all the elements discussed in Sec. 8.4.2 and use them to form object value specifications vspec(C±), for each class Cj.. These specifications provide the focal point for studying class extension and composition in frameworks. Example 8.1. Let A be the framework defined below. framework A class C { } Thus C is the simplest possible definable class discussed earlier. Given the above FML model, we can generate the following object value specification vspec{Ck): vspec(Ck) = sorts :
CState
funs :
null,C* : [] —> CState .
Given that /uns(C) = [] and preds(C) = [], according to the discussion in Sec. 8.4.2, vspec(Ck) has no axioms. Specifications are interpreted with initial semantics. Thus, the above specification is one in which there are two distinct states. 8.4.4. Object
Diagrams
Up to this point we have only considered state spaces. We have yet to consider the binding of object states to object names. In order to bind a Ci-object to one of its possible states we need an assignment function $ : [d] -> dState
,
which tells us which state values are associated with each object in the framework. An instance diagram for framework A can be formalised using models of spec{k). Note that in our current understanding of framework specifications spec (A) consists of the C-object identifier ADT specification, the object value specification vspec(C), and the valuation function $. An instance diagram for A can be described by one possible model for spec(k). For example, let A be a model for vspec(Ck) extended with the valuation function $ : [C] -> CState such that CA
= {ci,c 2 ,c 3 ,...}
CStateA
d
nullA
d
%A
=f {0,1} =f 0 = ' {(Cl, 1), (C2, 1), (C 3 , 0 ) , ^ , ( ) ) , . . . } .
This interpretation corresponds to an instance diagram where there are two objects in existence, c\ : C and c2 : C, each of which is in the state denoted by the constant C*. All other objects c 3 ,C4,..., which we take to be mapped to the value 0, are interpreted as being inactive or do not exist.
252
S.-M. Ho and K.-K. Lau
This treatment of instance diagrams is similar to Bordeau and Cheng's [12] work on giving a formal semantics for OMT object diagrams and instance diagrams. In their work, object diagrams are formalised as algebraic specifications and instance diagrams as algebras satisfying these specifications. In their case, an instance diagram corresponds to an algebra with the addition of a special element errct, which denotes the error object of class d, and a special state undefc., which is the undefined state of d. An axiom $(errci) = undefc. , fixes the interpretation of the state of the error object. In our specifications, we do not have such objects. Although our treatment on formalising static aspects is similar to their work, concerning the introduction of object state sorts, it must be noted that we use the term state differently to Bordeau and Cheng. There, the term refers to the simplest possible observation of an object, i.e., observations of objects described in state charts: "object-states are the simplest kind of attribute, as they provide a simple summary of the condition of an object." Attributes are therefore different kinds of observations, each one of which is given a name. Thus, for each attribute a of Ci there is a valuation function
a:[d]^T, the provision being that if an object x : Ci contains a link to an object y and if x is an error object then y must also be an error object, i.e., a(errd)
= errr •
8.5. Structuring and Modularity Many of the structuring mechanisms in algebraic specification, (e.g., renaming, extension/enrichment, hiding, and union) form the basis for Catalysis' notion of package extension. In this section we will consider extension/composition in-thesmall, i.e., composition/extension of classes. When defining frameworks in FML, we take the composition of classes to mean the union of their individual definitions. Whether a class is extended or composed, the state space of a class becomes larger. This is reflected in the object value specifications of the extended/composite class. Example 8.2. Recall framework A from Ex. 8.1. Let B be the framework defined below by extending C with the features a and r and with the type T. framework B import A extend class C { a: -> T, r: (T) } type T { tl: -> T, t2: -> T }
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
253
The specification vspec(CB), shown in Fig. 8.6, can be derived from vspec(Ck) by introducing new sorts, functions, and predicates, i.e., vspec(C%) is obtained by extending vspec(Cb) with the function a : [CState] -> T and the predicate r : [CState, T] . The state space of CB consists of values which encapsulate the new data introduced by a and r. States encoding this data can be generated by the new constructors C* : [T] -» CState
and
assertr : [CState, T] -» CState .
vspec(C-B) = extend vspec(Cx) with sorts : T, CState funs : t i , t 2 : 0 - * T C* : [T] -> CState assertr : [CState, T] ->• CState a : [CState] -> T preds : r : [CState, T] axioms : (3!x : T) (C* = C*{x)) (Vx:T)
(a(C*(x))=x)
(Vx : CState, y : T) (a(assertr(x,y))
= a(x))
(Vx : T) (assertr(null, x) = null) (Vx : CState, y : T) (assertr(assertr(x,
y), y) = assertr(x, y))
(Vx: CStaie, y , 2 : T ) (assertr(assertr(x,
y), z) = assertr (assertr (x, z), y))
(Vx:T) ( r ( m ^ , x ) ) (Vx : CState, y,z :T) (r(assertr(x, y), z) <=> (y = zV r(x, 2))) Fig. 8.6.
Extending the specification vspec(C{)
We can enumerate all the desired states of CB. The intended state space can be described as the union of the initial states set {nuH,(ti,{}),(t2,{})} and the set of states { (h, {h}), (h,{t2}),
(h,{h,t2}),
(t2, {ii}), (t 2 , {*2}), (*2,{*i,*2» }
S.-M. Ho and K.-K.
254
Lau
generated from the initial states via the assertr constructor. The intention is that the undefined state null in vspec(Cx) corresponds to the undefined state in vspec(C?) and that the state denoted by the constant C* corresponds to any one of the initial states of CB listed above. This condition is made explicit by the axiom (3Lr : T) (C* = C*(x)) . 8.5.1. State
Models
We can make the following observations about the relation between the state space models of class C in frameworks A and B. Let A and B be models of vspec (CA) and vspec(Cs) respectively, then, (1) CStateA C CStateB; (2) nullA = null3; and (3) CA = CB. In fact, CState C CState since, in framework B the state space CState is extended with new elements. In the general case, we make the following observations. Let C be a class in framework M and let C be extended in a framework M'. Thus, vspec(CK) and vspec (CM') are object value specifications for C in M and M' respectively with models M. and M.1. Then, (1) sM C sM for each sort s in vspec{Cyi). (2) cf4 = c^4 for each constant c of sort s. (3) fM{x1,...,xm)=fM'(xu...,xm) for all functions / : [si,..., sm] —> s in vspec(C)n), where Xi : sf4 for 1 < i < m.
fM =
fM'\(sfAr,
i.e., each m-placed function fM of M. is the restriction to dom(fM) of the corresponding function fM of M!. (4) pM(x\,... ,xn) iff pM (xi,... ,xn) for all predicatesp : [s\,..., sn] in vspec(Cn), where Xi : sf4- for 1 < i < n. That is, M. is a submodel of M.'. The term extend is somewhat of a misnomer in our specifications: strictly speaking, vspec(CB) does not extend vspec(Ck). We take extension to mean that the initial semantics properties of no junk and no confusion are preserved whenever new sorts and function and predicate symbols are added. A more appropriate term is enlargement [13]. Enlargement is similar to the extending mode of import used in OBJ3 [14], which preserves the no confusion property across module imports. 8.5.2. Class Extension
and
Composition
Extension. If A and B are two classes such that vspec(A) = then "B extends A {•••}" corresponds to wapec(B) - <£ A U £«5, $A U $5) .
(EA,$A)
and vspec(B),
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
255
As described in the example £5 and $s emerge as a result of taking into consideration the new class definition B. New constructors are introduced to enlarge the state space of A and function/predicate symbols are introduced for each newly declared feature symbols of B. The set $5 consists of the axioms that can be generated according to the new definition of B—the axioms extend the interpretation of features over the newly introduced states. In addition, $,5 includes the equality axioms for constraining how A*-generated states are mapped to equivalent l?*-generated states. Composition. The case for class composition is similar. If Ci and C2 are classes such that uspec(Ci) = ( E c u ^ C i ) a n d vspec(C2) = (Ecsi^Ca) a n d © is the implied class composition operator, then the composition C = d ffi C 2 = C2 © d means vspec{C) = ( £ C l U Ec2 U E«, $ C l U $ c 2 U $«) , the union of vspec(Ci) and vspec(C2), followed by an extension by (Es, $$). As with extension, E^ and &s arise from the generation of new state constructors. However, whereas extension always results in the addition of (£5, $,5), the composition of Ci and C2 may result in uspec(C) = (E C l U E C 2 , $ C l U $ c 2 ) • This situation can arise when Cj = C2. Note that in the case of Ci © C2 = C2 (Ci ^ C2) additional axioms $,5 are still required to equate CJ-generated states to a subset of C^-generated states. Framework Extension and Composition. The idiosyncrasies associated with class extension and composition are reflected at the framework level. For framework extension we have spec(G) = ( E F U S A , $ F U $ A ) , where
EA = E7 U \J E4i
and
$ A = $ 7 U (J $Si .
i
i
The extension of each class Ci introduces the extensions (E^, ^ ) as described above. The extension of a framework with new event declarations and new constraints is reflected by the introduction of E 7 and 7 respectively. Here, E 7 consists purely of the new event predicate symbols introduced in G whereas $ 7 may consist of (additional) rules specifying the effects of already declared events or new constraints. In the case of pure composition we have spec(F) = (E F l U S F a U E A , $ F l U $ F 2 U $ A ) • This time E A and 3>A are defined by
EA = U S * i
and
*A = U**ii
S.-M. Ho and K.-K.
256
Lau
8.6. Behavioural Modelling and Specification For dynamic properties of frameworks we need to consider temporal versions of the state assignment functions described earlier—state history functions. When considering temporal aspects, we assume time is discrete and that there is a global (synchronous) time scale valid across frameworks, i.e., objects do not have local time. The use of time was illustrated in Fig. 8.5 in the definition of the fact oneOrMoreEmployees and the pre- and postcondition definition for the event employ in Drivers+Employees.
8.6.1. State History
Functions
For each class Ci, the framework specification contains a function $ : [Ci, Time] —* Ci State . The sort Time acts as an abstract time axis against which observations on framework state are made. The definition of time is assumed to reside in the aforementioned DataTypes framework. Here, Time is interpreted as the natural numbers with constant 0, successor function next: [Time] —> Time and ordering relation <. The term $(x,t), where x : Ci and t : Time, denotes the state of the Ci-object named x at the tth observable moment of the framework and $(x, next{t)) the next observable state of x as was illustrated previously. The issue of object existence was raised when we considered instance diagrams in Sec. 8.4.4. The scheme can be extended to state histories: an object x exists at time t if and only if %{x, t) ^ null. From this we can go on to define the usual notions of existence sets of objects and populations of objects within an 0 0 system. In the framework of FML, these correspond to temporal variations of OCL's alllnstances operator.
8.6.2.
Events
In FML, both the internal state of objects and the history of events are noted. On the one hand, state sequences allow us to talk about how the states of objects may change over time while timed events allow us to describe simple interaction protocols and the effects of actions on the states of objects. The motivation for having both state sequences and externally observable events is that some state transitions are silent—they may occur as a result of some unknown event occurring, i.e., the occurrence of an external action. Consequently, we do not consider the specification of locality axioms [15], which restrict what actions can modify an object's state. The approach adopted in this formalisation of Catalysis frameworks is based on non-reified temporal logic, i.e., where events and state valuation functions are explicitly augmented with a time parameter. In Fig. 8.5 the declaration of the event employ illustrates the essence of the approach. This differs from reified approaches in which explicit event sorts and event predicates are introduced.
Charaterising
8.6.3. Constraints
Object-Based Frameworks in First-Order
Predicate Logic
257
in FML
In Sec. 2, we were interested in three kinds of constraints: invariants, expressed as facts in FML; pre- and postconditions, which in FML, are attached to event declarations; and external effect invariants. According to the Catalysis definition, external effect invariants may be thought of as conditions that must be checked whenever a framework is imported into others. These are expressed in a similar manner to facts but introduced by the keyword a s s e r t . The translations of invariants and these assertions into first-order logic is straightforward. As we saw earlier, the invariants of FML are expressed as formulae in first-order logic but using FML's object-based syntax. The translation converts object terms/formulae into non-object-based terms/formulae. Thus, a term $ ( x , t ) .f is rewritten as the term f($(x,t)) and the formula $ ( x , t ) -g(y) is rewritten as g($(x,i),y). 8.6.3.1.
Invariants
From the preceding sections it can be seen that we have a choice when it comes to constraining the state of an object. The first method is to address the state space of a class directly. That is, we can assert that all states of class C satisfy a given property P, i.e., (Vz : CState) P{x) , or, in FML, by the fact f a c t { a l l x: C S t a t e : : P(x) > . The alternative is to constrain which states may be assigned to objects. In this case the property P is asserted to hold over assigned states: (Vz :C,t:
Time) P(${x,t))
,
expressed by f a c t { a l l x: C, t : Time:: P ( $ ( x , t ) ) } . The former approach requires that the sort CState is visible to the modeller. In the latter case, the sort is implicit and hidden. In FML, CState is considered a hidden sort—individual states may only be referenced using object terms of the form §(x,t). Implicit Properties. It should be noted that there are a number of intrinsic properties applicable to all objects/classes. These properties are defined as axioms of the specification spec(F). An example of one such property is the persistence of objects: once an object becomes active or is created, it is not destroyed. For each class Ci, the following axiom can be generated to assert this: (Vz :Ci,t:
Time) ($(z,i) ^ null => $(z, next(t)) ^ null) .
This constraint ensures that an event does not cause an object to move from a defined stated to an undefined state.
S.-M. Ho and K.-K. Lav,
258
8.6.3.2. Events Let m be a timed event and (mpre,mpost) denote the pre- and postcondition pair attached to m. Then, the event axiom (Vx) (m(x) A m p r e (x) =>
mpo8t(x)))
specifies the intension of m. In any interpretation of a framework, the extension of m gives its event history, describing when m occurs and which objects were involved in each occurrence. Underlying the event is the assumption that any object participating in it is one that exists. Thus, if x = (x\,... ,xn,t) and Xi : C,, then the event m would have a guard n
(Vx) ->(m(x) A /\ $(xi,t) = null) . i=i
For example, the employ event would be described by the axiom (Vc : Company,p : Person, t : Time) (employ (c,p,t) A ->mem(p, employee($(c,t))) => employee($(c, next(t))) = add(p,
employee($(c,t))))
subject to the guard (Vc : Company, p : Person, t : Time) ^(employ (c,p, t) A $(c, t) = null A $(p, t) = null) . 8.6.3.3. External Effect Invariants One of the main features of Catalysis is the ability to choose any level of abstraction when modelling frameworks. This is reflected in the different ways in which a constraint may be expressed. For example, in the Observer pattern, the synchronisation of data between two objects may either be expressed as a static invariant between subject and observer, or using behavioural contracts. These contracts may be expressed in FML as facts. In contrast, external effect invariants act as additional assertions that must be checked whenever a framework is imported into others. If F and G are frameworks, where mG is an event in G, then the presence of an external effect invariant EF in F would require that IF+G A (mG A PG => QG) A EF holds whenever F and G are composed. In reality, there is little to distinguish effect invariants from invariants. Hence, in the sequel we do not concern ourselves with external effect invariants. 8.7. Framework Consistency Intuitively, a framework specification spec(F) = ( E J ? , $ F ) is consistent if the axioms &F are not in some way contradictory. The danger of basing framework composition
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
259
on union rather t h a n disjoint union is t h a t specifications of the same class in different frameworks may not be compatible. Catalysis defines two types of composition to address these issues, depending on the application of the framework. Firstly, a framework may define one of several slices of behaviour exhibited by a system. Classes are partially defined and, likewise, operations may also be partially defined. This corresponds t o the idea in UML t h a t complex constraints may be decomposed into smaller ones or t h a t multiple constraints may be combined into single statements. In Catalysis the composition associated with this is known as joining. Secondly, the individual functionality of components may need preserving. Taking the intersection of classes ensures t h a t the actions of composite classes adhere to mutually exclusive constraints. As we will see b o t h types of composition differ from the s t a n d a r d notion of subcontracting.
8.7.1.
Contract
Composition
in
Catalysis
J o i n . T h e join of n pre- and postcondition pairs {Pi, Qi) (1 < i < n) for an event m is given by the pair formed by (Pi A . . . A P „ , Q i A . . . A Q „ > . This is not the only way of joining action specifications in Catalysis or in UML tools such as the K E Y system [16]. An alternative way is to factor the preconditions of each specification m into their postconditions, i.e., to derive the specification ( t r u e , (Pi =» Qx) A . . . A ( P n => Qn) ) . From this a resultant precondition can be computed to derive ( P i V . . . V P „ , ( P ! =*• Q i ) A . . . A (P„ => Qn) ) . T h e reader is referred to Hennicker et al. [17] for a discussion on the semantics of each kind of contract composition. I n t e r s e c t i o n . T h e difference between joins and intersections lies in the fact t h a t , for intersections, the invariants / ; and operations specifications {Pi,Qi) for each class Cj are assumed to be mutually inconsistent b u t nonetheless the composition of contracts should still be allowed. Intersection may result in the refactoring of a design. To avoid the problem of inconsistency between the invariants Ii, the invariants must be factored into the m e t h o d specifications as follows: / \J(Pi/\Qi),
f\(Ii
A P =}• Qi)\
,
i.e., it involves a retraction of axioms.
8.7.2.
Contracts
in
FML
It can be seen t h a t the notions of join and intersection differs from the principle of subcontracting in Design by Contract [18], where preconditions are weakened and postconditions strengthened: (Pi V . . . V P n , Q i A . . . A Q „ ) .
S.-M. Ho and K.-K.
260
Lau
When class extension occurs there are a couple of ways we can deal with the refinement of an event specification. We could allow an extending class to provide an alternative axiom to describe the intention of the event. Following the subcontracting principle, we could replace the event axioms m A Pi => Qi by a single subcontracting-compliant axiom m A (Pi V . . . V Pn) => Qi A . . . A Qn . Instead we leave the set of imported axioms untouched, preserving the axiom set { m A P i => Q i , . . . , m A P n => Qn) This coincides with joining, i.e., we can show that the join m A (Pi V . . . V Pn) => (m A Px => Qx) A . . . A (m A P n =>• Qn) is a logical consequence of the above axiom set. 8.7.3. Consistency
Checking
Consistency of a specification implies that there is some model which satisfies the specification. Tools such as Alcoa [19] typically check constraints in two ways: (i) by exercising invariants or operations, attempting to find satisfying states and transitions respectively; or (ii) by checking that some well-known property of a system is a logical consequence of the object model constraints. Our aim is to check that there are models which satisfy the axioms of a framework, i.e., ensuring that the invariants (axioms) are not so strong that they rule out any satisfying states and ensuring that the resulting states from events are reachable. One source of difficulty (and complexity) in consistency checking in specifications derived from FML models is the manner in which invariants and operations are specified. A lack of distinction between local object invariants and global invariants makes it difficult to direct the theorem proving process on specific parts of the model. Another source of complexity arises from the use of time in specifications. As we saw earlier, many generated axioms are required to specify intrinsic properties of frameworks, e.g., persistence constraints. For theorem proving purposes, we can seek to reduce the number of axioms we need to deal with by examining which parts of a specification are unnecessary for invariant and operation checking. In the remainder of this section we will identify ways in which the complexity of a specification spec(F) may be reduced for theorem proving. The resulting specification may be processed by any number of theorem provers, e.g., the tableaux-based 3TAP [20], or the resolution-based O T T E R [21]. In the latter case, the additional step of reducing the many-sorted specification into a single-sorted specification [22] is required. 8.7.3.1. (Static)
Invariants
The decision to consider the sort CState as hidden means that a static invariant in OCL, for example, would be formulated as a temporal invariant in FML. Constraints on a state history function $ indirectly restricts which states in CState are applicable. We can view the constraint (Vz :C,t: Time) {a($(x,t)) > 0)
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
261
as being a weaker form of the static constraint (Vx' : CState) {a{x') > 0) . That is, temporal constraints may be rewritten into static constraints. For any valid object x' we should be able to make a query (3a;' : CState) (a(x') > 0 A x V null) , i.e., we want to see whether we can construct a state for which a(x') > 0 and one that happens not to be the undefined state. The condition that the state is undefined is important for checking that actions do result in what we intuitively consider as valid states. These assumptions are made explicit by the introduction of persistence axioms and the event-guard axioms mentioned earlier. 8.7.3.2. Temporal Invariants and Events The situation for handling temporal invariants that contain references to next-states and events are similar. Given a temporal invariant P =>• Q, we wish to ensure that from all states satisfying a property P there exists a state satisfying Q. The persistence constraint is one such example. The statement may be recast: from all defined states, there exists a state which is also defined, resulting in [W : CState) (x' + null =» 5{x') ^ null) . As before, a term $(#,£) is replaced by the state variable x'. In addition, for each x' the term $(x,next(t)) may be replaced by a term S(x'). The consequence of these rewrites is that the datatype Time may be eliminated from the framework specification. An observation that can be made from the above is that event-guard axioms may also be discarded. The reason is that we wish to consider the effects of operations over valid (non-null) object states. Consequently, we can go further and discard the constant null and its related axioms. 8.7.3.3. Flat Specifications vs Structured Specifications A final consideration is that the flat specification flatspec(?) of a framework can be used as the basis for consistency checking as opposed to the structured specification spec(F). The flat specification for F is obtained from applying the flatten procedure to F, and then generating a specification from the flattened model. From the discussion of structuring and modularity at the class level in Sec. 8.5 it can be seen that in structured specifications some function symbols and axioms become redundant whenever classes are extended. These 'legacy' function symbols and axioms arise as a result of enlarging the state space of objects. As new state generators are added existing constructors become redundant: if C^ is extended by C2 any Ci-state reachable using the constructor C{ is also reachable using the constructor CJ. The introduction of the constructor CJ also brings about the introduction of new axioms ranging over C^ -generated states. These axioms describe the evaluation of features over C2-states. However, these are a superset of the existing Ci-states. Consequently, the equations introduced for each function over the states
S.-M. Ho and K.-K.
262
Lau
of Ci may also be discarded. Flat specifications have the property that they consist of only symbols and axioms for the class C2. Axioms are not generated for Ci since the fact that C2 is derived from Ci is discarded when unfolding takes place. That is, flatspec(F) is a sub-specification of spec(F).
8.8. Frameworks in Component Modelling In this section, we consider how the concepts of framework and component are related by means of an example of a production planning system (PPS), adapted from Rausch's article on design by signed contract [23] for componentware. The goal of the PPS is to optimise the scheduling of jobs to robots. The operation of robots is constrained in the following ways: (1) each robot may process only one job at a time; and (2) no two robots may be assigned the same job. The first condition is violated if a robot is assigned two jobs whose scheduled times overlap. The PPS itself is modelled as a component-based system, constructed from two subcomponents: a scheduler and a robot component. This can be expressed using a UML component diagram, as illustrated in Fig. 8.7. Each component has a provided and required interface. The components are defined such that the provided interface of one satisfies the required interface of the other. Job
Job
Scheduler
Robots Robot
Robot
Fig. 8.7. Component diagram showing how the production planning system is assembled from subcomponents
A framework representing the scheduling component is shown in Fig. 8.8. However, unlike the Scheduler component in Fig. 8.7, the framework does not distinguish between the provided and required services. Scheduler
Job start: Nat end: Nat
schedule
scheduled
Robot
assigned 0..1
Fig. 8.8.
hasConflictO
A framework representing a job scheduling component
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
263
The Robots component is similar to Scheduler but describes only the static relationships between jobs and robots. It differs from Scheduler in the definition of robots, in which case it adds a (derivable) attribute duration, and the absence of the schedule action. Figure 8.9 shows the FML definition for the component Scheduler. In particular, the fact labelled noConf l i c t s expresses the condition that a robot may not be assigned two jobs whose scheduled times overlap. The event schedule, corresponding to an assignment of a job to a robot, should maintain this invariant. An event hasConf l i c t is signalled whenever the invariant is violated. Thus, a valid model for this framework is one in which hasConf l i c t does not occur. The robot handling component, Robots, is shown in Fig. 8.10. The fact conf l i c t G u a r d provides is an alternative way of expressing the invariant noConf l i c t in Scheduler. It states that the hasConf l i c t property should not become true whenever there is a change in a robot's state. This is an effect invariant but not an external one as it applies equally to interactions within Robots and to Scheduler. Implicit P r o p e r t i e s . In the framework Scheduler, an intrinsic property is that every object referenced in the collections scheduled and assigned are valid objects. That is, we have the axiom (Vr : Robot, j : Job,t : Time) ->{mem{j, scheduled($(r, £))) A %{j,t) — null) for the feature scheduled and a similar axiom is generated for assigned. Like persistence axioms and event-guard axioms these axioms may be discarded before the theorem proving begins. 8.8.1. Signed
Contracts
and
Composition
The signed contract between two components is a user's specification of the syntactic and behavioural mappings between them. It specifies how the required interface of one component is satisfied by the provided interface of another. The intention of signed contracts is to enable users or developers to check whether all required properties of a component are fulfilled by another component. Signed contracts are similar to specification fitting morphisms for instantiating parameterised algebraic specifications. In the following, to differentiate one class from another, elements in Scheduler will be subscripted by S and those in Robots by R. We wish to map the class RobotR to Robots and Jobs to JODR. In frameworks the syntactic mapping is straightforward. In general, this can be achieved using the renaming mechanism of FML whenever the names of required components differs from that of the provided component. However, in addition to syntactic mappings, the signed contract may specify behavioural mappings between components. In the previous section, the discussion on consistency checking focused on ensuring that there exist valid object states satisfying the framework axioms. This is sufficient for checking the consistency of the operation schedule against the invariants in Scheduler and the temporal invariant
S.-M. Ho and K.-K. Lau
264 framework Scheduler import Sets[Data\Job] import Sets[Data\Robot]
c l a s s Robot { scheduled: -> Set(Job) } class Job { assigned: -> Set(Robot), start: -> Nat, end: -> Nat } fact IntervalNonNegative { all j: Job, t: Time:: $(j,t).start < $(j,t).end } fact { all j: Job, rl,r2: Robot, t: Time:: mem(rl,$(j,t).assigned) k mem(r2,$(j,t).assigned) ==> rl=r2 } fact noConflicts { all r: Robot, jl,j2: Job, t: Time:: jl!=j2 & mem(jl,$(r,t).scheduled) k mem(j2,$(r,t).scheduled) & $(jl,t) .start<=$(j2,t) .start ==> $(jl,t) .end<=$(j2,t) .start } fact { all r: Robot, t: Time:: hasConflict(r,t) <==> exists jl,j2: Job:: jl!=j2 & mem(jl,$(r,t).scheduled) & mem(j2,$(r,t).scheduled) k $(jl,t).start<=$(j2,t).end & $(j2,t).start<=$(jl,t).end > event hasConflict(r: Robot, t: Time) event schedule(jl: Job, r: Robot, t: Time) { decls: all j2: Job pre: empty($(jl,t).assigned) k jl!=j2 & mem(j2,$(r,t).assigned) k (($(jl,t).start<=$(j2,t).end k $(j2,t).start<=$(jl,t).end) or ($(j2,t) .start<=$(jl,t) .end & $(jl,t) .start<=$(j2,t) .end)) post: $(jl,next(t)).assigned=add(r,$(jl,t).assigned) k $(r,next(t)).scheduled=add(jl,$(r,t).scheduled) }
Fig. 8.9. The Scheduler component fact { all r: Robot, t: Time:: $(r,t)!=$(r,next(t)) ==> IhasConflict(r,next(t)) in R o b o t s .
In R a u s c h ' s signed contracts, however, a m a p p i n g between t h e con-
straints in one component a n d those in another may also be specified. For example, if R o b o t s requires jobs t o have t h e p r o p e r t y n I n t e r v a l N o n N e g a t i v e a n d S c h e d u l e r provides jobs w i t h this p r o p e r t y ( I n t e r v a l N o n N e g a t i v e ) , t h e n this behavioural dependency m a y be expressed using signed contracts. In F M L , t h e precise way in which one p r o p e r t y should be m a t c h e d with another should be m a d e using explicit assertions. T h u s , we have
Characterising Object-Based Frameworks in First-Order Predicate Logic framework Robots import Sets [Data\Job] import Sets[Data\Robot] c l a s s Job { assigned: -> Set(Robot), s t a r t : -> Nat, end: -> Nat, d u r a t i o n : -> Nat > c l a s s Robot { scheduled: -> Set(Job) > f a c t nlntervalNonNegative { a l l j : Job, t : Time:: $ ( j , t ) . s t a r t < $ ( j , t ) . e n d } f a c t { a l l j : Job, t : Time:: $ ( j , t ) . s t a r t + $ ( j , t ) .duration = $ ( j , t ) . e n d > f a c t { a l l j : Job, r l , r 2 : Robot, t : Time:: m e m ( r l , $ ( j , t ) . a s s i g n e d ) & m e m ( r 2 , $ ( j , t ) . a s s i g n e d ) ==> r l = r 2 > f a c t { a l l r : Robot, t : Time:: h a s C o n f l i c t ( r , t ) <==> e x i s t s j l , j 2 : J o b : : j l ! = j 2 & m e m ( j l , $ ( r , t ) . s c h e d u l e d ) & mem(j2,$(r,t).scheduled) & $ ( j l , t ) . s t a r t < = $ ( j 2 , t ) .end & $ ( j 2 , t ) . s t a r t < = $ ( j l , t ) .end > f a c t conflictGuard •[ a l l r : Robot, t : Time:: $ ( r , t ) ! = $ ( r , n e x t ( t ) ) ==> ! h a s C o n f l i c t ( r , n e x t ( t ) ) > event h a s C o n f l i c t ( r : Robot, t : Time) Fig. 8.10. The Robots component in FML
a s s e r t { I n t e r v a l N o n N e g a t i v e ==> n l n t e r v a l N o n N e g a t i v e }
t o represent one mapping. Such assertions are required for event matching. Signed contracts follow the subcontracting principle: iven the pre- and postcondition pair ( P L , Qh) of a provided operation L and the pre- and postcondition pair ( P R , Q R ) of a required operation R, the conditions P L =$• P R and Q R =>• Q L should hold. However, not all events in F M L are defined using pre- and postcondition pairs. T h e event hasConf l i c t is one such example. In b o t h frameworks the event is defined using a fact and a suitable condition must be formulated t o test whether hasConf l i c t R is a suitable match for hasConf l i c t s :
a s s e r t { a l l r : Robot, t : Time:: R o b o t s : : h a s C o n f l i c t ( r , t ) ==> S c h e d u l e r : : h a s C o n f l i c t ( r . t ) > .
266
S.-M. Ho and K.-K.
Lau
8.9. Related Work The formalisation of object-based (or, in the wider context, object-oriented) systems has been studied extensively within different disciplines of computer science. In systems modelling, much effort has been placed on giving a formal semantics for the different kinds of diagrams used in UML. Prominent among these is the work of Richters [24, 25], which is concerned with the formalisation of OCL with respect to a subset of UML's class diagrams. This work may be used as a basis for the validation of UML models and OCL constraints. The aforementioned K E Y tool [16] may also be used to check the consistency of UML class diagrams but differs in its formalisation approach. Class diagrams and static OCL constraints are translated into first-order predicate logic [26]—a similar transformation might defined to translate OCL (pre version 2.0) constraints into FML; OCL pre- and postconditions specifications are translated into dynamic logic [27]. Dynamic logic has been used elsewhere, firstly to examine formal foundations for conceptual modelling [28], and secondly to examine the way in which dynamic classes can be used to model the roles of objects [29], supporting the notion that objects may switch between different behavioural roles at different times. Temporal logic has also been applied extensively. For example, OCL expressions have been translated into the temporal logic BOTL [30]. The full expressivity of B O T L , however, is not exploited because of the lack of temporal operators in OCL. Increasing the temporal expressiveness of OCL has been a subject of great interest. There have been numerous proposals for the incorporation of time in OCL, e.g., by Hamie et al. [31] and Sendall and Strohmeier [32], either by defining temporal operators for OCL or adding explicit notions of discrete or real time for timing constraints. The need for a temporal OCL for modelling business components has been discussed by Conrad and Turowski [33]. Temporal constraints can be represented in UML's state charts but the definitions of UML/OCL limit what kind of temporal properties can be specified, hence the need for a proprietary OCL. Similarly, Ziemann and Gogolla describe their own version of OCL, TOCL (Temporal OCL) [34], which extends OCL with temporal operators and adapts existing OCL operators to a temporal context and Bradfield et al. [35] define a template-based approach for specifying the temporal properties. Increasing the expressivity of OCL, however, does not address the issue of a lack of integration between the different views—static, behavioural, interactive, and functional—of an object model. The issue has been addressed partially by the integration of class diagrams and sequence diagrams [36-38]. Different views of a model may introduce different kinds of constraints (e.g., timing constraints in sequence diagrams) without having to increase the expressivity of the constraint language. However, Conrad and Turowski's observations apply even to the interaction view in the integrated approach. In many of the above approaches, the object-oriented aspects of models are lost in their formalisation. Objects have a shared vocabulary for their attributes/associations and operations: in essence, local object features are projected onto a global context. This differs from class-as-template approaches, where classes are templates from which objects derive their own unique vocabulary ( [39] - [41]).
Charaterising
Object-Based Frameworks in First-Order
Predicate Logic
267
The idea has been applied to Catalysis frameworks, exploring the formal semantics for frameworks and interaction [42] and embedding this static semantics into a temporal setting using modular distributed temporal logic ( M D T L ) [43]. Static semantics for frameworks are limited in the kinds of interactions that may be expressed: interactions occur via framework parameters. MDTL alters the situation and allows the specification of effect invariants while also allowing the specification of synchronous and asynchronous communication between objects from different frameworks. The formalisation of a class described in this chapter differs from the above approaches in one common aspect: the way in which object state is treated. In this text, there is an explicit specification of a class' state space. In model-based approaches this state space is similar to taking the Cartesian product of the attribute types of a class and augmenting this set with a special undefined element. Object value specifications are similar in style to formalising object state using algebraic specifications [44] where, with the addition of ordered sorts, it is possible to use subsorts as a mechanism for partitioning state spaces. Much of the work on formalising UML/OCL has concentrated on class diagrams. At the same time, the aforementioned shift toward increasing the expressivity of OCL for dynamic constraints (i.e., action clauses [45, 46] in OCL 2.0) has been realised to some degree. Indeed, Catalysis goes further and allows the specification of how actions are invoked and sequenced within constraints. In contrast, FML is less expressive owing to its simplified event model based on a synchronous time framework. Consequently, FML is restricted compared to OCL 2.0 or M D T L when it comes to specifying the interactions between objects.
8.10.
Summary
This chapter has been concerned with how model frameworks in Catalysis, which makes use of extensions and adaptations to UML/OCL, may be formalised. There are many approaches to formalising object-based or object-oriented systems, some of which have been discussed. In this chapter first-order logic is used as the formal foundation for frameworks. From the FML definition of a framework F, we have looked at how a specification of F, spec(F), can be derived. Contained within this specification are the specifications of classes and object states. A semantics for a class C is given by its associated object value specification uspec(C), which enumerates the state space of C. This notion of explicitly enumerating object states influences the way in which framework composition and extension is defined. The composition of two frameworks F and G is defined if the axioms in the resulting framework F+G are consistent, which can be verified by a theorem prover. There are ways in which the specification spec(F+G) can be reduced prior to theorem proving: symbols and axioms which become redundant in spec(F+G) may be identified and discarded. Not all redundant formulae are discarded. Typically, one framework may strengthen the constraints of another but nonetheless these weaker constraints are retained in framework specifications.
268
S.-M. Ho and K.-K. Lau
It has been argued t h a t t h e specification of software contracts is not enough for componentware. T h e notion of design by signed contract extends Design by Contract with the ability to specify the syntactic and behavioural mappings between t h e provided services of one component and t h e required services of another. This notion has been examined in the context of frameworks. Syntactic mappings may be identified during importing and renaming may be used to ensure t h a t provided components are m a p p e d t o required components. T h e signed contract m a y also identify behavioural mappings between components. These are equivalently represented using assertions in F M L .
Acknowledgements T h e first author is indebted to the Engineering and Physical Sciences Research Council, without whose support this work would not have been possible.
Bibliography 1. R.E. Johnson. Frameworks = (Components+Patterns). Communications of the ACM, 40(10): 39-42, 1997. 2. G. Larsen. Designing Component-Based Frameworks Using Patterns in UML. Communications of the ACM, 42(10): 38-45, 1999. 3. D. D'Souza and A. Wills. Objects, Components, and Frameworks with UML. AddisonWesley, 1998. 4. OMG Unified Modeling Language Specification, Version 1.5 (Draft), March 2003. 5. J. Warmer and A. Kleppe. The Object Constraint Language: Precise Modeling with UML. Addison-Wesley, 1999. 6. H.-E. Eriksson and M. Penker. Business Modeling with UML, Business Patterns at Work. Wiley, 2000. 7. T. Reenskaug et al. Working with Objects. Manning/Prentice-Hall, 1995. 8. T. Clark, A. Evans, and S. Kent. A Metamodel for Package Extension with Renaming. In J.-M. Jezequel, H. Hussman, and S. Cook, editors, Proc. UML 2002, LNCS 2460, pages 305-320, Springer-Verlag, 2002. 9. R. Helm, I. Holland and D. Gangopadhyay. Contracts: Specifying Behavioural Compositions in Object-Oriented Systems. In Proc. OOPSLA, 169-180, October 1990. 10. M. Vaziri and D. Jackson. Some Shortcomings of OCL, the Object Constraint Constraint Language of UML. Technical report, Massachusetts Institute of Technology, December 1999. 11. D. Jackson. Micromodels of Software: Lightweight Modelling and Analysis with Alloy (Draft). Technical report, Massachusetts Institute of Technology, February 2002. 12. R.H. Bordeau and B.H.C. Cheng. A Formal Semantics for Object Model Diagrams. IEEE Transactions on Software Engineering, 21(10): 799-821, October 1995. 13. H. Ehrig, B. Mahr. Fundamentals of Algebraic Specification 1, Equations and Initial Semantics. Springer-Verlag, 1985. 14. J.A. Goguen, T. Winkler, J. Meseguer, K. Futatsugi and J.P. Jouannaud. Introducing OBJ. In J.A. Goguen, editor, Applications of Algebraic Specification using OBJ, Cambridge, 1993.
Charaterising Object-Based Frameworks in First-Order Predicate Logic
269
15. N. Aguirre and T. Maibaum. A Temporal Logic Approach to Component-based System Specification and Verification. In Proc. ICSE02, 2002. 16. W. Ahrendt et al. The KeY Tool. Department of Computer Science, Chalmers University and Goteborg University, Technical Report in Computer Science No. 2003-5, February 2003. 17. R. Hennicker, H. Hussmann and M. Bidoit. On the Precise Meaning of OCL Constraints. In Object Modeling with the OCL, LNCS 2263, pages 69-84, Springer-Verlag 2002. 18. B. Meyer. Design by Contract. Technical Report TR-EI-12/CO, ISE Inc., 1987. 19. D. Jackson, I. Schechter and H. Shlyakhter. Alcoa: The Alloy Constraint Analyzer. In Proc. 22nd Intl. Conf. on Software Engineering, pages 730-733, 2000. 20. B. Beckert, R. Hahnle, P. Oel, and M. Sulzmann. The Tableau-based Theorem Prover 3TAP, Version 4.0, In Proc. 13th Intl. Conf. on Automated Deduction, LNCS 1104, pages 303-307, Springer, 1996. 21. W.W. McCune O T T E R Reference Manual and Guide, Argonne National Laboratory, January, 1994. 22. H.B. Enderton. A Mathematical Introduction to Logic. Academic Press, New York, 1972. 23. A. Rausch. "Design by Contract" + "Componentware" = "Design by Signed Contract" In Journal of Object Technology, Special issue: Proc. TOOLS USA 2002, 1(3): 19-36, 2002. 24. M. Richters and M. Gogolla. OCL: Syntax, Semantics, and Tools. In Object Modeling with the OCL, LNCS 2263, pages 42-68. Springer-Verlag, 2002. 25. M. Richters. A Precise Approach to Validating UML Models with OCL Constraints. PhD thesis, Universitat Bremen, 2002. 26. B. Beckert, U. Keller, and P.H. Schmitt. Translating the Object Constraint Language into First-order Predicate Logic. In Proc. VERIFY, Workshop at Federated Logic Conferences, Copenhagen, Denmark, 2002. 27. T. Baar, B. Beckert, and P.H. Schmitt. Extension of Dynamic Logic for Modelling OCL's @pre Operator. In D. Bjorner, M. Broy, and A.V. Zamulin, editors, Proc. J^th Intl. Andrei Ershkov Memorial Conf., Perspectives of Systems Informatics, LNCS 2244, pages 47-54, Springer-Verlag 2001. 28. R.J. Wieringa. A Formalisation of Objects using Equational Dynamic Logic. In C. Delobel, M. Kifer and Y. Masunag, editors, 2nd Intl. Congress on Deductive and ObjectOriented Databases 566, pages 431-452, Springer-Verlag, 1991. 29. R.J. Wieringa, W. de Jonge, and P. Spruit. Using Dynamic Classes and Role Classes to Model Object Migration. In Theory and Practise of Object Systems, 1(1): 61-83, 1995. 30. D. Distenfano, J.P. Katoen, and A. Rensink. On a Temporal Logic for Object-based Systems. In S.F. Smith and C.L. Talcott, editors, Formal Methods for Open Objectbased Distributed Systems IV, pages 305-326. Kluwer Academic Publishers, September 2000. 31. A. Hamie, R. Mitchell, and J. Howse. Time-based Constraints in the Object Constraint Language. Technical Report CMS-00-01, University of Bristol, 2000. 32. S. Sendall and A. Strohmeier. Specifying Concurrent System Behaviour and Timing Constraints Using OCL and UML. In Proc. UML 2001, LNCS 2185, pages 391-405, Springer-Verlag 2001. 33. S. Conrad and K. Turowski. Temporal OCL: Meeting Specification Demands for Business Components. In K. Siau and T. Halpin, editors, Unified Modeling Language:
270
34.
35.
36.
37.
38. 39. 40. 41.
42.
43.
44.
45. 46.
S.-M. Ho and K.-K. Lau Systems Analysis, Design and Development Issues, Chapter 10, pages 151-166, Idea Publishing Group, 2001. P. Ziemann and M. Gogolla. An Extension of OCL with Temporal Logic. In Critical Systems Development with UML—Proc. UML'02 Workshop, pages 53-62, Technische Universitat Munchen, Institut fur Informatik, 2002. J. Bradfield, J. Kiister-Filipe, and P. Stevens. Enriching OCL using observational mucalculus. In R.-D. Kutsche, and H. Weber, editors, Proc. Fundamental Approaches to Software Engineering 2002, LNCS 2306, pages 203-217, Springer-Verlag 2002. J. Yang, Q. Long, Z. Liu and X. Li. A Formal Semantics of UML Sequence Diagrams. In Z. Liu and K, Araky, editors, Proc. of 1st International Colloquium on Theoretical Aspects of Computing (ICTAC 2004), Lecture Notes in Computer Science 3074, pages 170-186, Springer, 2005. X. Li, Z. Liu and J. He. A Predicative Semantic Model for Integrating UML Models. In Proc. Australian Software Engineering Conference, pages 168-177, IEEE Computer Society, 2004. Z. Liu, J. He, X. Li and J. Liu, Unifying views of UML , Electronic Notes in Theoretical Computer Science, Volume 101, pages 95-127, 2004. E. Amir. Object-Oriented First-Order Logic. In Linkoping University Electronic Articles in Computer and Information Science, ISSN 1401-9841, 4(1999): 042. H.-D. Ehrich. Object Specification. In E. Astesiano, H.-J. Krewski, and B. KriegBiickner, editors, Algebraic Specification, Chapter 12, pages 435-465, Springer, 1999. K.-K. Lau and M. Ornaghi. Correct Object-Oriented Systems in Computational Logic. In A. Pettorossi, editor, Proc. LOPSTR '01, LNCS 2372, pages 168-190, SpringerVerlag, 2002. K.-K. Lau, S. Lui, M. Ornaghi, and A. Wills. Interacting Frameworks in Catalysis. In Proc. 2nd Intl. Conf. on Formal Engineering Methods, pages 110-119, IEEE Computer Society Press, 1998. J. Kuster Filipe, K.-K. Lau, M. Ornaghi, and H. Yatsu. Intra- and Inter-OOD Framework Interactions in Component-based Software Development in Computational Logic. In A. Brogi and P. Hill, editors, Proc. 2nd Intl. Workshop on Software Development in Computational Logic, September 1999. J.A. Goguen and R. Diaconescu. Towards an Algebraic Semantics for the Object Paradigm. In RECENT Trends in Data Type Specification: Workshop on Specification of Abstract Data Types: COMPASS: Selected Papers, LNCS 785, Springer-Verlag, 1994. A. Kleppe and J. Warmer. Extending OCL to Include Actions. In Proc. UML 2000, LNCS 1939, 440-450, Springer-Verlag, 2000. A. Kleppe and J. Warmer. The Semantics of the OCL Action Clause. In Object Modeling with the OCL, LNCS 2263, 213-227, Springer-Verlag, 2002.
Chapter 9 Formalization in Component Based Development
Jens P. Holmegaard, J o h n Knudsen, Piotr Makowski and Anders P. Ravn Fr. Bajers
Aalborg University Vej IB, 9000 Aalborg, [email protected]
Denmark
We present a unifying conceptual framework for components, component interfaces, contracts and composition of components by focusing on the collection of properties or qualities that they must share. A specific property, such as signature, functionality, behavior or timing is an aspect. Each aspect may be specified in a formal language convenient for its purpose and, in principle, unrelated to languages for other aspects. Each aspect forms its own semantic domain, although a semantic domain may be parameterized by values derived from other aspects. The notion of aspects is illustrated in terms of the currently most adapted terminology, we consider a contract as a collection of selected aspects compatible with some component. Keeping aspects separate opens up an opportunity to combine semantic models. The proposed conceptual framework is introduced by small examples, using UML as concrete syntax for various aspects, and is illustrated by one larger case study based on an industrial prototype of a complex component based system.
9.1.
Introduction
A component is any part of which something is m a d e a . T h e idea of building something by fitting components together to a functioning whole is fundamental for engineering. In software engineering or programming, the idea has been there from the beginning: Assembly programming is exactly the art of putting instructions together to make a computer execute an entire algorithm. And with the advent of problem oriented languages with subroutines etc., larger programs were always assembled from a collection of modules, some of which are made for the specific application, while others are ready-made and found in function or class libraries, including components provided by operating or run-time systems. So one may certainly ask whether component based development is a new concept in software engineering. Consider how Szyperski defines a component [32]: a unit of composition with contractually specified interfaces and fully explicit context dependencies that can be deployed independently and is a subject to third party composition. a
Oxford Advanced Learner's Dictionary 271
272
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
This definition summarizes Szyperski's analysis of prerequisites for software reuse, which is the major concern in his text. It contradicts the common myth of software development stating that reusability comes with generalization. Indeed, many software oriented companies associate effort of providing reusable software with rewriting of project specific code to its "generalized" form. Usually that process is accomplished by extrapolating implicit contextual assumptions, for instance adding method parameters, or by introducing an abstract interface definition to a concrete implementation. As a result, intuition about the intended use is lost, interfaces are made complicated by mixing dynamic input and static configuration data; it becomes much more difficult to learn to use them and additional coding assistance is needed. With this process in mind, we observe a failure of component technology and reusability concepts which for the last 20 years were perceived as "tempting but too expensive to apply in practice". Szyperski tries to overcome that impasse by defining four orthogonal properties for a truly reusable component: (1) (2) (3) (4)
contractually specified interfaces, fully explicit context dependencies, independent deployment, third party composition
In order to understand the implications of the definition, let us consider some potential candidates for being components: an assembly language instruction, a routine in a library, a class in a class library, and a software package. An assembly language instruction has a very well defined interface through its operands and its operation code. Some may argue that its context, the processor architecture, is implicit, because it is a global state shared by all instructions. Nevertheless it is explicitly defined, because the instruction is always used in the context of a specific processor architecture. However, assembly instructions are very explicitly deployed by the programmer, because their role can not be abstracted and interpreted outside of their context. It does not make sense to place them in an arbitrary order. Thus an assembly language instruction is not according to the definition a software component, although it is certainly a very reusable program component. What then about a library routine? It has contractually specified interfaces, and the context dependencies are also explicit although some of them are coded in a linkage format. They are independently deployed in the sense that the application programmer does not care when and where they are placed by the underlying runtime system. Finally, they are subject to third party composition, thus they are in some application areas successfully used as components although there may be some blurred context dependencies resolved by the linker and loader of the language in which they are embedded. Classes in a class library are a modern counterpart of the routines, and satisfy the criteria for being components, although some of the context dependencies might show up only when objects are loaded, just as for routines. Packages in general have no interfaces. Their context dependencies are in many cases explicitly provided, e.g. various Linux packaging systems. However, they are usually expressed by lists of names and versions of the other packages rather
Formalization
in Component
Based
Development
273
than required services. Independent deployment is addressed rather by providing a family of products released under the same name and dedicated to different fixed configurations than as a single customizable unit. They do not qualify as components. Thus we may suppose that if there is anything intrinsically new in the component concept, it must have to do with context dependencies and independent deployment. Note that independent deployment presupposes some knowledge about the permissible contexts. A 'one size fits all' component to fit in any context is hardly believable. The distinctive feature is thus that a component not only, as other module concepts, provides a functionality through a defined interface and is subject to composition; but furthermore has specified assumptions about its deployment context, which allows it to link into a fitting context. A component is explicit about the contracts on the required execution platform, about the required resources and about other required components. These aspects of a software module have up till now been hidden as linkage conventions and configuration requirements and taken care of by compilers, linkage loaders, and operating systems. Explicit specifications of these facets give opportunities for building systems with heterogeneous components. However, the platforms must be prepared for such information and provide compatible interfaces to those required by components. When the platform itself is heterogeneous, potentially distributed, and even dynamic, there is a massive amount of supporting middleware as seen in e.g., the CORBA ORB [25], or .Net [10]. In order to use components, not only an understanding of their functional properties as known for class or routine libraries is required, but as well an understanding of the platform dependencies of the component. One example is the protocols they use, e.g., a remote procedure or method call protocol with potential loss of connection or temporary failure to find a specific component. Other examples would be memory consumption, execution time or perhaps even power consumption for a range of platforms. A small example may illustrate this point. Example: Component and Platform Dependencies Figure 9.1 shows a generic organization of contemporary embedded systems architectures. The application component is in itself an assembly of components required to fulfill the specification of the system. Each of the components requires the services of the middleware including operating system and hardware platform components. The middleware component offers services managing communication and synchronization between the components of the application. Also the execution ordering of the application component processes as well as the distribution of these processes over the available resources provided by the hardware platform is managed by the middleware. The middleware component may take various shapes, but in embedded systems context it is often a real-time operating system (RTOS) with some language specific run-time libraries. The hardware platform component provides an execution engine supporting the software components of the system, both of the application and middleware. It may consist of one or more CPU cores along with memories and various peripherals
274
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
o
Application
+ Middleware
Hardware Platform
Fig. 9.1.
Layered Architecture
organized in some configuration. The interfaces provided by the hardware platform component correspond to the Instruction Set Architectures (ISAs) of the CPU cores. For instance, the hardware platform component may incorporate an ARM core and hence offer the services of the ARM ISA. Since both the application and middleware components rely on the hardware platform component, certain aspects of these and their sub-components depend on the particular hardware platform used. In particular, consider aspects such as program-memory consumption, execution time, and power consumption. These are directly dependent on the specific hardware. Due to the construction of the ISA, e.g., CISC or RISC, and also the instruction encoding, a certain functionality of a component may require more or less memory. Moreover, the number of instruction to be executed to perform a functionality is influenced by the complexity of the individual instructions. In addition, issues such as pipeline depth of the CPU core affect timing of the execution. All these facets influence the program-memory use, execution time, and power dissipation related to a particular sub-component of the application or middleware components. Hence, components built for hardware platform components based on, e.g., a Pentium CPU core, an ARM core, or a Texas Instruments DSP core, will observe different values for the same aspect.
9.1.1. Related
Work
In embedded systems development in particular, there is a strong demand for tools supporting the entire development process, from design based on functional properties, to analysis of reactive behaviors, timing properties, and resource usage an ambitious attempt in that direction is Metropolis [4]. The aim is to ensure
Formalization
in Component
Based
Development
275
that implicit context dependencies do not show up as unexpected behaviors in implementations. In order to build tools, properties must be formalized and precise models of the phenomena subject to analysis must be available, hence motivating the development of formal semantics for component aspects. It appears obvious that a comprehensive semantic model for a full component framework in a traditional denotational, operational or axiomatic style is infeasible. It would have to straddle all the layers of hardware, network protocols, middleware, operating systems, programming languages etc. We see this symptom already with a notation like UML. In this situation, we see the absence of a comprehensive semantics as a virtue, and propose to use independent semantics for different aspects of a component. For each of them, there must be a notion of composability of the components, and a notion of compatibility, i.e., when some component can be substituted for another in a system. The vision is that a design and development framework will invoke specialized tools for the various aspects when connecting components. The notion of composability is used to derive the specific property for a given aspect for the composed subsystem, and compatibility is used to check that the connection is well formed for a particular aspect. The Advanced Real-Time Systems Information Society Technologies (ARTIST) research consortium with the objective of coordinating research efforts in the field of advanced real-time system has defined component based design and development as a topic of research. The roadmap [8] distinguishes interfaces and rich interfaces. Interfaces specify syntax and functional semantics of components, whereas rich interfaces include timing and other extra-functional properties. The term contract is used as a general term for specification of any aspect. This corresponds to its use in Object-Oriented software engineering practices [23] and component-based modelling [15]. We adapt a more fine grained view of descriptions of the interaction between components than what can be expressed in terms of interfaces and rich interfaces as we address the description of an aspect and consider any interface, rich or not, as a conjunction of such aspects. In the ARTIST study, the ordering of interfaces proposed in [6] is adopted. We do not agree that aspects should be ordered although categorizing the aspects is very reasonable. The implication of adopting such an ordering is that one is not free to select relevant aspects in an arbitrary order. Ordering or importance of an aspect must be considered in the context of the system being developed. Thus requiring that the ARTIST level 1 properties, which refers to the syntactic interface, i.e. component signature, must be satisfied before e.g. addressing the semantic aspects seems an arbitrary choice. Components are to be independently deployed by the definition given in [32], thus disregarding a component, because it from a signature point of view is syntactically incompatible, is counterproductive. If a component has the right semantic and extra-functional properties, it should be considered, especially if developing a connector requires less effort than developing a corresponding component considering all classes of aspects. However, we hasten to add that the aspect categories of the interface hierarchy in [6] seems to be a very reasonable taxonomy on aspects. A concrete example of a component language with an aspect style is the CoCo modeling language proposed as part of the PECOS project [22]. It allows for
276
J. P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
modeling of both individual components and component compositions. In addition to component interface representation, the language allows attaching both functional and non-functional features to a component model, e.g., memory consumption, execution time, etc., using the 'property' keyword of the language. Such attributes may be explicitly specified on both component models as well as the individual instances of a composition. It is thus close to a free association to components of aspects. A similar trend is seen in UML2 with metamodels [26].
9.1.2.
Overview
The following section makes the concepts of component, interface and aspect more precise, defining aspect languages as formal theories and illustrating aspect languages for well known aspects, such as signature, functionality, behavior, timing, platform dependency, and resource use. The section discusses also criteria for useful aspect languages: defmedness, conformance checking, compatibility, and composability. A particular issue is interdependent aspects, where we suggest that theories for composition of specifications may be applied. Section 9.3 discusses component frameworks and supporting platforms and their role in design, validation, quality estimation and deployment activities. Section 9.4 summarizes some experiences with developing a component based system with the yet immature technology currently available. Finally, Section 9.5 concludes and gives a perspective on further work
9.2. Components, Interfaces and Aspects A component has one or more interfaces. Each interface is essentially a collection of names which makes it possible to speak about the properties of the component. Formally speaking, the names are special symbols in a special theory for some aspect of the particular component. Collectively, the aspects are the component model and it comprises a naming scheme for the interfaces which allows us to create a repository to deploy, browse and retrieve components. Names have different interpretations depending on the aspect. Consider for example a name inchannel in some communication component. It may be interpreted in a type system as having the property of receiving messages of a certain defined packet format; in a process calculus it may denote an element of the communication alphabet; or in a QoS aspect, it may denote a transmission channel with an average transmission capacity of 1.2 Mbit/sec. The theory used to reason about the names of an interface is thus dependent on our interest in understanding the component from a particular point of view. In summary, for each aspect the names are the special symbols of a logical theory, which usually is an extension of some rather general theory. In the example above, the general theories would be the type system of some programming language, the laws of a process calculus, or queuing theory respectively. The component instance is a 'stand alone and deliverable' entity (binary code, electronic device) that can be treated as a building block of a system. It may
Formalization
in Component
Based
Development
277
be provided together with some deployment data such as XML DTDs or skeletons of configuration files. Each component view or interface embraces a set of aspects expressing different and unrelated properties such as interface signatures or memory consumption of a component instance.
Component Instance f X I deployment data
system framework
Component Model —O naming scheme
r^
«profile» Component View
Fig. 9.2.
C^
«aspect» Component Aspect
Logical view of component
Each aspect addresses a specific kind of the requirements. The aspect provides a component instance abstraction restricted to a very limited context and expressed in a language specialized and dedicated to operate over it. In Figure 9.2 we give the relations among these concepts. We use the stereotype to mark possible definitions in different aspect languages. The interfaces are intended as generalizations of the component instance reflecting different uses. There can be several interfaces for a component. They are applied for organizing views depending both on the application domain and preferences of the users community. For example, the Rational Unified Process (RUP) [21] provides a component interface of different granularity for each development phase: inception, elaboration, or construction, and it relates them vertically with a refinement relation. 9.2.1. Aspect
Representation
Our belief is that expressing different aspects in different dedicated languages will simplify reasoning about components and lead to more efficient and automated methods for assembling components into a working system. However there are several arguments for a partially unified representation at the syntactic level: • The aspects although logically unrelated are often structurally interdependent, e.g. a method signature and its execution time are both referring to the same named connection point. • The aspects are subject to a larger composition process and need to be manipulated in an uniform manner. • Property violations should be reported for all aspects on the development framework level where composition takes place.
278
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
• The component model should be stored in and queried from a general repository regardless of the aspects it embraces. • Automated tools should be able to decompose a model into aspects and delegate composition to appropriate subsystems. This step may also involve some syntactical validation. These practical issues can be addressed either by syntactical constraints on aspect languages and aspect instances, e.g. using UML notation, or by building tools that can link to each other in a coroutine style, such that different syntax is resolved by using XML definitions that are interpreted by loosely coupled tools. 9.2.2. Aspect
Languages
In general, aspect languages can be treated as universal many-sorted algebras ,where a Representation Language is covering syntax for both the system abstraction and the property specification, and a set of well defined operations over the language are used for model manipulation and formal reasoning. The semantics of an aspect language depends on its purpose, as should become clear from the following examples. Type systems If an aspect is intended to provide consistency checks for data representations and call conventions, it is a type system as known from programming languages. A central criteria for a good type system has been decidability, static typing or strong typing such that consistency could be determined at compile or at the latest at link time. Yet, there has always been loopholes in the strong typing, e.g. indexing of arrays, formal function parameters, or explicit type casting. In general, such features have been made safer by dynamic or run-time checks. With components and dynamic linking, run-time checks are more acceptable. However, when the applications are safety-critical, it may be hard to accept exceptions at run-time. Nevertheless, one could consider more flexible type systems as e.g. found in the PVS theory language [27]. It would then be up to the development frameworks to provide tools to assist in discharging type check obligations, or alternatively to embed them as run-time checks. Functional specification Another aspect that has been well researched is functionality. Essentially the language for this aspect would contain pre- and postconditions as well as state invariants for the component. It is not yet industrial state-of-the-art [3]; but it is gaining popularity as witnessed by the various assertion languages for Java [5]. Since such conditions usually employ free variables ranging over infinite sets, there is no general decidability; but development frameworks may offer assistance in discharging proof obligations, see e.g. the LOOPS project [7]. Another approach is to detach the aspect language from the programming language and employ a specialized specification language like Raise [13], Z [31], or B [1] with various object-oriented
Formalization
in Component
Based
Development
279
extensions and supporting tools. The benefit is that these languages have been thought out to support verification. Already with functionality there is a potential interdependence with the type system, because the type system typically include machine limitations, e.g. limited range of integers. One approach is to define functionality in terms of the abstract mathematical types and then delegate it to the type checking system to validate platform limitations. Another would be to codify the machine types in the theory for the functional aspect, thus making it a conservative extension of the type system. We much prefer the former solution, because it separates concerns, as already pointed out by Dijkstra [11] in his discussion of how to deal with machine limitations. It underscores our point about dealing separately with aspects. Reactive Specification More recently, but yet with mature research results, we have seen aspect languages for handling the aspect of reactivity with process calculi [12, 17, 24], stream calculi [9], rCOS [15] and various state-machine based formalisms [14]. We include in reactivity, the aspect of logical time and continuous observables, corresponding to the time in a dynamical system model controlled by an embedded software component. Here, extensions to the formalisms have been investigated in the last decade, and results about decidability and undecidability have been established [2, 16]. Other aspects And then there are all the extra-functional aspects like resource consumption. A rich area for research and development of specialized theories with associated tools [19]. 9.2.2.1. A Note on Interdependence Treating the aspects as independent theories may lead to concerns about overall consistency - after all they model the same component instances. However, this is not a concern for users of the component. Whether it is possible to construct a component instance is a concern for those that develop components, and thus a question to their development processes. If the aspects can be proved or validated on the concrete instance, there is consistency; if not, the component developer is selling a miracle or unvalidated product, and this is not a scientific concern. However, there may be merits in component development in at least seeing some aspects as conservative extensions of a common core, for instance using a broad common specification language [18] or developing consistent semantics for particular aspects [30]. 9.2.3.
Compatibility
Given an assortment of components, an interesting question for a system developer is: which of them will satisfy a given purpose? An answer to the question requires some partial order on components, and thus on their interfaces and aspects. In
280
J.P.
Holmegaard,
J. Knudsen,
P. Makowski
and A.P.
Ravn
our formulation of aspects, the answer must be formulated as defined morphisms between the algebras or theories defining aspects. The form of the morphism may be a simple renaming or type coercion, or it may entail a specialized abstraction or simulation relation. The implementation counterpart of such a relation is a connector, for instance provided by the middleware. A connector is often divided into several parts, wrapping the individual components. We prefer to see connectors as components, perhaps specially developed for the purpose. The partial order on individual aspects is easily extended point wise to a partial order on a collection of aspects. However, this requires all aspects to be present in a given view, and that cannot in general be assumed. A solution would be to require definers of an aspect language to define bottom or top elements for the refinement order. The extension could then take the bottom element as a default if one wants to maximize compatibility or the top element if one wants to be pessimistic and only allow proven compatibility. These questions have been treated extensively in the research community, for instance using category theory as a general setting, but has not really made it to the tool arena, although there are notable exceptions with the B-tool [1], the Raise tool [13] and the OBJ framework [29]. 9.2.4.
Composition
Assembly is the most common approach to component integration into a system. It takes a collection of components and proceeds to "connect the wires" - linking compatible interfaces through connectors. This composition requires the component to provide explicit structure for connectivity as shown in the Structural View of Figure 9.3. Each component can aggregate several different structural views. Each view contains a number of Connection Points that can be either required or provided by the component. When several views are supported, they have independent connection points. An example is that each EJB (Enterprise Java Bean) object provides at least two interfaces: the home interface (for bean manipulation) and the local or/and remote interface (for bean services). Component Model
Component Aspect Structural View
Connection Point ~53
Requires Point
language reference
CT functional dependency
Provides Point
Fig. 9.3.
Aspect Model
A prototypical language for assembly is an UML object diagram as it is used in for instance the Rhapsody tool [20]: The assembly process entails a check of
Formalization
in Component
Based
Development
281
compatibility of the interfaces and thus compatibility of the aspects, but it leaves open the meaning of the system: the aspects characterizing the assembly. Here, we must rely on combinators for the individual aspect languages. For type system and functional aspects, the result of the assembly is essentially remaining open or unconnected interfaces. If such partial assembly is allowed, we can compose components and get a derived component as a result. A closed system is then a special case. Such composition would rely on scoping rules. An assembly of components will from a reactive system perspective be a network of processes, and the aspects are combined by a parallel composition construct. With component assembly, it is probably not useful to have dynamic creation of entities like channels or processes in the language. This is unlike object oriented systems, where the concept of recursion is a very useful feature. 9.2.5. An Aspect
Framework
Considering the number of different theories that can enter in the context of aspects, it is important to have a common framework for manipulating them systematically. Such a framework should cover two basic mechanisms to unify aspect theories: Syntactical abstraction provides common composition rules for syntactically different theories. It is convenient to define an aspect of one kind by a set of equations, other as a well structured diagram, yet another as a graph, or a transition system. Consequently, composition can take the form of logical conjunction, term unification, superposition, edges joining vertexes of two graphs, etc. We are looking for syntactical objects sufficient to express subjects for composition and defined relations of such underlying languages. Semantic abstraction provides uniform interpretation of composition for multiple semantical domains. An aspect language can essentially be built over any formal theory in which one denotes some elements as 'composable'. Similarly semantics for composition can be retrieved as a special part of the theory that considers composable elements. We need to define rules for semantics abstraction so that composition has an interpretation in each of underlying languages.
-o z
Fig. 9.4.
The adder component, functional and signature aspects
Example: Adder Consider a simple component realizing addition of two integer numbers. The instance may be implemented as a C function int add(short x, short y). One of the structural views defined for the component (Figure 9.4) comprises the three
282
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
connection points: x, y associated with the operation arguments, and z for the result. The view is shared by two aspects: functionality and signature. The signature associates the connection points with the types: x , y £ short, z £ i n t e g e r , where short and i n t e g e r are interpreted as well defined subsets of the whole numbers. The functionality aspect assigns semantics by a postcondition z = x + y, where x,y,z are valuations of x , y , z , which are interpreted as program variables. We can also consider other aspects of that component. A memory consumption aspect (Figure 9.5) could be represented by a view with a single connection point interpreted as an amount of memory required by the component for its allocation. f
\ z=x+y
^ Fig. 9.5.
•
The adder component, memory consumption aspect
Example: Negation For use in the following discussion of composition, we introduce another simple component: i n t n e g ( i n t p) computing the negation of an argument. The view (Figure 9.6) gives its functionality and signature aspects, which are respectively q = —p, and q,p 6 i n t e g e r .
q=-p
J Fig. 9.6.
The negation component, functionality and signature aspects
9.2.5.1. Composition relation Taking connection points as a basic abstraction of composable elements, it is now possible to formalize the notion of structural composition and compatibility within the aspect framework. In principle, composition can be any relation over the set of connection points. Within the aspect approach we consider only binary relations. These can be graphically expressed as links between pairs of connection points. For an arbitrary set C of components used in the development process, we denote by P the set of connection points. Let V = 2 P be the powerset of connection points, V, = 2 P x P be the set of binary relations on P . Let Pi, P2 &V be structural views defined for two distinct components C\,C2 £ C,where Pi fl P2 = 0, and let
Formalization
in Component
Based
Development
283
R\, P 2 € P be internal connections defined respectively for Ci and C2, i.e. links already established between components constituting C\ and C2. Then structural composition is any operation COMP : (V x K) x (V x K) ^ (V x K) such that, if C O M P ( ( P i , P i ) , (P 2 ,P 2 )) = (P,P) then: P C Pi U P 2 RiUR2CR
(1) (2)
P \ (Pi. U P 2 ) ^ 0 (3) P \ (Pi U P 2 ) C (Pi U P 2 ) x (P x U P 2 ) (4) The first condition sayss that composition does not introduce any new connection points. The resulting set of connection points is a subset of the sum of sets given as the arguments. Condition (2) says that internal connections are preserved. Condition (3) states that composition introduces at least one new internal connection. Condition (4) restricts new connections to pairs from the sets of connection points of the composition arguments. For compositional development of the system, one should not make any assumptions on internal connections of components. In this case we can simply require P i = P 2 = 0 for the composition arguments. Note that structural composition is monotonic both in the set of connection points and connection relations. 9.2.5.2. Acceptance relation The definition of structural composition provided in the previous section is certainly too weak fro most aspects. We would want some particular compositions to be rejected on the basis of aspect language specific conventions. Recalling the example of the add and the minus sign components, we would reject compositions linking connection points representing the "results" (z and q) of the operations. That restriction is a specific property of the behavior aspect, which differentiates between the arguments and the results. It seems practical to introduce a flexible mechanism for restricting the composition for a particular aspect language. It shall capture compatibility of the components, i.e. questions about semantical correctness of the composition that do not involve validation against requirements. It would also allow inclusion of syntactical rules of an aspect within the framework. For such purposes we introduce an acceptance relation. Let £ be an aspect language used in a composition. Then an acceptance relation is any relation Accc Q V x 1Z. The acceptance relation specializes the notion of structural composition for a given language. Whenever some object is a well formed result of structural composition in aspect language C, it belongs to the acceptance relation defined for that language. The following sections contain several examples of commonly used acceptance relations. Finally, as the component in the aspect framework can be composed independently in any aspect language, we need to assure global consistency of the operation.
284
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
Taking the component examples from the previous sections: as behavior and signature aspects are unrelated theories, one could compose q with x in one aspect, and q with y in another (see Figure 9.7). However such compositions admissible on a level of aspects, does not have a consistent interpretation considering the entire set of components. We require that is if two aspects of a component share the same structural view, then structural composition in one of them enforces identical composition in another.
o— p
z=x+y
>
q=-p
p
o
oq y
Fig. 9.7. Inconsistent composition of add and minus sign components for two different aspects sharing the same view
Let Ci, Ci be two components, and C\, £2 be two aspect languages defined for both C\, and Ci- Assuming that £ 1 , £2 share the same view P\ for component C\, and the same view P2 for component C2, we say that composition of (Pi, i?i) and (P2,Pt2) is consistent if and only if: COMPdPuRi),
(P 2 ,R 2 }) 6 AccCl H AccC2
for some R\, P 2 6 HExample: Input/Output
compatibility
Inputs and outputs are a common concepts in construction of directional relations. They are used for describing data or control flows such as calling sequences, or communications. Generally inputs and outputs can be seen as a simple example of a type system: The concept of inputs/outputs can be implemented as an aspect language IO by dividing the set of connection points P into into two distinct sets /, O, such that IUO = P and / PI O = 0. The idea is that each element of i" represents some input of a component, and O comprises the component outputs. The compatibility predicate for composition of C\ and C2 is then: Accxo = {{P, R) € V x U\
RCIxO}
(1)
Considering the adder and the negation examples, the connection points can be classified within the I/O aspect as x,y,p £ I, z,q € O. Then the acceptance relation Accxo rejects the set of compositions presented on Figure 9.8.
Formalization
in Component
Based
Development
> X
z=x+y z q
q=-p P
q=-p q
P x
z=x+y
y
X
q=-p
z=x+y
z
p y
Fig. 9.8.
Compositions rejected by input/output acceptance relation
A note on implementation of acceptance relations The idea of the acceptance relation is that a tool performing composition can generate acceptance conditions to be discharged by aspect specific tools, cf., the use of type check conditions in PVS or generation of assert statements in programming languages.
9.2.5.3.
Semantics
The aspect framework aggregates multiple domains assigning them a common abstraction of the aspects. It aims at providing possibly the richest description of a component, concerning many factors important in design and development. The idea is to cover a large set of heterogeneous system properties uniformly. That approach resembles the UML language which provides several different perspectives on the same system. However, as the aspect framework is intended more as a practical development solution than a documentation system, we are primarily interested in the properties that can be precisely verified and validated against the requirements. We focus on aggregating a class of tool supported languages which provides modeling qualities such as: simulation, model checking, formal verification, testing, or consistency checking. All these capabilities provide means for validation which can generally be seen as some satisfiability relation between system and requirements. As the framework generalizes the syntax of the aspect languages assigning them uniform abstraction of the structural views, we need to define corresponding abstractions for the semantics. Interpretation of the structural view sv € SV of come component C in the aspect language £ is a model assigned to that view by a mapping function Vol : SV —> Mc which is homomorphic with respect to the structural composition. That is, given
286
J. P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
two structural views a,b, then: Val{COMP(a,b))
=
compc(Val(a),Val(b))
where compc in an interpretation of the structural composition in the language C. The models in an aspect language can be parametrized by a number of additional values. It reflects the situation in which component behavior can be customized by setting up some predefined configuration parameters. In this case the parametrized model represents a possibly infinite set of concrete models and could be represented as a function / : T —> Mc, where T is a Cartesian product of the parameter types. However in the aspect framework we need not distinguish between parametrized and non-parametrized models, assuming that they are both expressed in the same aspect language which provides some concretization scheme. A concretization function con : Mc —• Mc in some language C is any function which provides valuation to one or more of the model parameters. Finally, we can define semantics of a composition within the aspect framework. The composition in an aspect language L is given by an algebra (AM c, COMPc U CONc), where AMc is a set of aspect models, COMPc a set of composition functors injective in Accc and CONc a set of concretization functors. 9.2.6.
Summary
We have introduced an aspect as a theory over the names or symbols introduced by a given view or interface of a component. We have recognized that in order to define compatibility, an aspect language must support a defined refinement relation and include at least a bottom element. Furthermore, the language must support composition of the theories in order to define the properties of an assembled system. 9.3. Component Based Development The process of building a component-based system starts with a specification defining properties to be satisfied by the system. The specification is an abstract, normative description of the system to be built. Available for the process is a set of components stored in a repository. This process of composing components is known as assembly and may be accomplished manually, semiautomatic or fully automatic. It connects the components of the composition by adding the necessary middleware for communication and synchronization, for instance by means of method calls to middleware. Moreover, as some component interfaces may be incompatible, connectors and wrappers may be generated. 9.3.1. Model Based
Development
Instead of building a system directly, assembling concrete, platform specific components into a system, a Model Based Development approach may be taken. In this case, a system model is built before the concrete components are assembled. To build the model, component models capturing the component interfaces must be
Formalization
in Component
Based
Development
287
available, because they define how components may be combined with each other. Model assembly builds the system model, connecting component models based on a specification and inserting abstract communication and synchronization components representing the middleware. As the component models, and in turn the system model built from these, are abstract with respect to the actual components, the models are platform independent. Consequently, the system model represents different concrete systems for various platforms or even different frameworks. At the time in system development where the system model is built, the target platform is yet to be defined. Thus, the model must be represented independent of implementation details about concrete systems. A system model is only an intermediate step. In order to complete assembly, the model components must be replaced by actual components from a framework and platform specific repository. The component substitution step adds the concrete middleware components for component communication and synchronization. A great number of UML CASE tools are available for manual assembly of component models to produce a system model, cf. the examples in the ARTIST Roadmap [8]. These tools typically offer synthesis of the UML model into some programming language, either predefined or defined by the user, and they do not appear to support a component substitution step. It is similar to the semiautomatic assembly offered by the PECOS project; a C + + or Java assembly skeleton is produced from the system model, but the user manually adds component code before compilation. We have yet to see an integrated component development support as envisaged in Figure 9.9.
9.4. Case Study A full scale example of a component based development is the Autonomous Plant Inspection (API) software. It is developed by a consortium of agronomists, control engineers and computer scientists [28]. The initial system mission is to provide an autonomous platform that can navigate a field with crops and use image processing to produce a weed distribution map. The API architecture, Figure 9.10, has several communicating components. A major component is a farming database system. It defines the possible operations on the field, the physical placement of the field, jobs schedule etc. An API station is an operator console which allows supervision of job progress and keeps track of working platforms. It has two subcomponents: a tactical planner and a route planner. The tactical planner coordinates job execution of multiple platforms dynamically assigning part of the job to each of them. The route planner optimizes individual platform movement in order to minimize overall execution time. The station retrieves job plans and field data from the database system which is also used as a results storage. A remote access method like CORBA, .Net or RMI services can be provided by a vendor as a part of the farming system solutions or implemented as an additional "connecting component". The job plans for the API are delivered as sets of way-points (locations with commands) to be visited on the field. Another important API component is the platform i.e. mobile vehicle capable of moving on
J. P. Holmegaard, J. Knudsen,
288
P. Makowski and A.P.
Ravn
Component Provider Component abstraction - ~ ^ tools publish model
Component Library Server
!f
/ download models , query models ' get instance
Component System Developer
structured search filters
•j Component Development Platform
F i g . 9.9.
Component Integration Support System
Route Planner Farming Database System
CORBA
Tactical Planner
DB Connector
API Station
Communitation Protocol
Implement
API Platform
Fig. 9.10. Autonomous Plant Inspection System a field to the predefined locations. T h e platform has several autonomous behaviors, e.g., obstacle avoidance or crop following. It carries and interacts with one or more implements such as cameras. T h e station communicates with the platform using a communication protocol t h a t takes into account environment specific limitations (low network bandwidth or frequent package drops are often experienced in wireless communication between moving objects). T h e communication protocol carries way-point d a t a from the station to the platform and platform s t a t u s back.
Formalization
in Component
Based
Development
289
Whenever the platform reaches a way-point location it sends predefined commands to the installed implement(s). Defining a component architecture for the API is an important step towards identifying reusable elements for future experiments and robots. For reusability support we need to identify which components are implementation context specific and which provides more general functionality. Both types should have minimal external dependencies and explicitly defined interfaces. In some cases the component must satisfy additional time constraints for proper system functioning. The interface and timing aspects will be illustrated on the example of how way-points pass through the API system. In the API development, we have used a prototype database system developed by AGCO Corp. This is a relational database with associated file formats for logging data, planning data, measurement points, geographical data etc. The prototype 'Field Star Open Office' (FSOO), Figure 9.11, is a Microsoft Access database accessible through a COM interface with standard SQL queries. FSOO stores data both Components D - Third party OTS D - Lightweight connectors D - A P I core deliveries
6 SQL/DML File System A Calls ~
Coordinates Converter
lOttt
6 Query Execution HIM icr-••
r-O - '
fJORin
POA Interface
Fig. 9.11.
6 WGS2UTMConverter j—O >—' i-^i Hi i 'ii-m FSOODB Interface
-O > j
API Station
FSOOClient Interface
API - Farming Database System Access
in database tables and as text files, when the amount of data is large, e.g., a log of individual operations or individual images. Thus there is a need for an intermediate component providing uniform access. The FSOO vendor solution: DAL (Data Access Layer) is implemented as a DCOM service and provides methods for Microsoft oriented systems. As platform independence, Java based implementation and remote data access were basic assumptions for the API we decided to implement an additional interface to the FSOO, the API server. The API server runs on the same machine as the FSOO system. It connects to the FSOO database via a JDBC (Java Data Base Connect) driver and allows execution of an arbitrary SQL or DML statement. It can also retrieve and store files within the FSOO file structure, accessing it directly with a set of operating system calls. The API server is a standard CORBA service which makes it easily accessible for remote clients. The methods provided by the server are defined in IDL (Interface Definition Language) and are implementation language independent.
290
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
An example is p a r t of the interface t h a t defines a method for retrieving a j o b plan file from the database. module database^ module server-[ FileData getJobPlaiiData(int long long jobID) r a i s e s (DBConnectionException, FileOpenException, FileReadException)
W i t h a farming database system t h a t provides its own uniform C O R B A based port, the A P I server could be replaced with a vendor interface. However, with the current state of the art, it would almost certainly require some manually written glue code for matching the names and signatures of the interfaces. A task t h a t could be streamlined in a model based development tool. An F S O O Client communicates with the A P I Server via the H O P C O R B A protocol. It implements an abstract interface of the DB Connector, Figure 9.10, which provides an abstraction for any farming system d a t a t h a t is used by an A P I Station. A p a r t of t h a t interface defines how the Station should call for t h e j o b plan data: public c l a s s DBConnector{ public JobPlanData getJobPlanDataO ; T h e JobPlanData is not a component. It is a class implementing an abstract d a t a type t h a t stores the information on the job name, field location and list of way-points for the job. public c l a s s JobPlanData{ public Vector getAllWaypointsO ; T h e job plan is retrieved as a vector of objects of the Waypoint class. Each Waypoint object stores information about its location and the set of actions to be taken. Additionally it can define required speed of t h e platform, its desired orientation etc. public c l a s s Waypoint{ public double getEastingCoordinateO ; public double getWorthingCoordinateO ; public Vector getWaypointActionsO ; An action is dedicated to a specific implement defined by its type identifier. It contains the command identifier an a set of command parameters. T h e command and parameters are sent to the implement, when the platform reaches the way-point location.
Formalization
in Component
Based
Development
291
public class ImplementAction{ public int getlmplementTypeO ; public int getCommandldO ; public Vector getParametersO ; } Geographical data can be expressed in several standard systems, e.g. WGS84, UTM, or degrees. The API uses UTM coordinates, which are utilized by GPS devices installed on the platforms. As there are many standard coordinate systems (FSOO uses WGS84 since 2002), we decided to make coordinate system translation a separate component. public class WGS2UTMConverter{ public s t a t i c Point2D WGS2UTM(String longitude, String latitude); } The FSOO Client provides its functionalities by strictly implementing the DB Connector interface. As it is the only means of communication between the farming system and the API station, we expect that it can be easily modified or replaced without affecting performance of the other parts of the system. Components InsertionPlanner
• - Third party OTS D - Lightweight connectors CH - API core deliveries
Q RouteAlgorithm Interface
X SimpleTactic Q TacticalPlanner Interface J FSOO Client Interface
PlatformRegistration Interface
API Station
\o:
•MM Protocol
PlatformConneclion Interface Timing Constraints F i g . 9.12.
UDP Connection
Platform Bus
API Platform
-o>-
Implement
Timing Constraints
A P I - Station, Platform and Implement
The station provides the Graphical User Interface (GUI) visualizing a field, working platforms and planned routes for the operator. It also allows job selection, dynamic registration and detachment of platforms during job execution. Mechanisms for direct control of the platforms like an emergency stop/resume or manual control are also included for recovery from exceptional situations. The Station, Figure 9.12, maintains the current status of registered platforms, plans and coordinates the job execution by distributing the way-points to the platforms. Dividing the job effectively in a non trivial task that can take into account different platform configurations and physical capabilities, way-points local density, risk of platforms collision etc. The tactical planner operates on the list of the platform states and unprocessed way-points providing a vector of sequences of a way-points assigned to each platform.
292
J.P. Holmegaard, J. Knudsen,
P. Makowski and A.P.
Ravn
public class TacticalPlanner{ public Vector doPlanning(Vector platformStates, Vector waypoints);
In the A P I we have implemented a tactic (SimpleTactic) based on a greedy algorithm dividing the points proportionally to each platform prioritizing the closest platform over the other. T h e planned sequence of way-points is optimized by a component implementing a Route Planner interface. public c l a s s RoutePlanner-[ public Route planRoute(Vector waypoints); As route planning in general is a NP-complete problem (equivalent to the T S P problem) we can implement many different heuristics. In the A P I we have experimented with several different algorithms (brute force, insertion, genetic). The one mentioned on the diagram InsertionPlanner was selected as the one providing highly regular solutions with a relatively low computational effort. An A P I protocol component encapsulates communication between the station and the platforms. It implements the PlatformConnection interface providing send/receive functionality. T h e P l a f r o m C o n n e c t i o n object maintains a dictionary of the most current information about the platform state. It also handles packaging and delivery of the scheduled Way-points. public c l a s s PlatformConnection{ public PlatformState g e t ( i n t platformID); public void send(int platformID, PlatformCommand command);
public c l a s s PlatformState{ public double getEastingCoordinateO ; public double getNorthningCoordinateO ;
public c l a s s PlatformCommand-f public i n t getWaypointsCountO ; p u b l i c WaypointWrapper [] getWaypointsO ; } T h e WaypointWrapper is the class associating a way-point index maintained by a station. Extending the way-points with the index allows maintaining information about the job progress, invalidation and dynamic rearrangement of the platform task. T h e A P I protocol also provides the P l a t f o r m R e g i s t r a t i o n interface which is used during initialization. It captures the the platform I P address, installed implements etc. During registration each platform acquires an unique number for further identification.
Formalization
in Component
Based
Development
293
Another aspect that could be specified in the station requirements on the API protocol are the timing constraints. The station sends commands to the platforms with a frequency of one message per second which explicitly limits platform control. It also expects the platform to report with a one second frequency. Whenever the platform state message is delayed for more then five seconds, the operator is notified about a potential problem. After that time the platform is disregarded in work planning until it regains the connection. After one minute the unprocessed waypoints are reclaimed from the malfunctioning platform and distributed among the other ones. 9.4.1.
Status
A prototype API platform has been operational since 2003 and is doing useful work for the agricultural scientists. The overall system has been implemented and test by simulation; but currently job plans are directly transferred to the platform. The farming database system with its rigid discipline is not popular among agricultural researchers, and thus the station is rudimentary. Among the lessons we have learned about component based development, the most important ones are: There are really no tools that support development using components in different languages. We have used two different UML-tools and they support code template generation for one language of your choice, not for a mix of languages. And IDL is not supported. Furthermore, there is little support for compatibility checking across component interfaces or setting up the middleware, e.g. CORBA name servers. Finally, when we want to check aspects beyond those covered by the type systems, we have to manually translate the interface definitions into the aspect language, e.g. checking reactivity including timing with the UPPAAL tool [33].
9.5. Conclusion Components are reusable software with defined interfaces deployable within specific frameworks. We have presented aspects, consisting of independent theories over a common set of names or symbols as a common denominator for defining properties of components. In order to support component based development, the aspect languages should provide a notion of refinement allowing a partial compatibility order among components. The refinement order should have at least a bottom element which represents the default value of a particular aspect. Furthermore an aspect language should support combination of aspects corresponding to assembly of components. Ideally, component based development uses the component models, i.e. interface views defining aspects, to design a particular application. Assembly for a particular platform is then a mechanisable process analogous to running a complex makefile with compilation of connectors, linking and loading. The ideal is far from the industrial reality. As illustrated by our case study, component based systems are at best supported by UML-based design tools support development for a single
294
J.P. Holmegaard, J. Knudsen, P. Makowski and A.P. Ravn
language framework. T h e assembly process has to be done manually by developing programs and scripts t h a t register components in name servers, and compatibility checking is only within computational expensive frameworks like C O R B A supported at the level of type systems. Checks of further aspects has to be done by separate development for particular aspects. Much research has already been done in developing theories and notations for particular aspects of programs, like typing, functionality, reactivity and various performance measures. Our analysis suggests t h a t searching for a unified semantics for all aspects or combining semantics of assorted semantics may be less fruitful t h a n developing independent languages. Where each of t h e m really support automatic and semiautomatic verification of compatibility and composability, much as current type systems are used in programming languages. Furthermore, more research is needed in how these aspect languages and their reasoning systems are rendered in a form suitable to be invoked from component development frameworks.
Acknowledgments The presentation has been much influenced by discussions about the Roadmap for Component Based Systems within the ARTIST project (EU IST-2001-34820), where we wish to thank the partners. Informal discussions with Zhiming Liu during a visit in summer 2003 have also been very helpful.
Bibliography 1. J.R. Abrial. The B-Book : Assigning Programs to Meanings. Cambridge University Press, October 1996. 2. R. Alur and D.L. Dill. The Theory of Timed Automata. In REX Workshop, pages 45-73, 1991. 3. K. Arnout and B. Meyer. Uncovering Hidden Contracts: The .Net Example. IEEE Computer, 36(ll):48-55, 2003. 4. F. Balarin, Y. Watanabe, H. Hsieh, L. Lavagno, C. Passerone, and A. SangiovanniVincentelli. Metropolis: An Integrated Electronic System Design Environment. IEEE Computer, pages 45-52, April 2003. 5. D. Bartetzko, C. Fischer, M. Moller, and H. Wehrheim. Jass - Java with Assertions. In Workshop on Runtime Verification, 2001. 6. A. Beugnard, J.M. Jezequel, and N. Plouzeau. Making Component Contracts Aware. IEEE Computer, 32(7):38-45, 1999. 7. D. Bobrow, S. Mittal, S. Lanning, and M. Stefik. Programming Languages - the LOOPS Project, http://www2.parc.com/istl/members/stefik/loops.html, 19821986. 8. B. Bouyssounouse and J. Sifakis, editors. Embedded System Design: The ARTIST Roadmap for Research and Development, volume 3436 of Lecture Notes in Computer Science. Springer-Verlag GmbH, 2005. 9. M. Broy and G. Stefanescu. The Algebra of Stream Processing Functions. Technical report, Technische Univeritat Miinchen, 1996. 10. Microsoft Corporation. .Net Framework Developer's Guide, h t t p : //msdn.microsoft. c o m / l i b r a r y / d e f a u l t . a s p , last visited April 2005.
Formalization in Component Based Development
295
11. E.W. Dijkstra. A Discipline of Programming. Automatic Computation. Prentice-Hall, 1976. 12. P.H.J. Van Eijk, C.A. Vissers, and M. Diaz. The Formal Description Technique LOTOS. North-Holland, April 1989. 13. Raise Language Group. The Raise Specification Language. Bcs Practitioner Series. Prentice Hall, December 1992. 14. D. Harel. Statecharts: A Visual Formalism for Complex Systems. Science of Computer Programming, 8:231-274, 1987. 15. J. He, X. Li, and Z. Liu. Component-based software engineering. In Proc. 2nd International Colloquium on Theoretical Aspects of Computing (ICTAC05), Lecture Notes in Computer Science 3722, pages 70-95. Springer, 2005. 16. T.A. Henzinger and J-F. Raskin. Robust Undecidability of Timed and Hybrid Systems, volume 1790 of Lecture Notes in Computer Science. Springer-Verlag GmbH, 2000. 17. C.A.R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985. 18. C.A.R. Hoare and He Jifeng. Unifying Theories of Programming. Prentice Hall, 1998. 19. J.P. Holmegaard, A.P. Ravn, and P. Koch. An Approach to Quality Estimation in Model-Based Development. In Proceedings of the 1st International Workshop on Model-Based Methodologies for Pervasive and Embedded Software, MOMPES'04, pages 81-91, June 15 2004. 20. iLogix. ilogix. http://www.iLogix.com, last visited April 2005. 21. P. Kruchten. Rational Unified Process, An Introduction. Addison Wesley Professional, 3rd edition, 2003. 22. T. Genfiler, A. Christoph, M. Winter, O. Nierstrasz, S. Ducasse, R. Wuyts, G. Arevalo, B. Schonhage, P. Miiller, and C. Stich. Components for Embedded Software: the PECOS Approach. In CASES '02: Proceedings of the 2002 International Conference on Compilers, Architecture, and Aynthesis for Embedded Systems, pages 19-26. ACM Press, 2002. 23. D. Mandrioli and B. Meyer, editors. Advances in Object-Oriented Software Engineering, pages 1-50. Printice Hall, Englewood Cliffs, N.J., 1991. 24. R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer-Verlag GmbH, 1980. 25. OMG. Common Object Request Broker Architecture: Core Specification, h t t p : / / www.omg.org/cgi-bin/doc7formal/04-03-01, last visited April 2005. 26. OMG. UML 2.0 Superstrucure Specification, http://www.omg.org/cgi-bin/doc? ptc/2004-10-02, last visited April 2005. 27. S. Owre, J.M. Rushby, and N. Shankar. PVS: A Prototype Verification System. In CADE-11: Proceedings of the 11th International Conference on Automated Deduction, pages 748-752. Springer-Verlag, 1992. 28. API project. Api projektet. http://www.cs.auc.dk/~api/, last visited April 2005. 29. The Apache DB Project, h t t p : / / d b . a p a c h e . o r g / o j b / , last visited April 2005. 30. H. Rasch and H. Wehrheim. Formal Methods for Open Object-Based Distributed Systems: 6th IFIP WG 6.1 International Conference,, volume 2884 of Lecture Notes in Computer Science, chapter Checking Consistency in UML Diagrams: Classes and State Machines, pages 229-243. Springer-Verlag GmbH, 2003. 31. J. M. Spivey. The Z Notation: A Reference Manual. International Series in Computer Science. Prentice Hall, second edition, 1992. 32. C. Szyperski. Component Software, Beyond Object-Oriented Programming. AddisonWesley, 1997. 33. UPPAAL. Uppaal. http://www.uppaal.com/, last visited April 2005.
This page is intentionally left blank
Chapter 10 A Model-Driven Approach for Building Business Components
Vinay Kulkarni* and Sreedhar Reddyt Tata Research Development and Design Centre, 54B, Industrial Estate, Hadapsar, Pune, India * vinay. vkulkarni@tcs. com t sreedhar. reddy@tcs. com Modern business systems need to cater to rapidly evolving business requirements in an ever-shrinking window of opportunity while keeping pace with advances in technology. Current industry practice for developing large and complex applications is expensive and error prone as it relies heavily on manual verification. Model-driven development approach addresses this problem by providing a set of modeling notations like UML for specifying the different aspects of a system and a set of code generators for transforming these models into code. These modeling notations can be used to specify an application as a set of interacting components that can be adapted along the dimensions of change. Typically, business applications vary along the dimensions of domain functionality, business process, architecture, design strategies and technology platforms. We present a methodology, emerging from and aimed at guiding the engineering practice, that uses aspect-orientation and model-driven development techniques for specifying different views of interest of a system as models and transforming them in successive stages of refinement with specific aspects of interest being imparted at each stage. We discuss how this approach was used to restructure a model driven development environment resulting in greater reuse and ease of its evolution.
10.1.
Introduction
Modern business systems need to cater to rapidly evolving business requirements in an ever-shrinking window of opportunity. It should be possible to quickly put together a system from reusable components with only the required p a r t s built afresh. Modern business systems also need to keep pace with advances in technology. For developing large and complex applications, industry practice uses a combination of non-formal notations and methods. Different notations are used to specify the properties of different aspects of a n application a n d these specifications are transformed into their corresponding implementations t h r o u g h the steps of a development process. T h e development process relies heavily on manual verification to ensure the different pieces integrate into a consistent whole. This is an expensive and error prone process. 297
298
V. Kulkarni and S. Reddy
Model-driven development approach addresses this problem by providing a set of modeling notations for specifying the different aspects of a system and a set of code generators for transforming the models into an implementation [10]. Models defined using these different notations are instances of a single meta model. This provides a means to unify the specifications of different aspects and leads to a simple and elegant implementation method. The method has been used extensively to construct medium and large-scale enterprise applications resulting in improved productivity, better quality and platform independence [5, 6]. An application can be modelled as a set of interacting components. A component can be adapted for different solution architectures and technology platforms by choosing an appropriate set of code generators. However, this limits the choice of adaptation to the set of available code generators as the design and architectural decisions are hard coded in their implementations. One needs to write a new set of code generators to support new choices. There is a need to further raise the level of abstraction of system specification, by modeling all significant aspects of a system, and devising processes that transform these models in successive stages of refinement, at each stage imparting specific aspects of interest. It should be possible to decompose a system specification along multiple dimensions of concerns (e.g. Functionality, Architecture, Design, Process, Technology etc) into a set of reusable building blocks. A building block is a first class entity that can be composed from other building blocks. Developing a custom solution can be seen as a refinement-based process in which appropriate building blocks are suitably adapted and composed to form a unified system specification that can be transformed into an integrated implementation. The approach enables realization of component factories wherein business domain expertise is captured in a repository of reusable building blocks with tool support for selection, adaptation and composition of building blocks to quickly deliver custom solutions. Section 2 describes the traditional approach to software development. Section 3 introduces a model-driven development approach. Section 4 presents the proposed aspect-oriented model-driven development approach. Section 5 describes our model-driven development environment that supports the proposed aspect-oriented model-driven development approach along the dimensions of architecture, technology platform and design strategies. Section 6 presents conclusions and outlines future work.
10.2. Traditional development approach Faced with the problem of developing large and complex applications, industry practice uses a combination of non-formal notations and methods. Different notations are used to specify the properties of different aspects of an application and these specifications are transformed into their corresponding implementations through the steps of a development process. The development process relies heavily on manual verification to ensure the different pieces integrate into a consistent whole. This is an expensive and error-prone process demanding large teams with broad-ranging expertise in business domain, architecture and technology platforms.
A Model-Driven
Approach for Building Business
Components
Analysis
Ul prototype
UML diagrams
Design
GUI standards
Design strategies
ER diagrams + Table design
Coding
JSP implementation
C++/Java code
RDBMS implementation
Application
299
Database
Fig. 10.1. Break up of application based on development phases and architecture layers
Industry practice addresses scale and complexity by breaking down the problem along different axes — functional, architecture and development process (fig. 10.1). In most business domains it is possible to identify clusters of conceptually coherent functionality. These clusters are the basis for identification of functional components having high internal cohesion and well-defined interactions with other functional components. For a layered architecture the application is split up so that each piece implements the solution corresponding to a layer in the architecture. Different phases of the development process determine the properties of the application that are to be implemented during a particular phase. For example, a banking system may be broken down into different functional components like Foreign Exchange, Retail banking etc. A functional component like Retail banking will have a User Interface layer describing the way a user interacts with the system, an Application layer implementing the business functionality and a Database layer making information persistent. A development process consisting of phases such as Analysis, Design and Implementation will implement different properties of a layer. The Analysis phase for the application layer of Retail Banking will define the domain object models, use-cases etc. The Design phase will define implementation architecture and identify various design strategies such as object-relation mapping, concurrency management, auditing strategy etc. The Implementation phase will code application logic, database queries etc in the chosen platform.
10.2.1.
Discussion
In this approach, the components of a system are available only in the form of implementation artifacts. Aspects of interest such as design strategies, business process, architecture and technology platform are coded into the implementation artifacts in an entangled manner. As a result, a component can be used only in 'as is manner'. Adaptation of a component with respect to an aspect of interest is difficult as it involves opening up the code, making changes manually and testing for correctness — activities that are highly effort-intensive and error-prone due to the entangled nature of code.
V. Kulkami
300
and S. Reddy
10.3. M o d e l driven development approach We illustrate a model driven development approach in the context of a client-server application. T h e development of an application starts with an abstract specification t h a t is to be transformed into a concrete implementation on a target architecture [5]. T h e target architecture is usually layered with each layer representing one view of the system e.g., Graphical User Interface (GUI) layer, application logic layer and database layer. Each layer is implemented on a different platform supporting different primitives. For example, User interface platforms like Visual Basic provide windows and controls as implementation primitives. Application logic is implemented in a programming language like C + + or Java with classes and methods as the primitives while the database layer is implemented using tables and columns in a relational database system. These three layers are implemented independently and combined later t o get an integrated implementation.
Unified meta model View of
GUI layer meta model
App layer meta model
Db layer meta model /
Instance of Applicatip i specification • ""•"""
i
Decomposes
~ "" ~ - jqto__
4e~' GUI layer model
App layei model
Db layer model/
Model-to-code transformation GULlayer code
App layer code
Db la^etcode Composed of
Application implementation
Fig. 10.2.
Model-based development approach
T h e modeling approach constructs the Application specification using different abstract views - GUI layer model, A p p layer model and D b layer model each defining a set of properties corresponding to the layer it models as shown in fig. 10.2. Corresponding to these specifications are the three m e t a models - GUI layer m e t a model, A p p layer m e t a model and Db layer m e t a model which are views of a single Unified m e t a model. Having a single m e t a model allows us to specify integrity constraints to be satisfied by the instances of related model elements within and across different layers. This enables independent transformation of GUI layer model, App layer model and DB layer model into their corresponding implementations namely
A Model-Driven
Approach for Building Business
Components
301
GUI layer code, App layer code and Db layer code. These transformations can be performed either manually or using code generators. The transformations are specified at meta model level and hence are applicable for all model instances. If each individual transformation implements the corresponding specification and its relationships with other specifications correctly then the resulting implementations will glue together giving a consistent implementation of the specification as depicted in fig. 10.17. Construction of application specification in terms of independent models helps divide and conquer. Automated code generation results in higher productivity and uniformly high quality. Modeling helps in early detection of errors in application development cycle. Associated with every model are a set of rules and constraints that define validity of its instances. These rules and constraints could include rules for type checking and for consistency between specifications of different layers. The following sections describe the models for the different layers — Application, User Interface and Database - of a client server application in greater detail. For brevity, we have left out many details from each of the models. We also briefly describe how business logic, architectural choices and design strategies are specified. 10.3.1. Application
layer
source l..'1'
p
cedes
•£s
*
•*
Class
Process
Association
' 1
1..* has
ha/ Method
Task *
* rpal
VPt
Fig. 10.3.
destination has Attribute * ojType 1^ DataType
Meta model for application layer
The application layer implements the business functionality in terms of business logic, business rules and business process. The functionality is modeled using classes, attributes, methods and associations between classes. A business process models the application as a set of tasks and defines the order in which these tasks are executed. Each task is implemented by a method of a class. This layer can be specified as an instance of the meta model in fig. 10.3. Business logic is coded in a high level language and translated into a programming language of choice with code
302
V. Kulkarni and S. Reddy
fragments corresponding to the selected design and architectural strategies suitably woven in. Example: A banking system allows a user to open and operate an account with a bank. Two classes corresponding to this system are User and Account. An association between User and Account specifies the account belonging to a user. Account number is an attribute of Account and name is an attribute of User. The account opening process involves filling up an account opening form, verification of the form and approval of the form. A user can operate the account only after it is approved.
10.3.2. User interface
layer
*
Window * / has /
* \ has \
Class
\J o\ lens 1 *
•j-
Button
UlClass
0..1
1..* has * UIAttribute
i..*\ has \
has
0 1
Fig. 10.4.
Method
Attribute 0..1
ct ills
mapsto
Model for user interface layer
A user interacts with an application through its user interface. The user feeds in information using forms and browses over available information using queries and reports. Forms, queries and reports are implemented in a target platform using standard graphical user interface primitives such as windows, controls and buttons. A window implements user interaction and is composed of controls and buttons. A control accepts or presents data in a specific format. The user can perform a specific task by clicking on a button. Example: The user of a banking system needs a query window to inquire about past transactions and the current balance. She also needs a form window to withdraw money from her account. These windows will use appropriate controls to represent account number and date of transaction. The user interface is best specified in terms of windows, data to be shown in each window, controls to be used to represent this data, possible navigation between windows and actions that can be performed.
A Model-Driven
Approach for Building Business
Components
303
The core of a model for such a specification is as shown in fig. 10.4. In the figure a UlClass represents a logical grouping of the data to be shown in a window. Each UlClass represents a view of the application data. The association mapsto between UIAttribute and Attribute defines the view. This enables type-correct representation of value of the Attribute on the Window. This also ensures the user can enter only values that are valid for the attribute of a class. The association calls between Button and Operation enables type-correct invocation of operation. This also ensures the right set of objects get created and passed as parameters to the method invocation. The mapsto association enables copying of the right values from window to the parameter objects.
Control
*
re piesentedB y
*
DataType
instanceO^ f CalendarControl
Fig. 10.5.
instanceOf re >resentedB y 0..1 0..1
Date
A model for specifying UI standards
Additionally for uniform look and feel of the user interface, a particular data type should be represented by the same control in all the windows. In the banking system the same format should be used for all dates in all the windows. Similarly the same format should be used for account number in all the windows. An association between data type and control, as shown in fig. 10.5, will allow specification of such GUI standards. 10.3.3. Database
layer
The database layer provides persistence for application objects using RDBMS tables, primary key and query based access to these tables, and an object oriented view of these accesses to the application layer. In a relational database, the schema is made up of tables, columns and keys where a column has a name and a simple data type, and relations between tables are specified using foreign keys. An object model specifies similar information in terms of classes, attributes and associations. A row in a table contains data for an instance of a class. Therefore, the mappings essential to object / relational integration are between a table and a class, between a column and an attribute, and between an association and a key as shown in fig. 10.6.
304
V. Kulkarni and S. Reddy
Class
Table
mapsto
1 * / 1 .* has /
destination
source
Attribute
Association 0..1
0..1
oJType
Column 0..1
implements
1..*
0..1
composed
1..* mapi to
DataType Fig. 10.6.
Meta model for database layer
Example: The persistent information for the banking system will include details about accounts and users. Two tables User and Account implement this persistent information. These tables have columns corresponding to user name and account number. The association between a user and an account is implemented by having account number as a foreign key in the User table and a primary key in the Account table. Similar to the mapsto association of User Interface model, the mapsto association between Attribute and Column ensures type correctness. The implements association allows correct coding of class associations using appropriate Primary and Foreign keys. This association uniquely identifies the related classes and the tables. These classes and tables must be related through mapsto association. Such constraints can be specified in the meta model. 10.3.4. Business
logic
Business logic specifies the computations to be performed by the application in terms of methods. The language for specifying business logic, Q + + , [6] frees the application developer from low-level implementation concerns such as memory management, pointers, resource management etc and is retargettable to programming languages of choice such as Java, C + + , C # etc. 10.3.5. Component
model
A Component is an abstraction to model functional decomposition. A component has an interface and a dependency. A component interface is specified in terms of the model elements such as classes, operations etc. A component explicitly models its dependencies on other components. A component can only use the model elements
A Model-Driven
Approach for Building Business
Components
305
specified in the interface of the component it depends upon. Explicit modeling of the dependencies enables automated consistency checking of provider-supplier relationship between components. Code can be generated to target this model to various implementation platforms such as COM+, EJB, WebServies etc.
10.3.6.
Discussion
In this approach, the components of a system are available in model form. These models can be kept independent of implementation technology and can be targeted to multiple technology platforms through model-based code generation. However, not all aspects of interest are modeled. For example, GUI layer needs to address the usability concerns of standard look and feel across screens and standard user interaction patterns; error handling and logging are some of the concerns in the application logic layer; and concurrency, auditing and locking are some of the concerns in the database layer. Due to lack of support for modeling of such concerns, application developer has to manually implement these concerns. Typically, many of these concerns cross cut i.e., a given concern impacts at multiple implementation points e.g., error handling. Also, multiple concerns may impact the same implementation point e.g., error handling and auditing. In absence of a modeling mechanism for localization and composition of concerns, a concern specification gets scattered across models and code leading to entanglement. Such scattering and entanglement results in problems for maintenance in general and customizability with respect to a concern in particular.
10.4. Aspect-oriented model-driven development approach Aspect-oriented programming addresses separation of concerns by treating concerns of interest as separately specified Aspects that are weaved into computational
i
/
/
''
\
N
, point
Component code
i
i Join i point.
! Join 'I
i
\_J
has
Abstract Structural Represent CU1U11
.' \
\J
Aspect! code
„
\ \ Aspectncode
hr -j
• Aspect Weaver — •
Woven code
Composes Fig. 10.7.
Aspect-oriented programming
V. Kulkarni and S. Reddy
306
Components as shown in fig. 10.7 [4]. Typically component and aspects are specified using the same language (e.g. Java) and there is a fixed language for specifying join points (e.g. Aspect J [2] or Hyper/J [3]). Aspect-orientation can be elevated to model level by coming up with platform independent models (PIM) for these two languages as shown in fig. 10.8 [1]. UML profile (Component + Aspects)
UML profile (Join points)
1 abstraction
-
i Component code
-
Fig. 10.8.
Join v point 1
Aspect code "" ~
Aspect-oriented modeling — a simplistic approach
However, this approach does not allow an aspect to be specified in a language best suited for its expression. We propose a meta modeling approach to elevate aspect specification to the modeling world as shown in fig. 10.9. An aspect type, corresponding to a dimension of concern [11] e.g., business process, is specified in terms of a meta model which is the language for describing aspects along this dimension of concern. Similarly, join point type for an aspect type is specified in terms of a meta model which is the language for specifying join points for instances of the aspect type. Aspect2 join point Meta Model (JPMMi) Aspect 1 join point Meta Model (JPMM2)
paramete: Component Meta Model (CMM)
I I I I I I I I I
I
infctanceOf Component Model parametei
parameter Aspect2 Meta Model (AMM2)
4 i
I
Aspect! Model
1
Aspecti join point model
Fig. 10.9.
parameter Aspecti Meta Model (AMM
Aspect2 Model
paramett:r
Aspect-orientation at modeling level
An application is specified as a set of component, aspect and join point models as shown in fig. 10.9. These models are then transformed into a common
A Model-Driven
Approach for Building Business
Components
307
implementation model using model transformation as a mechanism. T h e responsibility of ensuring t h a t a transformation preserves the semantics of the source model in the target model lies entirely with the user i.e., the one specifying the transformation, as the approach, per se, does not guarantee it.
Component meta model
input Aspect meta model
input. /
output
Transformation ^ _ _. • _._ specification
\ nnfpin
Common join point meta model
input Aspect join point meta model
Fig. 10.10.
Aspect transformation specification pattern
Business process model: BPEL4WS
Application + Process model: UML - Application classes - Process classes
Transformation
Process tasks to class method associations
Fig. 10.11.
Weaving specification : CJPMM - To invoke process class methods from app class methods
Transformation of business process aspect
Model transformation is specified in terms of m e t a models as shown in fig. 10.10. T h e transformation creates an implementation model corresponding to the aspect model and also creates specifications for weaving this implementation model into the component implementation model. Figure 10.11 shows a transformation instance. T h e transformer takes application model (an instance of UML), business process model (an instance of B P E L 4 W S m e t a model) and their relationship in the form of associations between business process tasks and application class methods as
V. Kulkarni and S. Reddy
308
input and produces UML classes representing application and business process, and a weaving specification (an instance of common join point m e t a model). These platform independent implementation models can then be transformed into platform specific realizations such as Java and AspectJ. To sum up, aspect-oriented model-driven development approach aims t o raise the level of abstraction of system specification, by modeling all significant aspects of a system, and devising processes t h a t transform these artifacts in successive stages of refinement, at each stage imparting specific aspects of interest as shown in fig. 10.12.
Aspect 2 join point model
Aspect] join point model
Aspect! model
Comonent model
Java PSM join point model Java PSM model
Aspect2 model
~Y Y Y^ Tl
Transformation for Aspect]
CI Component model + Implementation model for Aspect! + Model for weaving Aspecti
"^ T2
Transformation for Aspect2
C2 Component model + Implementation model for Aspect! + Model for weaving Aspecti + Implementation model for Aspect2 + Model for weaving Aspect2
"^
Y
Y
Model-to-code transformation
Java code + AspectJ code
Fig. 10.12. Aspect-oriented model-driven development
A Model-Driven
Approach for Building Business
Components
309
We identify functionality, business process, architecture, design strategies and technology platform as principal dimensions of concerns for a business application. They also typically are the dimensions of customization for an application. Figure 10.13 shows these dimensions along with their meta models. We discuss two of these dimensions namely business process and design strategies in more detail in the example section. Technology (Platform Profiles) Design strategies (Pattern meta model) Functionality (UML)
Business Process (BPEL4WS meta model)
Architecture (ADL meta model)
Fig. 10.13.
10.4.1.
Dimensions of concern for business applications
Transformations
Model transformations play a key role in the realization of aspect-oriented modeldriven development approach. OMG has issued an RFP requesting proposals for a suitable mechanism for model transformations [8]. We propose a hybrid approach that combines the benefits of declarative and imperative paradigms [9].
/
1
/\
D
\
1
B
C
\ fines» :
\ .
-
-
•
*
-{M}Fig. 10.14.
Model transformation
G
H
x j
1
V. Kulkarni and S. Reddy
310
In our approach, a transformation is specified as a relationship between a set of model element domains. A domain essentially describes a set of sub-graph instances in a model graph and is specified using a combination of graph patterns and conditional expressions. There are two kinds of transformations — relations and mappings (fig. 10.14). A relation specifies a predicate relationship between a set of model domains i.e., it evaluates to true when the specified relationship holds between the sub-graph instances of the domains, and to false when it does not. Relations are not executable in the sense that they do not create or alter a model: they can however check whether two or more models satisfy the relationship. A relation can be refined into multiple mappings. A mapping specifies a functional relationship between a set of input domains and an output domain i.e., it computes sub-graph instances of the output domain from the sub-graph instances of the input domains. Thus, a relation can be seen as a specification of a transformation and a mapping as its implementation. A transformation can be composed from other transformations. There are three fundamental composition operators — Or, Parallel and Sequential. An Or composition specifies choice points. A Parallel composition specifies parallel execution of member transformations on the same source model-state and their results being merged to produce the target model state. A Sequential composition specifies sequential execution of member transformations with each transformation in the sequence producing the model state for the next transformation to work on. An application is specified as a composition of a set of aspects each imparting a set of properties (fig. 10.12). It is important to ensure that application of one aspect does not invalidate properties imparted by other aspects. Relation, the declarative specification of transformation, helps in this regard. For example, in figure 10.12, if the relation specifications of transformations T l and T2 hold in the component model C2 then we can be sure that application of Aspect2 hasn't invalidated application of Aspect 1. 10.4.2. Building
block
abstraction
A building block is a structural and behavioural abstraction that encapsulates an aspect of interest. Figure 10.15 shows the building block meta model. A building block has an associated meta model that provides the language for describing the aspect it encapsulates. Building blocks are of two kinds: leaf building block and composite building block. The instantiation specification of a leaf building block specifies how to transform an aspect model into its implementation model and how to weave it into the component implementation model. A leaf building block may also have a model-to-code transformation specification to transform the platform independent implementation model into platform specific code. The composition specification of a composite building block specifies how to compose the transformations of its members to honour the dependencies between them. The meta model associated with a composite building block is constructed by merging the meta models associated with its member building blocks. Conversely, the meta models associated with its member building blocks should be projections of the meta model associated with the composite building block. It is possible to stamp out specific aspect type variants from a generic parameterized aspect type [12] by
A Model-Driven
Approach for Building Business
aspectMetaModel Package
member
311
parameter
* Building block *
0..1
Components
1
2..*
Model element
4
*
Leaf building block
Composite building block
1
1
compositi0hSpec
instantiatio'nSpec
0..1
modelToCodeSpec 0..1
0..1
Composition spec
Model-to-code spec
Instantiation spec
Fig. 10.15.
Building-block meta model
Audit: Leaf
Business Process : Leaf member
member member Design strategies: Composite
Application :Composite
Functionality : composite
member
member Object-Relational map : Leaf
Fig. 10.16.
member
Model-to-Java: Leaf
Building-block composition for the example
binding its parameters to suitable model elements. We use the mechanism proposed in [9] for model transformations and a transformation language called SpecL [6] for model-to-code transformation. 10.4.3.
Example
We consider an order processing sub system of a clearing and settlement system. Order processing consists of — order entry, authorization, matching, clearing and settlement. Information regarding who placed the order, when, the progress of order
V. Kulkarni and S. Reddy
312
cnliiTTin
columi
Column
dataType
key
Table
mapsTo
magsTo 1
Key
Process
1
1 1
i I
1 audit :
precedes
4-
4-^>
Attribute IsPrimary nvReqd
attribute
;raf
Class isPersistent isAuditable
1 has
* Operation
Task realizes
I type BasicType
•t>
T
_L 3
Fig. 10.17.
parirneter
Database meta model
I
|
•
I I
Parameter
yP e
!
Process meta model
typp
Functionality meta model
I
Unified meta model for the example
processing etc., needs to be logged for auditing purposes. The information to be logged for auditing purposes may vary from time to time and from place to place. Details of order processing may also vary, for example, there may not be a need for a separate authorization step. Also, it is required to deliver the system on multiple technology platforms. Order: Class IsPersistence = 'y' IsAuditable = 'y'
^
Enter: Operation
partyld : Attribute realizes
OrderProcessing: Process Entry: Task
Authorize : Operatio n
Authorization: Task
Match : Operation
Matching: Task
Settle: Operation
Settlement: Task
Clear: Openition
C eann I: Task
Fig. 10.18.
Input model
A Model-Driven
Approach for Building Business
Components
313
The system requirements identify functionality, business process, design strategy and technology platform as the principal dimensions of concerns. In this example, we consider the design concerns of object-relational mapping and auditing, and Java as the technology platform. The building block model for this application is shown in fig. 10.16. Figure 10.17 shows a relevant part of the unified meta model for the Application building block and the meta model views corresponding to Functionality, Business process and Design strategies building blocks. Figure 10.18 shows a sample input model comprising of a class capturing order details, a process capturing order processing and mappings between process tasks and class operations. Figure 10.19 shows the platform independent model for the example just before application of Model-to-Java building block. Join points between business process class methods and application class methods, and between audit class methods and application class methods are not shown for lack of space. However, the corresponding code fragments are shown later. Audit and Object-relational map building blocks interact with each other. Audit building block introduces model elements that need to be persisted, and generates classes for realizing auditingrelated functionality that needs to be woven into the functionality imparted by the Object-relational building block. Object-relational building block makes the model elements introduced by Audit building block persistent. This cyclic interaction is specified in the composition specification of Persistence building block by invoking member transformations in the following order: Object-relational map followed by Audit followed by Object-relational map. Object-Relational map building block creates a table for every persistent class with columns corresponding to attributes of the class and generates operations that provide primary-key based access for creation, modification, deletion and fetching of a row. Audit building block creates a table for persisting the audit information for every persistent auditable class and generates operations to store pre- and post-image for every state change of an object. An audit image consists of the state of the object and the relevant information like who made the change, when, where etc. Business process building block generates a component with provided and required interfaces for every process. The provided interface comprises of operations corresponding to the tasks of the business process and the required interface comprises of the class operations that realize the process tasks. This building block also generates a wrapper component for every class that realizes the business process tasks through its operations. Model-to-code transformation results in generation of various method bodies and appropriate weaving specifications. For example, bodies for the operations imparting persistence — Create, Modify, Delete, Get and Exists; bodies for the operations imparting auditing — Getlmage, SetPrelmage and SetPostlmage; and the following weaving specification illustrating how to weave auditing functionality into the Modify method: Bracket Order:'.Modify() Before Getlmage(), SetPreImage() After Getlmage(), SetPostImage() Processing of the above weaving specification produces the following code: Order::Modify() Getlmage(); SetPreImage(); // Body of original Modify method Getlmage(); SetPostImage();
V. Kulkarni and S. Reddy
314
orderld: Attribute
[Ojorderld : Column
[O] Order : Key
partyld : Attribute
[O] partyld : Column
[A] timeStamp : Column [A] Prelmage : Column
mapsTj
[O] Order: Table
[A] Postlmage : Column
Order: Class audit [O] Create : Operation
[A] Audit_Order: Table
[0] Modify : Operation
required
[P] Order: Component provided
[P] Enter: Operation [P] Authorize: Operation
zz:
[O] Delete : Operation
[P] Order: Interface
[P] Match : Operation
/
[O] Get: Operation
/-
[P] Order Processing: Component provided
[O] Exists : Operation [A]GetImage:Operation [ A] SetPrelmage: Operation [A]SetPostImage:Operation
[P] EntryS : Operation
// [P] Enter: Operation
/
[P] Authorizations : Operation
[P] Authorize : Operation
[P]MatchingS : Operation
[P] Settle : Operation
[P] ClearingS : Operation
10.4.4.
[P] MatchingE: Operation
~n [P] Settlements : Operation
Fig. 10.19.
[P] AuthorizationE : Operation
v, / /
[P] Match: Operation
[P] Clear: Operation
[P] EntryE : Operation
/
[P] SettlemenfE: Operation [P] ClearingE : Operation
[O] contribution of Object-Relational map building block [A] contribution of Audit building block [P] contribution of business process building block
Platform independent model just before code generation
Discussion
In this approach, the components of a system are available in model form. These models can be kept independent of implementation technology and can be targeted to multiple technology platforms through model-based code generation. The approach enables all aspects of interest of a system to be modelled resulting in localization of aspect specifications and their composition. For example, customizing
A Model-Driven
Approach for Building Business
Components
315
an application for a different business process would require changing only the business process model and regenerating the application. Similarly an application can be easily customized for other aspects of interest such as design strategies, architecture etc. This provides a handle for adaptation of a component for a specific usage context. The approach can be used to realize component factories wherein business domain expertise is captured in a repository of reusable building blocks with tool support for selection, adaptation and composition of building blocks to quickly deliver custom solutions.OMG advocates a model-driven approach for system development through its MDA initiative [7]. The objective is to organize system specifications along various layers of refinement namely computation independent models (CIM) that capture requirements independent of their computational realization; platform independent models (PIM) that capture system specifications independent of the implementation technology concerns; and platform specific models (PSM) that capture implementation technology platform concerns. OMG envisages different languages to cater to the specification of these various layers and a set of standard transformations between these languages. This is essentially a generic architectural framework. Our approach can be seen as one instance of this generic architecture specialized for the domain of enterprise class business applications.
10.5. A case-study for the approach We have used this approach to restructure our model-driven development framework itself. The model-driven development approach presented here has been realized in our model driven development environment [6]. It comprises of a model repository, an integrated development environment for a higher-level specification language (Q++), and a set of model-based code generators as shown in fig. 10.20. Application specification comprises of extended UML models, business logic in Q + + and data accesses in terms of an extended query language. We extend UML models with appropriate properties and associations to capture design strategies such as object-relational mapping, auditing, concurrency management, logging etc. This framework has been used to develop several large business applications; a representative set is summarized in the table below. The column Domain model refers to the domain classes and not to the implementation classes. As can be seen from fig. 10.20 there is a separate code generator for each combination of an architecture layer and technology platform, and each of these code generators has to handle the design strategies as well. Since the design strategies cut across several architectural layers, changing an existing design strategy or adding a new design strategy requires modifications to multiple code generators at multiple places. Complete and consistent implementation of such a change requires thorough knowledge of all the concerned tools on part of a tool implementer. This problem became more acute as the number of variants of the code generators grew requiring a large development team. We ended up needing a team having thorough knowledge of all the design strategies for each technology platform. Thus, in effect, we had as many separate development teams as the number of platforms into which we were delivering.
316
V. Kulkarni and S. Reddy
Model to PB translator
GUI layer in PB
Q++ to C++ translator
App logic layer in C++
Model to C++ translator
Data manager layer in ProC
Query to ProC translator
Model to JSP translator
* "*
Fig. 10.20.
10.5.1. A sped-oriented
Q++ to Java translator
GUI layer in JSP
Model to Java translator
App logic layer in Java
Query to JDBC translator
Data manager layer in JDBC
Code generator architecture
restructuring
The application architecture consists of three layers namely GUI layer, application logic layer and data manager layer. There exists a choice of technology platforms for realization of each of these layers for instance, PowerBuilder and JSP for GUI, C + + and Java for application logic, and ProC and JDBC for data manager layer. We have a code generator for each such technology platform as shown in fig. 10.20. Here the complexity is principally due to each code generator having to handle several design strategies and all code generators having to handle them in a consistent manner. This is a clear case for separation of concerns approach. We organize the code generators along two dimensions of concern namely technology platform and design strategies. We encapsulate each design strategy and each platform as aspects. Figure 10.21 shows the new architecture of our code generators. Application specifications captured in terms of UML models, Q + + code and Query code are transformed in successive stages of refinement through application of various design strategies. The final specifications are then translated to a platform specific implementation by the respective platform specific code generators.
A Model-Driven
Projects
Streight Through Processing System
Negotiated Dealing System
Approach for Building Business
Components
Table 10.1. Experience summary Generated code Specifications Domain Size kind Size kind model (kloc) (kloc) (no of classes screens) 334/0 Business 3271 Application 183 logic layer Business Database rules layer Queries Architectural glue
303/0
46
Business logic Queries rules
627
Application layer Database Architectural glue
Distributor Management System
250/213
380
Business logic Business rules Queries
2670
Application layer Database layer Architectural glue
Insurance System
334/0
183
Business logic Business rules Queries
3271
Application layer Database layer Architectural glue
317
Technology platforms
IBM S/390, Sun Solaris, WinNT, C + + , Java ICS, MQ Series, WebSphere, DB2 IBM S/390, WinNT, CICS, C + + , Java MQ series COM+, DB2 HP-UX, Java, JSP, WebLogic, Oracle, E J B
IBM S390, Sun Solaris, C + + , Java C + + , Java CICS, DB2 CORBA
In this approach, the platform specific code generators are independent of design strategy related issues. This enabled us to organize our development team along two independent streams namely technology platform experts and design experts. A single design team can now service all the technology platform teams. Separation of design strategies enables lean technology platform teams. Separation of concerns enables a tool variant to be viewed as a composition of design strategy and technology platform aspects of choice. Localization of concerns led to impact containment. Ability to compose aspects led to increased reuse. These in turn resulted in quicker turnaround times for delivering toolset variants.
V. Kulkarni and S. Reddy
318
UML models
Query code
Q++ code
Object-relational mapping
UML models'
Model to PB translator
GUI layer in PB
Q++ to C++ translator
App logic layer in C++
Model to C++ translator
Data manager layer in ProC
Query Code'
Q++ code
Query to ProC translator Concurrency management Model to JSP translator UML models"
Q++ code
Query Code"
UML model
Q++ Code final
Q++ to Java translator
GUI layer in JSP
Query Code final
Model to Java translator
App logic layer in Java
L.
Query to JDBC translator
Data manager layer in JDBC
LL,
liffaT
Fig. 10.21.
Aspect-oriented restructuring
10.6. Conclusions and future work The approach has been used extensively in several projects to develop large business applications. Many of these projects had a product-family nature wherein a product-variant needed to be quickly put together and customized to meet the specific requirements of a customer. Model-driven development approach helped in quickly retargeting the application functionality on multiple technology platforms which was achieved using a relatively unskilled workforce as the technology and architecture concerns were largely taken care of by the code generators. The tool-assisted component-based development process helped in early detection of errors that would otherwise have led to late-stage integration problems. Also, all the projects reported significant improvements in productivity and quality. Having part of the application specification in model form and part in code form resulted in synchronization problems. Raising the abstraction level resulted in ease of specification but availability of debugging support only at the code level lead to difficulties in debugging. Also, the cycle-time required to effect a small change and verify its correctness was found to be significantly greater for the model-based approach
A Model-Driven Approach for Building Business Components
319
t h a n the traditional approach. However, the fact t h a t a model-level change got automatically reflected at multiple places in a consistent manner was appreciated. T h e aspect-oriented model-driven approach presented here has been used t o reorganize a model-driven development environment [6]. T h e reorganization has made customization easier and has resulted in increased reuse across tool variants. Localization of a concern helped in containing the impact of a change. Being able to selectively compose from aspects of interest resulted in faster customization. Clear separation of concerns has resulted in better traceability from requirements to implementation, leading to better change management. Use of higher-level languages for specifying model and model-to-code transformations has made maintenance and evolution of the tools simpler. However, this approach does not fully address the problem of correctness of composition. Building block constraints and type checking address t h e problem only partly. Ensuring t h e correctness of building block composition still lies largely with the user. We are working on extending the building block abstraction with process specification. We are working on a building block construction framework t h a t enables one to cater to new concerns through definition of new building blocks and / or customization of existing building blocks. We are also working on an open extensible repository-based model-driven development framework t h a t enables one to set u p a configuration of building blocks for a product-family using which variants of the product-family can be developed easily.
Bibliography 1. Omar Aldawud, Tzila Elrad and Atef Bader. UML profile for aspect-oriented software development. Workshop on aspect-oriented modeling with UML, 2003. http://lglwww.epfl.ch/workshops/aosd2003/papers/AldawudAOSD_UMLJProfile.pdf. 2. Aspect-J : http://www.aspectj.org. 3. IBM research. Hyper/J: Multi-dimensional separation of concerns for Java. http://www.research.ibm.com/hyperspace/HyperJ/HyperJ.htm. 4. Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Videira Lopes, Jean-Marc Longtier and John Irwin. Aspect oriented programming. ECOOP'97 LNCS 1241, pp 220-242. Springer-Verlag. June 1997. 5. Vinay Kulkarni, R. Venkatesh and Sreedhar Reddy. Generating enterprise applications from models. OOIS'02, LNCS 2426, pp 270-279. 2002. 6. MasterCraft: Component-based Development Environment. Technical Documents. Tata Research Development and Design Centre. http://www.tcs.eom/0_products/mastercraft/index.htm. 7. MDA: Model Driven Architecture http://www.omg.org/mda. 8. (Need to update) MOF 2.0 Query / View / Transformations RFP http://www.omg.org/cgi-bin/doc7ad/02-04-10. 9. (Need to update) QVT Partners Initial submission to the MOF 2.0 Q / V / T RFP http://www.omg.org/cgi-bin/doc7ad/03-03-27. 10. A. Sreenivas, R. Venkatesh and M. Joseph. Meta-modeling for Formal Software Development. Proceedings of Computing: the Australian Theory Symposium (CATS 2001). Gold Coast, Australia, 2001.
320
V. Kulkarni and S. Reddy
11. Peri Tarr, Harold Ossher, William Harrison and Stanley M. Suttom Jr. N Degrees of separation: Multi-dimensional separation of concerns. Proceedings of the International Conference on Software Engineering (ICSE'99) pp 107-119. 12. Tony Clark, Andy Evans and Stuart Kent. A Metamodel for Package Extension with Renaming. UML 2002 LNCS 2460 pp. 305-320 Springer 2002.
Chapter 11 A Formal Approach to Constructing Weil-Behaved Systems Using Components Sotiris Moschoyiannist, Juliana Kiister-Filipe* and Michael W. Shields^ ^Department of Computing, University of Surrey Guildford, Surrey, GU2 7XH, England {s.moschoyiannis, m.shields} @ eim.surrey.ac.uk $ School of Informatics, University of Edinburgh The King's Buildings, May field Road Edinburgh EH9 3JZ, Scotland jkfilipe @inf. ed. ac.uk Present-day software systems are in increasing need of modification and evolution due to changing requirements. Component-based development constitutes a key methodology for creating large-scale, evolvable systems in a timely fashion as it advocates the (re)use of prefabricated replaceable software components. However, it is often the case that undesirable or unpredictable behaviour emerges when components are combined. This is partly due to lack of behavioural information about the individual components. In this paper, we describe a formal model for component specification which can be used to support the analysis and predictability of component composition and to identify undesirable behaviour. In our approach, component behaviour is modelled by so-called behavioural presentations, a powerful model of true-concurrency Moreover, the framework is compositional and supports the assembly of the final system from arbitrary components.
11.1.
Introduction
A component-based approach t o software engineering emerges as a key paradigm for creating software-intensive systems. T h e idea is t h a t the final system can be assembled from prefabricated software components, thus increasing the scope for reuse and replacement. However, it is often the case t h a t systems assembled from pre-built components exhibit undesirable or unpredictable behaviour. In some occassions this is due to an unfortunate interaction between concurrency and nondeterminism. P a r t of t h e problem has to do with the fact t h a t when a component is designed, its interface often does not contain enough or precise behavioural information. This makes it difficult t o infer how t h e component will behave when placed in a particular context and combined with other components. Current component technologies such as C O R B A , E J B and C O M / . N E T facilitate the development of the final system from pre-built software components by 321
322
S. Moschoyiannis,
J. Kiister-Filipe
and M. W. Shields
addressing, mostly, the syntactic restrictions on component interoperability. However, these technologies seem to lack an adequate treatment of components at the specification level. In particular, they offer little support for reasoning about the final system before its parts have been combined, executed and tested as a whole which build the major motivation for our work. We argue in favour of an a priori reasoning approach to component-based software engineering in which reasoning about composition is based on the properties of the individual components [14]. Our efforts are directed at providing support for predicting properties of assemblies of components before the actual composition takes place. Given a partial description of the behaviour of a component, most likely described by a software engineer at the design level using the Unified Modeling Language (UML) [23], the proposed mathematical framework can be used, behind the scenes, for reasoning about generic issues related to components and their composition aiming to reveal potential instance's of undesirable behaviour. Ideally, appropriate tool support would allow feedback to be provided to the designer indicating what the component actually does (possibly when combined with others), how it is to be deployed, and so on. The feedback produced by such tools should be given using notation well known to the designer which in our case is assumed to be UML. Considerations about the tools is, however, beyond the scope of the present paper. In this paper we describe a mathematical framework for formalising components at a semantic modelling level. In particular, we provide a mathematical concept of a software component: the structure of a component is described by a sort while its dynamic characteristics are captured by component vectors. Essentially, these vectors are tuples of sequences of events which are used to model calls to interface operations of the component in question. More importantly, we define conditions that ensure that a component is well-behaved. The notion of well-behavedness in this context is twofold. From a theoretical perspective, these conditions allow us to associate a component with an event structure-like behavioural model, called a behavioural presentation [25], Using a behavioural presentation to capture component behaviour is advantageous in the sense that temporal relations among occurrences of events can be derived in such a way that non-determinism, concurrency and simultaneity are treated as distinct phenomena. From a more practical perspective, these conditions allow us to model the reactive behaviour of the component and establish a precise description of the time ordering between calls to component interface operations. The key point is that whenever this time ordering is respected, the component is guaranteed to behave in predictable ways. The foundations of the mathematical framework for formalising components are essentially those described in previous work [19, 20, 28]. The main difference though is that while previous work is concerned with the treatment of composition and the preservation of properties under composition, in this paper we focus on associating the model to behavioural presentations. In this way, the component model can be related to a general theory of non-interleaving representation of behaviour [26].
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
323
This paper is structured as follows. In Section 11.2 we present the foundations of the abstract mathematical model and introduce two basic conditions imposed on components: discreteness and local left-closure. These conditions inspire the definition of so-called well-behaved components. In Section 11.3 we associate components with behavioural presentations, which can be used to capture their potential behaviour. Section 11.4 provides a brief overview of composition of components within our model and shows that the proposed framework is compositional. The paper finishes with a discussion on future work and some concluding remarks in Section 11.5.
11.2. Component Model A component, at a specification level, can be seen as a software entity that provided services to other components and possibly required services from other components in order to deliver those promised. The offered services are made available via a set of 'provided' interfaces while the reciprocal obligations are to be satisfied via a set of 'requires' interfaces. In this paper, we adopt the basic component concepts for specification as in Meyer and others [11, 15, 17, 29]. Pictorially a component is a square box with a number of input and output ports [30]. Each port is associated with an interface. We shall assume a countable infinite set / of interface names and a countable infinite set Op of calls to operations of those interfaces, both sets remaining fixed for the remainder of this paper. Definition 11.1. We define a (component) sort to be a tuple E = where,
(PE,RE,PE)
• PE C I is a set of provided interfaces • Rs C / is a set of required interfaces • PE • PE U RE -^ p{Op); hence, PE(I) is the set of calls to operations associated with interface i and we require that PE l~l RE = 0- Define IE = PE U REThese sets and this function comprise the static structure of a typical software component. For simplicity, we refer to events that may occur on an interface as operation calls. However, these could be understood in more general terms as input actions (on provided interfaces), used to model operations/procedures/methods that can be called as well as the return locations of such calls, and as output actions (on required interfaces) which are used to model operation/procedure/method calls and exceptions that may arise during execution. In this sense, the notion of a component sort resembles the alphabet structure of interface automata [6]. As far as the dynamic characteristics of a component are concerned, we capture those by modelling the sequences of events experienced at the interfaces of a component. For this purpose, we introduce the notion of component vectors in our model.
324
S. Moschoyiannis,
J. Kuster-Filipe
and M. W. Shields
Definition 11.2. Suppose that E is a sort. We define Vs to be the set of all functions v: Is —> O* such that for each i € Is, v(i) € Ps(i)*- We shall refer to elements of Vs as component vectors. By Ps(i)* we denote the finite sequences over (3s(i). The idea is that behaviour of the component as a whole can be described by assigning such a sequence to each interface of the component. The function v_ returns the finite sequences of events (e.g. calls to operations) on interface i, for each interface i of the component. Putting together such sequences, one for each interface, we form (a set of) vectors of sequences. Mathematically, the set of component vectors Vs is the cartesian product of the sets j3s{i)*. Each coordinate corresponds to an interface of the component and contains a finite sequence of calls to operations on that interface. Thus, the dynamics of a component consist of a set of possible behaviours. Components are developed under (often differing) assumptions about the context in which they will be placed. For instance, a component may be expecting certain signals to arrive consecutively while the other is generating them concurrently. Or, more generally, a component may assume that calls to interface operations occur in a specific order and it may behave as desired only when this order is respected. It is the purpose of component-based design to document such assumptions and describe the behaviour of the component in contexts which satisfy those assumptions. Within our component model this amounts to restricting to an appropriate subset of Vs comprising component vectors that describe intended or permitted behaviour only. Definition 11.3. A component c is a pair (£, V), where • E is the sort of c, • V C Vs is the component language of c. Thus, a component consists of the static structure (signature) described by a sort E together with a 'language' V of component vectors. Intuitively, the idea is that the component language indicates possible constraints on the order in which several operations of the component can or should be called. It might be noteworthy, that there are a number of ways to restrict the v_(i) to allowed sequences of operation calls. Schmidt and Reussner [24] attach a finite state machine to each interface, in which case the allowed sequences are essentially given by the language accepted by the machine. Moschoyiannis [21] advocates the use of sequence diagrams, LSCs [4] in particular, for obtaining the component language based on the partial order induced by a sequence diagram, effectively building on Kuster-Filipe's [12, 13] work on formalising the interactions that appear on sequence diagrams. Alternative options could be the use of regular expressions or simply a textual description (use cases) of intended behaviour. Without implying any strong preference among the various options, for simplicity we opt for the latter in the example we use to illustrate the basic ideas in this chapter.
A Formal Approach to Constructing
Well-Behaved
Systems
Using Components
325
E x a m p l e 11.1. Consider a small and simplified extract of a TV platform, related to the MENU functionality of a TV set. The MANUAL STORE options are provided by the interaction of the components of Figure 11.1 which depicts the component specification architecture using UML. The component architecture of Figure 11.1 comprises a set of application level components together with their structural relationships and behavioural dependencies. «component» CMenu
ISearchFre O IFineTune
6 IOutput
«component» CTuner
IDetectSignal
Cj IChangeChannel
o—. Fig. 11.1.
Component specification architecture
The CMenu component establishes communication with users via its provided interfaces ISearchFre and IFineTune. The ISearchFre interface has operations highlightltem and startSearch. Calls to these operations shall be denoted by a\,a^ respectively, for abbreviation. The IFineTune interface has operations highlightItem, incrementFre and decrementFre, abbreviated by b\, 62 and 63 respectively. A user requests to search the available frequency for a program via the ISearchFre interface. The CMenu component cannot satisfy the requested operation itself and requires a component providing the IDetectSignal interface to conduct the frequency search on its behalf. This is done by invocation of an operation detectingSignal (abbreviated by c\) on its required interface IDetectSignal, which is implemented by the CTuner component. In what follows, we apply the formalism presented earlier to model the CMenu component. The provided interfaces of CMenu is given by the set Ps = {ISearchFre, IFineTune}, and the required interfaces is given by the set Rs = {IDetectSignal}. Consequently, the complete set of interfaces is given by Is = {ISearchFre, IFineTune, IDetectSignal} and of course, Ps f~l Rs = 0Function j3s a s defined in Definition 11.1 provides the set of calls to operations associated with each interface. In this case we have ps (ISearchFre) = {a^,a2} Pz(IFineTune) = {b^,b2,b3} Pz(IDetectSignal) = {c\} Suppose that a component developer considers the expected behaviour of CMenu fulfilling the following:
326
S. Moschoyiannis,
J. Kuster-Filipe
and M. W. Shields
• The Fine Tune option should be highlighted before the user can change the default fine tune value. • The Search option should be highlighted before the user can request a frequency search. • Once the user requests a search (which corresponds to invoking operation a-i on ISearchFre) the CMenu component calls the CTuner component (calling operation c\ on IDetectSignal). • An occurrence of an operation call 122 on ISearchFre should be followed immediately by an operation call c\ on IDetectSignal, and nothing should be allowed to happen in between. Given this informal description of behaviour for the CMenu component, and if we write (x,y,z) for the function v_ of Definition 11.2 with v(ISearchFre) = x, v{IFineTune) = y and v{IDetectSignal) = z we obtain a set of behaviours that indicates a partial description of what would be desirable behaviour of the CMenu component. In fact such description of behaviour from a component developer would be interpreted into the following set of component vectors. We use A to denote the empty sequence. V = {(A, A, A), (Ql,A, A), (A, blyA), ( 0 l a a , A, A), (A, bxba, A), {A, bJ>z,A), (a^&u ^4), ( a ^ & A , ^ ) , (a^a^A,^), (a 1 a 2 ,6 1 ,c 1 ), (a1,b1b2b3,A)} In further explanation of the notation, there is an ordering between calls to operations on the same interface (within one coordinate) but not between different interfaces of the component (between two coordinates). The ordering amongst calls to operations on different interfaces is obtained from the behavioural presentation for the component, as will be demonstrated in Section 11.3. Indeed, c = (£, V) is a component (recall Definition 11.3) where £= {Ps,Rs,Ps) is a component sort and its component language V is a subset of all possible component vectors in Vs • The mathematics of component vectors is given in a report by Shields [28] and is very similar to the behaviour vectors in Shields' earlier book [26]. The main difference is that while vectors in the book [26] describe behaviour of systems of sequential processes combined using something like the 11 operator of CSP [8] component vectors in this text describe behaviour of systems using something like the 111 operator of CSP. In this paper, we present the fairly basic properties of component vectors. More details can be found in the report [28]. If x and y are sequences we write x.y for the concatenation of x and y. As is well known, this operation is associative with identity A, where A denotes the empty sequence. We also have a partial order on sequences given by x < y if and only if there exists z such that x.z = y, and this partial order has a bottom element A. It is also well known that concatenation is cancellative, thus z is unique. Component vectors are essentially tuples of sequences. Thus, the above wellknown operations on sequences can be applied to component vectors coordinatewise. Based on the ordering among sequences, the set of component vectors Vs becomes a
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
327
partially ordered set with a partial order ' < ' and bottom element A. The behaviour A assigns the empty sequence to each interface. The partial order on sequences, together with a relation that allows us to determine which vectors lie immediately underneath another vector (cf Definition 11.8) gives the ordering of behaviour vectors in our model. Now based on the order theoretic properties of the set Vs, two basic operations on the set of behaviours of a component can be introduced. Definition 11.4. Let u and y_ be component vectors in V C V^. Then, (1) uF\v is defined to be the vector w. which satisfies w_(i) = min(u(i), v_(i)) (2) if it exists u U v. is defined to be the vector w_ which satisfies w{i) rnax{u{i),v_{i))
=
where, for an arbitrary i, w_(i), u(i), v_{i) denote the sequence of the i-th coordinate of vector w_, u, v_ respectively. The minimum (min) and the maximum {max) among sequences appearing in coordinates of component vectors is determined by the prefix ordering defined on the set of sequences formed over f3s(i), each i. We write u(i) = min(u(i), v(i)) if u(i) is a prefix of v_(i). Formally we have, u(i) = min(u(i), 2i(i)) <£=>• 3z_: u(i).z_(i) = v_(i),i £ Is u(i) = rnax(u(i), v_(i)) is defined similarly. In terms of partial orders the above operations essentially give the greatest lower bound and the least upper bound of u, v_ € V, in the usual sense of lattices and domain theory [5, 32]. It is well known that if (X, <) is a partially ordered set (poset) then the least upper bound of xi,x% € X, if it exists, is the least element x G X such that X\,X2 < x. We denote it by x\ U X2- The greatest lower bound, denoted by X\ l~l X2, is the largest element x € X such that x < x\,X2- Notice that these are computed coordinatewise for the behaviour vectors of our model. Additionally, and based on the fact that if x, y, z are sequences such that both x and y are prefixes of z, i.e. x,y < z, then either x < y or y < x, we may infer for component vectors that if u, v_ < w_ each i, then u(i),y_(i) < w(i) so u(i) < v_{i) or u(i) < ^(i), each i. We shall use this fact in the sequel without further comment. We may now restrict to a class of a components with desirable properties. These are properties of the corresponding component language and allow us to relate our component model to a behavioural model for non-interleaving representation of behaviour. In particular, a component language, under certain conditions, may be translated into an object called behavioural presentation, introduced in Shields' article [25], which generalises the event structures of Nielsen, Plotkin and Winskel [22] in allowing time ordering of events to be a pre-order rather than a partial order, thereby allowing the representation of simultaneity as well as concurrency. Effectively, this association builds a bridge between algebraic and order-theoretic representation of component behaviour. How behavioural presentations [25, 26] can be used to model software components is discussed in Section 11.3. In what follows, we describe the conditions that allow this association between components and behavioural presentations.
S. Moschoyiannis,
328
J. Kiister-Filipe
and M. W. Shields
A key property of the sets Vg is that they possibly contain discrete subsets. Definition 11.5. Let c = {E, V) be a component. We shall say that V is discrete, and consequently also that c is discrete, iff A G V and whenever V_1,v2,weV such that Hi,V2 — yi then, • vx U v2 G V • «i n v2 G V Note that V j U ^ G ^ i s understood as asserting that y_x U v2 is defined. Also, note that the notion of consistent completeness underlies the definition of discreteness. In short, consistent completeness for a poset dictates that whenever two of its elements are less than a third in the set, then their least upper bound not only exists but is also in the poset. Thus, discreteness simply imposes the additional requirement that the greatest lower bound also belongs to the poset. In fact, we wish to constrain components in such a way that they can be associated with a subclass of behavioural presentations, namely those that are discrete. This guarantees that i) there are no infinite ascending or descending chains of occurrences of events, with respect to time ordering, which would give rise to Zeno type paradoxes, ii) there are no 'gaps' in the time continuum and iii) there is an initial point in which nothing has happened. Exactly how i), ii) and iii) relate to the notion of discreteness shall become more clear when we discuss discrete behavioural presentations in Section 11.3 (cf Definition 11.13). We also wish to ensure that the behavioural presentation for each component contains one occurrence for each call to operation on one of its interfaces. This can be guaranteed by a property called local left-closure, which we now define. Definition 11.6. Suppose that c = (E,V) is a component. We shall say that V is locally left-closed, and consequently also that c is locally left-closed, iff whenever u G V and i E Is and x G Ps(i) such that A < x < u(i), then there exists v_ G V such that v_ < u and v_{i) = x. Definition 11.7. If a component c is discrete and locally left closed, then we shall say it is well-behaved. Effectively, we require that every occurrence of an event is 'recorded' in the set of behaviours of the component. Otherwise, a component might be discrete but its behaviour vectors may represent the occurrence of the last operation call only, on each appropriate interface. Such situations can be eliminated by requiring the local left-closure property of components. For a well-behaved component, and based on consequences of local left-closure in particular, we may define a further ordering between component vectors in which one is 'immediately beneath' the other. Definition 11.8. Let u and v be behaviour vectors in V C Vs. Then we define, u
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
329
Intuitively, the relation
8. Moschoyiannis,
330
(A, bit>3,A)
J. Kuster-Filipe
and M. W. Shields
(ai.bibfc, A)
(A,bi, A)
(aia 2 ,bi,ci)
(ai,A, A)
(A. A. A1
Fig. 11.2.
Order structure of elements in B, as given by the designer
framework this vector should be added in order to make the component language V, and consequently the CMenu component, discrete. By adding in vector (a1a2, b1, A) we get the following set of behaviours in V. Its order structure is depicted in the diagram of Figure 11.3. V = {(A, A, A), (ax,A, A), (A, b±,A), (a.a^A, A), (A, b±b2, A), (A,bxb3,A), (a1,b1,A), (a1,b1b2,A), (a±a2, b±,A), (axa2, A, cx), (a1a2,b1,c1), (ax,bxbJ>2,A)} This newly added vector describes behaviour of the component in which a call to operation 02 is followed by a call to operation b\. In terms of our example, the user requests a frequency search and before the CMenu component deals with this request (by making a call to operation c\ on IDetectSignal, according to the component designer) the user highlights the FineTune option (via a call to operation 61 on IFineTune). Such a sequence of events might leave the CMenu component in an inconsistent state or even cause a system failure. Hence, the component vector (a1a2,b1,A) can be regarded as describing a potential instance of undesirable behaviour. By comparing the two diagrams for the order structure of the component language V in Figure 11.3 and Figure 11.2, it can be seen that the model is, in a sense, warning the component designer. The diagram of Figure 11.3 says that in the course of achieving the desirable behaviour, described by vector (a1a2,b1,c1), the component might exhibit the potentially undesirable behaviour described by vector (a1a2,b1,A). In case this vector is indeed undesirable, some refinement of the component design is required in order to ensure that ( a ^ a a , ^ , ^ ) could be reached only through vector (a1a2,A,c1) excluding any path that would involve
A Formal Approach to Constructing
(A, bib), A)
Well-Behaved
Systems
(ai,bib2b 3 , A)
(A,bi, A)
Using Components
331
(aia 2 ,bi,C|)
(ai.A, A)
(A, A. A)
Fig. 11.3.
Order structure of elements in B, as given by the model
vector (a1a2,b1,A). If on the contrary vector (a1a2,b1,A) represents reasonable behaviour and such a sequence of calls to operations should be allowed, then our model is detecting it and serves as a designer's aid to find the complete set of allowed behaviours of the component. Now based on Figure 11.3, we examine whether the discreteness property holds. That this is so, is best illustrated diagrammatically. By inspection, we have the case depicted as a Hasse diagram in Figure 11.4, in which each subgraph below a given node exhibits the characteristic structure of a lattice. It can be seen in the illustration of Figure 11.4 that we only include those vectors of V with at least two distinct immediate predecessors. To see that (aiOj^nCj), (a15 btb2, A), (a1a2,b1,A) and (a^b^A) are such vectors, focus on the familiar lozenge shapes formed in Figure 11.3. The Hasse diagram of Figure 11.4 then, demonstrates that together with their predecessors they constitute lattices. Indeed, the least upper bound and the greatest lower bound of the distinct immediate predecessors exist and are in V, in all four cases. This implies that the CMenu component is discrete (in conformance with Definition 11.5). For local left-closure, we concentrate on those vectors in V with at least one component containing a coordinate with length greater than one and examine their predecessors. Figure 11.5 demonstrates that for each vector in V with at least two events in one of its coordinates there is some other vector in V which has either the same sequence of events, at that specific coordinate, or the same reduced by one event. This implies that the CMenu component is locally left-closed. Having established both discreteness and local left-closure for the CMenu component, we have shown that it is well-behaved. Consequently, its set of behaviours can be associated with a behavioural presentation (see the discussion in the following
332
S. Moschoyiannis,
J. Kiister-Filipe
and M. W. Shields
(a,a2, b] ,ci)
(ai,bib 2 , A)
(A, bib2, A)
(aia 2 , bi, A)
(ai, b,, A)
(A, bi, A)
(aia2, A, ci)
(a,a2, A, A)
(ai,A, A)
(A. A. A) Fig. 11.4.
Discreteness of CMenu component
(ai.bibfe.A)
(A.bfc.A)
(aia 2 ,bi,ci)
(aia 2 ,bi,ci)
(ai,bib 2 , A)
(A,bi,A)
(aia 2 ,bi,A)
(aia 2 ,A,ci)
(ai.bi, A)
(aia 2 ,A, A)
(ai,bi, A) (A,bib3,A)
(ai, A , A) (A,b|, A) Fig. 11.5.
Local left closure of CMenu component
section for this association), modelling the potential behaviour of the CMenu component.
11.3. Behavioural Presentations for Components In this section we relate the component model to behavioural presentations which comprise the central behavioural model of our approach. We shall use behavioural presentations to model component behaviour. In this way, our component model is equipped with a semantics expressive enough to capture non-determinism, concurrency as well as simultaneity as distinct phenomena. First, we present this behavioural model and motive our pre-occupation with discrete behavioural presentations. Then, we describe how a component can be
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
333
associated with such a behavioural presentation. We start by outlining the basic concepts behind behavioural presentations. Associated with any system is a set of events. For instance, call to operation startSearch on interface ISearchFre of the CMenu component in our examples is considered an event. When an event actually happens we talk about an occurrence of that event. Therefore, for any system there is a corresponding set E of events and a set O of occurrences of those events. A description of the possible behaviour of a system may consist of a set of assertions concerning what events have occurred during its execution. An assertion will be valid relative to some point in the space-time of the system. Therefore, each system is associated with a set of points. A point can be thought of as a "possible world" in which certain events have occurred. Each point is identified with the set of occurrences of events which have taken place prior to that point. The intuition is that each point represents that point in time reached after all occurrences which constitute it have taken place. Events may have multiple occurrences. Two occurrences of the same event are the same if they have been preceded by the same sequences of events. Consider the sequences of events aabab and aaabab. The second b in the sequence aabab is not the same occurrence as the second b in aaabab. They take place in different "possible worlds". We may thus refer to events by giving the sequence of which they are the last occurrence. These concepts comprise the basic behavioural model of our approach, namely that of behavioural presentations. A behavioural presentation, introduced in an article by Shields [25] is defined as follows. Definition 11.9. where (1) (2) (3) (4)
A behavioural presentation is a quadruple B =
(0,JJ,E,X),
O is a set of occurrences 17 C p(0) is a non-empty set of points E is a set of events A : O —> E is the occurrence function
which satisfies {J^en
n =
O.
The requirement that ( J ^ e i r n = ® s a y s that e v e r y occurrence belongs to some point and essentially reflects the fact that we should not be concerned with things that could never happen. The function A associates occurrences with events. Therefore, A(o) = e is to be read as 'o is an occurrence of e'. The behavioural presentation model is closely related to the event structures model [22]. In fact, behavioural presentations mildly generalise event structures [22] in allowing time ordering of events, given in Definition 11.10, to be a pre-order (a reflexive and transitive relation) rather than a partial order, thereby allowing the representation of simultaneity as well as concurrency. The relationship between the two is further examined in Shields' book [26] where it is shown that event structures correspond to behavioural presentations which are closed, in the sense of being prime algebraic and coherent [5]. We omit further details.
334
8. Moschoyiannis,
J. Kiister-Filipe
and M. W. Shields
In order to obtain a precise description of the behaviour of a component at its interfaces, we need to model: i) the order in which the component makes calls to operations to other components through its required interfaces, and ii) the order in which the component receives calls to operations from other components on its provided interfaces. A behavioural presentation can be rather useful to this end, since it determines various temporal relations on the set of occurrences of events. Definition 11.10. Let B be a behavioural presentation and suppose that 01,02 € O. Define, • Oi tt o2 <=> W G II : o2 G 7T =$• ox 0 7T and we say 0\, 02 are mutually exclusive • o± —+ o2 •<==> W G II : o2 G 7r =>- Oi G 7r and we say o\ has happened no later than 02 • 01 = 02 <=> (01 —•> 02) A (02 —> Oi) and we say o±, 02 occurred simultaneously • 01 co 02 •£=>• -1(01 (t 02) A (01 7A 02) A (02 /> Oi) and we say 01,02 occurred concurrently • 01 < 02 ^=>- (01 —> 02) A (o2 7^ oi) and we say 01 happened strictly before 02 Using the above temporal relations we can determine the causal and temporal ordering amongst operation calls occuring at component interfaces, and in this way describe the observable behaviour of a component. It can be seen that the temporal relations derived from behavioural presentations are based on two fundamental relations: j} and —>. These relations introduce concepts of mutual exclusion and time ordering among events, in a fashion similar to the well-known conflict and causal temporal relations in an article by KiisterFilipe et al. [10] and elsewhere. The relation —> is a pre-order - that is, a transitive, reflexive relation. If 01 —> 02, then if 02 has happened, then so must o\. The relation < is a strict pre-order. As for fl, if 01 ft 02, then an occurrence of 02 excludes future occurrence of 01, and vice versa. It is this relation that allows us to introduce notions of non-determinism into the model. In fact, jj is an independence relation - that is, an irreflexive, symmetric relation. Put formally, R is an independence relation on X if and only if Vx, y G X : (->xRx) A (xRy = > yRx) The following remark gives the basic properties of all temporal relations derived from behavioural presentations. Remark 11.1. Suppose that B is behavioural presentation, then (1) —> is a pre-order (2) (j is an independence relation, and whenever o\ —> 02 and o\ —• o 2 , then 01 tt o[ => 02 tt o 2 (3) = is an equivalence relation (4) co is an irreflexive and symmetric relation (5) < is a strict pre-order A pre-oredered set is a pair (X, —>) where X is a set and —-> is a relation on the set which satisfies:
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
335
• Va; G X : x —> x (reflexivity) • Va;, y€zX:x—>yAy—> z ==> x —> z (transitivity) Notice t h a t we do not require —> to be antisymmetric. Thus, if x —> y and y —> x, instead of x — y we get x = y which means t h a t x and y are distinct b u t stand in an equivalence relation. This allows the formal t r e a t m e n t of simultaneity on top of concurrency. In further explanation of the notation, = is the equivalence relation generated by the pre-order —K Hence, if 0,01,02 £ O and o\ (J 02, then • o\ —> o •<=>• 02 —> o
• o —> 01 •<=>- 0 ^ 0 2 • O (( Oi 4=4> O (J 0 2 In other words, two occurrences related by = s t a n d in exactly t h e same relationship to other occurrences. Further, suppose t h a t 01 = 02, and assume t h a t we have some means of deriving from t h e system t h e exact time t at which o\ occurred. Then, if o\ is in a relationship with the clock occurrence t, then 02 must also be in t h a t relation (and vice versa). Thus, 02 must also have occurred at t. T h e interpretation is t h a t 01,02 are simultaneous. Notice t h a t this is not the same as saying t h a t 01,02 are concurrent. 01 co 02 says t h a t certainly neither precedes the other and they are not mutually exclusive. Using the above temporal relations we may capture all occurrences in any behavioural presentation, as two occurrences are either: (1) (2) (3) (4) (5)
mutually exclusive ordered in time simultaneous concurrent strictly ordered in time,
and only one of these relations holds for a pair of occurrences [25]. Component-based systems are largely conceived of as proceeding in discrete steps. This implies t h a t occurrences of events in the system do not blur into one another. In the spirit of event structures, as discussed by Winskel [31], this means t h a t any two events in the system can be seperated by an open neighbourhood. We would like to be able to describe such systems using our behavioural model. For this purpose, we shall consider a subclass of behavioural presentations which are well suited for representing discrete behaviour. Before defining discrete behavioural presentations we discuss related properties t h a t motivate the definition. First, we want to ensure t h a t discrete systems do proceed in an orderly way in discrete steps. A step in behavioural presentation is understood in the following terms. Assume t h a t the system is in a state where its occurrences of events so far are described by 71". An occurrence o of some event takes place and this additional occurrence is now described by 7r'. Thus, we obtain IT' by adding in o, to whatever occurrences were already in IT. (In fact, we need t o add in the entire simultaneity class of o since, if d is some other occurrence such t h a t 0 = 0', t h e n d must also be in 71"' (by Definition 11.10 of = and —>).)
336
8. Moschoyiannis,
J. Kiister-Filipe
and M. W. Shields
Second, we want to ensure that a behavioural presentation contains enough points to separate events which are strictly ordered or non-simultaneous. This is the repletion property and is defined as follows. Definition 11.11. If B = (0,n,E,\) is a behavioural presentation, then B will be said to be replete iff whenever TT1,TT2 G II such that -K^ C 7r2 and ox,o2 £ 7r2 \ TTX, then 02 ~h 0± ==>- 37T3 £ LI : (ll1 C 7T3 C 7T2) A (ox £ 7T3) A (o2 0 7T3) In further explanation, suppose that we have points TT1 and 7r2 such that TT± is before 7r2, and 0\, 02 occurred between n1 and 7r2. Now if o% occurred later than or concurrently with 0\ (i.e. 02 -f+ o\), then the repletion property says that there is a point (another possible world) 7r3, after irx and before 7r2, at which it is legitimate to assert that 0\ has happened but 02 has not. The idea of the repletion property perhaps can be best illustrated by the simple example given in Shields' book [26]. In short, a coin is tossed (occurrence c) and then it either lands with heads on top (occurrence h) or tails on top (occurrence t). A behavioural presentation in this case would give 7r0 = 0, •n1 = {c, h], 7r2 = {c, t] and that c < h,c < t and h jj t. However, there is a possible world missing. That is, 7r3 in which the coin has been tossed but not landed yet. That would be 7r3 = {c}. Therefore, 7r3 is in between 7r0 and TT± (similarly, for n0 and 7r2) and c has happened but h has not (similarly, c has happened and t has not). It is situations like this that the repletion property is intended to capture. Referring back to Definition 11.5 of discrete components, repletion ensures that there are no 'gaps' in the time continuum. Finally, to define discrete behavioural presentations we also need an ordering on subsets of behavioural presentations. Definition 11.12. If B = (O, II, E, A) is a behavioural presentation and X,Y C O, then we shall say that X is left-closed in Y whenever
lcy^ic7A(Voel,Vo'eF:o'^o^o'el) This ordering relation among subsets of behavioural presentations is reminiscent of the 1 eft-closed subsets in the prime event structures model of Winskel [31]. Remark 11.2. The relation C is a partial order on p(0). Proof. Reflexivity and antisymmetry of C follow from reflexivity and antisymmetry of C. Now suppose that X,Y,ZCO with X C Y and Y Q Z. Then, we may deduce that X C Z. Let o £ X and o' e Z such that d —> o. It suffices to show that o' £ X. We have that o £ Y as X C Y. Since Y C. Z, we may deduce that d £ Y. So, we have o £ X and d £ Y with d —> o. Since X C Y, we can conclude that d £ X. Hence, we have shown that C is also transitive, which completes the proof. • Now we can define discrete behavioural presentations. Definition 11.13. A behavioural presentation B = (0,11, E, X) will be said to be discrete if and only if, for every 7r £ 77 we have,
A Formal Approach to Constructing
Well-Behaved
Systems
Using Components
337
(1) The set of =-classes of the elements of IT is finite (2) HXQTT, then X G 77 Point (1) of the above definition asserts that only a finite number of occurrences may take place within finite time. Hence, it is a finitary condition and with regard to Definition 11.5 of discrete component languages, it excludes infinite ascending and descending chains of occurrences of events. By examining Definition 11.11 it can be seen that a non-replete behavioural presentation may have some points missing. Definition 11.12 says that such points must lie under existing points. Point (2) of the above definition then, guarantees inclusion of those points and thus ensures that there will be no points missing. With regard to Definition 11.5 of discrete components, this ensures that there are no 'gaps' in the time continuum. Note that point (2) essentially reflects Definition 11.12 so that a discrete behavioural presentation is one that is left-closed and additionally satisfies the finitary condition (point (1) of Definition 11.13). Finally, note that since 0 E 7r, for all 7r e 77 it follows, again from point (2) of Definition 11.13, that discrete behavioural presentations have bottom elements. This is the initial point in which nothing has happened yet, in the sense of discrete component languages (Definition 11.5). It might be worth adding a small note to clarify the terminology used so far. Discreteness and local left-closure in component languages are defined as two seperate properties. Discreteness in behavioural presentations includes the notion of a leftclosed behavioural presentation. Hence, a discrete component language is not necessarily locally left-closed whereas a discrete behavioural presentation is always leftclosed. The obvious connotations of the naming are intentional. Well-behavedness (i.e. discreteness and local left-closure) of component languages manifests itself in discrete behavioural presentations, as we will see in the sequel. Now we turn our attention to relating the (language part of the) component model, described in Section 11.2, to a behavioural presentation. More specifically, we describe a construction that takes a well-behaved component into a discrete behavioural presentation. The construction uses the familiar idea of taking the elements of the orders to be primes in the partial order of component vectors (V, <). Recall that an element a; of a partially ordered set (X, <) is prime if, whenever U C X and x < \_\ U G X, then x < u, some u G U. The set of all primes of (X <) will be denoted by Pr(X). In mapping a well-behaved (i.e. discrete and locally left-closed) component language onto a discrete behavioural presentation the main challenge lies with leftclosure of the corresponding behavioural presentation. The finitary condition (point (1) of Definition 11.13) that makes a left-closed behavioural presentation discrete, can be guaranteed by discreteness of the component language (see Definition 11.5). In the study of left-closed behavioural presentations in Shield's book [26], which provides useful insights on the subject matter, it is shown that if B = (0,77, E, A) is a left-closed behavioural presentation, then the poset (77, <) is prime algebraic and consistently complete, with the elements of j. o as primes, where J, o — {o' G O : d —> o). Subsequent analysis in the reports [27, 28] showed that in order to associate components with behavioural presentations we need to characterise primes in a component language and prove prime algebraicity and consistent completeness.
338
S. Moschoyiannis,
J. Kuster-Filipe
and M. W. Shields
But, let us first describe a construction that maps a component language onto a quadruple (Oc, IIC, Ec, Ac) which mirrors the behavioural presentation model. We start by exploiting the basic properties of the associated order-theoretic structures. Recall that the relation
= {i e Is • (basev(v))(i)
< v(i)}
Each primal vector y_ G prms(V) represents behaviour in which an operation call (and any others simultaneous to it) has occurred on the corresponding interface(s), during the course of behaviour since that described by basey(v). The key point is that at most one operation call has occurred per interface during the fragment of behaviour between basey{v) and v. We accordingly associate primals in V with simultenaity classes of event occurrences, as we define next. Definition 11.15. Suppose that c = (27, V) is a well-behaved component and let Oc = {(v,i) eV x Is • ve prms(V) A i e
ifsv(v)}
We define a function Ac : Oc —> Op x Is by Ac(w, i) = (TO, i) ^=> 3y G Pz(i)* and m G (is{i) such that v(i) = y.m
A Formal Approach to Constructing
Well-Behaved
Systems
Using Components
339
The set Oc comprises all possible occurrences of events in the behaviour of the component. As for the occurrence function, if Ac(o) = (m,i) then o is the occurrence of an event consisting of a call to operation m at interface i, during behaviour described by v_ and since that already described by basev(v). Notice that since v(i) ^ A when (v_,i) G Oc, there exists y G Ps{i)* such that y_(i) = y.m. In effect, we isolate the last call out of the sequence of calls to operations at interface i. This is indeed the case since v_ is a primal vector and thus the sequence y of events (operation calls) on its z-th coordinate will have been already described by the corresponding basev(v)(i). In other words, we isolate m which is what takes basev(v_) and 'stretches it up' to v_. Next, we also need to define a set of points. Definition 11.16. For u G V, we define TT« = {(.U,i)
£Oc:v
The set 7r„ is the set of all occurrences of events during the component behaviour represented by u. The set of all sets 7rM, for u G V constitutes the set of points LTC, hence nc = {-Ku : u G V} The structure (O c , FIC, Ec, Xc) is an instance of a behavioural presentation [26], where Oc is the set of occurrences of events in the behaviour of the component c = (S, V), IIC C p(Oc) is a set of points, Ec is the set of events of the component and Ac is the occurrence function. Finally, (J^eir ^ = OcWe may now proceed to characterise primes in a well-behaved component language (in fact, discreteness suffices) as its primal vectors. This is established in the following proposition. Proposition 11.1. Let c= (S,V) be a well-behaved component and z_ G V with z / A. Then, the following are equivalent: (1) 2; is prime (2) z_ is primal Proof. See Proposition 4.2 in the report by Shields and Moschoyiannis [27], which also contains a series of useful intermediate results. • Our remaining concerns have to do with proving prime algebraicity and consistent completeness of a well-behaved component language. In fact, consistent completeness follows from discreteness of the component language - it is inherent in the definition of discreteness as discussed before (see Definition 11.5). As for prime algebraicity, which is somewhat more tricky, we will need to invoke local left-closure as this will allow the construct primes 'to order'. More specifically, given a non-prime u we may 'pull it down' to an element u' of V such that u' < u by keeping some chosen coordinate fixed until we hit a prime. This entails that Pr(u), defined by Pr(u) = {z_ G PriV) : z_ < u}, posseses an element v_t with
S. Moschoyiannis,
340
J. Kiister-Filipe
and M. W. Shields
Ui(i) = U.{i)i each i. We may thus argue that {\_\Pr(u))(i) — M(*)> a ^ *> which is the hard direction in proving prime algebraicity of a component language. Proposition 11.2. Let c = (£,V) be a well-behaved component, then V is prime algebraic with the primal vectors as primes. Proof. Let u € V and define Pr(u) as above. We show that \_\Pr(u) = u by proving that \_\Pr(u) < u and then the reverse inequality. See Proposition 4.1 in the report [27] for the complete proof. • The local left-closure property of components (Definition 11.6) takes up on ideas of left-closed subsets of the set of occurrences in a behavioural presentation. Consider the component c = (S, V), where Is = {i, 2} with the set of behaviours V = {(A, A), (aa, A), (A, bb), (aa, bb)}. It can be shown that the component language V consitutes a finite lattice, so c is discrete. However, the corresponding behavioural presentation would have the counterintuitive property that although four operation calls have occurred there are only two elements in Oc to describe them, namely (aa, A) and (A, bb). This is because the two primal vectors represent the occurrence of the second of the operation calls on each interface. It is for this reason that we require local left-closure of the component language V of a component. In fact local comes from the fact that it is applied to each interface of the component (i.e. to each coordinate of a component vector in V). Example 11.3. We extend Examples 11.1 and 11.2 with a sequence of demonstrations that will allow us to use a behavioural presentation to model the potential behaviour of the CMenu component. First, for each element w in V of Example 11.1 we determine its corresponding set prev(v)- Recall that this is the set of component vectors that lie immediately underneath v_. prev(A,A,A) =0 prev(a1,A,A) = {(A,A,A)} prev{A,bx,A) = {(A,A,A)} prev(a-La2,A,A) = {(a1,A,A)} prev(A,bxb2,A) =preB(A,b-Lb3,A) = {{A,bx,A)} previa^b^A) = {(a1,A,A),{A,b1,A)} prev(a1,b1b2,A) = {(a1,b1,A),(A,b1b2,A)} prev(axa2,b1,A) = {(a^a2,A, A), (a 1 ; 6 1; A)} prev(a^a2,A,cx) = {(^a^^A^)} prev{a1a2,b1,c1) = {(a±a2, A, ct), (a1a2,b1, A)} prev(a1,b1b2b3,A) = {(ax,bxb2, A)} All v_ that have a unique vector in V immediately below them comprise the set
prmsiy), prms{V) = {(a,., A, A), {A, bx,A), {a1a2,A, A), (A, bxb2,A), (A, b^A), (a1a2,A,c1), (ax,bxb2b3, A)} Next, we associate the behaviour vectors in prms(V) with the interfaces to which the last call to an operation occurred. In effect, we are applying the definition of
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
341
Oc to each primal element in the component language of the CMenu component. The following occurrences make up the set Oc for the component. o± = ((ax, A, A), ISearchFre) o2 = ((A,b±, A),IFineTune) °3 = ((aia2>A,A), ISearchFre) 04 = ((A, bxb2,A), IFineTune) 0 5 = ((A,b±b3, A), IFineTune) oe = ((a^a2,A,cx),IDetectSignal) o7 = ((a1,b1b2b3, A), IFineTune) The occurrence function Ac can also be used to determine exactly to which call to operation, and on which interface, each occurrence refers to. Ac(oi) = (oi, ISearchFre) Ac(o2) = {bi, IFineTune) Ac(o3) = (a 2 , ISearchFre) Ac(o4) = (b'2, IFineTune) Ac(o5) = (b3, IFineTune) Ac(o6) = (c 1; 1Detect Signal) Xc(o7) = (b3, IFineTune) The occurrence function is to be read in conjunction with the corresponding occurrence given earlier. For instance, Ac(o7) = (b3, IFineTune) implies that 07 is the occurrence of call to operation 63 on interface IFineTune. This is also the case for Ac(o5) (with Ac(o5) = (63, IFineTune)). However, Ac(o5) is associated with o 5 = ((A, bxb3, A), IFineTune) which means that 05 refers to occurrence of 63 when 61 has happened earlier whereas Ac(o7) = (b3, IFineTune) is associated with o7 = ((a1,b1b2b3, A), IFineTune) and refers to occurrence of ^3 on IFineTune but only after the component has experienced calls to operations ax, bx and 62. Based on Definition 11.16, we may now obtain the sets of all occurrences of calls to operations during the course of behaviour in question. K(A,A,A) = n(ai,A,A) = ^{A,bltA) = n(aia„A,A) K(A,b^b?,A) K(A,b^b3,A)
^(a^b^A)
$ {°i} {°2} = {Oi,03} = {02,04} = {°2,05}
= {0i,02}
^{aia,,A,Cl) = {Oi,03,06} ""(a^b^./l) = {Oi,02,04} 7 r (a 11 6 1 b 2 b 3 ,A) = {Oi,02,04,07} ^{a.a^b^c,) = {Ol,02,03,06}
For instance, 7T(0iQ3tA,cx) — {°i> 0 3i°6} contains occurrence of event ax at interface ISearchFre (i.e. Ox), occurrence of event ai at interface ISearchFre (i.e. 03) and occurrence of event Cj at interface IDetectSignal (i.e. oe). Occurrence of all three events comprises the behaviour of the component represented by (a1a2,A,c1).
S. Moschoyiannis,
342
J. Kuster-Filipe
and M. W. Shields
Referring back to Definition 11.9, we have constructed the sets 0C and Uc for the CMenu component. Finally, from the set of all occurrences of events given above and based on Definition 11.10 we extract the temporal relations among occurrences of events for the CMenu component shown in Table 11.1. Table 11.1. for o\\ o\ co 02 oi < 03
o\ tt°4 01 j j o 5 o\ < 06
Ordering on occurrences of events of the CMenu component
for 02: 0 2 JJ03 02 < 04 02 < 05
02 tt°6
for 03: 03 03 03 03
for 04: 04 j)05 04 jt 06 04 < 07
It 04 It 05 < o6 B 07
for 05:
for 06:
05 tt 06 05 II 07
°6 tt°7
02 < 07
o\ < or
As an illustration of how these temporal relations are obtained we examine the relation between 03 and OQ. Occurrence o§ refers to a call to operation C\ at interface IDetectSignal, but only when calls to operations a\ and then 0,2 at ISearchFre have preceded it. Occurrence of a call to operation a\ followed by a-i at the same interface is precisely occurrence 03. Thus, 03 strictly precedes o§ since a\ and <22 must have occurred at interface ISearchFre (this is 03) before c\ can occur at interface IDetectSignal (this is OQ).
°6 •
Fig. 11.6.
07
•
Behavioural presentation model for the CMenu component
A behavioural model of the CMenu component can be seen in Figure 11.6 which depicts the temporal relations among occurrences of events, as these are experienced by the component on its interfaces.
11.4. Composition of Components Up to this point we have been concerned with a single software component. We described a component model and identified conditions on component behaviours which enabled the link to a behavioural model, namely that of behavioural presentations [25]. As a result, component behaviour can be modelled using behavioural presentations. The major challenge in a component setting is that of understanding the consequences of interconnecting components. In this section, we give an overview of the
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
343
notion of composition of components within our framework. We shall not discuss composition in great detail though as it is not essential for the understanding of the current paper. A more rigorous treatment of composition can be found in the article by Moschoyiannis and Shields [20] while a previous paper [19] outlines the key technical results. The main purpose in this section is to show that the mathematical framework we have been concerned with so far is indeed compositional. In particular, we consider an operation on components which takes a set of components and combines them by allowing communication between their provided and required interfaces. The basic concept behind composition, as this is considered in our framework, is the following. If component c\ provides interface i and component C2 requires interface i, then a behaviour of c\ and a behaviour of c
{PsxUPEa)\{Rs^Rsa) U RZMPS,
U
PSJ
344
S. Moschoyiannis,
• Ps(i) = Pz-jii) wherever i € ISj,j
J. Kuster-Filipe
and M. W. Shields
= 1,2
The following lemma establishes that S1 © E2 is indeed a sort whenever E1 and S2 are consistent sorts. Lemma 11.1. Suppose that Sx, I72 are consistent sorts, then E1 © £2 is a sort. Proof. Let ZV © S2 = E. We first prove that (3s is a well defined function. Since Is Q IEX U IS, it suffices to show that if i G Isl C\ I^2, then Ps^ii) = Ps^{i)- But this is precisely point (3) of Definition 11.17. Finally, we note that,
Ps n i? s = ((P El u P E2 ) \ (i? El u i?E2)) n ((ij S l u i?E2) \ (P El u P E2 )) which completes the proof.
•
As for the behaviour of the resulting composite, it is an aggregate of behaviours from each of the components at the non-connected - or better, the noncomplementary - interfaces which remain essentially unaffected, and of its connected interfaces on which the sequences of events must agree. We motive the definition as follows. In any behaviour of the composite system, each component Cj will have engaged in a piece of behaviour v_,. If i is a complementary interface of components Cj and cj,, then it will be a provided interface of one and a required interface of the other. Without loss of generality, suppose it is an interface provided by Cj and required by c^. Then, V_Ai) represents the sequence of operations sent from Cj to Cfc, which (assuming no delays) is precisely v^{i). In the following definition, IEl A Jj; a is the symmetric difference of Is1 and Is,, defined to be (IEl \IE J U (Is, \IEl). Definition 11.19. Suppose that cx = (E1,V1) and c2 = (XV,, V2) are components and let IEl, Is, be their sets of interfaces, respectively. We shall say that vectors ur € V\ and u2 e ^2 a r e consistent, and we write ux I u2 ^
where f[x define,
y* L/ Sl n / E a = u-2 [1 .niz denotes the restriction of function / to the set X, in which case we ux © u2 = (Mi U u2) [Isi
A/£a
where ux U u2 • Iz^ &IE, satisfies
which is well defined if ux J, u2. This defines the operation © of composition on component languages. Two component vectors can be composed following Definition 11.19 providing that they are consistent. Consistency basically ensures that the vectors from each language agree on complementary interfaces. There is no point in considering vectors whose
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
345
sequences of events on shared interfaces do not agree. In the resulting composite vector ttj © u 2 the coordinates that correspond to complementary interfaces are removed - notice the restriction to Is1 AIE3 in the definition of u^ © u2 • The remaining coordinates are essentially unaffected by composition and the components exhibit the same behaviour as that prior to composition, on the corresponding interfaces. Having composed their sorts (following Definition 11.18) and their component vectors (following Definition 11.19), we can now give the formal definition of composition of components. Definition 11.20. Suppose that cx = (Z7i, Vi.) and c2 = (£2,V2) Then, we define their composition c± © c2 = (£, V) where,
are components.
• £ = E± © £2 • V = V± © V2 where, Vi®V2 = {veV^\ 3ux G Vy, 3u2 G V2 : ux [ u2 A v = ux © u2} Informally, the definition says that the static structure of the composite is formed from those of the components while the dynamics reflect the fact that behaviours of the composite comprise component vectors from each of the components on nonconnected interfaces so long as they are consistent on the shared interfaces. Based on the above definitions, it can be shown that c\ ® c2 is a component whenever ci,C2 are consistent components. This is formally put in the following lemma. Lemma 11.2. Suppose that c\, c2 are consistent components, then c = c\® c2 is a component. Proof. Let c = cx®c2 = (E, V) hence, £ - E1 © S2 and V = V1®V2. Then, £ is a sort by Lemma 11.1 and V C Vs by definition. • The operation © of composition of components is commutative - notice that Definitions 11.17, 11.18, 11.19, 11.20 are all symmetric on £X,E2 or ci,C2 - and associative [20]. This has the advantage of being able to build systems from generic components. We can take two components, put them together and the resulting composite can then be further composed with another component or another composite. The work presented in the related article [20] also deals with the all important issue of preservation of well-behavedness (i.e. discreteness and local left-closure properties) under the operation of composition. It can be shown that under certain conditions, captured by the notion of compatibility between components [20], the composition of well-behaved components results in a well-behaved composite component. Local left-closure of the resulting composite is relatively straightforward. Discreteness requires further analysis of the interaction between U and J. in the corresponding component languages. This is useful in showing that vectors bounded above in the composite component language have their least upper bound and greatest lower bound in it.
346
S. Moschoyiannis,
J. Kuster-Filipe
and M. W. Shields
The fact that well-behavedness is preserved under compositon of components allows us to infer prime algebraicity of the composite component language (based on Proposition 11.2). In this way, the resulting well-behaved composite component can be readily associated with a discrete behavioural presentation. Therefore, its behaviour can be described again using the powerful model of true-concurrency presented in Section 11.3. The interested reader is referred to the paper [19] or the article [20] for further details of the results establishing the algebraic properties of the operation © of composition as well as the issue of preservation of well-behavedness under composition.
11.5. Conclusions and Future Work In this paper we presented a mathematical model for formalising software components based on the use of tuples of sequences of events to model component behaviour. These sets of tuples can be used to model the behaviour of components provided they are well-behaved, that is, they satisfy two conditions, namely discreteness and local left-closure. In that case, our component model corresponds to an order-theoretic based model of behaviours with desirable properties. In this paper, the presentation of the mathematical framework has been mostly concerned with the behaviour of a single component. The fact that the framework is compositional was briefly discussed. Component composition is dealt with in the paper [19] and further in the article by Moschoyiannis and Shields [20], which contains an intensive study of the algebraic properties of the operation of composition within the framework. The overall objective of the work presented in this paper is the application of mathematical methods in enhancing the design of component-based systems. Our approach relates to both building a component (how to specify a component taking into account structural and behavioural aspects) and reuse (how can the component be reused and combined with others, how can it be operated, what are the expected sequences of operations). In creating a component, a designer would be expected to specify the desired set of behaviours. Using our model, and in checking against discreteness and local left-closure, we identify possible missing behaviours and refine the set of behaviours for the component (i.e. additional vectors might be included in the set to make the component well-behaved; such vectors might indicate pathological behaviour, leading to refinement of the initial design). The refined set of behaviours then describes what the component does as well as the ordering of calls to operations, which needs to be respected whenever the component is deployed in different configurations. Probably the closest to our model is the algebraic specification model of Broy [1, 2], The use of tuples of sequences to model component behaviour is reminiscent of the use of streams in Broy's approach to represent messages communicated along the channels of a component. Essentially, the set out of our model is quite similar to that of Broy [1, 2], where a software component is modelled as an input / output function transferring input streams to the set of possible output streams. Semantically, a component in Broy's model is represented by a predicate defining a set of behaviours
A Formal Approach to Constructing
Weil-Behaved
Systems
Using Components
347
where each behaviour is represented by a stream processing function [2]. In this respect, we take a different approach since our model is based on the order theoretic structure of the set of behaviours of a component and is then related to behavioural presentations which provide a denotational semantics expressive enough to capture non-determinism, concurrency and simultaneity. Another approach to formalising software components is that of Kuster-Filipe [9, 11] which describes a distributed logical framework for components and their composition. The initial set out is quite different to our model since Kuster-Filipe introduces a distributed temporal logic M D T L for specifying components. The semantics of M D T L is based on event structures which, similarly to behavioural presentations, can capture non-determinism and concurrency, but by contrast cannot express simultaneity [26]. At the design level a software engineer could envisage using UML [23] to specify component-based systems. Even though UML was not developed for componentbased design some of its notation can be useful (as shown for example by Cheesman and Daniels [3]). In particular, UML includes a constraint language called Object Constraint Language (OCL) which can be used for describing component contracts, mostly in terms of pre- and postconditions on interface operations [16]. OCL 1.x is still essentially a static language and lacks the appropriate expressiveness to capture provided/required dependencies, which are essential for describing component contracts precisely [29]. This is tackled in work by Kuster-Filipe [11] and Kuster-Filipe et al. [10] using a Catalysis-like notation [7] for describing component interactions and component frameworks, respectively. Work in increasing the expressive power of OCL is in progress and possible correspondence to the temporal relations derived from behavioural presentations needs to be further investigated. In any case, OCL 2.0 will have added expressive power and consequently provide more useful notation for describing some component contracts. Finally, one natural extension of our work is to associate component behaviour with automata. This transition is straightforward because behavioural presentations give rise to a certain class of automata, building on consequences of discreteness and local left-closure. Preliminary work suggests that we may take vectors appearing in a component language as states and define transitions in a way that reflects the observation that behaviours may be seen to be built up from the empty vector A by repeatedly concatenating 'event vectors' to it. These are vectors in which each coordinate is either a single event or the empty sequence. An obvious advantage of automata is that it would make our formal approach to components amenable to automated verification and consequently tool support. Such considerations are however subject for further work.
Bibliography 1. M. Broy. Advanced Component Interface Specification. In Proceedings of Theory and Practice of Parallel Programming (TPPP '94), volume 307, pages 369-392. Lecture Notes in Computer Science, 1995. 2. M. Broy. Algebraic Specification of Reactive Systems. Theoretical Computer Science, 239(2000):3-40, 2000.
348
S. Moschoyiannis, J. Kiister-Filipe and M. W. Shields
3. J. Cheesman and J. Daniels. UML Components. Component Software Series, Addison Wesley, 2001. 4. W. Damm and D. Harel. LSCs: Breathing Life into Message Sequence Charts. Formal Methods in System Design, 19(l):45-80, 2001. 5. B. A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge Mathematical Textbooks, Cambridge University Press, 1990. 6. L. de Alfaro and T. Henzinger. Interface Automata. In Proceedings of Foundations of Software Engineering (FSE'01), pages 109-120. ACM Press, 2001. 7. D. F. D'Souza and A. C. Wills. Objects, Components and Frameworks with UML: The Catalysis Approach. Object Technology Series. Addison Wesley, 1999. 8. C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985. 9. J. Kiister-Filipe. Fundamentals of a Module Logic for Distributed Object Systems. Journal of Functional and Logic Programming, 2000(3), March 2000. 10. J. Kiister-Filipe, K. K. Lau, M. Ornaghi, K. Taguchi, H. Yatsu, and A. C. Wills. Formal Specification of Catalysis Frameworks. In Proceedings of the 7th Asia-Pacific Software Engineering Conference, pages 180-187. IEEE Computer Society Press, 2000. 11. J. Kiister-Filipe. A Logic-Based Formalization for Component Specification. Journal of Object Technology, special issue: TOOLS USA 2002 Proceedings, l(3):231-248, 2002. 12. J. Kiister-Filipe. Giving Life to Agent Interactions. In M. Ryan, J.-J. Meyer and H.-D. Ehrich, editors, Agents, Objects and Features: Structuring mechanisms for contemporary software, LNCS 2975, pages 98-116. Springer, 2004. 13. J. Kiister-Filipe. Modelling Concurrent Interactions. In C. Rattray, S. Maharaj and C. Shankland, editors, Proceedings of Algebraic Methodology and Software Technology (AMAST 2004), LNCS 3116, pages 304-318. Springer, 2004. 14. K. K. Lau. Component Certification and System Prediction: Is there a Role for Formality? In Proceedings of ICSE'01, J^th International Workshop on Component-Based Software Engineering, Toronto, Canada, 2001. 15. F. Liiders and K. K. Lau. Specification of Software Components. In I. Crnkovic and M. Larsson, editors, Building Reliable Component-Based Software Systems, pages 2 3 38. Artech House, 2002. 16. B. Meyer. Object-Oriented Software Construction. Prentice Hall, 1997. 17. B. Meyer. Applying design by contract. IEEE Computer, 25(10):40-51, October 1992. 18. A. J. R. Milner. Communication and Concurrency. Prentice Hall, 1989. 19. S. Moschoyiannis and M. W. Shields. Component-Based Design: Towards Guided Composition. In Proceedings of Application of Concurrency to System Design (ACSD'03), pages 122-131. IEEE Computer Society, 2003. 20. S. Moschoyiannis and M. W. Shields. A Set-Theoretic Framework for Component Composition. Fundamenta Informaticae, 59(4):373-396, 2004. 21. S. Moschoyiannis. Generating Snapshots of a Component Setting. In Proceedings of ETAPS 2004 workshop on Formal Foundations of Embedded Software and ComponentBased Software Architectures (FESCA '04), ENTCS 108, pages 83-98. Elsevier, 2004. 22. M. Nielsen, G. Plotkin, and G. Winskel. Petri Nets, Event Structures and Domains, part 1. Theoretical Computer Science, 13:85-108, 1981. 23. OMG. Unified Modelling Language Specification, version 1.5. OMG document formal/01-03-01, available from http://www.omg.org, March 2003.
A Formal Approach to Constructing Weil-Behaved Systems Using Components
349
24. H. W. Schmidt and R. H. Reussner. Generating Adapters for Concurrent Component Protocol Synchronisation. In Proceedings of 5th IFIP Conference on Formal Methods for Open Object-Based Distributed Systems, pages 213-229, Kluwer B. V., 2002. 25. M. W. Shields. Behavioural Presentations. In de Bakker, de Roever, and Rozenberg, editors, Linear Time, Branching Time and Partial Orders in Logics and Models for Concurrency, LNCS 354, pages 673-689. Springer-Verlag, 1988. 26. M. W. Shields. Semantics of Parallelism. Springer-Verlag London, 1997. 27. M. W. Shields and S. Moschoyiannis. Primes in Component Languages. Technical Report SCOMP-TC-01-04, Department of Computing, University of Surrey, 2004. 28. M. W. Shields and D. Pitt. Component-Based Systems I: Theory of a Single Component. Technical Report SCOMP-TC-01-01, Department of Computing, University of Surrey, 2001. 29. C. Szyperski. Component Software: Beyond Object-Oriented Programming. Addison Wesley, 1997. 30. R. van Ommering, F. van der Linden, J. Kramer, and J. Magee. The Koala Component Model for Consumer Electronics. IEEE Transactions on Computers, 33(3):78-85, 2000. 31. G. Winskel. An Introduction to Event Structures. In de Bakker, de Roever, and Rozenberg, editors, Linear Time, Branching Time and Partial Orders in Logics and Models for Concurrency, LNCS 354, pages 364-397. Springer-Verlag, 1988. 32. G. Winskel and M. Nielsen. Models for Concurrency. In S. Abramsky, D. Gabbay, and T. Maibaum, editors, Handbook of Logic in Computer Science, vol. 4, Semantic Modelling, pages 1-148. Oxford Science Publications, 1995.
This page is intentionally left blank
Subject Index
abstract behavior composition, 50 abstract behavior type, 37, 46 abstract data types, 36 ABT, 37, 45, 46 action, 4, 11 active class, 163 activity diagram, 155, 174 actor, 156 ADL, 2-4 ADT, 36 alphabet, 215 analysis, 299 anamorphism, 76 architecture, 120, 212 architecture description language, 2 aspect-oriented programming, 306 aspects, 277 association, 19, 24, 98, 243 attribute, 10, 243 B, 70, 279 B-tool, 280 barrier synchronizers, 58 basic component, 4 behaviour, 73, 244, 342 behaviour tree, 174 behavioural presentation, 333 bisimulation, 73, 77 black box view, 127 business logic, 304 Cash Machine, 145 Catalysis, 239, 240 category, 81 CBT, 194
CCS, 343 channel, 52, 53, 124, 139 class, 12, 97, 243 class diagram, 96, 155 client-server, 234 coalgebra, 71, 73 coalgebraic specification, 97 code, 299 coinduction, 46, 76 COM, 207 COM.NET, 321 combinator, 86 communication, 157 communication link, 16 CommUnity, 4 Community's channels, 4 compatibilty, 279 complex component, 4, 5 component, 38, 123, 129, 161, 175, 224, 239, 271, 324 component architecture transformation, 180, 190 component assembly, 107 Component behavior projection, 180 component behavior tree, 194 component calculus, 83 component composition, 228 component coordination, 211 component instances, 38 component model, 161, 304 component semantics, 164 component sort, 323 component system diagrams, 162 component-based programming, 208 components, 119
352
componentware, 70 composability, 275 composition, 255, 280 composition of components, 342 compositionality, 165, 167 concurrent combinator, 90 condition, 175 connector, 5, 52, 53, 236 consistency, 260 constant replacer, 59 constraint, 245 constraint automata, 41, 43 contract, 208, 218, 258, 259, 275 contract composition, 219 contract inheritance, 221 contract refinement, 219, 220 coordination, 52, 157 CORBA, 207, 273, 321 CSP, 157, 326 datatype, 8 DBT, 177, 190 deadlock, 62, 188 design, 177, 215, 299 design behavior tree, 177, 183, 190 design pattern, 239 design refinement, 216 dining philosophers, 49, 61 direct sum of histories, 125 domain theory, 327 dynamic aggregation, 16 dynamic configuration, 16 dynamic logic, 266 dynamic reconfiguration, 168 EJB, 321 Enterprise JavaBeans, 207 event, 175, 256 execution, 160 extension of services, 145 features, 119 fibonacci series, 60 field, 212 final coalgebra, 76 FML, 266
Subject
Index
Focus, 119 framework, 239, 240 framework modelling language (FML), 242 functions, 124 general contract, 222 generalisation, 99 glue code, 57 hiding operations, 214 hiding operator, 214 history, 124, 130 hook combinator, 91 inconsistency, 189 interaction, 179, 241 interface, 106, 123, 161, 209, 211 interface abstraction, 151 interface automata, 323 interface inheritance, 214 invariant, 257 labelled transition systems, 41 live-lock, 188 logic, 6 logical architectures, 120 logical specification, 130 message, 10 message sequence chart, 174 method, 212 Metropolis, 275 middleware, 273 mobile channel, 53, 167 model based development, 286 model composition, 240 model driven development, 298, 300 model synthesis, 240 model transformation, 309 modularity, 252 modularization, 119 morphism, 73 MSC, 174 multi-functional systems, 120
Subject
.NET, 273 OBJ, 280 object diagram, 102, 251 object orientation, 70, 208 OCL, 240, 266, 347 OORAM, 240 operational semantics, 158 ordering, 58 partial behaviour, 122 passive class, 163 Petri-net, 174 platform, 273 postcondition, 215, 245, 278 precondition, 178, 215, 245, 278 process calculi, 279 protocol, 157, 211 provided interface, 163, 210, 262, 323 provided services, 224, 323 QoS, 208 Raise, 279 Raise tool, 280 RBT, 177 rCOS, 207, 279 read variable, 10 refactoring, 259 refinement of services, 138 refinement rules, 231 remote procedure call, 5 rendez-vous, 157 Reo, 52 required interface, 163, 210, 262, 323 required services, 224, 323 requirements, 174, 178 requirements behavior tree, 177, 183 requirements integration, 180, 183 requirements translation, 179 role, 161 SA, 2 safety, 188 semantics of classes, 12 semantics of components, 224
Index
353
sequence diagram, 155, 174 sequencer, 58 service, 123, 130 service combination, 147 service composition, 147 service control, 146 service interface, 128 services, 129 software architectures, 2 software component, 69 specification, 129, 247 specification of component, 9 specification of connectors, 15 specification of methods, 215 stack, 70 state, 175, 251 state diagram, 155 state machine, 134, 157 statechart, 174 statechart diagrams, 104 stream, 46, 123, 129 Stream Calculus, 210, 279 structured programming, 119 subsystem, 5, 16, 18 subsystem specification, 17 synchronisation, 5 syntactic interfaces, 126 temporal constraint, 266 temporal specification language, 6 termination, 215 time stream, 46 timed data stream, 45, 46 total behaviour, 122 trace semantics, 161 type, 123 types, 123 UML, 96,155, 156, 174, 201, 214, 239, 266, 280, 299, 322, 347 unified system model, 199 use case, 103 use case diagram, 155 use cases, 121, 142 UTP, 209
354
Subject
Index
variable, 215 VDM, 70
Wright, 16 write-cue regulator, 57
well-behaved components, 322, 328 wrapping, 85
Z, 70, 279
The range of component-based technology is both wide and diverse, but some common understanding is emerging through the ideas of model-based development. These include the notions of interfaces, contracts, services, connectors and architectures. Key issues in the application of the technology are becoming clearer, including the consistent integration of different views of a component, component composition, component coordination and transformation for platforms. However, we still know little about theories that support analysis and synthesis of componentbased systems.
The distinct feature of this volume is its focus on mathematical models that identify the "core" concepts as first class modeling elements, and its providing of techniques for integrating and relating them. The volume contains eleven chapters by wellestablished researchers writing from different perspectives. Each chapter gives explicit definitions of components in terms of a set of key aspects and addresses some of the problems of integration and analysis of various views: component specification, component composition, component coordination, refinement and substitution, and techniques for solving these problems. The concepts and techniques are motivated and explained with the help of examples and case studies.
FRAMEWORKS FOR COMPONENT SOFTWARE
World Scientific www.worldscientific.com 6231 he