Component-Based Software Development for Embedded Systems: An Overview of Current Research Trends (Lecture Notes in Computer Science)

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...

Author: Colin Atkinson | Christian Bunse | Hans-Gerhard Gross | Christian Peper

30 downloads 817 Views 12MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos New York University, NY, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany

3778

Colin Atkinson Christian Bunse Hans-Gerhard Gross Christian Peper (Eds.)

Component-Based Software Development for Embedded Systems An Overview of Current Research Trends

13

Volume Editors Colin Atkinson University of Mannheim Chair of Software Technology 68161 Mannheim, Germany E-mail: [email protected] Christian Bunse Christian Peper Fraunhofer IESE, Component Engineering Dept. Fraunhofer Platz 1, 67663 Kaiserslautern, Germany E-mail: {Christian.Bunse,Christian.Peper}@iese.fraunhofer.de Hans-Gerhard Gross Delft University of Technology, Faculty of Electrical Engineering Embedded Software Laboratory Mekelweg 4, 2628 CD Delft, The Netherlands E-mail: [email protected]

Library of Congress Control Number: 2005936802 CR Subject Classification (1998): D.2, D.4, K.6.3 ISSN ISBN-10 ISBN-13

0302-9743 3-540-30644-7 Springer Berlin Heidelberg New York 978-3-540-30644-3 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springeronline.com © Springer-Verlag Berlin Heidelberg 2005 Printed in Germany Typesetting: Camera-ready by author, data conversion by Olgun Computergrafik Printed on acid-free paper SPIN: 11591962 06/3142 543210

Foreword

Embedded systems are ubiquitous. They appear in cell phones, microwave ovens, refrigerators, consumer electronics, cars, and jets. Some of these embedded systems are safety- or security-critical such as in medical equipment, nuclear plants, and X-by-wire control systems in naval, ground and aerospace transportation vehicles. With the continuing shift from hardware to software, embedded systems are increasingly dominated by embedded software. Embedded software is complex. Its engineering inherently involves a multidisciplinary interplay with the physics of the embedding system or environment. Embedded software also comes in ever larger quantity and diversity. The next generation of premium automobiles will carry around one gigabyte of binary code. The proposed US DDX submarine is eﬀectively a ﬂoating embedded software system, comprising 30 billion lines of code written in over 100 programming languages. Embedded software is expensive. Cost estimates are quoted at around US$15– 30 per line (from commencement to shipping). In the defense realm, costs can range up to $100, while for highly critical applications, such as the Space Shuttle, the cost per line approximates $1,000. In view of the exponential increase in complexity, the projected costs of future embedded software are staggering. Embedded systems, let alone embedded software, are far from widely accepted as a subject for a university-level curriculum. While electrical and computer engineering curricula traditionally approach embedded systems from a hardware view, computer science curricula usually lack an embedded systems focus. As the resulting “embedded systems gap” between the curricula is far from being bridged, there is an increasing shortage of skilled, multidisciplinary embedded software engineers. As embedded software is becoming more complex, more costly, and less reliable, componentization is seen as the solution in much the same way this paradigm has been adopted in other engineering disciplines. Unfortunately, however, while componentization is well established in almost every traditional engineering branch (electronics, construction, chemistry, etc.), in software engineering componentization is relatively immature. Paradoxically, one of the reasons of the immaturity of component-based software engineering is that, in principle, everyone equipped with an editor and compiler can build software components, unlike hardware components such as a piston or a processor chip, which usually requires an expensive manufacturing process. While hardware manufacturing reality simply forced the emergence of standard components that reliably integrate within complex designs, there is still no mature software analogy. Consequently, there has been less incentive thus far to develop software component architectures to train engineers to design in terms of them, and have manufacturers sell them. In fact, software ﬁrms that

VI

Foreword

hope to sell software components analogous to computer chips usually ﬁnd out that such a business is not proﬁtable. Another reason for the state of aﬀairs in software engineering is the inherently greater complexity of software compared to hardware. Today, engineers can design and test something as complex as an intercontinental jet behind displays. While the physical laws governing how metal structures behave when they are shaped, integrated, and ﬂown are well known, for software, there is still no such body of basic science behind the way components interact, and how this should be reﬂected in their interface speciﬁcations. In hardware often a relatively simple, functional speciﬁcation (e.g., a set of equations) suﬃces to capture component behavior. With software components there is usually no guarantee that a proper speciﬁcation needs anything less than the modeling power of a fully Turingcomplete computation model. Consequently, the computational complexity of analyzing the functional and nonfunctional properties of a composition of even a few components may become prohibitive. Apart from the inevitable state explosion, the modeling power required to capture all sorts of computation and synchronization subtleties also brings about that certain system properties are sometimes even undecidable. In view of all the above, the challenges in component-based embedded software engineering are nothing less than formidable. Yet, I am optimistic that in the coming decade signiﬁcant progress will be made, simply because engineering history in other constructive disciplines has shown that component-based construction is the only approach that will enable us to overcome the problems of complex system design. In fact, with our increasing dependence on reliable embedded software, we simply have no other choice. This book provides a good opportunity for software engineering practitioners and researchers to get in sync with the current state of the art and future trends in component-based embedded software research. The book is based on a selective compilation of papers that cover the complete component-based embedded software spectrum, ranging from methodology to tools. Methodology aspects covered by the book include functional and nonfunctional speciﬁcation, validation, veriﬁcation, and component architecture. As tools are a critical success factor in the transfer from academia-generated knowledge to industry-ready technology, an important part of the book is devoted to tools. In summary, the editors have put together an excellent overview on componentbased embedded software engineering, an extremely important topic that is only beginning to be understood, but whose impact is guaranteed to aﬀect our daily lives. In fact, the science and technology described in this book will most probably make it into your next car. Welcome aboard.

Delft, September 2005

Arjan van Gemund

Table of Contents

Component-Based Software Development for Embedded Systems – An Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Colin Atkinson, Christian Bunse, Christian Peper, and Hans-Gerhard Gross

1

Specification and Verification Speciﬁcation and Veriﬁcation of Applications Based on Function Blocks . . . Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

8

A Model-Based Approach to Formal Speciﬁcation and Veriﬁcation of Embedded Systems Using Colored Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . 35 Leandro Dias da Silva and Angelo Perkusich Modular Veriﬁcation of Reconﬁgurable Components . . . . . . . . . . . . . . . . . . . . 59 Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨ orgen Hansson

Component Compatibility Behavioral Types for Embedded Software – A Survey . . . . . . . . . . . . . . . . . . . 82 Walter Maydl and Lars Grunske Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

Component Architectures, Implementation and Tool Support Platform-Independent Speciﬁcation of Component Architectures for Embedded Real-Time Systems Based on an Extended UML . . . . . . . . . . 123 Shourong Lu and Wolfgang A. Halang Model Driven Software Development in the Context of Embedded Component Infrastructures . . . . . . . . . . . . . . . . 143 Markus Voelter, Christian Salzmann, and Michael Kircher A Component Framework for Consumer Electronics Middleware . . . . . . . . . 164 Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien Connecting Embedded Devices Using a Component Platform for Adaptable Protocol Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Sam Michiels, Nico Janssens, Lieven Desmet, Tom Mahieu, Wouter Joosen, and Pierre Verbaeten

VIII

Table of Contents

CoConES: An Approach for Components and Contracts in Embedded Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Yolande Berbers, Peter Rigole, Yves Vandewoude, and Stefan Van Baelen Adopting a Component-Based Software Architecture for an Industrial Control System – A Case Study . . . . . . . . . . . . . . . . . . . . . . 232 Frank L¨ uders, Ivica Crnkovic, and Per Runeson

Non-functional Properties Speciﬁcation and Evaluation of Safety Properties in a Component-Based Software Engineering Process . . . . . . . . . . . . . . . . . . . 249 Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner Performance Evaluation Approaches for Software Architects . . . . . . . . . . . . 275 Anu Purhonen Component-Based Engineering of Distributed Embedded Control Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 J.H. Jahnke, A. McNair, J. Cockburn, P. de Souza, R.A. Furber, and M. Lavender Component-Based Development of Dependable Systems with UML . . . . . . . 320 Jan J¨ urjens and Stefan Wagner

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

Component-Based Software Development for Embedded Systems – An Introduction Colin Atkinson1, Christian Bunse2 , Christian Peper2 , and Hans-Gerhard Gross3 1

2

University of Mannheim, Germany [email protected] Fraunhofer Institute Experimental Software Engineering, Kaiserslautern, Germany {bunse,peper}@iese.fhg.de 3 Delft University of Technology, Netherlands [email protected]

The potential benefits of component-based development are as attractive in the domain of embedded systems as they are in other areas of the software industry. These include reductions in development times and costs, improved quality and specialization of expertise. However, these benefits are much more difficult to realize in embedded systems development than in other areas of software engineering because the problems of composing extra-functional requirements such as quality, reliability and performance are much more acute. When building new applications from existing components it is not only necessary to ensure that they behave as expected, but also that these extrafunctional properties are composed correctly. Until recently it was difficult to do this in an economically viable way, but recent trends have combined to make componentbased software engineering as imperative for embedded systems development as it is in other domains. One trend is the shear growth in the number and ubiquity of embedded systems in the world around us. We not only expect traditional machines such as fridges and automobiles to exhibit more intelligent behavior, but we have an insatiable appetite for new kinds of smart devices such as mobile phones and global positioning systems, etc. And many more kinds of devices are about to emerge from the research pipeline that were the domain of science fiction just a few years ago, such as RFID tags, smart dust and nano-scale machines. Thus, the visions of “pervasive computing” or “ambient intelligence” in which our whole environment is driven and controlled by software in embedded systems is already starting to materialize. Another significant trend is the growth in the size and complexity of software in embedded systems. For example, since software was first included in cars about 15 years ago, the amount of embedded code has grown exponentially from around 100 kilobytes to a projected 1 gigabytes in the latest generation of high-end automobiles. And there is every reason to expect this “Moore’s law for embedded systems” to continue for the foreseeable future. As a result the methods and technologies that have traditionally been used to develop embedded systems are starting to reach the limits of their scalability. The third important trend is the creation of larger product families in order to tailor products variants to the needs of specific customers and/or market segments. This trend is perhaps most visible in the mobile phones industry which offers a bewildering array of different devices and feature combinations, but it is present in most other consumer product ranges as well. At the very least, the software embedded within such C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 1–7, 2005. c Springer-Verlag Berlin Heidelberg 2005

2

Colin Atkinson et al.

devices must accommodate the feature variation in the product family, but more often it is the hardware that is responsible for supporting it. But unfortunately, traditional development methodologies are not geared towards the management of product lines. This leads directly to the fourth significant trend which is the shift from hardware innovation to software innovation as the decisive distinguishing factor in the market place. The general componentization of manufactured products has gradually turned the hardware elements of embedded systems into a commodity and has thus shrunk the profits derivable from them. This is increasingly forcing companies to compete in terms of software functionality where there is still room to innovate and offer unique features. All these trends point to the same conclusion that component-based software engineering will become increasingly important in embedded systems development, and that the success of consumer product manufacturers will be increasingly related to their prowess in software engineering. In effect, now that hardware componentization has become the norm we are starting to see the battle for efficiency and economy-of-scale savings at the software level. And just as with hardware componentization, the companies that first master the art of component-based embedded systems development will probably dominate the market for some considerable time. These market forces are naturally focusing attention on the state-of-the-art in embedded systems engineering and on the latest results emerging from research labs. The goal of this book is to provide a snapshot of this research, and to give a preview of the technologies which are likely to become important in the future. We have contributions which describe the latest results in four critical areas of component-based embedded systems engineering: the specification and verification of components and component based systems, the evaluation and assurance of component compatibility, componentbased embedded system architectures and tools for implementing them, and the composition of extra-functional properties. These papers, from leading researchers in the field, clearly indicate that although many open questions remain, some of the most difficult challenges have been addressed and component-based techniques are starting to become established in real embedded systems development projects. In the following sections we take a look at these individual subjects and see how they will be addressed by the researches who provided contributions to these fields.

Specification and Verification There seems to be an unbridgeable gap between theoretical and practical approaches in the development of software which is as old as computing itself. Practitioners, from industry in particular, are not willing to think about verification mechanisms. Even testing is often regarded as obstructive rather than helpful. On the other hand, many theorists do not care about practical problems, so that the availability and usability of suitable approaches is suffering. Ironically, both sides would benefit from an enhanced integration of specification and verification approaches into the day-to-day system development, because products can be provided with a higher quality and theoretical research efforts are better honored. Therefore, we should keep in mind that formality should never prevent people from their daily tasks. Instead they should and can be supported, but only if appropriate interfaces for the different roles in the system development are provided:

Component-Based Software Development for Embedded Systems

3

– programmers might be interested in a program trace leading to an error, – software designers might want to verify their component composition, and – requirement engineers might want to trace the distributed implementation of their requirements. Formal techniques are a prerequisite for any such evaluations. Nevertheless, none of the users needs to be aware of the fact, that the program trace was generated by a model checker, the compositional correctness was approved by a type inference calculus, and the requirement traceability was achieved through some formal mappings. The embedded system domain (with all its upcoming deviants like ambient intelligence systems, etc.) could become the initiator for an improvement: a good deal of embedded applications is operating in a safety critical area. Therefore, special measures have to be taken to assure that no threats to human life or physical condition are caused by the embedded device. One possible way to achieve this aim is to increase the amount of formality in the system development allowing formal reasoning about the system’s properties. Applicable theories comprise state automata, customized logics (such as real time temporal logics), architectural calculi with corresponding composition principles and many more. Especially, the fact that component environments are becoming more dynamic in contexts like ubiquitous computing, ad-hoc networks, etc. will still impose additional difficulties and give reasons for the creation of good theories in the future. Since we can expect a growing transfer of theoretical ideas into industrial practice, it is also one of the primary and challenging tasks to find the right levels of abstraction to support the different views of the people involved. In either case, it is very important to gain control over the world of formal specification techniques, for which component-based modularity is as valuable as for issues concerning the implementation. Formal, component-based approaches do not directly bridge the gap between theoretical and practical needs. But they can provide welldefined abstraction and composition principles which help to bring both worlds closer together. In this book, three contributions explicitly deal with selected formal aspects of component-based system engineering: – The chapter “Specification and Verification of Applications based on Function Blocks” discusses how to apply the so-called function blocks which are originating from automation engineering in a component-based development environment. – The chapter “A Model-Based Approach to Formal Specification and Verification of Embedded Systems using Colored Petri Nets” introduces a way to develop, model and verify components with Petri nets, which are widely used and readily applied in embedded systems development. – The chapter “Modular Verification of Reconfigurable Components” deals with how to specify and verify reconfigurable components based on timed automata in an aspect-oriented and modular way.

Component Compatibility Compatibility is a key problem in component-based system engineering. Here, compatibility means the ability of two entities, such as components, to interact in a semantically meaningful way. In other words, a component that is supposed to interact with another

4

Colin Atkinson et al.

component needs to invoke the syntactically correct service of this other component, and, additionally, both components need to have the same “understanding” of what this service means. In traditional software development every new subordinate module is exactly specified and crafted according to the requirements of the superordinate module that subsumes it and uses its services. So, in this type of custom design, all modules are compatible according to their specification and the specification of their respective partner module. The most important distinctive difference in component-based development is that individual components are not specified and laid out according to existing other components that are supposed to integrate their services. Every single component is specified according to a more or less general requirements profile, so it can be integrated in a number of diverse contexts. Generality is a key feature of many existing components simply because it opens up a bigger market. Hence, generality of components is a cure in that it enables software vendors to cover a wider angle of revenue, and it enables software purchasers to add functionality for which they have no initial expertise. But it is also a curse because all these components have never been made to integrate. Component producers develop their components with complete ignorance of any other components in the domain, or in any other domain. And this creates the fundamental flaw of component technology: the integration problem. This integration problem starts already in component procurement. The customer of a prospective component must compare the specification of the integrating context with the specifications of all candidate components. This can be seen as a search problem with an optimal mapping between the two stakeholders, customer and producer, as cost function. Since it is extremely unlikely for the identified candidate to fit exactly the specification of the integrating context, e.g. too much functionality which would not be too bad, not enough functionality or slightly differing functionality, the one or the other, or both have to be adapted, typically through an adapter component. The adaptation needs to consider a syntactic mapping, which is often supported through run-time platforms, and, more importantly, a semantic mapping, which is more difficult. These mappings ensure that both sides, the context and the integrated component “speak the same language and understand each other.” A final step, assessment, must be carried out in order to see whether the integration is successful, and to verify that the component will “understand” what the context “means”, and the other way round. Two of the contributions in this book deal with these issues: – The chapter “Behavioral Types for Components in Embedded Systems” approaches the previously described problems at design-time, providing and discussing a variety of formalisms for modeling and analyzing the communication behavior between components in an embedded system context. – The chapter “Assessing Real-Time Component Contracts Through Built-In Evolutionary Testing” approaches the same issue at run-time providing a solution for how real-time requirements can be assessed dynamically as part of a quality assurance contract between two components.

Component Architectures, Implementation and Tool Support The complexity and resource requirements of today’s software systems, especially in the domain of embedded systems, create the need for flexibility, adaptability, and ease

Component-Based Software Development for Embedded Systems

5

of composition or reuse. Component-based software development can be considered as an evolutionary step beyond object-oriented development in achieving these goals. To achieve these goals several support measures are needed. Implementation techniques and supporting tools are prerequisites to ensure the systematic development of component-based software systems. In particular, in the domain of embedded systems, these have to be adopted towards the specific demands concerning hardware and software components as well as the prevailing non-functional requirements. A component architecture is a description of a system in form of a collection of components that interact with each other through connectors, these are structural relationships and behavioral dependencies. In other words, the architecture defines a standard for the composition of components into a system in terms of what makes a component, how are interfaces described and how do components interact and communicate. In the context of embedded systems, this view has also to be extended to hardware components and the interaction between both component types. This adds another dimension to the component-based software development paradigm. In summary, a component architecture is a pre-requisite for the quick composition, extension, and adaptation of a system. In this book, five contributions address selected aspects of component architectures, as well as implementation and tool support. The group of contributions to this area deal with the following subjects: – The chapter “Platform-Independent Specification of Component Architectures for Embedded Real-Time Systems based on an Extended UML” applies the principles behind the OMG’s Model-Driven-Architecture (MDA) approach as well as the Unified Modeling Language (UML), which is related to component-based software engineering. It proposes a conceptual framework architecture for embedded systems design as a basis for component models that are not depending on a specific platform. – The chapter “Model Driven Software Development in the Context of Embedded Component Infrastructures” illustrates how reuse, the primary driving force behind all component technologies, may be supported through high-level model-driven software development techniques, but at the same time address the issues of a concrete hardware or operating system setup. – The chapter “A Component Framework for Consumer Electronics Middleware” demonstrates how a component framework that is specifically geared toward consumer electronics can fulfill the provision of robust and reliable operations, runtime upgrading/extension, low resource footprints, and support for component trading. – The chapter “Connecting Embedded Devices Using a Component Platform for Adaptable Protocol Stacks” deals with a software architecture that is specifically tailored to build highly adaptable protocol stacks, along with a component platform that enforces this architecture. This is also known as the Distrinet Protocol Stack (DiPS+). The key focus is run-time adaptability to application- and environmentspecific requirements.

6

Colin Atkinson et al.

– The chapter “COCONES – An Approach for Components and Contracts in Embedded Systems” discusses a the COCONES approach, a methodology and tool support for the development of embedded software. This is based on the composition of reusable components with the addition of a contract principle for modeling non-functional constraints. – Finally, the chapter “Adopting a Component-Based Software Architecture for an Industrial Control System” presents a case study on how a large organization active in the embedded domain uses programmable controllers and a component-based development approach to replace an array of existing products. Particular focus is put on how the existing software architecture must be re-designed, and how this can alleviate implementation substantially, but yet keep non-functional system properties under control.

Non-functional Properties In general, all software systems are characterized by functional and non-functional aspects. However, in embedded system development non-functional aspects (i.e., how the system behaves with respect to observable attributes such as performance, reliability, etc.) are critical for the final product and are much more important for the correctness of a system than in information system development. This is especially true in componentbased development of such systems due to the idea of quickly assembling systems out of pre-fabricated components. The non-functional properties of a component as well as their impact on the overall system must be known in advance. Although, one can think of numerous non-functional aspects within an embedded system, in practice, they often boil down to timing, memory, performance or throughput, and reliability/dependability-related aspects. The way how these aspects have to be addressed depends on the context or application domain (e.g., timing aspects have to be more precise for safety-critical systems than for home-appliances). Thus, depending on the context, non-functional aspects can be described or built-in at different levels of formality, such as: – Informal level, which describes non-functional aspects in natural language or more formal notations such as the UML. – Semi-Formal level, which describes non-functional aspects by using much more formal notations such as Petri-nets. – Fully formal level, which uses discrete mathematics or grammars for the description of non-functional aspects. Another topic in this respect is the validation of components regarding their non-functional properties by means of testing, inspection, simulation, etc. In any case, it is very important to gain control over the non-functional aspects of embedded systems. In this book, four contributions explicitly deal with selected aspects of specifying, estimating and checking non-functional properties in component-based system engineering:

Component-Based Software Development for Embedded Systems

7

– The chapter “Specification and Evaluation of Safety Properties in a ComponentBased Software Engineering Process” describes an industrial case study on the development of programmable controllers using a component-based software architecture that allows I/O and communication functions to be realized through independently developed components. The focus in this chapter is on how typical safety requirements can be specified and assessed. – The chapter, “Performance Evaluation Approaches for Software Architects” deals with the definition and application of a comparison framework for various performance evaluation approaches. A specific focus lies on the influence of the system architecture on performance. The proposed framework may be used to select a performance evaluation approach for assessing an architecture. – The chapter “Component-Based Engineering of Distributed Embedded Control Software” proposes a component-oriented approach that is suited particularly to engineer embedded control software for network-centric systems on low-powered platforms, and that, in addition, facilitates the interaction with PC-hosted graphical user interfaces. The method put forward is supported through a tool that is called MicroCommander. – The subject of the chapter “Component-Based Development of Dependable Systems with UML” is dealing with reliability-related analysis of component-based embedded systems in general, and with the identification of failure-prone components using complexity metrics and operational profiles, in particular. The analysis and assessment of reliability requirements using stereotypes is also a key issue discussed in this chapter.

Specification and Verification of Applications Based on Function Blocks Wei Zhang1 , Wolfgang A. Halang1 , and Christian Dietrich2 1

Faculty of Electrical and Computer Engineering FernUniversit¨at 58084 Hagen, Germany {Wei.Zhang,Wolfgang.Halang}@FernUni-Hagen.de 2 Institut f¨ur Automation und Kommunikation e.V. 39179 Barleben, Germany [email protected]

Abstract. The concept of Function Block origins in automation engineering. It is the standard IEC 61499 that makes a Function Block paradigm for developing automation applications. Due to the component-like features, hardware-oriented view and explicit abstraction-complexity handling provided by Function Block paradigm, a methodology for applying CBD to embedded systems is suggested in this chapter. The classic CBD methods are considered in the combination with Function Block method, which are organized in a development process (i.e., Vmodel). In the development process, specification and verification issues are especially emphasized in this chapter. UML is chosen to take the pivot role of specifying all artifacts in the process.

1

Introduction

Nowadays, in addition to high requirements for performance of embedded systems, safety-related issues and short delivery time from design ideas to the market are becoming increasingly important. The question how to develop an embedded system in a highly efficient and cost-effective manner has to be answered. With respect to safety and efficiency-related topics, e.g., re-usability, verification and testing, component-based methods are promising candidates to fill some gaps between existing methods and requirements of embedded systems (such as reliability and real-time behavior). Component is a term coined in computer science, whereas the term Function Block (FB) was defined in automation engineering. The component-based development (CBD) process constructs systems by assembling some encapsulated, replaceable, re-usable and extensible system parts, so-called components. It provides advantages of reducing development and maintenance costs, which are also thought to be goals of the FB paradigm. Some features of FBs are believed to cope with some shortcomings of current CBD methods. In the overview on the Workshop on Component-based Software Engineering (CBSE) held at the 9th IEEE Conference and Workshops on Engineering of ComputerBased Systems [6], it was questioned how to specify a component. In fact, it is a generic C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 8–34, 2005. c Springer-Verlag Berlin Heidelberg 2005

Specification and Verification of Applications Based on Function Blocks

9

problem for system development. Moreover, specification becomes a problem, not because there is no specification method or language, but there are too many of them. Formal methods are kept out of engineering projects, because of their high requirements for mathematical expertise. What is actually used in engineering projects is, therefore, almost the textual specification, especially in early steps. Since the software development process has not reached a satisfactory degree of reliability, safety, correctness and dependability, yet, as already attained by the hardware part in industrial control systems, early verification of software applications in the system development life-cycle is considered to be a good and necessary approach to exclude design errors. To reach this goal, a well-defined specification is highly required to abstract the application and convey enough information for later implementation, verification and test generation. Moreover, due to increasing complexity of embedded softwares, complexity-handling becomes more important for all approaches. Both CBD and FB paradigms are considered to be approaches with complexity-handling capacity. With a hardware- and application-oriented view, the contribution of FB technology to embedded system development is to be facilitated by formalized specification and verification. In summary, the purpose of this chapter is to mature the CBD methodology for embedded systems by adopting the FB paradigm. Specification and verification are thought to be two of the topics important for the new CBD methodology. We solve the specification problem in this chapter by mapping elements of the FB paradigm to the Unified Modeling Language (UML). Then, based on the basic model created by the specification, models for verification are derived. Importantly, from the perspective of managing the development process, the V-model development lifeless is used. In addition, FB application design is separated into two steps, namely, designing FBs to be collected in libraries, and designing applications by assembling FBs selected from libraries which are already verified. In this chapter, we focus on specifying and verifying FBs (or subapplications). The body of this chapter is organized as following. In Section 2, the features of the FB paradigm are introduced and compared with current CBD paradigms. A suggestion of combining the FB and CBD paradigms is given. Then, the combined paradigm is managed in the V-model development life-cycle, which is discussed in Section 3. Section 4 deals with the specification with UML expressions so as to build the basic model of FBs. The formalized model of basic/composite FBs is built in Section 5 for formal verification and test case generation. Section 6 discusses the verification of an example. Finally, we end this chapter by pointing to related work and drawing a conclusion.

2

Function Block Paradigm

In this section, we introduce the FB paradigm, show how it is used to design or model systems, and investigate its possible contribution to mature the CBD paradigm for embedded systems. 2.1 FB Model FB technology originates from the standard IEC 61131-3 [4]. In last ten years, the classic function block concept of IEC 61131-3 was widely utilized in automation en-

10

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Fig. 1. Usage of function blocks in the different control system structures

gineering. The classical system structure is central control (shown in Fig. 1) with an FB implementation typically written in an IEC 61131-3 language using central and/or remote I/O modules, e.g., on a fieldbus. The typical modules are simple devices, or devices with an interface like simple devices. To address I/O data, application programs use directly represented I/O variables, which are defined in IEC 61131-3 as, e.g., %I1.1 or %Q2.3. An increasing number of systems make use of the decentral control structure with intelligent field devices connected by a fieldbus. Here the proxy FB concept applies specific FBs in the “central” application program of a programmable logic controller (PLC), which represent the functionality of the devices and realize the implicit communication with the devices over the fieldbus. In this case, the devices are addressed by the input and output variables of the proxy FBs. The application program with the FBs in the controller is programmed “centrally” and scheduled according to IEC 61131-3 or IEC 61804. A new paradigm is the distributed control structure with FBs distributed over networks, which communicate directly as defined in IEC 61499. In this case, the application program is distributed to the devices and the controllers (if any). In contrast to the proxy concept of the decentral structure, here the FBs need a system-wide scheduling or an event-driven mechanism for their invocation. In IEC 61499, the term FB is defined as: A software functional unit comprising an individual, named copy of a data structure and associated operations specified by a corresponding FB type. This FB concept is the focus of this chapter. In a top-down view, as shown in Fig. 2-1, the FB paradigm defines a reference model for FB-based systems. Device instances are elements of the system model, while the device model illustrated by Fig. 2-2 has resource instances. Distributed applications are the biggest difference between the paradigm based on IEC 61499 and the one based on IEC 61131-3. The application A in Fig. 2 is distributed over Device 1, 2 and 3, while application C is local to Device 1, but distributed over different resources. An application is actually an FB network, defined in [5] as “a software functional unit that is specific to the solution of a problem in industrial-process measurement and control”. FB types are used to realize the re-usability of FBs. There are types of basic and composite FBs, service interface FBs (SIFBs) as well as adapter interfaces.

Specification and Verification of Applications Based on Function Blocks

11

Fig. 2. Reference model of system and device [5]

The fundamental elements to construct FB applications and composite FBs are basic FBs, which contain algorithms and algorithmic execution sequences. A basic FB is shown in Fig. 3-1. It has two main parts, the header part containing execution control with event inputs and outputs, and the body part consisting of data inputs/outputs, embedded algorithms and internal variables. Therefore, the behavior of a basic FB is determined by both its execution control mechanism and its embedded algorithms.

Fig. 3. Model of basic FBs [5]

All input/output events and data variables form the interface of an FB. Within an FB are an Execution Control Chart (ECC) and algorithms. The execution of an algorithm is invoked by an ECC in response to events detected at event inputs. Such an invocation takes the form of a request to the scheduling function of the associated resource to schedule the execution of the algorithm’s operations [5]. Fig. 3-2 illustrates an ECC, which consists of ec states, ec actions and ec transitions. Ec actions are considered as actions of corresponding ec states. The activities of invoking algorithms and generating event outputs are deployed in ec actions. When an ec state becomes active, it requests the resources to arrange execution of its associated algorithm or algorithms. After an algorithm’s execution, an event output is generated to indicate the completion and update

12

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

output data. Actually, the ECC concept inherits and extends the Sequential Function Chart (SFC) of IEC 61131-3 by the explicit event triggering decision. Additionally, ec transitions can be triggered by either events or Boolean conditions, or combinations of them, while the conditions of SFC transitions are the evaluation of single Boolean expressions. Typically, events transfer the sequence control from one FB to another, whilst inside each basic FB events are the main way to trigger state changes. In our former work [24], the ECC concept was compared with UML state machines. Ec states are all simple states, which determine a flat sequence control mechanism. There is no concurrency in an ECC, which reduces the possibility of adding complexity at the ECC level. At the same time, in an ECC, complex behavior can be described in terms of embedded algorithms. Since the design of algorithms is not defined in IEC 61499, these application-related algorithms provide the possibility and flexibility to re-use and even embed other software engineering approaches into the FB paradigm, e.g., to use objectoriented methods for algorithm design. The specification of an algorithm is to be enforced in the FB paradigm. A composite FB is “an FB type whose algorithms and the control of their executions are expressed entirely in terms of interconnected component FBs, events, and variables” [5], as illustrated in Fig. 4. The interface of a composite FB is the same as that of a basic FB. However, inside of it is in fact an FB network, which is defined as “a network whose nodes are FBs or sub-applications and their parameters, and whose branches are data connections and event connections” [5]. Since a composite FB instance can also be contained in another bigger composite FB type as a network node, we attain an arbitrary hierarchy of nesting. Even though a basic FB is known to be flat, complex behaviors can also be described in arbitrary hierarchies offered by composite FBs. In an FB network, all nodes are instances of certain FB types.

Fig. 4. Composite FB

Through the interface of a composite FB, incoming (or outgoing) events and data from (or to) the outside are connected to corresponding component FBs in a one-toone manner. Each event connection either between two FB instances, or between the composite FB and one of its component FBs, implies a possible invocation scheduling of certain algorithms in the receiver FB. The scheduling should be supported in an FB execution environment, e.g., a real-time operating system. Sub-application types

Specification and Verification of Applications Based on Function Blocks

13

are similar to composite FB types, except that the sampling of data inputs/outputs is not provided for sub-application instances. However, the invocation scheduling of sub-applications is not different from that of composite FBs. In this chapter, a subapplication will roughly be thought to be same as a composite FB. Therefore, similar introductions of sub-applications are not necessary. Shown in Fig. 4, a SIFB is an FB which provides one or more services to an application, based on a mapping of service primitives to the FB’s event inputs, event outputs, data inputs and data outputs [5]. Communication FBs and Management FBs are typical SIFBs. Moreover, sensors and actuators in a control loop are normally modeled as SIFBs in an application. 2.2 Why Function Blocks? Function blocks represent an abstraction of functionalities in automation at certain controllable levels, up from the lowest layer: algorithms and ECCs to composite FBs and sub-applications. FBs as middle layer functional units help designers to organize the execution of algorithms to solve certain application problems. A group of associated FBs can be integrated into a composite FB to realise a certain fixed functionality at a higher level. FBs representing generic functionalities and domain-specific solutions are collected in different FB libraries. When these libraries are well designed and verified before being used, system designers do not need to develop systems from scratch. At the same time, the term component was coined in software engineering. In general, a component is a piece of software or hardware with certain special-purpose functionality and a well-defined interface describing services provided by the component. In the literature, e.g., [1, 11, 14], CBD is considered as a software development approach addressing problems of development efficiency as well as software maintenance in software engineering. Components are designed to be used in different systems with little or no modification. However, behind all CBD’s splendid aureoles, a complete development life-cycle and a corresponding methodology are required for CBD to be used in industrial practice, e.g., for embedded systems. Compared with features of components (encapsulated functionality, interface, re-usability, modularity, replace-ability and extensibility), the FB paradigm shares some common understandings with the CBD paradigm. From this viewpoint, an FB can be seen as a light-weighted component. For instance, provided we have an FB with a new control algorithm (e.g., using fuzzy logic) that can provide better control performance than a currently running FB with a traditional control algorithm (e.g., PID), then it is convenient to improve the control quality by replacing the old controller with the next generation one in the application. The new FB must be designed with the same interface as the one replaced. This is exactly the basic idea of CBD. Moreover, the FB paradigm has some advantages originating in experience in automation engineering, from which embedded system design and specification can benefit. The first advantage is the hardware-oriented view provided by FB diagrams (FBDs) and the reference system model defined in IEC 61499. The hardware-oriented approach is developed into a modular software design. Additionally, today it is possible to separate functional design from implementation. This view makes it possible to easily combine different models to form a global model capable of predicting the overall behavior.

14

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Complexity-handling is considered to be the second advantage of the FB paradigm. In software development, engineers are presently facing the famous paradox that system complexity is increasing whereas cost and time of development have to be reduced at the same time. This is one of the reasons why the software development techniques and processes used today generally fail to yield the required quality characteristics. The complexity-handling in the FB paradigm cannot completely exclude complexity problems at all, but tries to reduce some possibilities of adding unnecessary complexity to any abstraction level, and to limit complexity to certain levels. In fact, it is always requires skill to find a good balance between abstraction and complexity in system design. The FB paradigm handles the complexity problem by flat execution control, explicit data and event connections among FBs and hierarchy, as above introduced. On the other hand, the advantage of layered abstraction and complexity necessitates verifications corresponding to different levels, namely from algorithm level, basic FB, composite FB and FB library level to application level. Additionally, FB diagrams can be used to support requirement analysis and specification. This advantage is also based on the hardware-oriented view of the FB paradigm. The logically constructed FB diagrams are then refined to networks of FBs which should be implemented or selected from FB libraries according to their functional requirements, and finally deployed in hardware. However, since FB diagrams informally represent certain abstractions of applications, additional descriptions are needed to refine their implementation and verification in a formalized way.

Fig. 5. Evaluation of CBD methods and the FB paradigm (“X” indicates matching)

In [14], three CBD methods, Rational Unified Process (RUP), Select Perspective, and Catalysis, are evaluated and compared based on some criteria. Here we re-shape the evaluation by considering the FB paradigm, summarized in Fig. 5, according to the questions raised in tracks of the CBSE workshops, namely:

Specification and Verification of Applications Based on Function Blocks

15

a) How to design a system and identify needed properties of components? b) How is system property prediction affected by the black-box nature of components? c) What is the relationship between components, architecture, and system? In the table shown in Fig. 5, criteria 2, 4, 5 and 6 are related to question a, items 5, 7 are related to question b, while items 1, 3, 6 are related to question c. The other criteria from 8 to 13 are highly framework-based [14]. The comparison elucidates that the FB paradigm combined with UML and the V-model (discussed in the followed section) gives better answers to items 3 and 5 than the other three CBD methods. But it is weak for items 2, 7, 9 and 13, which are to be improved in the new version of the paradigm. This new version can be called an FB-based CBD paradigm (FB-CBD in short). In summary, the advantages of the FB-CBD paradigm contribute to design and specification of industrial embedded systems by explicit data and event flows, specification of “dynamic” behavior of FBs through explicit connections, flat ECC and arbitrary hierarchy, etc. Nevertheless, the FB paradigm itself still needs improvement, especially in re-enforcing its specification/verification in a formalized way compliant with component concepts. Thus, the following section is dedicated to discuss how to combine FB paradigm, UML and V-model in order to mature the FB-CBD paradigm for embedded systems.

3

From FB Paradigm to FB-CBD

3.1 Specification and Development Life-Cycle Despite their syntax defined in IEC 61499, the specification of FBs is not supported in a formalized way, especially when organizing all development activities in a development life-cycle. To select a specification method or language, engineering projects often refer to criteria of formality, ease-of-use, and possible software tool support. Among the many formal methods available, it is still difficult to choose one satisfying all criteria. The common understanding is that formal methods are based on an underlying, mathematically precise foundation enabling formal proofs. But the success achieved with them is still limited due to their shortcomings. In engineering, formal methods are often kept out of projects as they require substantial mathematical expertise. Therefore, nowadays, most software development processes still use mainly text-based specification, perhaps mixed with other non-formal or formal specifications within certain development steps, because some specification methods may have specific advantages in particular steps or for domain-specific problems. For instance, for safety-critical systems, formal methods are used in specification of safety-critical parts. In the standard IEC 61508 particularly designed for safety-critical systems, some formal methods are recommended: Calculus of Communication Systems (CCS), Communicating Sequential Processes (CSP), Higher Order Logic (HOL), Language for Temporal Ordering Specification (LOTOS), OBJ, Temporal Logic, VDM/VDM++ and Z [13]. All of them, however, fall short to be applicable in all development steps. Typically, attempting to construct a formal specification directly from an informal, high-level requirements document can be challenging [2]. Using different methods in various steps, on the other hand, leads to the problem of non-consistency between inhomogeneous specifications

16

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

in steps, which causes verification activities between development steps to be very difficult and inefficient. Besides these methods, being the most widely accepted modeling language, UML represents a collection of best engineering practices, that have proven useful to model large and complex systems [12]. UML provides a meta-model with underlying action semantic. This is the semi-formalized basis for UML in general. Even though its current version does not include formal methods, it seems that UML’s orientation towards semiformality is highly adequate for practical use. Furthermore, based on this meta-model a transformation to formal models is possible without ambiguities. Some activities of normalizing UML (semantics) are in process, e.g., [2, 20]. Therefore, our idea is to use UML for specification at all development steps and to formalize the specification when necessary. To mature the FB-CBD paradigm, FBs, FB diagrams, and FB events, i.e., all concepts of the FB paradigm, are mapped to UML elements (elements of the UML meta-model, UML diagrams and UML extensions defined by users). This way, the FBCBD paradigm attains rich expressions from UML, whereas UML, on the other hand, is enforced by the explicit data flow view of the FB paradigm. As for the development life-cycle, there are the so-called V-model, waterfall and iteration style, all of which can be well supported by UML. In addition, due to increasing concern for safety in industrial systems, safety-related methods have to be combined in a design paradigm. Therefore, with respect to the safety issues of embedded systems, we refer to the V-model introduced in IEC 61508. Its relation to UML and other constraints are shown in Fig. 6. The V-model is characterized by the top-down design/modeling feature and by bottom-up/left-right verifications. The bottom-up and right-left verifications represented by dashed arrows in Fig. 6, are particularly emphasized for safety-critical systems. However, particular attention should be paid to these verification activities for complex systems, as errors may easily be added during development. Safety requirements are separated from non-safety ones in order to avoid a mixture of safety control channels with non-safety functionalities. The standard IEC 61508 itself does not say anything about engineering practice, but provides a generic guide-

Fig. 6. V-model of IEC 61508 [7]

Specification and Verification of Applications Based on Function Blocks

17

line for each step of the development life-cycle from a safety perspective. Based on risk analyzes which lead to different Safety Integrity Levels (SILs), the V-model starts from the safety requirement specification and then decomposes it into specifications of subsystems. This process is called architecture design. When a project’s functional requirements and non-functional constraints can be described with UML elements, they are easy to be combined with design artifacts of other steps, which are also in terms of UML elements. This way, module/subsystem specifications can remain consistent with the original specification, as well as with other modules/subsystems in the entire development life-cycle. On the other hand, due to these consistent specifications, the verification and validation activities can also be integrated with modeling/designing activities. 3.2 UML and FB Models in the V-Model The V-model sets out from software requirements specification. In most cases, a gap exists between the provider of system requirements and developers, i.e., they have different views on the same target system. They always communicate in form of meetings and question lists under participation of potential system users. The personnel who will later carry out verification or testing is sometimes requested to join the project team as early as possible. This results in one more specific view on the system. Therefore, a common development environment for specification and later verifications is strongly required from the very beginning of the life-cycle on, in order to provide a maximum guarantee for correct and easily understandable system specifications whose quality will affect all following development steps. Among UML elements, use cases, scenario and sequence diagrams (SDs) as well as textual formats can be used to specify requirements. Use cases show the interactions between a system and its actors (outside users), and the associations among functionalities inside the system. SDs describe interactions in terms of messages among objects (including actors) in a time sequence. Since an SD can provide information about “what time (relatively) who sends a message (event) to whom for what kind of goal,” it is especially suitable to describe scenarios and specify certain requirements in small contexts. Here, we use a traffic light control system as running example of this chapter. For a simple traffic light control system at a crossing with only two directions (north-south and east-west, NS and EW in short), its functional requirements are textually stated as: 1. 2. 3. 4.

normal control sequence for the traffic at the crossing; synchronization between the two directions; manual mode and rush hour model by using sensors; extendability of the system to more complex functionalities, e.g., turning-left control and connecting to a city-wide traffic control system.

As illustrated in Fig. 7, the use case diagram captures main actors and functionalities of the traffic light control system according to its functional requirements. Three functionalities associated with normal, manual and rush hour modes are to be designed in the control system. Then SDs used to specify the above requirements are shown in Fig. 8. The first SD specifies the manual mode, in which the controller reacts to a policeman’s command represented by ButtonEW. Fig. 8-2 illustrates the rush hour mode,

18

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Fig. 7. Traffic light control system

Fig. 8. Functional specification with sequence diagrams

in which the system updates time parameters computed on the basis of the sensed traffic flow measurement. The result of this first step then leads to functional allocation. Within a UML environment, component, package and class diagrams are used to express system architectures, and SDs for functional aspects of this design. Additionally, FB diagrams and device configurations offer another view of architectural functionality decomposition derived from, e.g., Pipe and Instrumentation Diagrams (P&IDs) of a process control system. As shown in Fig. 9-1, a FB diagram is used to describe the traffic control application. From the requirements specification expressed in SDs and textual formats, the main objects are caught: controller, sensor, light and policeman. In this chapter, to be simple, we consider the normal and rush hour modes, only. Therefore, the control application contains two controllers (Ctrl1 and Ctrl2), which communicate with each other for synchronization. One controller together with a sensor (S1/S2) and a light (Light1/Light2) compose a control loop. FBs in the application are deployed in different devices, shown in Fig. 9-2. When configuring the system, the sensors and lights are connected to the controllers via a certain communication system (e.g., a fieldbus), so that this local system can be further connected to the city traffic control system. From the architectural specification, the system design step refines each FB’s functionality, and details them in terms of functional specification, smaller contexts (scenar-

Specification and Verification of Applications Based on Function Blocks

19

Fig. 9. FB diagram for the control application

ios), implementation-related methods and languages, etc. Thus, class diagrams, collaboration diagrams, and state machines (if necessary) are used together with FB diagrams and corresponding resource configuration to specify artifacts of system design. From the perspective of the FB paradigm, this step corresponds to resource configuration and application building with selected (from FB libraries) or user-defined FBs. Then, in the step of module design, domain-specific modules are created or modified on the basis of available modules (say FBs). Module specifications can be described with the package and class diagrams of UML, while the behavior of modules is described in terms of UML state machines. To design modules means to construct new, additional FBs in terms of UML classes with their corresponding attributes, operations and state machines. The coding step copes with program translation using a compiler to produce executable code to be run in embedded systems. In other words, design steps make the entire specification for both application and modules, while system implementation is created by coding. Moreover, the bottom-up and right-left verification activities (dashed lines in Fig. 6) are especially important to satisfy constraints as well as to correctly specify and implement functionalities according to requirements. For CBD, the quality of module integration is most decisive for global performance, however, engineering practice falls short in this respect. As this topic plays an important rˆole both in constructing FB libraries and in constructing proper applications, it is addressed in the sequel.

4

Specification of FBs

So far, we have touched at least five different concepts: V-model, UML, safety, FBs, and verification-related issues. The first three can be integrated as illustrated in Fig. 6. Specifying FBs and FB applications with UML is a way to exploit advantages of both and to manage the FB paradigm in the V-model. Before that, the mapping between FB and UML needs to be provided. An FB profile is built for this purpose. 4.1 Meta-model and Profile of FBs The relationship between FBs and UML elements has first been investigated in [24]. Based on that, one method is to create a new meta-model for mapping FBs into UML. Fig. 10 illustrates the FBType in a class diagram derived from the Engineering Support System’s (ESS) object model that was introduced in [5]. FBType is generalized by

20

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

BasicFBType, CompositeFBType, or SIFBType. In addition, interface elements of FBs are described by “EventInput”, “EventOutput”, “InputVariable”, and “OutputVariable”. “BasicFBType” composes “InternalVariable”, “Algorithm”, and “ECC”. Like re-usable components in CBD, an FB library is a collection of FB types, which deal with, e.g., time handling, arithmetic and logical operations, event handling, and domain-related control algorithms. Each element of an FB library should be a class (type) inheriting from BasicFBType, CompositeFBType, or SIFBType. However, this method is too strong and difficult to achieve, as a supporting UML tool is still absent. Therefore, we prefer the light version of extension supported by UML tools. To do this, an FB profile is to be constructed as a meta-model for FB models in UML.

Fig. 10. FBType in a class diagram derived from [5]

As defined in [12], a profile is “a stereotyped package that contains model elements that have been customized for a specific domain or purpose using extension mechanisms, such as stereotypes, tagged definitions and constraints”. It is possible to describe FB features by defining corresponding stereotypes, tagged values and constraints. Stereotype is “a class that defines how an existing meta-class (or stereotype) may be extended, and enables the use of platform- or domain-specific terminology or notation in addition to the ones used for the extended meta-class” [12]. Tagged value is “the explicit definition of a property as a name-value pair” [12]. Constraint is “a semantic condition or restriction for the purpose of declaring some of the semantics of a model element” [12]. They are three extension mechanisms incorporated into UML. The proposed FB profile is organized in three logic groups of concepts: – architecture according to IEC 61499 reference model; – configuration; – FB model. The third group is introduced in this section. We define some stereotypes BasicFB, CompositeFB, and SIFB by extending the UML metaclass “Class”. FB-interface-related elements are described by the stereotypes EventInput, EventOutput, FBVariable, InputVariable, OutputVariable, and InternalVariable. EventInput is defined for input events of an FB by extending the UML metaclasses “Event” and “Reception”, whereas Event Output extends the meta-classes “Event” and “Operation” for an FB’s output events. InputVariable, OutputVariable and InternalVariable are defined by extending both meta-

Specification and Verification of Applications Based on Function Blocks

21

class “Variable” and “Attribute”. In addition, the stereotype AssociateWith is defined by extending the meta-class “Dependency” to describe the association between events and data variables. Importantly, the observable behavior of an (either basic or composite) FB should be explicitly specified, e.g., the relation between an event and the results of the FB in terms of corresponding event and data outputs. We add a dependency stereotyped by obsOutput, with direction from the event input to event output/outputs. As shown in Fig. 11, some stereotypes are defined to describe general information of an FB. Other FB-type-specific stereotypes are added into the profile in the course of the following discussions about specifications of basic and composite FBs.

Fig. 11. Stereotypes defined for FB Interface

4.2 Basic FBs Besides a basic FB’s interface specified in UML based on the FB profile, its ECC and algorithms together express its behavior. Here we use a state machine to describe an ECC. All ECC-related stereotypes are defined in Fig. 12. The stereotype ECC extends the meta-class “StateMachine” with the ECC-specific features which are described by the constraint “Flat state machine; No concurrency, pseudo-states, submachine-state; StateMachine.State.isComposite=FALSE; StateMachine.Transition.isComposite= FALSE”. In addition, ECState extends the meta-class “State” by limiting its attribute “isSimple” to “true”. This means that any instance of “State” stereotyped by ECState is always a simple state. With these stereotypes, an ec state can directly be represented by a simple state in a UML state machine. Since outputs of an ec state are issued after the completion of algorithms comprised in the associated ec action, the operation

Fig. 12. ECC stereotype

22

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

representing an algorithm is invoked in the entry activity of the state, then an event is generated to represent the ec state’s output. Algorithms of a basic FB represent its application-related services. In contrast to this observation, “Operation” of UML is chosen to describe algorithms. However, a service of a basic FB cannot be seen from outside, whereas it is possible to access an operation of an object by another objects. So a stereotype Algorithm is defined by extending “Operation” and constraining it as “private”. The body of an algorithm is implementation-dependent, but an activity diagram can be used to describe the algorithm’s body. Based on the FB profile, some rules could be made to specifying a basic FB: – An FB type is specified by a class. When implemented in a package with the same name, an FB is seen as a component. – Names of FB elements are expressed in “OBJECTNAME elementname” to avoid name conflicts. – Variables are defined as the object’s attributes which are “private” (for input and internal variables) or “public” (for output variables). – An input event is described as an event reception of the object with parameters that are associated data variables while an output event is described by an “private” operation stereotyped by EventOutput. – An ec transition is transformed into a simple transition. If an ec transition’s condition is expressed by combining an event and a Boolean expression, the Boolean expression is transformed into the transition’s guard while the event is the transition trigger. 4.3 Composite FBs Compared with basic FBs, composite FBs are quite complex due to the connections between FB instances and concurrency possibly caused by active instances. Since there is, in UML, no direct support to model FB networks, some UML elements have to co-operate in specifying composite FBs. In UML 1.x, among all diagrams that can express collaboration/interaction, the combination of class diagram and collaboration diagram is a hopeful candidate. However, in UML 2.0, the collaboration diagram was removed and somehow replaced by the interaction diagram, which exists in different variants, namely, as SD, communication diagram, and interaction overview diagram. Communication diagrams focus on the interaction between lifelines [12]. A lifeline represents an individual participant in an interaction, e.g., an object playing a rˆole. As a result of this investigation, we find two ways to model a composite FB in UML: 1. Collaboration diagrams are still used to describe the FB network within a composite FB, because some design tools (e.g., Rhapsody from I-Logix) support collaboration diagrams. 2. To comply with UML 2.0, a composite FB is specified as a composite class stereotyped by CompositeFB. A structure diagram is used to describe the internal structure of this composite class. Additionally, in both methods, SDs can be used to detail their interaction scenarios. Moreover, to avoid the name conflict, we suggest to use the naming rule: “receiver srcevent dstevent”

Specification and Verification of Applications Based on Function Blocks

23

for messages, where srcevent is the event name of the sender object and dstevent is the input event name of the receiver (Since messages between any instance and the composite FB being built just show their interface relationships, this naming rule may not be used for this situation.), and “objectname.variablename” for parameters, where objectname is the owner of the variablename. Actually, at the level of FB networks, interactions/collaborations are the focus of specification instead of FB detail. So, for UML 1.x, when defining a composite FB type we use a class diagram to show its component FBs. Then, a collaboration diagram is used to describe all connections of the network, as Fig. 13 shows. Their event connections are described by messages directed from senders to receivers. Since data are always associated with events, associated data connections are described as parameters of the message. Fig. 13 illustrates the specification of SEM MVC as a composite FB (or sub-application) for the traffic light example. It consists of three instances of the types SEQ, E DELAY, and SEQV, respectively. SEQ defines the time sequence control principle of the traffic light control system, E DELAY is an event FB generating events periodically, and SEQV is a SIFB representing a traffic light. In the collaboration diagram, the message sequence is not really important, as objects will not interact with each other exactly in the sequence at run-time, as Fig. 13 shows. To emphasize the sequence, we could use SDs.

Fig. 13. Modeling a composite FB based on UML 1.x

In the second method, i.e., for UML 2.0, a composite class represents a composite FB type. In the structure diagram of the class, instances of its component FBs are connected by instances of “InformationFlow”, “Connector” or “Dependency”, which are limited by corresponding stereotypes. “InformationFlow” is newly defined in UML 2.0, which can transfer some “InformationItem”. As for the connections between FB instances, as well as between instances and the interface of the composite FB type, instances of “InformationItem” on the “InformationFlow” can describe data and event connections in the composite FB’s internal structure. Thus, we define the stereotypes DataConnection and EventConnection extending the meta-classes “InformationItem” and “Connector”, as shown in Fig. 14-1. With “InformationFlow” the collaboration between component instances is abstractly described. The “Connector”, that is implementable, specifies a communication link between instances.

24

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Fig. 14. Modeling a composite FB based on UML 2.0

Moreover, UML 2.0 provides the concept of a “Port” (from package Ports), which is a structural feature of a classifier specifying a distinct interactional point between the classifier and its environment, or between the (behavior of the) classifier and its internal parts [12]. This way, connections between the composite class (composite FB) and its parts (component FB instances) can be specified by using instances of “Port” and “Connector”. This is compliant with the interface concept of a component. But if we take the implementation language into account, the use of “Port” is limited to C++ or Java, because C has difficulties to support this concept. In general, the specification of a composite FB gives an abstract description of its event and data connections by instances stereotyped by EventConnection and DataConnection. Some more refinements should be added to re-inforce the specification of a composite FB for verification.

5

Formalized Models of FB/UML

An FB-based module can be verified by analysis and testing. As mentioned in IEC 61508, analysis takes the form of reviewing design artifacts, e.g., via model-checking. The closer the development process approaches implementation, the more verification is carried out by testing. Formal methods occupy some places at this step in model-checking and test case generation. Based on the specification, a model can be formalized for these purposes. Here we choose Finite State Machines (FSM) and Extended FSMs (EFSM) to enable the application of FSM/EFSM-based methods. 5.1 Basic FB/UML to FSM Normally, to deal with the formalization of a UML state machine into FSM/EFSM, it is necessary to treat features such as concurrency, hierarchy, or transition guards, which are not compliant with FSM/EFSM. Fortunately, any state machine stereotyped by ECC is a limited subset of UML state machine. Neither concurrency nor hierarchy exist in ECC anymore. But it remains challenging to cope with transition guards. With respect to a basic FB’s specification, its state machine has events or Boolean expressions to control transition execution. As stated in [9], variables are categorized into context variables (CV), that are assigned within a UML action, and message parameters, that are assigned within the parameter list of a UML message. Additionally, a control variable is defined as the variable that influences transition executability. Thus, any variable

Specification and Verification of Applications Based on Function Blocks

25

used in a condition expression, whether it is of context variable or message parameter, is a control variable. They are to be excluded in the model for verification. A basic FB/UML state machine can be described as a tuple: BFBUML = (Eb , Sb , A, T, C, sb0 ) where Eb is the event reception set, Sb is the state space, A is the set of activities (including algorithms and action outputs), T is the transition relation T ⊆ Sb × Eb × C × Sb , C are the transition guards, and sb0 is the start state. Whereas FSM M is a transition system defined as a quintuple [9]: M = (I, O, S, δ, s0 ) where I, O, S are sets of input symbols, output symbols, and finite state space, δ⊆S×I×O×S is the transition relation, and s0 is the initial state. To formalize a BFBUML into M, first, outputs should be distilled from A and, secondly, C should be excluded, while the information contained in the BFBUML has to be retained. Since the guard expression may be true for several contexts, a single transition declaration with guards generally comprises several FSM transitions. Therefore, these control variables (in transition guards) are represented by newly created states, “ctrlStates”, denoted as set Sc . True contexts of this transition guard are transformed into “ctrlTranstitions”. Then, some “ctrlEvents” are added in the event list. They are defined as: Definition 1. ctrlEvent: denoted as Ec , an additional event set inserted into the FSMbf b to represent the path selection decided by transition guards. Definition 2. ctrlTransition: Tc ⊆ (Sb × Ec × Sc ) ∪ (Sc × Ec × Sb ). These ctrlEvents are inherently different from external events of the FB in that they reflect our testing goals, e.g., execution sequences defined in test cases. That implies the possibility to control special paths of transitions for special testing goals. Moreover, the influence of algorithms on transitions (through control variables) is also considered by using ctrlState and ctrlEvent. Since algorithms themselves are related to action languages, they cannot be included in the current version of transformation algorithms. However, this will not change the transition structure of a state machine. Based on what has been discussed above, the formalized model is: F SMbf b = (I, O, S, δ, s0 ) where I = Eb ∪ Ec O = Ao (Ao ⊆ A, output events generated in actions) S = Sb ∪ Sc δ ⊆ S × I × O × S. 5.2 Composite FB/UML Formalized in EFSM One of the frequently faced difficulties in verification is to handle complexity, especially when verifying compositions of components. As we have discussed above, composite FBs/FB networks provide the possibility to construct arbitrary hierarchies and to limit complexity at the same time. Therefore, we analyze the information needed for integration verification. Abstraction and Required Information. First, the FB network of a composite FB type describes arbitrary hierarchies, but the lowest-level components must be basic FB

26

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

instances. Only basic FBs provide services of the application in terms of algorithms. The FB network creates a global execution control on those algorithms, which may be hierarchically organized. From this perspective, the FB network can be unfolded to a basic FB-based network by removing hierarchies of the network. To this end, the network simply consists of communicating state machines (ECCs). However, at the same time, complexity is also added to the network due to more network nodes and concurrency. It is really hard to decide the balance between abstraction level and complexity. Moreover, sometimes a composite FB may re-organize the event-data “association” relationship originally defined in component FBs [24]. Without these information, unfolding may cause ill-formed network. So, the principle for unfolding is: – if the information of contained FBs can be fully accessed, unfolding the FB network and keeping all original associations unchanged between events and data variables will contribute to the verification; – if the information of component FBs is not accessible, regarding them as blank boxes is a reasonable approach, as their interfaces contain the information of the input-output relations. After the re-shaping work on the network, we are ready to formalize the model of the composite FB. We have observed that if an FB in a library is already verified, its output can be fully abstracted by its output function mapping input domain to outputs. Thus, the observable performance of a network may be described as an global output function based on lower-level output functions. According to the specifications of FBs, their output functions are specified as part of their interfaces by using stereotype obsOutput. The “output function” for each network node is denoted by . It abstracts the reactive behavior of the node by mapping its input domain to an output set. The output function of the rth node is written as r . Here we abstract a network node as a “situation”, “sit” in short, by a quintuple: Definition 3: sit = (I, O, , VE , VX ) where I is the input set, O is the output set, is the output function defined as : I →O, and VE and VX are data input and output sets of the node, respectively. In a situation net, the rth sit is denoted as sitr = (Ir , Or , r , VEr , VXr ). Every member of I or O is a pair: for an event input α, we have (α, Vα ) ∈ I, Vα ⊆ VE ; similarly, for an event output β, we have (β, Vβ ) ∈ O, Vβ ⊆ VX . The input/output event set in I/O is denoted by I ev /Oev . In addition, a function ↑ is defined as ↑: VX → Oev , mapping output variables to their associated events. Thus, the FB network is described by a situation network, defined as a 6-tuple: Definition 4: SITn = (Π, In , On , P, ET, DT ) where Π is the set of all situation nodes in the network, including sitc defined as a special situation for the composite FB or application itself, In is the external input set of the network, On is the external output set, P is the initial data set of the network, and ET and DT are event and data connections, respectively: ET ⊆ Π × O∪ × λ × I∪ × Π with O∪ = On ∪ O1 ∪ ... ∪ Ol , I∪ = In ∪ I1 ∪ ... ∪ Il , l ∈ [1, |Π|]; DT ⊆ Π × Vout × µ × Vin × Π with Vout = P ∪ VX1 ∪ ... ∪ VXl , Vin = VE1 ∪ ... ∪ VEl .

Specification and Verification of Applications Based on Function Blocks

27

The symbols λ and µ are connection factors. In short, a pair (α, β), α ∈ O∪ , β ∈ I∪ , simply represents an event connection, and a pair (v1 , v2 ), v1 ∈ Vout , v2 ∈ Vin , is the simplified form of a data connection. The fundamental principle of formalizing a situation network is connected to the function of every component FB, as represents the sit’s response to certain inputs and determines its next state. In addition, external events of the network need to be separated from internal events. Internal events are those taking place between normal sits inside the situation network, while external events come from the environment of the network. Therefore, for a composite FB, its input and output events are all external. So far, the abstraction and information required for verification are all formalized in SITn . Connection Analysis. An event connection between two sits means that an output event of the source sit causes in the target sit a certain computation which is triggered by one of its event inputs. This implies that FB networks are hardwired in a certain sense. However, particular treatments are needed for some special cases of connections. Let us first consider a non-concurrent network, i.e., all network nodes are not active but sequential. For this case, the “sits” are simply sequentially connected. To a node, each output as response to an input event is transformed into one state which represents one global state of the network. Event connections determine transitions between these states. However, for a concurrent network, more than one execution path may exist at the same time. Observable outputs of active communicating “sits” cannot directly be transformed into EFSM states. A typical example for this case are “one-more” event connections between “sits”, which cause parallel computations. In addition, a “moreone” connection means that several event outputs of different “sits” are connected to different event inputs of another “sit”. Here we use “one-two” and “two-one” as typical connection types to simplify “one-more” and “more-one”. For any three sits, say (I1 , O1 , 1 , VE1 , VX1 ), (I2 , O2 , 2 , VE2 , VX2 ), (I3 , O3 , 3 , VE3 , VX3 ), three possible event connection types are defined: – Simple Event Connection (ec1) between (sit1 and sit2 ): ∀(eo , Veo ) ∈ O1 , ∃(eo , ei ) ⇒ (ei , Vei ) ∈ I2 ; ∀(ei , Vei ) ∈ I2 , ∃(eo , ei ) ⇒ (eo , Veo ) ∈ I1 ; – One-More Event Connection (ec2): ∀(eo , Veo ) ∈ O1 ⇒ ∃(eo , ei2 ) ∧ (eo , ei3 ), (ei2 , Vei2 ) ∈ I2 , (ei3 , Vei3 ) ∈ I3 ; – More-One Event Connection (ec3): ∀(ei , Vei ) ∈ I3 , (ej , Vej ) ∈ I3 ⇒ ∃(eo1 , ei ) ∧ (eo2 , ej ), (eo1 , Veo1 ) ∈ O1 , (eo2 , Veo2 ) ∈ O2 , ei = ej . Since data variables are always associated with events, data connections are tightly related to event connections. We define two basic data connection types: – Full Data Connection (dc1): (vo , vi ), vo ∈ VX1 , vi ∈ VE2 ⇔ (eo , ei ), (eo , Veo ) ∈ O1 , (ei , Vei ) ∈ I2 ; – Partial Data Connection (dc2): (vo , vi ), vo ∈ VX1 , vi ∈ VE2 ⇒ ¬∃(eo , ei ), (eo , Veo ) ∈ O1 , (ei , Vei ) ∈ I2 . Therefore, the factor λ is defined as: λ = {ec1, ec2, ec3}; µ is defined as: µ = {dc1, dc2}. In order to take the effect of both λ and µ into account, a factor set f is defined as the product of λ and µ: f=λ × µ = {(ec1, dc1), (ec1, dc2), (ec2, dc1), (ec2, dc2), (ec3, dc1), (ec3, dc2)}.

28

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Some more complex event connections may also be constructed based on these three types. Based on the above discussion, the situation network is ready to be transformed into an EFSM as the model for verification and test case generation. The EFSM is defined by a tuple: M= (Q, C, I, O, V, δ, q0 ) where Q is a non-empty set of states, q0 ∈ Q is the initial state, I, O, V are finite sets of input symbols, output symbols and variables, respectively, C is an interpretation of V that assigns values for each variable in V, and δ is a finite set of transitions. Each transition is a tuple (q, i, o, g, q’), where q ∈ Q, i ∈ I, o ∈ O, q’ ∈ Q, and g is a predicate on variables in V. In the sequel, states and transitions of the target formal model Mc are discussed based on the SITn . Adding States in an EFSM. Each state in the formalized model Mc (an EFSM) is a pair, denoted by (s, Cs ) ∈ Q, where s is the state label and Cs is the variable context of s, Cs ⊆ C. In Mc , one particular state, named initial state q0 = (sp , P), is added to represent global initialization for the network, in which some members of VE are parameterized according to “P” of the network. In addition, in [3], it is recommended to make an initiation chain by connecting one FB’s “INITO” to other FBs’ “INIT” if possible. The initiation chain is represented by the initial state q0 of Mc . Other normal states are respectively related to output functions of sits for different input events. If an (er ) input event of sitr (r ∈ [1, |Π|], sitr = sitc ) is era , the state (qa , Vx r a ) representing (er ) the result of era is added to Mc , where Vx r a ⊆ VXr denotes the set of output contexts as response to the input event era of sitr . However, state transformations are tightly related to connection types. Except for the type ec1, some auxiliary intermediate states have to be added to represent the connection for types ec2 and ec3. We shall discuss them together with transitions. Adding Transitions in an EFSM. The transition set of the formalized model in EFSM representation is based on DT and ET of SITn . Without considering timeliness, the observable behavior of nodes is abstracted. Data connections of the network are actually data assignments. We assume that data assignment is achieved through event parameters. Therefore, at the level of situation networks, there are no control variables which affect transitions. This implies that SITnet |=g always holds, i.e., the predicate g of a transition is satisfied. We can simplify any transition in δ by a tuple (q, i, o, q’). The first principle to treat transitions is to create one transition for each event input of a node. The input is represented by the trigger of the transition while output is located on the transition. First of all, for each external event, ∀em ∈ In , one transition is added to Mc . According to the em related event connection (sitc , em , ec1, era , sitr ) in SITn , the transition added to Mc (denoted by δar ) is a tuple (q0 , em , r (era ), qm ). The target (er ) (er ) state qm of this transition is (srm , Vx r a ) where Vx r a ⊆ VXr is the set of output (er ) contexts associated with input event era for sitr . The state (srm , Vx r a ) represents the result of sitr ’s consumption of event era . As for the second principle, for the case when r (era ) = ∅ occurs, no normal transition is added, as the consumption of the event does not cause state changes. Mc should reflect this case by adding a self-transition with empty output. (For a self-transition the target state is same as the source state, and there are no transition actions.)

Specification and Verification of Applications Based on Function Blocks

29

The third principle relates to concurrency of the situation network. A concept of “configuration” was proposed for handling concurrency in [10, 17, 19]. A configuration CF is a maximum set of states in which a system can be simultaneous. If the connection between several sits is of type ec2, concurrency occurs. States of concurrent nodes need to be combined into “configuration” states. Connection Type of (ec1, dc1). For type (ec1, dc1) illustrated in Fig. 15-1, which implies internal connections, exactly one EFSM transition for each event connection between them is added. Their source states are those states representing outputs of corresponding sits. Provided an event connection is (sitr , ori , λi , em j , sitm ), the trigger of the m corresponding transition is the event em j . The output denoted by m (ej ) is determined i i by sitm ’s output function m . The source state (sr , Vx ) of the transition is determined by the output function r of sitr . Related to m , the target state is added to Mc and m (em j ) m j denoted by (sjm , Vx ). Then the transition is written as (qri , em j , m (ej ), qm ).

Fig. 15. (ec1, dc1) and (ec1, dc2) Connection types

Connection Type of (ec1, dc2). For three sits: sitr , sitl , sitm , one event connection trm = (xri , em j ) exists from sitr to sitm , whilst sitl outputs data to sitm but without associating to trm , as shown in Fig. 15-2. The intermediate state is described by qrl = i i (srl , Vxr ∪ Vxlk ), where Vxr ⊆ VXr and Vxlk ⊆ VXl are the data outputs of sitr and sitl , respectively. Two predecessors of qrl are qri , qlk , states that represent outputs of sitr and sitl , determined respectively by r and l . The i and k are marks of two input events of m (em ) j sitr and sitl . For output of sitm , there is a state added, namely qm = (sjm , VXm j ). j Therefore, three transitions are needed to connect the four states qri , qlk , qrl , qm : i r m (qr , oi , ej , qrl ); (qlk , ↑ Vxli , ∅, qrl ); m j (qrl , em j , (ej ), qm ). Connection Type of (ec2, dc1). This connection type causes concurrency as shown in Fig. 16-1. If taking the successors of sitl and sitm into account, the different paths occur from sitm on. For three sits, viz., sitr , sitl , and sitm , two states qri , qlm are referred to (if they exist already) or added, provided sitr is the connection source. Thus, there are r l i j two event connections, say (ori , em j ) and (oi , ek ), and two data connections, (Vxr , Vem )

30

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Fig. 16. Connection types (ec2, dc1) and (ec3, dc2) i and (Vxr , Velk ). The state qlm is the intermediate state combined by results of sitl on elk (el )

m (em )

l k and sitm on em ∪ Vxm j ). The transition from qri to qlm j , expressed as (slm , Vxl is: l m l (qri , {em j , ek }, m (ej ) ∪ l (ek ), qlm ).

Connection Type of (ec3, dc2). For this case, without losing generality, four sits are referred to: sitr , sitl , sitm , sitp , cp. Fig. 16-2. Among them, sitr , sitl are event sources, sitp is a data source, while sitm is the event and data receiver. An intermediate state qrlp is added. Assuming three predecessors of qrlp , namely, qri , qlj , qp , three transitions from these states to qrlp are added: j l m (qri , ori , em k , qrlp ), (ql , oj , ek+1 , qrlp ) and (qp , ↑ Vxp , ∅, qrlp ). Among them, transitions from qri , qlj to qrlp are added according to the treatments for the type (ec1, dc2). Two transitions are added to represent two event connections between sitr /sitl and sitm : m k m m k+1 (qrlp , em k , m (ek ), qm ) and (qrlp , ek+1 , m (ek+1 ), qm ). Since type (ec3,dc1) is an extended version of (ec1,dc1), and (ec2, dc2) is the simple synthesis of (ec1,dc2) and (ec2,dc1), we omit the discussion for (ec3,dc1) and (ec2, dc2). Therefore, based on those discussion, we obtain the formalized model Mc : Mc =(Qc , Cc , Ic , Oc , Vc , δc , q0 ), where Cc =dom(vi ) × dom(v2) × ... × dom(vn) with vi ∈ Vc ; Ic =E; Oc is the set of events generated during transformation; Vc =Vnet ; each transition in δc is a tuple (q, i, o, q’). In summary, we have discussed the formalization of a composite FB into an EFSM. FB applications are nearly the same as composite FBs except that the FB network of a composite FB has an external interface whereas an FB application has not. However, we could also define an application’s observable behavior as an output function, provided we could monitor variables of the application. Based on this, the formalization method for composite FBs can contribute to that of FB applications.

Specification and Verification of Applications Based on Function Blocks

6

31

Verification

According to the V-model, verification can be performed in form of analysis (model checking) and testing. In this section, analysis is used to verify a basic FB, while testing is carried out to verify a sub-application. 6.1 Model Checking Using the traffic control example introduced in Section 3, we take the controller of FB type SEQ as the model to be verified. According to Definition 5, we have the FSM for SEQ’s ECC: I = { init, st, timer, syntrig, e0 }; O = { inito, cpo }; Sc = { }; Sb = { START, INIT, RED, SYN, RY, GREEN, YELLOW}; S = Sb ∪ Sc = { START, INIT, RED, SYN, RY, GREEN, YELLOW }; δ = {(ST ART, init, inito, IN IT ), (IN IT, e0, ø, START), (START, st, cpo, RED), (RED, timer, cpo, SYN), (SYN, systrig, RY), (RY, timer, cpo, GREEN), (GREEN, timer, cpo, YELLOW), (YELLOW, timer, cpo, RED)}. For this example, the implicit test goal is applied in order to achieve a systematic coverage. After checking the transformed F SMF B based on an FSM verification tool [8], the result is that “the FSM is not strongly connected”, which means that not every state is reachable from some other states. The problem is obvious because of the lack of a quiting mechanism known as “reset”. After adding transitions, which are triggered by reset, from each state to START, the F SMF B passes this checking with 27 steps and 6 cycles. 6.2 Testing For verification testing, tested implementations are executed in a interactive way with a test environment which executes test cases. Test cases are generated on the basis of the information attained from the formalized model, functional requirements and test goals. Normally, a test goal can be explicit or implicit. With explicit test purposes (e.g., pre-set event sequences) and corresponding specifications, the tester has flexibility on test purposes at the cost of considerable manual effort. This method does not guarantee systematic fault coverage. Implicit test purposes offer systematic coverage, and guarantee that test sequences are executable. Their disadvantages are lower flexibility and possibly large amounts of test cases. The explicit test purposes can be applied for certain cases, e.g., a safety channel execution. If the starting point is a failure event, then the module/system is expected to reach a safety state through the safety channel. In most cases, however, it is not easy nor necessary to give an explicit purpose, e.g., in testing a module (FB). Instead, test goals are implicit such as to discover possible conflicts, or to check whether the module behaves correctly under all possible conditions (transitions). A test case is a description of testing methods, process, goals, conditions, and expected testing results. For a CFBF SM , appropriate test cases should reflect interactions

32

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

Fig. 17. Test case specified in SD and Testing Records

between its components in the FB network. Simply, a test case could be understood as event sequences and some criteria for passing the test. An event sequence represents a possible testing path, which may be a partial scenario. In [10, 17, 19] and [18], test case selections are based on data flow analysis, whereas control flow and data flow are combined in our testing. Data flows are always associated with control flows. Here, we specify test cases with SDs according to the functional requirements and the formalized model of SEQ MVC, which is a sub-application containing an instance each of Sensor, SEQ, Light. Based on the specification of SEQ MVC, its implementation in C is generated by Rhapsody 5.1, the UML tool we use. A test case shown in Fig. 17-1 is manually designed to check the observable performance of SEQ MVC. The two SDs shown in Fig. 8 are used as the criteria. A test driver tdsSEQMVC is designed as an object whose task it is to feed events to the implementation of SEQ MVC. Shown by Fig. 17-2, the simplified (partial) behavior of the implementation is recorded to compare with criteria SDs. Its execution sequences satisfy the requirements.

7

Related Work and Conclusion

In [21–23], some research efforts have been directed towards the verification and validation of FB applications by using Petri nets. Due to the limitations imposed by system complexity, large or complex systems cannot be dealt well with by their approaches (at least, as concluded from these publications). The question of how to manage FBbased development in certain life-cycles was not mentioned there. The research work of Thramboulidis et al. has a similar topic as ours. Their publications [15, 16] introduce work about combining UML and FB-based approaches for control and automation. A four-level architecture was provided for that. But the hybrid approach introduced in [16] is a loose combination of UML and FB approaches, as the mapping of FB elements into UML is not considered.

Specification and Verification of Applications Based on Function Blocks

33

Different to these works, the present chapter has introduced a brand new approach for CBD by combining UML, the FB paradigm, and current CBD methods. This new paradigm separates models related to specification, code generation and test case generation. The basic model created by specification is the basis for the other two models. To the best of our knowledge, it is the first work to apply FB for CBD in related areas. As a widely accepted semi-formal specification method, UML is used as an environment for all development steps. Thus, artifacts of these steps can be specified in a consistent way. In addition, we use the V-model defined in IEC 61508 with consideration of safety-related issues. By analyzing the FB paradigm, we believe that this paradigm can support to handle the problem of complexity, and provide engineers with a concise way to design component-based systems. A special profile is built for mapping FB elements to UML elements. Following these ideas, this chapter focuses on the steps of FB (module) specification and verification based on known methods (FSM/EFSM-based). Obviously, our attempt is opening up a research direction to satisfy the needs of developing embedded systems in combining non-formal or semi-formal methods (UML and FBs) with formal ones (e.g., FSM/EFSM). However, there is still a long way to go, e.g., detailing the analysis of different connections in FB networks, optimizing the transformation from a composite FB’s UML specification into an EFSM, dealing with more complex functionality, tool support for the suggested transformation, automating test case generation, as well as verifications between neighboring steps in the V-model.

Acknowledgment This work was financially supported by a STIBET Matching Funds grant of both DAAD and ifak e.V.

References 1. B ROWN , A., AND WALLNAU , K. The Current State of Component-Based Software engineering. IEEE Software 15, 5 (Sep. 1998), 37 – 46. 2. C HENG , B. H., AND WANG , E. Y. Formalizing and Integrating the Dynamic Model for Object- Oriented Modeling. IEEE Transactions on Software Engineering 28, 8 (Aug. 2002), 747–762. 3. C HRISTENSEN , J. Design patterns for system engineering with IEC 61499. In Conference Verteile Automatisieriung (2000), pp. 55–62. Magdeburg, Germany. 4. C OMMITTE , I. E. IEC 61131-3: Plc programming language. Standard, 1997. 5. C OMMITTE , I. E. IEC 61499 - function blocks for industrial-process measurement and control systems. Publicly Available Specification, 2001. 6. C RNKOVIC , I., L ARSSON , S., AND S TAFFORD , J. Component-Based Software Engineering: Building systems from Components. ACM SIGSOFT Software Engineering Notes 27, 3 (May 2002), 47. 7. D IEDRICH , C., H INTZE , E., AND N EUMANN , A. Information Control Problems in Manufacturing. In Proceedings of the 11th IFAC INCOM’04 Symposium (April 2004), C. Pereira, G. Morel, and P. Kopacek, Eds. Salvador da Bahia, Brazil. 8. H ENNIGER , O. Testgenerierung aus Spezifikationen in Estelle und SDL. PhD thesis, Otto von Guericke Universitaet Magdeburg, May 2001.

34

Wei Zhang, Wolfgang A. Halang, and Christian Dietrich

¨ 9. H ENNIGER , O., U LRICH , A., AND K ONIG , H. Transformation of estelle modules aiming at test case generation. In Proc. of the 8th IFIP International Workshop on Protocol Test Systems (1995), S. B. A. Cavalli, Ed., Chapman & Hall. Evry, France. 10. H ONG , H., AND U RAL , H. A test sequence selection method for statecharts. Software Testing, Verification, and Reliability 10, 4 (2000), 203–227. 11. M C I NNIS , K. Component-based development: The concepts, technology and methodology. White paper, 2000. Castek Software Factory Inc. http://www.CBD-HQ.com. 12. R ATIONAL . UML: Unified Modeling Language version 2.0, 2003. 13. S HEPPARD , D. An Introduction To Formal Specification With Z and VDM. McGRAWHILL, 1994. 14. S TOJANOVIC , Z., DAHANAYAKE , A., AND S OL , H. A Methodology Framework for Component-Based System Development Support. In sixth EMMSAD’01 (June 2001), pp. XIX–1 — XIX–14. Interlaken, Switzerland. 15. T HRAMBOULIDIS , K., D OUKAS , G., AND F RANTZIS , A. Towards an Implementation Model for FB-based Reconfigurable Distributed Control Applications. In Proc. of 7th International Symposium on Object-oriented Real-time Distributed Computing (2004). Vienna, Austria. 16. T HRAMBOULIDIS , K. C. Using UML in Control and Automation: A Model Driven Approach. In Proc. of 2nd IEEE intl. Conference on Industrial Informatics INDIN04 (June 2004), pp. 587–593. Berline, Germany. 17. U RAL , H. Test Sequence Selection Based on Static Data Flow Analysis. Computer Communications 10, 5 (Oct 1987), 234–242. 18. U RAL , H., AND W ILLIAMS , A. Test generation by exposing control and data dependencies within system specications in sdl. In Proc. of IFIP 6th International Conference on Formal Description Techniques FORTE’93 (Oct 1993), pp. 339–354. 19. U RAL , H., AND YANG , B. A Test Sequence Selection Method for Protocol Testing. IEEE Transactions on Communications 39, 4 (Apr 1991), 514–523. 20. VON DER B EECK , M. Formalization of uml-statecharts. In Proc. of UML 2001 - 4th International Conference on The Unified Modeling Language.Modeling Languages, Concepts, and Tools (January 2001), M. Gogolla and C. Kobryn, Eds., vol. 2185, Springer-Verlag Heidelberg, p. 406. Toronto, Canada, October 1-5. 21. V YATKIN , V., AND H ANISCH , H.-M. Practice of modeling and verification of distributed controllers using signal-net systems. In Proceedings of the International Workshop on Concurrency, Specification and Programming (2000), p. 335–349. Humboldt University, Berlin. 22. V YATKIN , V., AND H ANISCH , H.-M. Verification of Distributed Control Systems in Intelligent Manufacturing. Journal of Intelligent Manufacturing, special issue on Internet Based Modelling in Intelligent Manufacturing 14 (2003), 123–136. 23. V YATKIN , V., H ANISCH , H.-M., P.S TARKE , AND S.ROCH. Formalisms for verification of discrete control applications on example of IEC 1499 function blocks. In Proc. of Conference Verteilte Automatisierung (Distributed Automation) (March 2000), pp. 72–79. Magdeburg. 24. Z HANG , W., D IEDRICH , C., AND H ALANG , W. Comparison Between Function Blockoriented and Object-oriented Designs in Control Applications. In Proc. 27th IFAC/IFIP/IEEE Workshop on Real-Time Programming (2003), pp. 85–90.

A Model-Based Approach to Formal Specification and Verification of Embedded Systems Using Colored Petri Nets Leandro Dias da Silva and Angelo Perkusich Coordination of Graduate Program in Electrical Engineering (COPELE) Electrical Engineering Department Federal University of Campina Grande P.O. Box 10.105 - 58109-970 Campina Grande-PB-Brazil Phone: +55 83 310 1137, Fax: +55 83 310 1015 {leandro,perkusic}@dee.ufcg.edu.br

Abstract. In this chapter we introduce a component-based development process to deal with the complexity of the development of embedded software systems. This process is defined based on a reuse method for colored Petri nets. The use of formal methods and an associated systematic process promotes a greater confidence in the models reducing the time and errors to develop complex embedded software systems. A transducer network is used as a case study to illustrate the approach presented in this chapter.

1

Introduction

Nowadays embedded systems [1] are applied in several kinds of computing devices [2]. The evolution of microelectronics, is making possible to build devices incorporating sophisticated functionalities, thus resulting in an increasing software design complexity [3]. Among the main problems resulting from such scenario, the lack of systematic approaches, and the increasing number of human errors and higher costs, are the major ones from the software development point of view. In the last years several works have been developed in the embedded systems field in order to deal with such problems, including the definition of new techniques, methods, and tools. The application and integration of concepts such as product lines [4], software architectures [5], design patterns [6], and components [7], is a firm step toward the definition of development processes that allow to assemble systems based on a framework and a set of software and hardware components. This can be done by reusing common components of entities or functionalities, developed in other previews successful projects, thus resulting in a cheaper and faster development process, also reducing the number of human errors. Product lines have been used in the development of different artifacts with similarities, thus promoting the development of products with less effort, shorter time and lower cost. In order to have a software product line it is necessary to change the management and the production phases to take the advantages of this development process. Software systems have been developed using product lines to obtain different products C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 35–58, 2005. c Springer-Verlag Berlin Heidelberg 2005

36

Leandro Dias da Silva and Angelo Perkusich

based on parts or common blocks. To evolve a product line, it is necessary a business strategy to guide the development process in order to obtain common values to be used in a known and defined domain. The developed and used values in a product line may vary from strategies established in early requirements definition and project decisions to executable code. Based on these parts, different systems can be developed with specific characteristics. The use of components in the development of embedded systems is an approach to deal with the complexity. In this chapter we are specifically interested in the formal specification and verification of embedded systems based on components. A componentbased development process has been already defined [8]. Based on such process there are two possible kinds of business to be considered. The first one is the development of pieces of a system that can be used as the basis of a family of components defined for a common domain. The second one is based on the development of systems based on existing components. When components as selected or developed, or even adapted, they must be integrated together with a framework in order to obtain the final design. To do so, an integration strategy is necessary to define where and how the components can be integrated, how they are related, and possible integration restrictions, resulting in a software architecture. Therefore a standard architecture for a family of systems that can be developed based on common components can be defined. Thus, a system can be developed based on components as basic building blocks for the development of complex software systems. Based on a product line architecture and on design patterns a framework is then defined for such process. Another relevant aspect is the need to make available methods and techniques to aid in the management and reuse of components. In this context, formal methods can be applied in order to increase the dependability on component-based systems. Furthermore, the use of formal methods in the modeling of systems aggregates several advantages such as, for instance, automatic simulation, and proof of properties. Besides the systematization of the modeling, to increase the dependability on the behavior of the system, one may also want to use methods and techniques to reuse models. That is, to classify, recover, adapt, integrate and perform use verification based on Petri Nets [9] and temporal logic [10]. Among formal methods, Petri nets are widely applied in the context of concurrent systems, such as embedded systems. The advantages of Petri nets are: graphical and mathematical description; a large variety of algorithms for the design and analysis; powerful computational tools; incorporation of abstraction and hierarchical design; huge number applications already developed in several different domains. Also, there are many related variants of Petri nets models allowing to capture specific needs of application domains [11]. In this chapter we adopt Hierarchical Colored Petri Nets (HCPN) as the modeling language [12, 13]. It incorporates data types and hierarchy concepts to ease the modeling task. Colored Petri Nets have been extensively used in many different applications [14, 15], and the modeling activities are supported by a set of computational tools named Design/CPN [16]. As it is well known the life cycle of a software system is defined based on different phases, namely: requirements definition, modeling, coding, and tests. In this chapter we emphasize the modeling phase, and how a systematic and formal modeling approaches

Specification and Verification of Embedded Systems

37

can improve the development process of complex software systems. The aim is to use previous experiences and the definition of models and theories. Therefore, instead of solving a specific problem, a generic strategy can be applied. In this chapter we address the application of a component based development process, with an associate reuse process for HCPN models to specify and verify embedded systems. The remaining of this chapter is organized as follows. In Section 2 concepts of product lines, software architecture and components are presented. In Section 3 the formal specification based on Colored Petri nets, and verification based on model checking are presented. In Section 5 the application of this work to the embedded systems domain is shown. In Section 6 some related works are discussed. Finally, in section 7 the overview of the main aspects of this work are summarized.

2

Product Lines, Components, Architecture

The use of components in a product line together with the definition of an architecture, have the objective of systematic development of complex software systems. Several different definitions for components, software architecture, and product lines can be found in the literature. The following definitions are adopted in the context of this chapter. Software Architecture is the description of the structure of a software system with respect to its functionalities. This description specifies the relationship among the functionalities, the localization of them, and maybe interaction and relationship restrictions. Components are autonomous unities, with independent life cycles, that represent one or more specific functionalities. A component is composed by functionality, an interface, and other non functional characteristics. Product Lines are different products, based on common assets, in a well specified domain. These assets, in the context of this work, are models of components and architecture, support tools, strategies, methods, and techniques. In the specific context of modeling, the architecture is described by a framework to integrate models of components. The components have its functionalities modeled by means of Petri nets. Furthermore other non functional characteristics can be added to models in a repository. The management method of the repository makes the contract explicitly separated from the component model. Defining a domain, then a product line for this domain, and take advantages of previous projects efforts, is a way of efficiently use resources such as time and money, guided by a development process. Moreover the use of formal methods allows the designer to deal with the problem of trustworthiness. Components are a good solution to develop larger systems using building blocks. This is called Component Based Development (CBD). But, to make this a reality, the techniques and methods used until now to develop software must be adapted or new ones defined to satisfy specific requirements of CBD. In the context of Component Based Software Engineering (CBSE) we look for the definition of a set of disciplines that allow the CBD.

38

Leandro Dias da Silva and Angelo Perkusich

The concepts of product lines can be used to guide the component based development. Early in the requirements analysis phase we need to keep in mind that the components being developed will be used in other projects. Moreover, the system to be developed must use existing components. The definition of such a process promotes the use of components because its success depends directly on the management and evolution of a base of reusable components. Therefore, the designer can change the focus from the development to the assembly of systems. Moreover this is an approach to perform refactoring at a model level instead of code level that is much easier because the designer does not need to deal with specific technologies or complicated details. Moreover, the use of a formalism that has a graphical representation can be easier than to deal with code directly. 2.1 Life Cycle for Component Based Development As discussed by Crnkovic et al. in [17], the traditional life cycles used in software engineering have to be adapted for component based development (CBD). Although many works deal with such adoption no one emphasize the modeling phase in the development of software systems. The life cycle of CBD is divided into two different parts. First we have the development of generic components for use in other applications. The major problem in this case is the definition of the requirements because the component should be used in several applications, even in some domains that were not considered while the component was being developed. Second, a systems con be developed using existing components. The main problem in this case is to define how components are connected. One possible solution for this problem is to explicitly define software architecture. In Figure 1 the life cycle for the systematic development of systems based on components is illustrated, as presented in [8]. The modeling phase is present in almost all the steps. As a high dependability and flexibility are needed, this phase is explicitly emphasized. As soon as the requirements are specified, the architecture for the system is defined. This architecture can be specifically developed for a given system or it can be reused from previous projects. The specification of the architecture is in terms of the functionalities defined in the requirements analysis, and the components satisfying the functionalities can be identified. The definition of the architecture and the identification of the functionalities are considered in the modeling phase. A search is performed to identify possible candidates. If more than one candidate for the functionality is found, it is necessary to identify which one is more adequate to the other non functional requirements. It can be the case that it is difficult to find component satisfying all the requirements. It can be necessary to adapt the model before reusing it in the current design. It can be possible that one or more functionalities do not have components representing them, or even it can be the one that aggregates value to the design. In that last case the functionalities may be the major difference of the system being developed compared to other systems or, even, a commercial secret of the system, and they should be developed locally. The integration step consists of using a framework for the composition of the system based on the components. In the framework we have the part that does not change in a product line as, for instance, basic

Specification and Verification of Embedded Systems 1111 0000 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111

Search and Recovery of Components

1111 0000 0000 1111 0000 1111 Architecture 0000 1111 Definition 0000 1111 0000 1111 0000 1111 0000 1111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 Requirements 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 111 000 000 111 000 111 000 111 000 111 000 111 000 111 000 111

Components/ Architecture Adaptation

111 000 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111

Specific Components Development

Integration (Framework)

System 111 000 000 111 000 111 000 111 000 111 000 111 000 111 000 111

0000 1111 1111 0000 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111

39

Analysis

00000 11111 11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

Fig. 1. Life cycle for component-based development

functionalities, communication mechanisms and synchronization. The recovery, adaptation, development, and integration steps are explicitly treated by modeling. Artifacts such as, for instance, code, can be associated to the models and to the framework. In this way, the modeling activity becomes the most important one in the development of the systems, changing the focus of the programming to the modeling for visual composition of software. The last step is the analysis and it may consist of several methods as, for instance, simulation, proof of properties, and tests. Even if the system is validated, changes in the requirements are possible along the time. Therefore we have a cycle. In other words, we are considering that a system always evolves, and this evolution is always treated by the modeling of the system. It is important to note that, once we can have code associated to the models of the components and to the framework, and that we have a visual and formal modeling, we can perform, in fact, visual composition of software.

3

Formal Concepts

In this section it is presented the modeling formalism Hierarchical Colored Petri Nets and the model checking using temporal logic formulas. This presentation is informal for the sake of space and context, and the reader can find the formal definitions on the cited bibliography. 3.1 Petri Nets Petri nets allow the design of complex systems, expressing properties such as precedence relationships, conflicts, concurrency, synchronization, deadlocks, resource sharing, and non-determinism, among others. Also, the state and action locality characteristic allow the modeling of complex systems using either bottom-up or top-down approaches. Therefore, promoting modularity and re-usability, that are important characteristics to the modeling solution discussed in this chapter.

40

Leandro Dias da Silva and Angelo Perkusich

In the context of this chapter the formalism Hierarchical Colored Petri Nets (HCPN) are used. HCPN incorporates data types concepts to the Petri nets, and additionally also aggregates hierarchy mechanisms. An HCPN is a set of non-hierarchical CPN models, and each CPN model is then called a CPN page. In the following HCPN are informally introduced. An HCPN is an extension to the concept of CPN that allows the modeling in hierarchical levels. This is possible due to the inclusion of two mechanisms: substitution transition and fusion places. A substitution transition is a transition that will be replaced by a CPN page. The page which the substitution transition belongs to is called superpage and the page represented by that transition is called sub-page. The association between sub-pages and super-pages is done by means of sockets and ports. Sockets are all the input and output places of the transition in the super-page. Ports are the places in the subpage associated to the sockets. The ports can be input, output, or input-output. For simulation and state space generation sockets and ports are glued together and the resulting model is a flat CPN model. The fusion places are physically different but logically only one forming a fusion set. Therefore, all the places belonging to a fusion set have always the same marking. A marking of a place is the set of tokens in that place in a given moment. And the marking of a net is the set of markings of all places in the net, in a given moment. Indeed, these two additional mechanisms, substitution transition and fusion places, are only graphical, helping in the organization and visualization of a CPN model. They favor the modeling of larger and more complex systems by giving the designer the ability to model by abstraction, specialization, or booth. In Figure 2 a simple Petri net is illustrated. In this figure the fusion places and substitution transition are shown.

111111111 000000000 Sub−page 000000000 111111111 000000000 111111111 000000000 Input Port 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 Substitution 000000000 111111111 Transition 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000Output Port 111111111 000000000 111111111 000000000 111111111

Super−page

Socket

Fusion Places Socket

Fig. 2. Petri Net Hierarchy

Specification and Verification of Embedded Systems

41

3.2 Temporal Logic Temporal logic is a modal logic that can be used to describe how events occur over the time. There are operators to describe safety, liveness or precedence properties. Thus, providing, a framework to specify software systems, particularly concurrent systems [18]. Temporal logics are used to predicate over the behavior of a system defined by a Kripke structure [19]. This behavior is obtained starting from an initial state and then repeatedly moving from one state to another following the transition relation. It means that such relation should be total, and as consequence all the behaviors of the system are infinite. Since a state can have more than one successor, the structure can be thought of as unwinding into an infinite tree, representing all the possible executions of the system starting from the initial states. Two useful temporal logics are Computation Tree Logic (CTL) and Linear Temporal Logic (LTL). They differ in how they handle branching in the underlying computation tree. The CTL operators permit to quantify over the paths departing from a given state. In LTL, operators are intended to describe properties of all possible computation paths. It is an agreement that the temporal logic provides a good framework to describe and to reason about the behavior of concurrent systems. However, it is not the case when the question is which one is more appropriate, linear or branching time logic, to do it. But this is a question that is outside of the scope of this chapter. Along this chapter, we use a Computation Tree Logic (CTL) [20] defined for Colored Petri Nets named ASK-CTL [21]. In what follows we introduce basic concepts of both logics. The CTL temporal logic combines path quantifiers with linear time temporal logic operators. The path quantifiers A (“for all paths”) and E (“for some paths”) should be used as a prefix of one of the operators G (“always”) F, (“sometimes”), X (“nexttime”) and U (“until”). The syntax of CTL is given by the following rules: 1. If ϕ ∈ AP , then ϕ is a formula, where AP is a set of atomic propositions; 2. If ϕ1 and ϕ2 are formulas, then ¬ϕ1 , ϕ1 ∨ ϕ2 , and ϕ1 ∧ ϕ2 are also formulas; 3. If ϕ1 and ϕ2 are formulas, then EXϕ1 , EGϕ1 and E[ϕ1 Uϕ2 ] are formulas. 1. If ϕ ∈ AP , then ϕ is a formula; 2. If ϕ1 and ϕ2 are formulas, then ¬ϕ1 , ϕ1 ∨ ϕ2 , and ϕ1 ∧ ϕ2 are also formulas; 3. If ϕ1 and ϕ2 are formulas, then EXϕ1 , EGϕ1 and E[ϕ1 Uϕ2 ] are formulas. The others CTL operators are expressed using the three operators EX, EG and E[U]. So, we have that: AXϕ EFϕ

≡ ≡

¬EX¬ϕ E[trueUϕ]

AGϕ AFϕ

≡ ≡

¬EF¬ϕ ¬EG¬ϕ

A[ϕ1 Uϕ2 ]

≡

¬E[¬ϕ2 U(¬ϕ1 ∧ ¬ϕ2 )] ∧ ¬EG¬ϕ2

The semantic of CTL is defined with respect to paths in a Kripke structure. A path is an infinite sequence of states (s0 , s1 , · · · ) such that si+1 is reached from si for all

42

Leandro Dias da Silva and Angelo Perkusich

φ φ

φ • • •

• • •

• • •

• • •

• • •

(a) M, s0 |= EFϕ

• • •

φ φ

φ φ

• • •

• • •

(b) M, s0 |= AFϕ

φ

• • •

φ

• • •

• • •

• • •

(c) M, s0 |= EGϕ

φ • • •

φ φ

• • •

φ • • •

φ • • •

(d) M, s0 |= AGϕ

Fig. 3. Basic CTL operators

i ≥ 0. So, if ϕ is a CTL formula, we use M, s |= ϕ to denote that ϕ holds for s0 of M [22]. The four most used CTL operators are EF, AF, EG, and AG. In Figure 3 we illustrate the interpretation for such operators in a more intuitive way, and in what follows we define them. EFϕ ≡ E[trueUϕ] means that exists a path starting from s0 in which ϕ holds at some state along this path. AFϕ ≡ A[trueUϕ] means that for all paths starting from s0 , ϕ holds at some state along the path. In other words, ϕ is inevitable. EGϕ ≡ ¬AF¬ϕ means that exists a path starting from s0 in which ϕ holds at every state along this path. AGϕ ≡ ¬EF¬ϕ means that for all paths starting from s0 , ϕ holds at every state along that paths. In other words, ϕ holds globally. 3.3 ASK-CTL ASK-CTL is a CTL-like logic useful to specify properties for CPNs (Colored Petri Nets) state spaces, represented by occurrence graphs. Occurrence graphs carry information on both nodes and edges. Hence, a natural extension for CTL is to include the possibility to express properties about the information labeling for the edges (e.g., edge information is needed when expressing liveness properties since liveness is expressed by means of transition occurrence information). For this purpose two mutually recursively defined syntactic categories of formulas are defined: state and transition formulas

Specification and Verification of Embedded Systems

43

which are interpreted over the state space for states and transitions respectively [23]. As in CTL, quantified state formulas and transition formulas are interpreted over paths. Path quantification is used in combination with the until operator to express temporal properties. The ASK-CTL library has two parts: one which implements the ASK-CTL logic language, and another one which implements the model checker [21]. The syntax of ASK-CTL is minimal and in order to increase the readability of the formulas we make use of syntactic sugar, e.g., POS(ϕ) means that it is possible to reach a state where ϕ holds, INV(ϕ) means that ϕ holds in every reachable state, and EV(ϕ) for all paths, ϕ holds within a finite number of steps. 3.4 Model Checking The need to increase the dependability of software systems motivates the definition and application of more dependable developing methods and techniques. This need is more evident when dealing with critical real-time systems. With the increasing complexity of the systems the traditional methods based on tests, for example, are not enough anymore to guarantee dependability. The use of formal methods can increase the confidence in the system’s behavior. In the specification formal methods can be used to find difficult errors before the developing of the real system. Traditional methods based on tests and simulation can detect initial errors. But after the simplest errors are fixed more rigorous methods are needed. Model checking is used to verify specifications [20] in an exhaustive way. That is, where tests and simulations analyze some possibilities, formal methods analyze all possible behaviors. One great advantage of model checking is that it is fully automatic. Moreover the algorithms definition allows a counter-example generation in case of negation of a property indicating a path where the property is false. The disadvantage of model checking is the state explosion problem. That occurs when the system has several concurrent components, or when it manipulates complex data types. Some techniques have been researched and developed to deal with this problem such as symbolic model checking [24] and partial order reduction [25, 26]. The verification activity consists of verifying if a property is satisfied by a model, that is, to verify if the model models the specification. The properties are described in temporal logic, and the models can be described with automata or Petri nets, for example. Let M be a model and f be a temporal logic formula that express some property of M. The model checking consists in verify if M models f : M |= f . The model checking consists of the following three activities: 1. Modeling The modeling consists in to describe the system in some formalism. The formalism to be used depends on the tool to be used in the verification, the designer knowledge, or the culture in the institution that is developing the project. It is still possible to transform a given formalism into another to perform the verification.

44

Leandro Dias da Silva and Angelo Perkusich

2. Specification The specification is usually done in temporal logic that is used to specify how a system behavior evolves over time. It is not possible to guarantee the completeness of the specification, that is, it is not possible to guarantee that all the properties to be verified are specified. But once a property is specified it can be verified for all possible behaviors of the model. 3. Verification Given a model and a specification the verification is fully automatic. In the case of a property is negated the designer must analyze the counter-example to solve possible modeling errors, or to reformulate the specification. Moreover, abstraction and modular techniques depend on the designer to allow that the verification can be performed dealing with the state explosion problem.

4

Reuse Process

In this section it is described the reuse process for the modeling technique introduced in this chapter. In Figure 4 we can see a diagram of the process. Besides the reuse activities, we also consider a use verification activity. This step consists in performing model checking on the integrated models in order to verify whether the specific use case is correct, that is, the individual models were correctly used. The repository management activity, recovery and insertion of models, are discussed in details in [27]. The adaptation is discussed in [28]. The integration, and the use verification activities, as well as the functionality of the reuse process as a whole, unifying all the activities in a systematic modeling method is specific for each domain. It has been applied in Flexible Manufacturing Systems (FMS) domain [29] and Multi-Agent Systems (MAS) domain [30, 31]. In this chapter it is applied to embedded systems domain. As shown if Figure 4 systems are not always modeled from scratch, but existing models can also be used. The designer must think on how and where to search for pieces of models that can be directly reused, or if necessary adapted, while building a new model. Moreover, he must try to identify potential candidates for reuse and store them in a repository of models. The following reuse activities are identified during the formal modeling of systems: 1. Identification of the parts of the new model This activity is related to requirements analysis. Based on the system requirements it is possible to identify the entities of the system. 2. Definition of the architecture The architecture of a system in the context of this work is realized as a HCPN framework. It is based on the framework that the system is assembled, that is, the components are put together. A framework can be defined for each domain. In the context of this work the domain is embedded systems and an example is shown in Section 5. 3. Identification of the parts that need to be constructed and those that can be reused It is in this activity that the generic components can be identified and a search in the repository can be performed to locate such candidates. Moreover the specific, core, or even secret components can be identified to be developed locally to the actual project.

Specification and Verification of Embedded Systems

45

Repository of CPN Models

Recovery

Recovered CPN Model Temporal Logic Specifications

Adaptation

Adapted CPN Model Framework

Integration

Integrated CPN Model Use Verification

Verified CPN Model

Fig. 4. Diagram for the systematic reuse process

4. Description and recovery of the models (parts) that can be reused This is the recovery activity and it is presented in [27] and discussed in Section 4.1. 5. Adaptation of the recovered models The adaptation activity is presented in [28] and discussed in Section 4.2. 6. Integration of the recovered/adapted models Once there is a framework for the domain, the generic components have been found in the repository, and the specific ones have been developed, it is possible to integrate them to develop a new model for the actual project. 7. Verification The verification activity is useful for tow reasons. First it is necessary to verify if the semantic of the new model does not violate the semantics of the component models; and second to verify the new models before storing them into the repository. 8. Identification of reusable new models and storing them in the repository It is possible to identify some models in the actual project that can be inserted into the repository. In this way the repository evolves and also its dependability because the inserted models have been validated in the project. The activities 1, 2, and 3 are considered by the component development process, and the others are related to the formal reuse precess described in this section. Therefore, the reuse process is contextualized as a component-based development process. 4.1 Recovery When the elements of the system are identified, the repository can be searched to recover candidates for reuse. One such identification is a result of the requirements and

46

Leandro Dias da Silva and Angelo Perkusich

architecture of the system phases as shown in Figure 1. As it was defined in [27], the recovery consists in: to describe the properties using temporal logic; to verify the properties described against a meta-model (the repository); if more than a candidate is found evaluate through simulation and descriptions its behavior and characteristics that are not functional to decide the best candidate to be used in the current design. In the case that was not possible to find a candidate for reuse the model must be developed. After the recovery of the models of components, the next step may be the integration of them in a framework, or their adaptation. Note that it can be performed the recovery of all models before continuing with the integration or adaptation, or to recover a model and integrate it before recovering another one. This is a designer decision and has no impact on the introduced solution. 4.2 Adaptation The adaptation technique is based on component models that must be adapted in order to satisfy a new specification [28]. As said before, the models are described by CPN, and the restrictions to be applied on these models are expressed by temporal logic formulas, more precisely we use the ASK-CTL temporal logic. The basic idea behind this technique is that for a given CPN model that satisfies some properties, it should be possible to refine this model in order to obtain a new model whose behavior is more strict with respect to the original specification. It is important to observe that in the context of this chapter, adaptation is a refinement relation. Basically, it means that all the possible behaviors of the adapted model are also permitted in the original model. In some sense, the models can be related trough a pre-order relation like in compositional model checking [32]. Perhaps, the main point in this technique is that once the designer provides the model to be adapted and the new specification, the adaptation is a fully automatic process. The adaptation technique used in this chapter is based on the supremal controllable sub-language algorithm present by Ramadge and Wonham in [33]. From this, we have that our tool can lead only with finite state systems. A negative aspect of the synthesis algorithm is that it provides no way of knowing in advance if it possible to synthesize the new model, and this question is solved only at the end of the execution of the algorithm when an empty model is generated. If the state space of the models being adapted are to large, it is possible that the entire process takes a long time just to output a negative result. To avoid this problem, we make use of model checking techniques to determine in advance if there is some sub-state space that satisfies the new specification. If the model checker tool outputs a positive result, we proceed with the synthesis of the new model applying the synthesis algorithm on the state space of the given model. Of course that since the algorithm presented in [33] works over finite state machines we need a way to translate the result of this algorithm in terms of CPN models. So, instead of cutting the state space, we put marks on the undesired states. With this “new” occurrence graph, the adaptation algorithm proceeds with the modification of the CPN model by adding places, arcs and other objects.

Specification and Verification of Embedded Systems

47

The following steps show how the adaptation technique works: 1. Generate the occurrence graph of the CPN model; 2. Verify if the CPN model satisfies the new specification using the model checker; 3. In the negative case, apply the process again with a new model or ask for human intervention; 4. In the positive case, the synthesis algorithm is applied. To obtain a new CPN model, it is necessary to add some information on the net in order to control its behavior avoiding the net to reach the states marked as undesired. The way to do this is to inhibit the occurrence of some transitions, every time it leads to an undesirable state. Taking into account that each state of the occurrence graph of the CPN model is labeled, it will be possible to control the behavior of the net by adding a new place, called control place, to the net with initial marking corresponding to the label of the initial marking of the original CPN model. This new place will be an input and output place of all the transitions of the CPN model, and every time that a transition occurs, it will remove the token in the control place and put a new one with the label of the current state. It means that it is necessary a way to determine what is the new state reached by the occurrence of a given transition. This is done by the execution of a code segment, that is automatically generated from the marked state space. This code is associated to each transition of the CPN model. This code segment takes as input the label of the current marking and the identifier of the transition to be fired and returns the label of the reached marking. By this way, the information contained in the token in the control place always reflects a label of some state in the marked state space. Therefore, this token contains the label of the current marking of the CPN model. Note that until now we just provide our model with the information that we need to control its behavior, but in fact we did not add any restriction on it. We need to do more modifications on the model to avoid reaching the undesired states, we need to be able to disable a transition when needed. And it is done through the addition of guards to the t transitions. In other words, if si −→ sj belongs to the state space, sj is an undesired state and t models a controlled event in the system, we must add to transition t a guard disabling it if the token in the control place is the label of si . 4.3 Integration After a model is recovered from the repository, and possibly adapted, it has to be integrated into the framework. The integration, as well as the other activities, are fully implemented using the CPN/ML language [34] in the Design/CPN. First, the designer is asked the name of the file with the CPN model to be integrated into the framework. Then, some functions will be automatically executed to build the integration environment, that is, the places, transitions, arcs, and its respective names, color sets, and inscriptions. The next step executed by the algorithm is the definition of the input and output ports in the diagram being integrated. After this step, the substitution transition will be defined, and the sockets in the super-page will be associated to its respective ports, previously defined in the subpage. The last step is to select the box

48

Leandro Dias da Silva and Angelo Perkusich

with the model declarations, in the model page, to define the ports colors based on the sockets colors, and to append this information in the global declaration node. The file selection needs the user interaction, while all other steps are fully automatic executed. To define the algorithm some restrictions are considered. They are: – – – – – –

Unique page name; Prefix in the color sets names indicates the page name; Suffix in the place names indicates whether it is a port or not; Dot-dashed line patter to the auxiliary box with model declarations; Model declarations box must to be unique; Declarations of the port places must be the first ones in the model declarations box.

The first restriction to be considered ensures that the page name of the integrated model is unique, and it is the responsibility of the designer. The model being integrated must have a prefix in its color set names with the page name. Another restriction is about port places. The places that are ports must have a suffix IN or OUT, to input and output ports, respectively, on its names. This restriction is to allow the algorithm to recognize which places are ports and which type of port they are. The last integration restriction is that in the model being integrated there must exist an auxiliary box with dot-dashed line pattern. This box must contain all the declarations for this page. The declaration of the ports color set must be the first ones in this box. This is necessary to the algorithm be able to adapt correctly the color set to successfully integrate the model. It is important to note that it is the responsibility of the designer to ensure that the restrictions are respected in order to the algorithm can work properly. The integration depends on the framework. Thus, for each application domain it is necessary to define the architecture of the system and to model it as a CPN framework. Therefore, for each domain and framework, it is necessary to implement specific functions for the integration step of the reuse process. But this implementation has to be done just one time, and it is used through the evolution of the product line in the specific domain with the framework. 4.4 Use Verification Besides recovery, adaptation, and integration, it is considered in this work the specific use case of recovered models performing an use verification step. This activity is considered in the context of this work because when we model based on reuse we need to guarantee that the semantic of the resulting model respects the semantics of the reused models. Some parts of the resulting model can lead a reused model to behave in a different way than expected. This problem can compromise the modeling activity, and the facility, and flexibility, that the reuse process promotes. The use verification activity consists in performing model checking in the framework with the individual models to be verified already integrated on it. To do this, we specify, in a file, the temporal logic formulas for the properties of an individual model. The model checking is then performed in the whole model to ensure that the integrated models were correctly used.

Specification and Verification of Embedded Systems

49

The idea is to use the same specifications used in the recovering step, or the specification supplied together with the model. Another justification for the definition of the use verification activity is that in the case that no reuse candidate is found in the repository, a new model has to be built. The use verification also can be used to validate a new model to be inserted in the repository. In the context of component development this activity is generalized to an analysis activity as discussed in details in Section 5.

5

A Transducers Network Control System Example

As said in the introduction, concepts such as product lines, software architecture, and components are used to establish a domain oriented application development process. It promotes the reuse of components in different projects for a given application domain. The application domain considered in the scope of this chapter is an embedded transducer network control system. As shown in Figure 5, this system is composed by transducers and a controller, implementing the functionalities of a communication server, and a real-time server.

Sensor

Application

Sensor

Application

Communication Server

Real−Time System

Actuator

Application

Actuator

Fig. 5. System topology

The signals from environment acquired by sensors are converted and controlled in a way that the real-time server can access and modify the information to control the actuators according to the requirements of the application. Observe that the transducers are connected to a controller, which besides control functions acts as a front end communication processor. Thus, different applications can be specified and verified by changing the components. Several different applications may access the real-time server to acquire data and to control devices. For instance, we can have temperature, ventilation and humidity sensors. The signals acquired and processed can be used to control an HVAC system (Heating, Ventilation and Air Conditioning) in an intelligent building. It is important to note that a system defined as shown in Figure 5 is very common in many other kinds of command and control systems and therefore it is possible to define

50

Leandro Dias da Silva and Angelo Perkusich

a software architecture that can be reused in other application systems. In this chapter the emphasis is on the formal modeling and analysis of the framework, other details about it can be found in [35]. According to the requirements of the applications, defined based on the command and control problem, and the transducers used, different systems can be developed. Based on the approach discussed in this chapter the systems do not need to be specified and verified always from the scratch. What is necessary, is to specify and verify new or adapted components. An important observation concerning the specification is that details related to specific technologies to implement the components were abstracted. The context of this chapter is in the specification and analysis of the architecture of the system. Therefore, properties of the interfaces and architectural level of the components are verified regardless internal details of them, such as, for example, the protocol used by the communication server to communicate with the control system running in the real-time server. 5.1 Specification In Figure 6 we can see the HCPN hierarchy that specifies the architecture of the system, and is possible to see how the entities communicate with each other. The System page models the sensors and actuators. They communicate only with the communication server represented by the CommunicationS page. There are several components defined for the embedded system page. Input data from the devices to the embedded system and output data to devices are communicated using a blackboard mechanism. The input and output interpreter, simply I/O interpreter

Hierarchy#10

M

System#1

Prime

Declarations# Embedded

IOInterpreter

CommunicationS

DataConvert

IOInterpreter

CSChannel#1

DevController

DataConvert

M

SyncEtoRT1

DeviceContro

RTSync

Prime

RealTime#3

DataControl

DataControlle

UIModule

UIModule#9

Fig. 6. Model Hierarchy

RTChannel#1

Specification and Verification of Embedded Systems

51

(IOInterpreter), is used to instantiate the data written in the blackboard as objects. Also this component receives objects from the system and translate them to the data format used by the devices. The next component is the data converter, DataConvert. This component transforms data from the I/O interpreter to a format used by the real-time server, according to the requirements of the applications. Since data formats are dependent of the applications that access the server this component must be changed to satisfy the requirements of each project. The data converter decides the data flow. If data in the data converter is a control requisition, such as an initialization or calibration request, that data is sent to the device controller, DeviceControl. If data is an information signal it is sent to the synchronizer to be transmitted to the real-time server. The device controller component, is used to control devices, that is, as said before to perform calibration and initialization tasks. Moreover an application can request changes in the attributes for a device, such as, the sampling time. This is done also by the device controller. The synchronizer is a realization of the communication between the communication server and the real-time server. When a sensor sends an information signal and not a control signal, it must be transmitted to real-time server through the synchronizer. Thus, there is a synchronizer for the communication server and another one for the real-time server. Since this communication does not change, the synchronizer is fixed in the architecture. The real-time server, RealTime, intermediates the communication between the communication server and the applications. A database with information about the net and the applications is used to promote this communication. The applications can read or write information to control the system. In the real-time server we have several components also. The synchronizer is identical to the one in the embedded system. The data controller, DataContoller, is used to control data flow from and to the applications. The user interface (UIModule) module component makes services available to the applications to access the system. In Figure 7 the dashed lines define components that must be replaced, or hot spots, based on the application, and the continuous lines define components that do not need to be changed, or frozen spots. Using this architecture, we can specify any control system as defined for this chapter, allowing a product line evolution based on the reuse of component models. Moreover, this strategy allows the practice of refactoring at a model level. Now, it is illustrated how the integration phase of the process is performed. In Figure 8 it is shown an example where the DataController is integrated into the CommunicationServer model. In this example it is possible to see part of the integration phase, because some parts are internal to the algorithm and to the global declaration node. It is possible to see, for example, how the sockets in the super-page are associated to the ports in the subpage. And how a substitution transition in the super-page represents another CPN model, the page of the component model. Therefore, based on the topology we can define the hierarchy (framework) of the system. Moreover the components are identified, searched in the repository, developed, and integrated to form a new model, as discussed in Sections 2.1 and 4. In the reuse process there is a verification activity and in the component development process there is a analysis activity. That is because the reuse process is a more formal concept. Therefore in the component development process the concept of verification is generalized into

52

Leandro Dias da Silva and Angelo Perkusich

Communication Server Page Communication Server Model (id,data,ftf)

Blackboard

Data

ActuatorBB

P

SensorBB

I/O

(id,data,ftf) Data (id,data,ftf) [(isActuator(id)) [(isSensor(id)) orelse (data=info)] andalso (data=control)]

IOIntToBB

InitSensor

P

(id,data,ftf) [(data=ini) orelse (data=info)]

BBToIOInt (id,data,ftf)

(id,data,ftf)

IOOutHard

Syncronizer

(id,data,ftf)

IOInHard

Data

[data=ini]

InitActuator

I/O Interpreter

(id,data,ftf)

I/O

Data

SyncIn

Data

d d

d

IOInterpreter

SyncEtoRT1

HS

d

d

HS

d

d

IOInComp

IOOutComp

Data

Data

d

d

Data

DConToIOInt

IOIntToDCon

DConvToSync

d

d

Data Converter IODConOut

d

IODConIn

Data

Data

d Data

DConOutSync

d

d

SyncToDConv

d

DataConvert d

SyncOut

d

HS

d Data

DConOutDevC Data d

DConToDevC

CompDConIn

Device Controller d

DevControlIn Data

d

DevCToDCon

d d

DevController HS

d

DevControlOut Data

Fig. 7. Communication Server Model

analysis. That means that, actually, several technique can be applied, not only model checking. For instance, in Section 5.2 some guidelines for the verification are discussed, and the whole process can be seen as an analysis phase where the verification is just one activity. 5.2 Verification An important task in verifying systems is to identify and define properties to be proved. As we want to prove properties for the architecture and for the interface of the components, we can use the framework model regardless the internal component details

Specification and Verification of Embedded Systems

53

Super−page Real-Time System Page Real-Time System Model

RTSyncronizer s

RTSync

s

HS

Sub−page

Data Controller Page

SyncInB

Signal

SyncOutRT

Signal

s s

Data Controller Model P

In

DContInBP

P

DContInP Data

d

ToRTSyncB

In

s

Data Controller

d

DContOutB

DataContB

ToDataCont s

Data

Signal

DataContIn

ToDataCont

d

d P

Out

DContOutBP

P

Out

DContOutP Data

Substitution Transition

Signal

s

s

DataControl s

HS

s

Data

DContInB

Signal

DataContOut

Signal

s s

BackToDCon

Ports and sockets Association

ToUIMod s

UI Module

s

UIOut

UIIn

Signal

Signal

HS

s

UIModule

s

Fig. 8. Integration Example

to prove properties about the whole system. To do so we follow a technique to identify functional system scenarios in order to define interesting properties to prove. To illustrate the scenarios we use Message Sequence Charts (MSC), that are automatically generated during the simulation of the HCPN model. The verification strategy consists of the following activities: 1. Identification of scenarios This activity is important to help the designer to understand better how the system is supposed to work, and identify scenarios that need to be verified in order to validate it. 2. Automatic MSC generation for each scenario based on the model As soon as the designer has identified the working scenarios, it is possible to automatically generate Message Sequence Charts (MSC) simulating the model. This is important to identify an actual working scenario for the model based on what was identified in the previous activity. 3. Properties identification based on the scenarios (MSC) Based on the scenarios it is possible to identify the properties that need to be specified in order to verify the system. 4. Specification of atomic propositions and temporal logic formula For each property identified for the previous activity, it is necessary to specify some atomic propositions to predicate about the model and formula using the propositions to perform model checking.

54

Leandro Dias da Silva and Angelo Perkusich

5. Model checking The last step is to perform model checking to verify if the properties specified during the previous activity has been satisfied by the model. Note that the first defined activity is more abstract and when the designer moves one step forward he gets less abstract and more specific information. Thus, the designer has a systematic approach to perform the analysis of a system. It is not necessary to go directly to the more detailed information to perform model checking during the first step because it is possible to find and correct design errors during the simulation time. Suppose that a device sends an initial signal when the system is turned on or a new device is plugged in. Also, suppose that the system performs some task when it receives this kind of message and sends back to the device an acknowledgment, calibration, or even an initialization message. Such scenario is illustrated in Figure 9. When a device sends an initialization signal, the data converter sends it to the device controller to perform the associated control tasks. Since a MSC diagram is generated based on simulation therefore capturing a single execution sequence or information flow in the model, it might be the case that there is some situation where this expected sequence or flow is violated. Thus, we must verify the scenario for all possible situations based on model checking to guarantee that the expected flow is always satisfied. Considering again the scenario shown in Figure 9, we must prove that when an initialization message is sent by some device the flow will be through the device controller. To prove this scenario two atomic proposition PA and PB are used. The proposition PA is true if there is a token in place IODConIn of page CommunicationS. The proposition PB is to true if there is a token in place DConOutDevC of page CommunicationS. The temporal logic formula to prove this scenario is shown bellow. AG(P A → AF (P B)) Therefore, the formula is true if PA is true and PB is true in the future. It means that if there is a token in the data converter input, this token is sent to device controller input, which is the component that implements control tasks such as initialization, calibration, and changing devices working parameters. The evaluation of this formula to true means that this part of the model behaves as expected for all possibilities of model execution. We can proceed with the same reasoning to prove that the flow of information back to device also behaves as expected for all possibilities. The strategy illustrated above can be used for several different properties. Thus, it is used together with the framework, and the reuse of CPN models process to allow a formal and systematic approach to the specification and verification of systems. In the context of this chapter, embedded systems.

6

Bibliographical Remarks

Several works have been developed in the context of components and software reuse. Szyperski [7] considers components as binary units and discusses several aspects of this technology with respect to the market, patterns, and subjects related to the implementation of components as well as software systems based on components.

Specification and Verification of Embedded Systems Data Converter

Syncronizer

55

Device Controler

Initialization Signal Control Signal

Fig. 9. Data Converter Flow to Control Signal

In [36], Crnkovic et al. present aspects related to the specification, implementation, and development of components. Concepts such as interface, contracts, frameworks and patterns are related to the specification and implementation of both functional and non functional aspects. An example of an informal specification of behavior is introduced. Some problems related to CBD are pointed out, such as the lack of standardization, and the lack of techniques, methods, and tools to deal with non functional aspects of components. In another work, the authors deals with the challenges and problems of CBD [17]. Some technologies such as software architectures and UML (Unified Modeling Language) [37] are mentioned as support to CBD. Moreover, some current needs and challenges of CBD are discussed. Software reuse, composition of components, and objects, as well as classes are are discussed [38]. The black-box reuse, where the internal details of what is being reused are hidden, and the advantages of CBD and requirements for visual composition are discussed. In another work [39], composition subjects are taken into account. In this context, a script language for the composition of components, named Piccola [40], is introduced. From the formal point of view, and particularly related to model checking, many relevant works are being developed. Two examples are the Bandera [41], and Java Path Finder [42] projects. The approach adopted in these projects is to extract a model from Java source code to perform model checking. Also there are some works in the design and verification of embedded systems. The Ptolemy project [43] is centered on the design and simulation using models of computation. Charon [44] is used to formal specification and model checking of hierarchical and hybrid systems. Mocha [45] is also used to the formal specification and model checking exploiting modularity but not explicitly to embedded systems domain. Giotto [46] is a design methodology for implementing embedded control systems. Uppaal [47] is a tool set for the specification and verification of real-time systems. AutoFocus is a tool for model-based development of embedded systems using automata [48].

7

Concluding Remarks

Embedded systems are computing systems with more complex issues than traditional software for desktop computers. The challenges for the embedded systems domain in-

56

Leandro Dias da Silva and Angelo Perkusich

clude time-to-market, dependability, and complexity of heterogeneous devices, among others. These challenges are conflicting, and to deal with them it is necessary to develop techniques, tools, and algorithms. In this chapter we presented how a component-based development process can be applied to the embedded system domain to deal with the complexity. The establishment of a component process promotes reuse, and therefore, it is possible to deal with the dependability and time-to-market issues in an effective way. Moreover, the reuse is done in a formal way based on a reuse process based on Hierarchical Colored Petri Nets (HCPN). The use of these concepts together, based on a formal and graphical representation, allowed us to define a set of techniques, algorithms and tools to manipulate HCPN models, that promotes a visual model-based composition of systems.

Acknowledgments The research reported is this chapter is partially supported by grants 305110/2002-0 and 200365/2004-5 from the Brazilian National Research Council (CNPq) and a scholarship from CAPES to the first author.

References 1. Lee, E.A.: Embedded software. In Zelkowitz, M., ed.: Advances in Computers. Volume 56. Academic Press (2002) 2. Nierstrasz, O., Ar´evalo, G., Ducasse, S., Wuyts, R., Black, A.P., M¨uller, P.O., Zeidler, C., Genssler, T., van den Born, R.: A component model for field devices. Lecture Notes in Computer Science 2370 (2002) 200–216 3. Lee, E.A.: Embedded software - an agenda for research. Technical Report UCB/ERL No. M99/63, University of California, Berkeley (1999) 4. Clements, P.C., Northrop, L.: Software Product Lines: Practices and Patterns. SEI Series in Software Engineering. Addison-Wesley (2001) 5. Shaw, M., Garlan, D.: Software architecture: perspectives on an emerging discipline. Prentice-Hall, Inc. (1996) 6. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison Wesley, Massachusetts (1994) 7. Szyperski, C.: Component Software: Beyond Object-Oriented Programming. Addison Wesley (1999) 8. da Silva, L.D., Perkusich, A.: Formal verification of component-based software systems. In: Proceedings of The First International Workshop on Verification and Validation of Enterprise Information Systems VVEIS-2003, Angers, France (2003) 9. Murata, T.: Petri nets: Properties, analysis and applications. Proc. of the IEEE 77 (1989) 541–580 10. Emerson, E.A.: Temporal and modal logic. In jan Van Leeuwen, ed.: Handbook of Theoretical Computer Science. Volume B: Formal Models And Semantics. Elsevier Science (1990) 995–1072 11. Girault, C., Valk, R.: Petri Nets for Systems Engineering - A Guide to Modeling, Verification, and Applications. Springer-Verlag, Berlin (2003) 12. Jensen, K.: Coloured Petri Nets: Basic Concepts, Analysis, Methods and Practical Use. EACTS – Monographs on Theoretical Computer Science. Springer-Verlag (1992)

Specification and Verification of Embedded Systems

57

13. Jensen, K.: Coloured Petri Nets: Basic Concepts, Analysis Methods and Practical Use. Volume 2. Springer-Verlag (1997) 14. Jensen, K., ed.: Fourth Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools, Aarhus, Denmark, August 28-30, 2002. Volume PB-560., DAIMI (2002) 15. Jensen, K., ed.: Third Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools, Aarhus, Denmark, August 29-31, 2001. Volume PB-554., DAIMI (2001) 16. Jensen, K., al, e.: ”Design/CPN” 4.0. Meta Software Corporation and Department of Computer Science, University of Aarhus, Denmark. (1999) On-line version:http://www.daimi.aau.dk/designCPN/. 17. Crnkovic, I.: Component-based software engineering - new challenges in software development. Software Focus 2 (2001) 127–133 18. Pnueli, A.: The temporal logic of programs. In: Proceedings of the 18th IEEE Symposium on the Foundations of Computer Science (FOCS-77), Providence, Rhode Island, IEEE, IEEE Computer Society Press (1977) 46–57 19. Hughes, G.E., Creswell, M.J.: Introduction to Modal Logic. Methuen, London (UK) (1977) 20. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. The MIT Press, Cambridge, Massachusetts (1999) 21. Christensen, S., Mortensen, K.H.: Design/CPN ASK-CTL Manual, University of Aarhus. 0.9 edn. (1996) 22. Clarke, E.M., Emerson, E.A., Sistla, A.P.: Automatic verification of finite-state concurrent systems using temporal logic specifications. ACM Transactions on Programming Languages and Systems 8 (1986) 244–263 23. Cheng, A., Christensen, S., Mortensen, K.H.: Model checking coloured petri nets exploiting strongly connected components. Technical report, Computer Science Department, Aarhus University, Aarhus (Denmark) (1997) 24. McMillan, K.L.: Symbolic Model Checking. The Kluwer Academic Publishers, Boston/Dordrecht/London (1993) 25. Peled, D.: Combining partial order reductions with on-the-fly model-checking. In: Proceedings of the 6th International Conference on Computer Aided Verification, Springer-Verlag (1994) 377–390 26. Valmari, A.: A stubborn attack on state explosion. In: Proceedings of the 2nd International Workshop on Computer Aided Verification, Springer-Verlag (1991) 156–165 27. Lemos, A.J.P., Perkusich, A.: Reuse of coloured petri nets software models. In: Proc. of The Eighth International Conference on Software Engineering and Knowledge Engineering, SEKE’01, Buenos Aires, Argentina (2001) 145–152 28. Gorgˆonio, K.C., Perkusich, A.: Adaptation of coloured petri nets models of software artifacts for reuse. In: 7th International Conference on Software Reuse, Lecture Notes in Computer Science, Austin, EUA (2002) 29. da Silva, L.D., Perkusich, A.: A systematic and formal approach to the specification of flexible manufacturing systems reusing coloured petri nets models. In: Proceedings of The 11th IFAC Symposium on Information Control Problems in Manufacturing - INCOM’2004, Salvador, Bahia, Brazil (2004) 30. da Silva, L.D., Perkusich, A., de Almeida, H.O., de Barros Costa, E.: Modelling and analysis of a multi-agent intelligent tutoring system based on coloured petri nets. In: Proceedings of The First ACIS International Conference on Software Engineering Research and Applications (SERA’03), San Francisco, CA, USA (2003) 31. da Silva, L.D., Perkusich, A., de Almeida, H.O., de Barros Costa, E.: A coloured petri net model to analyze the design of a multi-agent system. In: Third International Workshop on Software Engineering for Large-Scale Multi-Agent Systems, Edinburgh, Scotland, UK (2004) Aceito para publicac¸a˜ o.

58

Leandro Dias da Silva and Angelo Perkusich

32. Long, D.L.: Model Checking, Abstraction, and Compositional Reasoning. PhD thesis, Carnegie Mellon University (1993) 33. Ramadge, P.J.G., Wonham, W.M.: On the supremal controllable sublanguage of a given language. SIAM Journal on Control and Optimization 25 (1987) 637–659 34. Christensen, S., Haagh, T.B.: Design/CPN Overview of CPN ML Syntax. University of Aarhus. 3.0 edn. (1996) 35. Perkusich, A., Almeida, H.O., de Araujo, D.H.: A software framework for real-time embedded automation and control systems. In: Proceedings of the IEEE Conference on Emerging Technologies and Factory Automation. Volume 2., Lisbon, Portugal (2003) 36. Crnkovic, I., Hnich, B., Jonsson, T., Kiziltan, Z.: Specification, implementation, and deployment of components. Communications of the ACM 45 (2002) 35–40 37. Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User Guide. Addison-Wesley (1998) 38. Meijler, T.D., Nierstrasz, O.: Beyond objects: Components. In Papazoglou, M.P., Schlageter, G., eds.: Cooperative Information Systems: Current Trends and Directions. Academic Press (1997) 49–78 39. Schneider, J.G., Nierstrasz, O.: Components, scripts and glue. In Barroca, L., Hall, J., Hall, P., eds.: Software Architectures – Advances and Applications. Springer-Verlag (1999) 13–25 40. Achermann, F., Lumpe, M., Schneider, J.G., Nierstrasz, O.: Piccola – a small composition language. In Bowman, H., Derrick, J., eds.: Formal Methods for Distributed Processing – A Survey of Object-Oriented Approaches. Cambridge University Press (2001) 403–426 41. Corbett, J.C., Dwyer, M.B., Hatcliff, J., Laubach, S., P˘as˘areanu, C.S., Robby, Zheng, H.: Bandera: extracting finite-state models from java source code. In: International Conference on Software Engineering. (2000) 439–448 (http://bandera.projects.cis.ksu.edu) 42. Havelund, K., Pressburger, T.: Model checking java programs using java pathfinder. International Journal on Software Tools for Technology Transfer (STTT) 2 (2000) (http://ase.arc.nasa.gov/visser/jpf) 43. Lee, E.A.: Overview of the ptolemy project. Technical Report UCB/ERL M01/11, University of California, Berkeley (2001) (http://ptolemy.eecs.berkeley.edu) 44. Alur, R., Dang, T., Esposito, J.M., Fierro, R.B., Hur, Y., Ivanˇci´c, F., Kumar, V., Lee, I., Mishra, P., Pappas, G.J., Sokolsky, O.: Hierarchical hybrid modeling of embedded systems. In: EMSOFT. (2001) 14–31 (http://www.cis.upenn.edu/mobies/charon/index.html) 45. Alur, R., Henzinger, T.A., Mang, F.Y.C., Qadeer, S., Rajamani, S.K., Tasiran, S.: MOCHA: Modularity in model checking. In: Computer Aided Verification. (1998) 521–525 (http://www-cad.eecs.berkeley.edu/ mocha) 46. Henzinger, T.A., Horowitz, B., Kirsch, C.M.: Giotto: A time-triggered language for embedded programming. Lecture Notes in Computer Science 2211 (2001) 166+ (http://wwwcad.eecs.berkeley.edu/ mocha) 47. Amnell, T., Behrmann, G., Bengtsson, J., D’Argenio, P.R., David, A., Fehnker, A., Hune, T., Jeannet, B., Larsen, K.G., M¨oller, M.O., Pettersson, P., Weise, C., Yi, W.: U PPAAL - Now, Next, and Future. In Cassez, F., Jard, C., Rozoy, B., Ryan, M., eds.: Modelling and Verification of Parallel Processes. Number 2067 in Lecture Notes in Computer Science Tutorial, Springer–Verlag (2001) 100–125 (http://www.uppaal.com) 48. Sch¨atz, B., Pretschner, A., Huber, F., Philipps, J.: Model-based development of embedded systems. In: Proceedings of the Workshop on Model-Driven Approaches to Software Development. (2002) (http://autofocus.informatik.tu-muenchen.de/index-e.html)

Modular Verification of Reconfigurable Components Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson Department of Computer Science, Link¨oping University, Sweden {alete,simin,jorha}@ida.liu.se

Abstract. This chapter presents a framework for modular verification of reconfigurable real-time components. The framework enables proving that the reconfiguration of components via aspect weaving provides expected functional and temporal behavior in the reconfigured component. Within the framework we formally represent components and aspects as augmentations of timed automata. The verification is based on two algorithms: an algorithm that extracts necessary information into component verification interfaces, and an algorithm that checks, on an aspect, whether the property is preserved upon reconfiguration. Hence, the method ensures that components are verified only once for a particular property, while the property satisfaction under reconfiguration is checked only on aspects. Verification interfaces for a given property can be reused for multiple aspects and reconfigurations.

1

Introduction

A large majority of computational activities in the modern society are performed within embedded and real-time systems. Successful deployment of these systems depends on low development costs, a short time to market, and high degree of tailor-ability [1]. Thus, the introduction of the component-based software development (CBSD) [2] into real-time and embedded system development offers significant benefits, namely: (i) composition of software for a specific application using components from the component library, thus reducing the system complexity as components can be chosen to provide the functionality needed by the system; (ii) rapid development and deployment of real-time software as many software components, if properly designed and verified, can be reused in different applications; and (iii) evolutionary design as components can be replaced or added to the system, which is appropriate for complex systems that require continuous hardware and software upgrades. However, there are aspects of real-time and embedded systems that cannot be encapsulated in one component with well-defined interfaces as they crosscut the structure of the overall system, e.g., synchronization, memory optimization, power consumption, and temporal attributes. Aspect-oriented software development (AOSD) [3] has emerged as a new principle for software development that provides an efficient way of modularizing crosscutting concerns in software systems in ”modules”, called aspects. Using AOSD, systems can be built to contain only the required functionality, while other functional and non-functional crosscutting features encapsulated into aspects can be added to the system in a process called aspect weaving. Applying AOSD in realtime and embedded system development would reduce the complexity of the system C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 59–81, 2005. c Springer-Verlag Berlin Heidelberg 2005

60

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

design and development, and, thus, provide means for a structured and efficient way of handling crosscutting concerns in the software system. Hence, the integration of the two disciplines, CBSD and AOSD, into real-time systems development would enable more efficient system configuration from components and aspects from the library based on the system requirements, as well as easier reconfiguration of systems by adapting components for a specific application (i.e., reuse context) through changing the behavior of a component by applying aspects [4, 5]. This results in enhanced flexibility of the real-time and embedded software through the notion of system and component reconfigurability. A recent study [6] performed on a general-purpose component-based system indicates that components-based technologies indeed benefit from separation of concerns via aspects. To successfully apply software engineering techniques such as AOSD and CBSD when developing real-time systems, a number of research challenges need to be addressed. One of the most important ones is determining the characteristics of a component model that can capture and adopt principles of the CBSD and AOSD in a real-time and embedded environment. Another challenge is verifying the properties of such a system when composed of components and aspects. To address the first challenge we have developed a reconfigurable real-time component model, denoted RTCOM, that describes a real-time component that provides support for aspect weaving while enforcing information hiding [4, 5, 7]. RTCOM provides explicit support for separation of concerns and allows efficient component reconfiguration through aspect weaving. Although RTCOM provides significant benefits over traditional models with respect to reconfiguration of components to suit the needs of the underlying real-time system and the run-time environment, it does not solve the problem of verification of a system composed of components. When composing systems out of components, developers need to prove temporal and functional properties of components as well as the composed system. One option is to first compose a system out of components and then do the verification of the overall system. However, this approach has drawbacks [8]. Namely, the possible number of configurations can be too large even if components are available in the library; a typical scenario in case of the product line architectures. Hence, for verification to be tractable and usable in component-based development, it should apply to components with implications for the overall composed system [8, 9]. The verification challenge for reconfigurable components is even greater as the verification methodology needs to ensure that components are verified only once and the verification of reconfigured designs is done on aspects. This is to overcome the possible state explosion that might happen in cases where verification is done on woven designs. Ideally, a verification methodology should support proving properties about individual components and aspects, and infer properties, functional and temporal, of reconfigurable components and the overall system based on the properties of individual software modules. Moreover, for the methodology to be useful, it needs to be supported by tools, hence, enabling automated checking. In this chapter we provide a formal framework for verifying behavioral and temporal properties of reconfigurable real-time components. This framework enables: (i) proving functional and temporal properties of individual components and aspects; and (ii)

Modular Verification of Reconfigurable Components

61

proving that the reconfiguration of components via aspect weaving preserves expected functional and temporal behavior in a reconfigured component. We primarily focus on the verification of one component reconfigured with one aspect because this is both the foundation and prerequisite for the successful verification of the overall composed realtime component-based system; we can then infer properties, functional and temporal, of the composed system based on the proven properties of individual reconfigurable components. We motivate the need for having such a formal framework based on a case study of a reconfigurable embedded real-time database system (COMET), targeted toward vehicle control systems. Requirements for the COMET database have been extracted from requirements of embedded systems in the vehicular industry [10]. We also demonstrate the use of our specifications and verification through this case study. Since the complete COMET database configuration is too complex to serve as a running example for the chapter, we illustrate our work on a simplified version of COMET and its components, such as the one given in figure 2. The remainder of this chapter is organized as follows. In section 2 we present preliminaries, including major characteristics of RTCOM with respect to reconfigurability, structure of the COMET database, and a brief review of the timed automata theory. The formal model of reconfigurable components is presented in section 3, followed by the verification method in section 4. Related work is discussed in section 5. The chapter finishes (section 6) with main conclusions and directions for future work.

2

Preliminaries

This section contains the background needed for understanding the reconfiguration via aspect weaving prescribed by RTCOM (discussed in section 2.1). We present, in section 2.2, a real-world real-time system, the COMET database, to motivate the necessity of having a method that enables us to check properties of reconfigured components. In section 2.3 we briefly present the theory of timed automata, upon which we build verification of reconfigurable real-time components. 2.1 Aspects and Components in Real-Time Systems Building reconfigurable real-time systems calls for new design approaches. Aspectual component-based real-time systems development, denoted ACCORD, is an approach that addresses these new needs [4, 5, 7]. ACCORD prescribes that real-time systems should first be decomposed into a set of components followed by decomposition into a set of aspects. Further, it uses RTCOM to support reconfigurability [4, 5, 7]. RTCOM components are grey boxes as they are encapsulated in interfaces, but changes to their behavior can be performed in a predictable way using aspects. Aspects are allowed to modify the code of the components in pre-defined reconfiguration locations that are explicitly declared in component interfaces. In this section we briefly review RTCOM and its configurability via aspects. Aspects within RTCOM that invasively change the code of a component correspond to traditional aspects in existing aspect languages, e.g., AspectC [11], AspectC++ [12],

62

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

and AspectJ [13]. Hence, aspects are informally defined as programming language level constructs encapsulating crosscutting concerns. An aspect weaver pre-processes and inserts code from the aspects into places in components determined by reconfiguration locations (the process is known as aspect weaving). Figure 1(a) illustrates the effects of the aspect reconfiguring a component. Figure 1(b) illustrates a case where two aspects are woven into multiple components. Component C

Component C1

Component C2

Component C3

Aspect AS1

Aspect AS

Aspect AS2

woven component

(a)

woven component

(b)

Fig. 1. Illustration of aspect weaving (a) one aspect and one component and (b) multiple aspects woven into several components

An aspect in an aspect language consists of point-cuts and advices. A point-cut in an aspect language is described by a point-cut expression and gives a set of join points. A join point refers to a point in the component code where aspects should be woven, e.g., a method. In RTCOM join points are explicitly declared in the component interfaces as reconfiguration locations, and these are declared such that temporally predictable weaving in the component code can be done (see section 2.2 for a concrete example). An advice is a declaration specifying the code that should run when the join points are reached. There are three types of advices: (i) before advice code is executed before the join point, (ii) after advice code is executed immediately after the join point, and (iii) around advice code is executed instead of the join point. 2.2 Component-Based Embedded Real-Time Database COMET is a reconfigurable real-time database platform targeting embedded systems in vehicles [10, 14]. COMET is developed using ACCORD and RTCOM and, hence, efficiently deals with crosscutting concerns including concurrency control, logging, and recovery. Components in COMET that provide basic database functionality are the user interface component (UIC), the transaction manager component (TMC), the index management component (IMC), the scheduling manager component (SMC), and the memory handling component (MHC). The UIC provides a database interface to the application, and contains a data manipulation language. The TMC is responsible for executing incoming transactions, thereby performing the actual manipulation of the data in the database. The IMC is responsible for maintaining indexes of tuples in the database. The SMC is responsible for scheduling transactions, while the MHC is responsible for memory allocation of tuples, meta-data, and database indexes.

Modular Verification of Reconfigurable Components

63

This basic COMET configuration can be extended and tailored for a specific system by adding different aspects, such as concurrency control, quality of service, indexing policies, etc., which affect several of the named components. Also, additional components can be added in combination with aspects to more efficiently tailor the database for the underlying embedded system, e.g., the locking manager component (LMC) that manages locks in the system. For example, concurrency control (CC), which enables multiple transactions to run concurrently in the database system by providing algorithms for detecting and resolving conflicts among transactions, can be added to the COMET configuration by weaving the CC aspect (CCA) into COMET components (TMC, IMC, SMC and LMC). Similarly, quality of service can be a requirement for the database, which can be added by weaving appropriate aspects into reconfiguration locations of the components. Figure 2 illustrates the weaving process and its constituents for the TMC and the CCA1 . CCA contains two advices: (1) an advice of type before that defines the code that should be inserted into TMC reconfiguration location readDataFromDB() to ensure that transactions obtain locks on data items, and (2) an advice of type after that defines the code to be inserted immediately after transactions commit to ensure that transactions release all obtained locks. The result of the weaving is the TMC modified at the recon-

Transaction Manager Component Concurrency Control Aspect ... aspect CCpolicy{ readDataFromDB(data); pointcut getLock(data)= ... call("void readDataFromDb()"); -----pointcut releaseLock(transaction)= ... call("void commit(transaction)"); commit(transaction); ... advice before getLock(data){

+

//aspect code getReadLock(); if conflict then resolve

Reconfiguration locations (accessible from component interfaces)

} advice after releaseLock(transaction): //aspect code releaseAllLcoks(transaction) } }

Reconfigured Transaction Manager Component ... //aspect code getReadLock(); if conflict then resolve readDataFromDB(data); ... -----... commit(transaction); //aspect code releaseAllLocks(transaction); ...

Fig. 2. A simplified example of the weaving process 1

Note that this example is simplified to show main constituents of the aspects, components, and their possible interaction. In the actual implementation, the CCA is more complex and contains advices that crosscut the behavior of the SMC, LMC and IMC components as well as the TMC

64

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

figuration locations, such that every read of the data item is now preceded by locking, while every commit of a transaction is followed by unlocking (see figure 2). Moreover, the CCA code provides an efficient way of handling possible deadlocks in the database (not presented here). Adding different aspects to reconfigure components may result in a system no longer conforms to the desired behavior. Furthermore, the same component(s) can be reused with variety of different aspects in different reuse contexts, e.g., the TMC can be modified with different types of CCAs, or with different types of quality of service aspects. Hence, a verification method is needed to ensure that changing the component behavior via aspect weaving does not violate proven properties of components. Moreover, components should be verified only once, and the verification of reconfigured designs should be done on aspects, in order not to waste the effort invested in verification of the components, and to overcome the possible state explosion that might happen in cases where verification is done on woven designs. Before providing an efficient verification method, we need to have a suitable formal representation of RTCOM, i.e., components, aspects, and reconfigured components.

2.3 Theory of Timed Automata Finite-state real-time systems are frequently modeled using timed automata. A timed automaton is a timed extension of a finite state automaton with a finite set of real-valued clock variables (clocks) [15]. Constraints on the clocks, referred to as guards, are used to restrict the behavior of an automaton, and accepting conditions are used to enforce progress properties. A transition, represented by an edge in a graph representing the automaton, can be taken when the clocks satisfy the guard (condition) that labels the edge. A simplified version of the original timed automata, called timed safety automata [16], are introduced to specify progress properties using local invariant conditions. An automaton, in this case, can remain in a location as long as the clock values satisfy the invariant conditions of the location. Given that timed safety automata are, due to their simplicity, increasingly used in verification tools and model-checking theory [17], we focus on timed safety automata and hereafter refer to them as timed automata. Figure 3 shows an example of a simple timed automaton that has three locations: start, loc1, loc2, and one clock x. The automaton can stay in the location start only while the value of the clock is less than or equal to four, i.e., while the location invariant is satisfied. However, due to the “empty” guard, equivalent to the condition true, the system may move from start to loc1 at any time point. The transition from location loc1 to loc2 is taken when the guard is satisfied, i.e., clock x is equal to one or higher. Note that edges are taken instantaneously, but time elapses in a location. Additionally, clocks can be reset to zero at any edge of a timed automaton. In the example from figure 3, clocks are not reset. Formally, a clock is defined as a variable over R+ . For a set C of clocks with x, y ∈ C, the set of clock constraints over C, ψ(C), is defined by α ::= x ≺ c | x − y ≺ c | ¬α |(α ∧ α),

Modular Verification of Reconfigurable Components

start

loc1

65

loc2 x>=1

x<=4

x<=4

Fig. 3. An example of a simple timed automaton

where c ∈ N, and ≺∈ {<, ≤}. A formal definition of timed automata is given as follows [17]. Definition 1 (Timed automaton). A timed automaton A is a tuple L, l0 , E, C, r, g, Inv, where L is a non-empty finite set of (named) locations, l0 ∈ L is an initial location, E ⊆ L × L is a set of edges, C is a finite set of clocks, r : E → 2C is a function that assigns to each edge e ∈ E a set of clocks r(e) to be reset, g : E → ψ(C) is a function that labels each edge e ∈ E with a clock constraint g(e) over C, and Inv : L → ψ(C), a function that assigns to each location l ∈ L an invariant Inv(l). The values of clocks are formally defined by clock valuations. A clock valuation v is a function that assigns a value v(x) to each clock x ∈ C. In the remainder of the chapter, we consider guards, clocks to be reset, and invariants as sets of clock valuations. We use notation v ∈ g to denote that v satisfies a guard g, and [r → 0]v to denote the clock valuation that maps all clocks in r to 0 and agrees with v for other clocks C\r. Similarly, v + d denotes the clock valuation that maps all x ∈ C to v(x) + d, d ∈ R+ . The operational semantics of a timed automaton is defined as a transition system where a state consists of a current location and current values of clocks. Two types of transitions can be taken between the states: an action transition and a delay transition. Definition 2 (Operational semantics). Let RC be the set of all clock valuations, and v0 = v0 (x) = 0, for all x ∈ C. The semantics of a timed automaton A=L, l0 , E, C, r, g, Inv is a transition system (S, s0 , →), where S = L × RC is the set of states, s0 = l0 , v0 is the initial state, and →⊆ S × S is the transition relation defined as: – l, v → l, v + d if v + d ∈ Inv(l), for d ∈ R+ (delay transition); and – l, v → l , v if there exists e = l, l ∈ E such that v ∈ g(e), v = [r(e) → 0]v and v ∈ Inv(l ) (action transition). The foundation for decidability results in verification of timed automata is based on the notion of region equivalence over clock assignments [18]. A more efficient representation of the state-space for the timed automata is based on the notion of clock zones and zone graphs [16]. A zone Z represents a solution set of a clock constraint, i.e., the maximal set of clock assignments satisfying the constraint. Zones can be represented as conjunctions in ψ(C) and, therefore, ψ(C) denotes a set of zones. A symbolic semantics of timed automata is defined by a transition system where a symbolic state consists of a current location and a current zone. Definition 3 (Symbolic semantics). Let Z0 = x∈C x ≥ 0 be the initial zone. The symbolic semantics of a timed automaton A=L, l0 , E, C, r, g, Inv is a transition system (S, s0 , ;) called the zone graph where S = L × ψ(C) is the set of symbolic

66

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

states, s0 = l0 , Z0 ∧Inv(l0 ) is the initial state, and ;⊆ S×S is a symbolic transition defined as: – l, Z ; l, Z , Z = Z ↑ ∧ Inv(l); and – l, Z ; l , Z , Z = re (Z ∧ g(e)) ∧ Inv(l ) if e = l, l ∈ E; where Z ↑ ={v + d|v ∈ Z ∧ d ∈ R+ } is the future operation, and re (Z)={[r(e) → 0]v|v ∈ Z} is the reset operation. The set of zones is closed under reset and future operations. That is, the result of a rest operation on a zone results in a new zone in which adequate clocks are reset. The symbolic semantics is a full and correct characterization of the operational semantics of timed automata [17]. The symbolic semantics can be extended to cover networks of communicating timed automata, where a location vector is used instead of a location. Verification of real-time systems emphasizes checking of safety and bounded liveness properties of real-time systems using reachability analysis [17]. For an automaton with symbolic semantics described by definition 3, a state l, Z is reachable if there is a sequence of symbolic transitions from the initial state l0 , Z0 to the state l, Z. A number of tools exist, e.g., Uppaal [19] and Kronos [20], that use reachability analysis based on symbolic semantics of timed automata for checking properties of real-time systems. The algorithms utilize the property of the inclusion operation ⊆ on the zones, where clock constraints satisfied in a zone Z are also satisfied in zone Z if Z ⊆ Z. Properties in these tools are expressed in Timed Computational Tree Logic (TCTL) [15] using temporal operators G and F , and path quantifiers A and E. The semantics of the operators is as follows: (i) Ap means that the property p holds for every path; (ii) Ep means that property p holds for some path; (iii) Gp means that the property p holds globally for every state on the path; and (iv) F p means that the property p holds eventually for some state on the path. Using these operators, we can express different properties of real-time systems. For example, we can express the reachability property that some state satisfying proposition q will eventually be reached in the system on some path as EF q. Invariant properties, e.g., φ = AGq checking if all states in the system at every path satisfy q, are checked using the negation of reachability properties (AGq can be transformed into ¬EF ¬q). When using model-checking tools, e.g., Uppaal, the clock constraint c for a clock x ∈ C is explicitly expressed in a TCTL formula, e.g., x ≤ c. In the remainder of the chapter, we use this explicit representation of clock constraints in the TCTL formulas when expressing properties. In this chapter, we utilize the verification of the real-time systems using symbolic techniques and representation of timed automata using zones. We extend these techniques to allow modular model-checking of reconfigurable components.

3

Modeling Reconfigurable Components

Our goal is to verify that properties proven for a component are preserved under component reconfiguration. More precisely, given a component C, an aspect AS, a weaving operation , and a property φ, we want to prove that if φ is satisfied by component C (denoted C |= φ), then the reconfigured component C = C AS, obtained by weaving

Modular Verification of Reconfigurable Components

67

AS into C also preserves the property (C |= φ). In order to be able to do this we need to formally model components, aspects, and reconfiguration, which is the focus of this section. In section 2.1 we explained that aspects, or more precisely their constituents advices, can be woven into the code of the component such that they are executed before, after, or around (instead of) pre-defined reconfiguration locations. Thus, we augment a timed automaton, which represents a component, with interfaces that fix places, i.e., reconfiguration locations, in the component used for attaching advices of aspects. Definition 4 (Component). A reconfigurable component C is a tuple Ac , I, where Ac is a timed automaton Lc , l0c , Ec , Cc , rc , gc , Invc , and I = rl1 , . . . , rlk , rli ∈ Lc , is a reconfiguration interface. Thus, each location rli from an interface I corresponds to a reconfiguration location in RTCOM. Note that the reconfiguration locations of a component are pre-defined and explicitly declared for each component and, thus, the number of these is known and fixed. Although in the structural description of components (see section 2.1) we only needed a knowledge of the reconfiguration locations, for formal verification purposes, we also require the information about its predecessors and successors (definition 5). This information can easily be extracted from the timed automaton Ac of a component C. Definition 5 (Successor and predecessor). Given a reconfigurable component C = Ac , I, for each rli ∈ Lc in the interface I of C, we define two sets, a predecessor set of rli , pred, and a successor set of rli , succ, as – pred(rli )={l | l, rli ∈ Ec } – succ(rli )={l | rli , l ∈ Ec } To be able to place an advice of type before we use pred(rli ), since the advice is going to be placed between the pred(rli ) and rli location. An advice of type after is going to be placed between rli and succ(rli ). In the case of an around advice, the advice is going to be placed between locations pred(rli ) and succ(rli ) and will be executed instead of the code in the reconfiguration location. Figure 4(a) illustrates an automaton that represents the behavior of a component. The example shows a simplified model of the transaction manager component (TMC) of the COMET database. For this component, the Ac part is shown as the timed automaton in figure 4(a), and the I part as the interface I = update associated with one location. The component is responsible for starting a transaction (start), performing operations defined within a transaction on data in the database (update) and ending the transaction (end). The interface of the component consists of one reconfiguration location update. The reconfiguration location is characterized also with its predecessor pred(update) = {start} and successor succ(update) = {end}. Next we define aspects, also as a collection of timed automata. Definition 6 (Aspect). An aspect AS is a tuple AD, P C, F , where AD=Ad1 , . . . , Adn is a collection of advices, P C={pc1 , . . . , pcm } is a set of point-cuts, and F is a function assigning a subset of point-cuts to each of the advices, F (Adi ) ⊆ P C. Each

68

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

idle

start x<3

in

x:=0 x:=0

x<3 lock x<=5 x>=5

x<11

end (a)

x<11 update(=rp)

out

(b)

Fig. 4. Examples of timed automata specifying (a) the transaction manager component and (b) the locking advice in the COMET database

pci ∈ P C is some reconfiguration location rl from an interface I of some component C ∈ C, where C is a set of components. Each advice Ad = Aa , in, out, t is a timed automaton Aa with two designated locations in and out and an associated type t ∈ {before, after, around}, such that the graph defined by La , Ea induced by Aa is a connected graph. In definition 6, following the traditional structure of aspects presented in section 2.1, an aspect consists of a number of advices and point-cuts. Point-cuts denote a place where the advice should be woven into the component. Therefore, each point-cut corresponds to one or more reconfiguration locations from interfaces of components. Note that an advice of an aspect can be woven into multiple components, and that two and more advices in an aspect can share the same point-cuts. The in and out locations introduced in definition 6 are used as placeholders for the places at which the advice will be woven at reconfiguration time. Following the example of the COMET database, assume that the TMC component should be extended such that it is ensured that data cannot be updated in the system unless the appropriate lock on a data item is obtained prior to updates. This can be done by defining an aspect ASlock = Ad, update with the advice Ad = Aa , in, out, bef ore, where the automaton Aa describes the locking mechanism to be woven before the pointcut update in the TMC component (see figure 4(b)). Without loss of generality from now on we refer to aspects defined as AS = Ad, pc consisting of only one advice Ad and only one point-cut pc, and exclude function F . This is to simplify presentation and reduce the complexity of notation that otherwise would complicate definition of component configuration unnecessarily. Reconfiguration of a component C = Ac , I with an aspect Ad, pc is defined by the weaving operation as follows. Definition 7 (Component reconfiguration). Given a component C=Ac , I and an aspect AS=Ad,pc, where pc = rli ∈ Lc \{l0c } is a reconfiguration location from the interface I and Ad = Aa , in, out, t, the reconfigured component C =CAS is defined as a tuple Ac , I where Ac is defined as follows:

Modular Verification of Reconfigurable Components

– the locations Lc Lc

=

69

Lc ∪ La \{in, out} if t ∈ {before, after} Lc ∪ La \{in, out, pc} if t=around

= l0c – l0c – Ec ={l, l |l, l ∈ Ec ∪ Ea , l, l ∈ / {in, out, pc}} ∪ Es where Es denotes the substituted transition set defined as follows:  {l, l|in, l ∈ Ea , l ∈ pred(pc)}∪    if t=before {l, l|l, out ∈ Ea , l = pc} Es = {l, l |in, l ∈ E , l = pc} ∪ E if t=after  a s succ   {l, l|in, l ∈ Ea , l ∈ pred(pc)} ∪ Essucc if t=around

where Essucc is the substituted transition subset that depends on the content of succ(pc) set as {l, l |l , out ∈ Ea , l = pc} if succ(pc) = ∅ Essucc = {l, l |l, out ∈ Ea , l ∈ succ(pc)} otherwise – Cc =Cc ∪ Ca – the clocks to be reset rc (e)  if e ∈ (Ec \Es ) ∩ Ec   rc (e)  ra (e) if e ∈ (Ec \Es ) ∩ Ea rc (e) = ra (e) if e = l, l ∈ Es , ea = in, l ∈ Ea , ec = l , l ∈ Ec    ra (ea ) ∪ rc (ec ) if e = l, l ∈ Es , ea = l, out ∈ Ea , ec = l , l ∈ Ec – the guards gc (e)  if e ∈ (Ec \Es ) ∩ Ec   gc (e)  ga (e) if e ∈ (Ec \Es ) ∩ Ea gc (e) = gc (ec ) ∧ ga (ea ) if e = l, l ∈ Es , ec = l, l ∈ Ec , ea = in, l ∈ Ea    ga (ea ) if e = l, l ∈ Es , ea = l, out ∈ Ea , ec = l , l ∈ Ec – Invc ={Invc (l)|l ∈ Lc }∪{Inva (l)|l ∈ La \{in, out}} Definition 7 of component reconfiguration is easily extensible to aspects consisting of many advices woven into multiple components. The definition describes the way edges are formed in a reconfigured component. Also, definition 7 ensures that aspect weaving into a component, when replacing an edge between the reconfiguration locations, preserves clock constraints that existed on the removed edge. That is, invariants, guards, and clock constraints of the replaced edges are transfered to the newly created edges. Figure 5 illustrates the way clock constraints are preserved on the removed edges. Gray colored edges and clock constraints depict elements substituted during component reconfiguration. Figure 5(a) shows the way guards and reset clocks are transfered to a newly created edge in the case when the newly created edge connects a component location to an aspect location in a reconfigured component. Similarly, figure 5(a) shows

70

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

l

in gc(ec)

ec

ga(ea) ea

ra(ea) l’

rc(ec) out l"

in

l

g(e)=g (e ) c c ga(ea) g (e ) e ec c c

ga(ea)

ea r(e)=r (e ) a

ra(ea)

a

l’

rc(ec) out

l"

(a) Transfer of guards and clock constraints for an edge starting in a component and ending in an aspect.

l"

in

gc(ec)

ec

ea rc(ec) out l’

ra(ea)

ga(ea)

l

in

l" ec gc(ec)

r(e ) rc(ec) r(e)=r(e c)U a e

l’

(e ) g(e)=g a a ea

out

l ga(ea)

ra(ea)

(b) Transfer of guards and clock constraints for an edge starting in an aspect and ending in a component. Fig. 5. An example for preservation of clock constraints

Modular Verification of Reconfigurable Components

71

start

idle

x<3 x:=0 x<3 lock x<=5 x>=5 x<11 commit

x<11 update

Fig. 6. An example of a reconfigured component

the way guards and reset clocks are transfered to a newly created edge in the case when the newly created edge connects an aspect location to a component location in a reconfigured component. To enable meaningful reachability analysis on the advices, preservation of clock constraints that existed on the transitions to/from the reconfiguration location is essential. The reconfiguration of the TMC from figure 4(a) with the ASlock aspect is illustrated in figure 6. Weaving the aspect ASlock in the TMC component at reconfiguration location rl = pc = update results in the reconfigured component C = C ASlock . The composition resulted in replacing the in location in the advice with start = pred(rl), and the out location with the location rl = update. Now instead of taking a transition from location start to update directly, location lock is first visited for obtaining the locks. The reconfigured component retains all the clock constraints of the original components (which existed on the removed edge).

4

Formal Analysis of Reconfiguration

As mentioned, the goal of our work is to verify a reconfigurable component by verifying the aspect that is to be woven into the component. In our approach, when verifying a component, appropriate information about reconfiguration interfaces is extracted. The extracted information is used for verification of an aspect. Moreover, the same information can be reused for different aspects that might reconfigure the same component, e.g., in different reuse contexts. To verify the timing and functional behavior of reconfigurable components via reachability analysis we employ the following three steps (S1)-(S3): (S1) proving TCTL properties of components, which is performed using existing model checking techniques; (S2) deriving constraints on reconfiguration interfaces of components to preserve a particular property upon reconfiguration; and (S3) preserving properties of components upon reconfiguration, which is done by analyzing aspects, using the interface derived under (S2). The above steps (S1)-(S3) constitute an extension to deal with timing properties in the verification approach for the non-real-time systems by Li et al. [8, 9]. While the un-timed approach uses labeling on states with propositions, we use clock zones to represent timing information.

72

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

We provide a verification method that is of practical value as our method can be applied in existing model-checking tools for real-time systems. In Uppaal, which is our candidate tool, verification of a real-time system is done on the subset of TCTL properties found essential for real-time system verification: EF q, AF q, and AG(p ⇒ AF q), where q is a boolean expression over locations and clock constraints and can be directly checked on a state. For the purpose of this chapter, we primarily focus on reachability properties in the form φ=EF q, where q=lq ∧ Zq (lq is a location and Zq a zone represented by a clock constraint), and derive main results for these types of properties. This restriction to q may later be relaxed; in the relaxed form q can consist of an arbitrary number of boolean expressions over locations and clock constraints. An example of a standard EF q property is the property φ=EF (end ∧ x < 11) of the TMC from figure 4(a), ensuring that a transaction executes within 11 time units. Here, lq =end and Zq ={x < 11} The invariant properties AF q are dual to reachability properties EF q, and the results obtained for reachability properties are transparently applicable to the invariant properties. The property in the form AG(p ⇒ AF q) can be derived from reachability properties as well. The following section formally describes the verification steps. 4.1 Verifying a Reconfigured Component Given component C=Ac , I, TCTL reachability property φ in the form EF (l ∧Z), and aspect AS=Ad, pc, we illustrate steps (S1)-(S3) in detail. Step 1 (S1) Given property φ, and component C=Ac , I, use the standard reachability algorithm [17] for model checking of timed automata Ac . If the algorithm reports success then C satisfies property φ, denoted C |= φ. The semantics of a component C=Ac , I is defined by semantics of a timed automaton Ac , which is a zone graph Z(C)=(Sc , s0c , ;) given by definition 3. Recall that the standard reachability algorithm works by traversing a zone graph (Sc , s0c , ;) until the desired state is reached or there is nothing else to traverse. Hence, the following holds for a component C. Proposition 1. Let C=Ac , I be a component, the semantics of which is defined by the zone graph Z(C)=(Sc , s0c , ;). Let φ = EF q be a reachability property where q = lq ∧ Zq . Let ;∗ denote a sequence of action/delay transitions in the zone graph. If the component C satisfies property φ, then there exists at least one path σ in the zone graph Z(C) such that σ=l0c , Z0c ;∗ lq , Z, Z ⊆ Zq . Proof: Follows from the definition of reachability in the zone graph of a timed automaton, and the property of the inclusion operation on zones [17]. The path induced by a proof of reachability of φ in C is referred to as reachability path of φ in C. To save notation, we use Zq to denote zone Z associated with location lq on the reachability path of φ in C where there is no risk for ambiguity. It follows from the proof of reachability that a component C can have several paths reaching the state

Modular Verification of Reconfigurable Components

73

lq , Zq . We denote a set of reachability paths of φ in C as Σ. Further, l ∈ σ is used as a shorthand for stating that location l appears in some l, Z on the path σ; conversely, l∈ / σ denotes that location l does not appear on the path σ. Proposition 2. Let AS=Ad, pc be an aspect with an advice Ad=Aa , in, out, t. Let C=Ac , I be a component, and C =CAS be a reconfigured component at some reconfiguration location rl = pc. Let φ = EF q be a reachability property such that C |= φ, and Σ be the set of reachability paths of φ in C. The following holds for C : (1) If for every reachability path σ ∈ Σ, rl ∈ / σ then C |= φ. (2) If rl = lq and t=around, then C φ. Proof: (1) If rl ∈ / σ for all σ ∈ Σ then property φ is always preserved under reconfiguration since none of the paths σ ∈ Σ are altered by the reconfiguration. (2) When rl = lq and t=around then by construction of a reconfigurable component (definition 7) it follows that the location lq will not appear in C . Thus, lq , Zq cannot be reached in Z(C), and C φ. From proposition 2, it follows that: (1) the property is always preserved if the reconfiguration location does not exist in any of the reachability paths of the property; and (2) the property is always violated when the advice of type around is reconfiguring the component at the reconfiguration location that exists in the property expression. Case (1) should be clearly identified in the verification of a component to avoid unnecessary aspect verification. Given that the advices of type around are rarely used, case (2) appears seldom. However, we include it here to provide adequate support for detecting these immediate property violations in the property preservation algorithm (presented later in this section) and, thus, enable designers to immediately see in which cases around advices are inappropriate for component reconfiguration. We deal with these cases by introducing a boolean variable, discussed further in (S2), that helps in determining if one of the two cases has occurred. Step 2 (S2) Step (S2) involves deriving constraints on the reconfiguration interfaces building up on the results of the verification from the previous step. These constraints include temporal information needed in the subsequent step (S3) of the verification. Since we need to store this information with the component, we augment component C with an appropriate interface, denoted verification interface, formally defined as follows. Definition 8 (Verification interface). Let C=Ac , I be a component with the zone graph Z(C)=(Sc , s0c , ;) and an interface I=rl1 , . . . , rlk . Let φ = EF q be a reachability property. Let C |= φ, inducing a set of reachability paths Σ = {σ1 , . . . , σm } of φ in C, with σj =l0c , Z0c ;∗ lq , Zq . The verification interface I φ of component C for the property φ is defined as a k-tuple rl1φ , . . . , rlkφ where

74

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

 {IsP reserved : true} if for all σj ∈ Σ, rli ∈σ / j    φ {rlij = IsP reserved : ⊥, rliφ =  l, Zpred , rli , Zrl l , Zsucc |   rli ∈ σj , l ∈ pred(rli ), l ∈ succ(rli )} otherwise φ All the states l, Zpred , rli , Zrl , l , Zsucc , appearing in elements rlij of the set rliφ , appear in the reachability path σj of φ. The variable IsPreserved flags that case (1) identified in proposition 2 is trivially true, i.e., identifies those verification interfaces of components that are not affected by the reachability property φ no matter which aspect reconfigures them. When IsP reserved : ⊥ is stored in the verification interface for reconfiguration location rli , the reconfiguration of the component at this location potentially affects φ. ⊥ denotes unknown as it is customary in 3-valued formalisms. In algorithm 1 we describe step (S2), i.e., the extraction of a verification interface of a component, in algorithmic steps.

Algorithm 1: Verification interface extraction Input: • Component C with reconfiguration interface I=rl1 , . . . , rlk . • Reachability property φ=lq , Zq Output: • Verification interface I φ =rl1φ , . . . , rlkφ of component C. Construct the zone graph Z(C) of C Compute the set Σ = {σ1 , . . . , σm } of reachability paths of φ in C N otOnP ath := true I φ := For i = k downto 1 do rliφ := ∅ For j = 1 to m do φ := rlij If rli appears on path σj then φ φ :=append(l , Zsucc , rlij ) for location l ∈ succ(rli ) on σj rlij φ φ rlij :=append(rli , Zrl , rlij ) for state rli , Zrl on σj φ φ rlij :=append(l, Zpred , rlij ) for location l ∈ pred(rli) on σj φ φ rlij :=append(IsP reserved : ⊥, rlij ) N otOnP ath := f alse φ } rliφ :=rliφ ∪{rlij end if end for If N otOnP ath = true then rliφ :=rliφ ∪ {IsP reserved : true} I φ :=append(rliφ , I φ ) end for Return verification interface I φ

Modular Verification of Reconfigurable Components

75

Step 3 (S3) Before giving the algorithm that checks whether a property φ is preserved under reconfiguration of the component C with an aspect AS, step (S3), we define the semantics and the satisfiability relation for the aspect reconfiguring the component. Symbolic semantics of an aspect AS=Ad, pc with an advice Ad=Aa , in, out, t is defined by the symbolic semantics of the timed automaton Aa , which is a zone graph (Sa , s0a , ;) given by definition 3. We denote this zone graph by Z(AS) and refer to it as the aspect zone graph. Definition 9 (Enriched aspect zone graph). Let AS=Ad, pc be an aspect with an advice Ad=Aa , in, out, t, and the zone graph Z(AS). Let C be a component under reconfiguration by aspect AS at some reconfiguration location rli =pc from I. Let φ be a property under verification, with C |= φ. Let I φ =rl1φ , . . . , rlkφ be the verification φ φ interface of the component for this property with rlij ∈rliφ , rlij =IsP reserved : ⊥, l, Zpred , rli , Zrl , l , Zsucc . The enriched zone graph Z(AS) =(Sa , s0a , ;) φ for rlij is computed as follows: – the initial state s0a

=

l0a , Z0a

=

in, Zpred ∧ Zin if t∈{before, around} in, Zrl ∧ Zin if t=after

– s0a ;s1a ↑ • l0a , Z0a ; l0a , Z0a , Z0a = Z0a ∧ Inv(l0a ); and • l0a , Z0a ;l1a , Z1a , Z1a = re (Z0a ∧ ga (e) ∧ gc (e )) ∧ Inv(l1a ) for all e = l0a , l1a ∈ Ea and all e ∈ Ec such that l, rli if t∈{before, around} e = rli , l if t=after – for all sia ; s(i+1)a in Z(AS), i ≥ 1, compute sia ; s(i+1)a as follows:

↑ • lia , Zia ; lia , Zia , Zia = Zia ∧ Inv(lia ); and • lia , Zia ; l(i+1)a , Z(i+1)a , Z(i+1)a = re (Zia ∧ ga (e)) ∧ Inv(l(i+1)a ) for all e = lia , l(i+1)a ∈ Ea , where the reset operation is defined as  } if l(i+1)a = out  {[ra (e) → 0]v|v ∈ Zia }∪ re (Zia ) = {[ra (e) → 0]v|v ∈ Zia  {[rc (e ) → 0]v|v ∈ Zia } if l(i+1)a = out

for all e ∈ Ec such that

e =

l, rli if t∈{before, around} rli , l if t=after

Before we formulate an algorithm that analyzes enriched zone graphs and instantiates the IsP reserved variable for deducing satisfiability of a property φ, we introduce the notion of satisfiability for an aspect and prove the theorem that shows under which conditions the weaving operation preserves a property by analyzing the enriched zone graph of an aspect.

76

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

Definition 10 (Preserving φ under reconfiguration of C). Let C be a component under reconfiguration by the aspect AS=Ad, pc at some reconfiguration location rli =pc from I. Let φ be a property under verification such that C |= φ, and I φ verificaφ φ tion interface of the component for this property with rlij ∈ rliφ , rlij =IsP reserved : ⊥, l, Zpred , rli , Zrl , l , Zsucc . Let Z(AS) be an enriched zone graph of AS for φ rlij . We say that aspect AS preserves property φ under reconfiguration of C, denoted AS |=C φ, if there exists a path δ=in, Zoa ;∗ out, Zout , in the enriched zone graph Z(AS) such that the following holds – Zout ⊆ Zrl , for t=before, – Zout ⊆ Zsucc , for t=after, – Zout ⊆ Zsucc for t=around and rl = lq . The following lemma is used in the proof of theorem 1. Lemma 1. Let A1 and A2 be two timed automata, with L1 ⊆ L2 , l01 = l02 , E1 ⊆ E2 , C1 ⊆ C2 , for all e ∈ E1 , r1 (e) ⊆ r2 (e), g1 (e) = g2 (e), and Inv1 (l) = Inv2 (l) for all l ∈ L1 . Let σ1 and σ2 be paths in zone graphs Z(A1 ) and Z(A2 ), respectively, such that σ1 =l1 , Z1 ;∗ ln , Zn and σ2 =l1 , Z1 ;∗ ln , Zn , where li ∈ L1 . If Z1 ⊆ Z1 then Zn ⊆ Zn . Proof: by induction on transitions in the zone graph.

Theorem 1. Let C be a component, AS an aspect, the weaving operation, and φ a reachability property. Let C |= φ and C = C AS. If AS |=C φ then C |= φ. Proof: We present a full proof for one advice type t=before. The proofs for other types of advices are analogous. Since C |= φ, there exists a set of reachability paths Σ in zone graph Z(C) of C, where each reachability path σj ∈ Σ of φ is in the form: σj =l0c , Z0c ;∗ lq , Zq . Based on definition of C = C AS, we know that the reconfiguration of C with AS is done at some reconfiguration location rli = pc from the reconfiguration interface I of C. If rli does not appear in any of the paths σj ∈ Σ then semantics of C is not affected in relation to satisfiability of φ (proposition 2), and the theorem is trivially true. If rli appears in some σj ∈ Σ, then σj =l0c , Z0c ;∗ lpred , Zpred ; rl, Zrl ∗ ; lq , Zq . Since C = C AS, by the definition of component reconfiguration and the definition of zone graphs, it follows that there is a reachability path σrl for location rli in zone graph Z(C ) of C , σrl =l0c , Z0c ;∗ lpred , Zpred ;∗ la , Za ; rl, Zrl , where la ∈ La such that la , out ∈ Ea . From AS |=C φ we know that there exists an element rliφ in the verification interφ face I φ of C, with some rlij =IsP reserved : ⊥ , l, Zpred , rli , Zrl , l , Zsucc . φ We further know that in the enriched zone graph of AS based on rlij there exists a path ∗ δ=in, Z0a ; la , Za ; out, Zout in which Zout ⊆ Zrl . From the definition of component reconfiguration it follows that the zones associ from path σrl and the state out, Zout from path δ are the ated with the state rl, Zrl same, i.e., Zrl =Zout . Since Zrl = Zout and Zout ⊆ Zrl , then Zrl ⊆ Zrl .

Modular Verification of Reconfigurable Components

77

Based on definition 7 of component reconfiguration, for all locations l, l ∈ Ec ∩Ec , guards, invariants and reset clocks remain unchanged under reconfiguration and l, l ∈ Ec ⇒l, l ∈ Ec . This implies that, since there was a path reaching location lq from location rli , rli , Zrl ;∗ lq , Zq in Z(C), there will be a path between these two ;∗ lq , Zq . Given that location rli is reachable in Z(C ) locations in Z(C ), rl, Zrl from location l0c by path l0c , Z0c ;∗ rl, Zrl , and lq is reachable in Z(C ) from rli ∗ by path rl, Zrl ; lq , Zq , then location lq is reachable from l0c in Z(C ) by a path ;∗ rl, Zrl ;∗ lq , Zq . Given that Zrl ⊆ Zrl , by lemma 1, we have σj = l0c , Z0c that Zq ⊆ Zq . Hence, σj is a reachability path of φ in Z(C ). Thus, C |= φ. Algorithm 2 checks for satisfiability of AS |=C φ, thus, giving algorithmic steps for performing step (S3) of the verification procedure. Algorithm 2: Property preservation Input: • Aspect AS = Ad, pc and an advice Ad = Aa , in, out, t • Reachability property φ=lq , Zq • Component C with reconfiguration interface I=rl1 , . . . , rlk and verification interface I φ =rl1φ , . . . , rlkφ , such that rli = pc. Output: φ if AS |=C φ, false, rliφ otherwise. • true, rlij If rliφ = IsP reserved : true then return true, If rli = lq and t =around then return false, ∅ φ φ , . . . , rlim } RemainingP aths :=rliφ ={rli1 Repeat: φ =IsP reserved : ⊥, δ from RemainingP aths Choose an element rlij φ } RemainingP aths:=RemainingP aths\{rlij φ Construct Z(AS) for rlij from definition 9 If (Zout ⊆ Zrl and t=before) or (Zout ⊆ Zsucc and t∈{after, around}) then φ rlij :=IsP reserved : true, δ

φ return true, rlij end if φ :=IsP reserved : f alse, δ rlij

φ } rliφ :=rliφ ∪{rlij Until RemainingP aths = ∅ Return false, rliφ

We argue for the soundness of the algorithm as follows. If the algorithm returns φ φ , then the aspect AS preserves property φ when reconfiguring C. Here, rlij true, rlij contains the path on which the property is preserved. In the first two steps of the algorithm, it trivially returns a correct value based on proposition 2. Namely, the algorithm returns true, if the property is always preserved (case (1) of proposition 2). In this case, is returned together with true to indicate that the property is trivially satisfied and no further checks on an aspect need to be performed. The algorithm returns false, φ ∅ if the property is trivially violated (case (2) of proposition 2). Otherwise, true, rlij

78

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

is returned if there exists a zone Zout with properties named in definition 10. Thus, the aspect complies with definition 10 of preservation of φ. If the algorithm is not previ ously terminated, then the final step terminates it by returning false, rliφ , where rliφ gives all paths on which the property is not satisfied. 4.2 An Example We illustrate steps (S1)-(S3) of the verification procedure on the running example of the TMC component C = Ac , I and the locking aspect ASlock =Ad, update, where Ad=Aa , in, out, before. We want to check if the transaction executes within 11 time units in the reconfigured component C = C ASlock . That is, we need to check if the property φ=EF (end ∧ x < 11) is preserved under reconfiguration, i.e., location end is reached within 11 time units. (S1) Existing real-time model-checkers, e.g., Uppaal, can be used in this step for modelchecking timed automaton Ac . This results in C |= φ. The tool generates the trace showing how the property is fulfilled. One such trace for the property φ=EF (end∧ x < 11) of the component TMC is illustrated in figure 7. This trace corresponds to the path σ=l0c , Z0c ;∗ lq , Zq , lq =end and Zq ={x < 11}, identified by proposition 1. (S2) In this step the temporal information for the location update and it successor and predecessor locations start and end are extracted into the verification interface I φ of C using algorithm 1. Following the steps of the algorithm we obtain the verification interface I φ =rlφ as follows. First, the state of the successor location end, end, Zsucc , Zsucc = Zend = {0 ≤ x} is appended to rlφ . Next, the state corresponding to the reconfiguration location update, update, Zrl , where Zrl = Zupdate = {0 ≤ x < 11}, is appended to rlφ . Further, the state corresponding to the predecessor location start, start, Zpred , where Zpred = Zstart = {0 ≤ x < 3} is appended to rlφ . Finally, < IsP reserved : ⊥ > is appended to rlφ since rl appears on the path σ. Hence, algorithm 1 returns the verification interface: I φ =rlφ =IsP reserved : ⊥, start, {0 ≤ x < 3},update, {0 ≤ x < 11}, end, {0 ≤ x}. (S3) This step is done as prescribed by the property preservation algorithm 2. Following the steps in the algorithm 2, the checks for trivial satisfaction of ASlock |=C φ are passed without stopping the algorithm as the advice is neither of type around nor is the flag IsP reserved set to true. In the next step, the enriched aspect zone graph is computed based on definition 9. Then the zone Zout corresponding to location out is checked for satisfiability. Since Zout = {5 ≤ x < 11} then Zout ⊆ Zrl and the algorithm returns true, rlφ . In this example, rlφ = rlφ . Hence, ASlock |=C φ. idle x=0

start 0≤x<3

update 0 ≤ x < 11

end

locations zones

Fig. 7. One possible execution trace of for the component TMC and the property EF (end ∧ x < 11)

Modular Verification of Reconfigurable Components

5

79

Related Work

Li et al. and Sipma have provided basic formalization frameworks for verification of reconfigurable component-based designs [8, 9, 21]. Both of these research efforts primarily focus on proving the correctness of the functional behavior of components and aspects, and do not consider timing behavior of the system. Sipma [21] provides a formal framework for crosscutting, where system and aspects are modeled as modular transition system, and verification of cross-cuttings is done applying deductive reasoning. Her work is more oriented toward the existing aspect languages, providing support for explicit specification of aspects and advices as well as the point-cuts. Li et al. [8, 9] model the system (components) and extensions (features) as state machines. They provide a compositional verification framework based on model checking, and quasisequential composition of features with the base system. Moreover, their work concentrates on the features, and feature-oriented systems, and therefore considers crosscutting on a more abstract level than crosscutting defined by aspect weaving and existing aspect languages; their approach, in the named papers, does not support specification of aspects by means of advices and point-cuts. Krishnamurthi et al. [22] extended this feature-oriented formalization framework to support verification of aspect-oriented programs. Here aspects are explicitly modeled as state machines. They give an informal description of a method as follows. The verification is done by first extracting an appropriate interface from the crossproduct of the state machines that model the program and point-cut designators. These interfaces are then used in the property preservation step, in which the advice is checked to find out whether the weaving preserves proven program property. In the real-time domain, a lot of work has been done in the area of verification of real-time systems, see [23] for a survey. Approaches to real-time system verification typically use timed automata as the underlying formalization and primarily focus on parallel composition of the components and the system, and the space-explosion reduction [15, 23, 24]. We build upon this work, focusing on the representation of the timed automata in terms of zones, and provide a method for verification of the real-time systems obtained by quasi-sequential composition of aspects with components.

6

Conclusions and Future Work

In this chapter we have presented a formal framework for verification of timeliness properties of reconfigurable components. The framework supports proving functional and temporal properties of individual components and aspects, and proving that the reconfiguration of components via aspect weaving provides expected functional and temporal behavior in the reconfigured component. The verification is based on the formal representation of components and aspects as augmentations of timed automata and two algorithms. One algorithm extracts, under component verification, the necessary information for verification of an aspect. The information is stored into a verification interface of a component. The other algorithm checks, on an aspect, if the property is preserved upon reconfiguration. This check is done using the information stored in the verification interfaces. Hence, the verification framework ensures that components are

80

Aleksandra Teˇsanovi´c, Simin Nadjm-Tehrani, and J¨orgen Hansson

verified only once for a particular property. The verification of reconfigured components is done on the aspects, in order not to waste the effort invested in verification of the components and to overcome the possible state explosion that might happen in cases where verification is done on woven designs. Here we have primarily focused on the application of the framework for the verification of an aspect with a single advice that is woven into a single component. This result is a foundation for further extension of the framework to enable verification of reconfiguration of several components with aspects containing several advices running in parallel; this type of composition corresponds to a hybrid of sequential and parallel composition. In the extended framework we assume that a system configuration (an assembly) is done such that components are chosen from a component library and are composed into a monolithic system off-line. Hence, all interaction among components in such an assembly can be modeled as networks of timed automata. For embedded and real-time system, due to the requirement for predictability in resource consumption, e.g., memory and CPU, static off-line component composition is preferred. Thus, ongoing work focuses on the verification of reconfigurable component assemblies modeled as networks of timed automata, as well as the implementation of a plug-in in the Uppaal environment to represent aspects zone enrichment.

Acknowledgments The authors would like to thank Thomas Gustafsson, Jonas Elmqvist, and anonymous reviewers for valuable comments on the manuscript. This work is financially supported by the Swedish Foundation for Strategic Research (SSF) via the SAVE project and the Center for Industrial Information Technology (CENIIT) under contract 01.07.

References 1. Stankovic, J.: VEST: a toolset for constructing and analyzing component based operating systems for embedded and real-time systems. In: Proceedings of the Embedded Software, First International Workshop (EMSOFT 2001). Volume 2211 of Lecture Notes in Computer Science., Tahoe City, CA, USA, Springer-Verlag (2001) 390–402 2. Szyperski, C.: Component Software - Beyond Object-Oriented Programming. AddisonWesley (1999) 3. Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.M., Irwin, J.: Aspect-oriented programming. In: Proceedings of the ECOOP. Volume 1241 of Lecture Notes in Computer Science., Springer-Verlag (1997) 220–242 4. Teˇsanovi´c, A., Nystr¨om, D., Hansson, J., Norstr¨om, C.: Towards aspectual component-based real-time systems development. In: Proceedings of the 9th International Conference on RealTime and Embedded Computing Systems and Applications (RTCSA’03). Volume 2968 of Lecture Notes in Computer Science., Springer-Verlag (2003) 5. Teˇsanovi´c, A., Nystr¨om, D., Hansson, J., Norstr¨om, C.: Aspects and components in real-time system development: Towards reconfigurable and reusable software. Journal of Embedded Computing (2004) 6. Pichler, R., Ostermann, K., Mezini, M.: On aspectualizing component models. Software Practice and Experience 33 (2003) 957–974

Modular Verification of Reconfigurable Components

81

7. Teˇsanovi´c, A.: Towards aspectual component-based real-time system development. Technical report, Department of Computer Science, Link¨oping University (2003) Licentiate Thesis, ISBN 91-7373-681-3. 8. Li, H., Krishnamurthi, S., Fisler, K.: Interfaces for modular feature verification. In: Proceedings of the International Conference on Automated Software Engineering, IEEE Computer Society Press (2002) 9. Li, H., Krishnamurthi, S., Fisler, K.: Verifying cross-cutting features as open systems. In: Proceedings of the ACM SIGSOFT Conference on Foundations of Software Engineering, ACM Press (2002) 10. Nystr¨om, D., Teˇsanovi´c, A., Norstr¨om, C., Hansson, J., B˚ankestad, N.E.: Data management issues in vehicle control systems: a case study. In: Proceedings of the 14th Euromicro International Conference on Real-Time Systems, Vienna, Austria (2002) 11. Coady, Y., Kiczales, G., Feeley, M., Smolyn, G.: Using AspectC to improve the modularity of path-specific customization in operating system code. In: Proceedings of the Joint European Software Engineering Conference (ESEC) and 9th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE-9). (2002) 12. Spinczyk, O., Gal, A., Schr¨oder-Preikschat, W.: AspectC++: an aspect-oriented extension to C++. In: Proceedings of the 40th International Conference on Technology of Object-Oriented Languages and Systems (TOOLS Pacific 2002), Sydney, Australia, Australian Computer Society (2002) 13. Xerox Corporation: The AspectJ Programming Guide. (2002) Available at: http://aspectj.org/doc/dist/progguide/index.html. 14. Nystr¨om, D., Teˇsanovi´c, A., Nolin, M., Norstr¨om, C., Hansson, J.: COMET: A componentbased real-time database for automotive systems. In: Proceedings of the Workshop on Software Engineering for Automotive Systems at 26th International Conference on Software engineering (ICSE’04), Edinburgh, Scotland, IEEE Computer Society Press (2004) 15. Alur, R., Courcoubetis, C., Dill, D.: Model checking for real-time systems. In: Proceedings of the 5th IEEE International Symposium on Logic in Computer Science, Philadelphia, IEEE Computer Scoiety Press (1990) 16. Henzinger, T., Nicollin, X., Sifakis, J., Yovine, S.: Symbolic model checking for real-time systems. In: Proceedings of the 7th. Symposium of Logics in Computer Science, IEEE Computer Society Press (1992) 394–406 17. Bengtsson, J., Yi, W.: Timed Automata: Semantics, Algorithms and Tools. In: Concurrency and Petri Nets. Volume 3098 of Lecture Notes in Computer Science. Springer-Verlag (2004) 18. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Science 126 (1994) 183–235 19. Uppaal tool. (http://www.uppaal.com) 20. Kronos tool. (http://www-verimag.imag.fr/TEMPORISE/kronos/) 21. Sipma, H.: A formal model for cross-cutting modular transition systems. In: In Proceedings of the Workshop on Foundations of Aspect-Oriented Languages (FOAL 2003), Boston, USA (2003) 22. Krishnamurthi, S., Fisler, K., Greenberg, M.: Verifying aspect advice modularity. In: Proceedings of the ACM SIGSOFT International Symposium on the Foundations of Software Engineering, ACM Press (2004) 23. Alur, R.: Timed automata. In Halbwachs, N., Peled, D., eds.: Proceedings of the CAV’99. Volume 1633 of Lecture Notes in Computer Science., Springer Verlag (1999) 8–22 24. Larsen, K., Pettersson, P., Yi, W.: Compositional and symbolic model-checking of real-time systems. In: Proceedings of the 16th IEEE Real-Time Systems Symposium, Pisa, Italy, IEEE Computer Society Press (1995)

Behavioral Types for Embedded Software – A Survey Walter Maydl1 and Lars Grunske2 1

2

University of Passau, Innstr. 33, 94032 Passau, Germany [email protected] http://www.fmi.uni-passau.de/∼maydl/ University of Queensland, Boeing Postdoctoral Research Fellow, School of ITEE 4072 Brisbane (St.Lucia), Australia [email protected] http://www.itee.uq.edu.au/∼grunske/

Abstract. At present, there is a variety of formalisms for modeling and analyzing the communication behavior of components. Due to a tremendous increase in size and complexity of embedded systems accompanied by shorter time to market cycles and cost reduction, so called behavioral type systems become more and more important. This chapter presents an overview and a taxonomy of behavioral types. The intentions of this taxonomy are to provide a guidance for software engineers and to form the basis for future research.

1

Introduction

The current trend towards increasingly complex embedded component-software [1–4] intensifies the urgent need for early error detection to avoid additional redesigns wasting time and money. Complete formal verification based on, e.g., model checking or theorem proving [5] is often not possible, because it is too time consuming. Therefore, type-checking mechanisms represent an interesting alternative for detecting compatibility errors during the construction of large-scale component systems [6]. Traditionally, STATIC TYPE CHECKING mechanisms were used for the analysis of component interaction. These techniques were limited towards checking signatures of communication methods or analyzing the set of exchanged messages. BEHAVIORAL TYPES [7–9] extend the analytical capabilities significantly by capturing the DYNAMIC COMMUNI CATION PROTOCOLS of components [10–12]. The advantages are that the behavioral compatibility of components is checked during design time and that the selection of suitable components may be supported. The objectives of this chapter are to present an OVERVIEW OF THE STATE OF THE ART regarding behavioral types for component-based software and to propose a TAX ONOMY which may be used as guidance for selecting the most suitable approach for an application and as starting point for future research. For this purpose, the rest of the chapter is organized as follows: In Section 2, the basic definitions and concepts of behavioral types are introduced. In Section 3, we explain the TAXONOMY in detail. To ease the understanding, a LEVEL CROSSING CONTROL SYSTEM is introduced in Section 4. In Section 5, an overview of the specification formalisms for behavioral types regarding the taxonomy is given. We conclude this survey with a summary in Section 6. C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 82–106, 2005. c Springer-Verlag Berlin Heidelberg 2005

Behavioral Types for Embedded Software

83

Fig. 1. Embedded System

2

Behavioral Types

Usually, an embedded system is based on multiple models of computation (cf. Fig. 1). Data flow graphs are responsible for signal processing and finite state machines control the data flow graph or deal with user interaction, for instance. The data flow graph is composed of data flow components encapsulating various parameterized algorithms (e.g., FFT, PID-controllers, etc.) connected by edges (fifos) to solve a complex application problem. The embedded system may consist of real-time parts (e.g., controling the technical process) and non-real-time parts (e.g., gathering of statistics) both involving periodic, aperiodic, or sporadic tasks. Communication in Embedded Systems. Faultless dynamic interaction of a component with its ENVIRONMENT is especially important for embedded systems which are usually supposed to run without termination. The environment of a component (cf. Fig. 1) may consist of – other components (belonging to the same or a different computational model) receiving and sending messages and exchanging data – a scheduler determining the execution order of all components – a technical process sending or receiving physical signals – a supervisor changing component parameters Here, we will focus on the communication between components. Each of the messages sent or received constitute service requests or responses to service requests. A component provides INTERFACES where other components may connect to by using COMMUNICATION CHANNELS (cf. Fig 2). On a communication channel, messages (and data) may be transmitted unidirectional or bidirectional. If the capacity of

84

Walter Maydl and Lars Grunske

Fig. 2. Communication between Components

the communication channel is zero, the message exchange is synchronous (that is, either the message is received instantaneously or it is lost). An asynchronous communication channel is further described by its capacity and the order of message transmission (random, fifo, etc.). Behavioral Types. In classical type systems, the set of services provided at each interface determines the type of this interface ( SERVICE TYPE [10]). Connecting interfaces with different service types results in a type error. The main disadvantage of this approach is that these service types do not capture the DYNAMIC AVAILABILITY OF SERVICES . Components having the same service types may be incompatible as there are no compatible sequences of service requests and service responds. The constraints a component imposes on the availability of its services and the requests for services of other components are expressed in the protocol of a component: D EFINITION 2.1: The sequences of requests a component is capable of servicing constitute the COMMUNICATION PROTOCOL of the component. Communication protocols are usually depicted as labeled transition systems consisting of states and transitions (cf. Fig. 2). The outgoing transitions of a state represent the currently available services. As these protocols are not supposed to terminate, they contain at least one cycle. Such a cycle in the labeled transition system represents a subprotocol. Two connected components are said to be compatible if both may be executed endlessly without deadlock using a finite amount of memory. The communication protocol of a component is formalized as its behavioral type [7–9]. In [10], the term REGULAR TYPE is used. D EFINITION 2.2: The formal description of the communication behavior of a single component (its communication protocol) is called its BEHAVIORAL TYPE. Depending on the complexity of the component protocols, non-determinism may be introduced into the formalism. This is used, for instance, to provide an approximation of infinite state protocols [10]. Two components are COMPATIBLE if their behavioral types (communication protocols) are compatible. D EFINITION 2.3: A type system which captures the dynamic behavior of components in order to detect mismatches regarding protocol incompatibilities which result, e.g., in deadlocks, memory overflow, etc. is called a BEHAVIORAL TYPE SYSTEM.

Behavioral Types for Embedded Software

85

Type Constraints and Type Resolution. The behavioral type constraints of a component are given by its communication protocol. Furthermore, each communication channel represents an additional type constraint which enforces the compatibility of the behavioral types of the connected components. These type constraints represent the input of the type resolution algorithm. Each time a component or communication channel is inserted or deleted, the type resolution algorithm is invoked to check if all type constraints are still valid for the overall component system. Static vs. Dynamic Type Checking. For efficiency reasons, a compromise has to be found between static and dynamic type checking. Pure static type checking of all communication aspects results in MODEL CHECKING and its well-known STATE EXPLO SION PROBLEM [5]. Pure dynamic type checking represents only a debugging support being inadequate for embedded systems which should run without premature termination.

3

Taxonomy

To compare the current state of the art formalisms of behavioral type systems for component-based systems, we propose a taxonomy and illustrate in Section 5 how each type system satisfies the desired properties. These properties are subdivided in ANA LYTICAL CAPABILITIES and FEATURES SUPPORTING EFFICIENT SOFTWARE DEVEL OPMENT . It is pointed out, what issues arise by applying behavioral types to different MODELS OF COMPUTATION. 3.1 Analytical Capabilities (T.1–2) Analytical capabilities describe the potential of a behavioral type system to detect executional errors like deadlocks or memory overflow and to check the satisfaction of constraints regarding annotated quality attributes (e.g., execution times). Error Detection (T.1). A critical challenge in the construction of component-based systems is to ensure that the components interact without error [13, 14]. The main goal in embedded system design is INFINITE EXECUTION. Once deployed, embedded systems like washing machines, traffic light controllers, etc. are supposed to run without breakdown. Two main reasons for premature system termination are [15, 16]: – DEADLOCKS: A component system is in GLOBAL DEADLOCK if none of the components can make progress. A LOCAL DEADLOCK arises if only some of the components cannot progress. – MEMORY OVERFLOW: Infinite execution of a component system has the potential of requiring unbounded amounts of memory as, e.g., more messages are written to a communication channel than read. These two reasons of system breakdown are closely related. A component system deadlocks if the imposed capacity of a message channel is exhausted, for instance. If the

86

Walter Maydl and Lars Grunske

system would not deadlock with a different capacity, the arising deadlock is called AR TIFICIAL [16]. Each behavioral type system has to make a trade-off between modeling power and analytical capabilities. If the model is as powerful as a Turing machine, the above stated problems are in general not decidable any more. An interesting side-effect of the analysis of the above stated problems in nonTuring-equivalent models is the calculation of a CYCLIC SCHEDULE [15]. In a cyclic schedule, a finite sequence of component activations is repeated infinitely. This has the advantage, that the behavior of the component system is represented by the sequence found. Annotation with Quality Attributes (T.2). Since embedded software has to fulfill requirements regarding non-functional properties (NFPs), such as safety, availability, reliability, and temporal correctness [2], behavioral types should allow reasoning about these properties. Therefore, the behavioral type system should be capable to model timed and probabilistic behavior. This is especially important as many embedded systems are also REAL - TIME SYSTEMS. Timed behavior represents the reaction time needed to reply for a service request. These reaction time must be constrained by time conditions (time condition), which basically is specified by the following inductive language: – time condition = x ⊗ c – time condition = ¬time condition – time condition = time condition ⊕ time condition where c is a integer constant in a time domain (s, min, hours, days, etc.), x is the value of a clock, ⊗ is a comparison operator {<, >, =, ≤, ≥} and ⊕ is a boolean operator {∨, ∧, ⇒, ⇔}. Based on this language, we may specify time conditions like (x ≤ 10s) ∨ (y ≥ 2s) or ¬(x > 10s). The probabilistic behavior is used to model the availability of a service. This probabilistic behavior depends on the environment and the hardware platform which is used to execute the component. For the specification of the probabilistic behavior, stochastic functions and metrics are used. Metrics often used are the failure density function t f (t), the reliability function R(t) = 0 f (τ ) dτ , which defines the probability of component to work correctly in the time interval 0 < τ ≤ t or the mean time to failure ∞ MTBF = 0 R(t) dt. 3.2 Support of Efficient Software Development (T.3–6) Up to a certain degree, behavioral type systems support additional modern principles of software engineering described in this section. Reuse and Adaptation (T.3). For the effective use in a real world application, components must be applicable in different systems and different environments [17]. To adapt a component to different usages, often a parameterization is necessary. In the context of behavioral type systems, parameters regarding interface characteristics are of special interest like

Behavioral Types for Embedded Software

87

– number of interfaces – set of services provided at an interface – temporal sequence of service requests/responses to service requests The number of interfaces of a visualization component displaying physical signals may be adjusted by the user, for instance. Additionally, it is also important whether the protocol adaptation is conducted automatically by the type resolution algorithm or interactively by the user. Subtyping and Polymorphism (T.4). Subtyping and polymorphism are two powerful concepts of abstract data types which are used in the object-oriented software construction [10, 18]: D EFINITION 3.4: An abstract data type A is a SUBTYPE of another abstract data type B if an instance of B can be safely substituted with an instance of the data type A, without altering any of the desirable properties (e.g., correctness) of any desirable program. The safe substitution is called a Liskov substitution [18] and allows for the POLYMORPHIC USE of abstract data types A and B. This kind of polymorphism is also called INCLUSION POLYMORPHISM [19]. A behavioral type system supporting subtyping and polymorphism allows efficient component-based software construction. Thereby a component A is a subtype of another component B if the behavioral type of A allows the same interaction as specified by the behavioral type of B. This is the same kind of polymorphism as defined for abstract data types. Hierarchical Abstraction and Compositionality (T.5). One main goal of componentbased software engineering is to construct high-level components using components of finer granularity [4]. To apply this, a mathematically well-founded formalism is necessary which derives a behavioral type specification for the high level component using the behavioral types of the contained components. The implementation of such a formalism is called a composer [20]. The precondition to construct such a composer is that the behavioral type system allows for the composition of behavioral types. Refinement (T.6). A typical approach to develop a component-based software system is a step-by-step refinement from the interface specification to the implementation of the component [21]. Besides these transformation steps towards more concrete specifications and, finally, the implementation, subsequent verification is necessary. This verification step(s) guarantee that a concrete implementation adequately captures the constraints (protocols) of the interface specification. It has to be verified that each (intermediate) result provides a protocol with at least the same services and the same dynamic availability of these services. Besides that, the services provided become more concrete, more efficient, involve less nondeterminism, etc. 3.3 Applicability to Different Models of Computation (T.7) Usually, embedded systems are developed by using several models of computation (cf. Fig 1) [2, 22, 23]. These models may be categorized as CONTROL FLOW BASED (state

88

Walter Maydl and Lars Grunske

based), DATA FLOW BASED (activation based), and TIME BASED. For each category, some examples are presented. This categorization is based on the main characteristics of each computational model. Of course, there are minor overlaps as, for instance, data flow based models may be annotated with time attributes. Control Flow Based. Control flow based models describe the function of a system by a set of states and constraints. The constraints control the transition from one state into another and the output of the system. In graphical representations, states and transitions are represented as nodes and edges, respectively. Finite state machines and their extensions (like hierarchical finite state machines based on the semantics of Argos [24] or Statecharts [25]) are based on this model of computation. Data Flow Based. Data flow based models describe the flow of data in a componentbased system. The component system is represented by a directed graph. Nodes represent “working units.” Edges depict the flow of data (fifo semantics). That is, edges buffer data until the working unit is able to process it. Usually, such models of computation are used to specify programs based on parameterized components applied, e.g., for signal processing. Kahn process networks [26] or the data flow paradigms [15] SDF (synchronous data flow), BDF (boolean controlled data flow, and DDF (dynamic data flow) are based on this model of computation. Time Based. Discrete event models [22] are used to simulate communication networks or digital hardware, for instance. Components represent functions (as in data flow based models). Additionally, statistically distributed processing times are annotated to each component. Hardware description languages, e.g., VHDL and Verilog, are based on such models of computation. Often, embedded systems are part of a control loop. In order to predict the behavior the overall system (including the technical process controlled), is has to be modeled in form of a time discrete system. Such models are usually based on linear systems of differential equations. Requirements on Behavioral Type Systems. The various models of computation have different requirements on their behavioral types which are concerned with – the communication channels used (e.g., synchronous vs. asynchronous communication, bounded vs. unbounded capacity, etc.) – restrictions on the order and number of incoming service requests or outgoing responses. In the synchronous data flow paradigm, for instance, a fixed number of tokens is consumed from the incoming interfaces of a component before a fixed number of tokens is sent on to the outgoing interfaces [27]. – annotations with quality attributes (e.g., response times). This is especially important, if the embedded system is also a real-time system where, e.g., response time constraints have to be satisfied. Combining different models of computation is another point of interest (e.g., combination of synchronous and asynchronous communication channels).

Behavioral Types for Embedded Software

89

Fig. 3. Simplified Structure Specification of the Level Crossing Control System

4

Application Example

In this section, we introduce a simplified version of a control system for a level crossing to illustrate different specification formalisms in Section 5. This level crossing system is constructed with the control component LevelCrossingControl, that controls two T RAIN S IGNAL actuators and two G ATE actuators. The control component utilizes sensors to determine the state of the gates and to detect an arriving train and its progress through the level crossing. For a more comprehensive design, a component called GateControl is used to capsule a G ATE and a G ATE S ENSOR. The component TrainControl is used to capsule a T RAIN S IGNAL and T RAIN S ENSORS. This leads to the following structure specification of the level crossing control system (cf. Fig. 3). The graphical notation is based on the ROOM methodology [3], where rectangles and small black squares are used as graphical symbols for components and ports, respectively. For each component of the structure specification, it has to be further characterized (Tab. 1) which incoming and outgoing messages are sent via the interfaces (ports) of the component. Thereby, in a basic scenario the TrainControl-component detects an arriving train and transmits the message TrainDetected to the LevelCrossingControlcomponent. This component sends the messages CloseGate to the GateControl-component to close the gates and waits until the gates are closed (approved by the GatesClosed -message). Then it sends the AllowPassage-message to the TrainControlcomponent to allow the train to enter the level crossing section. After the Train has entered the section (TrainEntered-message), the passage for other trains is will be forbidden (DenyPassage). If the TrainControl-component detects the train left the level crossing section (TrainLeft), the gate are opened in a similar way. For the concrete specification of the behavioral types, we refer to the Section 5.

5

Overview of the State of the Art

The considered notations for behavioral types may be classified according to the underlying formalism: ORDERED EVENT SETS, FINITE STATE MACHINES, and PROCESS ALGEBRAS .

90

Walter Maydl and Lars Grunske Table 1. Incoming and Outgoing Messages for the Level Crossing Control System Component LevelCrossingControl

Interface (Port) LCTrainSensors

LCSignals LCGateSensors LCGates TrainControl

TCTrainSensorsLC

TCTrainSignalsLC GateControl

GCGatesSensorsLC GCGatesLC

Incoming Outgoing Message Message TrainDetected T rainEntered TrainLeft AllowPassage DenyPassage GatesClosed GatesOpen OpenGate CloseGate TrainDetected TrainEntered TrainLeft AllowPassage DenyPassage GatesClosed GatesOpen OpenGate CloseGate

5.1 Ordered Event Sets Ordered event sets are subdivided into TOTALLY ORDERED SETS OF EVENTS and partially ordered sets of events which will be described in detail below. Totally Ordered Sets of Events. Totally ordered event sets are used to specify the interface behavior of a component through sets of sample or exemplary behaviors in form of event traces. Within a trace, it is assumed that an event is followed by exactly one other event and thus the event trace is totally ordered. Due to their simplicity, this formalism is widely accepted by engineers and is especially used in object oriented software development processes [28]. Behavioral Type of a Component. Formally a totally ordered event set is defined as follows: D EFINITION 5.5: Let Σ be a set of events partitioned into two sets: a set of S of send events (sent from the component C to the environment Env) and a set R of receive events (received by the component C from the environment Env). Then a TOTALLY ORDERED EVENT SET is an element of the language of strings L(Σ) over the alphabet Σ. Based on this definition, a behavioral type of a component is specified by a set of totally ordered event sets, where each totally ordered event set represents a possible trace of the component during runtime. For the graphical representation of totally ordered event sets, sequence diagrams or message sequence diagrams (MSC) can be used. Within these diagrams, a vertical

Behavioral Types for Embedded Software

91

Fig. 4. Level Crossing Example (Traces)

line corresponds to a component C (e.g., a asynchronous process or an autonomous agent). The environment Env is represented by a frame that is drawn around the diagram. Messages exchanged between two components or between a component and the environment are represented by arrows. The tail of the arrow marks the event of sending a message. The head marks the receipt of a message. Additionally, each arrow is labeled with an event type from the alphabet Σ. Type Resolution. For the construction of the composed behavioral type (of two components with the behavioral types A and B), the environmental merge operator Env specified in [29] is used. This environmental merge operator Env identifies every equally named message sent to and received from the environment in both behavioral types. Then the behavioral types (respectively the processes) will be merged and the equally named messages are disposed. Application Example. Fig. 4 illustrates the specification of the behavioral types for the application example of Section 4. The message sequence charts for the components LevelCrossingControl, GateControlA and GateControlB are given. The application of the environmental operator Env results in the elimination of several messages. The final result is depicted at the right side of the diagram. Conformance to Taxonomy. Totally ordered sets of events represent a simple formalism with limited modeling power. The different points of the taxonomy are addressed as follows. T.1 Error Detection: Totally ordered event sets provide a limited support for error detection. If combining MSC graphs is interpreted as synchronous, [30] proposes a model checking technique. T.2 Quality Attributes: An extension of sequence diagrams which is presented in [31] supports the annotation with time conditions. For the specification, the OCL is used

92

Walter Maydl and Lars Grunske

T.3 Reuse and Adaptation: If the MSC is encapsulated and a unique name is provided, reuse is possible. Adaptation is not directly supported. T.4 Subtyping and Polymorphism: The subtyping relation is defined on a trace semantics, where a behavioral type A is a subtype of the behavioral type B iff A accepts all traces that will be accepted by B. Therefore, totally ordered sets of events are used to specify the subtyping relation between two behavioral types. T.5 Hierarchical Abstraction and Compositionality: This issue is addressed by hierarchical MSCs (HMSCs) [30], also called high-level MSCs. This formalism is supported by almost all modern software engineering methodologies like ROOM or UML 2.0. T.6 Refinement: In the earlier phases (analysis) of the object-oriented software development process [28], event traces (e.g. specified by sequence diagrams) are used to model the basic behavior of a system or a component. This basic behavior has to be refined in the later phases. For this, the fundamental approach is to substitute a single event by an event trace that models the behavior in more detail. Due to the total order of events, this substitution is simple and it is viable in several specification notations (sequence diagrams in the UML 2.0 or message sequence charts). T.7 Different Models of Computation: Totally ordered sets of events are suited for control flow based models of computation using synchronous communication. An application for time based models of computation is possible by using the corresponding quality annotations. Partially Ordered (Multi) Sets of Events (PO[M]SETS). PO[M]SETs use a partial order to describe the correct causal or temporal order of events [32]. They are more powerful than totally ordered sets of events [33]. Behavioral Type of a Component. A PO[M]SET is defined as follows. D EFINITION 5.6: A PO[M]SET is a labeled partial order (lpo) defined by the 4-tuple (V, Σ, , µ) where: – V is a vertex set; – Σ is an alphabet modeling event types; – is a partial order relation over V where e f (event f depends causally on the event e); – µ : V → Σ is a labeling function assigning symbols to vertices The totally ordered event set is derived from a PO[M]SET by the construction of all possible linearizations of the partial order. Whereby a linearization is a string over the alphabet Σ. Finite PO[M]SETS are visualized in a Hasse diagram, which is a graph whose vertices are elements of V and the edges correspond to the partial order relation. More precisely, an edge from P1 ∈ V to P2 ∈ V denotes that P1 P2 and that there is no P3 ∈ V with P1 P3 ∧ P3 P2 . Furthermore, P2 is drawn above P1 . In Fig. 5, for instance, a Hasse diagram is presented for the PO[M]SET (V1 , Σ1 , 1 , µ1 ):

Behavioral Types for Embedded Software

93

Fig. 5. Hasse Diagram

– – – –

V1 = P1 , P2 , P3 , P4 , P5 Σ1 = a, b, c

1 = (P1 , P2 ), (P1 , P4 ), (P2 , P3 ), (P3 , P5 ), (P4 , P5 ) µ1 = P1 → a, P2 → c, P3 → b, P4 → a, P5 → a

Type Resolution. The construction of the composed behavioral type (of two components with the behavioral types A and B) is similar to the composition of totally ordered sets of events. Every equally named message sent to and received from the environment in both behavioral types is determined. Then the composite behavioral types results from merging the original behavioral types and disposing the equally named messages. Application Example. The components LevelCrossingControl, GateControlA and GateControlB of the application example (cf. Section 4) are depicted in Fig. 6. GatesClosedA is processed parallel to GatesClosedB and CloseGateB. The result of the composition does not contain any parallel events any more.

Fig. 6. Level Crossing Example (PO[M]SETS)

94

Walter Maydl and Lars Grunske

Conformance to Taxonomy. PO[M]SETS extend the modeling power of totally ordered sets of events by introducing parallelism. T.1 Error Detection: The model checking technique defined in [30] may also be applied for synchronously combined PO[M]SETs. T.2 Quality Attributes: For the specification and verification of real time attributes, timed PO[M]SETS are introduced in [34] for the use in the architecture description language Rapide. These timed PO[M]SETS use the event pattern during(n,m), that constrains the minimal arrival time of an event by the parameter n and the maximal arrival time of an event by the parameter m. T.3 Reuse and Adaptation: Reuse is possible. Adaptation is not directly supported. T.4 Subtyping and Polymorphism: Since the subtyping relation is defined on a trace semantics, in which a behavioral type A is a subtype of the behavioral type B iff A accepts all traces that will be accepted by B, PO[M]SETs support subtyping. T.5 Hierarchical Abstraction and Compositionality: This issue is addressed similarly to hierarchical MSCs (HMSCs) [30]. T.6 Refinement: By refining PO[M]SETs, a vertex is replaced with another PO[M]SET. T.7 Different Models of Computation: PO[M]SETs support control flow based models of computation due to the synchronous communication. If annotations with time attributes are provided, an application for time based computational models is feasible. 5.2 Finite State Machines Finite state machines provide a suitable formalism to define behavioral type systems. Here, the focus is on INTERFACE AUTOMATA and COMMUNICATING FINITE STATE MACHINES . There are many additional types of finite state machines like PROTOCOL STATE MACHINES used in UML 2.0 which behave similarly. Interface Automata. Interface automata use synchronous message exchange to model the communication behavior of a component [7–9, 35]. Behavioral Type of a Component. The protocol of a software component is described by an interface automaton: D EFINITION 5.7: An INTERFACE AUTOMATON A = (S, S init , AI , AO , AH , T ) consists of – S: set of states – S init ⊆ S: set of initial states (S init = ∅) ˙ O ∪A ˙ H : set of actions with – A = AI ∪A I • A : input actions • AO : output actions • AH : hidden (internal) actions – T ⊆ S × A × S: set of steps An interface automaton is CLOSED if it has only internal actions. Input actions represent invocations of methods of the component. Output actions correspond to the returning of method calls to other components. Internal actions represent internal computations.

Behavioral Types for Embedded Software

95

Type Resolution. Each time, a component is inserted into the component system, the type resolution algorithm determines the interface automaton of the overall system by computing the composition of the interface automata of the original component system and the inserted component. For the composition to be well defined, the corresponding interface automata have to be composable: D EFINITION 5.8: Two interface automata X and Y are COMPOSABLE if all their action O I sets are disjoint besides AIX ∩ AO Y and AX ∩ AY which may be nonempty. The composite interface automaton is the product of the interface automata being composed in which all incompatible or unreachable states are removed. D EFINITION 5.9: The PRODUCT OF TWO COMPOSABLE INTERFACE AUTOMATA X and Y is defined as follows: – – – – –

SX⊗Y init SX⊗Y AIX⊗Y AO X⊗Y AH X⊗Y

– TX⊗Y

= SX × SY init = SX × SYinit I = (AX ∪ AIY ) \ shared(X, Y ) O = (AO X ∪ AY ) \ shared(X, Y ) H H = AX ∪ AY ∪ shared(X, Y ) {((v, u), a, (v , u )) |(v, a, v ) ∈ TX ∧ (u, a, u ) ∈ TY    ∧ a ∈ shared(X, Y )} = ∪ {((v, u), a, (v , u)) |(v, a, v ) ∈ T ∧ a ∈ shared(X, Y ) ∧ u ∈ SY }  X   ∪ {((v, u), a, (v, u )) |(u, a, u ) ∈ TY ∧ a ∈ shared(X, Y ) ∧ v ∈ SX }

with shared(X, Y ) = AX ∩ AY . Removing incompatible states means pruning all states which are part of a dead end. The inserted component is compatible with the original component system if the corresponding interface automata are compatible. D EFINITION 5.10: Two interface automata are COMPATIBLE iff they are composable and their composition is nonempty. Application Example. Fig. 7 illustrates the composition of two gate control components with the level crossing component. Incoming, outgoing, and internal messages are marked with “?”, “!”, and “;”, respectively. All messages sent between the two components being composed become internal. States which are part of a dead end in the composition are removed (pruning). Leaving out state 5 of the LevelCrossingControl component would result in an error state. Conformance to Taxonomy. Interface automata satisfy the different points of the taxonomy of Section 3 as follows: T.1 Error Detection: The computational complexity is not Turing-equivalent. Therefore, it is possible to detect all deadlocks regarding the communication protocols. If the composition of two interface automata results in an empty interface automaton, their protocols are incompatible and deadlock arises. As messages are not stored in the communication channels, the memory used for communication is per default bound. On the other hand, the modeling power is limited as only synchronous message exchange is supported.

96

Walter Maydl and Lars Grunske

T.2 Quality Attributes: In [36], TIMED INTERFACE AUTOMATA are introduced which allow the specification of real-time constraints, by extending the specification formalism with real-valued clocks and clock conditions based on the formalism of timed automata [37]. Using this extension, the timing of the inputs a component expects and the timing of the outputs it can produce may be specified. Furthermore algorithms are presented allowing to detect component compatibility and to derive the composite interface. For specification and evaluation of probabilistic behavior it should be possible to extend interface automata with transition probabilities. In that case, interface automata would correspond to Markov models. T.3 Reuse and Adaptation: Reusing components and their associated interface automata requires only to assign a unique name to each instance of a component (and its interface automaton) by appending an instance number, for example. Adaptation seems to be possible, too. The challenge, however, is to map parameter changes onto changes in the structure of the interface automaton.

Fig. 7. Level Crossing Example (Interface Automata)

Behavioral Types for Embedded Software

97

T.4 Subtyping and Polymorphism: Both are supported by interface automata. Subtyping is, e.g., used to form a hierarchy of several models of computation in Ptolemy II to formalize that components of the SDF domain are subtypes of components of the DE domain, for instance [7]. T.5 Hierarchical Abstraction and Compositionality: Interface automata are per definition compositional. Hierarchical abstraction is possible. Removing internal actions and resulting empty transitions results into a new simplified interface automaton which only contains input and output transitions. This new interface automaton abstracts from all internal communication. T.6 Refinement: Refinement seems to be possible. However, the verification step has to be defined. That is, the behavioral equivalence of the original interface automaton and the final implementation has to be guaranteed by the corresponding refinement operator. T.7 Different Models of Computation: As interface automata are synchronous, they are especially suited for models of computation which support synchronous communication. In Ptolemy II, interface automata are, e.g., used to model the synchronous interaction of different models of computation. Communicating Finite State Machines (CFSMs). In the area of design and validation of computer protocols, communicating finite state machines (CFSMs) are widely used [38, 39]. Behavioral Type of a Component. To specify communicating finite state machines, a formal definition of a fifo is needed. D EFINITION 5.11: A FIFO ( MESSAGE QUEUE , CHANNEL ) is a triple c = (Mc , nc , wc ) where Mc is a finite set called the message alphabet. nc ∈ N ∪ {∞} is the fifo capacity. wc ∈ ×c∈C Mc∗ is a tuple of words representing the fifo contents. Fifos are an important part of a CFSM as they allow to model asynchronously communicating components. D EFINITION 5.12: A COMMUNICATING FINITE STATE MACHINE (CFSM) is defined as (Q, q0 , C, T ) where – – – –

Q is a finite, non-empty set of states q0 ∈ Q is the initial state C is a set of fifos T is a state transition relation

Relation T delivers for a current state q and an action a the successor state qˆ. An action a may either be an input action c?m or an output action c!m where c denotes the fifo and m the message. “!” or “?” indicate a write or read operation, respectively. An action a is EXECUTABLE if a is the empty action, a is an input action and m is a prefix of the corresponding fifo contents, or a is an output action and the updated fifo content does not exceed the capacity of the corresponding fifo (nc ≤ size(wc ) + size(m)).

98

Walter Maydl and Lars Grunske

During execution, all CFSMs and all fifos are set to their initial state or initial content, respectively. Then as long as possible, an arbitrary CFSM i with an arbitrary executable transition rule is executed. Otherwise, the execution terminates. Type Resolution. In order to validate the protocol compatibility by model checking, several reachability analysis algorithms are proposed [38, 39]. These algorithms try to generate and inspect all states of a distributed system that are reachable from a given initial state. Three main types of algorithms are proposed: full search, controlled partial search, and random simulation. Application Example. Fig. 8 depicts the level crossing system modeled as CFSMs. The reachability tree is constructed. Each node contains not only the information about the current states of the CFSMs but also the fifo contents. If the reachability tree is finite, it can easily be interpreted as a CFSM resulting from the composition of the original CFSMs. Conformance to Taxonomy. Communicating finite state machines are used in combination with fifos to model asynchronously communicating components. If the fifos are unbounded, many questions regarding, e.g., deadlock detection become undecidable [38]. T.1 Error Detection: If the fifo capacities are finite, the computational complexity is not Turing equivalent. Therefore, analysis regarding deadlock detection and memory usage is possible. That is, the finite reachability tree can be constructed and analyzed. In case, at least one fifo capacity is unbounded, these questions become undecidable as the reachability tree may become infinite. However, due to the STATE EXPLOSION PROBLEM, this analysis may be expensive regarding time consumption and memory usage. T.2 Quality Attributes: Quality attributes are not considered. However, an extension of CFSMs using time annotations or probabilistic annotations (Markov model) is possible. T.3 Reuse and Adaptation: Providing unique names for components and, thus, CFSMs and their interfaces, permits reuse. The challenge regarding adaptation is, again, to map parameter changes onto structure changes of CFSMs. T.4 Subtyping and Polymorphism: Both are not supported. T.5 Hierarchical Abstraction and Compositionality: If the reachability tree is finite due to bounded fifos and it can be determined in reasonable time, it is possible to define the composition of two CFSMs. In the resulting composite CFSM, all possible contents of its internal fifos would be integrated into its states. T.6 Refinement: Refinement is not supported. T.7 Different Models of Computation: CFSMS are especially suited for dataflow based models of computation which use asynchronous communication.

Behavioral Types for Embedded Software

99

Fig. 8. Level Crossing Example (CFSM)

5.3 Process Algebras Process algebras represent well-founded formalisms which may be used for defining behavioral type systems. Well-known examples are CSP (C OMMUNICATING S E QUENTIAL P ROCESSES ), CCS (C OMMUNICATING C ONCURRENT S YSTEMS ), and πCALCULUS . In the following, CSP will be treated in full detail. Communicating Sequential Processes (CSP). The following description of CSP is based on [40, 41].

100

Walter Maydl and Lars Grunske Table 2. CSP Operators

description empty process terminating process output

operator stop skip a! → P

input

a? → P

deterministic choice

P Q

nondeterministic choice P Q sequential composition P ; Q parallel composition

P A Q

interleaving

P ||| Q

information hiding

P \A

semantics process not communicating with its environment. √ process only reacting to the event . The process generates the output event a and behaves afterwards as specified in P . The process receives the input event a and behaves afterwards as specified in P . The operator behaves like P or Q. The environment may influence the choice using, e.g., guards a! → P b! → Q [42]. The process behaves like P or Q. The choice is not influenced by the environment and, therefore, nondeterministic. The process first executes the specification of P . As √ soon as P receives the event , Q will be executed. The process executes P and Q in parallel. The event set A represents the SYNCHRONIZATION ALPHA BET . During each event e ∈ A, P and Q take part in the communication. The process acts like the parallel composition of P and Q but only one process has to be part of a communication. The process acts according to the specification P besides that the events e ∈ A are not visible for the environment.

Behavioral Type of a Component. The protocol of a component is described as a REG ULAR SEQUENTIAL PROCESS . Such processes communicate using SYNCHRONOUS MESSAGE PASSING . D EFINITION 5.13: The alphabet α(P ) of a process P describes all messages (events) a process sends or receives. √ Two additional events τ and are used to describe internal actions of a process or to model the termination of a process, respectively. Based on the alphabet of a process and these additional events, the syntax used in CSP is defined as follows. D EFINITION 5.14: The syntax and (informal) semantics of CSP operators is depicted in Tab. 2. The semantics of a process may be derived using either the TRACE MODEL or the FAIL URE DIVERGENCE MODEL [41]. The trace model describes a process P using the set of all possible event sequences (TRACES) trace(P ): √ √ trace(P ) =< a1 , a2 , . . . , an > with ai ∈ (α(P ) ∪ { }) ∧ an = trace(P ) is derived in an inductive manner as follows ( denotes concatenation):

Behavioral Types for Embedded Software

101

trace(stop) = {∅} √ trace(skip) = {∅, } trace(a! → P ) = {∅} ∪ {a?} trace(P ) trace(a? → P ) = {∅} ∪ {a!} trace(P ) For all binary operators op exist operators optrace satisfying: trace(P op Q) = trace(P ) optrace trace(Q) The operators are described in full detail in [41]. The failure divergence model determines the semantics of a process P by the tupel (α(P ), F (P ), D(P )). α(P ) denotes the alphabet of process P . F (P ) represents the sets of event sequences rejected by P : F (P ) = trace(P ) × R(P ) R(P ) is called refusal set and represents the set of events not accepted by the process. D(P ) is the set of event sequences resulting in an infinite sequence of internal events. Type Resolution. There are several ways to test protocol compatibility based on CSP processes. The first method is based on the trace model [13]. All event sequences (traces) of the component processes have to be generated and it has to be checked for each sequence if it is acceptable by the corresponding process. The second method is based on the failure divergence model [41]. Additionally to the trace model method, deadlock freedom is checked. It has to be tested if the event sequence of process P is rejected by process Q or if the event sequence of P urges Q into a divergence state: f B ∧ tA = dB ∀tA ∈ trace(P ), fB ∈ F (Q), dB ∈ D(Q) : tA = ∀tB ∈ trace(Q), fA ∈ F (P ), dA ∈ D(P ) : tB = f A ∧ tB = dA These two methods are rather expensive and only applicable for behavioral types resulting in finite event sequences. A third method uses the representation of CSP processes in form of labeled transition systems and applies a similar type resolution method as presented for interface automata. Application Example. Fig. 9 depicts the level crossing example as a set of CSP processes. Transferring the textual notation into a labeled transition system would result in a graph similar to Fig. 7. The application of the ||| operator determines the process for the composition. Conformance to Taxonomy. CSP addresses the different points of the taxonomy as follows: T.1 Error Detection: Using the failure divergence method, it is possible to determine deadlocks [41].

102

Walter Maydl and Lars Grunske LevelCrossingSystem := LevelCrossingControl ||| GateControlA ||| GateControlB LevelCrossingControl := LCC LCC := TrainDetected ? → CLC CLC := CloseGateA! → CloseGateB ! → WFG GAB := GatesClosedA ? → GatesClosedB ? GBA := GatesClosedB ? → GatesClosedA ? WFG := GAB ||| GBA → WTL WTL := AllowPassage ! → TrainEntered ? → DenyPassage ! → OLC OLC := TrainLeft? → OpenGateA ! → OpenGateB ! → LCC GateControlA := LLC LLC := CloseGateA? → GatesClosedA ! → OpenGateA ? → LLC GateControlB := RLC RLC := CloseGateB? → GatesClosedB ! → OpenGateB ? → RLC Fig. 9. Level Crossing Example (CSP)

T.2 Quality Attributes: To model the timed behavior CSPs are extended to T IMED CSP [43, 44] by adding the constructs wait (t) and at → P . Thereby wait (t) represents a process which is waiting for t time units before it terminates. This may be used to specify the lower bound of a time constraint as wait (t) ; P . In at → P , the process sends the event a for t time units and behaves afterwards as specified in P . This may be used to specify the upper bound of a time constraint. In [45], the construct a(t1 ,t2 ) → P is used to define lower and upper bound of a time constraint. In the time interval [t1 , t2 ], process P sends the event a. T.3 Reuse and Adaptation: Reuse and adaptation seem to be possible, but there is no support for these features. T.4 Subtyping and Polymorphism: Both are not supported. T.5 Hierarchical Abstraction and Compositionality: CSP supports these features by definition. T.6 Refinement: Based on the failure divergence method, it can be checked if a process P is a correct refinement of a process Q. For processes consisting of a finite number of states, the tool FDR (failures divergence refinement) checks the correctness of the refinement [46]. T.7 Different Models of Computation: Due to the synchronous communication, CSP are more suited for control based models of communication. Behavioral types for time based models of communication, e.g. discrete event systems, may be modeled using timed CSP.

Calculus of Communicating Systems (CCS) and π-Calculus. CCS is an algebraic theory to formalize the notion of concurrent computation [47]. CCS is differing in many details but shares the same basic orientation as CSP. The π-Calculus extends CCS to express mobile processes [48].

Behavioral Types for Embedded Software

103

Table 3. Overview

Taxonomy T.1 Error Detection T.2 Quality Attributes T.3 T.4 T.5

T.6 T.7

Ordered Event Sets Finite State Machines Process Algebra Sequence PO[M]SETS Interface CFSMs CSP Diagrams Automata √ √ √ √ √ √

√ (timed PO[M]SETs)

√

(timed interface automata) √ /r

r

√ Reuse & r/r r/r /r Adaptation √ √ √ Subtyping & r Polymorphism √ √ √ √ Hierarchical Abstraction & Compositionality √ √ √ Refinement r √ √ √ √ Different Models of (controlflow (controlflow, (controlflow, (dataflow Computation based) time based) time based) based)

√

(timed and probabilistic extensions) r/ r r √ √ √

(controlflow, time based)

5.4 Overview Tab. 3 depicts for each behavioral type system how the different items of the taxonomy √ are satisfied. The following entries are used: supported ( ) and research area (r).

6

Conclusion

The starting point for this survey of behavioral types for embedded software development was the following observation: Checking the constraints of behavioral types (protocols) during construction of component-based embedded systems avoids many runtime errors. If the type checks are executed interactively during the design, the software developer is notified at the earliest possible time. This helps avoiding additional redesigns wasting time and money. The underlying idea of these behavioral type systems is to provide correctness by construction (and not by testing). It can be generally stated that not only in the area of embedded systems software development processes that do not depend on exhaustive testing gain more and more importance. The taxonomy and overview of behavioral type systems provided show several possible directions of future research in this area. For example, adding quality annotations and providing efficient algorithms for solving resulting constraints becomes more and more important. The presented behavioral type systems support (if used in an appropriate context) the exploitation of all the significant advantages of a component-based approach: – improved quality of the applications due to reuse at the component level, – support of rapid prototyping and development (shorter time to market), – easy adaptation to changing requirements and improved configurability (particularly important for embedded systems with changing environments), – enhanced maintainability (due to software updates at the component level),

104

Walter Maydl and Lars Grunske

– separation of tasks of application engineers (component assembly) from tasks of software engineers (component construction), and – development of protected intellectual property (IP) libraries containing novel algorithms. These advantages lead to a multiplication of financial investment and an accelerated increase of innovation. Therefore, the presented behavioral type systems are a valuable contribution to efficient design of component-based embedded systems.

References 1. Brown, A.W., Wallnau, K.C.: The current state of component based software engineering. IEEE Software 15 (1998) 37–46 2. Lee, E.A.: Embedded Software. In: Advances in Computers. Volume 56. Academic Press, London (2002) 3. Selic, B., Gullekson, G., Ward, P.T.: Real-Time Object-Oriented Modeling. Wiley, New York (1994) 4. Szyperski, C.: Component Software. Addison-Wesley (2002) 5. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. MIT Press (1999) 6. Amey, P.: Correctness by construction: Better can also be cheaper. CrossTalk, The Journal of Defense Software Engineering (2002.) 7. Xiong, Y.: An Extensible Type System for Component-Based Design. PhD thesis, University of California at Berkeley (2002) 8. Lee, E.A., Xiong, Y.: A behavioral type system and its application in Ptolemy II. Aspects of Computing Journal (to appear) 9. Lee, E.A., Xiong, Y.: A behavioral type system and its application in ptolemy ii. Formal Aspects of Computing 16 (2004) 210–237 10. Nierstrasz, O.: Regular Types for Active Objects. In: Object-Oriented Software Composition. Prentice-Hall (1995) 99–121 11. Plasil, F., Visnovsky, S.: Behavior protocols for software components. IEEE Transactions on Software Engineering 28 (2002) 12. Yellin, D.M., Strom, R.E.: Protocol specifications and component adaptors. ACM Transactions on Programming Languages and Systems 19 (1997) 292–333 13. Chakrabarti, A., de Alfaro, L., Henzinger, T.A., Jurdzinski, M., Mang, F.Y.C.: Interface compatibility checking for software modules. In: Proceedings of the 14th International Conference on Computer-Aided Verification (CAV). LNCS 2404, Springer Verlag (2002) 428–441 14. Inverardi, P., Wolf, A.L., Yankelevich, D.: Static checking of system behaviors using derived component assumptions. ACM Transactions on Software Engineering Methodologies 9 (2000) 239–272 15. Bhattacharyya, S.S., Murthy, P.K., Lee, E.A.: Software Synthesis from Dataflow Graphs. Kluwer Academic Publisher (1996) 16. Geilen, M., Basten, T.: Requirements on the execution of Kahn process networks. In: 12th European Symposium on Programming. LNCS, Springer Verlag (2003) 319–334 17. Bass, L., Clements, P., Kazman, R.: Software Architecture in Practice. Addison-Wesley (1998) 18. Liskov, B.H., Wing, J.M.: A behavioral notion of subtyping. ACM Transactions on Programming Languages and Systems 16 (1994) 1811–1841 19. Cardelli, L., Wegner, P.: On understanding types, data abstraction, and polymorphism. Computing Surveys 17 (1985) 471–522

Behavioral Types for Embedded Software

105

20. Janneck, J.W.: Actors and their composition. Technical report, University of California at Berkeley (2002) 21. Bowen, J.P., Hinchey, M.G.: Formal Methods and the Specification Process. In: The Computer Science and Engineering Handbook. CRC Press (1997) 2302–2322 22. Edwards, S., Lavagno, L., Lee, E.A., Sangiovanni-Vincentelli, A.: Design of embedded systems: Formal models, validation, and synthesis. Proceedings of the IEEE 85 (1997) 366– 390 23. Sgroi, M., Lavagno, L., Sangiovanni-Vincentelli, A.: Formal models for embedded system design. IEEE Design & Test of Computers 17 (2000) 14–27 24. Jourdan, M., Maraninchi, F., Olivero, A.: Verifying quantitative real-time properties of synchronous programs. In: Fifth International Workshop on Computer Aided Verification (CAV). LNCS 697, Springer Verlag (1993) 25. Harel, D.: Statecharts: A visual formalism for complex systems. Science of Computer Programming 8 (1987) 231–274 26. Kahn, G.: The semantics of a simple language for parallel processing. In: Proc. of IFIP Congress 74, North Holland Publishing Company (1974) 471–475 27. Buck, J.T.: Scheduling Dynamic Dataflow Graphs with Bounded Memory Using the Token Flow Model. PhD thesis, University of Berkeley (1993) 28. Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User Guide. Addison Wesley (1999) 29. Rudolph, E., Grabowski, J., Graubmann, P.: Tutorial on Message Sequence Charts. Computer Networks and ISDN Systems 28 (1996) 1629–1641 30. Alur, R., Yannakakis, M.: Model checking of message sequence charts. In: Proc. 10th International Conference on Concurrency Theory, Springer Verlag (1999) 114–129 31. Douglass, B.P.: Real-Time UML. Second edn. Addison Wesley (1999) 32. Pratt, V.: Modeling concurrency with partial orders. International Journal of Parallel Programming 15 (1986) 33–71 33. Luckham, D.C., Kenney, J.J., Augustin, L.M., Vera, J., Bryan, D., Mann, W.: Specification and analysis of system architecture using Rapide. IEEE Transactions on Software Engineering 21 (1995) 336–355 34. Luckham, D.C., Vera, J.: An event-based architecture definition language. IEEE Transactions on Software Engineering 21 (1995) 717–734 35. de Alfaro, L., Henzinger, T.A.: Interface automata. In: Proceedings of the 9th Annual ACM Symposium on Foundations of Software Engineering (FSE), ACM Press (2001) 109–120 36. de Alfaro, L., Henzinger, T.A., Stoelinga, M.: Timed interfaces. In: Proceedings of the Second International Workshop on Embedded Software (EMSOFT). LNCS, Springer Verlag (2002) 108–122 37. Alur, R.: A theory of timed automata. Theoretical Computer Science 126 (1994) 183–235 38. Holzmann, G.J.: Design and Validation of Computer Protocols. Prentice Hall (1991) 39. Holzmann, G.J.: The Spin Model Checker. Addison-Wesley (2004) 40. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall (1985) 41. Brookes, S.D., Hoare, C.A.R., Roscoe, A.W.: A theory of communicating sequential processes. Journal of the ACM 31 (1984) 560–599 42. Dijkstra, E.W.: Guarded commands, nondeterminism and formal derivation of programs. Communications of the ACM 18 (1975) 453–457 43. Reed, G.M., Roscoe, A.W.: A timed model for Communicating Sequential Processes. Theoretical Computer Science 58 (1988) 249–261 44. Schneider, S.: Timewise refinement for communicating processes. Science of Computer Programming 28 (1997) 43–90 45. Pardo, J., Valero, V., Cuartero, F.: A dynamic state graph for a timed process algebra. In: Proceedings of SNPD’00. (2000) 199–209

106

Walter Maydl and Lars Grunske

46. Roscoe, A.W.: Modeling and verifying key exchange protocols using CSP and FDR. In: Proceedings of the 8th IEEE Computer Security Foundations Workshop (CSFW). (1995) 98–107 47. Milner, R.: Communication and Concurrency. Prentice Hall, New York (1989) 48. Milner, R.: Communicating and Mobile Systems: the π-Calculus. Cambridge University Press (1999)

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing Hans-Gerhard Gross1 , Nikolas Mayer2, and Javier Paredes Riano2 1

2

Delft University of Technology, Netherlands [email protected] Fraunhofer Institute Experimental Software Engineering, Kaiserslautern, Germany [email protected] 3 Technical University of Kaiserslautern, Germany [email protected]

Abstract. Real-time contracts between components, as an important aspect of quality-of-service considerations, add a new dimension to the development and validation of component-based embedded systems. A real-time contract determines the fulfillment of a component’s response time requirements when it operates in a client-server relation with other components. This represents a typical contract testing scenario in which the client component needs to assess the timely response of an associated server component according to its usage profile of that server. The basic model of built-in contract testing technology is capable of assessing the correctness of behavioral contracts between components in this way. This chapter introduces an extension to the basic built-in contract testing model that puts components into the position to assess their deployment environment with respect to response time specifications. This extended model of built-in contract testing requires some additional built-in test architecture for components plus an automatic test case generator that is based on search heuristics.

1

Introduction

The primary motivation for applying component-based software development principles is that novel applications may be created with comparatively little effort through the assembly of existing prefabricated building blocks. Under the component paradigm, the main development activities can be seen more in terms of integration of existing parts of a system, rather than pure construction from scratch. This focus on integration raises issues during system validation, because for engineers it means there is little opportunity to check the correct function and behavior of the system parts properly. The primary challenges in checking the correct function and behavior of component-based systems are lack of access to the internal workings of the individual building blocks and the usage, and, consequently, the testing of a component in a new context [18]. Built-in contract testing promises a means to tackle these challenges. It is based on the ideas of equipping individual components with the ability to check their execution environment at runtime. The basic architecture of built-in contract testing defines a testable component, the server in a client/server interaction, and a testing component, C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 107–122, 2005. c Springer-Verlag Berlin Heidelberg 2005

108

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

the client in a client/server setting. This basic model of built-in contract testing is depicted in Figure 1 (artifacts that are not shaded). The testable server provides additional testing interfaces that facilitate information retrieval during a test, for example introspection; the testing client comprises a tester component that executes test cases during initial component deployment. When deployed in a new system, built-in contract test components check the compliance of all other associated components, including the explicit servers as well as the implicit servers [15]. An explicit server is an explicit part of the considered system, e.g. the object Server in Figure 2. Implicit servers are represented by the underlying middleware platform, operating system, and hardware components. This second kind of components is typically taken for granted by the application and provided through the runtime support system, as shown in Figure 1. Built-in contract testing checks whether a component will function in a new context, and whether the new context will accept the newly deployed component. The basic model of built-in contract testing is described in a number of articles and chapters, e.g. [1, 14, 15, 18].

Fig. 1. Basic and extended model of built-in contract testing with testing interfaces and server tester components

Any two-way interaction between components is based on a contract expressing each party’s rights and obligations in this relationship. Built-in contract testing checks the compliance of a component with its provided and required contracts, hence the name. Component contracts typically come in four levels of increasingly negotiable properties (according to Beugnard and others [2]): 1. Basic Contract, including a component’s provided and required signatures in terms of invocable operations, or the so-called syntactic contract [18]. 2. Behavioral Contract, including the functionality of a component in terms of pre and postconditions according to Meyer’s “design-by-contract” [21], or Reussner’s “architecture-by-contract” [22–24].

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

109

3. Synchronization Contract, which can be seen as specialization of the behavioral contract dealing with sequences and combinations of operation invocations. This is also termed “semantic contract” [18]. 4. QoS Contract, which refers to the quality of syntactic and semantic contracts in terms of precision, throughput, timing, etc. Levels 1 to 3 ensure that a component integration is correct in terms of function, behavior and “meaning”. In other words, two components can perform some functionally meaningful tasks together. These contracts are validated through the basic model of built-in contract testing. The QoS contract quantifies the component interaction in terms of additional non-functional properties that need to be planned and assessed in the same way as any other functional properties. Real-time requirements, as one aspect of a quality-of-service (QoS) contract, compound the challenges for built-in contract testing during component integration. QoS requirements for component contracts in general, and real-time requirements in particular, are not only affected by individual pair-wise object interactions as it is the case for behavioral contracts. In addition, they are affected by every single object that participates in a system’s operation whose timing will be considered. In some cases the entire application will have an effect on the QoS requirements. For example, all objects that access a distinct resource are relevant for deadlock resolution, and all objects that participate in the realization of a distinct piece of functionality are relevant for response time specification and assessment of this functionality. This second example emphasizes the fact that high-level real-time requirements must be allocated among many nested and collaborating individual component contracts. This is indicated in Figure 2. Each object that participates in the realization of the operation StateOfPhysicalMem() shows a well-defined response-time behavior according to its own algorithm and its underlying environment. This environment of an object is determined by the underlying explicit and implicit servers upon which it depends, or, in other words, by other subsequent component contracts. If one of these nested contracts changes, it will affect all the other timing contracts through which the operation sequence is coupled. Hence, the response time of a component is variant with respect to its underlying environment. It exhibits different timings if it is plugged to other components that implement part of the functionality of a real-time application at a customer’s site, or if it is brought on a different middleware or hardware platform. These other components provide the same service in terms of behavior, so the behavioral contract is invariant, but they likely differ with respect to their execution time, so that the QoS contract is changed. Thus, whenever a component is reused in a new real-time context, in addition to its functional behavior, its timing behavior must be assessed. Components are made for reuse, so they are typically integrated in various differing contexts, even in contexts for which they have never been explicitly developed. A component provider would therefore have to assess every single possible combination of a component’s feasible usage profiles in respect to other components on a host of different hardware platforms in order to give meaningful estimates on its real-time behavior. Clearly, this cannot be done a priori for each component since the developer of that component can never anticipate its usage in a particular context, nor the underlying run-time platform, or any other pieces that affect timing. What can be, and usually is

110

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

Fig. 2. High-level latency requirement and its distribution over an entire operation of an embedded application

done, is a timing assessment in situ, when a system is initially assembled and deployed on a platform. Any new real-time system context requires that every single component integration be validated according to the specified timing schedule of the new context; and the validation of an application’s response time properties can therefore only be performed in that context. This chapter proposes an extension of the basic model of built-in contract testing that is capable of assessing real-time component contracts when an application is assembled and deployed for the first time. This extended model proposes to equip a tested server component with additional timing introspection mechanisms and the testing client component with a dynamic test case generator that is based on the application of Evolutionary Algorithms. These items are laid out in more detail in the following section.

2

Extended Model of Built-in Contract Testing

The tester component at the client side in the client/server relation displayed in Figure 1 holds the test scenarios that are applied during a contract test. It comprises sequences of operation invocations according to the client’s usage profile of the tested server. Built-in contract testing is based on the philosophy that only the client of a component “knows” best how it is going to use an acquired server component. The client’s server tester component is therefore made up of test cases and operation invocation sequences that exactly reflect the way in which the client is going to use a server. The client’s server tester component represents the expectation of the client toward the function and behavior of the server. In real-time systems, client/server interactions are additionally determined by runtime expectations. A client in a real-time system expects to receive a semantically correct service on time. The execution time of any of the server component’s operations is defined through the initial state of the component represented by internal attributes plus the input pa-

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

111

rameter values for the operation under consideration. The underlying runtime support system and the hardware platform naturally have a considerable effect on timing, but for any specific configuration of an application they can be regarded as invariants. The response time for an operation of a stateless component is merely defined through its input values, and the response time for a parameter-less operation is merely defined through the component’s internal states. Internal state and input values determine the routes through which the execution path proceeds through a component, and, consequently, the type and number of machine-level instructions executed. In addition, the input values determine the duration of calculations and assignments at the machine level. All these factors have a considerable effect on the variation of a component’s response time. A single execution of a test case with fixed initial internal state and invariant input values as it is applied in functional testing will result in a single response time for the test. However, the task of the client’s test software during response time assessment is to repeatedly invoke typical usage scenarios of the server with constantly changing parameter values, since these determine the server’s response time behavior. Timing assessment is applied to see whether the server can fulfill the client’s response time requirements under a host of different varying execution conditions that are represented by the different operation invocations with a variety of input values. This process of creating and applying test cases can be regarded as a search process, with the goal to generate a test case that violates the client’s response time requirements. If the test fails it means that client and server cannot fulfill the specified real-time contract. If the client’s response time requirements are not violated in this process, it is likely that this component interaction will be correct in terms of timely fulfillment. Under the built-in contract testing paradigm, this search process is realized through evolutionary algorithms. How this is done is subject of the next subsection. 2.1 Evolutionary Testing Evolutionary testing [34] is a relatively new testing technique that is based on the application of genetic algorithms [11], evolution strategies [26], or simulated annealing [32]. All these methods which are referred to as evolutionary algorithms are optimization techniques. For testing the timing behavior of systems, evolutionary testing can be used to generate test cases automatically [19, 27] which are most likely to produce the longest or shortest execution time on the test object or violate its timing requirement. Static analysis is the traditional technique for determining the timing behavior of systems, though dynamic timing analysis techniques such as evolutionary testing are increasingly being used to overcome the limitations of static techniques [13]. For example, random testing represents a very simple dynamic timing analysis technique that is mainly applied in industry today. However, evolutionary testing has proven to be a much more powerful technique for the problem at hand [30, 31]. Evolutionary algorithms are sophisticated search/optimization techniques loosely related to the mechanisms of natural evolution that is based on reproduction and selection. They perform on populations of binary strings or real numbers which represent possible solutions to an optimization or search problem. Evolutionary algorithms recombine and mutate strings and the resulting new strings are selected as a new gen-

112

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

eration of prospective solutions according to a fitness function. The operation of an evolutionary algorithm can be subdivided into reproduction, mutation and selection. Each string in the population, the so-called individual, represents a feasible solution to the search problem. During reproduction, pairs of individuals are selected for recombination and some parts of their strings form a new individual. This is termed crossover and controlled by the recombination operator and the crossover probability. Some bits in the new individual are mutated according to a low mutation probability. The resulting new individuals are tested and their fitness evaluated through the fitness function. This function assesses how well each individual solves the original problem, for example, measures the execution time. Usually, fitter individuals, have a higher chance of being selected and recombined. This is controlled by the selection operator. The fittest individuals remain in the population and build the basis for the next generation. In our case, fitter means longer execution times in the case of worst-case timing assessment, and shorter execution times in the case of best-case timing assessment. This procedure of an evolutionary algorithm can be represented by the following pseudo-code with P, P1, P2 and P3 representing sets of possible solutions, or populations: begin ea initialize (P); while not breakCondition do begin P1 = selection (P); P2 = recombination (P1); P3 = mutation (P2); P = fittest (P3, P); end-while; end-ea; The loop in this procedure is continued for a number of iterations, each representing a new generation of individuals, until the stopping criterion is satisfied. This may be after a predetermined number of generations, or after a number of generations without improvement of the result. Computer applications that use evolutionary algorithms can be supported through a host of commercially and freely available libraries such as Genetic Server and Genetic Library [10], GAUL [9], GALib [7], EODev [4], GA Playground [8], GAJIT [6] (www.micropraxis.com/gajit/index.html), to name only a few. In order to analyze the best- and worst-case timing behavior of a component, an evolutionary algorithm can be used as a test case generator to produce populations of operation invocation sequences and input scenarios for these operations. The calling sequence and the respective input parameter values determine the dynamic behavior of the tested component and thus its timing properties for the invocation sequences, as said before. How timing tests and evolutionary algorithms can be brought together is explained in the following subsection. 2.2 Evolutionary Testing and Timing Assessment The fact that components are inherently encapsulated and they also represent state machines, creates a fundamental difficulty for the application of testing strategies in gen-

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

113

eral. In order to find out the best-case and worst-case timing behavior for operations or sequences of operations of a tested object, the test software must not only consider input parameter values for the timed operations, but it must also consider an initial state from which the timed events will be triggered. Any test case comprises information about internal states in terms of pre and postconditions, the tested operation or operation sequence, input parameters for the tested operation, and expected and observed results. For a timing test, the expected and observed behavior and the postcondition of a tested component are not relevant. Since the test target is not functionality but response time, the only interesting outcome after completion of the tested operation is its duration. This can be measured. The precondition determines the component’s internal initial state, so it has an effect on the runtime and must be set prior to a test case execution. Any internal state of a component can be defined completely through a distinct sequence or combination of operation invocations with distinct input parameter settings. We call this the invocation history of the component. The tested operations are yet another combination of operation invocations with distinct input parameter settings. Both, input parameter values for the invocation history, as well as the values for the tested operations must be generated by the search algorithm. In other words, the string that represents an individual within the evolutionary algorithm is a list consisting of all input parameter values for all operations that are required to carry out the test. So, in addition to generating values for the timed operations, the search strategy must also generate values for the invocation history. This setup is displayed in Figure 3. The evolutionary algorithm constantly generates new binary strings that represent input parameter values for the tested operation and its invocation history. The generation of new input parameter values is entirely driven by the execution time of each of these parameter strings. The test framework software in the tester component (left-hand side of Figure 3) receives the parameter values from the evolutionary algorithm (righthand side of Figure 3) and turns this information in a predefined form of sequences of operation invocations. Some operation calls will set up the component in its initial state that is required before the test execution; then the time can be taken and the timed operation invoked. The second timer call results in the execution time of the tested operation (Figure 3). The test framework software that assembles the predefined operation invocation sequences and the values from the evolutionary algorithm can be derived from the functional server tester component of the basic model of built-in contract testing. These test cases represent the usage of a server component through its client. In addition to special tester components at the client side, the extended model of built-in contract testing requires a special testing interface at the server side. This is briefly explained in the next subsection. 2.3 Testing Interface for the Extended Model Internal states can be accessed through the testing interface of the basic model of builtin contract testing. This can be used by external clients of a tested component to set and retrieve state information during runtime. More important for assessing real-time contracts is the additional timing testing interface depicted in Figure 1. It provides operations that control, set and read timers that the underlying run-time environment of the component is implementing. In theory, timing could well be measured inside the

114

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

Fig. 3. Encoding of a test scenario in an evolutionary algorithm

associated tester component at the client’s side (in the test software), but if the tested and timed component is residing on a different network node than the tester component that provides the test scenarios for the tested component through some network, we would additionally to the response time of the component’s operations also measure the response time of that underlying networking infrastructure. The timing testing interface thus enables remote testing. This is important in two particular cases: – The tested component is running on an embedded controller that does not provide the space for the tester component and the testing infrastructure. – The tested component is running on an embedded controller that provides realtime features, and the platform of the testing system does not provide real-time. We are only interested in the timing of the component that runs on the embedded controller, so we have to measure the timing there. If we measure the timing in the tester component, we assess the response time of the embedded component, plus the response time of the network connection, plus the run-time of the tester component, plus some operating system activities of the tester component. And this is clearly not what we are aiming at. 2.4 Summary of the Extended Model The extended model of built-in contract testing comprises an additional testing interface at the server side of a client/server contract that provides timing facilities and notification, and an additional tester component (as summarized in Figure 1). The additional tester component holds and hides the core complexity of the extended model. It contains the implementation of the evolutionary algorithm and the fitness function that measures the execution time for a test case. Essentially, it couples the framework of a test case with the input values for the test case provided by the evolutionary algorithm. It measures the time and returns its value to the search process. This section outlined the organization of the tester component for the extended model of built-in contract testing. In the next section we will provide a small case study that demonstrates how these items are brought together in a real system.

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

3

115

The Resource Information Network Case Study

Response time requirements are derived directly from high-level of abstraction user requirements. For most real-time systems this user is not a real human user but another application that requires some support of a real-time system, for example, a communication application that uses the service of a resource information network (RIN) [5, 25]. The RIN connects multiple types of mobile communication devices through a radio network. It is part of a larger communication system that supports working floor maintenance staff. Each communication device has capabilities that are made public to all other communication devices through the RIN, such as available memory, network throughput, installed and running applications, screen resolution, etc. Every device participating in the RIN contains a RINClient, a RINServer and a number of RINSystemPlugins, one for each type of resource. Figure 4 shows the nesting of the components in the context of the communication system. The application, the user of the RIN system, defines a maximal expected response time (latency requirement) for an entire transaction that reads information about the state of an associated network device’s physical memory, for example. The application is merely the client of the RIN system. It has no knowledge of the RIN system’s internal implementation and how it would possibly satisfy the client’s response time requirements. The RIN system performs its own internal processing and interactions with other entities, so that the actual run time is distributed among all entities that are participating in providing the requested service. Some objects will use up more, some others less run time. The client is only interested in getting the requested service within the expected and specified time bound of the latency requirement. It does not care how the timing is distributed among the internal parts of the RIN system. For the application to receive information about the memory state of a device, it has to call ProcessRequest(...) on the RINClient. Internally, this operation is then subdivided into a number of subsequent operation calls on the RINServer and the RINSystemPlugin. These steps are summarized in the following list, and displayed, in more detail, in Figure 5: – The client calls ProcessRequest() on the server’s processing object in order to enqueue the requested transaction to be further processed by the server and consequently the plug in. This is the actual transaction that the application is interested in. – The server performs a syntax check on the request sent by the client. – The server calls ProcessData(Message) on the respective server plug in component. – The plug in performs a syntax check on the plug in request sent by the server. – The plug in performs the request on the local host memory (e.g. available physical memory). – The plug in calls OnDataFromPlugin(Message) on the server component and notifies the server that the request has been processed. – The server checks the syntax of the message received by the plug in. – The server calls ReceiveDataFromServer(Message) on the client’s callback object.

116

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

Fig. 4. Nesting structure of the RIN System and the application in a communication system context

Fig. 5. Sequence diagram for the RIN operation ProcessRequest

The sequence diagram in Figure 5 is an indicator of how complex typical usage scenarios can become for which the timing tester has to provide the respective test infrastructure. Quite a number of different operation call sequences are conceivable according to the application’s usage profile of the RIN system. In this example we will only con-

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

117

centrate on one particular scenario that represents a single transaction according to the application’s usage profile of the RIN system. It is a sequence of invocations of the method ProcessRequest() with a number of different input parameter values. For an entire system, the timing assessment would have to consider all the scenarios that are defined in the application’s functional built-in contract tester component (according to the basic model). The timing tester component will therefore comprise a number of timing tests, one for each of these usage scenarios. The task of the search algorithm is to find the combination of operation invocations plus their respective input sets that represent the worst-case execution time, or the bestcase execution time, respectively. These are produced by at least two individual test cases out of a huge number of possible test cases that the search process generates and executes. In order to apply an evolutionary algorithm to this task we have to find a way of representing a test case in form of a binary string, which is the basic item on which the used genetic algorithm is operating. Assume that the usage profile of the RIN system’s client comprises a sequence of five different operation calls that the client is invoking on the RIN system by using the provided method ProcessRequest. These five operations may be called by the client in an any arbitrary sequence. So, every operation invocation that the client of the RIN performs has the following format: ProcessRequest (FUNCTION_1, PARAMETER1, [PARAMETER2] ); ProcessRequest (FUNCTION_2, PARAMETER1, [PARAMETER2] ); ... ProcessRequest (FUNCTION_5, PARAMETER1, [PARAMETER2] ); The type FUNCTION is represented by the following list of services that the RIN system can perform (enumeration type): enumeration FUNCTION = { BYPASS, REPEAT, ABSTIME, ASAP, COMPARE }; The type PARAMETER1 is represented by the following list of possible parameter values that serve as input to each of the functions (enumeration type): enumeration PARAMETER = { MemoryLoad, TotalPhys, AvailPhys, TotalPageFile, AvailPageFile, TotalVirtual, AvailVirtual } The type Parameter2 is only used in tandem with the function COMPARE according to the following values (enumeration type): enumeration PARAMETER2 = { ==, !=, >=, <=, >, < } Each individual within the genetic algorithm is comprised of these values, a value in [0..4] for the function, one in [0..7] for each parameter1, and one value in [0..5] for each parameter2. In addition, it can use a 16-bit integer for the size of the memory. For example the string 4, 0, 3, 17832 transformed into a binary format for the genetic algorithm, represents the following operation invocation:

118

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

ProcessRequest ( COMPARE, MemoryLoad, <= , 17832) This is a request of the application to find out whether a device may be able to accept more data. One test case that is represented by one individual in the genetic algorithm will comprise five of these lines, one for each of the five possible method invocations in a sequence. These are subject to selection, recombination and mutation. The fitness function transforms the binary representation into real operation invocation sequences, and it measures the execution time for each such sequence. This is fed back into the genetic algorithm in order to determine the “goodness” of an individual with respect to best- or worst-case timing. In this example we have not applied any setting up of an initial state before invocation of the tested operation sequence, because this was not a requirement for assessing the execution time of this sequence. But setting up any internal state that would be required prior to calling the test sequence, would involve exactly the same steps as for the sequence of the ProcessRequest operations. They would simply not be included in the response time measurement. The genetic algorithm used for the testing is our own implementation, and we have applied the following parameters for performing the timing tests of the previously described scenario: – – – – – – –

Number of parents in the population: 12 Number of children in the population: 12 Number of individuals: 24 Number of generations: 40 Number of executed tests: 800 Crossover probability, Pc: 0.3 Mutation probability, Pm: 0.02

The following list shows the software encoding of the operations for the found worst case execution time (2.344 seconds), the order is important: ProcessRequest ProcessRequest ProcessRequest ProcessRequest ProcessRequest

( ( ( ( (

ASAP, MemoryLoad ) BYPASS ) REPEAT, AvailPhys ) ABSTIME, MemoryLoad ) COMPARE, MemoryLoad, <= XX)

This individual is represented by the following string within the genetic algorithm: 3, 0, X, 0, X, X, 1, 2, X, 2, 0, X, 4, 0, 3 The variable X indicates any arbitrary value for that position. This is because not all functions take all the parameters, so they are simply not considered. The parameter that represents the size of the memory to compare seems not to be significant. The outcome means that on the used platform this sequence of five operation invocations with these input parameter settings will result in the worst-case execution time of 2.344 seconds. The used platform was a typical Pentium PIII 200 Mhz with Microsoft’s built-in DCOM middleware. If this found value is below the client’s expectation the test

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

119

passes and we can call this a successful integration of the two components, the client and the server, with respect to their timing contract. The same applies for the assessment of the best-case execution time, although here the test passes if the found value is above the client’s expected value. The following list shows the software encoding of the operations for the found best case execution time (1.468 seconds), here the order is also important: ProcessRequest ProcessRequest ProcessRequest ProcessRequest ProcessRequest

( ( ( ( (

BYPASS ) ASAP, MemoryLoad ) REPEAT, AvailPhys ) ABSTIME, MemoryLoad ) COMPARE, MemoryLoad, <= XX)

Figure 6 shows the improvement of the found found values for the best-case and worstcase timing assessments over the course of 40 generations of evolution.

Fig. 6. Improvement of the genetic search over the course of 40 populations

4

Summary and Conclusion

The philosophy behind built-in contract testing is that an upfront investment on validation infrastructure pays off during reuse. This adds considerable value to the reuse paradigm of component-based software development because a component can complain if it is given something unsuitable to interact with, and if it is mounted into an unsuitable environment. This comprises the behavioral contract in the basic model of built-in contract testing as well as the real-time contract in the extended model of builtin contract testing proposed in this chapter. The benefit of built-in validation follows the same principles which is common for all reuse methodologies: the additional effort of building the test software directly into the functional software results in an increased return on investment according to how often such a component will be reused.

120

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

The idea of building tests into components is not new. The very basic principles of built-in contract testing are founded on the ideas of traditional assertion checking mechanisms. These are operations in the code that are executed in regular intervals and compare current execution conditions of a component with expected execution conditions [3] and raise an exception if both deviate. Wang et al. [33] take these ideas a bit further and adopt a hardware analogy in which components have self-test functionality that can be invoked at run-time to ensure that they have not degraded. However, since software, by definition, cannot degrade, the portion of a self-test which rechecks already tested code is redundant, and simply consumes time and space. The approach described in this chapter concentrates on things that are likely to change in a new component configuration, that is a component’s environment, and more specifically, its associated server components and its run-time platform on which it depends. Every new environment and platform will result in different response times of a component’s operations. This chapter has concentrated on the extended model of built-in contract testing that checks real-time contracts between components. This extended model of built-in contract testing, when properly implemented, is capable of assessing a component’s run-time environment in-situ when it is deployed and integrated with other components in terms of response time qualities. In that way, a component can complain if it is mounted into an environment that does not provide its expected timing properties. Once a component has been equipped with built-in contract testing artifacts, all a system integrator has to do is to physically interconnect the components in an application and start the test. All the integration testing is then done completely automatically through the built-in contract testing infrastructures. Some of the work described in this chapter is adapted for use in small and medium sized enterprises under the CBTesten project acronym. This project is part of the Software Engineering 2006 Initiative of the German Federal Ministry for Education and Research (www.cbtesten.de) and it can be seen as route for further development of the technologies described here.

Acknowledgments This work has been supported by the German Federal Ministry for Education and Research under the MDTS project acronym (www.fokus.fhg.de/mdts), and under the ITEA initiative of the European Union (www.empress-itea.org).

References 1. C. Atkinson and H.-G. Gross. Built-In Testing in Model-Driven, Component-Based Development. Workshop on Component-Based Development Processes, 7th International Conference on Software Reuse (ICSR-7), Austin, TX, April 15-19, 2002. 2. A. Beugnard, J.M. J´ez´equel, N. Plouzeau and D. Watkins. Making Components ContractAware. IEEE Software, 32(7):38-44, 1999. 3. R. Binder. Testing Object-Oriented Systems - Models, Pattern & Tools. Addison-Wesley, 2000. 4. EODev. http://eodev.sourceforge.net

Assessing Real-Time Component Contracts Through Built-in Evolutionary Testing

121

5. Fraunhofer IGD. RIN System Specification. Technical Report of the Fraunhofer Institute for Graphical Data Processing, Darmstadt, 2002. 6. GAJIT. http://www.micropraxis.com/gajit/index.html 7. GALib. http://lancet.mit.edu/ga 8. GA Playground. http://www.aridolan.com/ga/gaa/gaa.html 9. GAUL. http://gaul.sourceforge.net 10. Genetic Server and Genetic Library. http://www.nd.com/genetic 11. D.E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA, 1989. 12. H.-G. Gross, B.F. Jones and D.E. Eyres. Evolutionary Algorithms for the Verification of Execution Time Bounds for Real-Time Software. IEE Workshop on Applicable Modeling, Verification and Analysis Techniques. London, UK, January, 1999. 13. H.-G. Gross. Measuring Evolutionary Testability of Real-Time Software. PhD Thesis, University of Glamorgan, Pontypridd, Wales, June 2000. 14. H.-G. Gross. Built-In Contract Testing in Component-Based Application Engineering. CologNet Joint Workshop on Component-Based Software Development and Implementation Technology for Computational Logic, International Workshop on Logic Based Program Development and Transformation (LOPSTR), Madrid, Sept. 17-20, 2002. 15. H.-G. Gross. An Evaluation of Dynamic Optimisation-based Worst-case Execution Time Analysis. 1st Intl. Conference on Information Technology: Prospects and Challenges in the 21st Century, Kathmandu, Nepal, 2003. 16. H.-G. Gross and N.F. Mayer. Search-Based Execution-Time Verification in Object-Oriented and Component-Based Real-Time System Development. 8th IEEE WS on Object-oriented Real-time Dependable Systems, Guadalajara, Mexico, January 15-17, 2003. 17. H.-G. Gross and N.F. Mayer. Search-based Performance Evaluation with MARMOT. 1st Intl. Workshop on QoS in Compnent-based Software Engineering, Toulouse, France, 2003. 18. H.-G. Gross. Component-Based Software Testing with UML. Springer, Heidelberg, September 2004. 19. J. Hunt. Testing Control Software using a Genetic Algorithm. Engineering Applications of Artificial Intelligence, 8(6), 1995. 20. J.M. J´ez´equel, D. Deveaux and Y. LeTraon. Reliable Objects: Lightweight Testing for OO Languages. IEEE Software, 1999. 21. B. Meyer. Object-Oriented Software Construction. Prentice-Hall, 1997. 22. R.H. Reussner. Parameterisierte Vertr¨age zur Protokolladaption bei Software- Komponenten. Logos, Berlin, 2001. 23. R.H. Reussner. The Use of Parameterized Contracts for Architecting Systems with Software Components. 6th International on Component-Oriented Programming, Budapest, 2001. 24. R.H. Reussner, W.H. Schmidt and I.H. Poernomo. Reliability Prediction for ComponentBased Software Architectures. 9th International Workshop on Component-Based Software Engineering, Lund, 2002. 25. P. Santos, T. Ritter, M. Born. Rapid Engineering of Collaborative and Adaptive Multimedia Systems on Top of CORBA Components. Irmscher, K. (ed.), Kommunikation in Verteilten Systemen, VDE, Offenbach, 2003. 26. H.P. Schwefel and R. M¨anner. Parallel Problem Solving from Nature. Springer, Berlin, 1990. 27. H. Sthamer. The Automatic Generation of Software Test Data. PhD Thesis, University of Glamorgan, Pontypridd, Wales, UK, 1995. 28. H. Sthamer, A. Baresel and J. Wegener. Evolutionary Testing of Embedded Systems. 14th International Software Quality Week (QW ’01), San Francisco, USA, May 2001. 29. H. Sthamer, J. Wegener and A. Baresel. Using Evolutionary Testing to Improve Efficiency and Quality in Software Testing. Proceedings of the 2nd Asia-Pacific Conference on Software Testing Analysis & Review, Melbourne, Australia, July 22-24, 2002.

122

Hans-Gerhard Gross, Nikolas Mayer, and Javier Paredes Riano

30. N. Tracey, J. Clark and K. Mander. The Way Forward for Unifying Dynamic Test-Case Generation: The Optimisation-Based Approach. IFIP Intl. Workshop on Dependable Computing and its Applications (DCIA), South Africa, January. Pages 169-180, 1998. 31. N. Tracey. A Search-Based Automated Test-Data Generation Framework for Safety-Critical Software. PhD Thesis, University of York, UK, 2000. 32. P.J.M. Von Laarhoven and E.H.L. Aarts. Simulated Annealing: Theory and Applications. Mathematics and its Applications, Kluwer, Dordrecht, 1987. 33. Y. Wang, G. King, M. Fayad, D. Patel, I. Court, G. Staples and M Ross. On Built-in Tests Reuse in Object-Oriented Framework Design. ACM Journal on Computing Surveys, Vol. 23, No. 1, March 2000. 34. J. Wegener and M. Grochtmann. Verifying Timing Constraints of Real-Time Systems by Means of Evolutionary Testing. Real-Time Systems, 3(15), 1998. 35. J. Wegener, H. Sthamer and A. Baresel. Application Fields for Evolutionary Testing. Eurostar 2001 Stockholm, Sweden, November 2001.

Platform-Independent Specification of Component Architectures for Embedded Real-Time Systems Based on an Extended UML Shourong Lu and Wolfgang A. Halang Fachbereich Elektrotechnik und Informationstechnik FernUniversit¨at in Hagen 58084 Hagen, Germany {Shourong.Lu,Wolfgang.Halang}@FernUni-Hagen.de

Abstract. A way to specify component-based software architectures for embedded real-time systems is introduced. Component models are specified taking the Model Driven Architecture (MDA) approach, and employing UML notations. First, the principle of the developing process based on the MDA approach and the new concepts of UML-specified component architectures are addressed. Then, a conceptual framework architecture for the design of embedded real-time systems is presented, in which platform-independent component models are built. Taking specific platform features into regard, specific component models result from transformations mapping the platform-independent component model to either the Process and Experiment Automation Real-Time Language (PEARL) or to Function Blocks according to IEC 61131-3 or IEC 61499.

1

Introduction

In embedded systems, software is used to control the functions of mechanical and physical devices by dedicated digital signal processors and computers. Its design has to fulfill demanding requirements with respect to limited resources, real-time requirement, reliability, cost, and re-usability. Component-based software engineering is quite common today in traditional applications [4]. It is expected that it could bring a number of advantages to the embedded systems world, such as rapid development times, the ability to re-use existing components, and the ability to compose sophisticated embedded software [24]. In developing software for embedded systems one has to consider non-functional and resource constraints besides software quality aspects such as re-usability, because the correct operation of such a system is not only dependent on the correct functional working of its components, but also dependent on its non-functional properties. Embedded systems often have limited resources such as processing power, storage capacity and network bandwidth. A developer has to cope with these constraints and make sure that the software will be able to run on the constrained system. In addition, embedded systems also have timing constraints on their computations. Missing a time constraint can be catastrophic or annoying. In order to cope with these special requirement, it is indispensable to improve and enhance the component technologies to design software for embedded systems. C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 123–142, 2005. c Springer-Verlag Berlin Heidelberg 2005

124

Shourong Lu and Wolfgang A. Halang

The Unified Modeling Language (UML 2.0) [21] offers an unprecedented opportunity to describe component-based embedded systems. It is a standard graphical language used to visualize, specify, construct and document the artifacts of softwareintensive systems [20], and provides constructs to deal with varying levels of modeling abstraction to visualize and specify both the static and dynamic aspects of systems. Its capabilities for architectural modeling have been drawn from three architectural description languages (ADLs), viz., UML-RT (ROOM) by Selic and Rumbaugh, ACME by Garlan et al., and SDL by the ITU-T [2]. Its extensibility is a powerful feature of UML: with the mechanisms stereotypes, tagged values and constraints the semantics of model elements can be customized and extended. In addition, models generated in UML can be connected to a variety of object-oriented programming languages such as C++ and Java, or to architectural description languages. UML comprises concepts that are suitable to model complex architectures in various application domains [9], and supports component-based development and architectural specifications [6]. The Object Management Group (OMG) has proposed the Model Driven Architecture (MDA) approach [10], which aims to allow developers to create systems entirely with models. MDA envisages systems being comprised of many small, manageable models rather than one gigantic monolithic model, and allows systems to be designed independently of the technologies they will eventually be deployed on. The approach is based on two essential concepts, viz., the Platform Independent Model (PIM) and the Platform Specific Model (PSM), which separates the specification of system functionality from the specification of the implementation of that functionality on a specific technology platform and provides a set of guidelines for structuring specifications expressed as models [16]. Both PIM and PSM can be expressed in extended UML [18]. The advantages of the MDA approach are: (1) PIM provides specifications of the structure and functions of the system that abstract away technical details, (2) implementations on several platforms can be derived from one PIM, (3) system integration and interoperability can be anticipated and planned in the PIM but postponed to the PSMs, and (4) it is easier to validate the correctness of models [1, 3]. Therefore, it is useful and significant to combine the MDA approach and UML models with the component technique to develop software for embedded systems. In the following sections, we shall present how to design a platform-independent component model employing the MDA approach and UML notations based on component-based software engineering.

2

Modeling Component Architectures on the Basis of MDA and UML

2.1 MDA Based Development The MDA approach supports to realize the functionality specified in one model on multiple platforms through auxiliary transformation. It includes: Platform Independent Model (PIM) which is a highly abstracted model independent of any implementation technology. It captures the essential features of a system, and specifies what the system does. A PIM is formulated in UML, and expresses

Platform Independent Architecture Specification

125

how an application or a system is structured. It may include generic functions, scenarios, and classes to describe how the system realizes its requirements. A PIM can be mapped into one or more PSMs. A PIM should include sufficient detail to allow for creating such mappings. Platform Specific Model (PSM) which specifies how the system is implemented. It determines how the PIM executes in the target deployment environment. A PIM is used as foundation for mapping into one or more platform specific models. Such a PSM describes in detail how the PIM is implemented on a specific platform, or in a certain technology. PSMs are also expressed in UML adding constraints and implementation details. Platform Model which is the final product and corresponding to code written in a specific programming language for a specific application. On the left hand side of Fig. 1 a usage of the MDA approach is shown. A system can horizontally be split into multiple functions, each of which has a model of its subsystems. These models can be considered to be views on an overall system PIM. The PIM can be converted into a PSM running on a specific platform. The converting technology are transformations such as in Fig. 1: transformation T1 integrates some requirements of a system for defining a PIM, transformation T2 converts the PIM into PSMs for each deployment platform.

Fig. 1. The MDA approach

The process of model driven development is shown on the right hand side of Fig. 1. The meta-model is object-oriented. It contains classes informally defining the meaning of model elements. These informal semantics are re-inforced by constraints expressed in the Object Constraint Language (OCL). A mapping is a model transformation, which consists of a set of rules and techniques used to modify one model in order to obtain another one. PIM to PIM mappings are model refinements during the development life cycle that do not need any platform-dependent information. A refinement produces a new model from a previous one by adding pieces of information to the initial model, which builds the bridge between requirements, analysis and design. PIM to PSM mappings are defined to be deployed in a platform-specific model. They are performed once the PIM is elaborated enough to be associated with the characteristics of a chosen platform infrastructure.

126

Shourong Lu and Wolfgang A. Halang

PSM to PSM mappings are model refinements carried out during the realization and deployment of components. PSM to PIM mappings are operations to reverse engineer models. They may be required to build an abstract model from models existing in particular technologies. The principle of this development process is to establish, first, a PIM, then, to refine it according to a specific situation and, next, to derive one or more PSMs from this PIM. These PSMs are the results of mappings to several platform models. Both PIM and PSMs are developed according to the rules defined within a meta-model which can be described in UML. If necessary, the PIM and PSMs can be optimized to reduce their complexity. The PSM produced last is used to generate executable code, preferably in an automatic way. The transformation from one PIM to several PSMs is the core of MDA, while the one from a PSM to executable code is straightforward and nowadays supported by a number of commercial tools. 2.2 Use of UML 2.0 to Specify Component Architectures The new version UML 2.0 provides some new concepts to remedy the weakness of structure and behavior for modeling system software. Structural Modeling. UML 2.0 offers a wide variety of possibilities in modeling software systems for hierarchical composition of models, communication structures, and components. The most important of these is the introduction of internal structures which describe how an element is composed by other elements called parts in UML 2.0.

Fig. 2. An example of component specification in UML 2.0

As shown in Fig. 2, with the internal structure, a model can be described at different levels of abstraction as its elements can be nested, i.e., components consist of other components and so on. In order to connect with an internal structure, communication paths can be specified by connectors which interlink two or more parts to form a communication channel between them so that messages can be sent and received. As parts occur only inside an internal structure, there is still need for a mechanism to connect both internal parts with the outside. It is realized by ports which serve as interaction points. They are also linked by delegation connectors. Ports serve as a kind of contract between the elements they connect. They specify what an element expects from its environment and what it offers to its environment.

Platform Independent Architecture Specification

127

Therefore, ports are usually of the type interface. In UML 2.0, interfaces can be either provided or required ones. From the outside, nothing but the ports can be seen. They direct all signals designated to internal elements to their correct receiver. The other way round, all signals from internal elements to the outside are also transmitted by ports. An assembly connector has parameters in form of a set of source ports and target ports. Therefore, components are treated now as software components instead of merely physical pieces of software for deployment purposes. That means, a component is a modular, replaceable, and deployable piece of software that is available at specification time, at deployment time, and at runtime. A component owns an internal structure that shows how it is composed, and interacts with its environment exclusively via interfaces or, more often, ports. Furthermore, a component can be replaced by another one which offers at least the same provided and required interfaces or ports, as these are the only parts of the component which are accessible by its environment. In UML 2.0, there are class diagrams, component diagrams, object diagrams, composite structure diagrams, deployment diagrams, package diagrams, and activity diagrams which can be used to describe structures. Behavioral Modeling. In UML 2.0, one can employ use case diagrams, state machine diagrams, sequence diagrams, collaboration diagrams, interaction overview diagrams, and timing diagrams to describe behavior. – An action is a fundamental unit of behavior specification. There are more than twenty different action types in UML 2.0, which can coarsely be classified as read/write actions and computation actions. – Activities focus on sequences, conditions, inputs and outputs for invoking other behaviors. They specify co-ordinated executions of subordinates behaviors. Each kind of behavioral model emphasizes a different aspect of system dynamics, to make it more suitable than others for particular applications or stages of application development. – Interactions describe message passing between components that causes invocation of other behaviors. – State Machines are used to model discrete behavior triggered by events such as signals, time-outs, operation calls, or value changes. They allow modelers to clearly communicate ideas to each other, but are limited to be used as the basis for checking behavioral compliance. State machines can also be used to express the protocol at a port. They may have sub-state machines which can be fully encapsulated with entry and exit points. – Protocol State Machines can only capture communication of single classifiers, typically interfaces. They can specify sequences of operation calls, but such sequences cannot capture nesting of calls, although this is a common pattern in component communication. State machines describe intra-object behavior, while protocol state machines are used to define protocols. The main difference between these two types of state machines is that behavioral ones describe the specific behavior of a classifier while protocol ones describe an abstract behavior and may be associated to interfaces or ports.

128

Shourong Lu and Wolfgang A. Halang

The behavior of a component is defined by provided and required behavior typically captured as communication on the component’s provided and required interfaces. Behavior may be specified implicitly by means of its visible properties and operations. It may also be associated with interfaces or connectors to define a contract between participants. UML 2.0 provides some means to specify component behavior, either implicitly or explicitly, such as it may be described with State Machines, Protocol State Machines, Activities, and Interactions. Fig. 3 indicates the structure and behavior parts in the UML meta-model.

Fig. 3. Relevant parts of the meta-model of UML 2.0

Another useful concept of UML for specifying component-based architectures is that one can define a UML profile for a particular domain, technology, or methodology. Such a profile is described in terms of a pre-defined set of extension mechanisms known as Stereotypes, Tagged Values and Constraints. A Stereotype represents a new modeling element that defines additional values based on tag definitions. It is derived from existing (base class) but problem-specific ones, and can be applied to other model elements. That is, any model element may be attached by one or more particular stereotype, in which case it receives the tag values of the stereotypes. Tag definition is used to specify new kinds of properties that may be attached to model elements. The actual properties of individual model elements are specified using Tagged Values. These may either be simple data type values or references to other model elements. A Constraint is a semantic restriction represented as a text expression, which is usually formulated in the Object Constraint Language (OCL). Constraints are attached to one or more model elements. They allow users to refine and specify new semantics for any model element, i.e., allowing to add new rules or modify existing ones. 2.3 Building a UML Component Meta-model From the user’s point of view, a component should be an encapsulated unit with completely hidden interior behind the interface. It should contain all the information that users need to know about the component’s operations, such as what its function does, and how and where the component can be deployed. An interface defines all the information about the component that its environment should know about and rely on, without opening that black box. In other words, the interface of a component should provide all the information that users need, and should be the only point of access to the component. Pre-existing components can be re-used in several applications.

Platform Independent Architecture Specification

129

From the component developer’s point of view, a component has a unique identifier and set of properties. It encapsulates a well-defined piece of the overall application functionality. An application is assembled from collaborating components accessing each other through a well-defined component interface. The external view on a component is a set of provided and required interfaces, which may be exposed via ports. A component may also have an internal view in the form of a realization, which is a set of class instances or smaller components that collaborate to implement the services exposed by the component’s provided interfaces while relying on the services of its required interfaces. The concept can be used to specify both logical and physical components. Now we can define a component meta-model as shown in Fig. 4, which illustrates the component concepts and reflects both user’s and developer’s view. The component architectures are structured by subcomponents, connectors, ports and interfaces. A component owns a unique identifier and set of properties, and defines a set of communication ports which provide interfaces. Components can exchange data with each other through ports and connectors, only. A component can be a composite one containing other component.

Fig. 4. Component meta-model

– Part represents a sub-component. – Port is a named and typed interaction point of a component. A provided port is typed by a provided interface, a required port by a required interface, and a complex port by an arbitrary set of provided and required interfaces. Complex ports enable the localization of complex interaction patterns where calls may occur in both directions. Unlike interfaces, a port may be associated with a behavior, specifying the externally observable behavior of the component when interacting through the port. This allows the specification of semantic contracts. A component may have multiple ports typed by the same interface, and is able to distinguish between calls received through different ports. – Connector is a link that may be of kind delegation or assembly. A delegation connector either links a provided port of a component to a part of the component’s realization, signifying that requests received through the port are forwarded to the part, or it links a realization part to a required port, signifying that requests sent through the port originate in the part. Several connections may exist between a

130

Shourong Lu and Wolfgang A. Halang

single port and different realization parts. An assembly connector links a required interface or port of a component to a matching provided interface or port of another component. – Interface is the only part of the component that is visible to the users, and it should provide all the information that the users need in order to deploy the component, and contain specifications for the operations. It is a collection of operations that is used to specify a service of a class or a component. Interfaces do not specify any structure and implementation. These operations may be adorned with visibility properties, concurrency properties, tagged values and constraints.

3

Special Requirements of Embedded Real-Time Systems

Real-time systems are computing systems whose correct behavior not only depends on the computation results, but also on the time when these results are provided. Embedded real-time systems contain computers as parts of larger systems and interact directly with external devices. They usually have to meet stringent specifications for safety, reliability, limited hardware etc. Thus, there is a strong relationship between hardware and software. Real-time systems consist of different, independent processes or threads that communicate with each other. This communication has to be described, e.g., which system parts may talk to each other and which may not, which messages they can send and receive, or protocols that must be followed. A common solution is the definition of ports as communication points for active objects which are used as representations of processes and tasks. Ports are connected by communication channels that allow posting and receiving of messages. Protocols are used for steering these activities. In addition, real-time systems often have to deal with task management as this kind of software often consists of several processes and threads whose scheduling has to be regulated. Therefore, certain aspects of task management like priorities have to be modeled, too. Real-time applications are composed of one or more tasks that are required to perform their functions under strict timing constraints. Most typically, tasks are built of concurrent programs. This concurrency can be implemented as a pseudo-concurrency where multiple tasks compete for the use of a single processor, or true concurrency on multiprocessor systems. When a single processor has to execute a set of concurrent tasks, the CPU is assigned to various tasks according to a pre-defined scheduling policy. Real-time tasks can be categorized as periodic and aperiodic according to whether their arrivals repeat regularly [23]. A typical timing constraint of a task is its deadline, i.e., the instant before which the task should complete its execution. For process scheduling, the attributes of real-time tasks are: Worst Case Execution Time (WCET), which is the maximum time necessary for the processor to execute a task without pre-emption, release time, which is the time at which a task becomes ready for execution, and precedence constraints defining an order of task execution. The smallest structural element in embedded real-time systems should be the task. Each task has a set of input ports and output ports, which are used for communication with other tasks and also constitute the interface to the environment. As shown in Fig. 5, here real-time tasks are considered as leaf components. Since a component can also be

Platform Independent Architecture Specification

131

Fig. 5. Real-time task as component

a composition of other components, a component consists of at least one task, possibly more. Any task is executed starting at its entry point and running to its exit point. A task can assume the states ready, running, waiting, or suspended. The task state transitions are shown in Fig. 5. A task is running if a processor is assigned to it. At any point in time, only one task can be in the running state. A task is ready, if all the requirements for transition into the running state are met, but no processor is assigned to it, yet. The scheduler decides when a ready task becomes a running one. A ready task can pre-empt a running one if required by priority and enabled by pre-emptivity; otherwise, it has to wait until all tasks with higher priorities have finished. A task in the state suspended is passive. Before it can enter the state ready, it must be activated. Tasks enter the state waiting when they need to wait for at least one event before they can resume execution.

4

Component Framework for Embedded Real-Time Systems

A framework is a software entity that supports components conforming to certain standards, and allows instances of these components to be plugged into the component framework. It establishes environmental conditions for the component instances and regulates the interaction between them [24]. The framework now presented is aiming to support component specification, composition, configuration checking, and deployment of embedded real-time systems built from software components. To enable component-based software development for embedded real-time systems, it is necessary to build a platform-independent real-time component model that can provide the features scalability, re-usability, and customizability for different platforms as well as to ensure consistency of system constraints among related components. Therefore, the framework includes both a platform-independent real-time component model suitable to characterize software components for embedded real-time systems, and a composition environment to express and validate compositions of components conforming to non-functional constraints. The platform-independent real-time component model should have the following capabilities: – specifying individual components that may contain subcomponents; – specifying component connections; – specifying non-functional properties and constraints of real-time requirements, such as timing, scheduling of components, and their resources consumption;

132

Shourong Lu and Wolfgang A. Halang

– specifying diverse ports and interfaces according to the requirements of embedded systems; – supporting reasoning about the behavior of compositions, specifically concerning plug-compatibility, concurrency and synchronization, real-time schedules, and code generation. An simple example of the framework construction is illustrated in Fig. 6. It consists of component models, inter-component connections, and architectural constraints organized into a hierarchical architecture. The component models describe the real-time behavior and communication interfaces of components. The connections specify how the components interact with each other. The architectural constraints define requirements imposed on the components and connections. All connections can be defined using communication interfaces, only, which provides the flexibility to change the design of individual components without the need to analyze the entire system [5].

Fig. 6. Framework to specify real-time architectures

In this example, there are three components C1, C2, and C3. Component C3 is further refined by components C31, C32, C33. Components C1, C3 have a request-reply relation that is modeled by required and provided interfaces with assembly connectors. A connection (between C1 and C2) represents a channel, which associates a Contract describing time-criticality of message transmission across components. Another Contract is associated with a port of C2 describing a property of time-criticality, which its environment expects from the component. The component model is defined, in this framework, in a platform-independent way. Thus, there is a clear separation between the requirements and their implementation. This is necessary to achieve re-usability and portability.

5

Platform-Independent Real-Time Component Model

In embedded real-time applications, components must collaborate in meeting timing and resource constraints, which can be expressed with certain semantics in terms of

Platform Independent Architecture Specification

133

general properties and pre-defined properties [22]. The general properties can be expressed with respect to timing and resource usage such as deadline, period time, and worst-case execution time (WCET), or resource consumption. The pre-defined properties are used to express Super-component, Port, or Constraints. Conforming to the MDA approach, a platform-independent real-time component model for embedded real-time systems is defined as shown in Fig. 7, in which it addresses functionality constraints, and specifies efficient functional interfaces and architectural constraints. It clearly separates the requirements and their implementation in terms of a particular execution platform.

Fig. 7. Class diagram of a platform-independent component model

A platform-independent real-time component model is constructed by defining a set of stereotypes, a Contract class which governs some functional, non-functional constraints, and assigning a Component Meta-model, which has been defined in Section 2. Here some kinds of components are defined meeting the requirements in the embedded real-time domain, such as Active Component, Passive Component, and Event Component. Each of them has own attributes (such as scheduling, resources and initialization). We now detail each of these stereotypes. Event Components are those whose behavior is triggered by events. They are used to model pieces of hardware that frequently emit events, such as timers emitting timing events when certain deadlines have passed. Whenever a certain event fires, the behavior is executed immediately. Active Components are used to model on-going activities. They have their own threads of control, which can wait for calls in any input interface and make calls to external, internal or passive components. An active component is started once the component is activated.

134

Shourong Lu and Wolfgang A. Halang

Passive Components are explicitly scheduled by active ones. They are typically used to encapsulate a piece of behavior that executes synchronously and completes in a short time-cycle. They automatically provide mutual exclusion and, in order to make resource sharing predictable, all their execution must be time-bounded and blocking-free. Leaf Components are the smallest structure elements, i.e., they cannot be further refined by models, and can directly be implemented with a programming language. They also have interfaces consisting of sets of ports and properties. From the users’ point of view there is no visible difference between a composite component and a leaf component. Composition Components specify how components are interconnected. They contain a number of component instances and define their configurations. In addition, a composition component also specifies how the ports of those instances are wired, i.e., which connector is used to connect which ports. It is defined for the purpose of configuration, and may occur in parts of components or in a main component such as system composition. A composition component contains a number of connected subcomponents, rules which specify compositional constraints, and component ports, which may form internal ports of the composite. A composite component also has external ports, which are the only ones that are externally visible. The external ports are connected to appropriate internal ports and connectors. Contract is defined as a class used to specify a component’s operation constraints. It uses the theory and methods of the design by contract approach [19] to specify functionality. It can be assigned a port, connector or component. The concepts of “Contract” class will be addressed in Section 6. Port Adapter enables the connection of two incompatible ports. It defines the semantics associated with the ports and provides the operations which are expected from the respective other port. The adaptation is realized at mapping time. A port adapter can also describe time-dependent, operational-behavior constraints of components. Properties are used to characterize aspects of components. Operation specifies an individual action that a component object will perform. It deals with input parameters which specify the information provided or passed to the component, output parameters which specify the information updated or returned by the component, any resulting change of the component’s state, and any constraints that apply.

6

Contract Class for Functional and Non-functional Constraints

Design by contract is a design approach developed by Meyer [19]. In general, a contract specifies an agreement between two or more parties about a service. This contract principle has been applied to component-based design [6], since components offer services to other components, and the properties of these services can be put in contracts. The concept is important in providing support for plug-compatibility and re-usability of components. Contracts are used here to provide for specifying and monitoring functional and non-functional constraints by describing the semantics of applications. A contract is

Platform Independent Architecture Specification

135

attached to one or more participants, where each participant is a component, a port or a connector. According to Meyer, a contract is a collection of assertions that describe what each feature of the component does. The key assertions in the design by contract technique are of three types: invariants, pre-conditions, and post-conditions. Invariant is a constraint attached to a type that must be held true for all instances of the type whenever an operation is not being performed on the instance. We can attach invariants to an interface to specify properties of the component objects that implement the interface. Pre-conditions and Post-conditions are assertions attached to an operation of a type. A pre-condition expresses requirements that any call of the operation must satisfy if it is to be correct. A post-condition expresses properties that are ensured in return by the execution of the call. Assertions are logical expressions about the entities in an interface’s information model. They give component developers a precise description of the behavior of a component implementing the interface. A component can only be considered as implementing an interface if all instances of the component satisfy all the assertions in the interface’s contract. Non-functional constraints can be specified by means of contracts and a runtime mechanism which is responsible for the runtime checking of these contracts. Every non-functional constraint has its own properties (such as for timing constraints deadline, duration, period, etc., and for resources constraints heap usage, stack size, etc.). Thus, different contract types need to be defined, such as timing contracts which can be used for the specification of timing constraints. Furthermore, in order to be able to design a component independent of the rest of a system, it is desirable that there is a set of constraints which completely describes the properties its environment expects of the component and, in turn, which properties the component expects from its environment. By defining architecture constraints, these requirements can be expressed. Architecture constraints are expressed as a set of rules to be observed by a component, its connectors or its composition, and can be formulated in the Object Constraint Language (OCL). According to the requirements, architecture constraints can be divided into component constraints, composition constraints, and connection constraints. Such as in realtime systems, a component constraint may describe a property of time-criticality, which its environment expects from a component. A connection constraint describes timecriticality of message transmission across components which is, normally, a systemwide (or subsystem-wide) timing requirement. A composition constraint describes the time behavior expected by a component from its environment. All constraints can be specified by employing Contracts and assigned to corresponding participants. No internal information about components is revealed, which is very useful for re-use and verification.

7

Platform-Specific Component Models

Platform-specific component models are produced by incorporating the behavior of platforms. They are also described in UML notations such as UML profiles. As shown

136

Shourong Lu and Wolfgang A. Halang

in Fig. 8, here we consider two transformations from a platform-independent real-time component model to platform-specific component models. One is a mapping from PIM to the Process and Experiment Automation Real-Time Language (PEARL) [7], the other one is to function blocks as defined in IEC 61131-3 [14]. For each transformation, some transformation rules need to be defined to generate PSMs from a PIM.

Fig. 8. PIM to PSM transformations

7.1 Transformation from PIM Real-Time Component Models to PEARL In order to transform a PIM real-time component model to PEARL, here we give a short introduction on PEARL to provide a basic understanding. PEARL is one of the very few genuine high-level real-time languages. Its development began in the late 1960s by a group of industrial companies and research institutes mainly for real-time applications. Its later version PEARL90 was standardized in 1998 by the document DIN 66253-2 [7], and its extension for distributed systems in DIN 66253 Part 3 [8]. PEARL has been extended towards safe object-oriented and distributed applications (PEARL*, Hi-PEARL*, Safe PEARL*, Verifiable PEARL*) [12]. Owing to its clear concepts, combining features from classical programming languages and special ones for real-time systems, PEARL is especially suitable for industrial process control application. PEARL for distributed systems includes elements to describe hardware and software configurations. As shown in Fig. 9, an architecture description consists of station division, configuration division, net division, and system division, which describe different associated layers of a system design. In a station division a system’s processing nodes are introduced. It is treated as black box with connections for information exchange. There may be more stations in a system, each one of them being uniquely identified, and associated with a state information for reconfiguration purposes. The basic components of a station are its processing elements, such as “Proctype”, “Workstore”, and “Devices”. In a net division the physical connections between stations are given by listing the point-to-point connections between their interfaces. A system division encapsulates the assignment of symbolic names to hardware devices. The items from the station and net divisions are listed here. A configuration division deals with software architecture. The largest executable program component that is associated with a “Station” and its state is a “Collection” consisting of “Modules”. The latter consist of “Tasks”, which communicate through “Ports” and “Connection lines”. “Ports” are further described by “Import” and “Export” ports, in which it is stated, which data structures and task references are shared with other modules. A “Task” is described by its scheduling parameters. The connections between the ports are described by their directions and line attributes.

Platform Independent Architecture Specification

137

Fig. 9. PEARL basic architecture

PEARL supports both periodic (time driven) and aperiodic (event-driven) tasks, semaphores, device level I/O, and modules. Further, there is explicit support for temporal variables with the data types Clock and Duration and associated operations. These data type describe time instants and periods, and also appear in time schedules. The schedule option in PEARL is associated with six multi-tasking commands for the control of task state transitions, i.e., activation, termination, prevention, suspension, continuation and resumption. A more detailed description can be found in [11]. Model Transformation. The essential point of this transformation is the mapping to the PEARL architecture, its real-time features, and its runtime constraints. It is performed in two separate steps, the first one being to make a transition between PIM real-time components and PEARL nodes. Here as an example we only map to PEARL collections (architecture allocation). The second step is to assign attributes to collections (attributes assignment). The transformation rules are defined as follows: – The platform-independent real-time component model is implemented as a PEARL collection having to meet real-time constraints. – For each class in the PIM real-time component model, a PEARL-oriented stereotype is defined according to PEARL architecture features. – The timing attributes of a task in PEARL can be defined as TaggedValues and assigned to corresponding stereotypes in the platform-specific model. – The constraints of real-time tasks can be described with Constraints expressed in the Object Constraints Language, and applied to port and connector stereotypes of the platform-specific model. By parameterization and attribute assignment, the transformed component model can be expressed as shown in Fig. 10. Real-time components were mapped into “Collections”, which can exchange data with each other through their “Ports”, only. The timing requirements such as release time, deadline, WCET, and period are expressed as TaggedValues attached to the “Task” stereotype in PSM. When several tasks are ready to run, a priority-driven scheduler should select the task with the highest priority to run.

138

Shourong Lu and Wolfgang A. Halang

Fig. 10. Platform-specific component model using PEARL collections

Schedulers are timed systems that manage shared resources. Usually, schedulers apply scheduling policies to select among pending requests to allow for access to resources. The scheduler polices can be expressed by “Contract”. In addition, a run-time system needs to enforce phasing between tasks. The following constraints can be defined also as TaggedValues, and assigned to “Tasks”. Precedence constraint specifying the execution order between two tasks, Correlation constraint of the maximum time between executions of two or more parallel tasks, Distance constraint of a minimum distance between the executions of two tasks, Latency constraint specifying a maximum allowed distance between the start of one task and the completion of another one, Shared resources specification of the tasks that use semaphores and times for critical task sections, Mutual exclusion relations for shared resources. 7.2 Transformation from PIM Component Models to Function Blocks This transformation maps the architecture of a component model to function blocks. Function Blocks. Function blocks are defined within the standard IEC 61131-3 software model [13, 15] as show in Fig. 11. In IEC 61131, an overall application is composed of nested functional components called Program Organization Units (POUs), that are Programs (in higher levels of the hierarchy), Function Blocks (FBs), re-usable software blocks defined by a data set (input, output parameters and local variables) and an internal algorithm, and Functions that differ from FBs only in the lack of permanent internal variables and in having a single output parameter. Unlike the FBs of IEC 611313, a function block in IEC 61499 [17] generally consists of two parts, i.e., “Execution

Platform Independent Architecture Specification

139

Fig. 11. Function Block instance in IEC 61131-3, IEC 61499

control” that creates a processing of events with control inputs and outputs (control flow), and “Algorithm” with data inputs and outputs and internal data (data flow and processing). FBs of IEC 61499 require distinct Data Interfaces and Event Interfaces. In IEC 61131-3, a program can contain several instances of the same FB, each one independent of the others as their internal data are allocated in separate memory areas. Moreover, the same FB can be used in several different programs, reducing effort for code re-writing. An instance of a function block is specified by input, output, in-out and internal variables, and by an internal behavior. The input variables can only be written from the outside of the FB. From the inside they can be read, only. Output variables can be read and written from the FB’s inside, and only be read from the outside. In-out variables are special shared variables. If their data types match, output variables can be connected to input variables by connectors. Assignments of data values to interface variables is the only means to communicate with other FBs, and FBs have internal state information that persist the execution of FB instances. A characteristic feature of FBs is the separation of external interfaces and internal implementation. In IEC 61499, FB is more related to execution control and scheduling of FBs, because IEC 61499 addresses distributed control systems, in which it is very important that software modules residing on difference devices follow a well predictable execution sequence. Periodic or eventdriven real-time tasks can be associated with function blocks. Model Transformation. The rules to generate a PSM meeting platform features of the function block model from a platform-independent real-time component model are the following. – The platform-independent real-time component model is mapped to a function block-oriented model by defining correspond stereotypes, such as “FBActive”, “FBPassive”, “FBEvent”. – The ports in the component model are direct counterparts to the interface variables of FBs. The input and output variables are mapped to in-ports or out-ports, respectively, and the in-out variables to in-out ports on which a Contract (with matching

140

Shourong Lu and Wolfgang A. Halang

protocols) is added. The stereotypes “FBInPort” and “FBOutPort” are defined as examples as shown in Fig. 12, and are formulated in OCL. “FBOutPort” requires one “FBInterface”, and “FBInPort” provides one “’FBInterface”. The “FBInterface” is needed to model input and output variables of FBs; it may contain operations. The “FBInterfaces” can be mapped to “dataInterface” and “controlInterface” of a component according to its operations. – In addition to the defined in-ports, out-ports, in-out-ports (IEC 61131-3) stereotypes, there is still need to define “Data Port”, “Control Port”, and “Event Port” (matching the IEC 61499 paradigm) stereotypes as follows according to the requirements of function blocks.

Fig. 12. Stereotypes of function block interfaces

Data Port is used to specify the data flow between components. It represents pure data transfer from one component to one or more other ones. Data ports have specified types, and can be either provided or required. Control Port is used to specify control operations, such as real-time constrained operations. It specifies the prevailing restrictions. In addition, control ports define the temporal behavior of the execution of components. Event Port serves to invoke functionality upon a component. An event corresponds to the invocation of functionality. By parameterization and defining stereotypes, the transformed component model can be expressed as shown in Fig. 13.

8

Summary

The processes of designing platform independent real-time component models in the course of developing embedded real-time systems were shown. A PIM component model captures the essential features of a system, and specifies what it does, while a PSM component model describes how the PIM component model is implemented on a specific platform. Platform-independent component models are designed by defining a set of Stereotypes, Tagged Values and Constraints according to the requirements for software in embedded systems, and assigned to a Component Meta-model. In a platform-independent model, component specifications define realization contracts which include specifications of the various ports provided and required, and possibly also details about the number of component objects expected, as well as a Contract class which is used for some functional and non-functional constraints associated with ports,

Platform Independent Architecture Specification

141

Fig. 13. Platform-specific component model

components and connectors. Port specifications define usage contracts. Interfaces are specified by sets of operation specifications. Each operation itself can be specified using pre- and post-conditions. A post-condition defines the effects of an operation, while a pre-condition defines the conditions under which the post-condition is guaranteed to hold. The relationship between two ports is defined by a connector. Architecture Constraints describe the properties components expect from their respective environments and vice versa; they are specified in form of OCL expressions.

Acknowledgments This work was financially supported by a STIBET Matching Funds grant of both DAAD and ifak e.V.

References 1. J. Bettin: Best Practices for Component-based Development and Model-driven Architecture. http://www.softmetaware.com/best-practices-for-cbd-and-mda.pdf, 2003. 2. M. Bjoerkander and C. Kobryn: Architecting Systems with UML 2.0. IEEE Software, July/August 2003 (Vol. 20, No. 4), pp. 57–61. 3. P. Boulet, J.L. Dekeyser, C. Dumoulin and P. Marquet: MDA for SoC Embedded Systems Design, Intensive Signal Processing Experiment. Proc. SIVOES-MDA, Workshop at UML2003, San Francisco, 2003. 4. E. Brinksma, G. Coulson, I. Crnkovic, A. Evans, S. G´erard, S. Graf, H. Hermanns, B. Jonsson, A. Ravn, P. Schnoebelen, F. Terrier, A. Votintseva and J.M J´ez´equel: Component-based design and integration platforms. Roadmap, Advanced Real-Time Systems Information Society Technologies (ARTIST), May 2003. 5. X. Cai, M.R. Lyu and K.F. Wong: Component-based Embedded Software Engineering: Development Framework, Quality Assurance and a Generic Assessment Environment. International Journal of Software Engineering and Knowledge Engineering 12, 2, 107–133, 2002.

142

Shourong Lu and Wolfgang A. Halang

6. J. Cheesman and J. Daniels: UML Components – A Simple Process for Specifying Component-based Software. Addison-Wesley 2001. 7. DIN 66253-2: Programmiersprache PEARL90. Berlin-K¨oln: Beuth Verlag 1998. 8. DIN 66253 Teil 3: Mehrrechner-PEARL. Berlin-K¨oln: Beuth Verlag 1989. 9. B.P. Douglass: Real-Time UML – Developing Efficient Objects for Embedded Systems. 2nd ed. Addison-Wesley 2002. 10. D.S. Frankel: Model Driven Archtecture. Addison-Wesley 2003. 11. R. Gumzej: Embedded System Architecture Co-Design and its Validation. Doctoral thesis, University of Maribor, 1999. 12. W.A. Halang, C.E. Pereira and A.H. Frigeri: Safe Object Oriented Programming of Distributed Real Time Systems in PEARL. Proc. 4th IEEE Intl. Symp. on Object-Oriented Real-Time Distributed Computing, pp. 87 – 94. Los Alamitos: IEEE Computer Society Press 2001. 13. International Electrotechnical Commission, Technical Committee 65: Industrial-Process Measurement and Control. 2001. 14. International Standard IEC 61131-3: Programmable Controllers, Part 3: Programming Languages. Geneva: International Electrotechnical Commission 1992. 15. K.-H. John and M. Tiegelkamp: IEC 61131-3: Programming Industrial Automation Systems. Springer-Verlag 2001. 16. A. Kleppe, J. Warmer and W. Bast: MDA explained: The model driven architecture: practice and promise. Addison-Wesley 2003. 17. R. Lewis: Modelling control systems using IEC 61499, Applying function blocks to distributed systems. IEE Control Engineering Series, No. 59. 2001. 18. S.J. Mellor and M.J. Balcer: Executable UML – A Foundation for Model-Driven Architecture. Addison-Wesley 2002. 19. B. Meyer: Applying design by contract. IEEE Computer 25(10):40–51, 1992. 20. Object Management Group: Unified Modeling Language Specification (1.4). http://www.omg.org, 1999. 21. Object Management Group: Unified Modeling Language: Superstructure. OMG document ptc/2003-08-02, 2003. 22. PECOS Publications: Pervasive Component Systems (IST-1999-20398). http://www.iam.unibe.ch/ scg/Archive/pecos/publications.html. 23. K. Sandstr¨om, J. Fredriksson and M. Akerholm: Introducing a Component Technology for Safety Critical Embedded Real-Time systems. Proc. Intl. Symposium on Component-based Software Engineering, Edinburgh, 2004. 24. C. Szyperski, D. Gruntz and S. Murer: Component Software – Beyond Object-Oriented Programming. 2nd ed. Addison-Wesley 2002.

Model Driven Software Development in the Context of Embedded Component Infrastructures Markus Voelter1 , Christian Salzmann2 , and Michael Kircher3 1

Voelter - Ingenieurb¨uro f¨ur Softwaretechnologie Heidenheim, Germany [email protected] 2 BMW Car IT, M¨unchen, Germany [email protected] 3 Siemens AG Corporate Technology, M¨unchen, Germany [email protected]

Abstract. In this chapter we motivate the need for an infrastructure platform for embedded software, supporting the development of reusable systems. Our solution is based on a component infrastructure that is implemented using modeldriven software development (MDSD) techniques. This approach allows us to achieve the goal of re-usability while still providing an efficient system, tailored for the specific embedded hardware and operating system. This chapter explains the principles of our approach and introduces model-driven software development. It illustrates the concepts by presenting an example of how to model and specify the embedded application (a simple weather station), and how to generate supporting component middleware infrastructure from these models.

1

Introduction

Development of embedded software is one of the most challenging fields in software engineering today. Since innovations in many industries traditionally grounded in mechanical engineering (such as automotive or aerospace) shift more and more from mechanical solutions to electronic and computerized functions, embedded software emerges as one of the most promising application domains for software engineering. In the automotive industry, for example, over the last years, 90 percent of all innovations in new models were driven by electronics. As a consequence of increasing the amount of electronic functions, software emerges as a central aspect in a car. In current premium cars, we face a total number of up to 270 functions the user interacts with, deployed over 67 independent embedded devices (ECU - Electronic Control Unit). All in all, this sums up to about 65 megabytes of binary code. Classical embedded software applications used to be isolated: small programs that typically run on a single embedded device. Usually, the amount of code used to be in the area of some kilobytes per device. Abstractions were minimal and the focus was on efficient resource consumption and optimized performance, getting the most out of the hardware available. As a consequence, typical platforms used to be minimal with respect to the provided services, and applications used to be programmed directly in machine code or in rather low level languages such as C. C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 143–163, 2005. c Springer-Verlag Berlin Heidelberg 2005

144

Markus Voelter, Christian Salzmann, and Michael Kircher

However, the amount, nature, and complexity of embedded software changes rapidly. Today, we are faced with networks of embedded devices, in which devices interact with other embedded devices through various networks and bus systems. Considering cars again, it is expected, that we will reach the amount of one gigabyte of binary code within the next six years, distributed via a network of several buses and about 70 devices. The complexity of such a system is comparable to a state-of-the-art desktop workstation of today. It seems obvious that a system of that size cannot be developed with the methods and abstractions of classical embedded software. But a workstation and its software face completely different needs compared to software in embedded devices, such as in cars, namely, the product lifetime of embedded software is substantially longer than the average 3 years of an office product (with its several hot fixes), the strong timeto market-pressure of an office feature vs. the demands for reliability of embedded software, etc. Therefore, we need similar levels of abstraction as we learned from the desktop and enterprise world. However, these must cope with the additional challenges of reliability, flexibility and performance faced in the embedded domain. The Challenge. Why is it so much harder (roughly a factor of 20) to develop the same piece of functionality in an embedded ECU in the automotive domain compared to the same functionality developed for the traditional IT world, or for a device with similar characteristics, say, a Pocket PC PDA? One reason for this is that embedded software is more or less developed from scratch for each application. There is only very little reuse, compared to desktop software. We can identify a set of key requirements for the reuse to be practical: – Resource optimization: due to unit-based cost structure, modularity must not be overly expensive with respect to resource consumption such as memory or CPU allocation. – Adaptable to different domains: although software modules may be developed in different ways due to the heterogeneity of the various domains in a vehicle (e.g. brake system vs. infotainment) it must be possible to assemble them into one system. – Customizable to specific hardware but also transferable from one hardware platform to another: due to the life-cycle gap and the hardware/software correlation, the software must be transferable from old hardware platforms to newer ones, but still be optimized with regard the hardware. – Extensible: as another consequence of the life-cycle gap, it must be possible to extend a software system during its lifetime and upgrade it with new features. The overall goal in today’s embedded software development is to reduce complexity of the system, to save development effort, and thus cost, and to achieve a shorter time to market, while taking into account the requirements above. In the course of our research of how to address the afore-mentioned requirements, we analyzed existing development methodologies and technologies that have successfully been used in the development of business and enterprise applications in the past. Here we focused on the separations of concerns on programming level and on modeling level (through middleware).

Model Driven Development for Component Infrastructures

145

Re-usability Through Separation of Concerns. Separation of concerns [16] and associated modularization of functionality has successfully helped large applications to be developed quickly and efficiently. Here is a brief overview. – Object-orientation (OO) separates different functionalities into modules, called classes. But OO left the separation of non-functional, operational [7] requirements untouched. – Frameworks also encapsulate concerns in an architectural way, and they are therefore fairly static. – Component/Container infrastructures [22] such as EJB [18], COM+ [5], CCM [13] or (to some extent) OSGi [15] go one step further. Their goal was to isolate application functionality from technical, operational functionalities, by encapsulating the application functionality into components, the operational functionalities into containers – decoupling them via well defined interfaces. The separation of concerns is done architecturally. – Aspect-oriented Software Development (AOSD) focuses on weaving encapsulated aspects into the existing application code. Specialized languages have been developed to connect aspect code with the locations where the aspects are to be applied. Today, AOSD tools showcase interesting ideas, but do not provide industry-proven solutions [17], yet. Middleware. Having done a proper separation of concerns and having achieved a certain level of modularization, it becomes important to provide a run-time environment for the individual application components. For this glue code, the term middleware is typically used. The glue code addresses those concerns that have been factored out of the application components. If the application components have to run in a distributed environment, connected through networks and buses, it becomes important that the middleware also covers remote communication transparently. For the purpose of this chapter we focus on communication middleware as one of the most prominent examples of middleware. CORBA is one of the most widely used middleware standards, but its implementations are often to large (w.r.t. memory footprint) for embedded applications. Therefore, the Minimum CORBA [11] standard was defined, which defines a subset of the highly configurable CORBA features. But Minimum CORBA is still too large for many scenarios and causes too much runtime overhead for embedded systems. Where quality of service (QoS) properties, such as latency and delay of remote invocations play an important role, Real-time CORBA [12] becomes interesting. Realtime-CORBA aims at addressing the management of QoS properties between clients and servers end-to-end. In order to make the footprint sufficiently small and the execution overhead sufficiently low, actual implementations of Minimum and Real-time CORBA have defined their own “customized” set of features for the embedded domain (independent of any OMG specification). Some implementations are quite small; their footprint is sometimes 100 kB, or even less. Nevertheless, with their restricted functionality, they are limited, but sufficient for many use cases. Middleware implementations, such as CORBA, use code generation for their adaptation layers. The generated code connects applications clients and remote object im-

146

Markus Voelter, Christian Salzmann, and Michael Kircher

plementations with the underlying communication middleware. From a structural view, communication middleware typically consists of a framework [19] and adaptation layers. While the framework provides the core functionality, the adaptation layers mediate between clients, the core, and the remote objects. The adaptation layer is typically code generated, since it is specific to the remote object’s interface. For some stringent embedded environments, even the optimized frameworks and the generated code are too rigid in size and QoS. One of the reasons is that the remote objects, their dependencies, and their deployment is typically only considered from the viewpoint of individual remote objects, not from a system view. This, however, is necessary to further minimize resource consumption.

2

Description of Our Approach

Our proposed approach to building embedded component infrastructures is based on the combination of Component/Container infrastructures, including the underlying communication middleware, with model-driven software development techniques. Component/Container infrastructures unify various middleware approaches, specifically, separation of technical and functional concerns as well as remote communication and QoS management (see [22]). While, of course, we use code generation, in contrast to many other code generation approaches used in embedded systems, we do not generate any application logic. Rather, we focus on the generation of the infrastructure code required to make the application logic work in the context of the embedded system’s hardware infrastructure. Using the terminology of the components/container approach, we generate the container, not the components. As explained in the previous section, this is essential in order to come up with a minimum-overhead implementation of the infrastructure. The illustration shown in Fig. 1 illustrates the basic architecture. The functionality of an embedded application is encapsulated in components (the C’s in Fig. 1). The functionality is accessed using interfaces; the component accesses its environment using interfaces, too. The application logic is supplied using implementation files (e.g. C files) that have obey certain conventions, as explained below. The component implementation is developed manually. Here, ”developed manually” means that the creation process is outside the scope of the generative process introduced in the next section. It is, of course, possible to use tools such as Ascet, Statemate, etc., to generate the implementation.

C

C

C

C

C

Container

Container

OS

OS

Embedded Device

Embedded Device

Network/Bus

Fig. 1. Layered architecture of an embedded container infrastructure

Model Driven Development for Component Infrastructures

147

As mentioned earlier, the containers, which constitute the middleware part of the system, are code-generated. In order to be able to generate this code, we need various specifications, or models, that drive code generation. These models capture aspects such as component types, interfaces, connectors among components, hardware topology, network types, QoS constraints, and so on. The specifics of the models depend on the domain for which we want to generate containers. From these models, the generator tool generates all the infrastructure code required to make the application logic work in the embedded environment. This might include interrupt handling, scheduling, network/bus access and timing constraint verification. Before the generator actually generates code, it will verify the model’s compliance to the rules specified by the component/container meta-model, as well as the constraints implied by the specific embedded environment, in which the system is intended to run. The approach just outlined is known as model-driven software development, or MDSD. The next section explains some details of the model-driven software development paradigm, in general, including modeling and meta-modeling, definition of domain-specific languages, generator tools and platform design. 2.1 Model-Driven Software Development Model-driven software development (see [20]) is about making models first class development artifacts, as opposed to ”just pictures”. Various aspects of a system are not programmed manually, rather, they are specified using a suitable modeling language. These models are significantly more abstract than the implementation code that would have to be developed manually otherwise. They are specific to the domain for which the models are relevant. The modeling languages used to describe such models are called domain-specific languages (DSL). Domain Specific Languages. Like any other (formal) language, a DSL has three constituent parts: – A meta-model (also called abstract syntax) defines the building blocks of the language, and the rules how they might be combined to form legal models (sentences in the DSL) – A concrete syntax defines the actual notation used to specify models (or sentences). A particular meta-model might have several concrete syntaxes; concrete syntax can be textual (in which case sentences are often called specifications) or graphical (where sentences are often termed models). We use both terms interchangeably. – Finally, a DSL needs to have semantics; the meaning of models has to be welldefined. We will return to the issue of semantics definition below. Models themselves are not useful in the final application. Rather, models have to be translated into executable code for a specific platform. Such translation is implemented using model transformations. A model is transformed into another model, typically a more specific, or less abstract model. A series of such transformations results in executable code, since the last transformation is a model-to-code transformation. Because of today’s somewhat limited tool support, many MDSD infrastructures use just one

148

Markus Voelter, Christian Salzmann, and Michael Kircher

generation step, directly from the models to code. Model transformation tools using the latter approach are often referred to simply as model-driven code generators. Figure 2 shows the most important concepts as a UML model. describes relevan concepts of

Domain

<<synonym>>

Metamodel

Abstract Syntax

expressed in terms of

Static Semantics

0..* SubDomain

Concrete Syntax

<>

DSL

expressed in terms of <<synonym>>

Formal Model

gets meaning from

Semantics

Modelling Language

respects

Fig. 2. Core concepts of domain-specific languages

Semantics. In addition to producing less abstract models (or implementation code), model transformations also serve the purpose of defining the semantics of the model they transform. By describing the rules of how a model is projected onto an implementation language, the meaning of the model is defined. Although, this is a rather pragmatic approach of defining semantics, it works well in practice. This definition of semantics has the disadvantage that we cannot ensure formally that, if we have several sets of transformations which transform the models to different target languages with well-known semantics, the defined semantics are the same. In practice we can use testing to make sure they are the same, but there is no formal way to ensure it. Complex systems typically consist of a variety of concerns, such as components and their interfaces, the description of the deployment infrastructure (hardware), or timing and concurrency concerns. It is often not practical to use a single modeling language for all of these aspects; specifically, different concrete syntaxes are often useful. For example, the components and interfaces can be described using (stereotyped) UML, the hardware and the deployment using XML, and dynamic and concurrency aspects using a specific textual language. The generator must be able to understand all of these, and integrate the different partial models into a coherent whole. Note that some aspects of an application, typically the application logic, might be described in a sensible way using a 3GL programming language. It is perfectly ok to use a 3GL, nobody should feel forced to ”invent” a DSL if the general purpose programming language is efficient for the particular task at hand. Platforms. While code generation is a powerful tool, it usually does not make sense to generate the complete code, or, in other words, the complete middleware infrastructure, in the scope of our context. Just as today, where nobody would re-implement printf,

Model Driven Development for Component Infrastructures

149

programmers will almost always rely on a set of libraries, frameworks and pre-built components for application development. In the embedded world, performance considerations might limit the applicability of this approach, but this depends on specific constraints. Together with the operating system and the base libraries provided by the programming language, all this is referred to as the platform. Figure 3 shows the layering structure of a typical application architecture.

Applications

Domain Platform

- Core Entities - Core Valuetypes - Core services - ...

Technical Platform/ Middleware

- Distribution - Scheduling - Hardware Access - ...

Programming Language Operating System

Fig. 3. Layered structure of a domain-specific platform

Restating the role of model transformations, they map the platform-independent application model to code that runs on and utilizes the target platform. By providing a set of transformations, it is possible to generate implementations for several platforms. It is even possible, theoretically, to generate any of the layers in Fig. 3. However, in the context of component/container infrastructures in embedded systems, the following approach has proven to be most useful in practice: – Obviously, the programming language and its base libraries are not generated. – The operating system is not generated either, although it is usually configured, or tailored, according to the model. Configuration files are generated. – The primary candidate for generation in our context is the technical platform/middleware layer. – The domain platform is usually supplied in the form of reusable, pre-built software components. – Finally, components are implemented manually. Applications are implemented as a set of collaborating components. Remember, that the goal of our approach is to generate the technical platform, or middleware, on which applications can run. Using code generation, we can combine the advantages of modularization, separation of concerns, and reuse with efficient implementation. We will show in the example below, how this is achieved. Design Flow. This subsection briefly describes the design flow that is typically used in embedded component projects. The UML-like diagram in Fig. 4 shows the issues; below we explain the steps.

150

Markus Voelter, Christian Salzmann, and Michael Kircher

generated from <<model>> Deployment

<> Container, OS config, ...

5

6

<<model>> Instances, Composition Instances, Deployment, ... 5

Types, Interfaces, ...

generated from <<model>> Interfaces, Types

<

> Application Logic/ Component Impl.



<> Interface and Types



1



3



4 generated from



<<model>> Components



<> Component Base 2



3



Fig. 4. Design flow for model-driven component/container implementation



In the first step, we define interfaces and complex types. Since an interface can be used by various components, it is crucial to define them first, independent of the components. In a second step, we define components and their communication ports. Here, we also define some of the communication parameters, for example, whether the communication in a port should happen synchronously or asynchronously. Since this affects the API against which the implementation code is developed, these definitions have to be made before the application logic is implemented (manually) in step four. Before doing this, we generate these interface APIs as well as component base code, for example, base classes in OO, wrappers in C (see step 3). This concludes the component definition phase. In step five we define which component instances we will use, how their ports will be connected, and on which hardware devices these instances will be deployed, as well as other system constraints. All these models together will then be used by the second generation step that creates all the infrastructure code, that is the container, the communication implementations, OS configuration files, build scripts, etc. In a final step (not shown) all the generated code we be compiled and linked using the generated build script, resulting in the final system. Software System families. Using a model-driven software development approach usually pays off only in the context of software system families. The various members, or products of such a family have a spectrum of features in common, allowing systematic reuse. Specifically, the DSLs used to describe the members of the family are usually the same. In order to come up with a suitable DSL, transformer/generator and plat-



 Model Driven Development for Component Infrastructures



151



form (together called the domain architecture), the developers need to have a good understanding of the domain for which they develop the infrastructure. Domain analysis techniques can help to deepen this understanding. In practice, developing useful domain architecture happens incrementally and iteratively over a relatively long time, and it is based on experience. Architecture. Software architecture plays an important role in the context of modeldriven software development. Transformations rely on the availability of a well-defined meta model for the source as well as the target. They are literally rules describing how to map the concepts from the source meta model to the concepts provided by the target meta model. In order for transformations not to be overly complex, the concepts defined by the meta models must be concise, precise and limited in number. With respect to the final transformation step, that is the one that produces implementation code, this means, that the architecture of the target platform, as well as the mapping of application concepts to this platform needs to be clearly defined.



3



Example: A Simple Weather Station



Overview. This section aims at illustrating our approach with a more concrete example. We use a small distributed weather station for the purposes of the example. The weather station consists of several nodes (micro controller) connected by a bus, on each of these nodes software components will be deployed (see Fig. 5).



Outside Node Temp . Sensor



Bus



Hum. Sensor



Outside of Building Inside Building Main Node



Inside Node Readouts



Bus Temp . Sensor



Hum. Sensor



Fig. 5. Weather station example scenario



The example consists of three nodes, one outside node, the main node and an inside node, connected by a bus. This could be CAN bus, for example. In the course of this example, we will take a look at the following artifacts: – The models necessary to describe a weather station, – The tool chain necessary to validate the models and generate the code, and – How code generation works in detail.



 152



Markus Voelter, Christian Salzmann, and Michael Kircher



Models. We use three different models to describe the distributed embedded system of the weather station: – a type model describes interfaces, components and their ports, – a composition model describes component instances and how they are connected, and – a deployment model describes the physical infrastructure and how component instances and connectors are mapped onto it. Interfaces are specified with a textual DSL, similar to CORBA IDL. The following example, shown in Fig. 6 defines the Sensor interface to have three operations, start, stop, and measure. The controller interface has a single operation reportProblem which sensors will use to report problems with the measurement. Instead of a textual model as interface Sensor { operation start():void; operation stop():void; operation measure():float; } interface Controller { operation reportProblem(Sensor s, String errorDesc ):void; } Fig. 6. Example for an interface description



shown in Fig. 6, we could also use a graphical model, as long as the same information is conveyed. The information included in an interface definition is described using a metamodel, such as the one depicted in Fig. 7. As explained above, the meta-model describes the constructs a DSL provides for building models. The meta-model for interface definitions is given in Fig. 7. As one would expect, the meta-model defines interfaces as artifacts that own a number of operations which each having a name, a returning a type as well as a number of parameters (each with a name and a type) as well as exceptions.



Operation Interface



{ordered}



name : String type : String



*



**



Parameter name : String type : String



Exception type : String



Fig. 7. Interface meta-model



The next step in describing a component-based system is the component model. We use a graphical notation for this aspect of the overall model, which uses UML syn-



 Model Driven Development for Component Infrastructures



153



tax to be able to build these models in a UML (1.x) tool. We first define two kinds of sensors, TemperatureSensor and HumiditySensor. Both of these have a provided port called measurementPort which offers the operations defined in the sensor interface shown in Fig. 6. In addition, both of these two types of sensors have a required port called controllerPort, through which the sensors expect to communicate with their controller. On top of that, there are two kinds of sensors defined, a control component which provides a controller port and that requires a number of sensors. These artifacts are all illustrated in Fig. 8. Again, we show the meta-model for this aspect of the model in Fig. 9; we use a UML-based concrete syntax, as exemplified in the interface meta-model of Fig. 7, and we extend the UML meta-model.



<> Control <<providedport>> controllerPort



<<requiredport>> sensorsPort



<> Controller



<> Temperatur Sensor unit: String



controllerPort <<requiredport>>



<> Sensor



controllerPort <<requiredport>>



<> Humidity Sensor



measurementPort <<providedport>>



measurementPort <<providedport>>



Fig. 8. Model of components and ports



The Component type extends UML:Class. This is why we model the components as UML classes with a component stereotype in the meta model shown in Fig. 9. A Component has a number of ports. A port is modeled as a subtype of UML::Association. A port references an InterfaceRef, and it cannot technically directly reference interfaces because they are defined in another meta-model. The InterfaceRef plays the role of a proxy [6] for the Interface. Ports are abstract. Concrete subtypes are defined in the form of RequiredPort and ProvidedPort. Also note the concepts of applications, which are components that do not offer any services themselves. This is expressed by the OCL constraint that requires the ports association to contain RequiredPort objects only (illustrated in Fig. 9). Finally, a concrete system must be specified by defining component instances, containers and (hardware) nodes, as well as connections on physical and logical level. We use an XML based concrete syntax for this aspect. The XML code shown in Fig. 10 illustrates one part of the deployment definition of the weather station example. With the background of the previous explanations, and the meta-model for the deployment displayed in Fig 11, the meaning of this model (in Fig. 10) should be understand-



 154



Markus Voelter, Christian Salzmann, and Michael Kircher {subsets Features} *



Attributes



UML:: Attribute



UML:: Association



UML::Class



name : String type : String



UML Metamodel



Component



ConfigParam



1



*



*



Port



ports



1



InterfaceRef



* {subsets Attributes}



RequiredPort



ProvidedPort



from



context ConfigParam inv: type == "String"



to



Application Port Dependency context Application inv: ports->collect( p | p oclIsKinfOf ProvidedPort )->isEmpty



context PortDependency inv: to.Interface == from.Interface



Fig. 9. Meta-model for components and ports



able without further explanation. This part of the system has a meta-model, too. It is shown in Fig. 11. The central concept is the System. A System consists of a number of Nodes, each Node itself consists of Containers. These contain a number of ComponentInstances which reference a Component as their type. On the other hand, Systems also contain a number of Connectors. These connect a provided and a required port of two ComponentInstances in order to allow these two instances to communicate through the respective ports. Finally, a Connector has a type, which implements one of several communication strategies, such as communication through CAN bus, through a local direct call, or through shared memory (Fig. 11). From these three different models, an overall model can be composed (this is done in code generator’s first phase, on AST level). This overall, merged model will subsequently be used as the input to the code generation phase of the generator. The overall model thus consists of several partial models describing different aspects of the entire system. However, in order to generate a useful system, the code generator (described in more detail below) must consider all the aspects at the same time. This requires a way to join the models. Technically, this is done by using different parser front-ends on the generator. However, we also need to make sure that the models can be joined logically. For example, in the component model, we must reference an interface defined in a text file. As a consequence, we use proxies [6] in the meta-model (called references). Figure 12 illustrates how the various models are joined logically. A component model uses InterfaceRefs to reference interfaces defined in the interface model. The system model uses the type attribute of ComponentInstances to refer to Components defined in the component model as well as PortRefs to reference the Ports defined as part of Components.



 Model Driven Development for Component Infrastructures



155



<system name="weatherStation"> <node name="main">     <node name="inside">   <param name="unit" value="centigrade"/>    <node name="outside">   <providedPort instance="tempOutside" port="measurementPort"> <requiredPort instance="controller" port="sensorsPort">   <providedPort instance="controller" port="controllerPort"> <requiredPort instance="tempOutside" port="controllerPort>   ... ...  ... ... 



Fig. 10. Specification of nodes, instances and connectors



Tooling. In addition to a UML tool and a text editor, to create the various models shown earlier, the tooling mainly consists of a model-driven code generator, the “openArchitectureWare [oAW]” toolset in our case. The generator has three primary responsibilities: – Parse the various models and join them together; inconsistencies must be detected and reported – Verify the model against the meta-model of the domain. If the model does not conform to the domain meta-model, report errors. – In case the model is fine, generate the target code for the various platforms In the tradition of programming language compilers, the generator works in several phases, as illustrated in Fig. 13. In the first phase, one or several model parser front-



 156



Markus Voelter, Christian Salzmann, and Michael Kircher



Provided PortRef



Required PortRef



1 source



ComponentRef



1 target



1 * Component Instance name : String



Connector



*



id : String



context Connector inv: source.interface == target.interface



* 1



type *



Connector Type



System



* Node



Container



{open}



DirectCall Connector



SharedMemory Connector



CAN Connector



Fig. 11. Deployment meta-model



Interfaces Interface



Components name name



Systems



InterfaceRef



Component



Port



name type name name



instance. type



PortRef



Fig. 12. Relationships among the meta-models



specification (model)



metamodel



instance of model parser



metamodel instance



written in terms of



template engine



templates



Fig. 13. Generator tool work flow



output code



 Model Driven Development for Component Infrastructures



157



ends read the model. This results in the representation of the model as an object graph inside the code generator. The classes used to represent the object graph directly map to the domain meta-model. This is where meta-model constraints are checked (they are implemented as part of the meta-classes). In the second phase, code generation templates are used to actually generate output code. For illustrative purposes, in Fig. 14, we show a skeleton implementation of the Interface meta-class. The defined class in Fig. 14 is an ordinary Java class. We inherit from the UML::Class meta-class, because – it makes the ECInterface a model element, i.e., a valid generator meta-class, – it inherits the properties of UML::Classes, specifically the fact that it can have operations, that it is in a package, etc. – it allows us to use stereotypes on UML::Classes to represent instances of interfaces. public class ECInterface extends generatorframework.meta.uml.Class { } Fig. 14. Initial definition of the ECInterface meta-class



This is illustrated in Fig. 15. This same approach can be applied in many other circumstances, for example, to ensure that the port names of components are unique. Figure 16 provides another example. public class ECInterface extends generatorframework.meta.uml.Class { public String CheckConstraints() { Checks.assertEmpty( this, Attribute(), "must not have attributes." ); } // more ... } Fig. 15. ECInterface meta-class with constraints



Generating Code. Code generation is based on templates. A template is basically a piece of code with escapes in it that can access the model (represented as an object graph in the generator). The code in Fig. 17 represents a simple example that generates a C header file for a component implementation. Templates consist of two kinds of text: – The commands within the ”guillemots” are used to iterate over the model and thus to control code generation. – Text outside the ”guillemots”is code to be generated. It is literally copied into the generated code file. – Within the to-be-generated code the ”guillemots”-escape can be used to reference properties of the respective model object.



 158



Markus Voelter, Christian Salzmann, and Michael Kircher



public class Component extends generatorframework.meta.Class { public String CheckConstraints() { Checks.assertEmpty( this, Operation(), "must not have attributes." ); Checks.assertEmpty( this, Generalization(), "must not have superclasses or subclasses." ); Checks.assertEmpty( this, Realization(), "must not implement any interface." ); Checks.assertUniqueNames( this, Port(), "a component’s ports must have unique names." ); } // more ... } Fig. 16. Constraints Checks for the Component meta-class



<> <> /**** Port Header File **** * * Type: <> * Name: <> * Component: <> * Interface: <> */ <> #ifndef <<portPrefixUpperCase>>_H #define <<portPrefixUpperCase>>_H #include <<middleware_types.h>> <<EXPAND Body(connector)>> #endif <<ENDLET>> <<ENDFILE>> <<ENDDEFINE>> <> <> <<EXPAND Util::ExternDecl(Component.Name"_"Name) FOREACH Interface.Operation>> <<ENDIF>> <<ENDDEFINE>> Fig. 17. Sample code generation Template



Parsing Input Models. Parsing of the input models is done using generator front-ends, as shown above. Since we need to parse several models for a certain generator run, we use the composite design pattern [6] to build a front-end that itself contains front-ends for the various models we need; Figure 18 provides an example. How the various front-



 Model Driven Development for Component Infrastructures



159



package util; public class EmbeddedComponentsInstantiator extends CompositeInstantiator { private String systemConfFile = System.getProperty("EC.SYSTEM"); private String interfaceFile = System.getProperty("EC.INTERFACE"); private String componentsFile = System.getProperty("EC.COMPONENTS"); public EmbeddedComponentsInstantiator () { // a front-end that reads the UML model add( new XMIInstantiator( componentsFile ) ); // a front-end that reads the XML system spec // use ecMetamodel as package prefix when // attempting to load meta-model classes add( new XMLInstantiator( systemFile, "ecMetamodel" ) ); // a front-end that reads the textual spec // for the interfaces add( new JCCInstantiator( interfaceFile ) ); } } Fig. 18. Instantiator that reads the various models



ends work internally is beyond the scope of the chapter. Basically, they read the models and create an object graph from them. Overall Setup. Since we use several aspect models with different concrete syntaxes, the actual setup is somewhat more complicated, as shown in Fig. 19. The interfaces are represented with a textual DSL. Components are represented using profiled UML models; the deployment (or systems) are described using XML. All these different partial models refer to their respective parts of the meta-model. The complete meta-model is implemented as Java classes as illustrated above, independent of their concrete syntax. So, while a model is represented in different files using different concrete syntax, all the model parts are represented as Java objects once they have been parsed by the respective instantiator, i.e., parser or front-end. This is also the place where the references among the model parts are dereferenced. The proxy is supplied with a reference to its delegate object. At this stage, the generator back-end uses the code generation templates to generate the output.



 160



Markus Voelter, Christian Salzmann, and Michael Kircher



Metamodel



concrete syntax



DSL (Interfaces)



meta model



Interfaces



metamodel



Components



System



semantics



UML Profile



Generator Meta-Metamodel (Java)



<>



metamodel <>



DSL (components) concrete syntax



<>



Domain Architecture ASCII



XML Instantiator Model Interfaces



Components



System



XMI Instantiator



Instantiated Metamodel



Textual Instantiator DSL (Systems) concrete syntax



Generator Backend



XML Transformations



Platform



Manually Coded Application Logic



Generated Code Application



Fig. 19. Overall tool chain and artifacts



3.1 Resource Optimization Due to the generative approach we were able to conduct experiments concerning optimized code generation for efficient resource allocation. Since this is one of the key requirements for the embedded world we invested considerable efforts into this topic. In our experiments we reached an additional memory allocation for middleware based embedded software of 1KB of ROM and 300 Bytes of RAM for an typical event based communication pattern in the automotive domain. The performance of the middlewarebased software was not significantly slowed down either, which fortifies the practicability of our approach. For reasons of brevity, more details are beyond the scope of this chapter.



4



Conclusions



Advantages of this Approach. A model-driven approach to software development yields a number of advantages, some of them especially important in the context of embedded systems. The following list briefly explains some of these - the order is not significant. – First of all, developer productivity is improved, since repetitive, aspects need not be coded manually over and over again. Many target platforms (such as real-time operating systems) require a lot of ”bookkeeping” code and configurations that fall in this category.



 Model Driven Development for Component Infrastructures



161



– The models capture knowledge about the application structure and functionality much more explicitly, free from “implementation clutter”. – Different concerns are separated explicitly. Each of them (or subsets) are model led using their own model, making them explicit and thus more tractable, easier to change and potentially reusable. – Communication with the various stakeholders is simplified, since each stakeholder need only take a look at the models of the aspect they are interested in. – It is easier to react on changes, since the change often affects one piece of code only. Code (and other artifacts) that needs to change as a consequence is simply regenerated. – The transformations capture design knowledge for the target platform. They thus serve as a form of “codified best practices”. – Reuse (of the platform, DSLs, etc.) is made possible. – Typically, the software architecture improves since the definition of a stringent software and system architecture is necessary. Otherwise code cannot be generated efficiently. – Code quality is improved, since most of it is generated from templates. It is easier to ensure templates generate high-quality code than to assure this for each piece of code manually. – Portability is simplified. If a different platform should be used, only a new set of templates needs to be created (which, of course, can be non-trivial, too). – MDSD increases flexibility without inducing runtime overhead. The generated code can be strongly typed, maybe even relying on static memory allocation only, while flexibility is still there. The flexibility is realized at generation time and compiletime, not during runtime. – Since the mapping from models to implementation code is determined by templates (i.e. is the same each time), the quality of service characteristics of the implementation (timing behavior, memory consumption, performance) are known to some extent. This can be a big advantage in embedded system development. – Since error messages are not just reported by the compiler when compiling the implementation code, but also by the transformer/generator when reading the models, the error messages can be much more expressive. A model contains much more domain semantics that can be reported as an error messages compared to implementation code. Prejudices. Since we keep hearing the same prejudices against MDSD over and over again, here are some simple statements that developers should consider. – MDSD does not require UML, use any DSL that is suitable for your domain. – Generated code can be very readable, can include comments, etc. Generated code is often even easier to understand than manually written code, since it is more structured (it is based on the “rules” in the transformations) – MDSD does not require a waterfall process. MDSD works well using incremental, iterative processes. This is true for application development as well as for the development of the domain architecture. – MDSD is quite agile, since - once a domain architecture is in place - it allows to come up with running applications very quickly.



 162



Markus Voelter, Christian Salzmann, and Michael Kircher



Challenges. There is no “free lunch”. So, even while model-driven software development has a great number of benefits, there are also some drawbacks/challenges that need to be addressed. For reasons of space we cannot go into details, and we recommend reading [21]. – The development process has to take into account the two development paths: domain architecture development and application development. An approach that has worked in practice is to have two kinds of teams (domain architecture and application development). The application development teams play the role of the customers for the domain architecture development. An iterative process with regular “deliverables” will minimize the problems. – There are no universal standards yet. Using MDSD will always tie the development to a number of tools. With the OMG’s MDA standard, this should be less of a problem in the future. Today the impact of the problem can be minimized by relying on open source tools. – The concepts and tools need to be understood. Specifically for “traditional” embedded developers this can be quite a ’cultural shock’. Terms like meta-model, DSL, etc. are often not very well-known, and not readily accepted. The best approach to attack this problem is to run an example-driven education effort that first convinces people of the benefits of the approach, and then goes into some details of the concepts behind it. Practical Experience and Related Work. Model-Driven Software Development has a long success history, although it did not always appear under that name. MDA [14], Generative Programming [3], Domain-Specific Modeling [2] and Domain-Driven Development are all either different names for MDSD or special “flavors” of the general MDSD approach. Specifically, MDA is gaining more and more importance in the enterprise software development area. In the embedded world, generating code from models is also a well-known approach, although use of such code in production systems is only slowly being adapted. Tools like ASCET [4], Matlab/Simulink [9] or Statemate [8] are well-known to embedded developers. Using MDSD to implement (component/container-based) middleware is a rather novel approach, though. The authors, as well as other people known to the authors have been using the approach with overwhelming success in the domains such as automotive, mobile phones or scientific computing. The productivity boosts promised by MDSD have largely been realized. Acceptance with developers was good after they had seen that they could understand the generated code, and that it also was efficient. In the automotive domain, the AUTOSAR consortium [1] is currently in the process of standardizing an architecture and process conceptually similar to the one explained in this chapter.



References 1. The AUTOSAR Consortium. AUTOSAR homepage. http://www.autosar.org/ 2. Domain Specific Modelling Fourm. http://www.dsmforum.org/



 Model Driven Development for Component Infrastructures 3. 4. 5. 6. 7.



8. 9. 10. 11. 12. 13. 14. 15. 16. 17.



18. 19. 20. 21. 22.



163



U. Eisenecker, K. Czarnecki. Generative Programming. Addison-Wesley 2000 ETAS Group, ASCET Homepage. http://en.etasgroup.com/products/ascet sd/index.shtml T. Ewald. Transactional COM+: Building Scalable Applications. Addison-Wesley, 2001 Gamma, Helm, Johnson, Vlissides. Design Patterns. Addison-Wesley 1995 K. Henney. Inside Requirements. Programmer’s Workshop column in Application Development Advisor, May/June 2003 (http://www.two-sdg.demon.co.uk/curbralan/papers/InsideRequirements.pdf) I-Logix, Statemate homepage. http://www.ilogix.com/statemate/statemate.cfm The Mathworks, Matlab Homepage. http://www.mathworks.com/ The openArchitectureWare generator framework. http://sourceforge.net/projects/architecturware/ Object Management Group, Minimum CORBA. http://www.omg.org/technology, 2004 Object Management Group, Real-Time CORBA. http://www.omg.org/technology, 2004 Object Management Group, CORBA Component Model Specification (CCM). http://www.omg.org/technology, 2004 OMG, Model-Driven Architecture (MDA). http://www.omg.org/mda OSGi Alliance, http://www.osgi.org, 2004 D.L. Parnas. On the Criteria To Be Used in Decomposing Systems into Modules. Communications of the ACM, Vol. 15, No. 12, December 1972 C. Schwanninger, E.Wuchner, and M. Kircher. Encapsulating Cross-Cutting Concerns in System Software, Workshop on Aspects, Components, and Patterns for Infrastructure Software, AOSD 2004 conference, Lancaster, UK, March 22-26, 2004 Sun Microsystems, Java2 Enterprise Edition (J2EE). http://java.sun.com/j2ee/, 2004 M. Voelter, M. Kircher, U. Zdun. Remoting Patterns: Foundations of Enterprise. Internet and Realtime Distributed Object Middleware, John Wiley & Sons, 2004 M. Voelter. MDSD Tutorial, http://www.voelter.de/services/mdsd-tutorial.html M. Voelter, T. Stahl, J. Bettin. Modellgetriebene Softwareentwicklung. dPunkt, to be published in 2004; an English version is in preparation. M. Voelter, A. Schmid, E. Wolff. Server Component Patterns - Component Infrastructures Illustrated with EJB, John Wiley & Sons, 2002



 A Component Framework for Consumer Electronics Middleware Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien Department of Mathematics and Computer Science, Technische Universiteit Eindhoven, P.O. Box 513, 5600 MB Eindhoven, The Netherlands {J.Muskens,M.R.V.Chaudron,J.J.Lukkien}@tue.nl Abstract. Developers of Consumer Electronics (CE) devices face the problem of the ever increasing amount of software that needs to be developed. At the same time the time to market of their products needs to decrease. In other domains Component Based software development aids in solving the resulting problems. However, existing component frameworks fail to meet some of the requirements specific for the CE domain. In order to improve this situation a component-based framework has been developed. In this chapter we describe this framework and motivate the architectural choices. These choices are influenced by the requirements on the framework. Some of these requirements are specific for the CE domain, others are more general.



1



Introduction



1.1 Background The component framework presented in this chapter has been developed in the context of the Robocop project [14]. The aim of Robocop is to define an open, component-based framework for the middleware layer in high-volume consumer electronic devices. The framework enables robust and reliable operation, upgrading, extension, and component trading. The appliances targeted by Robocop are consumer devices such as mobile phones, set-top boxes, DVD players, and network gateways. 1.2 Motivation With the increasing capacities of Consumer Electronics (CE) devices the amount and complexity of the software in these devices is growing rapidly. The software is determining to a large extent what a device is or feels like. Producers of these devices face the challenge of developing this continually increasing amount of software while the time-to-market should preferably decrease. Component Based Software Engineering (CBSE) promises to aid in solving the resulting problems. Key success factors generally attributed to CBSE are: 1. the possibility of re-use at the level of components; 2. the support for component composition such as to build new applications; 3. the promise of improved reliability because of explicit specifications and subsequent convergence of individual components to comply with their specifications. C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 164–184, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 A Component Framework for Consumer Electronics Middleware



165



In short, the success should come from being able to use software components in a similar way as we do their hardware counterparts. This includes the way components are paid for or otherwise traded. Presenting a piece of software as a component should therefore really represent a gain in terms of abstraction. In particular this calls for elimination of as many dependencies as possible and, as far as dependencies exist, to specify them explicitly and to bring them to the component interface. Several component frameworks have been developed over the last years, all addressing to some extent these three success factors. However there are also a lot of differences between the existing component frameworks, mainly due to the different requirements in the individual problem domains. Successful adoption will depend on how well the three issues mentioned above are addressed plus some other factors depending on domain specific requirements. These factors are discussed in section 2. Existing component frameworks did not meet some of the requirements particular important for the CE domain: – – – –



Robust and reliable operation Run-time upgrading and extension Low resource footprint Support for component trading



In order to improve this situation a number of European companies and universities have joined forces in an effort to develop a component-based framework for the middleware layer of network-enabled consumer devices, addressing several of the points above. This work was done in the context of the Robocop project [14] which was subsequently used as input for the Space4U project [19]. In this chapter we describe the approach defined by these projects. 1.3 Overview This chapter is structured as follows. Section 2 describes the project context and the target requirements set out for the component framework. It also relates these to existing component frameworks. Section 3 discusses the architecture and how it relates to the targets. Section 4 presents the download framework which enables runtime upgrading and extensibility of a component assembly. Concluding remarks and related work follow in section 5.



2



Background and Requirements



In this section, we discuss the most important requirements for the Robocop component framework and how they influenced the architecture. There are quite some differences between existing component frameworks. These differences are due to different requirements in the targeted application domains. Figure 1 shows the ’features’ provided by component frameworks. Some of the features are mandatory (marked gray), for example Communication. Some of the features are optional (marked white), for example Language independence.



 166



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



Fig. 1. Component framework features (grey=mandatory, white=optional)



2.1 Common Features of Component Frameworks In this section we discuss some features that are common to many component frameworks. We distinguish the categories of features depicted in Figure 1: – Infrastructure: All component frameworks provide an infrastructure. With infrastructure we mean mechanisms for component instantiation, binding, communication, distribution of components over hardware, announcing capabilities of components and discovery of desired components. These mechanisms are needed to create a composition of components that can cooperate in performing a certain task. • Instantiation: A Component Instance is the instantiation of a Component implementation at a specific location in the memory of a device. The relation between a component instance and a component implementation is the same as that between an object and a class. Once in operation, each component instance may create and manage its own data.



 A Component Framework for Consumer Electronics Middleware



•



•



•



•



167



There is a number of different ways in which the instantiation can be achieved. The distinguishing factor is the element in the architecture that controls the instantiation. In existing component frameworks instantiation is typically controlled either by the component infrastructure, a component container, or a component factory. Binding: In the context of component-based systems, binding is the creation of a link between multiple component instances. Binding can be done at designtime, compile-time and run-time. At design time and compile time the binding is done by the developer. The link between component instances may be used for communication and navigation between component instances. The distinguishing aspect of the different ways in which binding can be organized in a component framework is the party that initiates the binding. We distinguish 1st party binding and 3rd party binding. In case of 1st party binding a component instance binds itself to another component. In case of 3rd party binding a binding between component instances is created by a party not part of any of the subjects of the binding. Communication: To facilitate communication between components, a component infrastructure must provide some interaction mechanisms. The interaction styles supported are partially defined by the architectural styles that the component framework supports. The communication styles that a component infrastructure supports determine a number of the quality properties that systems built using these components can obtain. For instance, some communication styles favor efficiency over flexibility. The most common style is request-response as implemented by procedure/ method-calling. This style is the basis of all imperative programming languages and does not require any special facilities from the component-model. The next most commonly supported interaction style is events. Typically events are used for notification; e.g. of exceptions. Often this style is used in conjunction with request-response interaction. Publish/subscribe can be seen as a generalization of events to distributed systems. Component frameworks that are aimed at supporting multi-media processing often provide mechanisms that support streaming as interaction style. Discovery: Every component framework needs to define a mechanism by which presence of components in the system can be discovered. Such a Discovery mechanism is needed to support late and dynamic binding. Discovery mechanisms are the most prominent in component frameworks with run-time changes/ binding. In systems with design-time or compile-time binding the discovery is typically guided by the designer/developer. In systems with run-time adaptation a registry is commonly used for the discovery of components. Announcement of capabilities: Usually the capabilities of a component are expressed by a number of interfaces that are implemented by a component. The way in which interfaces are specified differ between component frameworks. Some component frameworks introduce a special language for expressing interfaces, others use programming languages to specify the interfaces.



 168



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



– Component and application development support: Components and applications are developed using a component framework. The component frameworks have different development features. For example COM[1] and .NET support programming language independent development of components, whereas Enterprise Java Beans (EJB) [15] supports platform independence. • Language independence: Component frameworks often support component development in different programming languages. In order to achieve interoperability between the components developed in the different programming languages the interfaces must be specified in a manner that is independent of the programming language. Usually an interface description language (IDL) is used for this purpose. • Platform independence: Some component frameworks offer platform independence; this means that executable components can be executed on different platforms. This is usually achieved using a intermediate language. This intermediate language can be interpreted at run-time or compiled by a Just In Time (JIT) compiler. • Analysis support: During development of individual components and applications it can be desirable to have analysis techniques. These techniques can be used to prove correctness of the software [7], or to predict extra functional properties [5, 11]. – Support for upgrading and extension: Software evolves over time. The value and the economic life-time of a device and the software on it can be increased by supporting upgrading and extension of the software. Component frameworks can support upgrading and extension at different stages of the software life-cycle (designtime, compile-time, run-time, etc.). The current trend is that upgrading and extension is shifting more and more to the run-time phase of the software life-cycle. In this way, devices can be customized to the needs of a consumer in the period that it is owned and used by the consumer. – Support for extra functional properties: In conjunction to functional requirement, software also needs to satisfy extra functional properties like performance, security and reliability. Which extra functional properties are important for a component framework highly depends on the problem domain that it targets. The extra functional properties that are important can introduce all kinds of restrictions on a component framework. For example, when a low resource footprint is important this can exclude the use of virtual machines for interpretation of programs. – Support for trading: In order to gain the benefits of reuse and trading, a large market of components and hence component-producers is needed. The fact that components should be traded has technical implications for the component framework. 2.2 Focus for the CE Domain The requirements on a specific component framework highly depend on the problem domain in which the framework will be used. Below we will discuss the features that were specifically important for the consumer electronic domain and, consequently for Robocop:



 A Component Framework for Consumer Electronics Middleware



169



– Upgradability and Extensibility: Improvements in software are developed in rapid succession. To extend the economic lifetime of devices, they should be able to upgrade software components with improved versions. In addition to upgradability, there is a need to be able to add new functionality to a device. The mechanism needed for uploading new functionality can be largely shared with that for upgrading. – Robustness and Reliability: In the area of consumer electronics, it is unacceptable that systems break down. Building stable systems requires special effort during their design [2] as well as special mechanisms at run-time [3, 12, 17]. – Low resource footprint: Consumer electronic systems must be made cost effective. This results in limited available resources on the targeted devices and imposes strict resource constraints on the components and the infrastructure. – Trading: One of the main reasons for component based development is decreasing time to market and increasing productivity by increase the (re-)use of existing components. In order to maximally exploit (re-)using existing components, it should be possible to use third-party components. This requires support for trading. In Robocop this requirement highly influenced the choice for component packaging. In section 3 we present the architecture and motivate how it was constructed driven by the requirements mentioned above. Section 4 presents a download framework that can be used in addition to the architecture to realize support for run-time upgrading and extension. In the remainder of this section we will discuss which features are realized by existing component frameworks and how they are realized. Table 1 shows which features are supported by the existing component frameworks. This Figure shows that there is a core set of features offered by all existing component frameworks. There are also a number of features at which component frameworks distinguish themselves. The features that where especially important for the CE domain are highlighted. Infrastructure features are offered by all component frameworks. These features like instantiation, binding and communication are needed to create a system out of components that cooperate to achieve a certain goal of the system. The largest differences between the individual component frameworks can be found at the level of support for extra functional properties, flexible upgrading/extension, development and trading. The need for these features, and therefore the selection of the component framework, depends highly on the target problem domain. In Table 2 we show some existing solutions for component framework features. Most features cannot be dealt within isolation. Solutions for one feature can exclude, or negatively influence, other features. This means there is no such thing as a free lunch. For example platform independence is usually realized using an interpreted or intermediate language, which negatively influences performance. Each component framework needs to make a trade-off between the different features.



3



Architecture of Framework and Component Model



In this section we will discuss the architecture and component model that has been developed during the Robocop project. We will discuss the Robocop component lifecycle, Robocop component packaging, executable component structure and the runtime execution model.



 170



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



Table 1. Features of existing component frameworks (features particular important for the CE domain are highlighted) Component Framework COM DCOM EJB .NET CORBA Koala Robocop PECOS AutoComp Infrastructure Instantiation + + + + + + + + + Binding + + + + + + + + + Communication + + + + + + + + + Distribution + + + + Announcement of Cap. + + + + + + + + + Discovery + + + + + Extra Functional Properties Robust. & Reliab. + + Security +/- +/Low resources + + + + + + Upgrading and Extension Design time + + + + + + + + + Compile time + + + + + + + + + Runtime + + + + + + Development Support Language independ. + + + +/+/+/Platform independ. + + Analysis techniques + +/Trading +



Table 2. Common solution for component framework features (features particular important for the CE domain are highlighted) Infrastructure Instantiation by: 1) Component Container 2) Component Factory 3) Infrastructure Binding 1) 1st party binding 2) 2nd party binding 3) 3rd party binding Communication 1) (Remote) procedure calls 2) Events 3) Publish subscribe 4) Blackboard communication 5) Streaming Distribution 1) Use location transparent communication and instantiation mechanisms Announcement of Cap. 1) Provided interfaces 2) Component descriptors Discovery 1) Registry 2) Publish discovery files on dedicated servers Extra Functional Properties Robust. & Reliab. 1) Explicit and clear dependencies/contracts Security 1) Provide mechanisms for secure communication, authentication, etc. Low resources 1) No intermediate/interpreted language, minimal runtime environment, etc. Upgrading and Extension Design time 1) Always possible Compile time 1) Compile time substitution of components Runtime 1) Binary interfaces and run-time (un/re)binding of instances Development Support Language independ. 1) Use interface description language Platform independ. 1) interpreted language 2) intermediate language Analysis techniques 1) Use models describing properties of a component used for analysis 2) Component models for RT app Trading 1) Use models to describe the interests of the different stakeholders Instantiation



 A Component Framework for Consumer Electronics Middleware



171



3.1 Component Life-Cycle The Robocop life-cycle addresses Robocop components from development time, until run-time instantiation of services. During their life-cycle the Robocop components manifest themselves in different ways. Once developed, Robocop components are published in a repository. Published Robocop components can be generic in the sense that they still need to be tailored to run on a specific platform. This may involve e.g. compilation and linking for that platform. At this time the Robocop component can be loaded on the specific target. When a Robocop component is resident on the target it needs to be registered. After registration the component is ready to be used. The component can be loaded and services implemented by the component can be instantiated. The life-cycle is depicted in Figure 2.



Fig. 2. Component Life-cycle



3.2 Component Packaging Unlike Szyperski [20], we found that the unit of trading is not the same entity as the unit of deployment. We distinguish between Robocop components which are the units of trading and configuration management and executable components which are executable code that is physically present in the memory of a device during execution. During the component life-cycle a number of stakeholders are involved. These stakeholders are interested in different aspects of a component. A user of a device is interested in executable code for his device. A system developer is interested in documentation as well, maybe even the source code. Some of the stakeholders are not interested in the executable code, an system architect can decide to use a component based on a specification or simulation model. The Robocop component model can support all these different uses by using different models for the individual aspects. Usage of multiple different models is very useful in component trading, since a number of stakeholders can be involved with different concerns. Furthermore we found that it is often desirable to trade more than executable code (documentation, sources, etc).



 172



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



A Robocop component (component package) is a set of related models (see Figure 5). The set of models is not fixed, but may be extended if needed. Any particular Robocop component must consist of one or more of these models. The information in the models can be human-oriented or machine-oriented. An example of human-oriented models is documentation. Typically one of the machine-oriented models will be the executable component. In order to achieve technical interoperability of executable components, many issues concerning the run-time execution model need to be defined. More details about this are given in the remainder of this chapter. Other examples of (machine-oriented) models are: – A simulation model. This type of model could be of use during design of an application to analyze the interaction between components. Such models should be supplied by the developers of the component. Colored Petri-Nets [8] are a candidate for this type of model. – A resource model. This type of model describes the resource needs of a component. This can be used both during design and during the dynamic upgrading to assess whether a configuration of components fits within the available resources [4]. – A functional model. This type of model specifies the functionality of the component. A candidate specification language for this type of model is Z [18]. – Interface model (Robocop IDL). This model describes the functions provided and required by the component. Based on this model it is possible to check whether or not all dependencies in a configuration can be resolved. An example of a Robocop IDL description can be found in Figure 3. Each Robocop component contains a globally unique identifier (GUID). This is needed for configuration management. Additionally each model is also identified by a GUID. Figure 6 shows two examples of different Robocop components. Robocop component 1 contains one executable, one IDL, one simulation model and one resource model (see also Figure 4). Robocop component 2 contains two executables, two resource models, one IDL and one simulation model. Multiple executables implementing the same IDL can occur when a Robocop component provides executables for multiple platforms. In that case the resource consumption of the executables can be different, thus different resource models for the executables are included. Relations are used to indicate which resource model is related to a specific executable model. 3.3 Executable Component Structure The Robocop component package, as described in section 3.2, is applicable to any type of model. Therefore it can be applied to any kind of binaries: e.g. static or shared libraries, COM components, or even complete executables. However, the executable view of the component determines how applications can be built from these components. This view is different for the various executable models, and determines at which abstraction level systems are composed. We will now discuss the executable component model. In the remainder of this chapter we will use the term component for the executable component model. For the Robocop component package we will use the term Robocop component. Within Robocop, a



 A Component Framework for Consumer Electronics Middleware module Printers { interface IPrinter {6083D1C4-0643-4ce6-B1EA-66467A65840C} { void printLn( in string line ); ... }; service SPrinter {68CCA7C4-C24A-4bee-9A5E-AF79B806483D} { provides { IPrinter printer; }; };



service SPlotter {6782A98F-06D5-436f-AE89-0D6E064AB047} complies SPrinter { provides { IPrinter printer; }; }; component CLaserPrinter {B7621504-7FCD-42ea-BCF0-90F67FE557C7} { provides SPlotter; }; };



Fig. 3. Example Robocop IDL description



 <models> <model guid=’6083D1C4-0643-4ce6-B1EA-66467A65840C’ type=’resource’ location=’./component1.rm’/> <model guid=’7EDB39A0-5326-D811-87C6-0008744C31AC’ type=’ridl’ location=’./component1.ridl’/> <model guid=’D9A356E2-5873-40b8-8D95-EA2F5F4DE692’ type=’simulation’ location=’./component1.sim’/> <model guid=’28B4E880-AF84-4c86-B5FD-FC82A9FE1746’ type=’executable’ location=’./component1.so.0.0.0’/>       



Fig. 4. Example Robocop component description



173



 174



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



Fig. 5. A Robocop component is a set of models



Fig. 6. An example Robocop Component



component contains a number of services. These services implement the functionality of the component. Each service offers this functionality through a set of named interfaces (ports). The interfaces used are COM [1] interfaces. The v-table approach used in these interfaces results in little communication overhead. The overhead is limited to one pointer dereference per operation invocation on an interface. Services have explicit dependencies through a set of named requires interfaces (ports). All interfaces have a standard error return mechanism and there is a facility to pass additional error information from a service to the client using the service. The explicit dependencies, and the error return mechanisms enable the development of a robust and reliable system. Every component implements at least one service manager. This service manager is used to instantiate services and initialize attributes of the created service instances. Each component has a fixed entry point, this entry point is used to get a service manager. Figure 7 illustrates the structure of the executable component.



 A Component Framework for Consumer Electronics Middleware



175



The services implemented by components can be specified in a Robocop IDL (RIDL). RIDL is inspired by the CORBA [10] IDL. For each service the provided ports and required ports are specified as well as the public attributes of a service. RIDL allows a developer to specify that a service is compliant with another. Service X is compliant with another service Y at a structural level means: service X provides at least the named interfaces that are provided by service Y and service X requires the same named interfaces that are required by service Y. In this case code written to use service Y will also work using service X.



Fig. 7. Executable component



3.4 Run-Time Execution Model In this section we will present the run-time execution model used in Robocop. The Robocop runtime environment (RRE) has a key role in this execution model. In order to enable a low resource footprint the RRE kept minimal and can be extended with optional frameworks. The RRE supports registration of components / services and instantiation of services. All other features, like for example resource management, are optional. If these features are not used there is no resource overhead. Next we will discuss how the RRE stores the component information in a registry, how components are registered, how services of a component can be instantiated and how the functionality provided by a service can be used. The RRE Registry. The core responsibility of the RRE is to handle requests for service instances (and service managers). The desired service instance is identified by its GUID and the RRE needs to find a component that can deliver instances of the specified service. To that end, the RRE maintains sufficient information in a database (referred to as the registry). This registry contains three tables (see Figure 8). The first table contains the association between the component GUID and the physical location of the component. This



 176



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



physical location is formatted in a URL-like fashion. When the component is stored in a file system, the location will be file:// followed by the actual file name. For systems that store the component in a direct addressable memory space the location looks like address:// followed by the physical address containing the component code. The second table contains the relation between the component GUID and the service GUID. As indicated earlier, a component can implement multiple services. It is equally well possible that a particular service is implemented in more than one component. When multiple components are registered in this table for the same service, the implementation of the RRE is free to choose any of these components when there is a request for a service. The third table contains the complies relation for services. When a service is requested, the RRE can use this table to find compliant services that can be used to satisfy the request. In Figure 9 below, there is a graphical representation of a sample registry contents (the same contents as in Figure 8). The solid lines indicate the complies relation between services: e.g. the SPixelScreen service complies to the SCharacterScreen. So, the SPixelScreen can be used when a SCharacterScreen service is needed. The dotted lines indicate which component implements which service. In the example the CScreen component implements both the SPixelScreen and the SCharacterScreen service.



Fig. 8. The RRE registry with example contents



Fig. 9. Graphical representation of registry contents



 A Component Framework for Consumer Electronics Middleware



177



When the SCharacterScreen is requested, the RRE will activate the CScreen component and send it a request to instantiate the SCharacterScreen service. When the SOutput service is requested, there is no component registered that can instantiate the service. The RRE might activate the CLasterPrinter, and request it to instantiate the SPlotter service, or it might activate the CScreen component, and request it to instantiate the SPixelScreen or the SCharacterScreen service. The actual component and service used, is left to the RRE. Registration. The behavior of the RRE is driven by the information in the registry. When components are resident on a system, information needs to be added to the registry (registration) in order for the RRE to be able to use the component and the implemented services. The components themselves play a passive role in this registration process: the system configurator controls the registration process. This role can be exercised by a person assembling the system or can be automated to some degree. We do not let components themselves be in control of this registration process, because it is a system-wide aspect, i.e. can only be done well with system-wide knowledge. For example, when two components can offer the same service, neither of the components can decide which one should be used in the system. Instead of using an overwrite strategy (e.g. the component most recently registered takes precedence), this choice needs to be made external to the individual components. The registration information needed is either directly known to the system configurator (e.g. location of the component on the target) or can be obtained from the RIDL description (component GUID, service GUIDs, complies relations) or via one of the other models in the Robocop Component. Note that a system configurator may decide not to use all of the information for the registration. The configurator can decide not to register all of the services, or ignore some of the complies relations. This is useful, because the complies relation on the RIDL level only expresses substitutability at a structural level. This does not address extra-functional aspects. The resource usage of the service may exceed what the system is willing to spend, and thus it makes sense not to register this service on the device. Instantiating Services. Clients can get a Service Instance by calling functions in the RRE. Instantiating a service is a multi step process, with shared responsibility between the RRE and the Component that can instantiate the Service. The process can be broken down in three steps: – Locating and activating the component – Retrieving the Service Manager – Retrieving the Service Instance Locating a component has been described earlier in this section (The RRE Registry). Once the correct component is located, the RRE will activate the component. This entails loading the component in OS terms, which means mapping the component into (executable) memory space; e.g. from the file system where the component is stored. Each component supports a fixed entry point. This entry point is used to get a service manager, and to check if a component can be unloaded.



 178



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



The Service Manager is part of the factory pattern for the services [6]. The service instances are created using the Service manager. Additionally, the service manager can be used to pre-set attributes of the service instances. Interacting with Service Instances. As described earlier, the service instances expose their functionality through named interfaces (ports). Each service implements the rcIService interface. It supports methods to retrieve the provided ports of service instances, and the bindTo and the unBind method to bind and unbind interface instances (provided ports of a different service instance) to requires ports of the service. The methods to do this are not strongly typed. In order to have a type safe mechanism, each service definition gives rise to an interface definition that is implemented by the service. This interface is a descendant of the rcIService interface. For each provided port, there is a specific method GetProvides in this interface that returns an interface of the type as mentioned in the service definition. The  is the actual port name used in the service definition. Similarly, there are two methods in this interface to bind and unbind requires interfaces: the bindTo and unBind. In general, a service instance can not operate when its requires ports are not bound. In order for a client to signal that it has finished binding its required interfaces and wants to start using the service instance, the client calls the Start method. This method can be called either in the rcIService interface, or in the service specific descendant of that interface.



4



Download Framework



In this section we will present the download framework. The download framework has been developed as part of Robocop to fulfill the requirements concerning upgrading and extension. During the period that a CE device is owned by a user there is a need for improving and extending the software on the device in order to extend the economic lifetime of the device. The download framework is responsible for transferring Robocop components from a repository to a target and their registration with the RRE. In the component life cycle (see 3.1) this corresponds to the phases tailoring, target loading and registration. Within the Robocop project little work has been done on tailoring of components and automated registration of components. For locating Robocop components and target loading a solution has been developed that consists of the following conceptual roles: – – – – –



Initiator Locator Decider Repository Target



At run-time these roles can run at different devices. The target role runs at the terminal to which a component is transferred. The roles are discussed in detail in sections 4.1, 4.2, 4.3, 4.4, and 4.5 respectively. Within the download process we distinguish the following phases:



 A Component Framework for Consumer Electronics Middleware



– – – –



179



Location of entities that participate in the download process Decision about the feasibility of a given download Target loading, the actual transfer of the Robocop component to the device. Confirmation of the download and registration of the downloaded component at the RRE.



All roles that participate in the download process communicate using gSOAP [21]. The gSOAP technology enables communication between the different roles even though they might be deployed on different hardware nodes. The download process is depicted in Figure 10. Next, we will discuss the individual roles in more detail.



Fig. 10. The download procedure



4.1 Initiator Role In this subsection we will discuss the initiator role in the download process. The main responsibility of the initiator is verifying the presence of all the entities in the download process and coordinating the download process. The initiator role can be implemented by a component on the target device. It is also possible that the initiator role is implemented on a different device, this can be useful in case of remote maintenance. An initiator starts the download process in response to some external event. Such an event may be triggered for instance by the component upgrading process or as result of a change in user preference settings. An initiator can participate in multiple download processes at the same time. The download processes are identified by the GUID of the Robocop component that needs to be transferred and a unique name for the target. In order to verify the presence of all the entities and to coordinate the download process, the initiator needs the addresses of all the entities. To that end the initiator has



 180



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



a table with the addresses of one or more locators ordered by priority or proximity (see Figure 11). Using one of these locators, the initiator can retrieve the addresses of the other entities involved in the download process.



Fig. 11. Example of deployed download roles



4.2 Locator Role In the process of downloading a Robocop component from a repository to a target, the locator is responsible for locating a repository which is to provide the Robocop component to be downloaded, and the target where the Robocop component is to be transferred. In addition, the locator is responsible for locating the entity playing the decider role which is responsible for deciding whether the given download will take place or not. Hence, the locator provides the addresses which can be used to contact the repository, the target and the decider. Although the functionality of the locator does not depend on the interconnection network used for the download, the latter determines the type of the address returned by the locator. For the sake of precision, the term address in the above definitions stands for a comprehensive descriptor of an entity in an interconnection network which allows its holder to contact that entity. For example, in an all-IP network this descriptor is the IP address of the entity. In order to be able to provide the address information the locator maintains three tables. The first table is used to store all the registered targets, the second table is used to store the registered repositories, and the last is used to store the registered deciders. For



 A Component Framework for Consumer Electronics Middleware



181



all these entities the name and address is stored. For the deciders also the decider class is stored which can be used to distinguish different type of deciders. Information about which repository contains which components is not stored at the locator. When the locator receives a request to locate a repository that contains a Robocop component it will query all known repositories for that specific Robocop component using its GUID. A locator provides a registration mechanism for targets, deciders and repositories. Entities implementing these roles use this mechanism to register themselves at a number of locators. 4.3 Decider Role A decider awaits a request (typically from an initiator) to determine if it is possible that a given Robocop component can be downloaded onto and subsequently registered on a given target. This is done by investigating and matching the acquired profiles of repository resident components and profiles of the target. After this matching process the decider will notify the involved parties of the decision. The decision procedures performed by the decider may be sophisticated. For example, in the case of components that are subject to a real-time scheduling policy, a schedulability-test may be performed. Within the Robocop project little work has been done on defining advanced decision procedures. 4.4 Repository Role The main responsibility of a repository is the storage of Robocop components in order to be transferred to a target. It is also possible that the repository tailors Robocop components; for example by compiling a component for a specific platform. During the download process the repository is first queried for the availability of a Robocop component. At a later phase in the download process the decider will request the component profile. After the repository has received a positive decision from the decider, it must prepare the specific Robocop component to be transferred to the target. The actual transfer of the Robocop component is activated by the initiator. The download framework supports two types of transfer strategies: Push and Pull. Figure 10 illustrates the Push strategy. In that case the initiator sends a message to the repository to start transferring the Robocop component. In the latter case the initiator will send a message to the target that it should fetch the Robocop component at the repository. 4.5 Target Role The target role has two main responsibilities. The first is that a target should be ready to receive the Robocop component that is transferred during the download process and make it resident on the device. The second responsibility is providing the a profile of the target device to the decider. This is necessary for the decider to be able to make sure that the downloaded Robocop component is suitable for the target. In case of a Pull strategy for the transfer of the Robocop component, the target will have an additional responsibility: fetching the Robocop component. The example download process in Figure 10 is based on the Push strategy in which this is not the case.



 182



5



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



Concluding Remarks



5.1 Discussion In recent years a number of component models have been developed. These component models aim to increase productivity and reduce time to market by enabling re-use of software. These benefits are also desirable in the CE domain. However, existing component models have failed to meet some of the specific requirements in the CE domain: – – – –



Robust and reliable operation Run-time upgrading and extension Low resource footprint Support for component trading



In this chapter we presented the Robocop component framework that is specifically developed to satisfy these requirements in order to be suitable for the for the CE domain, or more general for the domain of high volume embedded systems. Robust and reliable operation have been achieved by using explicit dependencies between components and by providing hooks for analysis techniques (models describing the properties of components and services). Runtime upgrading and extension have been realized using dynamically loadable components. Each component provides a single entry point, that gives access to the services provided by the component. In order to achieve runtime upgrading and extension, services needed a binary interface. At runtime, services can be instantiated and connected by a third party. We used the v-table approach that is also used in COM [1]. Low resource footprint was one of the high priority requirements. To achieve this requirement we adhered the following principle: ”Do not pay for what you don’t use!”. We designed a minimal runtime environment that offers the core functionality of a component framework, e.g. discovery, instantiation, binding, and an efficient communication mechanism. Additional features are provided using optional frameworks. If an optional frameworks is not used, then there no overhead is incurred during run-time. Support for component trading should enlarge the market of components in order to increase the benefits of component based development. We distinguish Robocop components that are the unit of trading and configuration and executable components which are executable code that can be executed on the target device. To increase the tradeability of the components, it is desirable to have more information about a component than its executable only. For example, documentation, source code, specification, may be very useful. Therefore, Robocop components are defined to be a set of models. One of these models typically is the executable component; other models may provide complementary information about the component. 5.2 Related Work Over the last couple of years a number of component models have been developed. A taxonomy of the most well known component models has been presented in section 2. Koala [13], PECOS [22], and AutoComp [16] are component models developed especially for embedded systems. These component models have some of the features



 A Component Framework for Consumer Electronics Middleware



183



required in the CE domain; they enable robust & reliable operation and a low resource footprint. Although run-time upgrading was already supported by existing component models like COM [1], EJB [15] and .NET [9], this was not the case for component models for embedded systems. To remedy this, the Robocop component model enables run-time upgrading and increase the support for run-time trading of components. In the Robocop framework we combine ideas for run-time upgrading from COM with ideas for robust and reliable operation from Koala. 5.3 Contributions With the development of the Robocop framework we combined a number of solutions from existing component models to create a component model suitable for the CE domain. In addition we introduced the notition of a Robocop component as a set of models. Different model-types can be used to describe different aspects of a component, similar to the way multiple views are used in describing software architectures. The different models of a component increase the trade-ability since customers are usually not interested in just the executable code, but need additional information to assess suitability.



Acknowledgments We are grateful to all the partners of the Robocop project for their contributions. Philips Electronics, Nokia, CSEM, Saia Burgess, ESI, Fagor, Ikerlan, University Polytechnic de Madrid, and Eindhoven University of Technology. The Robocop project has been funded in part through the European ITEA programme.



References 1. D. Box. Essential COM. Object Technology Series. Addison-Wesley, 1997. 2. I. Crnkovic and M. Larsson. Building Reliable Component-Based Software Systems. Artech House Publishers, 2002. 3. E. Dashofy, A. van der Hoek, and R. Taylor. Towards architecture-based self-healing systems. In Proceedings of the first workshop on Self-healing systems. ACM, Nov. 2002. 4. M. de Jonge, J. Muskens, and M. Chaudron. Scenario-based prediction of run-time resource consumption in component-based software systems. In Proceedings: 6th ICSE Workshop on Component Based Software Engineering: Automated Reasoning and Prediction. ACM, June 2003. 5. E. Eskenazi, A. Fioukov, D. Hammer, and M. Chaudron. Estimation of static memory consumption for systems built from source code components. In 9th IEEE Conference and Workshops on Engineering of Computer-Based Systems. IEEE Computer Society Press, Apr. 2002. 6. E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patters: Elements of Reusable Object oriented Software. Addison Wesley, 1995. 7. G. Holzmann. Spin Model Checker. Addison Wesley, 2003. 8. K. Jensen. Coloured Petri Nets. Basic Concepts, Analysis Methods and Practical Use. Springer-Verlag, 1997.



 184



Johan Muskens, Michel R.V. Chaudron, and Johan J. Lukkien



9. J. Lowy. .NET Components. O’Reilly and Associates, 2003. 10. T. Mowbray and R. Zahavi. Essential Corba. John Wiley and Sons, 1995. 11. J. Muskens and M. Chaudron. Prediction of run-time resource consumption in multi-task component-based software systems. Technical Report TR-117, Technische Universiteit Eindhoven, 2003. 12. J. Muskens and M. Chaudron. Integrity management in component based systems. In Proceedings of the 30th EUROMICRO conference, Rennes France, Aug. 2004. 13. R. van Ommering, F. van der Linden, J. Kramer, and J. Magee. The Koala component model for consumer electronics software. IEEE Computer, 33(3):78–85, Mar. 2000. 14. Robocop Consortium. Robocop: Robust open component based software architecture for configurable devices project (http://www.extra.research.philips.com/euprojects/robocop/), 2001. 15. E. Roman, S. Ambler, and T. Jewell. Mastering Enterprise JavaBeans. John Wiley and Sons, 2001. 16. K. Sandstrom, J. Fredriksson, and M. Akerholm. Introducing a component technology for safety critical embedded real-time systems. In 7th ICSE Workshop on Component-Based Software Engineering, May 2004. 17. B. Schmerl and D. Garlan. Exploiting architectural design knowledge to support selfrepairing systems. In Fourteenth International Conference on Software Engineering and Knowledge Engineering, 2002. 18. G. Smith. The Object-Z Specification Language. Kluwer Academic Publishers, Boston, 1999. 19. Space4U Consortium. Space4u: Software platform and component environment 4 you (http://www.extra.research.philips.com/euprojects/space4u/), 2003. 20. C. Szyperski. Component-based Software Engineering beyond object orientation. AddisonWesley, 1998. 21. R. van Engelen and K. Gallivan. The gsoap toolkit for web services and peer-to-peer computing networks. In 2nd IEEE Internation Symposium on Cluster Computing and the Grid. IEEE Computer Society Press, May 2002. 22. M. Winter, T. Genssler, A. Christoph, O. Nierstrasz, S. Ducasse, R. Wuyts, G. Arevalo, P. Muller, C. Stich, and B. Schonhage. Components for embedded software - the PECOS approach. In 2nd ECOOP Workshop on Composition Languages, 2002.



 Connecting Embedded Devices Using a Component Platform for Adaptable Protocol Stacks Sam Michiels, Nico Janssens, Lieven Desmet, Tom Mahieu, Wouter Joosen, and Pierre Verbaeten K.U.Leuven, Dept. Computer Science, Celestijnenlaan 200A, B-3001 Leuven, Belgium {sam.michiels,nico.janssens}@cs.kuleuven.ac.be



Abstract. Research domains such as sensor networks, ad-hoc networks, and pervasive computing, clearly illustrate that computer networks have become more complex and dynamic. This complexity is mainly introduced by unpredictable and varying network link characteristics, heterogeneous capabilities of attached nodes, and the increasing user expectations regarding reliability and quality of service. In order to deal with this complexity and dynamism of computer networks, the system’s protocol stack must be able to adapt itself at runtime. Yet, to handle this complex challenge effectively and efficiently, we claim that it is essential for protocol stacks to be developed with run-time adaptability in mind. This chapter presents a software architecture tailored to build highly adaptable protocol stacks, along with a component platform that enforces this architecture. Although the presented software architecture focuses on protocol stacks in general, we zoom in on the application of its founding principles in the domain of embedded network devices.



1



Introduction



The use of mobile embedded devices to offer users network connectivity anywhere and anytime increases significantly [1]. In order to achieve seamless interoperability of heterogeneous devices in a highly dynamic network, the protocol stack of each connected device often needs to exhibit a similar degree of dynamism. Connected devices can vary from powerful portable PCs or PDAs to resource limited embedded devices like mobile phones or sensors. This chapter presents a software architecture [2] tailored to build highly adaptable protocol stacks, along with a component platform [3] that enforces this architecture. We refer to this combination as DiPS+, the Distrinet Protocol Stack [4]. The key focus in DiPS+ is run-time adaptability to application- and environment-specific requirements or characteristics. The strength of the DiPS+ approach is twofold. On the one hand, DiPS+ provides for two essential aspects of run-time adaptability: it offers support for controlling concurrency behavior of the protocol stack and for swapping components in a transparent manner, while sharing a common component platform core. This considerably facilitates system management, since it allows for modular integration of non-functional extensions that cross-cut the core protocol stack functionality. C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 185–208, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 186



Sam Michiels et al.



On the other hand, DiPS+ proposes a design method that imposes the separation of basic protocol stack functionality from additional run-time adaptability support. As will be illustrated further in this chapter, the employed separation of concerns allows for a programmer to concentrate on a single concern (e.g. the behavior of a DiPS+ protocol stack) without being distracted by other concerns scattered across the same functional code (such as additional adaptability support). This is essential for making adaptable protocol stacks more comprehensible, reusable and flexible. We believe that the DiPS+ component platform is a convincing case study to illustrate the potential of using fine-grained components and separation of concerns in building highly adaptable network systems. We argue that (1) in order to achieve runtime adaptability, the software must be developed with flexibility in mind, and that (2) modularity and strict separation of concerns are two main characteristics of an adaptable design [4]. Obviously, there are many other specific concerns when developing embedded systems, such as performance control, resource awareness, and real-time constraints. Experience shows that at least the first two of these “embedded system characteristics” benefit from our software architecture as well. We do not claim that the DiPS+ component platform can be used as-is in networked embedded systems; however, we are convinced that its founding principles can be beneficial for this kind of software. Throughout the chapter, we will clarify the advantages of the DiPS+ ideas for embedded systems. We have validated DiPS+ successfully, a.o. in the context of concurrency control [5, 6]. The DiPS+ component platform has also been applied in research domains different from component swapping and concurrency control. Discussing all related research tracks would lead us too far and certainly transcends the scope of this chapter. In summary, DiPS+ offers support for unit testing [7], automatic component composition [8, 9] and framework optimization [4], while current research extends DiPS+ from a local node architecture to a distributed management platform that allows for controlling and adapting multiple connected protocol stacks [10]. The remainder of this chapter is structured as follows. Section 2 sketches the domain of our case study: it explains the need for providing flexibility in protocol stacks with respect to concurrency control and component hot-swapping. Section 3 presents the DiPS+ component platform, which offers core programming abstractions to improve the development of adaptable protocol stacks. The two sections that follow each describe a specific extension of the DiPS+ platform to control and manage the underlying protocol stack: Section 4 focuses on dynamic load management; Section 5 explains how transparent component swapping is supported. Section 6 describes the DiPS+ prototype and its validation in various research projects and Master’s theses. It also explains how our positive experiences in the domain of networking software can be applied in the broader domain of component-based embedded systems. Section 7 positions our work with respect to related research. Conclusions are presented in Sect. 8.



2



The Specific Case of Protocol Stacks



The development of protocol stacks is often complex and error-prone, especially when additional preconditions (such as the need for run-time adaptability) are imposed. Be-



 Platform for Adaptable Protocol Stacks



187



fore elaborating on how advanced separation of concerns contributes to making adaptable protocol stacks more comprehensible, accessible and reusable, we elaborate in this section on the importance of run-time adaptability in the domain of protocol stack software (whether or not in an embedded system). More precisely, we focus on nonfunctional adaptations (load management) and on functional adaptations (component hot-swapping). 2.1 Load Management Management of system load in networked systems tries to prevent systems from being overwhelmed by arriving network packets. Load management is highly important for both embedded devices in an ad-hoc network (since they may have limited resources available), and network access devices (which may receive considerable access demand peaks when a large group of users connects in parallel). Since cooperating nodes in an ad-hoc network may be highly heterogeneous with respect to available processing resources, memory, and data transfer capabilities, low-end router nodes easily get overloaded by data transfers induced by more powerful machines. By consequence, adaptive load management is highly relevant for embedded network devices. Solutions for system load control often depend on run-time circumstances and/or application-specific requirements. In addition, system load should be controlled and managed at run-time to handle changing network circumstances gracefully. These changes can, for instance, be induced by (1) popular services being offered on the network, resulting in increasing network traffic to the server, (2) more clients being added dynamically and/or clients with varying quality of service requirements, or (3) decreasing processing capabilities when the battery of a stand-alone device is getting low. In other words, circumstances may vary at the side of the server, the clients, and the network nodes themselves. In order to enable (low-end) devices to handle overload situations gracefully, our approach proposes to dynamically balance resource consumption based on applicationand environment-specific requirements. This goal is achieved by detecting internal bottlenecks and deploying a solution to the problem in the running protocol stack. A bottleneck occurs when many more packets arrive at a component than can be processed immediately. Bottlenecks can be processed in many different ways. We concentrate on three approaches: packet classification and prioritization, input rate control, and thread reallocation from underloaded to overloaded areas. It is important that solutions (e.g. packet classification, input rate control, thread re-allocation) can be performed at any place in the protocol stack and, by consequence, can be based on information not yet available when the packet arrives. Protocol headers, for instance, only release their information when they have been parsed; however, this information may considerably influence further processing of the packet. In addition, the classification strategy used to differentiate between packets may be based on application- and/or environment-specific requirements, in order to take into account changing circumstances at run-time (e.g. ad-hoc network topology, available system resources, network load, etc.). Thread reallocation focuses on tasks to be executed instead of packets. This allows to customize processing of particular areas in the system by adding or removing threads locally. For



 188



Sam Michiels et al.



example, system performance is improved by increasing parallelism in areas that have become (temporary) I/O bottlenecks. Our approach is complementary to existing load management techniques that are, or can be, used to handle overload situations gracefully. The most relevant techniques in this context are quality-of-service (QoS) protocols, load balancing [11], and active networks [12]. Our approach complements these distributed techniques by offering a local platform that is able to detect and (partially) handle overload situations. 2.2 Component Hot-Swapping Research domains such as ad-hoc networks, sensor networks, 4G wireless networks and pervasive computing, clearly indicate a trend towards more heterogeneous mobile computer networks. Network heterogeneity manifests itself in the form of increased diversity in the type of communication technology that devices are equipped with (such as Bluetooth, WiFi, HomeRF and satellite links), as well as in the types of embedded devices connected to the network (differing in memory capacity, processing power and battery autonomy). In addition, performance characteristics of network nodes and communication links most often change over time, a.o. due to disturbing influences. These heterogeneous and dynamic performance specifications will affect the inter-operability of connected nodes, and as a result are most likely to compromise the communication quality of the network, in particular when a best-effort communication model is employed. For instance, a Bluetooth scatternet (operating at 2Mbps) will probably become a bottleneck when interconnecting a number of 802.11 MANET’s (22Mbps throughput). To fully exploit the potential of such heterogeneous and dynamic networks, it is essential for the protocol stacks of the connected embedded devices to adapt themselves at run-time as the environment in which they execute changes (e.g. by installing a compression service to boost the quality of the slow Bluetooth scatternet). To this end, we aim at coping with the increasing user expectations regarding quality of service. By consequence, the underlying protocol stacks should exhibit a similar degree of dynamism, which illustrates the need for employing programmable [13] (i.e. adaptable) network nodes. These programmable networks are strongly motivated by their ability to rapidly change the protocol stack of network nodes without the need for protocol standardization. In addition, protocol stack reconfigurations should be performed at run-time (transparently for end-user applications) to promote permanent connectivity of the embedded devices and thus exploit the full potential of mobile wireless networks. This requires the node architecture to conduct adaptations (recomposition) of the protocol stack functionality without having to shut down and restart active connections. As a result, a running DiPS+ protocol stack can be customized by a third party (such as a network operator or intelligent self-healing network support), without interfering with the execution of applications using the network. More in detail, we focus on unanticipated protocol adaptations, such as feature additions and protocol revisions. Since these adaptations are not anticipated at design-time or deployment-time, component hot-swapping is essential to achieve seamless run-time evolution of protocol stacks in mobile embedded devices. In addition, component hot-



 Platform for Adaptable Protocol Stacks



189



swapping is justified by the memory constraints inherent in connected limited embedded devices, such as intelligent sensors and mobile phones. Depending on the protocol to be adapted, additional support is required to prevent the replacement of DiPS+ components from jeopardizing the functionality of a running stack, which would compromise the correct functioning of the ad-hoc network. This includes avoiding packet loss during a reconfiguration (a.o. essential when changing protocols like TCP that aim to provide full reliability) as well as imposing a safe state over the DiPS+ components before conducting the actual reconfiguration. As will be illustrated in Sect. 5, the latter is essential to prevent reconfiguration of a composition from breaking the consistency of the components making up the protocol stack [14, 15].



3



The DiPS+ Component Platform



As stated in the introduction, DiPS+ aims for modular integration of non-functional extensions (such as support for load management and component hot-swapping), which share a common component platform. Strict separation of such non-functional behavior has proven to be an essential feature of adaptable, maintainable and reusable software [16]. To separate non-functional behavior from basic protocol stack functionality, the DiPS+ architecture represents data (packet) processing and protocol stack management as two planes on top of each other, respectively the data and the management plane. The data plane in the DiPS+ architecture houses the functional part of the system, i.e. the protocol stack. This plane identifies components and how they are connected on the one hand, and offers layers as a composition of basic components on the other hand. On top of the data plane, DiPS+ offers one or more management planes, which act as meta-levels to extract information from the data plane and control its behavior. Each management plane is responsible for a specific concern (e.g. load management or component hot-swapping) and is clearly separated from the data plane. In this way, a management plane can be added or removed without affecting components in the data plane. In the remainder of this section, we elaborate on the architectural styles employed by the data plane and describe how the provided abstractions in the DiPS+ component platform enable run-time adaptability. Afterwards, in Sects. 4 and 5, the modular extendibility of the data plane with support for load management and component hotswapping will be demonstrated. 3.1 Data Plane: Combination of Architectural Styles When taking a closer look at the architecture of the data plane, we can identify three main architectural styles – the pipe-and-filter, the blackboard, and the layered style. By employing these architectural styles, the DiPS+ platform offers a number of framework abstractions (such as components, connectors, and packets) to ease development of adaptable protocol stacks. Pipe-and-Filter Style. The pipe-and-filter style is very convenient for developing network software, which maps naturally to the pipeline style of programming. A protocol stack can be thought of as a down-going and an up-going packet flow.



 190



Sam Michiels et al.



Fig. 1. Example of a DiPS+ component pipeline with a dispatcher that splits the pipeline into two parallel component areas. More processing resources have been assigned to the upper concurrency component



The core abstractions of a typical pipe-and-filter software architecture are connectors (pipes) and components (filters). Connectors provide a means to glue components together into a flow. Each functional component in DiPS+ represents an entity with a well-defined and fine-grained functional task (e.g. constructing or parsing a network header, fragmenting a packet or reassembling its fragments, or encrypting or decrypting packet data). Our architecture distinguishes additional component types for dispatching and concurrency (see Fig. 1). These are not only highly relevant abstractions for protocol stack software, identifying them as explicit entities also facilitates their control. The dispatcher serves as a de-multiplexer, allowing to split a single flow into two or more sub-flows. Concurrency components and component areas are described in Sect. 3.3. Blackboard Style. The blackboard interaction style is characterized by an indirect way of passing messages from one component to another, using an in-between data source (blackboard). This style is very convenient in combination with the pipe-and-filter style to increase flexibility and component independence. The blackboard model is mapped onto the DiPS+ architecture as follows (see also Fig. 2). In order to finish a common task, DiPS+ components forward an explicit message (packet) object from the source to the sink of the component pipeline. In addition, each message can be annotated with meta-information. Attaching meta-information allows to push extra information through the pipeline along with the message, for instance



Fig. 2. Anonymous communication via a blackboard architectural style: a blackboard data structure has been coupled to each message to carry meta-information from one component to another



 Platform for Adaptable Protocol Stacks



191



to specify how a particular message should be processed. The message represents the blackboard, which encapsulates both data and meta-information. In this way, components that consume specific meta-information do not have to know the producer of these data (and vice versa). By consequence, components become more independent and reusable since they do not rely on the presence of specific component instances. Layered Style. Introducing an explicit layer abstraction in a protocol stack architecture is highly relevant for several reasons. First and foremost, it is very natural to have a design entity that directly represents a key element of a protocol stack. Secondly, each layer offers an encapsulation boundary. Every protocol layer encapsulates data received from an upper layer by putting a header in front. Finally, from a protocol stack point of view, layers provide a unit of dispatching. The general advantage of applying the layered style is that it allows to zoom in and out to an appropriate level of detail. When not interested in the details of every fine-grained component, one can zoom out to a coarse-grained level, i.e. the layer. 3.2 Explicit Communication Ports The employed architectural styles have resulted in the design of the DiPS+ components. A component in DiPS+ is developed as a core surrounded by explicit component entry and exit ports. DiPS+ Component. Component activity is split into three sub-tasks: packet acceptance, packet processing, and packet delivery. The DiPS+ framework controls packet acceptance and delivery by means of explicit component entry and exit points (the packet receiver and forwarder). The design of a DiPS+ component consists of three entities (see also Fig. 3). Packet processing is taken care of by a DiPS+ Unit class, which forms the core of a component. The PacketReceiver (PR) and PacketForwarder (PF) classes act as unit wrappers and uncouple processing units. The DiPS+ Component class is a pure framework entity that is transparent to programmers. A component encapsulates and connects a unit together with its packet receiver and forwarder. All components in DiPS+ share a common functional packet interface incomingPacket(Packet p). Some components may offer one or more management interfaces next to their functional interface, as will be described further in Sect. 3.3. With an eye to enable fine-grained management, the DiPS+ data plane is designed to be open for customizations in a well-defined way [17, 18]. DiPS+ components allow for transparent packet interception at the communication ports via their associated Policy object (see Fig. 3). The policy delegates each packet to a number of pipelined ManagementModule objects, which may be registered by an administration tool at application level. Unlike functional components, management modules encapsulate non-functional behavior (e.g. throughput monitoring, logging, or packet blocking). Advantages. The combination of the pipe-and-filter and the blackboard architectural style results in two main advantages. First of all, it supports the design of so-called



 192



Sam Michiels et al.



plug-compatible components [19], i.e. components that are unaware of any other component, directly or indirectly. The pipe-and-filter style uncouples adjacent components by means of a connector (represented in DiPS+ by a PF-PR combination). The blackboard style, for its part, allows for anonymous component interaction (represented in DiPS+ by packets and their associated meta-information). Secondly, the combination enables fine-grained and unit-specific management and control. Both the PR and the PF serve as attachment hooks for the management plane. Such hooks are designed in DiPS+ as separate entities, called policies, which are responsible for the handling of incoming and outgoing packets. Thanks to this plug-compatible component model and fine-grained management and control of the data plane, extending a protocol stack with load management and/or component hot-swapping becomes much easier and understandable (see Sect. 2).



Fig. 3. A DiPS+ component (consisting of a packet receiver, the core unit, and a packet forwarder) with a policy object that intercepts incoming packets (p). The policy delegates incoming packets to a pipeline of management modules



3.3 Explicit Concurrency Components Finally, to separate the employed concurrency model of a DiPS+ stack from basic functionality, functional components are complemented with concurrency components. This allows for a developer to concentrate on the concurrency aspect of a DiPS+ stack, without being discarded by other concerns scattered across the same functional stack and vice versa. A concurrency component allows to increase or decrease the level of parallelism in the component area behind it. In addition, it controls which requests are scheduled and when. Each concurrency component breaks the pipeline into two independent component groups, which will be referred to as component areas (see also Fig. 1). Concurrency components exploit the benefits of both the pipe-and-filter and the blackboard architectural style. The pipe-and-filter style divides the system into plugcompatible components. As a result, concurrency components can be added anywhere in the pipeline, without affecting the functional components within. The DiPS+ dispatcher allows to split a component pipeline into parallel sub-pipes. In this way, each sub-pipe can be processed differently by putting a concurrency component in front of



 Platform for Adaptable Protocol Stacks



193



it. Thanks to the blackboard style of data sharing associated with each individual message, component tasks are typically packet-based, i.e. each component handles incoming packets by interpreting or adding meta-information. This allows to increase parallelism since most components have no local state that is shared by multiple threads in parallel. The design of the concurrency component consists of three major entities: a packet queue, one or more packet handlers, and the scheduler strategy. Its behavior during overload or under-load can be customized via its management interface (see Figure 4), which allows to register specific overflow and underflow strategies. In this way, the concurrency component can be controlled without exposing its internal attributes (such as the packet queue). A packet handler is a thread that guides a packet through the component area behind its concurrency component. The scheduler strategy of a concurrency component decides which packet will be selected next from the packet queue. The scheduler strategy can be customized via the scheduler interface of a concurrency component (see Fig. 4).



Fig. 4. A DiPS+ concurrency component with its management and scheduler interface



Advantages. Having explicit concurrency components shows three major advantages. First of all, it allows not only to reuse functional components whether or not concurrency is present, but also to reconfigure and customize the system where concurrency needs to be added. In this way, the system’s structure can be fine-tuned to specific circumstances and requirements, for instance, by adding concurrency components only if needed. Secondly, it allows for fine-grained and distributed control of scheduling in the protocol stack. Each concurrency component may incorporate a customized scheduling strategy, using all meta-information attached to the request by upstream components. This information may not yet be available at the beginning of the component pipeline. In this way, packet processing can be adapted to both request-specific information (e.g. content type, size, or sender) and the system’s state (e.g. available resources) as the packet traverses the component pipeline. A third advantage of having concurrency components spread throughout the system, is that it allows to prioritize not only between incoming packets, but also between component areas. On the one hand, this considerably facilitates finding and solving I/O bottlenecks, i.e. component areas that are overwhelmed because too many arriving packets require I/O access. On the other hand, concurrency components may help prioritize particular component areas based on application-specific requirements. DiPS+



 194



Sam Michiels et al.



concurrency components allow, for instance, to associate additional threads with those component areas that are about to release resources that have become scarce.



4



Management Plane for Load Management



As a first validation of the flexibility of the abstractions offered in DiPS+, we illustrate how a DiPS+ composition is extended with load management support in a modular manner. The need for load management (as described in Sect. 2.1) has resulted in the DMonA (Dips+ Monitoring Architecture) management plane, which controls and customizes the behavior of the protocol stack. DMonA allows for handling certain overload situations in an application-specific manner via interventions at protocol stack level. These interventions focus on packet classification, controlling the packet arrival rate, and optimally distributing processing threads over the tasks to be executed. DMonA is a feedback-driven management platform. This means that DMonA (1) extracts information from the underlying protocol stack (via the policy associated with a PR and/or PF), (2) decides whether or not action must be taken (using a monitor policy), and (3) deploys this solution in the protocol stack. The rest of this section describes how DMonA handles load management, viewed from three complementary perspectives: packet classification, request control, and concurrency control. 4.1 Packet Classification Packet classification differentiates between packets based on meta-information that is collected in each packet as it traverses the protocol stack. By consequence, the further a packet has traversed the component pipeline, the more meta-information is available for its classification. Packet differentiation can be based, for instance, on parameters such as destination, data size, encapsulated protocol, packet type (connection establishment or data transfer), or on application-specific preferences passed via meta-information. Packet classification is highly relevant when different categories or types of packets can be recognized, and service quality should be guaranteed for specific categories. During overload, the most important packets can be handled with priority. Packet classification can easily be added to a protocol stack thanks to three abstractions offered in the DiPS+ component platform: meta-information, dispatchers, and concurrency components. Meta-information is used by applications or components to annotate packets. These annotations influence how dispatchers and concurrency components process packets. A dispatcher is associated with a specific classification strategy, which is used to demultiplex the component pipeline in parallel sub-pipelines based on meta-information. A concurrency component for its part encapsulates a packet buffer and a specific scheduler strategy, which decides what packet to process next from the buffer. Either, the dispatcher can delegate packets to different concurrency components, one for each category; in this case, the packet scheduler selects packets from multiple queues. Or, the dispatcher can delegate packets to one ordered buffer that puts high priority packets first; in this case the packet scheduler is associated with one packet buffer and fetches packets in priority order.



 Platform for Adaptable Protocol Stacks



195



Given the flexibility of DiPS+, DMonA support can be limited in order to allow for system administrators to install specific classification strategies (in the dispatchers) and scheduler strategies (in the concurrency components). Packet classification has been validated in the context of an industrial case study that customized the RADIUS authentication protocol so as to differentiate between gold, silver, and bronze types of users [5, 6]. 4.2 Controlling Arrival Rate From a request control perspective, system load is managed by limiting or shaping the arrival rate of new requests to a sustainable level. Such traffic control may, for instance, selectively drop low-priority packets to preserve processing resources for the most important requests. This is crucial when too much requests arrive to be handled by the available processing resources. Request control is highly relevant to protect the system from packet bursts and to allow for it to handle them gracefully by removing incoming packets early in the processing pipeline (e.g. in the protocol stack of the system). In addition, by prioritizing packets based on packet- and application-specific knowledge, the least important packets are removed first. Traffic control has been effectively employed in networks, for example, to provide applications with quality-of-service guarantees by individually controlling network traffic flows (also known as traffic shaping) [20]. Typically, a leaky bucket algorithm [21] is used to adjust the rate at which incoming packets are forwarded. In addition, a variety of performance metrics have been studied in the context of overload management, including throughput and response-time targets [22–24], CPU utilization [25–27] and differentiated service metrics based on a given performance target [28, 29]. Welsh [22] proposes the 90th percentile response-time as a realistic and intuitive measure of clientperceived system performance. It is defined as follows: if the 90th percentile responsetime is t, then 90% of the requests experience a response-time equal to or shorter than t. When applying DMonA in the context of traffic control, we need to provide information collectors (i.e. sensors) at the entry of a monitored component area, a monitor policy that decides on the actions to be taken, and a component area to be controlled. As a concrete example, we use the 90th percentile approach of Welsh [22]. First of all, a response-time sensor measures the response-times for packets passing through a component area. Such sensors are installed at each concurrency component’s packet forwarder and determine how long it takes between a request leaving the concurrency component and the release of the associated thread. A DMonA information collector collects the response-times of all packets that have passed through a component area. Secondly, the 90th percentile algorithm itself is offered as a monitor policy, which processes the collected information at regular times. In this case the algorithm checks whether 90% of the packets experience a response-time equal to or shorter than some pre-defined threshold t. Thirdly, the leaky bucket controls the admission rate of packets entering the monitored area. The leaky bucket is installed as a management module, associated with the packet receiver policy of the concurrency component in front of the area under control. This packet receiver is the perfect place for such control, since it represents the entry of a component area.



 196



Sam Michiels et al.



4.3 Concurrency Control While packet classification and request control focus on packets, concurrency control focuses on the tasks to be executed in the protocol stack. From a concurrency perspective, load management distributes the available processing power (i.e. threads) across the system’s component areas (tasks) such that the overall system performance is optimized [30]. This means that the DMonA management plane should be able to detect performance bottlenecks, i.e. component areas where packets arrive faster than they can be processed. In addition, the management plane should solve these bottlenecks by migrating processing resources associated with the concurrency component in front of a component area, from underloaded to overloaded component areas. Concurrency control is an effective technique for load management, since it allows to control how processing threads are applied at any time (e.g. to handle the highest priority tasks first), and compensates for blocking invocations inside the protocol stack. Because our approach allows for concurrency components to be added at arbitrary places in the protocol stack, bottleneck areas can easily be detected by measuring the throughput of each area. In addition, concurrency components allow for handling bottlenecks intelligently by increasing or decreasing the number of associated packet handler threads in certain component areas, which can be highly effective for parallel areas with blocking components. Moreover, as already described in Section 4.1, concurrency components support packet classification via their specific scheduling strategy.



Fig. 5. Illustration of DMonA attached to DiPS+ via two policies, one associated with the packet receiver and one with the packet forwarder. Processing resources are retrieved from a pool of free resources and allocated to a concurrency unit via its scheduler interface



More specifically, DMonA monitors the packet stream by installing throughput sensors, i.e. management modules that count the number of passed packets. Figure 5 shows how sensors are plugged in at both the packet receiver and forwarder of a concurrency component. The DMonA monitor collects on a regular basis the information stored in both sensors and resets them to start the next collecting phase. One possible monitor policy adjusts thread scheduling based on the concurrency component’s progress [4],



 Platform for Adaptable Protocol Stacks



197



comparable to the feedback-driven approach proposed by Steere [31]. Based on this status information, the DMonA monitor decides when and how to adapt local concurrency behavior to improve performance. Proposed monitor decisions can be deployed in two ways. On the one hand, a concurrency component can be linked with or unlinked from a packet handler thread. This is done via the concurrency unit’s scheduler interface (see Figure 5). On the other hand, the buffer overflow and underflow strategies of a concurrency component can be replaced by calling its management interface.



5



Management Plane for Transparent Component Hot-Swapping



The need for run-time adaptable protocol stacks (as described in Section 2.2) has resulted in the development of the CuPS (Customizable Protocol Stack) platform, a modular extension to the DiPS+ framework responsible for conducting seamless reconfigurations of a running protocol stack (illustrated in Figure 6). Since we aim for unanticipated adaptations, protocol stack reconfigurations imply changing a stack composition, rather than being limited to parameter tuning. The algorithm employed by the CuPS platform to orchestrate a reconfiguration of a protocol stack composition at run-time, involves three stages: Installation of new component area. The adaptation process starts with the installation of the new functional components, resulting in the co-existence of the old component area (still in use) and the new version (not yet activated). Activation of new component area. Next, the newly installed functional components become activated. This is achieved through stopping and disconnecting the old component area by redirecting packets towards the new version. At this point in the adaptation procedure, the new component area is plugged into the stack composition and will process transmitted packets. Removal of old component area. Finally, the old component area is removed. Since it has been stopped during the activation stage, it can safely be removed. In the remainder of this section, we elaborate on the activation phase to illustrate how an existing DiPS+ composition is extended with CuPS support in a modular manner. The terms reconfiguration and adaptation are used as an alternative for activation. 5.1 Self-contained Components in a Best-Effort Environment A first category of reconfigurations encloses the deployment of component areas strictly composed of functional components that are self-contained, i.e. components not depending on cooperation with other components to implement a service. Two examples of such self-contained protocol stack components are a filter component to relieve a congested node and a logging component. In addition, this class of reconfigurations assumes for packet loss or packet scrambling not to compromise the correct functioning of the network. Since performance (throughput) is an important characteristic of a protocol stack, most network protocols (such as IP) offer best-effort services and as such comply with this requirement. When both conditions are fulfilled, activating such a component area boils down to adapting the current composition. No additional support is needed to control the state



 198



Sam Michiels et al.



(activity) of the DiPS+ component area that is subject to activation. By consequence, the activation phase is limited to removing the connectors binding the old component area into the protocol stack, and plugging in the new area. With this, packets that are processed by the old component area during the activation stage (depending on the employed concurrency model) will get lost. Note that due to the use of plug-compatible components, the actual recomposition of DiPS+ component areas has been reduced to a trivial problem of adding and removing connectors. 5.2 Self-contained Components Demanding Safe Deployment Depending on the properties of the network service that is subject to adaptation, packet loss during protocol reconfiguration could compromise the correct functioning of the protocol. As an example, we refer to the adaptation of a running TCP stack. When packets are lost during the activation process, TCP will consider these errors as packet loss due to congestion and hence will reduce its congestion window [32]. This will cause a substantial degradation of performance in terms of throughput, even though sufficient bandwidth might be available. As such, this family of protocol stack reconfigurations covers seamless adaptation of self-contained components, enforcing a safe state to be imposed on the component area under change. Since no other components depend on self-contained components to complete a service, such a safe state for reconfiguration is obtained when the component area is made passive. This implies that the functional components (1) are currently not processing any packets and (2) have no pending packets to be accepted and processed. In this way, packet loss caused by packets being processed while a component is swapped can be prevented. 1) Packet Blocking. By consequence, CuPS support is needed to block packet flows before passing through a DiPS+ component area facing a reconfiguration. This is achieved by holding up all outgoing packets of adjacent packet forwarders directed to the component area that is subject to adaptation. When the reconfiguration is completed, the execution of these blocked packets will be resumed. To extend the targeted DiPS+ components with such blocking support in a modular and transparent manner, their packet forwarders are equipped with special Policy objects for intercepting packets (conducted by the CuPS platform). The employed separation between the functionality of DiPS+ components (offered by a programmer) on the one hand and additional CuPS support to deactivate other components on the other hand, has a number of advantages: First of all, minimal interference with the rest of the system can be guaranteed. Interrupting interactions in a composition can be restricted to those locations where an actual reconfiguration is needed. Instead of stopping the concurrency components (as proposed in [33]), only the adjacent DiPS+ components that initiate interactions (by forwarding packets) on the component that is subject to adaptation need to be blocked (as illustrated in Fig. 6). With this, conducting a safe reconfiguration does not depend on the employed concurrency model implemented by the number of concurrency components and their location (controlled by the DMonA platform). This implies that CuPS



 Platform for Adaptable Protocol Stacks



199



and DMonA can operate simultaneously, but independently from each other, sharing the same DiPS+ protocol stack. Secondly, due to separating support to block outgoing packets from the functional behavior of a DiPS+ component, changing the way of holding up packets at the packet forwarder will not interfere with existing component functionality and vice versa. As an example, we demonstrate the possibility to choose between two different blocking strategies. To obtain a safe reconfiguration, one could decide to block the execution thread in which the outgoing packets are initiated, using a ThreadBlockingPolicy. An alternative could be to queue outgoing packets without interrupting the execution thread by selecting the PacketQueueingPolicy. Such a change can be achieved transparently by only adapting the packet forwarders of the DiPS+ components that are involved. Finally, the impact of a blocking operation on a DiPS+ component can be made more fine-grained. Instead of stopping all interactions initiated by a component (e.g. by interrupting the execution thread of that component), only the packet forwarders initiating interactions that engage components that need to become passive should get blocked (illustrated by Component A in Fig. 6). In this way, packets that are sent out using other packet forwarders can still be initiated.



Fig. 6. Illustration of CuPS attached to DiPS+



2) Activity Monitoring. In addition to holding up packets to be accepted and processed by the component area subject to adaptation, safe adaptation also requires this component area to be inactive (i.e. currently not processing any packets). Due to the reactive behavior of a functional component, monitoring code to check whether such a DiPS+ component is active or idle can (automatically) be added by simply extending the policy employed by the packet receiver of this component. In case



 200



Sam Michiels et al.



of concurrent interactions, activity inside a DiPS+ component can be monitored by means of a counter situated at its packet receivers, which is incremented on invocation and decremented upon return [34] (illustrated by means of the ActivityMonitor Policy in Fig. 6). When only sequential interactions are used, a counter can be replaced by a boolean flag. This reduces the monitoring overhead for each interaction.



5.3 Safe Deployment of Tightly-Coupled Components The last category of reconfigurations encloses the activation of component areas containing tightly coupled components, i.e. components depending on cooperation with other components (locally, or in a distributed fashion) to implement a service. This cooperation is formalized by means of a transaction, consisting of a sequence of one or more asynchronous interactions. Referring to a fragmentation service, a transaction to fragment and reassemble a packet encapsulates a number of interactions, each representing the transfer of one fragment (packet) from a fragmenter to a reassembler. This cooperation implies that, from a reconfiguration point of view, the cooperating components are only consistent after termination of a transaction (i.e. when all fragments are received by the reassembler and the original packet has been restored). As a consequence, when imposing a safe state for reconfiguration of a tightly coupled component, the dependencies formalized by the transaction should be taken into account. Kramer and Magee [33] have stated that achieving safe software reconfigurations requires the software modules that are subject to adaptation (in this context the reassembler) to be both consistent and frozen (passive). When software modules are consistent, they do not include results of partially completed services (or transactions). By forcing software modules to be frozen (passive), state changes caused by new transactions are impossible. Kramer and Magee describe this required consistent and frozen state as the quiescence of a component. As stated in the previous section, forcing a component area to be frozen has been accomplished (in a modular manner) by separating the functional behavior of a module from potential support to block its outgoing interactions. Since there is no knowledge about the state of the tightly coupled component at the moment packets are blocked, reconfiguration may lead to inconsistency (caused by replacing the component when protocol transactions are only partially completed). When referring to the fragmentation service, replacing the reassembler when it has not yet received all fragments (and thus could not reassemble the original packet) will break the consistency between fragmenting and reassembling component (and in that way, the correct functioning of the fragmentation service). By consequence, additional support is required to drive a component area into a consistent state. This has been achieved by extending DiPS+ packet forwarders with special policies allowing “controlled” packet blocking support. After blocking the packet forwarders that are directing packets to the component that will be replaced, it should be possible for the CuPS platform (which conducts the actual reconfiguration) to check whether safe reconfiguration of the component is achievable. When this is not the case, blocked interactions are resumed one by one until the required safe state for reconfiguration is attained.



 Platform for Adaptable Protocol Stacks



201



Table 1. Source code example to illustrate the layer property description for the DiPS+ IPv4 protocol 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26



                



Checking whether safe reconfiguration of a component is achievable requires verification of its execution state. For that purpose, we have extended tightly-coupled DiPS+ components (that are eligible for reconfiguration) with monitoring code to reflect their current execution state. More in detail, verifying the internal state of a DiPS+ component is achieved by checking its internal Unit through introspection via the ManagementInterface. CuPS will only check this state in the face of an actual reconfiguration when the targeted component is idle.



6



DiPS+ Prototype and Validation



6.1 Prototype To validate the DiPS+ component platform and its potential for supporting run-time adaptability, we have developed a proof-of-concept prototype in Java, running on standard PC hardware. The protocol stack in Java is integrated in the Linux OS using a virtual Ethernet device (via the ethertap module in the Linux kernel). The DiPS+ prototype allows for building a protocol stack from an architecture specification. The DiPS+ architecture is represented in XML [35]. This representation specifies the core architecture entities, like components and protocol layers, along with how these entities are interconnected. To this end, descriptions for component connectors and layer glues, dispatchers and concurrency components are provided as well.



 202



Sam Michiels et al.



By way of example of a DiPS+ description, the source code listing in Table 1 zooms in on the IP layer of a protocol stack. It lists all essential elements that layers can be composed of: components (lines 3, 3-6, and 13), connectors (lines 8-11, and 22-24), a dispatcher (lines 17-20), a concurrency component (line 15), and the upper and lower entry point (lines 7 and 21). Each of these items is represented as such in the architecture description, which makes the listing self-explaining (Table 1). Having an architecture description separated from the implementation has major advantages. First of all, the internals of the DiPS+ platform are transparent for the protocol stack developer, resulting in a black-box framework. Developing a DiPS+ protocol stack boils down to designing the appropriate components and providing in a correct composition description. A stack builder tool is used to automatically transform the architecture descriptions into a running protocol stack. By consequence, a developer can configure different compositions without having to write extra code, or to change or recompile the source code of individual components. A second advantage is that the use of an architectural description allows for specific (optimizing or test) builders to be applied to the same architecture description. Testing a protocol layer in isolation, for instance, reuses the architecture description, but creates the layer in a different context (i.e. a test case instead of a protocol stack). Finally, an architecture description allows for optimization, in the sense that an optimizer can analyze the architecture and change it in order to become more efficient. When, for instance, a network router is known to be connected to two networks with the same maximum segment size, it can be reconfigured to omit reassembly and refragmentation of forwarded packets, since they are fragmented in sufficiently small pieces already. Only packets for local delivery must be reassembled in this case. 6.2 Validation We have successfully validated the DiPS+ approach, a.o. in an industrial case study that compared a DiPS+ and a commercial Java implementation of the RADIUS authentication and authorization protocol [5, 6]. Performance results clearly show the advantage of using application-specific scheduling strategies during overload. Moreover, the DiPS+ RADIUS server is able to gracefully cope with varying (over)load conditions. DiPS+ did not only facilitate the development of the RADIUS protocol, it also allowed to experiment with different scheduling strategies without having to change any functional code. In addition, the DiPS+ framework has been validated in the context of on-demand composition of a protocol stack, based on application-specific requirements. We have built a prototype in DiPS+ that allows an application to express high-level service requirements (e.g. reliability of data transfer, encryption, local or networked transfer, etc.) that must be supported by the underlying protocol stack. Based on these requirements, a combination of protocol layers is suggested by a stack composition tool [8, 9], and a protocol stack is built by the DiPS+ builder [36]. This illustrates the flexibility of the DiPS+ platform. Multiple Master’s theses have explored and validated the DiPS+ component platform from various perspectives. First of all, DiPS+ has been used to design and implement particular protocols (e.g. SIP [37], IPv6 [38], a TCP booster [39], dynamic routing



 Platform for Adaptable Protocol Stacks



203



protocols in [40, 41], an IPSec based VPN solution [42] and a stateful firewall in [43]). Secondly, DiPS+ has been applied in various domains to explore its applicability. The work in [44], for instance, describes how network management techniques can be used in combination with DiPS+ protocol stacks. Finally, more research related theses have explored fundamental extensions to the DiPS+ component framework and architecture (e.g. self-adaptability [45], and concurrency control [46]). Two main conclusions may be drawn from our experiences in guiding Master students during their thesis. First of all, the DiPS+ framework and architecture can quickly be assimilated, even by students with limited experience in software architectures and a mainly object-oriented design background. Nevertheless, creating a high-level modularized DiPS+ design of a network protocol was not always trivial, and sometimes required assistance of a DiPS+ team member to put the student on the right track. In our view, the students’ lack of design experience and the often poor documentation of protocol specifications lie at the basis of the complicated modularization process. The main advantage, compared to an object-oriented design, is that packet flows are clearly defined and well-identifiable, which makes a DiPS+ design much more understandable. Once the high-level design becomes clear, development of individual components is straightforward. Secondly, the theses show that DiPS+ allows for a highly incremental software development process. Stated differently, the first running prototype can usually be delivered quickly after implementation has started (i.e. after a few weeks). From then on, the prototype can easily be customized and extended towards the stated requirements. Although the DiPS+ component framework has been proposed in the context of protocol stacks, we are convinced of its applicability in other operating system domains. The first research results in the context of USB device drivers [47, 48] and file systems [30, 45] are very promising. These systems reflect a layered architecture, which perfectly matches the DiPS+ architecture.



7



Related Work



7.1 Protocol Stack Frameworks Although multiple software design frameworks for protocol stack development have been described in the literature [49–52], we compare the DiPS+ approach to three software architectures, which are tailored to protocol stacks and/or concurrency control: SEDA [22], Click modular router [53], and Scout [54]. SEDA [22] offers an event-based architecture for supporting massively concurrent web servers. A stage in SEDA can be compared with a DiPS+ component area along with its preceding concurrency component. Yet, the SEDA controller and associated stage are tightly coupled, whereas DiPS+ clearly separates a concurrency component from the functional code. As such, SEDA does not provide a clean separation between the functional and the management level. In addition, SEDA does not provide developers with an architecture specification, which makes it difficult for developers to understand the data and control flow through the set of independent stages. The Click modular router [53] is based on a design very analogous to DiPS+. Although one can recognize a pipe-and-filter architectural style, Click pays much less



 204



Sam Michiels et al.



attention to software architecture than DiPS+. Click supports two packet transfer mechanisms: push and pull. DiPS+ offers a uniform push packet transfer mechanism and allows for active behavior inside the component graph by means of explicit concurrency components. The Scout operating system [54] uses a layered software architecture, yet does not offer fine-grained entities such as components for functionality, dispatching, or concurrency. Scout is designed around a communication-oriented abstraction called the path, which represents an I/O channel throughout a multi-layered system and essentially extends a network connection into the operating system. 7.2 Concurrency and Separation of Concerns A critical element in our research is the separation of concurrency from functional code. Kiczales [55] defines non-functional concerns as aspects that cross-cut functional code. An aspect is written in a specific aspect language and is woven into the functional code by a so-called aspect weaver at pre-processing-time. Although this approach clearly separates all aspects from the functional code (at design-time), aspects tend to disappear at run-time, which makes it very difficult (if not impossible) to adapt aspects dynamically. Apertos [16] introduces concurrent objects that separate mechanisms of synchronization, scheduling and interrupt mask handling from the functional code. This makes software more understandable, and reduces the risk of errors.



8



Conclusion



Our contribution represents a successful case study, DiPS+, on the development of component-based software for protocol stacks that are adaptable at run-time. The employed architectural styles and the resulting component abstractions (1) increase the flexibility and adaptability of protocol stack software and (2) facilitate the development process of software that is complex and error-prone by nature, especially when additional concerns (such as the need for run-time adaptability) are imposed. The combination of the pipe-and-filter and the blackboard architectural style has resulted in the design of plug-compatible DiPS+ components. By consequence, DiPS+ components are unaware of other components they are connected to, directly or indirectly. This is a major advantage in terms of flexibility, as it allows for individual components to be reused in different compositions. In addition, by employing these architectural styles (together with the layered style), the DiPS+ platform offers a number of framework abstractions (such as components, connectors, and packets) to ease the development of adaptable protocol stacks. Finally, separate component types for functionality, concurrency and packet dispatching allow for a developer to concentrate on a single concern (e.g. concurrency) without being distracted by other concerns that are scattered across the same functional code. As stated in the introduction, a second objective of the DiPS+ component platform is to allow for modular integration of non-functional extensions that cross-cut the core protocol stack functionality. In this chapter, we have illustrated (by means of DMonA and CuPS) that the use of explicit communication ports is essential to transparently



 Platform for Adaptable Protocol Stacks



205



extend a DiPS+ protocol stack with support for controlling the packet flow. More precisely, they serve as hooks for connecting the data and the management plane. We have discussed our experiences with using the DiPS+ component platform in real-life situations. Although a seamless transformation of the DiPS+ platform towards embedded systems is not yet feasible, we argue that the principles behind DiPS+ (i.e. a combination of component-based development and separation of concerns) are crucial for component-based embedded network systems. In our opinion, this combination does not only facilitate the implementation of component hot-swapping and concurrency control (as we have demonstrated), but also seems very useful for other concerns such as on-demand and safe software composition [8, 9, 56], transparent data flow inspection [22], performance optimization [53], isolated and incremental unit testing [7], and safe updates of distributed embedded systems [10]. In our opinion, such concerns as data flow monitoring, component hot-swapping, unit testing, performance optimization, and safe composition are crucial for embedded software and will become even more so with the ongoing trend towards mobile and adhoc network connectivity of (highly heterogeneous) embedded devices. We hope that this case study can convince embedded system developers of the need and the power of a well-defined software architecture and component platform.



Acknowledgments Part of the work described in this chapter has been carried out for Alcatel Bell and supported by the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT SCAN #010319, IWT PEPiTA #990219). Additional support came from the Fund for Scientific Research – Flanders (Belgium) (F.W.O. RACING #G.0323.01).



References 1. Hubaux, J.P., Gross, T., Boudec, J.Y.L., Vetterli, M.: Towards self-organized mobile ad hoc networks: the Terminodes project. IEEE Communications Magazine 31 (2001) 118–124 2. Shaw, M., Garlan, D.: Software Architecture - Perspectives on an emerging discipline. Prentice-Hall (1996) 3. Schneider, J.G., Nierstrasz, O.: Components, scripts and glue. In L. Barroca, J.H., Hall, P., eds.: Software Architectures – Advances and Applications. Springer-Verlag (1999) 13–25 4. Michiels, S.: Component Framework Technology for Adaptable and Manageable Protocol Stacks. PhD thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2003) 5. Michiels, S., Desmet, L., Joosen, W., Verbaeten, P.: The DiPS+ software architecture for self-healing protocol stacks. In: Proceedings of the 4th Working IEEE/IFIP Conference on Software Architecture (WICSA-4), Oslo, Norway, IEEE/IFIP, IEEE (2004) 6. Michiels, S., Desmet, L., Verbaeten, P.: A DiPS+ Case Study: A Self-healing RADIUS Server. Report CW-378, Dept. of Computer Science, K.U.Leuven, Leuven, Belgium (2004) 7. Michiels, S., Walravens, D., Janssens, N., Verbaeten, P.: DiPS: Filling the Gap between System Software and Testing. In: Proceedings of Workshop on Testing in XP (WiTXP2002), Alghero, Italy (2002) 8. S¸ora, I., Verbaeten, P., Berbers, Y.: A description language for composable components. In: Proceedings of 6th International Conference on Fundamental Approaches to Software Engineering (FASE 2003). Volume 2621., Warsaw, Poland, Springer-Verlag, Lecture Notes in Computer Science (2003) 22–36



 206



Sam Michiels et al.



9. S¸ora, I., Cretu, V., Verbaeten, P., Berbers, Y.: Automating decisions in component composition based on propagation of requirements. In: Proceedings of 7th International Conference on Fundamental Approaches to Software Engineering (FASE 2004), Barcelona, Spain (2004) 10. Janssens, N., Steegmans, E., Holvoet, T., Verbaeten, P.: An Agent Design Method Promoting Separation Between Computation and Coordination. In: Proceedings of the 2004 ACM Symposium on Applied Computing (SAC 2004), ACM Press (2004) 456–461 11. Joosen, W.: Load Balancing in Distributed and Parallel Systems. PhD thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (1996) 12. Wetherall, D., Legedza, U., Guttag, J.: Introducing new internet services: Why and how. IEEE Network, Special Issue on Active and Programmable Networks 12 (1998) 13. Campbell, A.T., De Meer, H.G., Kounavis, M.E., Miki, K., Vicente, J.B., Villela, D.: A survey of programmable networks. SIGCOMM Comput. Commun. Rev. 29 (1999) 7–23 14. Janssens, N., Michiels, S., Mahieu, T., Verbaeten, P.: Towards Transparent Hot-Swapping Support for Producer-Consumer Components. In: Proceedings of Second International Workshop on Unanticipated Software Evolution (USE 2003), Warsaw, Poland (2003) 15. Janssens, N., Michiels, S., Holvoet, T., Verbaeten, P.: A Modular Approach Enforcing Safe Reconfiguration of Producer-Consumer Applications. In: Proceedings of The 20th IEEE International Conference on Software Maintenance (ICSM 2004), Chicago Illinois, USA (2004) 16. Itoh, J., Yokote, Y., Tokoro, M.: Scone: using concurrent objects for low-level operating system programming. In: Proceedings of the tenth annual conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’95), Austin, TX, USA, ACM Press, New York, NY, USA (1995) 385–398 17. Kiczales, G., Lamping, J., Lopes, C.V., Maeda, C., Mendhekar, A., Murphy, G.C.: Open implementation design guidelines. In: Proceedings of the 19th International Conference on Software Engineering (ICSE’97), Boston, MA, USA, ACM Press, New York, NY, USA (1997) 481–490 18. Kiczales, G., des Rivi`eres, J., Bobrow, D.G.: The Art of the Metaobject Protocol. MIT Press, Cambridge, MA (1991) 19. Szyperski, C.: Component Software: Beyond Object-Oriented Programming. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (1998) 20. Breslau, L., Jamin, S., Schenker, S.: Comments on the performance of measurement-based admission control algorithms. In: Proceedings of IEEE INFOCOM 2000. (2000) 1233–1242 21. Tanenbaum, A.S.: Computer Networks. Prentice Hall (1996) 22. Welsh, M.F.: An Architecture for Highly Concurrent, Well-Conditioned Internet Services. PhD thesis, University of California at Berkeley, Berkeley, CA, USA (2002) 23. Chen, H., Mohapatra, P.: Session-based overload control in QoS-aware web servers. In: Proceedings of IEEE INFOCOM 2002, New York, NY, USA (2002) 24. Chen, X., Mohapatra, P., Chen, H.: An admission control scheme for predictable server response time for web accesses. In: Proceedings of the tenth international conference on World Wide Web, ACM Press, New York, NY, USA (2001) 545–554 25. Abdelzaher, T.F., Lu, C.: Modeling and performance control of internet servers. Invited paper at 39th IEEE Conference on Decision and Control (2000) 26. Cherkasova, L., Phaal, P.: Session based admission control: a mechanism for improving the performance of an overloaded web server. Technical Report HPL-98-119, HP labs (1998) 27. Diao, Y., Gandhi, N., Hellerstein, J.L., Parekh, S., Tilbury, D.: Using mimo feedback control to enforce policies for interrelated metrics with application to the apache web server. In: Proceedings of Network Operations and Management Symposium, Florence, Italy (2002) 28. Kanodia, V., Knightly, E.: Multi-class latency-bounded web services. In: Proceedings of 8th IEEE/IFIP International Workshop on Quality of Service (IWQoS 2000), Pittsburgh, PA, USA (2000)



 Platform for Adaptable Protocol Stacks



207



29. Lu, C., Abdelzaher, T., Stankovic, J., Son, S.: A feedback control approach for guaranteeing relative delays in web servers. In: Proceedings of the 7th IEEE Real-Time Technology and Applications Symposium (RTAS), Taipei, Taiwan (2001) 30. Michiels, S., Desmet, L., Janssens, N., Mahieu, T., Verbaeten, P.: Self-adapting concurrency: The DMonA architecture. In Garlan, D., Kramer, J., Wolf, A., eds.: Proceedings of the First Workshop on Self-Healing Systems (WOSS’02), Charleston, SC, USA, ACM SIGSOFT, ACM press (2002) 43–48 31. Steere, D.C., Goel, A., Gruenberg, J., McNamee, D., Pu, C., Walpole, J.: A feedback-driven proportion allocator for real-rate scheduling. In: Proceedings of the third USENIX Symposium on Operating Systems Design and Implementation (OSDI’99), New Orleans, LA, USA, USENIX Association, Berkeley, CA, USA (1999) 145–158 32. Hoebeke, J., Leeuwen, T.V., Peters, L., Cooreman, K., Moerman, I., Dhoedt, B., Demeester, P.: Development of a TCP protocol booster over a wireless link. In: Proceedings of the 9th Symposium on Communications and Vehicular Technology in the Benelux (SCVT 2002), Louvain la Neuve (2002) 33. Kramer, J., Magee, J.: The evolving philosophers problem: Dynamic change management. IEEE Transactions on Software Engineering 16 (1990) 1293–1306 34. McNamee, D., Walpole, J., Pu, C., Cowan, C., Krasic, C., Goel, A., Wagle, P., Consel, C., Muller, G., Marlet, R.: Specialization tools and techniques for systematic optimization of system software. ACM Transactions on Computer Systems 19 (2001) 217–251 35. Harold, E.R., Means, W.S.: XML in a Nutshell. Second edn. O’Reilly & Associates, Inc. (2002) 36. Michiels, S., Mahieu, T., Matthijs, F., Verbaeten, P.: Dynamic Protocol Stack Composition: Protocol Independent Addressing. In: Proceedings of the 4th ECOOP Workshop on Object-Orientation and Operating Systems (ECOOP-OOOSWS’2001), Budapest, Hungary, SERVITEC (2001) 37. Vandewoestyne, B.: Internet Telephony with the DiPS Framework. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2003) 38. Janssen, G.: Implementation of IPv6 in DiPS. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2002) 39. Larsen, T.: Implementation of a TCP booster in DiPS. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2004) 40. Buggenhout, B.V.: Study and Implementation of a QoS router. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2001) 41. Elen, B.: A flexible framework for routing protocols in DiPS. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2004) 42. Vandebroek, K.: Development of an IPSec based VPN solution with the DiPS component framework. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2004) 43. Cornelis, I., Weerdt, D.D.: Development of a stateful firewall with the DiPS component framework. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2004) 44. Bjerke, S.E.: Support for Network Management in the DiPS Component Framework. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2002) 45. Desmet, L.: Adaptive System Software with the DiPS Component Framework. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2002) 46. Michiels, D.: Concurrency Control in the DiPS framework. Master’s thesis, K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2003) 47. Coster, W.D., Krock, M.D.: CoFraDeD: a Component Framework for Device Drivers. Technical report, internal use only, PIMC/K.U.Leuven, Dept. of Computer Science, Leuven, Belgium (2001)



 208



Sam Michiels et al.



48. Michiels, S., Kenens, P., Matthijs, F., Walravens, D., Berbers, Y., Verbaeten, P.: Component Framework Support for developing Device Drivers. In Rozic, N., Begusic, D., Vrdoljak, M., eds.: International Conference on Software, Telecommunications and Computer Networks (SoftCOM). Volume 1., Split, Croatia, FESB (2000) 117–126 49. Hutchinson, N.C., Peterson, L.L.: The x-kernel: An architecture for implementing network protocols. IEEE Transactions on Software Engineering 17 (1991) 64–76 50. Bhatti, N.T.: A System for Constructing Configurable High-level Protocols. PhD thesis, Department of Computer Science, University of Arizona, Tucson, AZ, USA (1996) 51. Ballesteros, F.J., Kon, F., Campbell, R.: Off++: The Network in a Box. In: Proceedings of ECOOP Workshop on Object Orientation in Operating Systems (ECOOP-WOOOS 2000), Sophia Antipolis and Cannes, France (2000) 52. H¨uni, H., Johnson, R.E., Engel, R.: A framework for network protocol software. In: Proceedings of the tenth annual conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’95), Austin, TX, USA, ACM Press, New York, NY, USA (1995) 358–369 53. Kohler, E.: The Click Modular Router. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA (2001) 54. Montz, A.B., Mosberger, D., O’Malley, S.W., Peterson, L.L.: Scout: A communicationsoriented operating system. In: Proceedings of Fifth Workshop on Hot Topics in Operating Systems (HotOS-V), Orcas Island, WA, USA, IEEE Computer Society Press (1995) 58–61 55. Kiczales, G., Lamping, J., Menhdhekar, A., Maeda, C., Lopes, C., Loingtier, J.M., Irwin, J.: Aspect-Oriented Programming. In Aks¸it, M., Matsuoka, S., eds.: Proceedings of 11th European Conference on Object-Oriented Programming (ECOOP’97). Volume 1241 of LNCS. Springer-Verlag, Jyv¨askyl¨a, Finland (1997) 220–242 56. Desmet, L., Piessens, F., Joosen, W., Verbaeten, P.: Improving software reliability in datacentered software systems by enforcing composition time constraints. In: Proceedings of Third Workshop on Architecting Dependable Systems (WADS2004), Edinburgh, Scotland (2004) 32–36



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems Yolande Berbers, Peter Rigole, Yves Vandewoude, and Stefan Van Baelen DistriNet, Department of Computer Science, KULeuven Celestijnenlaan 200A, B-3001 Heverlee {Yolande.Berbers,Peter.Rigole,Yves.Vandewoude,Stefan.VanBaelen} @cs.kuleuven.ac.be



Abstract. This chapter presents C O C ON ES (Components and Contracts for Embedded Software), a methodology for the development of embedded software, supported by a tool chain. The methodology is based on the composition of reusable components with the addition of a contract principle for modeling nonfunctional constraints. Non-functional constraints are an important aspect of embedded systems, and need to be modeled explicitly. The tool chain contains CCOM, a tool used for the design phase of software development, coupled with D RACO, a middleware layer that supports the component-based architecture at run-time.



1



Introduction



Embedded systems are typically characterized by a specific functionality in a specific domain, where the software element is taking an increasingly important role. When developing embedded software, besides a range of software quality and stability aspects, one has to consider non-functional aspects and resource constraints. Embedded systems often have limited processing power, storage capacity and network bandwidth. A developer has to cope with these constraints and make sure that the software will be able to run on the constrained system. Often, embedded systems also have timing constraints on their computations. Today, embedded software is becoming complex; according to [8] the complexity of embedded-system applications is increasing with 140% a year. It is no longer feasible to build such systems from scratch. Reuse of existing software is becoming vital, especially in the light of today’s tight time-to-market demands in industry. Reuse should ensure that one can use validated software, only then will reuse result in shorter development time. To enable reuse, we have chosen a component-based approach for building embedded systems. Component software is quite common today in traditional applications. A large software system often consists of multiple interacting components. These components can be seen as large objects with a clear and well-defined task. Different definitions of a component exist; some see objects as components, while others define components as large parts of coherent code, intended to be reusable and highly documented. We base our definition on the one given by Szyperski [16], see section 2.1. However, many definitions focus only on the functional aspect of a component. For embedded software C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 209–231, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 210



Yolande Berbers et al.



the non-functional constraints cannot be discarded. Modeling these non-functional constraints explicitly enables one to safely reuse components in a design, while being sure that the non-functional constraints will be met. This is the major motivation for the work presented in this chapter. In the past few years, we have developed a methodology C O C ON ES (Components and Contracts for Embedded Software) for developing software for embedded systems, using a component oriented approach. Our approach uses contracts to model the nonfunctional constraints. The C O C ON ES methodology is backed by a tool chain that spans both the design-time and the runtime phase. CCOM (Component and Contract-Oriented Modeling) is a software design tool, enabling the developer to specify components and their interactions, including contracts for the non-functional constraints. D RACO (Distrinet Reliable and Adaptive COmponents) is a middleware layer that at runtime will support the component-oriented software architecture on which our methodology is based. It allows components to be created and destroyed, organizes the communication between components, and monitors the contracts defined at design time. Currently, C O C ON ES offers support for memory and bandwidth contracts. The C O C ON ES methodology is aimed at computationally powerful systems running complex software. Although we address resource-constrained systems, we do not claim to support hard RT applications, nor embedded systems with small footprints. To validate the various elements of our approach, we have applied our methodology using our tools to various smaller examples and to a fully fledged embedded case study. This chapter gives a comprehensive overview of our methodology, our componentoriented software architecture, the supporting tool chain and the case study. Elements of this work have been published on conferences and workshops: [10, 11, 17, 18]. The presented work was started during the SEESCOA project (Software Engineering for Embedded Systems, using a Component Oriented Approach), and is continued in the CoDAMoS project (Context Driven Adaptation of Mobile Services). This chapter is organized as follows: Section 2 gives an overview of our component architecture. Sections 3 and 4 respectively describe the design-time tool and the runtime tool that together support our methodology. Section 5 presents a validation of our methodology, component architecture and our tools, through a fully fledged embedded case study. We compare our work with related work in Sec. 6, and conclude in Sec. 7.



2



Core Concepts of the Proposed Component Architecture



This section describes the software architecture used in the Components and Contracts for Embedded Software methodology. Before giving details about C O C ON ES, we list the main strengths and characteristics of C O C ON ES: 1. 2. 3. 4.



C O C ON ES components are loosely coupled to facilitate reuse C O C ON ES components communicate through ports Connectors are used to connect communicating ports C O C ON ES defines constructs for composing applications out of components (a) Some constructs describe design-time compositions (blueprints) (b) Other constructs describe run-time compositions (instances)



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



211



5. Contracts are used to specify and verify non-functional constraints (a) Contracts can be used to specify and verify non-functional aspects of compositions at design-time (b) Contracts can be used to verify the correct execution of compositions with regard to their resource use at run-time (c) Currently, C O C ON ES supports contracts for timing and for bandwidth requirements 6. C O C ON ES is a methodology, supported by a CASE tool and by a runtime environment. These tools are described in Sec. 3 and 4. 2.1



C O C ON ES Components



The most common definition of a component was given by Szyperski in [16]: A software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be deployed independently and is subject to composition by third parties. In C O C ON ES, a distinction is made between components and component blueprints. The latter are reusable static entities that only exist at design time and contain a complete description of the type of a component and its implementation (the code). In addition, component blueprints have a unique identifier, a version number and can be stored in a blueprint catalog. In contrast, the term component is reserved for a runtime component instance containing a certain runtime state. A C O C ON ES component complies with the definition of Szyperski: it is a reusable documented software entity, offering a coherent behavior and is used as a building block in applications. In addition, all inter-component communication is explicit and takes place by sending asynchronous messages through external interfaces. In general, interfaces are an abstraction of the behavior of a component and consist of a (subset of) interactions of that component, together with a set of constraints on when these interactions may occur. In C O C ON ES, a component interface consists of a group of messages that may be sent to or sent out from the component. These interfaces are formally specified using the port construct. 2.2



C O C ON ES Ports



A C O C ON ES port represents a bidirectional communication access point of a component, consisting of an interface for incoming messages and an interface for outgoing messages. As with components, the distinction is made between port blueprints and ports. In C O C ON ES, a port is specified on 3 levels: Syntactic Level: syntactic description of messages that can be sent and received. Semantic Level: pre and postconditions associated to the messages. Synchronization Level: description of the sequence in which the messages have to occur. At the moment of writing, only the syntactic and the synchronization levels have been formally worked out in detail. Two ports can only be interconnected if their associated interfaces match on all levels. The number of connections that can be made with



 212



Yolande Berbers et al.



a port is specified using the MNOI (Maximum Number of Instances) property of a port. A major advantage of this restriction is that with this additional knowledge about the usage of the component, the developer can make more accurate QoS statements about the services the component delivers. Evidently, these restrictions are enforced at runtime by our execution environment (see Sec. 4). C O C ON ES supports 3 types of ports with respect to this MNOI: Single Port: A single port allows for one-on-one communications. This port is represented by a rectangle in our CCOM design tool (Fig. 1(a)). Multiport: One multiport of dimension n is conceptually identical to n single ports as it allows for n connectors to be attached simultaneously. Although messages can be sent to the entire multiport as such (in this case it behaves as a multicast port), the intended behavior of a multiport is to send messages to a specific index. Conceptually, a multiport is analogous to a call center: a connection is granted to a multiport unless it is already involved in its specified maximum number of connections. Once connected, conversation is one-to-one. As depicted in Fig. 1(b), the symbol of a multiport depends on its dimension. Multicast Port: A multicast port of dimension n is a single port that can have n connectors attached to it. Messages sent to a multicast port are always sent to all connectors attached to it. It is therefore not possible to differentiate between different receivers. Also, a multicast port can never receive messages. The graphical notation of a multicast port is a trapezium (Fig. 1(c)). The dimension for both multiports and multicast ports may be ∞. 2.3



C O C ON ES Connectors



Ports are connected using the connector construct. Compatibility of the port interfaces is checked both at design time and at runtime by the C O C ON ES tool chain. As such, connectors act as a kind of tunnel during message transmission. Connectors provide a layer of abstraction of component location since they can cross node boundaries when different components are spread over various nodes in a distributed system: components are unaware whether they are communicating with local or with remote components. At runtime, the underlying middleware system (see Sec. 4) takes care of this transparency. 2.4



C O C ON ES Contracts



Contracts are used in C O C ON ES to specify non-functional constraints. They allow a designer to impose constraints on the behavior of components and on the interactions between them. Contracts are attached by the designer when an application is constructed by composing components. They can be attached to all previously described constructs of the C O C ON ES architecture. A C O C ON ES contract is used both for annotation and for verification. It is an important aspect for a designer for documenting applications. Furthermore, contracts are used to verify the correctness of a program regarding their resource use. Some verifications can be done statically and are performed by CCOM, our component composition tool.



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



213



(a) Single Port



(b) Multi Port



(c) Multicast Port Fig. 1. Different ports in C O C ON ES



Other verifications are done dynamically by a contract monitoring module in D RACO (see Sec. 4). Although the C O C ON ES contracts are a general construct, only timing contracts and bandwidth contracts have been worked out at the time of writing. Work is underway in order to support memory contracts as well. A C O C ON ES timing contract specifies and imposes the timing constraints to which communicating components have to adhere. Timing contracts can be attached both to connectors and to ports (to specify constraints concerning multiple connections – e.g. 500ms after the arrival of message m on port p1, a response must be broad-casted on port p2). Two types of timing contracts are currently supported: deadline timing contracts and periodicity timing contracts. A deadline timing contract imposes a constraint on the occurrence time of a particular event, given the occurrence time of an event that happened earlier. Possible events include the sending of a message, the receipt of a message, and the termination of the processing of a received message. A periodicity timing contract imposes a constraint on the periodic occurrence of a particular event.



 214



Yolande Berbers et al.



C O C ON ES bandwidth contracts specify constraints concerning the flux density of the information exchanged between two ports. By expressing characteristics of the amount of information exchanged per time unit, we can deduct how suitable a connector is in a distributed component configuration. Therefore, these bandwidth contracts improve the self-containedness of components with regard to their use in a distributed system, making components location transparent. By performing a design-time analysis step that checks the feasibility of the component’s connectors over a given connection, the CCOM tool can reject or accept distributed configurations. In order to make bandwidth contracts easily understandable by application engineers using the CCOM tool, C O C ON ES bandwidth contracts are expressed in terms of concepts that are easy to reason about. Descriptions such as bits per second, available time frames, packets per time unit, etc. make no sense from a component’s point of view. Components send out messages at a certain rate, so quantitative aspects of their port’s communication behavior should be described in terms of message size (MS) and interval time (IT) between consecutive messages. C O C ON ES bandwidth contracts consist of constraints on several statistical characteristics derived from their message sizes and interval times. Figure 2 gives a conceptual illustration of the relationship between them and the bandwidth they use. Message Size (MS): the size of a message expressed in bytes. Interval Time (IT): the time between the beginning of the transmission of two consecutive messages.



Fig. 2. Message timing



2.5



C O C ON ES Compositions



Applications are constructed by creating component compositions. In this process, guided by the CCOM design tool (see Sec. 3), existing component blueprints can be loaded from a component repository and visually be connected to each other. Additional components can be created and key properties (such as the number of ports and their interfaces) can be created by the tool. The CCOM tool generates the necessary skeleton code that can be filled in by the developer. This process is bidirectional, in that the properties of an existing code can be retrieved from its source code. During the design of the application, the CCOM tool will check architectural consistency such as the compatibility of the connected ports on the syntactic and synchronization level, and, where possible, the feasibility of contracts. 2.6 Supporting Tools The entire methodology is supported by a tool chain. The CCOM (Component and Contract-Oriented Modeling) composition tool is a CASE design tool supporting the design and implementation of components and the construction of compositions. CCOM



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



215



is capable of generating skeleton code that assists the developer in the implementation process. The code is then converted to standard Java with a preprocessor, and compiled. At runtime, D RACO (DistriNet Reliable And Adaptive Components) is the middleware system responsible for the correct execution of C O C ON ES compositions. We discuss CCOM and its code generation in Sec. 3. D RACO is discussed in Sec. 4.



3



CCOM Case Tool



The CCOM tool supports the development of applications using the architectural C O C ON ES concepts described in the previous section. First, CCOM assists the developer during the creation and development of: Component Blueprints: component blueprints can be graphically created, specified and stored into a repository for later use. Compositions: compositions can be constructed using components (either custom made or retrieved from a repository), connectors and contracts. In addition, the CCOM tool provides three views in order to decompose the structure of an application. These views allow the developer to focus on the issue at hand, and make sure that relevant constructs are easily accessible: Blueprint models: all component blueprints from which instances will be used in the composition are grouped in a blueprint model. Instance models: the instance model gives a structural overview of the application. It consists of connected component instances. Scenario models: a scenario model represents a specific action in the application. Focus is on non-functional constraints which are represented by contracts that can be attached to component instances, port instances and/or connectors. The following paragraphs elaborate on the different features of the CCOM tool, and how these features assist the developer in the construction of an application using the C O C ON ES methodology. 3.1 Developing Component Blueprints The development of a component blueprint comprises two steps. 1. The specification of the blueprints of both the component and its ports. 2. Providing an appropriate implementation of the messages a component can receive. This is achieved by filling in the skeleton code that was generated by the CCOM tool during the previous step. Once a component has been specified and implemented, it is transformed into an XML representation and stored in the component repository. Fig. 3 shows a screen shot of the tool during the development of a blueprint model. In our tool, blueprints are represented with dashed lines. Large rectangles are components, small ones are the ports attached to a component. Left in the figure is the repository of component blueprints, ordered hierarchically. On the right, a blueprint model



 216



Yolande Berbers et al.



Fig. 3. A blueprint model in the CCOM tool



is shown that groups several component blueprint needed in a car regulator or cruise control application. The cruise control application is used to align the speed of a car to a target speed requested by the driver using a cruise control. The regulator makes use of a speedometer to read the vehicle speed. At a frequency of 2 Hertz, that is every 500 ms, the regulator should calculate the new speed of the car, and pass this on to the engine. The regulator should be stopped a.o. when the brakes are hit, the driver accelerates, the driver turns down the cruise control or when the speed drops below a certain limit. As discussed in Sec. 2.2, port interfaces are specified on multiple levels: syntactic, semantic (not yet implemented) and the synchronization level. The specification of these interfaces is shown in Fig. 4. The port in this figure is the SpeedUpdate port, which is part of the SpeedoMeter component. This component measures the speed of the car at a frequency of 2 Hertz. It can output its speed calculation to an unlimited number of other components. Amongst other, the speed is sent to the Input port of the SpeedDisplay component, which displays the speed on the dashboard of the car. Fig. 4(a) shows how the syntactic elements of the port blueprint can be filled in: every message can be described including its name, parameters and direction. The synchronization level is shown in Fig. 4(b). Here extended MSC’s (Message Sequence Charts) are being used to specify the interaction protocol. From the MSC it becomes clear that the SpeedUpdate port of the Speedometer component first receives the



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



217



Start message and that it sends Update messages in a loop. The interaction stops when the port receives a Stop message. For each message interaction three hook types can be distinguished (see Fig. 4(b)): Send hooks (boxes with ’S’) representing the sending of messages. Receive hooks (boxes with ’R’) representing the reception of messages. End-of-activation hooks (boxes with ’X’) representing the termination of the processing of a received messages.



(a) Specification of the Syntactic Interface



(b) Specification of the Synchronization Interface



Fig. 4. Specification of the interfaces of a port blueprint



The interface of a port blueprint is used to verify whether it can be connected to other port blueprints: connecting ports is only possible if their interfaces match. The compatibility of ports can be verified both for the syntactic level and for the synchronization level. Using the specifications of a component and its ports, the CCOM generates the necessary skeleton code to be filled in by the developer (see Sec. 3.3). The synchronization between a component blueprint specification and its implementation occurs automatically by the CCOM tool. 3.2 Developing Compositions A composition can be built by retrieving component blueprints from the repository and loading them into the composition. Next, instantiations of these component blueprints



 218



Yolande Berbers et al.



can be created and put into instance models. Connecting component instances is done by (1) instantiating the port instances that will communicate with each other and (2) creating a connector and attaching it to the created port instances. The scenario model, used in the following step, enables the software developer to impose non-functional constraints on parts of a composition by attaching contracts. A CCOM contract can be attached to one or more participants (component instances, port instances and/or connectors). The actual number and type of participants in a contract are dependent on the particular type of contract: a contract constraining the memory usage of a component is attached to a component instance, while contracts imposing timing constraints on the interaction between components are attached to the ports involved in the interaction. To make it more concrete, we elaborate here on the timing constraints. In CCOM, timing constraints are specified by means of templates with properties that have to be filled in by the application designer. Using templates makes it easier for a developer to specify constraints, without the need to learn a particular formal specification notation. In general, a CCOM timing contract specifies and imposes the timing constraints to which communicating components have to adhere. A timing contract is concerned with the communication between components. As such, it is straightforward to attach the timing contract to their ports, since these are the communication gateways between components. Furthermore, the communication between components is fully specified by the MSC of the involved ports. So this MSC plays a key role in the specification of a timing contract. A hook is a point on an MSC that represents a particular communication action: we distinguish a send hook, a receive hook and an end-of-activation (eoa) hook. A timing contract can be specified by means of these hooks. For example, a deadline contract could specify that the maximum duration between the send hook and the eoa hook may not exceed 500 milliseconds. A deadline contract has thus 3 parameters: a hook that starts the contract, a hook that ends the contract, and the maximum allowed time difference between the occurrences of these hooks. The second type of timing contract that CCOM supports is the periodicity timing contract. Fig. 5 shows how such a contract can be specified in our tool. The window in the tool shows the MSC, with the hooks in each message, and the message names. The four necessary parameters of the periodicity contract can be filled in at the bottom: in this example the contract starts at the sending of the start message, the periodic event is the reception of the update message, the contract ends with the sending of the stop message, and the period is 500 ms. 3.3 From Design to Execution: Code Generation To facilitate the development of components in C O C ON ES, the skeleton code of the component and its ports is generated by the CCOM tool. CCOM also ensures synchronization between a component blueprint specification and its implementation. For the further implementation of the component, a custom language is used. This language is a superset of Java that supports relevant component-based constructs. The code is automatically preprocessed, compiled and packaged into a binary that is ready for deployment in the runtime environment. As such, the design is directly used as input for the implementation.



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



219



Fig. 5. The specification of a periodicity timing contract in the CCOM tool



We shortly illustrate the language with a small example consisting of two components that are used in our cruise control example: a SpeedoMeter and a Speed Display. The first component will measure the speed of a vehicle using an existing method (measureSpeed()) and will broadcast this information on its Out port. The second component prints out all values it receives through its Input port. The implementation of these two components is shown in Fig. 6. The implementation of a component starts with the component keyword. It consists of zero or more attributes (e.g. $activated in Fig. 6) and methods (not shown for these trivial components) and the description of its ports. The declaration of a multicast port is straightforward since it can not accept messages. It suffices to specify its existence using the multicastport keyword. Two parameters are required: the name of the port and the maximum number of simultaneous connections that are allowed. In the above example, the multicastport Out specifies an UNLIMITED number of simultaneous connections. As such, connections will never be refused at runtime. A multiport has a similar declaration, but since it can accept messages, these messages must be declared as well using the message keyword. The definition of a message in-



 220



Yolande Berbers et al. component SpeedoMeter { protected boolean $activated = false; multicastport Out UNLIMITED; multiport Control 1 { message Start { $activated = true; } message Stop { $activated = false; } message Update { if ($activated) { message x = Speed; x::value = measureSpeed(); Out..x; } } } }



component SpeedDisplay { multiport Input 1 { message Update { System.out.println( "The speed of " + "the vehicle is: " + $$inMessage::value ); } } }



Fig. 6. Two simple components in C O C ON ES notation



cludes the code to be executed when the message is received on the port. New messages can be created using the statement: message varName = messageName; After its creation, this message can be sent out through any connected port. Message sending is asynchronous and as such, the sending of a message is always successful. If the component on the other side of the connector does not accept messageName, the system will return a CannotDeliverMessage message. Finally, two additional operators were added. Fields of a message are accessed using the :: operator: x::value = measureSpeed(); The .. operator is used on a port to send out a given message: Out..x; Inside the implementation body for a message, the implicit parameter $$inMessage refers to the received message. In the Update message of the SpeedDisplay component for instance (see Fig. 6), a parameter is retrieved from the incoming message and displayed on screen.



4



D RACO Runtime System



Next to the CCOM tool, the tool chain supporting the C O C ON ES methodology also contains a runtime environment capable of executing C O C ON ES compositions: the D RACO component system. D RACO is a middleware system that provides the underlying infrastructure of an execution environment for component compositions. The runtime system is highly modularized. As such, it can be configured and targeted to specific applications, while guaranteeing a minimal memory footprint. The D RACO system is implemented in Java and targeted towards more powerful embedded systems such as an IPAQ or a robot used in manufacturing. Very small embedded systems or systems with



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



221



hard real-time deadlines, such as often found in the automotive world are not the focus of the D RACO middleware platform. The C O C ON ES component design-methodology can however, be used on such systems as well. The architecture of D RACO is depicted in Fig. 7. It consists of a core system which provides the minimal functionality to execute C O C ON ES applications. The most important tasks of the core system are as follows: 1. 2. 3. 4.



Management of component instances, connectors and contracts. Support for introspection and naming. Abstraction of the underlying hardware and OS. Routing and scheduling messages sent between components.



The core system consists of 5 units and its footprint is less than 65kB, allowing it to be installed on embedded devices with stringent resource constraints. At startup, the core is dynamically assembled using the builder pattern [7]. Since the builder reads an XML file describing which implementation to use for each of the core units, modifying or replacing one core unit has no impact whatsoever on the rest of the system. The ability to easily customize its core makes D RACO an excellent platform for various assessments (e.g. replacing the scheduler would allow us to investigate the influence of the scheduling algorithm on the execution of a component based application, . . . ). Furthermore, it allows for further customization depending on the target platform. Once instantiated, the core is considered to be fixed. In order to keep the complexity (and size) of D RACO sufficiently low, no attempt was made to allow for unanticipated modifications of the D RACO core at runtime. The 5 core modules are: Component Manager: is responsible for loading component blueprints, creating instances and removing them. It also keeps a repository of created component instances, with a basic directory mechanism mapping names onto component instances. Connector Manager: is a repository containing the connectors that exist between component instances in a composition. Each connector refers to the ports to which it is connected. Each port has a send message handler queue and a receive message handler queue associated to it. Message Manager: this module is responsible for delivering messages sent out by components. By means of the Connector Manager it retrieves the send message handler queue of the sending port and the receive message handler queue of the receiving port. The messages then traverse the send message handler queue of the sending port and arrive at the Scheduler. Scheduler: accepts messages coming from a send message handler queue and schedules them for delivery to the appropriate message handler queue. Module Manager: responsible for loading and unloading extension modules, which can be used to extend the functionality of the D RACO component system. As shown on top of Fig. 7, each of the core modules exports a lightweight component-interface. These appear as components in the runtime system, and can be used by application components to query or configure the underlying middleware environment.



 222



Yolande Berbers et al.



Fig. 7. Overview of the D RACO architecture



Interaction between D RACO and the user is handled by an external shell, which is provided to D RACO at startup and resides in a different binary. By separating the user interaction from D RACO , it is possible to use different interaction shells depending on the situation. An interactive shell with scripting capabilities is available for use during development on a high performance desktop machine, while a thin layer with minimal functionality can be used when resource consumption is an issue. Although the functionality of the core system is relatively limited, D RACO offers an infrastructure which allows the addition of functionality that may not always be required: extension modules. These modules can be loaded and unloaded at runtime by the module manager. The following extension modules have been worked out: The distribution module (DM): this module adds distribution functionality to the core platform in a complete transparent way. It introduces the notion of proxy components, similar to the proxy pattern defined in [7]. These proxy components are lightweight components that represent remote components. As such, they offer the same ports and exactly the same semantic information. The DM is responsible for (1) setting up and tearing down connections between remote D RACO systems, and (2) responsible for managing proxy components and generating them based on real components. In D RACO , a connection is an abstract concept that can be implemented by any kind of physical wired or wireless connection. No stubs or other design-time entities need to be generated in advance in order to make components communicate in a distributed way. Instead, proxies are created dynamically on an as-needed basis. This incorporates a considerable advantage over traditional approaches that need additional constructs (e.g. the subs an skeletons used by Java RMI). The contract monitor: this module checks whether contracts are violated at runtime. Depending on the type of the contract, the monitoring differs. For the timing contracts, messages are intercepted and time stamped by an event gathering unit. These time stamps are used by the event processing unit to verify the timing contract. The unit responsible for the verification of the contract, can be moved to another node to minimize intrusion on the target platform. Currently, contract violations are reported offline: a developer can analyze the occurred violations after an application’s execution. In the future, a contract violation will be reported to the application, which then must take corrective measures.



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



223



The resource manager: this extension module is responsible for the negotiation of contracts with applications when these are started. The resource manager knows what contracts are currently active, and can then accept new contracts in function of available resources. The live update module (LUM): allows components to be replaced at runtime, even while they are part of a running application. The LUM achieves this by (1) putting the component to be replaced in an inactive state by temporarily holding back the messages, (2) instantiating the new version of the component, (3) possibly transferring the internal state from the old components to the new components, using routines provided by the developers of the components, (4) rewiring the connectors of the old version to the new component, (5) activating the new component by deblocking the messages that were held back in step 1 (6) removing the old component. Since the exact tasks and thus requirements of extension modules are unknown in advance, they can make use of reflection mechanisms and may subscribe to one of the many events triggered by the core system. In addition, they can interfere with the message flow and interact with the delivery of messages. In D RACO , messages are sent asynchronously between components. The path followed by a message traveling from component A to component B consists of 3 major parts (see Fig. 8(a)): the sending message chain, the scheduler and the receiving message chain. Each extension module can add message handlers to these message chains to implement the features they want. The sending message chain comprises the journey of a message from the moment it is sent through the port of the originating component until it is scheduled for execution by the scheduler. Its detailed implementation in D RACO is shown in Fig. 8(b). In the first step, the component contacts the port through which the message will be sent. Since ports are implemented as inner classes in D RACO , this is achieved with a local call (arrow 1). The port will pass on the message to the message manager (arrow 2) which will retrieve the attached connector from the connector manager (arrow 3). Each connector is associated with 4 message handlers (the first handler of both the sending and receiving chain of each direction: component A to B and vice versa). The message manager retrieves the two handlers associated with the current message direction. The receiving message handler is used for the delivery of the message after it has been scheduled for execution by the scheduler (see further). It is therefore simply passed on with the message to the sending message handler (arrow 4). This sending message handler is the first (and in the most basic scenario also the last) handler in a chain of message handlers. Each handler in the chain has the ability to intercept and modify the message, and will then forward it to the next handler in the chain (arrow(s) 5). The last handler is responsible for the delivery to the scheduler (arrow 6). After receiving both the message and its associated receiving message handler, the scheduler queues the message until it is ready for execution. The exact queuing mechanism depends on the scheduler that is used, but it is the responsibility of the scheduler to preserve the order of messages over a given connector. When the scheduler has selected a message for delivery, it allocates a thread for the execution of this message, and passes on the message to its receiving message handler.



 224



Yolande Berbers et al.



(a) Schematic overview of message journey



(b) Details of the sending message chain



Fig. 8. Message Delivery in D RACO



As shown in Fig. 8(a), the principle behind the receiving sequence is identical to the sending sequence: there is a chain of message handlers that process the message (e.g. the timing monitor can read out the time stamp added to the message by his peer in the sending message chain) and subsequently pass it on to the next handler in the chain. The last handler delivers the message to the port at the end of the connector. This port will then dispatch the message to the actual method associated to the message. After message execution, control is returned to the scheduler.



5



Extensive Case Study



Several embedded applications have been developed using the C O C ON ES design methodology and the supporting tool chain. One of these is the car regulator introduced in Sec. 3. This application has been used as a case study to define and assess timing constraints. However, as our implementation was not run on a car, and not on embedded hardware, it was in our eyes still a toy application. We then developed a full fledged embedded application, with the specific intention to validate our design methodology in a larger application, to verify the specific advantages of our component architecture and to test the re-usability of our components and our designs. This led to a camera surveillance system of which several variations were designed and implemented. The surveillance system can be used for security related purposes such as physical intrusion detection and registration of activity in home and office buildings. A PC/104 embedded computer (holding the operating system, a Java virtual machine, our component runtime and the test case code) was connected to a DFW-VL500 firewire digital camera. It was linked over a TCP/IP network with a desktop PC serving as a storage and control station. The two bottom boxes in Fig. 9 (generated by the CCOM tool) give an overview of the component compositions in the surveillance case. Device boundaries are indicated by surrounding boxes. The central Camera component on the embedded device (PC



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



225



Fig. 9. Overview of the Camera Surveillance Case



104) continuously grabs images from the camera at a predefined rate and multicasts them towards the MotionDetector and the Switch component. The motion detector analyzes the images and produces an alarm-start output message when motion is detected. The switch, receiving the message from the motion detector, forwards the video stream towards its output port until the alarm-stop message is received, meaning that motion has ceased. The suspicious images are sent to the StorageController component, which is located on the desktop PC. Proxy components are introduced for handling the remote communication transparently. The StorageController Proxy and the Switch Proxy allow for remote communication between Storage Controller and Switch. The Storage component, which encapsulates database access, eventually stores the images. The core application, as just described, was developed in CCOM and executes on top of D RACO , running a distribution module. In order to demonstrate reuse, several variations and extensions of the core application were developed. In one of the variations the motion detector component and the switch component were replaced by a component that passes one image out of 20 to the storage controller. This was a straight forward change, and all the other components could be reused without any change. Subsequently, we wanted to take reuse one step further and ported our component system on an iPAQ. We then designed BlueGuard which provides security guards with more information about the safety inside the building. This extension allows the guards to query the recorded images using a handheld device when they are in the neighborhood of an observation station. For this extension a BlueGuardClient component was created that is available on the handheld device.



 226



Yolande Berbers et al.



(a) Viewing events



(b) Viewing an image



(c) Camera Control Fig. 10. BlueGuard client



Furthermore, the distribution module was extended with Bluetooth [3] connection support to enable short-range wireless access to our observation stations. The Blue GuardClient component, instantiated on the iPAQ handheld devices (see the top box of Fig. 9), provides the user interface used by the security guards. This component can be connected to a StorageController component for querying purposes and to several Camera components for adjusting camera settings such as focus and zoom. As depicted on the example setup (Fig. 9), the BlueGuardClient may be connected to a StorageController component through the embedded PC 104 module using proxy components for routing their messages. Fig. 10(a) shows the tab view of all recent events and the Bluetooth connection status. The image associated with each event can be requested and is shown in the view tab (Fig. 10(b)). The third tab (Fig. 10(c)) allows for changing the brightness, sharpness and zoom parameters of the camera. An in-depth description of this BlueGuard extension can be found in [11]. Both timing contracts and bandwidth contracts have proven their advantages in the surveillance case. In the distributed setup, the bandwidth contract attached to the connection between the Switch and StorageController component imposed limitations on the rate at which the camera’s images could be sent. At the resolution of 320×240 in uncompressed RGB color format, we were limited to sending no more than 4 images per second over a 10Mb Ethernet connection. The message size



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



227



is 230400 bytes and the interval time 250 milliseconds. The Bluetooth communication between the handheld device and PC 104 module was fast enough for transmitting requested images to the handheld device because of the long interval times between consecutive images (only one image at the time is viewed). However, the timing contract describing the time between sending the image and receiving it only just met the timing requirements due to the rather slow (0.4 Mbps) throughput. The periodicity contract attached to the image sending hook of the camera was defined with a period of 250 milliseconds, fitting the requirements imposed by the network connection. Contract monitoring proved that they were always adhered by the components during the test period. Our case study allowed us also to validate the live update possibilities of our architecture. To do so, we have replaced the MotionDetector component by a newer version, using a different algorithm, while the application was running. Both the old and the new version compare every image with the previous one. The state of this component consists of the current image. It was this state that was asked to the old component and fed to the new component. The application as a whole continued to work fine. No images were lost. The core camera surveillance system, its variations and its extensions, have proven the soundness of our methodology and our approach where components and contracts play a central role, have shown that our architecture is suitable for embedded platforms, and have validated our tool chain.



6



Relation to the State-of-the-Art



In [2], Beugnard et al. argue that component interfaces should be specified on 4 levels: basic, behavioral, synchronization and Quality of Service. This is similar to the specification of port interfaces in C O C ON ES. The last level of the port specification was deliberately omitted, since Quality of Service properties are specified using contracts. C O C ON ES contracts are more general and have a broader scope. Therefore, they can be attached to components or connectors as well (e.g. memory contracts are likely to be attached to components), and are not tied exclusively to a port. The state-of-the-art of component based development is too large to be presented here. We just contrast our work with alternative component-based frameworks specifically targeted to embedded systems, which have been developed in recent years. In Koala [19], components are implemented in C and specify provides and requires interfaces that cannot be changed. Interfaces can be connected if the provided interface implements at least all methods for the required interface. The binding of these interfaces is made at the product level. All external information (including memory management) must be retrieved through require interfaces. Other embedded component systems worth mentioning are PECOS [9, 20] (a model for field-devices with the emphasis on formal execution models using Petri Nets), PortBased Objects [14, 15] (used in the Chimera RT operating system), VEST (A tool set for constructing and analyzing component based embedded systems) [13] and DESS (a generic component architecture and notation for embedded software development) [5, 6].



 228



Yolande Berbers et al.



C O C ON ES has several aspects in common with the Ptolemy II project [1]. Ptolemy II studies the modeling, simulation and design of concurrent, real-time, embedded systems. It specifically focuses on the assembly of concurrent components and the use of models of computation that regulates the interaction between embedded components. One of their major problem areas are the use of heterogeneous mixtures of models of computation, including discrete-event systems, data flow, process networks, synchronous and reactive systems, and communicating sequential processes. A number of C O C ON ES concepts are inspired by ROOM [12] and UML-RT, more specifically components, ports, and connectors. Although initially intended for designing and building telecommunication systems, the ROOM methodology can also be used for the design of other types of embedded systems. ROOM designs contain primarily actors, ports, bindings and state machines. The ROOM methodology has some new and interesting ideas: it introduces thread encapsulation which hides the internal thread mechanisms; it offers and alternative way of connecting software components by means of bindings; the idea of port protocols is an advantage since it enforces a designer to only connect compatible ports; and it offers executable models and the ability to generate code by putting code into transitions of the state machines. ROOM however lacks a consistent way to annotate time in designs. In general, ROOM has no support for the annotation of non-functional constraints, like for instance memory and bandwidth constraints. The Fractal component model [4] in particular is very interesting because it is in several aspects based on the same principles as C O C ON ES’s component model. Support for extension and adaptation is its prime concern and it is aiming at a broad range of host devices from embedded systems to application servers. It has, however, a less strictly defined component definition than C O C ON ES’s components and provides a language independent interface definition for its components. This interface definition can be used to connect components written in different languages. In addition, there is also support for composite components and it allows (sub)components to be shared between components. In the Fractal model, connections between two or several components are called bindings. There are two types of bindings, primitive bindings and composite bindings. Primitive bindings are language-level bindings (synchronous or asynchronous) whereas composite bindings are a composition of primitive bindings and components. Using this definition of inter-component bindings, flexible distributed applications can be built. Each component has a component controller that can control all internal behavior of the component such as affecting operation invocations, influencing the behavior of internal components, creating new components, etc.. The Javabased Fractal framework that supports the Fractal component model consists of a core and several extensions, called increments. Like in DRACO , these increments can add new functionality to the core. The core offers a basic API for performing actions such as creating components, adding bindings between components and managing the content of components. Some of the increments under development allow for component bootstrapping, component distribution (distributed bindings), mobility and protection (resource management and distribution).



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



7



229



Conclusion



This chapter has given an overview of C O C ON ES, a methodology and architecture for developing software for embedded systems, using a component oriented approach, where contracts are used to model the non-functional constraints. C O C ON ES is backed by a tool chain that spans both the design-time and the runtime phase. Contracts can be specified at design-time and can be checked both at design-time and at runtime. The runtime environment offers beside its core system support for distribution and live updates. A full fledged embedded case study has proven the soundness of our methodology, the applicability of our software architecture in the domain of embedded systems and the robustness of our runtime environment. In conclusion we can say that our methodology is original in that it is supported by a tool chain, where both functional and non-functional constraints are checked, and this both at design-time and at runtime. These checks are generated by the tools, based on the design made by the developer. The key contribution of our work is therefore the integrated tool chain that spans design-time and runtime, and covers functional and non-functional constraints. More detailed description of this work can be found in [10, 17, 18]. C O C ON ES is an ongoing project. We are currently on different tracks to extend and improve our methodology and architecture. (1) We are extending the contracting framework to support general resource contracts in an extensible way. This framework will make it easy to add new types of contracts to the tool chain and monitoring mechanisms for new contracts to the D RACO runtime system. One track that is already being explored is the deployment of memory contracts and mechanisms to monitor a component’s memory use. In addition, the new contract framework will allow resource contracts to be negotiable at deployment time and even renegotiable when resource availability in the system changes. This will enable flexible runtime deployment of programs on embedded systems without endangering the system’s robustness and reliability. (2) In the future, our live updating tool will be able to automatically generate state transfer functions in order to enhance the support for updating components at runtime. (3) With the addition of the notion of context, our future runtime system will be able to discover, retrieve and process context information to improve its application’s capabilities to react to environmental conditions and events. Such context information could include information about the user of the system (such as his preferences), environmental information (such as temperature, location and time), platform information (such as CPU and memory information) and service information (available software services). (4) Furthermore, research is being conducted on runtime software adaptation and reconfiguration mechanisms that respect existing resource contracts. This way, programs can -guided by the runtime system- adjust its configuration and composition at runtime according to new contextual conditions.



Acknowledgment Both projects introduced, SEESCOA and CoDAMoS, are funded by the Belgian Institute for the Promotion of Innovation by Science and Technology in Flanders.



 230



Yolande Berbers et al.



References 1. Philip Baldwin, Sanjeev Kohli, Edward A. Lee, Xiaojun Liu, and Yang Zhao. Modeling of sensor nets in ptolemy ii. In Proceedings of Information Processing in Sensor Networks (IPSN), Berkeley, CA, USA, April 26-27 2004. 2. Antoine Beugnard, Jean-Marc J´ez´equel, No¨el Plouzeau, and Damien Watkins. Making components contract aware. Computer, 32(7):38–45, July 1999. 3. Bluetooth. Bluetooth wireless protocol, 2003. http://www.bluetooth.com/ and http://www.bluetooth.org/. 4. Eric Bruneton, Thierry Coupaye, and Jean-Bernard Stefani. Recursive and dynamic software composition with sharing. In Proc. of the Seventh International Workshop on ComponentOriented Programming, Malaga, Spain, 2002. 5. DESSteam. Definition of components and notation for components. Technical report, December 2001. http://www.dess-itea.org. 6. DESSteam. Timing, memory and other resource constraints. Technical report, 2001. http://www.dess-itea.org. 7. Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison Wesley, 1994. 8. ITRS. International technology roadmap for semiconductors. Internet, 2004. http://public.itrs.net. 9. Oscar Nierstrasz, Gabriela Ar’evalo, St´ephane Ducasse, Roel Wuyts, Peter M¨uller, C. Zeidler, Thomas Genssler, and R. van den Born. A component model for field devices. In In proceedings of the IFIP/ACM working conference on Component Deployment, Berlin, 2002. 10. Peter Rigole, Yolande Berbers, and Tom Holvoet. Design and run-time bandwidth contracts for pervasive computing middleware. In C. Urarahy, A. Sztajnberg, and R. Cerqueira, editors, Proceedings of the first International Workshop on Middleware for Pervasive and Ad Hoc Computing (MPAC)., pages 5–12, Rio De Janeiro, Brazil, June 2003. 11. Peter Rigole, Yolande Berbers, and Tom Holvoet. Bluetooth enabled interaction in a distributed camera surveillance system. In Proceedings of the Thirty-Seventh Annual Hawaii International Conference on System Sciences, pages 1–10. IEEE Computer Society, 2004. 12. B. Selic, G. Gullekson, and P. Ward. Real-Time Object Oriented Modeling. Wiley, 1994. ISBN 0471599174. 13. John A. Stankovic. VEST — A toolset for constructing and analyzing component based embedded systems. Lecture Notes in Computer Science, 2211:390–402, 2001. 14. David B. Stewart and Pradeep K. Khosla. The chimera methodology: Designing dynamically reconfigurable and reusable real-time software using port-based objects. International Journal of Software Engineering and Knowledge Engineering, 6(2):249–277, June 1996. 15. David B. Stewart, Richard A. Volpe, and Pradeep K. Khosla. Design of dynamically reconfigurable real-time software using port-based objects. Software Engineering, 23(12):759–776, 1997. 16. Clemens Szyperski. Component Software: Beyond Object-Oriented Programming. AddisonWesley, November 2002. 17. David Urting, Stefan Van Baelen, Tom Holvoet, Peter Rigole, Yves Vandewoude, and Yolande Berbers. A tool for component based design of embedded software. In J. Noble and J. Potter, editors, Proceedings of 40th International Conference on Technology of Object-Oriented Languages and Systems (Tools Pacific 2002), volume 10, pages 159–168, Sydney, Australia, February 2002. Australian Computer Society Inc. 18. David Urting, Tom Holvoet, and Yolande Berbers. Embedded software development: Components and contracts. In T. Gonzalez, editor, Proc. of the IASTED Conference on Parallel and Distributed Computing and Systems, pages 685–690, 2001.



 C O C ON ES: An Approach for Components and Contracts in Embedded Systems



231



19. Rob van Ommering. Building Reliable Component-Based Software Systems, chapter The Koala Component Model. Artech House Publishers, July 2002. 20. Michael Winter, Thomas Genssler, Alexander Christoph, Oscar Nierstrasz, St´ephane Ducasse, Roel Wuyts, Gabriela Ar´evalo, Peter M¨uller, Chris Stich, and Bastiaan Sch¨onhage. Components for embedded software - the pecos approach. In Proceedings of the Second International Workshop on Composition Languages, Malaga, Spain, June 2002.



 Adopting a Component-Based Software Architecture for an Industrial Control System – A Case Study Frank L¨uders1, Ivica Crnkovic1, and Per Runeson2 1



Department of Computer Science and Engineering M¨alardalen University Box 883, SE-721 23 V¨aster˚as, Sweden {frank.luders,ivica.crnkovic}@mdh.se 2 Department of Communication Systems, Lund University Box 118, SE-221 00 Lund, Sweden [email protected]



Abstract. This chapter presents a case study from a global company developing a new generation of programmable controllers to replace several existing products. The system needs to incorporate support for a large number of I/O systems, network types, and communication protocols. To leverage its global development resources and the competency of different development centers, the company decided to adopt a component-based software architecture that allows I/O and communication functions to be realized by independently developed components. The architecture incorporates a subset of a standard component model. The process of redesigning the software architecture is presented, along with the experiences made during and after the project. An analysis of these experiences shows that the component-based architecture effectively supports distributed development and that the effort required for implementing certain functionality has been substantially reduced while, at the same time, the system’s performance and other run-time quality attributes have been kept on a satisfactory level.



1



Introduction



Component-based software engineering (CBSE) denotes the disciplined practice of building software from pre-existing smaller products, generally called software components, in particular when this is done using standard or de-facto standard component models [7, 16]. The popularity of such models has increased greatly in the last decade, particularly in the development of desktop and server-side software, where the main expected benefits of CBSE are increased productivity and timeliness of software development projects. The last decade has also seen an unprecedented interest in the topic of software architecture [2, 15] in the research community as well as among software practitioners. CBSE has notable implications on a system’s software architecture, and an architecture that supports CBSE, e.g. by mandating the use of a component model, is often called a component-based software architecture. This chapter presents an industrial case study from the global company ABB, which is a major supplier of industrial automation systems, including programmable controllers. The company’s new family of controllers is intended to replace several existing C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 232–248, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 Adopting a Component-Based Software Architecture for an Industrial Control System



233



products originally developed by different organizational units around the world, many of which were previously separate companies, targeting different, though partly overlapping, markets and industries. As a consequence, the new controller products must incorporate support for a large number of I/O systems, network types, and communication protocols. To leverage its global development resources and the competency of different development centers, ABB decided to adopt a component-based software architecture that allows I/O and communication functions to be realized by independently developed components. This chapter is organized as follows. The remainder of this section describes the questions addressed by the case study and motivates the choice of method. Section 2 presents the context of the case study, including a description of the programmable controller and its I/O and communication functions as well as the organizational and business context. The process of componentizing the system’s software architecture is presented in Section 3. Section 4 analyzes the results of the project and identifies some experiences of general interest. A brief overview of related work is provided in Section 5. Section 6 presents conclusions and some ideas for further work. 1.1 Questions Addressed by the Case Study The general question addressed by the case study is what advantages and liabilities the use of a component-based software architecture entails for the development of an industrial control system. Due to the challenges of the industrial project studied, the potential benefit that a component-based architecture makes it easier to extend the functionality of the software has been singled out for investigation. More specifically, the project allows the two following situations to be compared: – The system has a monolithic software architecture and all functionality is implemented at a single development center. – The system has a component-based software architecture and pre-specified functional extensions can be made by different development centers. By pre-specified functional extensions we mean extensions in the form of components adhering to interfaces already specified as part of the architecture. This fact is presumed to be significant, while the fact that the functionality in question happens to be related to I/O and communication is not. In addition to the question of whether the component-based architecture reduces the effort required to make such functional extension, the study also addresses the questions of whether any such reduction is sufficient to justify the effort invested in redesigning the architecture and after how many extensions the saved effort surpasses the invested effort. Since the system in question is subject to hard real-time requirements, the potential effect of the architecture on the possibility of satisfying such requirements is also studied. Finally, the architecture’s possible effect on performance is analyzed. 1.2 Case Study Method The research methodology used is a flexible design study, conducted as a participant observation case study [14]. The overall goal of the study is to observe the process of



 234



Frank L¨uders, Ivica Crnkovic, and Per Runeson



componentization, and evaluate the gains of a component-based architecture. It is not possible to demarcate such a complex study object in a fixed design study. Neither is there an option to isolate and thereby study alternative options. Instead we address the problem using a case study approach, where one study object is observed in detail and conclusions are drawn from this case. In order to enable best possible access to the information on the events in the case, the observations are performed by an active participant. The main researcher is also an active practitioner during the study. As a complement, interviews are conducted after the case study to collect data on costs and gains of the component approach, thus conducting data triangulation. Participatory research always includes a threat with respect to researcher bias. In order to increase the validity of the observations, a researcher was introduced late in the research process as a ”critical friend”. The long researcher involvement in this case study reduces on the other hand the threat with respect to respondent bias. Case studies are by definition weak with respect to generalization, in particular when only a single case is observed. However, to enable learning across organizational contexts, we present the context of the case study in some detail. Hence, the reader may find similarities and differences compared to their environment, and thus judge the transferability of the research.



2



Context of the Case Study



Following a series of mergers and acquisitions, ABB became the supplier of several independently developed programmable controllers for the process and manufacturing industries. The company subsequently decided to continue development of only a single family of controllers for these and related industries, and to base all individual controller products on a common software platform. To be able to replace all the different existing products used in different regional areas and industry sectors, these controllers needed to incorporate support for a high number of communication protocols, network types, and I/O systems, including legacy systems from each of the previously existing controllers as well as current and emerging industry standards. A major challenge in the development of the new controller platform was to leverage the software development resources at different development centers around the world and their expertise in different areas. In particular, it was desirable to enable different development centers to implement different types of I/O and communication support. Additional challenges were to make the new platform sufficiently general, flexible, and extensible to replace existing controllers, as well as to capture new markets. The solution chosen to meet these challenges was to base the new platform on one of the existing systems while adopting a component-based software architecture with well-defined interfaces for interaction between the main part of the software and I/O and communication components developed throughout the distributed organization. As the starting point of the common controller software platform, one of the existing product lines was selected. This system is based on the IEC 61131-3 industry standard for programmable controllers [8]. The software has two main parts 1) the ABB Control



 Adopting a Component-Based Software Architecture for an Industrial Control System



235



Builder, which is a Windows application running on a standard PC, and 2) the system software of the ABB controller family, running on top of a real-time operating system (RTOS) on special-purpose hardware. The latter is also available as a Windows application, and is then called the ABB Soft Controller. A representative member of the ABB controller family is the AC 800M modular controller. This controller has two built-in serial communication ports as well as redundant Ethernet ports. In addition, the controller has two expansion buses. One of these is used to connect different types of input and output modules through which the controller can be attached to sensors and actuators. The other expansion bus is used to connect communication interfaces for different types of networks and protocols. The picture in Fig. 1 shows an AC 800M controller equipped with two communication interfaces (on the left) and one I/O module (on the right).



Fig. 1. An AC 800M programmable controller



The Control Builder is used to specify the hardware configuration of a control system, comprising one or more controllers, and to write the programs that will execute on the controllers. The configuration and the control programs together constitute a control project. When a control project is downloaded to the control system the system software of the controllers is responsible for interpreting the configuration information and for scheduling and executing the control programs. Fig. 2 shows the Control Builder with a control project opened. The project consists of three structures, showing the libraries used by the control programs, the control programs themselves, and the hardware configuration, respectively. The latter structure is expanded to show a configuration of a single AC 800M controller, equipped with an analogue input module (AI810), a digital output module (DO810), and a communication interface (CI851) for the PROFIBUS-DP protocol [10]. To be attractive in all parts of the world and a wide range of industry sectors, the common controller must incorporate support for a large number of I/O systems, communication interfaces, and communication protocols. During the normal operation of a controller, i.e. while the control programs are not being updated, there are two princi-



 236



Frank L¨uders, Ivica Crnkovic, and Per Runeson



Fig. 2. The ABB Control Builder



pal ways for it to communicate with its environment, denoted I/O (Input/Output) and variable communication, respectively. To use I/O, variables of the control programs are connected to channels of input and output modules using the program editor of the Control Builder. For instance, a Boolean variable may be connected to a channel on a digital output module. When the program executes, the value of the variable is transferred to the output channel at the end of every execution cycle. Variables connected to input channels are set at the beginning of every execution cycle. Real-valued variables may be attached to analogue I/O modules. Fig. 3 shows the program editor with a small program, declaring one input variable and one output variable. Notice that the I/O addresses specified for the two variables correspond to the position of the two I/O modules (AI810 and DO810, respectively) in Fig. 2. Variable communication is a form of client/server communication and is not synchronized with the cyclic program execution in the way that I/O is. A server supports one of several possible protocols and has a set of named variables that may be read or written by clients that implement the same protocol. An ABB Controller can be made a server by connecting program variables to so-called access variables in a special section of the Control Builder (see Fig. 2). Servers may also be other devices, such as field-bus devices [10]. A controller can act as a variable communication client by using special routines for connecting to a server and reading and writing variables via the connection. Such routines for a collection of protocols are available in the Communication Library, which is delivered with the Control Builder. The communication between a client and a server can take place over different physical media, which, in the case of the AC 800M, are accessed either via external communication interfaces or the built-in Ethernet or serial ports.



 Adopting a Component-Based Software Architecture for an Industrial Control System



237



Fig. 3. The program editor of the ABB Control Builder



Control projects are usually downloaded to the controllers via a TCP/IP/Ethernetbased control network, which may optionally be redundant. A control project may also be downloaded to a single controller via a serial link. In both cases, downloading is based on the Manufacturing Message Specification (MMS) protocol 5, which also supports run time monitoring of hardware status and program execution. The system software of a controller, including the RTOS, can be updated from a PC via a serial link. Fig. 4 shows an example of a control system configuration.



Fig. 4. Example control system configuration



3



Componentization



3.1 Reverse Engineering of the Existing Software Architecture The first step in the componentization of the architecture of the Control Builder and the controller system software was to get an overview of the existing architecture of the



 238



Frank L¨uders, Ivica Crnkovic, and Per Runeson



Fig. 5. The original software architecture



software, which was not explicitly described in any document. The software consists of a large number of source code modules, each of which is used to build the Control Builder or the controller system software or both, with an even larger number of interdependencies. An analysis of the software modules with particular focus on I/O and communication functions yielded the course-grained architecture depicted in Fig. 5. The boxes in the figure represent logical components of related functionality. Each box is implemented by a number of modules, and is not readily visible in the source code. Many modules are also used as part of other products, which are not discussed further here. This architecture is thus a product-line architecture [3], although the company has not yet adopted a systematic product line approach. On the controller side, which is the focus of this chapter, the architecture has two distinct layers [15]. The lower layer (the box at the bottom of the figure) provides an interface to the upper layer (the rest of the boxes), which allows the source code of the upper layer to be used on different hardware platforms and operating systems. The complete set of interdependencies between modules within each layer was not captured by the analysis. To illustrate how some modules are used to build both the Control Builder and the controller system software, we consider the handling of hardware configurations. The hardware configuration is specified in the Controllers structure of the Control Builder. For each controller in the system, it is specified what additional hardware, such as I/O modules and communication interfaces, it is equipped with. Further configuration information can be supplied for each piece of hardware, leading to a hierarchic organization of information, called the hardware configuration tree. The code that builds this tree in the Control Builder is also used in the controller system software to build the same tree there when the project is downloaded. If the configuration is modified in the Control Builder and downloaded again, only a description of what has changed in the tree is sent to the controller. The main problem with this software architecture is related to the work required to add support for new I/O modules, communication interfaces, and protocols. For instance, adding support for a new I/O system possibly required source code updates in all the components except the User Interface and the Communication Server, while a new communication interface and protocol could require all components except I/O Access to be updated.



 Adopting a Component-Based Software Architecture for an Industrial Control System



239



As an example of what type of modifications may have been needed to the software, we consider the incorporation of a new type of I/O module. To be able to include a device (I/O module or communication device) in a configuration, a hardware definition file for that type of device must be present on the computer running the Control Builder. For an I/O module, this file defines the number and types of input and output channels. The Control Builder uses this information to allow the module and its channels to be configured using a generic configuration editor. This explains why the user interface did not need to be updated to support a new I/O module. The hardware definition file also defines the memory layout of the module, so that the transmission of data between program variables and I/O channels can be implemented in a generic way. For most I/O modules, however, the system is required to perform certain tasks, for instance when the configuration is compiled in the Control Builder or during start-up and shutdown in the controller. In the architecture described above, routines to handle such tasks had to be hard-coded for every type of I/O module supported. This required software developers with a thorough knowledge of the source code. The situation was similar when adding support for communication interfaces and protocols. The limited number of such developers therefore constituted a bottleneck in the effort to keep the system open to the many I/O and communication systems found in industry. 3.2 Component-Based Software Architecture To make it much easier to add support for new types of I/O and communication, it was decided to split the logical components mentioned above into their generic and specific parts. The generic parts, commonly called the generic I/O and communication framework, contains code that is shared by all hardware and protocols implementing certain functionality. Routines that are specific to a particular type of hardware or protocol are implemented in separate components, called protocol handlers, installed on the PC running the Control Builder and on the controllers. This component-based architecture is illustrated in Fig. 6. Focusing again on the controller side, and comparing this architecture with the previous one, the protocol handlers can be seen as an additional half-layer between the framework and the bottom layer. To add support for a new I/O module, communication interface, or protocol in this architecture, it is only necessary to add protocol handlers for the PC and the controller along with a hardware definition file and possibly a device driver. The format of hardware definition files is extended to include the identities of the protocol handlers as described below. Essential to the success of the approach, is that the dependencies between the framework and the protocol handlers are fairly limited and, even more importantly, well specified. One common way of dealing with such dependencies is to specify the interfaces provided and required by each component [9]. The new control system uses the Component Object Model (COM) [4] to specify these interfaces, since COM provides suitable formats both for writing interface specification, using the COM Interface Description Language (IDL), and for run-time interoperability between components. For each of the generic components, two interfaces are specified: one that is provided by the framework and one that may be provided by protocol handlers. In addition, interfaces are defined to give protocol handlers access to device drivers and system func-



 240



Frank L¨uders, Ivica Crnkovic, and Per Runeson



Fig. 6. Component-based software architecture



tions. The identities of protocol handlers are provided in the hardware definition files as the Globally Unique Identifiers (GUIDs) of the COM classes that implement them. COM allows several instances of the same protocol handler to be created. This is useful, for instance, when a controller is connected to two separate networks of the same type. Also, it is useful to have one object, implementing an interface provided by the framework, for each protocol handler that requires the interface. An additional reason that COM has been chosen is that commercial COM implementations are expected to be available on all operating systems that the software will be released on in the future. The Control Builder is only released on Windows, and it is expected that most future control products will be based on VxWorks, although some products are based on pSOS, for which a commercial COM implementation does not exist. In the first release of the component-based system the protocol handlers were implemented as C++ classes, which are linked statically with the framework. This works well because of the close correspondence between COM and C++, where every COM interface has an equivalent abstract C++ class. An important constraint on the design of the architecture is that hard real-time requirements, related to scheduling and execution of control programs, must not be affected by interaction with protocol handlers. Thus, all code in the framework responsible for instantiation and execution of protocol handlers, always executes at a lower priority than code with hard deadlines. 3.3 Interaction Between Components When a control system is configured to use a particular device or protocol, the Control Builder uses the information in the hardware definition file to load the protocol handler on the PC and execute the protocol specific routines it implements. During download, the identity of the protocol handler on the controller is sent along with the other con-



 Adopting a Component-Based Software Architecture for an Industrial Control System



241



figuration information. The controller system software then tries to load this protocol handler. If this fails, the download is aborted and an error message is displayed by the Control Builder. This is very similar to what happens if one tries to download a configuration, which includes a device that is not physically present. If the protocol handler is available, an object is created and the required interface pointers obtained. Objects are then created in the framework and interface pointers to these passed to the protocol handler. After the connections between the framework and the protocol handler has been set up through the exchange of interface pointers, a method will usually be called on the protocol handler object that causes it to continue executing in a thread of its own. Since the interface pointers held by the protocol handler reference objects in the framework, which are not used by anyone else, all synchronization between concurrently active protocol handlers can be done inside the framework.



Fig. 7. Interfaces for communication servers



To make this more concrete, we now present a simplified description of the interaction between the framework and a protocol handler implementing the server side of a communication protocol on the controller. This relies manly on the two interfaces IGenServer and IPhServer. The former is provided by the framework and the latter by protocol handlers implementing server side functionality. Fig. 7 is a UML structure diagram showing the relationships between interfaces and classes involved in the interaction between the framework and such a protocol handler. The class CMyProtocol represents the protocol handler. The interface IGenDriver gives the protocol handler access to the device driver for a communication interface. A simplified definition of the IPhServer interface is shown below. The first two operations are used to pass interface pointers to objects implemented by the framework to the protocol handler. The other two operations are used to start and stop the execution of the protocol handler in a separate thread.



 242



Frank L¨uders, Ivica Crnkovic, and Per Runeson



interface { HRESULT HRESULT HRESULT HRESULT };



IPhServer : IUnknown SetServerCallback([in] IGenServer *pGenSrv); SetServerDriver([in] IGenDriver *pGenDrv); ExecuteServer(); StopServer();



The UML sequence diagram in Fig. 8 shows an example of what might happen when a configuration is downloaded to a controller, specifying that the controller should provide server-side functionality. The system software first invokes the COM operation CoCreateInstance to create a protocol handler object and obtain an IPhServer interface pointer. Next, an instance of CGenServer is created and a pointer to it passed to the protocol handler using SetServerCallback. Similarly, a pointer to a CGenDriver object is passed using SetDriverCallback. Finally, ExecuteServer is invoked, causing the protocol handler to start running in a new thread.



Fig. 8. Call sequence to set up conne



To see how the execution of the protocol handler proceeds, we first look at a simplified definition of IGenServer. The first two operations are used to inform the framework about incoming requests from clients to establish a connection and to take down an existing connection. The last two operations are used to handle requests to read and write named variables, respectively. The index parameter is used with variables that hold



 Adopting a Component-Based Software Architecture for an Industrial Control System



243



structured data, such as records or arrays. All the methods have an output parameter that is used to return a status word. interface IGenServer : IUnknown { HRESULT Connect([out] short *stat); HRESULT Disconnect([out] short *stat); HRESULT ReadVariable( [in] BSTR *name, [in] short index, [out] tVal *pVal, [out] short *status); HRESULT WriteVariable( [in] BSTR *name, [in] short index, [in] tVal *pVal, [out] short *status); }; Running in a thread of its own, the protocol handler uses the IGenDriver interface pointer to poll the driver for incoming requests from clients. When a request is encountered the appropriate operation is invoked via the IGenServer interface pointer, and the result of the operation, specified by the status parameter, reported back to the driver and ultimately to the communication client via the network. As an example, Fig. 9 shows how a read request is handled by calling ReadVariable. The definition of the IGenDriver interface is not included in this discussion for simplicity, so the names of the methods invoked on this interface are left unspecified in the diagram. Write and connection oriented requests are handled in a very similar manner to read requests. The last scenario to be considered here, is the one where configuration information is downloaded, specifying that a protocol handler that was used in the previous configuration should no longer be used. In this case, the connections between the objects in framework and the protocol handler must be taken down and the resources allocated to them released. Fig. 10 shows how this is accomplished by the framework first invoking StopServer and then Release on the IPhServer interface pointer. This causes the pro-



Fig. 9. Call sequence to handle variable read



 244



Frank L¨uders, Ivica Crnkovic, and Per Runeson



Fig. 10. Call sequence to take down connections



tocol handler to decrement its reference count, and to invoke Release on the interface pointers that have previously been passed to it. This in turn, causes the objects behind these interface pointers in the framework to release themselves, since their reference count reaches zero. Assuming that its reference count is also zero, the protocol handler object also releases itself. If the same communication interface, and thus the protocol handler object, had also been used for different purposes, the reference count would have remained greater than zero and the object not released.



4



Experiences



The definitive measure of the success of the project described in this chapter is how large the effort required to redesign the software architecture has been compared to the effort saved by the new way of adding I/O and communication support. It is important to remember, however, that in addition to this cost balance, the business benefits gained by shortening the time to market must be taken into account. Also important, although harder to assess, are the long time advantages of the increased flexibility that the component-based software architecture is hoped to provide. At the time of writing, the parts of the generic I/O and communication framework needed to support communication protocols have been completed, requiring an estimated effort of 15–20 person-years. A number of protocols have been implemented using the new architecture. The total effort required to implement a protocol (including the protocol handler, a device driver, firmware for the communication interface, and possibly IEC 61131-3 function blocks) is estimated to be 3–6 person-years. The reduction in effort compared to that required with the previous architecture is estimated to vary from one third to one half, i.e. 1–3 person-years per protocol. Assuming an aver-



 Adopting a Component-Based Software Architecture for an Industrial Control System



245



age saving of 2 person-years per protocol handler, the savings surpass the investment after the implementation of 8–10 protocols. Table 1 summarizes these effort estimations, which were made by technical management at ABB and are primarily based on reported working hours. System tests have shown that the adoption of the chosen subset of COM has resulted in acceptable system performance. The ability to meet hard real-time requirements has not been affected by the component-based architecture, since all such requirements are handled by threads that cannot be interrupted by the protocol handlers. Table 1. Summary of effort estimation for the two software architectures



Investment in framework Cost per protocol Saving per protocol Return on investment



Original software architecture 0 4–9 person-years 0 –



Component-based software architecture 15–20 person-years 3–6 person-years 1–3 person-years 8–10 protocols



An interesting experience from the project is that the componentization is believed to have resulted in a more modularized and better documented system. Two characteristics generally believed to enhance quality. This experience concurs with the view of Szyperski [16] that adopting a component-based approach may be used to achieve modularization, and may therefore be effective even in the absence of externally developed components. The reduction in the effort required to implement communication protocols is partly due to the fact that the framework now provides some functionality that was previously provided by individual protocol implementations. This is also believed to have increased quality, since the risk of each protocol implementation introducing new errors in this functionality has been removed. Another interesting experience is that techniques that were originally developed to deal with dynamic hardware configurations have been successfully extended to cover dynamic configuration of software components. In the ABB control system, hardware definition files are used to specify what hardware components a controller may be equipped with and how the system software should interact with different types of components. In the redesigned system, the format of these files has been extended to specify which software components may be used in the system. The true power of this commonality is that existing mechanisms for handling hardware configurations, such as manipulating configuration trees in the Control Builder, downloading configuration information to a control system, and dealing with invalid configurations, can be reused largely as is. The idea that component-based software systems can benefit by learning from hardware design is also aired in [16]. Another lesson of general value is that it seems that a component technology, such as COM, can very well be used on embedded platforms and even platforms where runtime support for the technology is not available. Firstly, we have seen that the space and computation overhead that follows from using COM is not larger than what can be afforded in many embedded systems. In fact, used with some care, COM does not intro-



 246



Frank L¨uders, Ivica Crnkovic, and Per Runeson



duce much more overhead than do virtual methods in C++. Secondly, in systems where no such overhead can be allowed, or systems that run on platforms without support for COM, IDL can still be used to define interfaces between components, thus making a future transition to COM straightforward. This takes advantage of the fact that the Microsoft IDL compiler generates C and C++ code corresponding to the interfaces defined in an IDL file as well as COM type libraries. Thus, the same interface definitions can be used with systems of separately linked COM components and statically linked systems where each component is realized as a C++ class or C module. Among the problems encountered with the componentization, the most noticeable was the difficulty of splitting functionality between independent components, i.e. between the framework and the protocol handlers, and thus determining the interfaces between these components. In all probability, this was in large parts due to the lack of any prior experiences with similar efforts within the development organization. Initially, the task of specifying interfaces was given to the development center responsible for developing the framework. This changed during the course of the project, however, and the interfaces ultimately used were in reality defined in an iterative way in cooperation between the organizational unit developing the framework and those developing protocol handlers. Other problems are of a non-technical nature. An example is the potential problem of what business processes to use if protocol handlers are to be deployed as stand-alone products. So far, protocol handlers have only been deployed as parts of complete controller products, comprising both hardware and software.



5



Related Work



A well-published case study with focus on software architecture is that of the US Navy’s A-7E avionics system [13]. Among other things, this study demonstrated the use of information hiding to enhance modifiability while preserving real-time performance. Although the architecture of the A-7E system is not component-based in the modern sense, an important step was taken in this direction by decomposing the software into loosely coupled modules with well-defined interfaces. A more recent study, describing the componentization of a system with the aim to make it easier to add new functionality, has been conducted in the telecommunications domain [1]. In this case study, the monolithic architecture of Ericsson’s Billing Gateway Systems is redesigned into one based on distributed components, and a componentbased prototype system implemented. In contrast to our case, the system does not have hard real-time requirements, although performance is a major concern. The study shows that componentization of the architecture can improve the maintainability of the system while still satisfying the performance requirements. There is a substantial body of work on component-based software for control systems and other embedded real-time systems, which, unlike this chapter, focuses on the development of new component models to address the specific requirements that a system or application domain has with respect to performance, resource utilization, reliability, etc. One of the best known examples is the Koala component model [12] for consumer electronics, which is internally developed and used by Philips. Two other examples with particular relation to the work presented in this chapter are the PECOS



 Adopting a Component-Based Software Architecture for an Industrial Control System



247



component model [6], which was developed with the participation of ABB for use in industrial field-devices, and the DiPS+ component framework [11], which targets the development of flexible protocol stacks in embedded devices. The primary advantage of such models over more general-purpose models is their effective support for optimization with respect to the most important aspects for the particular application domain. A typical disadvantage is the lack of efficient and inexpensive tools on the market. For instance, building proprietary development tools in parallel with the actual product development may incur significant additional costs.



6



Conclusions and Future Work



The experiences described above show that the effort required to add support for communication protocols in the controller product has been considerably reduced since the adoption of the new architecture. Comparing the invested effort of 15–20 person-years with the saving of 1–3 person-years per protocol handler it is concluded that the effort required to design the component-based software architecture is justified by the reduction in the effort required to make pre-specified functional extensions to the software and that the savings surpass the investment after 8–10 such extensions. Based on current plans for protocol handlers to be implemented, it is expected that the savings exceed the investment within 3 years from the start of the project. In addition to these effort savings and the perceived quality improvements, the component-based architecture has resulted in the removal of the bottleneck at the single development centre and the possibility of developing the framework and several protocol handlers concurrently. This could potentially lead to business benefits such as reduced time to market. Concerning the overhead introduced by the component model, which is small in the current system but may be larger if and when more COM support is incorporated, we believe that the business climate in which industrial control systems are developed justifies a modest increase in hardware resource requirements in exchange for a noticeable reduction in development time. The experiences with the use of a component-based software architecture in ABB’s control system could be further evaluated. For instance, as more protocol handlers are completed, the confidence in the estimated reduction of effort can be increased. Another opportunity is to study the effect on other system properties, such as performance or reliability. A challenge is that this would require that meaningful measures of such properties could be defined and that measures could be obtained from one or more versions of the system before the componentization. Since a number of protocol handlers have been implemented and even more are planned, there is probably a good opportunity to study the experiences of protocol implementers, which may shed additional light on the qualities of the adopted architecture and component model. One possibility would be to conduct a survey, which might include several development centers. Further opportunities to study the use of a software component model in a real-time system might be offered by a future version of the controller that adopts more of COM and possibly uses a commercial COM implementation. An issue that may be addressed in the future development at ABB is inclusion of a COM-runtime system with support for dynamic linking between components. Com-



 248



Frank L¨uders, Ivica Crnkovic, and Per Runeson



mercially available COM implementations will probably be used for systems based on Windows and VxWorks. Dynamic linking will simplify the process of developing and testing protocol handlers. A potentially substantial effect of dynamic linking is the possibility of adding and upgrading protocol handlers at runtime. This might allow costly production stops to be avoided while, for instance, a controller is updated with a new communication protocol. Another possible continuation of the work presented here, would be to extend the component approach beyond I/O and communication. An architecture were general functionality can be easily integrated by adding independently developed components, would be a great benefit to this type of system, which is intended for a large range of control applications.



References 1. Algestam, H., Offesson, M., Lundberg, L.: Using Components to Increase Maintainability in a Large Telecommunication System. In: Proceedings of the Ninth Asia-Pacific Software Engineering Conference (2002) 65–73. 2. Bass, L., Clements, P., Kazman, R.: Software Architecture in Practice. 2nd edition. AddisonWesley, Reading, MA (2003). 3. Bosch, J.: Design & Use of Software Architectures - Adopting and Evolving a Product-Line Approach. Addison-Wesley, Reading, MA (2000). 4. Box, D.: Essential COM. Addison-Wesley, Reading, MA (1997). 5. ESPRIT Consortium CCE-CNMA (eds.): MMS: A Communication Language for Manufacturing. Springer-Verlag, Berlin Heidelberg New York (1995). 6. Genßler, T., Stich, C., Christoph, A., Winter, M., Nierstrasz, O., Ducasse, S., Wuyts, R., Ar´evalo, G., Sch¨onhage, B., M¨uller, P.: Components for Embedded Software – The PECOS Approach. In: Proceedings of the 2002 International Conference on Compilers, Architectures and Synthesis for Embedded Systems (2002) 19–26. 7. Heineman, G., Councill, W., editors: Component-Based Software Engineering - Putting the Pieces Together. Addison-Wesley, Reading, MA (2001). 8. International Electrotechnical Commission: Programmable Controllers - Part 3: Programming Languages. 2nd edition. IEC Std. 61131-3 (2003). 9. L¨uders, F., Lau, K., Ho, S.: Specification of Software Components. In: Crnkovic. I., Larsson, M. (eds.): Building Reliable Component-Based Software Systems. Artech House Books, Boston London (2000) 23–38. 10. Mahalik, N.: Fieldbus Technology. Springer-Verlag, Berlin Heidelberg New York (2003). 11. Michiels, S.: Component Framework Technology for Adaptable and Manageable Protocol Stacks. PhD thesis, K.U.Leuven, Leuven, Belgium (2004). 12. van Ommering, R., van der Linden, F., Kramer, J., Magee, J.: The Koala Component Model for Consumer Electronics Software. Computer, Vol. 33, Issue 3 (2000) 78–85. 13. Parnas, D., Clements, P., Weiss, D.: The Modular Structure of Complex Systems. IEEE Transactions on Software Engineering, Vol. 11, Issue 3 (1985) 259–266. 14. Robson, C.: Real World Research. 2nd edition. Blackwell Publishers, Oxford (2002). 15. Shaw, M., Garlan, D.: Software Architecture – Perspectives on an Emerging Discipline. Prentice-Hall, Upper Saddle River, NJ (1996). 16. Szyperski, C.: Component Software – Beyond Object-Oriented Programming. 2nd edition. Addison-Wesley, Reading, MA (2002).



 Specification and Evaluation of Safety Properties in a Component-Based Software Engineering Process Lars Grunske1 , Bernhard Kaiser2 , and Ralf H. Reussner3 1



2



School of ITEE, The University of Queensland St Lucia, Brisbane 4072, Australia [email protected] Fraunhofer Institute for Experimental Software Engineering Sauerwiesen 6, 67661 Kaiserslautern, Germany [email protected] 3 Software Engineering Group, University of Oldenburg OFFIS, Escherweg 2, 26121 Oldenburg, Germany [email protected]



Abstract. Over the past years, component-based software engineering has become an established paradigm in the area of complex software intensive systems. However, many techniques for analyzing these systems for critical properties currently do not make use of the component orientation. In particular, safety analysis of component-based systems is an open field of research. In this chapter we investigate the problems arising and define a set of requirements that apply when adapting the analysis of safety properties to a component-based software engineering process. Based on these requirements some important component-oriented safety evaluation approaches are examined and compared.



1



Introduction



Over the past years, the paradigm of component-based software engineering has been established in the construction of complex software-intensive systems [1], mainly in the context of large business software projects. Models and procedures have been developed that help designing component-based systems and assessing many relevant quality properties. Design by components is also a promising approach in the domain of embedded systems, where cost reduction, time-to-market and quality demands impose special constraints. In the context of embedded systems, in particular safety-critical systems, some new issues arise that are still subject of current research. Some of these issues are: – How to specify the failure behavior of a component, when its usage and environment are unknown? – How to evaluate the safety properties for a system built with components? – How to adapt accepted safety assessment techniques to the special context of embedded and component-based systems? – How to construct safety cases for a system built from components? In this chapter we will discuss these problems in detail and give an overview of research work covering this problem domain. C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 249–274, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 250



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



The remainder of the chapter is organized as follows: In Sect. 2 an introduction to the general safety concepts is given. Therefore, the relevant safety terms are defined. Thereafter, Sect. 3 provides an overview over the state of the art safety analysis techniques. In the main part of this chapter, Sect. 4, we investigate the arising problems and propose some requirements to safety analysis techniques when applying them to the construction and evaluation of component based safety critical systems. Furthermore, we summarize some important state of the art techniques for component-based analysis for safety properties. In Sect. 5 we compare these techniques and show how each of these techniques fulfills the stated requirements. This provides support for the selection of a suitable analysis technique. Finally, Sect. 6 contains concluding remarks and points out the directions for future work.



2



Basic Safety Concepts



To introduce into the matter of safety analysis, we first define the relevant terms and concepts used in this chapter. Definition 1 (Component). A component is an identifiable entity with a well defined and specified behavior. In computer science and engineering it designates a self-contained, i.e. separately deployable piece of hardware or software. Definition 2 (System). A system is a set of components which act together as a whole and that is delimited by a system boundary. This chapter deals with purely technical systems (while safety analysis in general considers non-technical components such as user interaction as well). Due to recursive decomposition the subsystems (components) of a system can be viewed as systems on their own, so the terms component and system are often used interchangeably. Definition 3 (Failure). A failure is any behavior of a component or system, which deviates from the specified behavior, although the environment conditions do not violate their specification. Based on this definition a failure is basically a deviation from the specified behavior. However, from the practical viewpoint it is useful to introduce a failure classification of finer granularity by distinguishing different ways in which the provided behavior can deviate from what the expectation. For dependable systems there is an accepted categorization which groups the failures into the following failure types or failure modes [2, 3]: – tl timing failure of a service (expected event or service is delivered after the defined deadline has expired - reaction too late) – te timing failure of a service (event or service is delivered before it was expected reaction too early) – v incorrect result of requested service (wrong data or service result - value) – c accomplish an unexpected service (unexpected event or service - commission) – o unavailable service (no event or service is delivered when it is expected - omission)



 Safety Properties in a Component-Based Process



251



Definition 4 (Fault). A fault is a state or constitution of a component that deviates from the specification and that can potentially lead to a failure. Definition 5 (Accident). An accident is an undesired event that causes loss or impairment of human life or health, material, environment or other goods (similar [4]). To reduce the probability of an accident the preconditions under the control of the system must be distinguished from uncontrollable ones, because the system designer can only take counter-measures for the controllable ones. These controllable preconditions are called hazards and can be defined as follows: Definition 6 (Hazard). A hazard is a state of a system and its environment in which the occurrence of an accident only depends on factors which are not under control of the system. An example of a hazard is a defective car air-bag, since the accident “driver is injured” occurs only if the car crashes. It depends on the environment, whether a hazard leads to an accident and thus the term hazard is always defined with respect to a given system environment and depends on the actual definition of the system boundary. To quantify safety it is important to consider how probable a hazard is and what the severity of the correlated accident or damage is. This is captured in the definition of risk. Definition 7 (Risk). Risk is the severity combined with the probability of a hazard. It is not practical to claim that risk be the product of hazard level and probability, since there are no universally accepted measures for the hazard level and the estimations of the probability are often very coarse. A practical way is to group both severity and probability in a few categories (negligible consequences . . . catastrophic, very rare . . . sure), as in [5, 6]. Both dimensions are independent from each other. A release of radioactivity in a nuclear power plant for instance can cost the lives of many people. Therefore, such a kind of accident is not acceptable, even with a very low likelihood. Definition 8 (Acceptable Risk). Acceptable risk is the level of risk that has deliberately been defined to be supportable by the society, usually based on an agreed acceptance criterion. The risk acceptance depends on social factors such as applicable laws or public opinion. According to standards (e.g. [5]) the acceptable risk can be identified based on various risk acceptance principles, depending on local legislation. Some known risk acceptance principles are ALARP (the residual risk shall be As Low As Reasonably Practicable), GAMAB ( globalement au moins aussi bon, French principle that assumes that there is already an acceptable system and the risk of new system shall be equivalent or lower) and MEM (Minimum Endogenous Mortality, where individual risk due to a particular technical system must not exceed 1/20th of the minimum endogenous mortality.) This definition of risk enables the definitions of the terms S AFETY and S AFETY R EQUIREMENTS



 252



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



Definition 9 (Safety). Safety is freedom from unacceptable risks [5] In other words, safety is the situation where the risk is below the accepted risk level. Literally, safety is the situation where no hazard is present. Since this is not a practicable definition, the widely agreed definition refers to the risk level instead, incorporating the probability of a hazard. Definition 10 (Safety Requirements). A safety requirement is a (more or less formal) description of a hazard combined with the tolerable probability of this hazard. The tolerable hazard probability must be determined so that the combined risk for all hazards of the system is acceptable. This is the task of risk analysis. In summary, the aim of safety critical systems construction is to build a system so that it fulfills all of its safety requirements. This comprises the steps – – – –



Identification of all system level hazards Determination of the acceptable hazard probabilities (safety requirements) Taking constructive measures in order to avoid or reduce anticipated hazards Proof that all of these safety requirements are fulfilled (safety cases)



If the proof fails on the first attempt, the last two steps have to be repeated iteratively.



3



Established Safety Analysis Techniques



There is an established set of safety analysis techniques for different purposes. Most of them have been developed at a time when safety critical tasks were exclusively performed by purely mechanical or electrical systems and do not especially consider the new aspects introduced by software control. The different techniques can be classified by different categories: they are used in different process phases, they use different formalisms, and they also differ in the kinds of qualitative and quantitative analyses that they provide. In the context of component-based system development, the techniques can also be divided into techniques that ignore the internal structure of the systems (as these are not concerned by the fact that a system is developed by components) and techniques that refer to a structural model of the systems (as these potentially need some adaptation when applied to component-based systems). 3.1 Safety Analysis Techniques on System-Level Techniques belong to the first category e.g. because they look at the system on a coarse and abstract level, focusing on black-box properties or the effects of system-level failures to the environment. In these cases it is irrelevant whether a system is monolithic or component-based and which of the components are implemented in software or hardware. These techniques are typically applied in early process phases. In the sequel we give an overview on some techniques belonging to this category. An example for an early safety analysis technique is the Preliminary Hazard Analysis (PHA) [7], a technique that is applied during requirements analysis and early system



 Safety Properties in a Component-Based Process



253



design. Its purpose is to identify potential dangerous sources, to give an early assessment of severity and probability of each hazard and to suggest constructive measures to avoid or reduce risks. PHA is an inductive technique that searches for the effects of identified hazards and the conditions in which they can arise. It is a manual and semi-formal technique that is applied on the system level. A similar technique is Functional Hazard Assessment (FHA)[6] that is increasingly used in aerospace industries. It assesses system functions without reference to the (later) technical realization. Like PHA it is used to obtain a first safety study in early process phases. Based on the potential hazards that have been identified all functions are categorized according to criticality levels. For each function and each of its failure modes the correlated effects, countermeasures and analysis or validation techniques are listed in a table. Although a FHA can be carried out on a subsystem level as well, it is a manual and rather coarse technique that does not require detailed information about the component structure of the system. Another example is Event-Tree-Analysis (ETA) [8], a graphical technique that uses a tree diagram to find and depict all potential effects of a given system-level hazard. The root of the tree is the hazard being analyzed. The branches are potential scenarios that lead to different consequences. Each branching point is associated to a condition which influences the further development of the scenario. For example if the hazard is “fire in engine”, the first branching point could be “automatic extinguishing system is working properly”. The TRUE branch leads to a mitigation scenario (no accident), the FALSE branch to another branching point: “fire is immediately detected by operator”. Again the two branches lead to a different continuation of the story and finally each scenario leads to an accident / damage or not. The technique can yield quantitative results, if for each branching point the probability to take the TRUE or the FALSE path are known. ETA is applied manually with computer support. Since all effects considered in an ETA happen in the system environment, the internal structure of the system is not of concern. As these techniques either regard the system as a black box or are applied in a stage where the actual implementation is yet unknown, they do not refer to components. Consequently, the aforementioned techniques can be applied to component-based systems without modification. In the following subsections we introduce some safety analysis techniques that refer to the internal structure of the system. Thus, we will have to discuss afterwards how far and with which modifications they can be applied in the context of component-based system design. 3.2 Failure Modes and Effects Analysis (FMEA) Failure Modes and Effects Analysis (FMEA; extended variant: Failure Modes, Effects and Criticality Analysis, FMECA) is a table-based, semi-formal technique to identify possible safety or reliability issues with their effects in a systematic and roughly quantitative way. FMEA can be applied both to products (system or component level) or to a process (e.g. software development process). FMEA has been standardized in IEC 60812 [9] and is today widely applied in industry, in particular in the automotive branch. The steps to be performed are: 1. Analysis of the system structure and identification of structure elements (hierarchically arranged in a structure tree diagram)



 254



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



2. Identification of the functions of each identified structure element. The functional decomposition follows the structural decomposition, i.e. functions of sub-components contribute to the functions of their respective super-components. 3. Investigation and listing of all failure possibilities of each function. Generating an FMEA table (see below) containing one row for each failure mode found 4. Estimation of (a) the failure probability of each failure mode, (b) the criticality of the failure mode and (c) the probability that the failure is not discovered early enough to prevent its consequences. For each of these three dimensions a measure out of the range 1 (most favorable case) to 10 (fatal case) is assigned. For this step the use of guiding words and predefined categories is recommended. Multiplication of the three numbers render a Risk Priority Number (RPN) between 1 and 1000. The most critical failure modes are marked with the highest RPN. 5. Redesign or improvement of the system. The ameliorations begin with the failure modes that have the highest RPN. The main goal is to reduce the occurrence frequency of failures, followed by measures to ameliorate the detection of the failures (e.g. by alarm facilities). The RPN can be used to prioritize the amelioration efforts and to decide whether corrective actions are mandatory or not. After the changes a re-assessment of the system quantifies the effect of the measures. The RPN must now be significantly lower than before. The central document of an FMEA is the table, containing the columns (Structure Element, Failure Mode, Effect on System, Possible Hazards, Risk Priority Number, Detection Means, Applicable Controls / Countermeasures). This table helps to carry out the FMEA systematically and makes it a semi-formal method. 3.3 Hazard and Operability Studies (HAZOP) Hazard and Operability Studies (HAZOP) [10, 11] is a criticality analysis technique that has been developed in the 1970s in the context of chemical process industry and has been transferred to other industry branches, including software engineering. It focuses on abnormal deviations of process parameters from their expected values. The key element is a set of keywords that qualify the kind of deviation (e.g. no, less, more, reverse, also, other, fluctuation, early, late) for each information or material flow. The use of predefined keywords assures the completeness and consistency of the whole study. The list can be adapted or extended as appropriate. HAZOP is a session technique, conducted by a team of domain experts as early as a first material or data flow model for the system is available. The goal is to predict potential hazards that result from these deviations. The results are usually presented in a table and in the end report the system design is either accepted or changes to improve safety are requested. 3.4 Fault Tree Analysis Fault Tree Analysis (FTA) [12–14] is a graphical safety and reliability analysis technique which has been used in different industry branches for over 40 years. It is a deductive top-down analysis technique and a combinatorial technique. FTA allows tracing back influences to a given system failure, accident or hazard. Fault Trees (FTs) provide



 Safety Properties in a Component-Based Process



255



logical connectives (called gates) that allow decomposing the system-level hazard recursively. The AND gate indicates that all influence factors must apply together to cause the hazard and the OR gate indicates that any of the influences causes the hazard alone. The logical structure is usually depicted as an upside-down tree with the hazard to be examined (called top-event) at its root and the lowest-level influence factors (called basic events) as the leaves. In the context of FTA the term “event” is applied in its probability theory meaning: an event is not necessarily some sudden phenomenon, but can be any proposition that is true with a certain probability. Based on a FT, several qualitative or quantitative analyzes are possible. A qualitative analysis list, for instance, all combinations of failures that must occur together to cause the top-level failure. Quantitative analysis calculates the probability of the topevent from the given probabilities of the basic events. Combinatorial formulas indicate for each type of gate how to calculate the output probability of a gate from the given input probabilities. The probabilities taken into account are the probabilities that an event occurs at least once over a given mission time or they are probabilities of a failed state with respect to a given point in time. The evolution of a system over time or any dependencies between the present system behavior and the history cannot be modelled. An important assumption to obtain correct results is the stochastic independence of the basic events, which is hard to achieve in complex networked systems where often common cause failures occur [15]. Figure 1 shows a simple FT example. The starting point of L a p to p U n a v a ila b le P = 0 .3 1 4



³ 1 & C P U D e fe c tiv e P = 0 .3 B a tte ry E m p ty P = 0 .2 M a in s P o w e r D o w n P = 0 .1



Fig. 1. Fault Tree Example



the model construction is a hazard state or a failure event that has been identified before (e.g. by means of an FMEA). In the example, the unavailability of a laptop computer is analyzed. A creative process is carried out to investigate all factors that contribute to the occurrence of this top-event. The search is performed along the system structure and examines all system functions, environment conditions (e.g. ambient temperature) and auxiliaries (such as power supply). The decomposition is stopped at a granularity level where the individual influence factors cannot or need not be refined any further. These lowest-level factors are called basic events and are the leaves of the tree, depicted by circles. For a quantitative analysis a probability or probability distribution must be known or estimated for all basic events. This is usually achieved by probabilistic failure models, most of them empirical models.



 256



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



In the example, we restrict ourselves to AND and OR gates, although many other gate types are provided by standards and FTA tools . In the figure, the graphical representation according to [13] has been chosen — in the United States different symbols are used. The AND gate is depicted by a & symbol and the OR gate by a ≤ 1 symbol (because at least one input must be true for the gate output to become true. The quantitative result shown in the figure has been obtained by application of the formulas associated with the AND and the OR gate. In case of the AND gate, the input failure probabilities Fi are multiplied with each other: Foutput = i Finputi , in the case of the OR gate DeMorgan’s theorem indicates that all input probabilities have to be inverted (subtracted from one), then multiplied, and finally the result has to be inverted again: Foutput = 1 − i (1 − Finputi ). 3.5 State-Based Approaches To explain the behavior of components with respect to safety it is often not sufficient to restrict to a two states (working and failed) abstraction, as FTA does. This accounts for a different class of analysis techniques, the state-based techniques. In the context of software engineering and systems safety usually discrete state approaches are applied. The practically relevant subclasses classes are: – Statecharts, ROOMcharts, UML State Diagrams or similar notations – Petri Nets – Markov Chains The use of Statecharts or similar notations can enhance safety during system construction by providing an intuitive notation with automatic consistency checking and by partially allowing for automatic code generation. Moreover, in safety critical areas they are exploited for safety analysis as well: formal state-based models that describe the (intended) system behavior can serve as a base for model checking. Model checking is a qualitative technique that decides if a certain undesired state (hazard state) is definitely impossible to reach. If this cannot be proved, a counter example is produced, which in turn helps the analyst to formulate countermeasures how to avoid that hazard. Probabilistic variants of model checking algorithms are currently a major subject in formal methods research. Petri Nets exist in deterministic and in probabilistic variants. They are a good means to model concurrent or collaborating systems. They also allow for different qualitative or quantitative analysis that can be useful in safety validation. However, Petri Nets are mainly applied for performance evaluation. Markov Chains (MCs) are a probabilistic state-based modelling technique. A MC is a finite state machine where the transitions occur stochastically according to defined probability distributions. MC analysis plays an important role in reliability analysis and can be used to judge the reliability or availability of safety-relevant components within a system [16]. It is a discrete-state approach and there exists a continuous time variant and a discrete time variant. A MC is mainly a state diagram that explicitly considers working states and failed states. In contrast to the combinatorial approaches MCs allow more than two states for each component, so multiple failure modes or degrading failure



 Safety Properties in a Component-Based Process



257



(e.g. working - restricted service - completely failed) can be modelled. The states are usually depicted as circles and the state transitions as directed edges. The transition rates are annotated at the edges. A transition rate is the conditional probability, that the state will change from Si to Sj in the next short time interval under the condition that it is in state Si at time t. MC analysis is performed by formulating and solving differential equations (there are several transient and steady state analysis or simulation techniques and quite a number of tools is available). These equations can be imagined to describe the “probability flow” between different states.



4



Safety Analysis Techniques for Component-Based Systems



4.1 Problems Since safety means freedom from unacceptable risks, the primary goal of safety analysis techniques is to identify all failures on the system level that cause hazardous situations and to demonstrate that their probabilities are sufficiently low. In the context of component-based systems this involves some additional problems that do not occur in the same way in monolithic systems. A principal question to be addressed is the composition of the property “safety”. Is it permissible to say that a system is as safe as its components together (analogously as the combinatorial reliability models judge system reliability from component reliability)? A small part of the system, in particular a piece of software, cannot do harm to the environment and thus cannot be unsafe. We find that safety as a property is not defined on an arbitrarily low granularity level and thus fine-grained components do not possess a quality attribute “safety” [4]. However, the influence of component behavior, in particular software behavior on the safety of the whole system cannot be argued. In particular component failures can compromise the safety of the system. In real-time systems this applies to timing failures as well as to value failures. More exactly, safety violations result from failures that propagate to the system boundary. Thus, component-based safety analysis means to conclude system safety from component behavioral models. On a higher abstraction level, the conclusion is drawn from various quality properties of the components (e.g. correctness, availability, reliability) to system safety. For instance, the availability of a protective device such as a car airbag or a fire detector directly influences the safety of the containing system. Consequently, the techniques applied at the component-level do not need to necessarily be proper safety analysis techniques; analyzing reliability or correctness of components can be a part of the overall safety argument and according techniques can be applied on the chocomponent-level. The question is which techniques to choose and how to integrate the results to a system safety case. Another finding is that components are usually not isolated but require services from other components to provide their service correctly. Therefore, not only internal failures, but also failures that are propagated from a foreign component can cause a component to produce failures. The next issue concerns the development process: safety analysis techniques must integrate into the overall development process of the embedded system. In the case



 258



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



of component-based design this means in particular that concurrent development at different places and design for later reuse in an unforeseen environment have to be considered. The different modelling techniques used within the same project should be compatible, which can be achieved e.g. by integrated tool-chains or model export and import facilities. Another big challenge is complexity. Component-based engineering is often applied to systems that are too complex to be understood in one piece. For example, a system composed out of 10 components with only 2 state each has a state space of 1024 states; one can easily imagine the consequences for real-world systems with lots of states for each component. This problem is referred to as state-explosion. However, not only statebased approaches, but also other techniques suffer from complexity problems, e.g. by an excessive amount of causal chains that hampers the readability of the model. This leads us to the correlated question of scope and granularity: It is impossible to consider all states and all behavioral aspects of a system. The challenge is to find the right abstraction level that makes a model expressive enough and yet analyzable. We found that on the one hand, informal or even combinatorial models are sometimes not sufficient, but on the other hand, composing an integrated behavioral model and analyzing all possible sequences of actions, including failures, is far beyond feasibility. Techniques on a practical granularity level and with a limited scope (i.e. expressing just the facts of interest) are necessary. 4.2 Requirements Being aware of these particular problems we now map out some requirements that will help us to classify the safety analysis techniques that we will present in the subsequent sections. Requirement 1 (Appropriate Component-Level Models) Each component must be annotated with an appropriate evaluation model. Component-based safety analysis should decompose the system according to its architecture and then annotate each component with an appropriate model. The system level analysis technique must finally integrate the results from all component annotations to a sound safety case for the whole system. Different components may be implemented by different technologies and ideally it should be possible to choose for each component the most appropriate modelling technique. Due to the embedded nature of the systems, this includes techniques that are suitable for software and hardware aspects. The techniques should be able to describe the correct behavior and failure behavior by appropriate means. Further, we saw that the property safety on system level is influenced by different aspects of the component failure behavior, for instance by quality properties such as reliability, availability, timeliness and correctness on component-level. Accordingly, attaching models for different quality properties to different components in order to validate each of these properties by an appropriate technique is a suitable approach. Requirement 2 (Encapsulation and Interfaces) The notation for the evaluation models should allow encapsulation and composition by interfaces similar to componentbased design notations.



 Safety Properties in a Component-Based Process



259



Many current component-based design notations (such as ROOM [17] or UML2.0) offer mechanisms to define components as closed capsules and ports that serve as points of information exchange between components. These ports define the externally visible interface of a component. Their semantics varies with the different modelling techniques; examples are – incoming and outgoing messages or signals – incoming and outgoing continuous data flows – provided and required services In these design frameworks, it is usually possible to refine components recursively and to integrate components to new components. Every component can be exchanged by another with the same interface. The internal implementation, i.e. everything that does not belong to the interface, is hidden from the environment. An appropriate syntax and type systems for interfaces allows one to check all component interconnections automatically for consistency. If even a formal semantics is associated to the interface notation (as it is the case in interface automata, for instance) it is possible to derive the system semantics from the component semantics and the topology. A similar construction principle is also desirable for component safety evaluation models. The interfaces of the component safety evaluation models should correspond as closely as possible to the interfaces used in the functional models from systems design phase. Requirement 3 (Dependencies on External Components) Safety analysis techniques must be able to express the dependencies of failures regarding provided services on failures regarding required services and on internal failures of the component. Due to the fact that most of the components are not self-contained and require external components to operate, the failure modes of the provided services depend on the failure modes of the provided services by other components. In consequence the failure probabilities of the provided services of a component are a function of (a) the probabilities of internal failure generation and (b) the probabilities of failures of the external environment the component interacts with. Requirement 4 (Integration of Analysis Results) A composition algorithm is required that constructs the evaluation model for a hierarchical component-based on the architecture and the evaluation models of the components used. The aim of constructing a safety model for a system is not only to visualize the system for better understanding, but also to run analysis algorithms on it, for instance to calculate the probability of the system level hazard. Therefore it is necessary that the algorithms are composable, i.e. the results from component analysis can be integrated to the results (e.g. safety critical failure probability) on the system level. Some compositional techniques are only analyzable after the final integration and suffer from the combinatorial explosion of the state space. Ideally the analysis algorithms allow simplifications and calculations of immediate results on the component-level. The advantage is that each time the component is reused only a part of the calculations has to be redone and the performance is thus acceptable. The integration of results from different



 260



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



modelling techniques should be automated to a high degree, as manual copy or translation between different formats is error-prone and compromises the integrity of a safety analysis. Requirement 5 (Practicable Granularity) The techniques applied should be on the one hand rich enough in details to express how different kinds of component behavior can influence system safety, but on the other hand coarse enough to allow affordable analysis on the system level. Regarding granularity and scope, a compromise between expressive power and analyzing effort must be found. The approach of exhaustive modelling of all possible behavior traces to explain how a system level hazard can occur is infeasible.A plain parts count approach (system works correctly if all components work correctly) which is sometimes used in reliability analysis is not sufficient to show how components interact and how for instance a safety subsystem mitigates failures of other components. Often the two state abstraction (working versus failed) in combinatorial techniques is too coarse, but a full state based approach is not manageable due to the state-explosion-problem. A compromise could be to classify failures according to a few categories, which still allows to formulate simple causal relations between failures of different classes at different interfaces. In the case of state-based approaches it is often not feasible to examine the whole state space as determined by the functional model of the system. Instead, it is preferable to take a coarser approach by only modelling the states that are involved in a safety-relevant scenario. Requirement 6 (Tool Support) The safety analysis technique should be supported by appropriate and ergonomic tools. Some of the safety analysis techniques (FMEA, for instance) were originally designed as paper-and-pencil methods. In the present context, manual application of techniques is not practicable. First, systems that are designed by components are usually complex systems, so only computer based tools allow humans to handle systems of high complexity without making errors. Important aspects are project browsing and history tracking facilities, model design assistants and consistency checking, ergonomic user interface and structured graphical representation. Second, one main purpose of components is to design them at different places (division of labor) or at different times (reuse). Traditionally, when one team at one place created a model, intuitive knowledge and implicitly agreed assumptions helped to overcome ambiguities. In the componentbased process, when working at different places or when reusing a component that has been created years before, the lack of this direct communication will likely lead to misinterpretation. By enforcing a well-defined model syntax (and ideally also semantics) and by capturing all aspects of the model in a file or database, computer based tools help to create reusable and exchangeable component analysis. 4.3 Running Example To explain how a safety evaluation system built with components works in practice, we present a steam boiler system as a running example. The left part of figure 2 shows a



 Safety Properties in a Component-Based Process



261



schematic, similar to those process engineers use to describe the hardware of the plant. The process plant incorporates the pressure tank, a triple-redundant pressure sensor and a double-redundant safety valve. Further the system contains a software controller that implements a two-out-of-three voter for the sensors and gives command to open both valves if a pressure higher than the allowable level is detected. The voter pattern assures that if at least two out of the three sensors indicate the right value, the controller takes the correct decision. Furthermore, each of the valves is sufficient as a pressure relief, so if one fails, the system is still safe. In subsequent sections we will also discuss a variant of the example where it is possible to select either voter mode (three sensors) or single-sensor mode. The right part of the figure shows a structure diagram, as an



P



S e n s o r 1



P



C o n tr o lle r



S e n s o r 2



P



:S y s te m V 1 : V a lv e



V 2 : V a lv e



S e n s o r 3 C : C o n tr o lle r



S 1 :S e n s o r



S 2 :S e n s o r



S 3 :S e n s o r



V a lv e 1 V a lv e 2



Fig. 2. Steam Boiler schematics and structure diagram



embedded systems engineer would use it to describe the system. The structure diagram describes the static architecture of a system, consisting of components and interconnections between these. During design phase, models for the behavior are attached to the components, for example state machine models that describe the reaction of components to trigger signals received via its ports [17]. During the construction phase, only the intended behavior is of relevance. Safety analysis in contrast focuses on possible deviations from the intended behavior that lead to hazardous situations. By definition, the ports of the components are the only spots where information is exchanged and the interconnections in the structure diagram are the paths of information flow. Consequently these are also the spots where failures are propagated between components. The idea behind the component-based safety analysis techniques discussed in the following subsections is to exploit the system architecture for safety analysis by attaching models for failure generation to the components.



 262



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



4.4 Failure Propagation and Transformation Notation (FPTN) The Failure Propagation and Transformation Notation (FPTN) described in [3, 18] is one of the first approaches that introduce modular concepts for the specification of the failure behavior of components. The basic entity of the FPTN is a FPTN-Module. This FPTN-Module contains a set of standardized sections. In the first section (the header section) for each FPTNmodule an identifier (ID), a name and a criticality level (SIL) is given. The second section specifies the propagation of failures, transformation of failures, generation of internal failures and detection of failures in the component. Therefore, this section enumerates all failures in the environment that can affect the component and all failures of the component that can affect the environment. These failures are denoted as incoming and outgoing failures and are classified by the failure categorization presented above (reaction too late(tl), reaction too early(te), value failure(v), commission(c) and omission(o)). In the example which is given in figure 3 the incoming failures are A:tl, A:te, A:v,and B:v and the outgoing failures are C:tl, C:v, C:c and C:o. The propagation and transformation of failures is specified inside the module with a set of equations or predicates (e.g for propagation: C:tl=E:tl and for transformation C:c=A:te&&A:v and C:v=A:tlB:v). Furthermore a component can also generate a failures (e.g C:o) or handle an existing failure (e.g B:v). For this it is necessary to specify a failure cause or a failure handling mechanism and a probability. FPTN-Modules can also be nested



Fig. 3. Abstract FPTN-Module



hierarchically. Thus, FPTN is a hierarchical notation, which allows the decomposition of the evaluation model based upon the system architecture. If a FPTN-module contains embedded FPTN-modules the incoming failures of one module can be connected with the outgoing failures of another module. Such a connection can be semantically interpreted as a failure propagation between these two modules. For the evaluation of an FPTN-module a fault tree is constructed for each outgoing failure based on the predicates specified inside the FPTN-module. As a result of this interpretation, a FPTN-module can be seen as a forest of fault trees, where the leaf



 Safety Properties in a Component-Based Process



263



nodes and their probabilities are extracted from the failure generation and the failure handling section inside the FPTN-module. To show the applicability of the FPTN in figure 3 the failure behavior of the Steam Boiler System (c.p. section 4.3) is modelled. To keep the considerations simple, we assume only a few failure modes: A sensor fails with a value failure (wrong pressure indicated) if a mechanical or an electrical failure occurs. A valve can fail to open (omission) for electrical or mechanical reasons, but also as a result of a missing command (omission at the input failure cmd). The controller fails to give the open commands (omission) either if at least two of the connected sensors give wrong signals (value failure corresponding to P 1, P 2 or P 3) or if there is an internal hardware defect. Based on these assumptions, for each componentused a FPTN-module is created, which describes the failure behavior. These created FPTN-modules are embedded into the FPTN module “Steam Boiler System” and connected with respect to the possible failure propagation. For the evaluation of the safety properties the failure probability of both outgoing failures Open.o need to be calculated. As described earlier, this can be performed by an analysis of the corresponding fault trees.



Steam Boiler System



ID=System



ID=S1



Sensor



ID=S2



Sensor



Sensor



Pressure:v



ID=C P1:v Pressure:v



Transformation Pressure:v = Intern1|| Interen2 Internal ntern1 Generated by [Electrical Defect] with [Probability=0.1]; Intern2 Generated by [Mechanical Defect] with [Probability=0.1];



P2:v



P3:v



Pressure:v



SIL=4



Controller



SIL=4



Transformation Cmd:o = Intern1|| (P1:v&&P2:v || P1:v&&P3:v || P2:v&&P3:v) Cmd:o Internal Intern1 Generated by [Hardware Defect] with [Probability=0.1];



ID=V2



SIL=4



Valve



Propagation Open:o Open:o = Command:o || Intern1|| Intern2 Cmd:o Internal Intern1 Generated by [Electrical Defect] with [Probability=0.1]; Intern2 Generated by [Mechanical Defect] with [Probability=0.1];



SIL=4



Transformation Pressure:v = Intern1|| Interen2 Internal ntern1 Generated by [Electrical Defect] with [Probability=0.1]; Intern2 Generated by [Mechanical Defect] with [Probability=0.1];



ID=S3



ID=V1



SIL=4



Transformation Pressure:v = Intern1|| Interen2 Internal ntern1 Generated by [Electrical Defect] with [Probability=0.1]; Intern2 Generated by [Mechanical Defect] with [Probability=0.1];



SIL=4



Valve



SIL=4



Propagation Open:o = Command:o || Intern1|| Intern2 Cmd:o Internal Intern1 Generated by [Electrical Defect] with [Probability=0.1]; Intern2 Generated by [Mechanical Defect] with [Probability=0.1];



Fig. 4. Steam Boiler Example (FPTN)



Open:o



 264



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



4.5 CFT Fault Tree Analysis is one of the most popular safety analysis techniques. Unfortunately they provide only a restricted decomposition mechanism: the decomposition into independent subtrees, called modules. To be compatible to the architecture model that shall serve for automatic construction of the safety case, the models for the failure behavior must be attachable to the components and account for the assignment of incoming and outgoing failures to the ports. They must take into account that the components are in general not independent from each other because the ports are access points for possible influences from other components. FTs are compositional in the sense that independent subtrees can be cut off and handled separately. Technical components however are typically influenced by other components and thus this assumption does not hold. To allow for a modularization that corresponds to the component and port concept, an extension of FTs has recently been proposed [19]. It is called Component Fault Trees (CFTs) and allows defining partial Fault Trees that reflect the actual technical components. These CFTs can be modelled and archived independently from each other. Input and output failure ports glue these parts together. While traditionally independent subtrees were regarded as compound events, CFT are treated as a set of propositional formulas describing the truth-values of each output failure port as a function of the input failure ports and the internal events. CFTs can be acyclic graphs with one or more output failure ports. Each component constitutes a name space and hides all internal failure events from the environment. Components can be instantiated in different projects. Thus all necessary preconditions for an application of FTA to component-based systems are fulfilled. To model potential failures, CFTs for each component-class are generated. This is a manual task that is conveniently performed on a graphical CFT editor. Each CFT has input failure ports and output failure ports that must be associated to failure categories with respect to messages or services at the ports of the corresponding componentclasses. Between input failure ports and output failure ports the failure propagation or transformation and the internal failure generation of the component-classes are modelled. For the components in the steam boiler example (c.p. section 4.3) this leads to the CFTs, which are presented in figure 5, if the same failures modes and internal faults are assumed as given in the FPTN Section (c.p. figure 4). The CFTs given so far allow in conjunction with the structure diagram to integrate the system level CFT. However,



Valve



Open. Omission



>=1



Sensor



Pressure. Value



Command. Omission



Controller



>=1 Electrical Defect Mechanical Defect



>=1



Electrical Defect Mechanical Defect



2 out of 3



P1.Value



P2.Value



Command.Omission



Fig. 5. Controller, Valve and Sensor CFTs



Hardware Fault



P3.Value



 Safety Properties in a Component-Based Process



265



before starting the analysis, another manual step is necessary: The user must complete the system-level Fault Tree by specifying which system hazard is to be examined. This can be performed using the graphical editor of the CFT analyzer. The resulting Fault Tree is shown in figure 6, which is a screen shot taken from our analysis tool UWG3 that will be introduced in the following section. The lower part of the structure has been generated automatically, the top-event and the AND gate have been added manually by the user. The AND gate attached to the failure output ports V1open.omission and V2open.omission specifies that if both valves fail to open when expected, the hazard to be examined is present. Assuming all events to have constant failure probability of 0.1 we calculated the hazard probability to 0,214 using the tool UWG3.



Fig. 6. System CFT



4.6 Safety Analysis with Parametric Contracts In the following, we describe how to model safety properties in the interface of a component, using the “Quality of Service Modeling Language” (QML) [20]. This language allows to define arbitrary quality dimensions. We define failure class definitions as quality dimensions. As the QML assumes that quality attributes are fixed values for a component and neglects the context-dependency of quality attributes, we then couple that notation with an analysis technique called “Parametric Contracts”. Parametric contracts allow to model the context-dependencies of a component’s safety attributes and thus the analysis of component-based systems. Parametric contracts have been used for general



 266



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



reliability modelling before [21, 22], but are specialized for safety analysis here. The following section provides original research results. As current interfaces (“signature-list based interfaces”) specify the well-behavior of a component service (i.e., the behavior exposed without failures), these interfaces are unsuitable for specifying or analyzing failure-propagation through a system. Therefore signature-list interfaces have to be extended by two dimensions: (a) a specification of failure classes and (b) a specification of the dependency of a service’s failure behavior on the failure behavior of its context. The inclusion of failure classes specification into a service signature can be done by the QML. The QML allows the specification of quality dimensions as well as the specification of Quality of Service contracts (QoS contracts, for short) specifying the actual provided or required service quality for the dimensions defined before. In the following, for each of the failure types introduced in section 2 a “contract type” (i.e., quality dimension) is defined. type TooEarly = contract {numberOfFailures type TooLate = contract {numberOfFailures type IncorrectValue = contract {numberOfFailures type Commission = contract {numberOfFailures type Omission = contract {numberOfFailures



: <<decreasing>> no / year;} : <<decreasing>> no / year;} : <<decreasing>> no / year;} : <<decreasing>> no / year;} : <<decreasing>> no / year;}



The above list assigns to each failure type the unit numberOfFailures which is measured by the number of occurrences per year (no / year). The keyword decreasing denotes that lower values relate to a higher “quality of service”. This is important to know when matching component interfaces. In case two values are not the same, one has to know whether a higher or a lower value is acceptable. The second extension of signature-list interfaces is modelling the context dependency of failures. Basically, for any failure of the above failure types, there are three causes: Internal service error: a bug in the service’s code causes a failure. External call error: a call to an external service causes a failure. External calls can go to services of other domain components or to services provided by the run-time environment (operating system, middleware, etc.) External interruption error: the run-time environment stops or interrupts the execution of the service pre-emptively and causes a failure. In the following a failure type is denoted by f t ::= tl|te|v|c|o for the failure types TooEarly, TooLate etc. The term PX (Y ) is used to denote the probability that the event which is specified by the corresponding subscript X and parameter Y occurs. The subscripts is, es and ei signify internal service error, external call error and external interruption error, respectively. One can assume that on each execution trace of the software



 Safety Properties in a Component-Based Process



267



at most one failure occurs. This assumption is justified by the fact that failure probabilities are very low. This assumption allows us to simply sum up the failure probabilities for all possible failure causes, restricting the analysis effort to linear equations. Therefore, the probability that service A fails with a failure of failure type f t is Pf t (A) = Pisf t (A) + Pesf t (A) + Peif t (A)



(1)



Here, Pisf t (A) is the probability that A fails (with a failure in failure type f t) because of an internal service error, Pesf t (A) because of an external service error and Peif t (A) an external interruption failure. If 5 failure types have been defined, then 5 equations of this style are required. It is assumed that failures of one failure type affect only consecutive failures of the same type, e.g. if some service is provided too late, it may cause other services to be provided too late as well, but not too early or with a wrong value. This assumption holds in many practical cases. In principle however, each initial failure can result in a failure of any of the above types. To capture this, the linear equations could be extended, which in turn increases the analysis effort. When modelling the dependencies of the component environment on the failure probability (for each failure type), the latter two terms of the above equation are important (as the first one is internal). Hence, the following considerations deal with Pesf t and Peif t . In both cases, we need information on what happens if method A is called. In the case of determining the probability of an external service error, one needs to know which external services are called (and how often they are called). In the case of the external interruption error probability one needs information on the length of the execution and assume that the chance of an interruption is proportional to the execution time. Both kinds of information are given by a so-called service effect automaton [23]. A service effect automaton (SEA) is a finite state machine, describing for each service implemented by a component, the set of possible sequences of calls to services of the context. Therefore, a service effect automaton is a control-flow abstraction. Controlstatements (if, while, etc.) are neglected, unless they concern calls to the component’s context. As the SEA is an automaton, it accepts a language. As the input symbols of the SEA are names of the external services called, a word of the language is a trace of service calls. By traces(SEA) the set of traces of the SEA is denoted (which is the language accepted by the SEA). Figure 7 presents the SEA of the control process of the steam boiler controller. We refer to the variant of our example where the user can select between voter mode and single sensor mode. The automaton in the figure presents an abstraction of the software control-flow of the boiler control process. It first reads the value of pressure sensor 1 (Read:P1), then it calls the user-interface to determine whether the user selected 2-of-3 three voting mode or single sensor mode. According to this selection (let us assume a probability [u]for voter mode), either the other sensors are read and then the valve commands are issued or the valve commands are issued directly after the first pressure sensor reading. Hence the set of all traces is traces(SEAControlP rocess ) = {(Read : P 1, Read : U I, (Read : P 2, Read : P 3, Cmd : V alve1)|(Cmd : V alve1), Cmd : V alve2)n |n ∈ N}. For our purpose the SEA is extended to a Markov model, i.e., each transition is annotated with a transition probability (while the constraint holds, that for any state the



 268



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



Fig. 7. Service Effect Automaton of the Steam Boiler Controller



sum of the probabilities of outgoing transitions never exceeds one.) As a result one has for each tr ∈ traces(SEA) a function P (tr) giving the probability of tr occurring in the SEA. Since execution traces of the main function in real-time systems usually are loops (repeating themselves over and over again until the device is switched off), let us first regard just the individual runs, and consider repetition later. Services that are called by the main function have one start and one end point. In our example SEA, showing the main loop, one finds two branches and thus two possible traces per run and get the probability P (tr) = u for the trace tr = Read : P 1, Read : U I, Read : P 2, Read : P 3, Cmd : V alve1, Cmd : V alve2 and P (tr) = (1 − u) for the trace tr = Read : P 1, Read : U I, Cmd : V alve1, Cmd : V alve2. On each trace, services are called and these services can cause a failure of one of the known failure type (still assume that failures are so improbable there is at most one failure per run.) Now one has to add up the failure probabilities of all externally called services e in each trace tr to get the failure possibility related to this trace under the condition that this trace is taken. Pesf t (tr) :=







Pf t (e)



(2)



e∈tr



To get the total probability for one run of the main function A we refer to the definition of conditional probability. This allows one to specify the probability Pesf t that a failure of type ft occurs in an arbitrary trace tr as follows: Pesf t (A) :=







P (tr) ∗ Pesf t (tr)



(3)



tr∈traces(SEA(A))



This means, Pesf t (A) sums for all possible traces the product of the probability that tr is executed and the probability that during one execution of tr a failure of f t occurs. Regarding the main loop that runs continuously, one finds that the probability that after n runs no failure has occurred is the product of the probabilities that there is no failure in the first run, no failure in the second run and so on until n. Consequently, denoting n runs of A as An , the probability Pf t (An ) is defined as follows: Pesf t (An ) := 1 − (1 − Pesf t (A))n



(4)



 Safety Properties in a Component-Based Process



269



The remaining step in order to obtain the failure probability per time unit is to estimate the number n, i.e. the number of main loop runs during one time unit. This task is feasible because the main loop is usually scheduled on a known and regular time basis, e.g. every 20 ms. If the main loop immediately starts again after completion, the number of runs per time unit can be estimated from the execution time per loop. As a result one obtains for each failure type the probability per execution time, that a failure that is caused by a call to a foreign component occurs. For practical application the method can be refined by correcting terms, e.g. the probability that the failure from the called component causes harm to the caller. These terms have to be specified by the component implementer while the function occurrence probabilities depends entirely on the usage context of the component. The probability of an external interruption error Peif t is modelled in linear dependency on the length of the service’s execution code trace. In principle, the length of the code execution trace depends on the actual path the control-flow takes through the code. The probability for a specific path taken is given by the transition probabilities of the service effect specification. The only missing information for specifying the control flow path length (in number of instructions) is the number of instructions associated to each transition and the number of instruction associated to each state of the service effect specification. If the service effect specification is derived from existing component code, this data is available and simply needs to be attached to the service effect specification. However, without having the service implementation at hand for analyzing its code, these figures might be hard to estimate in advance. Note, that this dependency of component specifications on the actual implementation makes us talking on component implementation instances rather than component types. Mathematically, one models the influence of external interruption errors as a linear function mapping each implemented service A to the probability that a failure of failure type f ti occurs. Again, we refer to the definition of conditional probability.  P (tr) ∗ L(tr) ∗ Pf t (tr) (5) Peif t (A) := tr∈traces(SEA(A))



Formally, it sums over all possible traces the product of the probability that the trace is executed (P (tr)) and a measure for the length of the trace (L(tr)) and the probability Pf t (tr) that the occurrence of an external interruption error results in an failure of failure type f t. After these definitions, it is time to step back and to consider practical issues. First, lets summarize what our model needs as inputs: 1. The service effect automaton (SEA), a Markov model (i.e., having for all traces tr the value P (tr)). See [21] for a detailed discussion how to yield that data by a combination of code analysis, monitoring or simply educated guessing. However, even if the component vendor does not provide the SEA it can be generated aposteriori out of an existing component. In addition one needs L(tr), the length of a trace. But this is also given by the code of a component.



 270



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



2. The failure probability Pf t (e) for each external service and each failure type f t. This data has to be provided by the component deployer as it is part of the component context. It can be measured (for basic operations) or predicted by using the presented model itself. 3. The probability Pf t that a failure of f t is caused by an external interrupt. The second question of practical concern is how to evaluate the above formulas. The main problem is that the number of traces can be infinite, hence the sums given above cannot be simply evaluated within a loop. (Even worse, one has to show their convergence). Therefore, we refer to the Markov chain analysis for service effect automata extended to Markov models, as described in [21, 22].



5



Evaluation of Safety Analysis Techniques



In the following we classify and evaluate the techniques presented above according to the requirements for safety analysis methods as introduced in section 4.2. 5.1 FPTN Requirement 1: Appropriate Component Level Models: As presented in [3] the failure propagation and transformation notation provides a simple but comprehensive annotation of the failure behavior of a component. These annotations are easy to understand and to analyze. However, failures are only differentiated according to the given five categories. Requirement 2 + 3: Encapsulation and Interfaces and Dependencies on External Components: A FPTN-Module is encapsulated and provided with the incoming and outgoing failures a well defined interface to its environment. To specify the relation of these incoming and outgoing failures, the failure transformation and propagation predicates are used. Based on these predicates the dependencies of the failure behavior of the modeled component from its environment is defined. Requirement 4: Integration of Analysis Results: For a hierarchical composition of FPTN-modules it is necessary to specify which failures are propagated between components. To identify this information currently no systematic procedure is specified in literature. Requirement 5: Practical Granularity: The notation of the failure propagation and transformation notation utilizes the five relevant failure types [2] (reaction too late, reaction too early, value failure, failure of commission and failure of omission). However, the architect of a component can decide which failure types and relations between these failure types are really needed. Due to this the granularity is define by the user of the notation and thus even for a complex system the safety properties are still analyzable. Requirement 6: Tool Support Up to now there is no commercial tool that supports the specification and evaluation of FPTN-modules.



 Safety Properties in a Component-Based Process



271



5.2 CFT Requirement 1: Appropriate Component Level Models: Similar to the FPTN the CFTs provide a simple but comprehensive annotation of the failure behavior of a component. The expressive power is restricted to combinatorial logic. Requirement 2: Encapsulation and Interfaces: Each CFT is encapsulated and failure ports are used as interfaces to the capsules. These failure ports are separated into input and output failure ports. Components are reusable entities which makes the technique appropriate for component-based development processes. Requirement 3: Dependencies on External Components: To describe the dependencies on external components the input failure ports are used. If they are connected with an output failure port of another component, the associated failures are propagated between these two components. Requirement 4: Integration of Analysis Results: Due to their structure, component fault trees are hierarchically decomposable. That means the CFT of a component can contain the CFTs of the embedded components. Furthermore, the embedded CFT can be automatically connected, based on the interface specifications of the embedded components and a construction algorithm, which is presented in [24]. The quantitative analysis is usually performed by Binary Decision Diagrams (BDDs) [25] and the BDD fragments for each component can be automatically flattened to one analyzable BDD. Requirement 5: Practical Granularity: Similar to the FPTN, component fault trees utilizes the five relevant failure types and the architect can decide which failure types and relations between these failure types are modeled within the CFT. Due to this, the granularity is defined by the user and thus even for a complex system the safety properties are still analyzable. Requirement 6: Tool Support The specification and evaluation of the CFTs is supported by a commercial tool, called UWG [19]. It has been developed in co-operation between the Hasso-Plattner-Institute and the companies Siemens and DaimlerChrysler for the last two years and has been used in several industrial projects where it proved its intuitive handling. It incorporates all previously mentioned features of the Component Fault Tree concept. UWG3 provides an efficient analysis algorithm that makes use of BDDs to efficiently represent even large CFTs. 5.3 Parametric Contracts Requirement 1: Appropriate Component Level Models: As parametric contracts are specified by the service effect automata, the composition of the notation is given by the recursive composition of service effect automata. For that composition, a transition marked by a call to an external method (read access or command) is replaced by the service effect automaton of that call. That construction of substituting transitions in a reversible way by service effect automata is shown in detail in [23]. However, Parametric Contracts are tailored only to a certain class of measurable quality properties. Requirement 2: Encapsulation and Interfaces: The service effect automata are used to describe the interface of a component. Requirement 3: Dependencies on External Components: This requirement is fulfilled, as the service automata explicitly models call to external components. Their fail-



 272



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



ure probabilities are explicitly considered in the analysis. Therefore, this requirement is fulfilled. Requirement 4: Integration of Analysis Results: As the service effect automata are again service effect automata (see above), one can apply the same analysis techniques. In fact, for given service effect automata, previously computed failure probabilities of their traces traces can be used directly for the analysis of the composed service effect automaton (even without explicitly constructing the composition). Requirement 5: Practical Granularity: Service effect automata abstract from internal computations and the influence of parameters on the failure probability of calls. This is only valid, if the parameters have no influence on the failure probabilities. This is the case e.g. if parameters are fixed or not existent (as in our example). However, the validity of this abstraction is not always given and its presence has to be validated. However, current research is concerned with more detailed usage profile models, taking parameters into account. Requirement 6: Tool Support Tool support for the specification of parametric contracts is currently developed by the Palladio research group in Oldenburg. Currently, the analysis is not supported by dedicated programs. Commercial tools for safety analysis are currently not available. 5.4 Comparison of the Three Evaluation Notation Concluding the evaluation we present a table 1 a comparison of the three componentbased analysis techniques for safety properties. In this table we assign a quality mark ranging from −− (requirements are not fulfilled) to ++ (requirements are completely fulfilled) up to our knowledge for each analysis technique for each requirement.



6



Conclusions



In this chapter, we have investigated the applicability of the component-based software engineering paradigm to the domain of safety critical systems. For that reason, we have discussed the relevant problems in detail and given an overview of current approaches and research covering this problem domain. As a result, we have identified a set of requirements that are needed to evaluate safety properties for a system built with components. These requirements are used to compare the state of the art specification Table 1. A Comparative Evaluation Requirement



FPTN



CFT



Appropriate Component Level Models Encapsulation and Interfaces Dependencies on External Components Integration of Analysis Results Practicable Granularity Tool Support



+ ++ ++ − + −



+ ++ ++ ++ + +



Param. Contracts + + ++ + + −



 Safety Properties in a Component-Based Process



273



techniques that allow for the evaluation of the probability of hazards or safety critical failures. These specification techniques are Component Fault Trees (CFTs), Parametric Contracts and Failure Propagation and Transformation Notation Modules (FPTN Modules), which have partly been developed by the authors of this chapter and partly by other researchers. Each of these three evaluation notations has its own strengths and limitations. To increase these strengths and to reduce the limitations we try to unite the features of the three evaluation notations, which will ideally lead to a unified notation that completely fulfills all requirements that are given in this chapter.



References 1. Szyperski, C.: Component Software: Beyond Object-Oriented Programming. ACM Press, Addison-Wesley, Reading, MA, USA (1998) 2. Bondavalli, A., Simoncini, L.: Failure Classification with respect to Detection. Esprit Project Nr 3092 (PDCS: Predictably Dependable Computing Systems) (1990) 3. Fenelon, P., McDermid, J., Nicholson, M., Pumfrey., D.J.: Towards integrated safety analysis and design. ACM Computing Reviews, 2 (1994) 21–32 4. Leveson, N.G.: SAFEWARE: System Safety and Computers. Addison-Wesley Publishing Company (1995) 5. CENELEC (European Committee for Electro-technical Standardisation): CENELEC EN 50126: Railway Applications – the specification and demonstration of Reliability, Availability, Maintainability and Safety. CENELEC EN 50128: Railway Applications: Software for Railway Control and Protection Systems CENELEC, Brussels (2000) 6. SAE ARP 4754 (Society of Automotive Engineers Aerospace Recommended Practice): Certification Considerations for Highly Integrated or Complex Aircraft Systems (1996) 7. Department of Defense, United States of America: Military Standard 882C. System Safety Program Requirements (1999) 8. Deutsches Institur f¨ur Normung e.V.: DIN 25419: Ereignisablaufanalyse, Verfahren, graphische Symbole und Auswertung (German Standard) (1985) 9. IEC 60812 (International Electrotechnical Commission): Functional safety of electrical/electronical/programmable electronic safety/related systems, Analysis Techniques for System Reliability - Procedure for Failure Mode and Effect Analysis (FMEA) (1991) 10. IEC (International Electrotechnical Commission): Hazard and operability studies (HAZOP studies) - Application guide (2000) 11. UK Defence Standardization Organisation: Defence Standard 00-58, HAZOP Studies on Systems Containing Programmable Electronics, Part 1 and 2 (2000) 12. DIN 25424 (Deutsches Institut f¨ur Normung e.V.): Fault Tree Analysis: Part 1 (Method and graphical symbols) and Part 2 (Manual: calculation procedures for the evaluation of a fault tree) (1981/1990) 13. IEC 61025(International Electrotechnical Commission): Fault-Tree-Analysis (FTA) (1990) 14. Vesely, W.E., Goldberg, F.F., Roberts, N.H., Haasl, D.F.: Fault Tree Handbook. U. S. Nuclear Regulatory Commission (1996) 15. Mauri, G.: Integrating Safety Analysis Techniques, Supporting Identification of Common Cause Failures. PhD thesis, Department of Computer Science, University of York (2001) 16. IEC (International Electrotechnical Commission): IEC 61165: Application of Markov techniques (1995-2003) 17. Selic, B., Gullekson, G., Ward, P.: Real-Time Object Oriented Modeling. John Wiley & Sons (1994)



 274



Lars Grunske, Bernhard Kaiser, and Ralf H. Reussner



18. Fenelon, P., McDermid, J.A.: An integrated toolset for software safety analysis. Journal of Systems and Software 21 (1993) 279–290 19. Kaiser, B., Liggesmeyer, P., M¨ackel, O.: A new component concept for fault trees. In: Proceedings of the 8th Australian Workshop on Safety Critical Systems and Software (SCS’03), Adelaide (2003) 20. Frolund, S., Koistinen, J.: Quality-of-service specification in distributed object systems. Technical Report HPL-98-159, Hewlett Packard, Software Technology Laboratory (1998) 21. Reussner, R.H., Poernomo, I.H., Schmidt, H.W.: Reasoning on software architectures with contractually specified components. In Cechich, A., Piattini, M., Vallecillo, A., eds.: Component-Based Software Quality: Methods and Techniques. Number 2693 in LNCS. Springer-Verlag, Berlin, Germany (2003) 287–325 22. Reussner, R.H., Schmidt, H.W., Poernomo, I.: Reliability prediction for component-based software architectures. Journal of Systems and Software – Special Issue of Software Architecture - Engineering Quality Attributes 66 (2003) 241–252 23. Reussner, R.H.: Automatic Component Protocol Adaptation with the CoCoNut Tool Suite. Future Generation Computer Systems 19 (2003) 627–639 24. Grunske, L.: Annotation of component specifications with modular analysis models for safety properties. In: Proceedings of the 1st International Workshop on Component Engineering Methodology (WCEM), Erfurt (2003) 737–738 25. Bryant, R.: Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers 35 (1986) 677–691



 Performance Evaluation Approaches for Software Architects Anu Purhonen VTT Technical Research Centre of Finland, P.O. Box 1100, FI-90571 Oulu, Finland [email protected]



Abstract. Performance analysis techniques have already been developed for decades. As software architecture research has matured, performance analysis techniques have also been adapted to the evaluation of software architectures. However, the performance evaluation of software architectures is not yet systematically used in the industry. One of the reasons may be that it is difficult to select what method to use. The contribution of this work is to define a comparison framework for performance evaluation approaches. In addition, the framework is applied in comparing existing performance evaluation approaches. The framework can be used to select methods for evaluating architectures, to increase understanding of the methods, and to point out needs for future work.



1



Introduction



In this work, performance evaluation means evaluation of both the time and resource behavior of the system. A number of techniques are available for evaluating hardware and software performance [1]. However, performance evaluation of software architectures has only been researched for a few years because software architecture research itself is relatively young. Architecture is the fundamental organization of a software system embodied in its components, their relationships to each other and to the environment [2]. Because software architecture has a fundamental role in the quality of the final product, software architecture evaluation can reveal critical problems in the design at an early stage of the development when modifications are still easy to make [3, 4]. Performance evaluation techniques are usually more well-known to developers of safety-critical systems such as avionics or medical systems. In non-safety critical systems, failure of a deadline does not have serious consequences such as loss of human lives, however, it may be harmful to the manufacturer’s business. In addition to just meeting deadlines, performance in non-safety critical products has become more and more important in terms of how good service the product gives to the users, for example, what applications can be concurrently run on a mobile phone. In the embedded systems, hardware has often been designed first and therefore it has also had a strong influence on software design. Nowadays, a hardware platform can be just one component in the system that is utilized in several products. On the other hand, the same software components can be used in various hardware platforms. The hardware and software are no longer designed separately but the system has to be designed as a whole. Another challenge in contemporary product development is C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 275–295, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 276



Anu Purhonen



that more and more components, either software or hardware, are purchased from third parties. Thus, it may be difficult for the integrator to know why a component behaves in a certain way during runtime. A third property of current systems that increases the complexity of estimating the performance of the system is that both software and hardware resources are reserved dynamically, based on the acute needs of the user. It may not be feasible to define beforehand what exactly is happening in the system in a specific moment of time, because there are too many variables. The concurrent and separate development of components means that several assumptions have to be made about interfacing components until more information is available. In addition, there may be uncertainty regarding requirements; for example, the standards that should be supported may be unfinished when the development starts. Software architecture is a natural way of supporting this kind of development. Architectural diagrams allow easy analysis of design alternatives before implementation is fixed. Several of the existing performance evaluation approaches are based on Queuing Network Models (QNM) [5–8], Petri nets [9–11], or process algebras [12, 13]. QNMs are constructed from service centers and queues. Service centers provide services and each service center has a queue attached to it. Compared to QNM, Petri nets and process algebras are more formal techniques. Petri nets describe the system using places, transitions, and tokens. Timing information has been added by a number of extensions to traditional Petri nets. Process algebras are algebraic languages that are used for description and formal verification of the functional properties of concurrent and distributed systems. Stochastic process algebras (SPA) are extensions to process algebras allowing analysis of performance properties. In addition to the above approaches, real-time system scheduling theory [14] has been used to address the issues of priority-based scheduling of concurrent tasks with hard deadlines. Moreover, the software architecture community provides the Architecture Trade-off Analysis Method (ATAMSM ) [4]. ATAM is a scenario-based method for evaluating architecture-level designs. ATAM considers multiple quality attributes, including performance. However, if detailed analysis is needed the method proposes that specialized performance evaluation techniques are used. Although performance is one of the main quality attributes in many application areas, at the moment performance evaluation is often not systematically used for supporting architectural design decisions in the industry [11, 15]. However, this does not mean that performance evaluation is not used at all. On the contrary, support for design decisions is gathered, for example, using benchmarks, simulation, prototyping, and analyzing worst case situations based on experience. Some specific problems of the current practices are that it is difficult to estimate the reliability of the results, the analysis cannot be easily repeated, and the results are not comparable with other, similar evaluations. Furthermore, each team or expert usually has their own practices and the results are not stored so that they could be utilized in other evaluations. The main reason behind the lack of acceptance of performance evaluation techniques is that they are considered to be difficult and time-consuming [9, 11, 15]. In order to make the methods easier to use and accept, developers are trying to connect the performance models directly to familiar architectural models and automate the whole



 Performance Evaluation Approaches for Software Architects



277



process from the software architecture description to the evaluation results [16]. However, an additional problem from the non-performance specialist’s point of view is that it is difficult to select a method to use, because the capabilities and differences of the current methods are difficult to sort out from existing publications. The contribution of this work is a definition of a framework that can be used to compare different approaches to the performance evaluation. The definition is based on the requirements of the stakeholders of the evaluation. The framework is applied in comparing different types of performance evaluation approaches that have published examples how they have been used in software architecture evaluation. The following performance evaluation approaches were selected for comparison: – Rate-Monotonic Analysis (RMA) is a collection of quantitative techniques that are used to understand, analyze, and predict the timing behavior of the systems [14]. It is not one method developed by a certain group, but there are, for example, several tools that are based on RMA. – PASASM is a method for the Performance Assessment of Software Architectures [17]. PASA uses the principles and techniques of Software Performance Engineering (SPE) [3, 6]. – Layered Queuing Network modeling (LQN) [16, 18–20] is an extension to QNM. One research group is mainly developing this approach. – The evaluation of Colored Petri Nets (CPN) approach is based on just one case study [11] where CPN was used for evaluating alternative mechanisms and policies in an execution architecture. No further publications have been made of this approach so far but because CPN in general is widely accepted this is an interesting case study. – Because ATAM is probably the best-known software architecture evaluation method, it is included even though it is not necessarily a performance evaluation method. – Another approach based on just one publication is called the “metrics” approach in this work. Metrics were used for analyzing performance of the high-level software architecture of a telecommunication system [21]. In this approach, the analysis can be made without making any assumption of the implementation of the components. The chapter is organized as follows. Related work is presented in Section 2. Section 3 reviews the needs of performance evaluation in the software architecture development and the expectations of the stakeholders. Section 4 introduces the elements of the comparison framework. An overview of the approaches selected for comparison is presented in Section 5. Section 6 discusses the results of the comparison and Section 7 concludes with some proposals for future work.



2



Related Work



Balsamo and Simeoni [22] examine approaches to derive performance models from software architecture specifications. Although they include many methods in their analysis, the comparison is quite narrow. They only compare the approaches based on the notation that is used to describe the architecture, the architectural constraints, and the used performance model.



 278



Anu Purhonen



Balsamo et al. in [1] introduce several notations that can be used to describe the behavior of a software system and the main classes of stochastic performance models. Based on these introductions, they present the integrated methods for software performance engineering. A summary of the characteristics is given using nine elements, three of which analyze the quality of the methods. They also classify the methods in a space of three dimensions: integration level of performance analysis in the software life cycle, automation degree, and integration level of the software model with the performance model. This survey is a good presentation of the available techniques for performance evaluation and in that way complements our survey. However, the survey appears to have started from the methods and their capabilities whereas our goal has been to start from the needs of the users and the system. A third survey from Balsamo and her colleagues [23] goes even further into the characteristics of queuing network models. Although it is very informative, it appears to be directed at the performance specialists and people who are familiar with queuing theories. Herzog and Rolia [13] compare the characteristics of layered queuing modeling and SPA in eight features that are close to some of the features we have used. However, they have not specifically concentrated on the needs of software architecture evaluation.



3



Performance Evaluation



The requirements for the performance evaluation techniques are derived from the features of the system and the needs and constraints of the development organization. Stakeholders are people who are interested in the results of the evaluation and the way the evaluation is performed. Stakeholders include, for example, hardware and software architects, software developers, managers, and marketing people. The hardware architect needs to know the amount of resources the planned software architecture requires. The software architect is interested in whether the architecture meets its requirements. In particular, they can utilize direct improvement proposals and clarifications of the dependencies between performance and other quality attributes. Marketing can use the information about the capabilities of the planned system and the cost of the capabilities. Managers follow the costs and benefits of the evaluation itself. The possible costs include time and the amount of work spent and the cost of tools. The component designers need performance budgets for their components in order to be able to tune them accordingly. The person who makes the evaluation prefers that the evaluation is as easy to perform as possible. In addition to these, all of the stakeholders are interested in getting the results as fast as possible and again when the parameters or the architecture changes. In addition to the needs of the stakeholders, the constraints of the organization can also affect the selection of the methods used in the development. The history and size of the organization affects what methods the company acquires. Moreover, the background and experience of the people affect how well they can utilize the methods. Performance evaluation at the architectural level requires expertise in three subjects. First, the needs of the application domain from the performance point of view have to be understood. Secondly, the design decisions and concepts that are important in the software architecture should be considered. Finally, an understanding of the theory and concepts behind performance is needed.



 Performance Evaluation Approaches for Software Architects



279



The stakeholders set the goals for the evaluation. On the other hand, the goals depend on the stage at which the evaluation is made in the life cycle of the system. One of the first situations is when the software architecture is being developed. The person making the evaluation is then the architect or one of the persons belonging to the architect team. When comparing design decisions, the accuracy of the analysis result is unimportant as long as the analysis points out which is the best solution from the different candidates. Moreover, in the early stages of development the requirements may still be changing, therefore it is not useful to derive accurate results from inaccurate input values. Although some sort of estimate of the hardware architecture is usually needed in the evaluation, hardware architecture is not yet fixed in the early phases of the development. The results of the performance evaluation can be used to support definition of the hardware/software partition and to determine the correct amount of hardware resources for the platform. This is especially important when designing a platform for a product-line. In order to be able to analyze the resource usage of a future architecture, the estimates from the old products are utilized. Therefore, the estimates of a software component in different types of hardware platforms should be comparable. The development organization should be interested in the quality of the architecture as well as the quality of the code. Fortunately, the advantages of architecture reviews are starting to be recognized [24]. Software architecture reviews may be internal reviews based on checklists or evaluations made by an external evaluation team. Usually the goal of the reviews is to identify possible points of improvement and problem areas, that is, to perform risk analysis. If the review is based on checklists, the architect should be able to answer to the questions of the reviewers using the results of the evaluations made during the software architecture design. However, when an external evaluator team makes the evaluation they can derive new quantitative performance models from the architecture to support their analysis of the system. Finally, evaluation methods need a description of the system under evaluation. These descriptions include, for example, the runtime architecture of the system, performance objectives, and estimates of the resource usage of individual components. However, the architecture specifications often do not describe the runtime operation of the system or the information is incomplete. Furthermore, the architecture can be evolving so that modifications are made frequently. Software architecture is not the only thing that affects performance; the implementation of the components also affects the overall performance. Thus, at the architecture evaluation, the effect of implementation has to be estimated until the real values are available.



4



Comparison Framework



The elements of the comparison framework are introduced in this section. 4.1 Context This section presents the elements that are used for defining what types of purposes the method has been applied to and in what type of systems.



 280



Anu Purhonen



Evaluation goal – The evaluation may be performed differently depending on the purpose of the evaluation. For example, there can be different needs in different stages of the product’s life cycle. Therefore, it is important to know what types of evaluations have already been performed with the approach or for what types of evaluations it was originally intended. Application field – Application field describes what kinds of applications the approach is intended for or what applications it has been already applied to. Application field can affect, for example, the requirement for the accuracy of the results. Product type – The products differ, for example, in size and complexity. The approaches can differ in terms of what support they give in evaluating different types of products. 4.2 Architecture The elements in this section examine the evaluation approaches from the point of view of what information they require from the system under study. Views – Software architecture descriptions are usually divided into views. The evaluation approach should define the architectural structures that need to be described in order to be able to make the evaluation. Language – In order to allow easy integration with the other development activities, the language in which the architecture is assumed to be described is important. Architectures are depicted with architecture description languages (ADL). For example, Unified Modeling Language (UML) [25] is commonly used as ADL, but there are also several languages that have been specifically developed for describing architectures [26]. Parameters – In addition to architectural structures, some other information may be needed. For example, in order to analyze response time of requests, the execution time estimates of the individual components needed for serving the request are required. The analysis technique may expect information about resource allocation policies such as scheduling policy. Furthermore, the objectives for the performance evaluation, such as timing requirements and resource constraints, are needed. 4.3 Evaluation The elements in this section describe how the evaluation is actually performed using the method. Process – Guidance is needed for understanding what tasks belong to the evaluation and how the tasks should be performed. The theory behind the evaluation methods is often difficult to understand for non-performance experts and thus an ambiguous process description can be a cause for not taking the method into use. Performance Model – Performance model is the model of which the actual evaluation is performed. The performance model may be part of the architecture description or the architectural diagrams are transformed into a performance model.



 Performance Evaluation Approaches for Software Architects



281



Solution Technique – An approach may support the use of one or more solution techniques. A solution technique may be based on some mathematical theory or it may be formed from rules and guidelines. Different techniques may be applied in different stages in the product development. Results – The performance-related design decisions that are made during software architecture design include selection of architectural styles and patterns, task partition and deployment to the platform, task scheduling and communication, and decisions concerning use of different types of resources. All those design decisions can be sources of improvement. In addition to software architecture changes, improvements to the hardware architecture or changes to the requirements, both functional and non-functional, can be proposed. Tools – Tools are needed to support the different tasks in the evaluation. In addition to helping the evaluation, transformation tools can be used between architecture models and performance models. Furthermore, the reuse of results is facilitated with tools. 4.4 Costs and Benefits This section handles the elements that are used to examine how useful the method is to an organization and to the actual user. Collaboration – Trade-off studies link the evaluation to the other needs of the stakeholders. Software architects have to know the possible dependencies between different quality attributes. Furthermore, there are several domains in the product development that need to work together so as to produce a final product that meets all the requirements. Because expertise and information tend to be distributed between experts in these domains, easy integration with other development activities is needed. Effort – In addition to benefits there are also costs from the evaluation. Furthermore, the time-to-market requirements may lead to the systematic architecture evaluation being omitted if it is too laborious. Flexibility – The approach should be flexible to the abstraction level of the architecture description and to the size of the product. For example, when in the early stages of the development the speed of the analysis is often more important than the accuracy of the results. In addition, when analyzing large systems unnecessary details may be discarded in order to get an understanding of the efficiency of the whole system. On the other hand, sometimes detailed analysis is needed of the most critical areas. Reusability – In the early stages of product development there is a lot of uncertainty that has to be handled. The requirements change and therefore the architecture also has to be changed. As the development proceeds, the estimates also become more accurate and therefore it should be easy to modify the performance models and re-evaluate the architecture. In addition, one of the problems with the current practices is that the results of the evaluations are not reused. Thus, the approach should support reuse between projects.



 282



Anu Purhonen



Maturity – The maturity is assessed by examining the amount of support that is available for the use of the method. Moreover, it is easier to convince the organization of the usefulness of the method if the method has already been used widely. Then the method is probably practical and stable, which means less problems while using it and that support is also available if problems do occur.



5



Overview of the Selected Approaches



This section describes the approaches selected for the evaluation. All these approaches have published examples of their usage in software architecture evaluation. 5.1 RMA Rate-monotonic analysis [14] is perhaps the best-known performance analysis approach among the hard real-time software developers. The basic groundwork for RMA can be traced back to the rate monotonic scheduling (RMS) theory [27]. Consequently, it has been especially used for analyzing schedulability. RMA has been widely used by several organizations in development efforts in mathematically guaranteeing that critical deadlines will always be met, even in worst-case situations [28]. Consequently, it has also been applied to evaluating software architectures, for example, for discovering the ratio between concurrently available functionality over the cost of required hardware resources [29] and for comparing architecture candidates [30]. Definitions of critical use cases and scenarios specify the scope of the analysis. In order to be able to use RMA, a task model of the system has to be defined. The model should specify period, response time, and deadline for each task. RMA does not have a separate performance model, but the mathematical analysis can be made directly from the architectural diagrams. There are commercial tools based on RMA [31, 32] and the tools can be linked directly to UML modeling tools. There is also a specialpurpose ADL and a tool-set for safety-critical systems [33]. 5.2 PASA PASA [17] can be used as a framework when SPE [3, 6] techniques are applied to software architecture evaluation. PASA identifies deviations from architectural style and proposes alternative interactions between components and refactoring to remove anti-patterns. PASA is intended for uncovering potential problems in new development or for deciding whether to continue to commit resources to the current architecture or migrate to a new one. PASA is developed on experiences in performance assessment of webbased systems, financial applications, and real-time systems. The evaluation starts from critical use cases that are further specified as key performance scenarios. The scenarios are documented using augmented UML sequence diagrams. PASA uses three different approaches to the impact analysis: identification of architectural styles, identification of performance anti-patterns, and performance modeling and analysis. Performance analysis is made in two phases. Initially, a simple analysis of



 Performance Evaluation Approaches for Software Architects



283



performance bounds may be sufficient. If the analysis of performance bounds indicates the need for more detailed modeling, this is done in the second phase. Detailed performance analysis is based on two models: the software execution model and the system execution model. The models are derived from the sequence diagrams. While the software execution models provide optimistic performance metrics, the system execution models are used for studying the resource contention effects on the execution behavior. The results of solving the system execution model include, for example, metrics for resource contention, sensitivity of performance metrics to variation in workload composition, and identification of bottleneck resources. There is a commercial tool available for the performance analysis [34]. 5.3 LQN Layered Queuing Network modeling (LQN) [16, 18–20] is an extension to the QNM approach. The main difference is that a server, to which customer requests are arriving and queuing for service, may become a client to other servers from which it requires nested services while serving its own clients. If tasks have no resource limits then LQN gives the same predictions as a queuing network [18]. LQN was originally created for modeling client/server systems [16]. It has now been applied to database applications [20] and web servers [35]. Moreover, simulation of LQN models has been used to incrementally optimize the software configuration of electronic data interchange converter systems in a financial domain [36] and to determine the highest achievable throughput in different hardware and software configurations in a telecommunication system [16]. LQN starts with the identification of the critical scenarios with the most stringent performance constraints. Then LQN models are prepared. A LQN model is represented as an acyclic graph whose nodes are software entities and hardware devices and whose arcs denote service requests. The LQN model is transformed from UML class, deployment, and interaction diagrams [37] or Use case maps [38]. There are also transformation rules for creating LQN models from architectural patterns [16]. The LQN model produces results such as response time, throughput, utilization of servers on behalf of different types of requests, and queuing delays. The parameters in a LQN model are the average service time for each entry and average number of visits for each request. LQN offers several analytical solvers. In case the scheduling policy or something else hinders the analytical solver to be used, then simulation is available. A confidence interval can be given to the results. A research tool is available for solving LQN models [39]. 5.4 CPN In the Colored Petri Nets case study, CPN has been used for evaluating alternative mechanisms and policies in execution architecture [11]. The mechanisms include the task control and communication mechanisms and the policies are for task division and allocation. In addition, CPN is used for setting timing requirements for the component design and implementation when the available resources are already fixed. They estimate the message buffer usage and message delays based on simulation of lot of use



 284



Anu Purhonen



cases. The application field is mobile phone software. The evaluation handles industrial scale products and product families. The ”‘4+1”’ views of architecture design approach [40] and UML are used to describe the architecture. The parameters are message delays on different message links, the task-switching time of the operation system, and event processing time. Different probability distributions are used for the streams of user request, events and network signals. The module architecture in UML is mapped to the execution architecture as a CPN model. The analysis is made using a simulation tool maintained by the University of Aarhus [41]. The simulation gives the following results: the message buffer usage in worst case, minimum, average, and maximum message delays, and the number of task switches needed for each transaction.



5.5 ATAM Architecture Trade-off Analysis Method (ATAM) is used to learn what the critical architectural design decisions in the context of selected attributes are [4, 42]. Those design decisions can then be modeled in subsequent analyzes. One of the supported attributes in ATAM is performance. ATAM has been used for risk analysis, for example, in realtime systems and in aeronautical systems. ATAM is a review type of activity that takes a few days to perform. An evaluation team that has several members with well-defined responsibilities performs ATAM. In addition, the development organization has to take part in the evaluation. In particular, the architect has a role in providing information on the architectural decisions. The quality goals in ATAM are characterized with scenarios. They use three types of scenarios: usage scenarios, growth scenarios, and exploratory scenarios. The scenarios are created by studying the architecture and interviewing the stakeholders. They use standard characterization for each attribute, which facilitates the elicitation of scenarios. The analysis is based on finding sensitivity points and trade-off points. A sensitivity point is a property that is critical for achieving a particular quality and a trade-off point is a property that affects more than one attribute and is a sensitivity point for at least one attribute. An attribute-based architecture style (ABAS) adds to architectural style the ability to reason based on quality attribute-specific models [43]. An analysis of an ABAS is based on a quality attribute-specific model that provides a method of reasoning about the behavior of component types that interact in the defined pattern. For example, a definition of performance ABAS can include a queuing model and the rules for solving the model under varying sets of assumptions. The qualitative analysis that ATAM uses is based on asking questions regarding what kinds of quantitative evaluations have been performed on the system and how else the performance characteristics have been ensured. The definitions of ABASs help in eliciting these questions. In case the screening questions reveal potential problems, a more comprehensive model of the quality attribute aspect under scrutiny is built. ATAM has a clear process description and definition of roles for members of the evaluation team.



 Performance Evaluation Approaches for Software Architects



285



5.6 Metrics The goal in the metrics case study has been to analyze the effect of architectural decisions so that the possible implementation of the components does not need to be taken into account [21]. In the case study, metrics are applied to a high-level software architecture of a telecommunication system. The evaluation is scenario-based. A performance scenario describes a particular use of the system performed by users and its weight is derived from the frequency of how often in time this scenario will happen. Architecture Description Language for Telecommunication (ADLT) is used for describing the architecture. It has been designed with an easy translation from UML diagrams in mind. ADLT uses four diagrams: activity diagram, architectural configuration, sequence diagram and protocol diagram. Activity diagrams and sequence diagrams are used in performance analysis. An activity diagram shows the dynamic component and connector behavior using simple finite state machines enriched with activities. A sequence diagram is similar to UML’s sequence diagram. The analysis is performed using a metric called Elementary Stress: P resences + Queues P aralellisms + 1



(1)



Presences is the number of times that a component or connector appears in a scenario. Parallelisms denote the number of parallelism symbols inside a component or connector activity diagrams. Queues are the number of queue-type structures inside activity diagrams. The justification of the metric is that more occurrences of a component or connector imply a larger overhead associated with that scenario. This overhead is decreased by use of parallelism and made worse by queues or stacks. The trade-off analysis is made using a table. The trade-off value indicates the number of attributes the element is sensitive to. The average of all trade-off values gives the architecture trade-off value.



6



Comparison Results



This section compares the approaches presented in section 5 using the framework defined in section 4. The properties of the approaches are derived from the published uses of the methods. Therefore, this comparison may not include all the cases of how the method has been used. Furthermore, because the properties are derived directly from the publications of the methods, some of the properties can be overlapping. For example, mobile phones are telecommunication systems and telecommunication systems are real-time systems. The term that is closest to the term that is used in the publication of the method is selected. 6.1 Context A summary of the tasks that the evaluation approaches have been used for is presented in Table 1. All the approaches seem to be suitable for comparing architectural decisions.



 286



Anu Purhonen Table 1. Evaluation goal



HW/SW configuration Finding bottlenecks Requirements for components Risk analysis Comparison of candidate solutions Estimation of architecture capability



RMA PASA LQN CPN ATAM Metrics X X X X X X X X X X X X X X X



Risk analysis and analysis of bottlenecks are other uses that do not necessarily require a method that gives precise values to the performance parameters. Thus, ATAM and metrics should be useful for those purposes. However, quantitative results are required for deciding the hardware and software configurations, for giving requirements to the component designers and for estimating, for example, response time. Therefore, the actual performance analysis techniques should be more appropriate for those purposes. Table 2 shows the applications the methods have been applied to. Telecommunication systems mean those other than mobile phones. If the publication has explicitly mentioned that the system is real-time, then that is marked. The problem in this comparison was that although the method descriptions often imply that the method is suitable for certain types of applications, there are not so many publications of how the methods have been used in those systems. Safety-critical systems probably need a more formal approach than, for example, consumer electronics in the architecture design. CPN could be considered to be the most formal approach in this comparison. However, RMA has also been used for safety-critical systems with a special-purpose ADL and a tool-set. Consequently, more examples are needed from all the approaches before any further conclusions can be made based on the application field. Table 2. Application Field



Telecommunication Real-time systems Missile guidance Web systems Mobile phone Client-server systems Avionics and aeronautics Financial systems



RMA PASA LQN CPN ATAM Metrics X X X X X X X X X X X X X X X X X



The types of products that have been evaluated using the methods are characterized in Table 3. One of the reasons why these approaches were selected for comparison was that they should be suitable for evaluating embedded software. Although this was not explicitly stated in the metrics case, it is reasonable to believe that it is also suitable for that purpose. RMA may not always be practical in complex systems because it expects



 Performance Evaluation Approaches for Software Architects



287



Table 3. Product type



Embedded software Large system Continuous data Communication solution Concurrent usage scenarios Product family



RMA PASA LQN CPN ATAM Metrics X X X X X X X X X X X X



each task to be characterized by a single period, deadline and response time [29]. This may lead to overly pessimistic results. On the other hand, RMA has been found useful in analyzing, for example, avionics systems with continuous data [33]. In systems where users may activate several applications concurrently, those situations are interesting from a resource usage point of view and should be modeled as concurrent usage scenarios. At least LQN seems to be suitable for that. The communication solution in the metrics case study integrated a GSM system with Internet. 6.2 Architecture The CPN case study starts from an architecture description based on 4+1 views [40]. In addition, at least in one example where LQN has been utilized, several views that are needed have been defined [44]. However, otherwise the methods do not usually explicitly state the views of the architecture that should be available before evaluation. ATAM, metrics and analysis of patterns in PASA are approaches that should be suitable for conceptual level analysis. For the other approaches the conceptual level descriptions may be too general because they require fairly detailed parameters as input, as is shown later. Instead of specific views, most of the approaches describe the structures or diagrams that are needed. These diagrams are presented in Table 4. META-H [33] and ADLT [21] are languages that have been used for describing input diagrams. ADLT resembles UML. Furthermore, PASA uses augmented sequence diagrams that are similar to UML sequence diagrams but have some features from message sequence charts (MSC). The Table 4. Architectural diagrams



UML class diagram UML deployment diagram UML sequence diagram UML use cases Augmented UML sequence diagram MSC Use Case Maps ADLT diagrams META-H diagrams



RMA PASA LQN CPN ATAM Metrics X X X X X X X X X X X X X X X



 288



Anu Purhonen



architecture for LQN has been described in UML, Use case maps or MSC diagrams. ATAM does not specify the structures that are needed for the analysis. In ATAM, the responsibility of the architect is to be able to describe the architecture and the main architectural decisions to the evaluation teams. In addition to the architectural diagrams each approach needs some additional information or parameters: – CPN requires time parameters such as message delays, task-switching time and event processing time. – Metrics and ATAM need definitions of usage scenarios and weights to them. These scenarios are derived from requirements. Part of the ATAM approach is that it supports the scenario elicitation. – LQN requires execution time demands for each software component on behalf of different types of system requests and demands for other resources such as I/O devices and communication networks. The parameters include, for example, the average values for the following parameters: arrival rate of requests, execution time of requests and average message delays. In addition, the scheduling discipline for each software and hardware server is needed. – In PASA the important performance metrics for each server are residence time, utilization, throughput, and average queue length. The scheduling discipline for queues also restricts how the problem is solved. – In RMA, each task should have period, deadline and response time possibly representing the worst case defined. In addition, the RMA tools provide support for different types of scheduling disciplines. 6.3 Evaluation The level of process descriptions varies in the approaches: – In the application of CPN the problem seems to be that there are no guidelines regarding how to use it in architecture evaluation. – For the metrics approach there is an architectural development process description. – ATAM has a process description for a review-type of evaluation where the roles of the evaluation team are strictly defined. – There are some guidelines of how to use LQN in the various publications but not one specific process description for software architecture evaluation. – PASA has a good process description of how to proceed from architectural diagrams to the evaluation. In addition, PASA is supported by the SPE methodology. RMA analysis can be made directly from UML diagrams and the metrics analysis is also made directly from ADLT diagrams. The other approaches use special performance models. CPN is based on the CPN model and a special inscription language for token definition and manipulation. PASA and LQN both use queuing network models. In case a detailed analysis needs to be included in ATAM, any model and method can be used with it. The solution techniques used are listed in Table 5. Naturally, the metrics approach is based on a special metric. LQN and apparently PASA have several analytic solvers, but



 Performance Evaluation Approaches for Software Architects



289



Table 5. Solution techniques



Performance bounds Architectural patterns Architectural anti-patterns Analytic solver Simulation Mathematical rules ABAS Metrics



RMA PASA LQN CPN ATAM Metrics X X X X X X X X X X X X X X



simulation is used when an analytic solution is difficult or impossible to find. The developers of ATAM provide support for the evaluation with a selection of ABASs. PASA also includes analysis of architectural patterns and anti-patterns. In addition, RMA is proposed for use with PASA for schedulability analysis. Based on the publications, the following types of results have been received with the solution techniques: – By simulating a lot of use cases CPN is used to obtain maximum size of message buffers and average transaction processing time. On the other hand, when the total size of buffers is fixed, the timing parameters are results that can be further used as requirements for the component design and implementation. – Metrics are able to show the critical elements in the architecture. – ATAM is used to find out the points in the architecture that are sensitive to performance. They are further used in specifying trade-off points and risks. – LQN results are response time, throughput, queuing delays, and utilization of different software and hardware components. The more exact the solver the more accurate the results. – The main result of RMA is the schedulability of the system. However, other information can also be revealed, for example, response times for requests. – PASA is intended for deciding end-to-end processing for messages and for analyzing scalability. PASA identifies problem areas that require correction in order to achieve the desired scalability and quantifies the alternatives so that developers can select the most cost-effective solution. Tools were found available for all methods except ATAM and the metrics approach. RMA and PASA are the only ones that have commercial tools available. Moreover, RMA is supported by more than one evaluation tool and the tools can be directly connected to well-known design tools. There are also some publications about transformation tools for LQN, but the tools were not publicly available. The tool used in the CPN case study was earlier a commercial tool, but it is now maintained by a university [41]. At least the CPN and RMA approaches have tools available for formal modeling and analysis. The problem with all the approaches seems to be that the tools do not yet give much advice on how to utilize the results in order to improve the architecture. Therefore, the user has to be able to translate the performance concepts back to architectural alternatives.



 290



Anu Purhonen



6.4 Costs and Benefits Trade-off with other attributes and integration with other development activities is supported as follows: – The CPN approach supports analysis of functionality in addition to performance evaluation. However, it had been difficult to formally analyze the whole model because of processing power requirements [11]. Integration of the CPN model with UML diagrams is claimed to be easy, but it is not clearly demonstrated in the publication. – In the metrics approach and ATAM, the trade-off analysis is supported. – In PASA the trade-off analysis is claimed to be supported but it is not clearly demonstrated how it is done. Integration of PASA and SPE to other development efforts is left to the organization utilizing the approach. – In LQN publications there are some ideas of how to integrate it with software design environments but more support is not yet available. – The connection with other design activities in RMA is supported with tools. However, although analysis can be made directly from UML diagrams, the support for the feedback to the architecture is unclear. In addition, mentions of trade-off studies with non-performance-related attributes were not discovered. The effort used is not often reported in the publications and the estimates that are given may not be comparable with each other. According to the CPN case study, it took one and half months to learn the method and tools. After that the modeling took two months to prepare the first version and an additional three weeks to make an update. The time spent in the actual evaluation was not given. According to the developers of the metrics approach, the costs of the evaluation do not seriously affect the development process. ATAM takes normally 3-4 calendar days to apply to architectures for medium-sized systems. However, ATAM requires assembling the relevant stakeholders for a structured session of brainstorming, presentation, and analysis costing a total of 40-60 staff days. The amount of time needed for the managers and the architects to prepare for the ATAM is not included. LQN does not give references to model building, but it has been mentioned that LQN analysis takes seconds and simulation minutes. The cost of following SPE (the approach upon which PASA is based) in performance-critical projects has been 1-10 percent of the overall budget of the project [6]. No references to the costs of RMA were discovered. A flexible method should be suitable for small and large systems at the conceptual and concrete levels. It should be flexible enough to be used in the different stages of product development and for different type of results accuracy. CPN supports hierarchical description that facilitates the modeling of larger systems. The metrics approach should be suitable for both large and small systems but without a tool, large systems may be too laborious. Because the steps of ATAM are strictly defined, the effort of the review remains nearly the same whatever the size of the system to be analyzed. LQN has been utilized for analyzing large systems. One of the main features of LQN is that it supports reserving multiple resources at the same time. Furthermore, there is a selection of solvers depending on what kind of result accuracy is needed. The steps for larger and smaller systems are the same in PASA. The use of sub-models facilitates the evaluation



 Performance Evaluation Approaches for Software Architects



291



of larger systems. Evaluation of patterns and anti-patterns can already be performed from conceptual diagrams. RMA does not have any additional features concerning the flexibility of the method. The reusability can be supported in different ways. For example, the evaluations should be easily repeatable when parameters change and the results should be reusable between different projects. The selected approaches support reusability as follows: – In the CPN approach the parameters can be updated and the variation in products can be described with data variables and initial markings. Thus, the approach should support reusability between projects. – The metrics approach apparently gives always the same result with the same input information regardless of the user, because it is a platform-independent approach. – Because ATAM is based on experience of the evaluators with different composition of experts, it does not necessarily give the same result regardless of the evaluator. However, the intermediate results, such as definitions of the scenarios, can be reused. – LQN models can be reused with different parameters and solvers. In addition, the behavior of the nodes can be described in more detail if needed. – PASA supports composition of sub-models, but the analysis of concurrent scenarios is supported only with a simulator. The tool provides a database where parameter values (e.g. resource utilization) can be stored and reused in different models that run in the same environment. – RMA has good tool support. Thus, the evaluation result should not depend on the user of the method. However, no examples were found regarding how the evaluation results are actually used to improve the architecture so that may cause variation on the actual outcome of the evaluation. The properties for estimating maturity are presented in Table 6. In the CPN approach the main problem is the lack of documentation on how to model architectures with it. For the metrics approach a validation of metrics through Petri nets and QNM was said to be in progress. There is especially good guidance for the use of PASA in the form of two books. However, PASA evaluation without the tool may be difficult [45]. There are a lot of publications about LQN, but it needs a good book on how it should be used especially for software architecture evaluation. RMA is the most mature method in that it has more than one commercial tool and it has been widely used in developing Table 6. Metrics for estimating maturity



Service marks Case studies Several developers Commercial tool Research tool Handbooks Several tool providers



RMA PASA LQN CPN ATAM Metrics X X X X X X X X X X X X X



 292



Anu Purhonen



hard real-time software. However, as a method for a software architect it is lacking in that although there are books about it, they have not been updated after new concepts of software architecture development have emerged. The problem with the other service mark, ATAM, is that there are no tools and it does not support detailed analysis of performance. From the software architecture development point of view, the CPN and metrics approaches are the most immature ones because there are not yet many publications on them. 6.5 Summary As anticipated, the comparison of the evaluation approaches showed that there is still a lot to do to make the methods more suitable for architects. Furthermore, it seems that one method is not enough; instead different techniques should be used for different purposes. In addition, the background and experience of the organization and the availability of tools affects what method is taken into use. RMA is a well-known method among real-time software developers and it is the only method that has more than one commercial tool. Tools are also directly connected to design tools. However, more guidance is needed on how it can be utilized specifically by software architects. In addition, the fact that it can give overly pessimistic results may hinder its use for many problems in consumer electronics. PASA also has a commercial tool and it is based on the well-known performance engineering approach SPE. Two books provide good support for different stages of the evaluation. Nonetheless, it is difficult to finish the evaluation without using their tool and there is no support for the integration of the approach with the other development activities. In addition, the trade-off analysis phase is included in the method but no detailed guidelines were found regarding how it is actually performed. LQN seems to be suitable for many types of applications and, based on numerous publications, a lot of effort is being put into making it user-friendlier. The interesting feature in LQN is that it allows reservation of multiple resources concurrently, which is an important feature for real world systems. However, the tool is not yet commercial and it is only available for restricted environments. In addition, although the notation is easy to understand, there is a need for a good manual on how the method can be applied by architects. CPN is especially good for systems where both functionality and performance needs to be validated. Modeling with CPN seems to take a long time and the evaluation is done using only a simulator, so the evaluation may be slow for some purposes. There seems to be no direct connection between design tools and the CPN tool. The results of CPN were based on only one case study of which no further publications were found. Thus, the results may not be as reliable as where there are more publications available. ATAM has been developed especially for architecture evaluation and it is supported by a handbook. In the evaluation, ATAM relies on the experience of the evaluation team. Consequently, it is not actually a performance evaluation method itself, but it creates a framework in which the detailed performance evaluation techniques can be embedded. Metrics is the other approach based on just one case study. The interesting point of this approach is that it does not require any estimates of the future implementation of the system, but it is solely based on the evaluation of the architectural structures.



 Performance Evaluation Approaches for Software Architects



293



Unfortunately, the proposed metric has not yet been validated through other applications and performance evaluation techniques. In addition, the tools are needed before the metrics approach can be applied in larger systems.



7



Conclusion



The use of systematic performance evaluation of software architectures is still rare. However, the benefits of the early analysis of software development have been long accepted. This chapter first described the needs of the stakeholders for the performance evaluation. Based on the stakeholder requirements, the elements of the comparison framework were introduced and the framework was used to compare six performance evaluation approaches. At the moment the evaluation methods are not general-purpose and all of them have their strengths and weaknesses. Different methods need to be used for different purposes. In addition, the background of the development organization and the availability of the tools affect selection of the method. Consequently, the proposed comparison framework can help in selecting a method that best suits the needs of the organization and the system. It also helps in estimating the status of the performance evaluation research and in understanding the differences between the approaches. Future work in the evaluation methods is needed for creating guidelines and examples on how to use these methods in software architecture development. In addition, tool support is not yet sufficient for an architect that is not entirely familiar with performance concepts. Support is especially missing in transforming the architectural diagrams to performance models and on the other hand, transforming the evaluation results into architectural decisions.



Acknowledgments This work was partly conducted in the Moose project, under ITEA cluster projects of the EUREKA network, and financially supported by Tekes (the National Technology Agency of Finland). The author was also supported by a grant from the Nokia Foundation.



References 1. S. Balsamo, A. D. Marco, P. Inverardi, and M. Simeoni. Software Performance State of the Art and Perspectives. Dipartimento di Informatica, Unversit`a Ca’Foscari di Venezia, Research Report CS-2003-1, January 2003. 2. IEEE, IEEE Recommended Practice for Architectural Description of Software-Intensive Systems. IEEE Std 1471-2000, 2000. 3. C. U. Smith. Performance Engineering of Software Systems. Addison Wesley, 1990. 4. P. Clements, R. Kazman, and M. Klein. Evaluating Software Architecture: Methods and Case Studies. Addison-Wesley, 2001. 5. F. Aquilani, S. Balsamo, and P. Inverardi. Performance Analysis at the Software Architectural Design Level. Performance Evaluation, vol. 45, pp. 147-178, 2001.



 294



Anu Purhonen



6. C. U. Smith and L. G. Williams. Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software. Addison-Wesley, Boston, 2002. 7. P. K¨ahkipuro. Performance Modeling Framework for CORBA Based Distributed Systems. University of Helsinki, 2000. 8. C. Shousha, D. Petriu, A. Jalnapurkar, and K. Ngo. Applying Performance Modelling to a Telecommunication System. In Proceedings of The First International Workshop on Software and Performance (WOSP’98), Santa Fe, New Mexico, USA, 1998. 9. P. King and R. Pooley. Derivation of Petri Net Performance Models from UML Specification of Communication Software. In Proceedings of XV UK Performance Engineering Workshop, 1999. 10. K. Fukuzawa and M. Saeki. Evaluating Software Architectures by Coloured Petri Nets. In Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering (SEKE02), Ischia, Italy, 2002. 11. J. Xu and J. Kuusela. Analyzing the Execution Architecture of Mobile Phone Software with Colored Petri Nets. International Journal on Software Tools for Technology Transfer, vol. 2, pp. 133-143, 1998. 12. M. Bernardo, P. Ciancarini, and L. Donatiello. Architecting Families of Software Systems with Process Algebras. ACM Transactions on Software Engineering and Methodology, vol. 11, pp. 386-426, 2002. 13. U. Herzog and J. Rolia. Performance Validation Tools for Software/Hardware Systems. Performance Evaluation, vol. 45, pp. 125-146, 2001. 14. M. H. Klein, T. Ralya, B. Pollak, R. Obenza, and M. G. Harbour. A Practioner’s Handbook for Real-Time Analysis: Guide to Rate Monotonic Analysis for Real-Time Systems. Kluwer, 1993. 15. R. Pooley. Software Engineering and Performance: a Roadmap. In Proceedings of the 22nd International Conference on Software Engineering, Future of Software Engineering Track, Limerick, Ireland, 2000. 16. D. Petriu, C. Shousha, and A. Jalnapurkar. Architecture-Based Performance Analysis Applied to a Telecommunication System. IEEE Transactions on Software Engineering, vol. 26, pp. 1049-1065, 2000. 17. L. G. Williams and C. U. Smith. PASA: A method for the Performance Assessment of Software Architectures. In Proceedings of the 3rd International Workshop on Software and Performance, Rome, Italy, 2002. 18. C. E. Hrischuk, C. M. Woodside, J. a. Rolia, and R. Iversen. Trace-Based Load Characterization for Generating Performance Software Models. IEEE Transactions on Software Engineering, vol. 25, pp. 122-135, 1999. 19. J. A. Rolia and K. C. Sevcik. The Method of Layers. IEEE Transactions on Software Engineering, vol. 21, pp. 689-700, 1995. 20. G. Franks, A. Hubbard, S. Majumdar, D. Petriu, J. Rolia, and M. Woodside. A Toolset for Performance Engineering and Software Design of Client-Server Systems. Performance Evaluation, vol. 24, pp. 117-135, 1995. 21. S. Afsharian, M. Giuli, and G. Tarani. Quantitative Analysys for Telecom/Datacom Software Architecture. In Proceedings of the third International Workshop on Software and Performance, Rome, Italy, 2002. 22. S. Balsamo and M. Simeoni. Deriving Performance Models from Software Architecture Specifications. In Proceedings of the European Simulation Multiconference (ESM 2001), Prague, 2001. 23. S. Balsamo, V. D. N. Persone`e, and P. Inverardi. A Review on Queuing Network Models with Finite Capacity Queues for Software Architectures Performance Prediction. Performance Evaluation, 2002.



 Performance Evaluation Approaches for Software Architects



295



24. R. Kazman and L. Bass. Making Architecture Reviews Work in the Real World. IEEE Software, vol. 19, pp. 67-73, 2002. 25. OMG, Unified Modelling Language, http://www.uml.org. 26. N. Medvidovic and R. N. Taylor. A Classification and Comparison Framework for Software Architecture Description Languages. IEEE Transactions on Software Engineering, vol. 26, pp. 70-93, 2000. 27. C. L. Liu and J. W. Layland. Scheduling Algorithms for Multi-Programming in a Hard RealTime Environment. Journal of the Association for Computing Machinery, vol. 20, pp. 46-61, 1973. 28. R. Obenza. Guaranteeing Real-Time Performance Using RMA. Embedded Systems Programming, pp. 26-40, 1994. 29. A. Ran and R. Lencevicius. Making Sense of Runtime Architecture for Mobile Phone Software. In Proceedings of the 11th ACM SIGSOFT Symposium on Foundations of Software Engineering held jointly with 9th European Software Engineering Conference, ESEC/FSE 2003, Helsinki, Finland, 2003. 30. A. Purhonen. Architecture Evaluation Strategy for DSP Software Development. In Proceedings of the 15th International Conference of Software & Systems Engineering and their Applications, Paris, 2002. 31. TimeSys Corporation, TimeWiz, www.timesys.com. 32. Tri-Pacific Software Inc., RAPID, www.tripac.com. 33. P. H. Feiler, B. Lewis, and S. Vestal. Improving Predictability in Embedded Real-Time Systems. Carnegie Mellon University, Software Engineering Institute, Technical report CMU/SEI-2000-SR-011, December 2000. 34. Performance Engineering Services, SPE*ED, http://www.perfeng.com. 35. J. Dilley, R. Friedlich, T. Jin, and J. Rolia. Measurement Tool and Modelling Techniques for Evaluating Web Server Performance. In Proceedings of Computer Performance Evaluation Modelling Techniques and Tools, 1997. 36. K. Aberer, T. Risse, and A. Wombacher. Configuration of Distributed Message Converter Systems Using Performance Modeling. In Proceedings of the 20th International Performance, Computation and Communication Conference, Phoenix, Arizona, 2001. 37. D. C. Petriu and H. Shen. Applying the UML Performance Profile: Graph Grammar-Based Derivation of LQN Models from UML Specifications. In Proceedings of TOOLS 2002. 38. D. C. Petriu and M. Woodside. Software Performance Models from System Scenarios in Use Case Maps. In Proceedings of TOOLS 2002. 39. Carleton University. LQNS Solver. http://www.sce.carleton.ca/rads/#softarch. 40. P. Kruchten. The 4+1 View Model of Architecture. IEEE Software, vol. 12, pp. 42-50, 1995. 41. University of Aarhus. CPN Tools. http://wiki.daimi.au.dk/cpntools/cpntools.wiki. 42. R. Kazman, M. Klein, and P. Clements. Evaluating software architectures for real-time systems. Annals of Software Engineering, vol. 7, pp. 71-93, 1999. 43. M. Klein, R. Kazman, L. Bass, J. Carriere, M. Barbacci, and H. Lipson. Attribute-Based Architecture Styles. In Proceedings of The First Working IFI Conference on Software Architecture (WICSA1), San Antonio, TX, 1999. 44. C.-H. Lung, A. Jalnapurkar, and A. El-Rayess. Performance-Oriented Software Architecture Engineering – an Experience Report. In Proceedings of the First International Workshop on Software and Performance, Santa Fe, New Mexico, 1998. 45. T. Kauppi. Performance Analysis at the Software Architectural Level. VTT Electronics, VTT Publications 512, 2003.



 Component-Based Engineering of Distributed Embedded Control Software J.H. Jahnke1 , A. McNair1 , J. Cockburn1, P. de Souza1 , R.A. Furber2 , and M. Lavender2 1



Department of Computer Science University of Victoria Victoria, V8W-3P6, Canada, B.C. {jens,amcnair,japc,pdesouza}@netlab.uvic.ca 2 Intec Automation Inc., 2751 Arbutus Rd. Victoria, V8N-5X7, Canada, B.C. {bob,mike}@microcommander.com



Abstract. Embedded control applications have become increasingly networkcentric over the last few years. Inexpensive embedded hardware and the availability of pervasive networking infrastructure and standards have created a rapidly growing market place for distributed embedded control applications. Software construction for these applications should be inexpensive as well in order to satisfy mass-market demands. In this chapter, we present results from an industrialdriven collaborative project with the purpose of researching component-based software engineering technologies for mass-market network-centric embedded control applications. This project has lead to the development and refinement of several tools in support of component-based software development. We describe these tools along with their underlying concepts and our experiences in using them.



1



Net-Centric Embedded Components



Embedded systems have become ubiquitous in our daily lives. Networking them via the Internet and Intranet infrastructures is one of the computer industry’s fastest growing markets [6]. The omnipresence of digital communication infrastructures has created inexpensive media for tele-monitoring and distributed processing in embedded devices. Traditionally, the primary concerns while developing software for embedded systems have been maximizing run-time and memory efficiency in order to minimize hardware costs. Due to continuously decreasing hardware costs and the increasing complexity of tasks controlled by embedded systems, other goals like maintainability, reliability, security and safety have gained great importance. Still, current industrial development practices (processes, tools, and techniques) for software in embedded systems lag behind the state-of-the-art in other software engineering areas. Despite all the progress made in the general software engineering arena (e.g., model-driven specification and design, component-based software development, generative programming, framework reuse, etc.), much embedded software is still being developed at a low level of abstraction, using primitive programming languages like assembler and C. This development C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 296–319, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 Component-Based Engineering of Distributed Embedded Control Software



297



practice is inefficient for complex systems because it impedes software reuse and maintenance. Moreover, it is human-intense, requires a significant amount of experience, and is prone to error. These problems are growing even more severe with the current trend of interconnected embedded systems in net-centric architectures. The magnitude of potential benefits of aggregating and connecting embedded systems over the Internet drives interest in the currently unsolved problem of how to design, test, maintain, and evolve such heterogeneous, collaborative systems. 1.1 The Minimal Footprint Challenge A component-oriented approach can be used to tackle the problem stated above. The notion of reusable software components has proven beneficial in general software engineering domains. Current integrated software development environments provide an extensible library of front-end Graphical User Interface (GUI) components and backend database components that can be used to rapidly compose applications. In the area of embedded control systems, component-oriented software development has been shown to cut production costs and improve the maintainability of systems [20]. Current embedded component models such as the Microsoft .Net compact framework and Java 2 Micro Edition (J2ME/Java Beans) are powerful platforms for component-based development on high and mid-end devices. However, these frameworks are still far too resource-hungry for applications on low-cost, mass-produced 8 and 16 bit micro controller platforms. This is partially because both of these frameworks have been developed as stripped-down versions of component models defined for traditional workstations and servers. In contrast, the component model described in this chapter has been developed from the ground-up, specifically targeting low-powered, small-scale micro controllers. 1.2 The Interactivity Challenge A weakness of current component models targeted towards embedded controllers is that they are concerned solely with the code running on the embedded device. However, micro controllers embedded in smart appliances have become increasingly interactive and users expect to interact with their devices from remote locations such as the PC at their work place. Therefore, the concept of an embedded software component deployed on such a micro controller has manifestations beyond the actual code running on the micro controller; it might also include the code for a GUI running on a client’s PC in order to monitor and adjust the execution of the embedded component. Of course, an alternative to executing component UI code on client user devices would be to access embedded controllers with general purpose thin-client software, such as Web browsers. However, this would require each micro controller incur the overhead of operating an embedded Web server to render the user interfaces to many, potentially concurrent, clients. This approach has additional drawbacks, limiting the user interface to relatively simple controls and interaction paradigms, and making near real-time visualization of the components unlikely. Therefore, dedicated client UIs are often used to provide PC-based control of embedded devices, which has the advantage of offloading UI features onto host PCs, al-



 298



J.H. Jahnke et al.



lowing the embedded controllers on the smart devices to be dedicated to what they are good at: controlling in real time. However, to create such applications, engineers have to bridge between the disparate component models used on the embedded system and the PC, respectively. Developing these bridges is often a costly and human-intensive process. The development of a holistic component model, which encapsulates embedded aspects as well as non-embedded (visual) aspects of controller components, could overcome this limitation. 1.3 The Integration Challenge Tele-monitoring of smart appliances over the Internet can be seen as a form of distributed computing. However, there is also an increasing trend to directly integrate the operations of different embedded devices so that they can exchange signals and information in order to act in concert with each other. It is one of the challenges of component-based development: to facilitate such integration without introducing additional context-dependencies for components and so decreasing their universal reusability in other application contexts. In other words, we are looking for ways to integrate distributed components, and, at the same time, to maintain their ignorance of each other. A programming style that attempts to solve this challenge by introducing a level of indirection is connection-based programming [20]. In this chapter, we present a component-oriented approach to engineering embedded control software for network-centric systems. This approach particularly targets low-powered platforms and facilitates interaction with PC-hosted GUIs. We present microCommander, a commercial visual development environment for embedded control software, developed by Intec Automation. The microCommander component model has specifically been developed as an answer to the first two challenges mentioned above. Furthermore, we have developed a research prototype technology in answer to the third challenge mentioned (integration). microSynergy is a model-driven development method for connection-based integration of multiple embedded controllers. Intec and UVic have been collaborating on research and development of these technologies since 2000, supported by the Advanced Systems Institute of British Columbia and the Natural Science and Engineering Research Council of Canada. The next two sections introduces the microCommander component-based development model and environment, which also facilitates PC-based user monitoring and control of devices. Section 4 focuses on the microSynergy approach to integrating multiple embedded devices by model-driven, connection-based programming. Section 5 presents related work and Section 6 contains an evaluation of our results and reports on our experiences.



2



Component-Based Development with microCommander



The microCommander embedded component technology consists of a component model, a library of embedded components and an integrated visual application development environment. The component library consists of basic control components such as timers logic, gates, controllers, etc. These components are assembled, using the visual



 Component-Based Engineering of Distributed Embedded Control Software



299



development tools, into control applications. The combination of the software code that performs specific control functions on the micro controller with the code used to view and manipulate these functions from a remote host PC provides a single and coherent component model. We have chosen to present the perspective of using microCommander for application development first, before discussing the component model and infrastructure underneath microCommander. 2.1 Visual Component Assembly – The User’s Perspective microCommander was designed with interactivity and ease of use in mind. As a consequence, it deviates from the traditional “develop, build, download, test, debug” cycle by allowing developers to instantly assemble and visually monitor software components on the embedded hardware in real time. In microCommander the line between development and deployment of an application is blurred, the same tool is used to both develop and interact with an application. microCommander applications are developed using mVisual, software that runs on a host PC interacting with a target micro controller. An application is a collection of microCommander components (mComponents) that may interact with one another. Figure 1 shows the mVisual view of a simple greenhouse application. Besides the mComponents selected for the application, the figure shows a list of system components that belong to the component framework of microCommander, e.g., System Information, Job Scheduler, Tick List, Second List. These components can be used to customize the way the microCommander framework executes the mComponents used in an application.



Fig. 1. mVisual component view



 300



J.H. Jahnke et al.



In order to develop an application, mComponents are instantiated by selecting them from a toolbar in mVisual. mComponents are then configured by customizing their properties through property sheets. For example, the left-hand side of Fig. 2 shows the property sheet for an on-off controller mComponent, which functions like a wall thermostat. It controls a digital component (e.g., a heater) to control a value (e.g., temperature) in a physical system (e.g., a greenhouse) to attain a target value. Connections between different input and output mComponent instances are also made with the help of an mComponent’s property sheet. The example below shows that the on-off controller mComponent is connected to a “Recirc Air Temp” converter component on its input (converts sensor mv to (◦ C), and to a “UHeater Ctrl” digital variable component on its output.



Fig. 2. On/Off controller property sheet (left-hand side) and corresponding visual control (righthand side)



Once an mComponent instance has been created, its data and possibly other properties may be viewed with an interactive dialog, called a visual control (right-hand side of Fig. 2). mVisual also serves as a micro controller operating interface that displays components in the form of visual controls, which might include switches, status lights, knobs, and sliders on one or more control panels. Adding descriptive wallpapers to the background of control panels can serve to further enhance the user guidance. mVisual also provides a comprehensive set of consistency checks such as dependency trees and hardware binding tables in order to prevent erroneous component compositions.



3



The microCommander Component Model



The microCommander component model is distributed in nature. A portion of an mComponent is executed autonomously on the embedded device and a portion is executed on a client PC in order to provide interactivity with the embedded application. For



 Component-Based Engineering of Distributed Embedded Control Software



301



instance, the greenhouse unit heater controller switches the unit heaters on/off to maintain the greenhouse temperature. When the user connects with mVisual, the mComponents are rendered on-demand through property sheets and visual controls, making the mComponents accessible to the user for configuration and interaction. This is done by a rendering engine in mVisual, as shown in Figure 3.



Fig. 3. User interaction with mComponent



For the microCommander component model, we have adopted the general definition of the concept of a software component by Szyperski: “a software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be deployed independently and is subject to composition by third parties” [20]. More specifically, mComponents have the following characteristics: 1. Uniqueness: each component has a unique identifier. 2. Embedded functionality: each component includes binary code to be executed on an embedded controller in order to perform a particular function. 3. User-interactivity: each component includes binary, data and meta-data to be processed on a remote computer for the purpose of monitoring and controlling the component’s embedded function(s). 4. Parameterization-dialog: each component includes binary code to be executed on a remote computer for the purpose of customizing a component’s embedded function. 5. Self-disclosure: components can disclose their status, type, and ID. 6. Persistence: component instances can serialize to a binary stream destined for permanent storage, e.g., on a flash memory provided by the embedded platform. 7. Embedded hardware-dependency: a specification of the hardware context necessary for executing the component. 8. Contractual inter-component interface: event and data flow-oriented interface specification for component composition. mComponent instances are classified into types according to the embedded function they perform. mComponent instances of the same type share rules, properties and behavior. In addition, each mComponent instance hosts unique parameters and run-time information, making a particular instance different from other instances of the same type.



 302



J.H. Jahnke et al.



When mVisual connects to a target micro controller, it introspects the current inventory of mComponent instances and sends the list back to mVisual. Then mVisual asks each instance to disclose its Component Configuration Block (CCB), a record of component information. In the case of component ”UHeater1”, it might reply: My My My My



Name: State: Data: Type:



"UHeater1" Enabled | Visible | Inverted On (1) "Digital Out"



Once mVisual knows the mComponent’s type, it retrieves its extended descriptor, which contains component type related information such as: Embedded function: Valid v-controls: Valid v-control messages: Parameterisability: Valid sources: Events detected: Hardware dependency:



Turn on/off, toggle LEDs, toggle switches, momentary switches, etc. Set data, get data, set flags Statements for building a property sheet Digi Var, Digi Op, Digi Out, Digi In, In Gate... "on to off", "off to on" exists digital output pin



This information allows mVisual to represent the mComponent with suitable visual controls and to render the component’s property sheet and verify related information when an mComponent is instantiated or modified. Extended descriptors are not stored on the embedded device but they are part of the microCommander environment installed on an operator’s computer. They might also be served from a remote component descriptor server. As this information is not needed in real-time, it can reside elsewhere, reducing the resources necessary on the micro controller. 3.1 Component Composition mComponents are composed in a data flow-style architecture. There are two main data types, digital (0 or 1) and analog (unsigned integer). Interfaces of mComponents comprise data sources and outputs. mVisual ensures type-safe composition, i.e., that data sources can only be connected to compatible data outputs. This flow of data between components is controlled by a common timed update service. The update frequency can be chosen differently for each mComponent instance. Obviously, some mComponents in micro controller applications will have to interface directly with hardware ports, such as digital and analog I/O ports. To do this, microCommander uses special components called hardware agents that act as an agent for hardware features such as analog and digital pins. The framework loads the hardware profile of the target micro controller platform to make all I/O ports accessible to the appropriate source and output agents. Updating a mComponent instance causes the instance’s default action to be performed. For instance, a timer update will cause it to increment its count. But, components can also be triggered to perform additional actions; actions from a set of actions unique to each Component type. For instance a timer can be triggered to reset, reset and go, stop and resume. Actions can be invoked in two ways: they can be scheduled or triggered periodically by system components that are part of the framework, or they can be triggered by events. Both alternatives are elaborated in more detail in the next section



 Component-Based Engineering of Distributed Embedded Control Software



303



The component framework consists of a runtime engine for mComponents, a set of system components, and a set of rules to which each mComponent must comply. The core of the runtime engine comprises a number of system components supported by a number of services. On the micro controller, the runtime engine includes a boot loader (checks for a default application saved to flash memory and instantiates the system components), a simple task scheduler and a message dispatcher. On the PC, the runtime engine includes a number of services for managing components (saving, loading, tracking dependencies, enforcing rules, etc.), rendering engines for displaying component property sheets, and visual controls for user interaction with the components. The heart of the runtime engine is the Supervisor component which is responsible for managing all components in an application, both on the PC and the micro controller. It is supported by a number of other system components: – System Information: tracks and manages system resources such as memory. Maintains the target date and time. – Tick, Second, Daylists: Execute user-defined periodic jobs (a “tick” is the smallest time unit in a microCommander application. The actual duration of a tick in real time depends on the choice of the target micro controller hardware and how much work must be done during the “worst case scenario” tick). – Job Scheduler: provides date-time scheduling for user-defined jobs. – Event: watches a data source and triggers an action in another component when a specified change is detected. – Security: provides a list of users, passwords and access levels. Also provides an interface for user authentication. Framework rules are enforced through component inheritance and meta-data. Adherence to these rules is what enables the runtime engine and system components to perform their required tasks, such as component introspection, composition and persistence. Event components can be seen as a component-based implementation of the EventCondition-Action (ECA) paradigm. They can be deployed to watch specific data sources for the detection of change events, which, on occurrence, evaluate a user-specified condition. If this condition is fulfilled, the event-component can trigger other components to perform selected actions. 3.2 Resolving Hardware Heterogeneity microCommander currently supports three unique CPU architectures and thirteen unique micro controller boards. To provide this multi-platform support, microCommander has been designed to abstract from specific hardware interfaces provided by different chip makers and board manufacturers. The abstraction is achieved by using a layered model. There are two levels of hardware abstraction. The first is at the components level: microCommander provides a set of special components called agents (Fig. 4). All agent components have two properties, personality and pin, which enable them to interact with the hardware. A personality is a generic interface to common micro controller functionality such as digital output, analog input, PWM, etc. Micro controllers typically provide these hardware personalities in groups of pins, called ports,



 304



J.H. Jahnke et al.



Fig. 4. microCommander framework abstraction layers



and hence the pin property is used to specify a single pin within the port. The enumeration of personalities for a micro controller can be greater than the amount of ports or pins, since often ports and pins can be configured to take on different behaviors. For example a micro controller often provides the ability to configure a digital pin to be either an input or an output. This level of abstraction provides portability of user applications. The second level of abstraction is the personality table. It is an enumeration of all the port/personality combinations provides by the platform’s hardware. Each platform’s hardware layer is responsible for providing the functionality for each type of personality available on the micro controller. The personality table provides the link between the hardware that implements the functionality for each personality and the agent components that use the personality. This allows the agent to be unaware of the underlying hardware. The same personality may be offered on several ports, as illustrated in Fig. 5, while some personalities may be lacking on a given hardware platform (i.e., no analog input). The personality table is one of the interfaces that contribute to the portability of the microCommander runtime engine.



Fig. 5. Example personality table



4



Coordinating Software Components on Distributed Controllers



Today, even inexpensive micro controller platforms are equipped with network interfaces such as Ethernet or CAN-bus. As outlined in the previous section, microCommander makes use of such interfaces for the purpose of PC-based remote interaction



 Component-Based Engineering of Distributed Embedded Control Software



305



with embedded micro controllers. Such interfaces can also be used for networking micro controllers among themselves in order to build more complex, distributed embedded networks. The purpose of this section is to present our approach to engineering these types of networks based on the paradigms of distributed components and connectionbased programming [20]. Despite the fact that many current embedded software libraries provide implementations of standard data transport protocol stacks, the development of complex, distributed coordination logic among micro controllers still remains a complex task that lacks the methodological and technological support available today for general software development. High-level specification languages such as SDL [5], State Charts [8], Petri nets [17], UML Collaboration Diagrams [7], and UML Sequence Diagrams have proven useful abstractions for analyzing requirements and designing protocols for distributed systems. However, there is still a significant chasm between such high-level formalisms describing the coordination logic of a controller network and the actual implementation of the different controllers in terms of low-level programming languages. Currently, programmers have to cross this chasm manually by repeatedly translating high-level (often diagrammatic) specifications into low-level program code. Driven by time-to-market pressure, programmers often use a shortcut and immediately start coding rather than designing high-level specifications. This practice has proven error-prone for complex systems. Moreover, it decreases the maintainability of a system. Notably, a similar gap in the development process existed for general software engineering not too long ago, however it has recently been narrowed with the introduction of higher-level component-based language platforms (e.g., .net and EJB) and the development of model-driven code generation mechanisms [4]. We have chosen an analogous approach for narrowing the gap between design and implementation of complex coordination logic among distributed embedded controllers. Based on the microCommander component platform, we describe the microSynergy method and technology for model-driven specification and code-generation of coordination logic among distributed embedded controllers. The goal of microSynergy has been to realize the vision of the Object Management Group’s (OMG) Model Driven Architecture (MDA) paradigm for development of distributed embedded control software [13]. One key objective in MDA is to separate application logic from platform technology specific details. Application logic is analyzed and designed in a Platform Independent Model (PIM), which later on is translated automatically or semi-automatically into a Platform Specific Model (PSM) and source code. In addition to increased productivity, this approach provides the benefits of application logic that is easier to change and to migrate to other platforms. 4.1 The microSynergy Composition Model microSynergy’s PIM consists of two diagram types, System Diagrams and Connector Diagrams, describing the static and dynamic aspects of the controller interaction, respectively. These two diagram types are based on UML Component Diagrams and State Diagrams. Analogously to UML Component Diagrams, microSynergy System Diagrams depict components, their interfaces and connectors between them. However, System Diagrams are more restrictive than general UML Component Diagrams, in or-



 306



J.H. Jahnke et al.



der to specifically target them to the domain of embedded controllers and make diagrammatic specifications executable. System Diagrams. The level of granularity components in System Diagrams correspond to is the deployment architecture of embedded controllers in the network. In other words, microSynergy System Diagrams treat embedded controllers as opaque entities and do not show their internal architecture, which can be developed, for example, with microCommander. Component interaction in microSynergy uses an event-based paradigm, meaning that there are no return parameters in interface signatures. These signatures each consist of a number of event gates, each one defining a signal that can be exchanged with the rest of the network. Gates can be defined as in-going or out-going and are graphically depicted as bubbles of different color at the border of component symbols. From a component’s viewpoint, in-going gates are used to receive events from the network, whereas out-going gates are used to emit events to the network. Events have names and can carry payload data based on a number of predefined types, such as boolean, integer, and blob (binary large object). Connections between controller components can be as simple as direct event-forwarding channels. However, microSynergy supports the specification of more sophisticated connections, which are required in order to coordinate third-party components with complex interactions. In order to show the advantages of complex connections, let us consider the simple System Diagram in Fig. 6. In this scenario, an “environmental controller” can directly send Heat(int) events to a “burner controller”, if the temperature is too low, and it can send Cool(int) events to a “window controller” if the temperature is too high. Each of these events can carry an integer payload specifying the intensity of the desired heating or cooling action (0-100 percent).



Fig. 6. Simple system diagram



A limitation of system-level designs using only such primitive connections is that interfaces have to be designed to work with each other. All complex application logic that specifies the coordination of the various distributed controllers has to be encapsulated in the controller components themselves. The main problem associated with this approach is the limited re-usability of controller components. However, it is often not practical to require that controllers be custom-programmed for a particular network configuration. In practice, system engineers reuse not only the hardware but also the software of pre-configured controller components. For example, the environment controller in the above example could be used in many other scenarios outside the domain of green house operations. In other scenarios, it might not be integrated with a win-



 Component-Based Engineering of Distributed Embedded Control Software



307



dow controller but with an air conditioner or no cooling device at all. Encapsulating the coordination logic for a network of embedded software components inside these components would decrease the re-usability of these components in other contexts. In other words, embedded components should be as ignorant about each other’s existence as possible in order to minimize their architectural dependencies and maximize their re-usability and maintainability. This statement is also valid in the general domain of software engineering, where its consideration has resulted in a new programming paradigm called connection-based programming [20]. The essence of connection-based programming is to promote connections among components to become first-order programming concepts, rather than being simple call-dependencies between component interfaces. While the ideas behind connection-based programming are not new, few language platforms exist that fully realize them. While current programming languages allow programmers to implement “connections” in form of traditional program components, these platforms lack dedicated language concepts. Introducing such dedicated language concepts for connections facilitates separation of concerns (component-internal application logic vs. network coordination logic) and thus promotes the reuse and maintenance of embedded networks. Therefore, we have provided the microSynergy modeling language with an explicit notion for complex connections. Complex Connections. Before we describe how complex connections are realized, let us outline at an abstract level the requirements behind this concept. The above discussion shows that connections should be able to mediate among different “third-party” component interfaces that might have been developed independently from each other. Consequently, connections should be able to change the types of events routed through them, as well as the event payload. Furthermore, there may be more than two component interfaces participating in one logical connection. Based on these considerations we can classify connections, as shown in Fig. 7. We denote the number of in-going gates and the number of out-going gates associated with a connection as #i and #o, respectively. In the terminology of this classification, which is based on [12], the “simple” connections discussed at the beginning of this section and illustrated in Fig. 6 are denoted as binary conduits: they connect a single out-gate to a single in-gate with preservation of event type and payload. Figure 8 shows a simple System Diagram with a binary transducer that connects incompatible gates. In this example, the Environment Control component has an out-gate TempOffset(int degrees), which emits events when the difference between the desired temperature and the actual temperature changes. Obviously, the interface of this component is no longer compatible with the interface of Burner Control. We need a connection that changes the event type as well as translates temperature offsets to percentages of desired burner intensity. If we are considering adding the window controller from Fig. 6 to our network, and, for reasons of energy efficiency, want to ensure that having the burner activated or the windows opened are two mutually exclusive states, we need a ternary connection. Figure 9 illustrates this example, which uses an analyzer transducer. Obviously, any connection more complex than a simple binary conduit requires further definition of coordination logic. microSynergy uses another diagram type called a Connector Diagram for defining the interaction semantics of each connector.



 308



J.H. Jahnke et al.



Fig. 7. Classification of connection types



Fig. 8. System Diagram using a binary transducer



Fig. 9. System Diagram with ternary connection



 Component-Based Engineering of Distributed Embedded Control Software



309



Connector Diagrams. Connector Diagrams are based on UML State Diagrams and UML Activity Diagrams. As such, Connector Diagrams describe extended finite state machines. The new UML 2.0 standard supports different representations of state machines, defined by different language profiles. The traditional representation of UML State Machines is oriented towards Harel’s original StateChart notation [8]. Another profile in UML 2.0 defines a representation of state machines that adopts the syntax introduced with ITU’s Specification and Description Language (SDL) [21]. microSynergy is using SDL-based syntax, since SDL is well accepted in the domain of embedded protocol design. Moreover, SDL renders state machines as acyclic graphs, which facilitates computer-based layout operations. Another benefit of using SDL syntax is that we can base the semantics of microSynergy specifications on the more precisely defined formal semantics of SDL. Figure 10 shows an example definition for the connection in Fig. 9. This definition is particularly simple, because it uses only a single state, “ready”. The processing of in-going events is specified as a rectangle with an inwardly directed triangle, whereas out-going events are specified as rectangles with outward-pointing triangle. Decision points are shown as diamonds. The connector diagram shows how the TempOffset event is cast to event types Heat and Open. Moreover, it shows how the payload of the original event is mutated. The connector specification in Fig. 10 explicitly excludes the possibility that both systems are engaged at the same time, given that the system was started in a physical state with the windows closed and the burner turned off. Of course it would be more desirable to be able to activate the controller network in any physical state, e.g. with open windows. The simplest way of achieving this functionality would be to set the windows to an initial position at start-up, e.g., to close them automatically. However, this approach assumes that devices and communication links are 100% reliable. A better approach would require sensors to enquirer about the status of both the windows and burners. Figure 11 shows an example connection that uses this approach. In this case,



Fig. 10. Definition of simple ternary connection



 310



J.H. Jahnke et al.



Fig. 11. Definition of simple ternary connection



the window controllers must have a status interface that lets external components query to the percentage that a window is open. The connector remains in its initial state start until it receives an activate signal. The connector uses status signals to enquirer about the physical status of the windows and assumes state closed or venting as an outcome of this inquiry. Controller identity and anonymity. The default semantics of microSynergy connection diagrams are to broadcast events to, and receive events from, all connected controllers with a gate of the specified name. This functionality might be desired in some cases. In others, the system engineer might want to specify the exact identity of the sources or destinations of signals. For example, in Fig. 11, we are only interested in inquiring about the status of the window controllers. This is specified by giving the window controller an explicit name (“w”) and using this name to qualify event interaction in connector diagrams, e.g., w.status. Controller multiplicity. Now let us assume that, for reasons of scalability, we want to permit many window controllers in our example controller network. Unless a controller component is annotated with the singleton stereotype, microSynergy’s default se-



 Component-Based Engineering of Distributed Embedded Control Software



311



mantics is to broadcast signals to all controllers in the local network which have a type specified in the system diagram. In other words, a controller component that is not marked as a singleton actually stands for “one or many” controller components of the same type. This allows adding type-compatible hardware (e.g., additional window controllers) without changing the coordination logic specified in microSynergy. While these semantics clearly benefit system maintainability, the fact that the number of controllers connected is undetermined at specification time causes a problem when we want to specify a condition that needs to be fulfilled for all controllers of a given type. We solve this problem by introducing timed default transitions at decision points. Default transitions are similar to else clauses in conditional statements. However, they are executed only after the time period chosen as their parameter has elapsed. In other words, this time period specifies a temporal window in which an event might still be received that triggers a different transition originating from the decision point. Our example in Fig. 11 uses this concept to specify that a window closed state can only be reached if no window controller responds with a status event containing a payload p > 0 within 1000 milliseconds of the status inquiry being broadcast. 4.2 Platform-Specific Execution of Coordination Logic The previous section described the composition model of microSynergy logic. This section describes implementation concerns involved in creating a system that executes the composition logic. The microSynergy execution architecture. Figure 12 gives an overview of the execution architecture of microSynergy. Its three main elements are the microSynergy Editor, microSynergy run-time, and the embedded targets. The microSynergy Editor is used by the engineer to develop and maintain the coordination logic on the model level, as described in the previous section. This logic is automatically translated into a highly compact format called CEL (Connector Execution Language) and downloaded to mServer at deployment time. mServer is the micro controller that handles the message routing



Fig. 12. microSynergy execution architecture



 312



J.H. Jahnke et al.



between targets. microSynergy run-time is a software component on mServer that handles the message routing among targets according to the CEL logic downloaded from the Editor. A target is a micro controller that participates in a microSynergy network, denoted as a LCN (Local Controller Network) in Fig. 12. Each target has a small piece of software on it that allows the target to expose its in-gate and out-gate interface, interpret the messages sent to it by run-time, and format messages to send to run-time. microSynergy run-time is designed using a layered architecture, as seen in Figure 13. The execution layer can be in one of two modes. The first is administration, where its purpose is to respond to messages sent to it by the microSynergy Editor. For example, the Editor might query what targets are currently registered in the network. The second mode is execution, during which it is interpreting the CEL instructions and, at a high level, controlling what messages are sent to targets. Because these instructions represent an encoding of an extended finite state machine, the job of the execution layer is to decide, based on its current state, how to respond to target signals. The execution layer sends messages to and from targets through the abstract transport layer.



Fig. 13. microSynergy runtime design



The abstract and concrete transport layers are designed to handle communicating with multiple micro controllers taking into account their extremely heterogeneous nature. microSynergy must be able to support a multitude of communication protocols. We may have a network of small, low-cost devices using RS-232 communicating with a real-time system that uses CANbus, a home PC using TCP-IP, and a wireless device that uses Bluetooth. The solution is to put the protocol-neutral aspects of communication into the abstract transport layer, while the protocol-specific aspects are dealt with in the concrete transport layer. Within the concrete transport layer, each protocol microSynergy runtime must support has a corresponding concrete transport component. In more detail, the abstract transport layer stores a mapping between a specific target id and an associated concrete transport component. When the execution layer instructs the abstract transport layer to send a message to a specific target id, the mapping is used to decide what concrete transport component to send the message to. The message is then transformed into a standard format and passed to the concrete transport layer. Within the concrete transport layer, an attempt is made to translate the generic requests of the abstract transport layer, such as sending a message to a specific target, into a protocol-specific realization of that request, such as sending a message through TCP-IP to a specific target. Occasionally this results in slightly different semantic interpretations of the generic request depending on the concrete transport component. For example, when the execution layer, in an administrative state, requests a list of the cur-



 Component-Based Engineering of Distributed Embedded Control Software



313



rently connected targets, to the Bluetooth concrete transport component this is a request to do a discovery of any currently available Bluetooth targets available. However to the TCP-IP transport component this is a request for a list of TCP-IP targets that have previously connected to mServer. One transport component interprets the message as a request for targets that are participating in the network and one transport component interprets the message as a request for targets that could participate in the network. Implementation concerns. When working in the domain of embedded systems, clearly it is important to be concerned about resource usage, such as disk space and RAM. Although it was necessary that microSynergy runtime be able to support multiple communication protocols, it was equally necessary that this be realized in as efficient a way as possible. The design solution was to make the particular microSynergy runtime used customizable. Only the protocols that a particular network is using need to be supported available in its infrastructure footprint. The engineer can specify target protocols for each inter-controller connection during the design phase. She does this by adding stereotypes to the communication channels in a System Diagram. Annotations indicate which protocol(s) a particular target supports. In the context of model-driven design, such model refinements can be seen as PIM to PSM transformations. Before deployment, the annotated system diagram is analyzed to determine the configuration of protocols that need to be supported at runtime. 4.3 Using the microSynergy Method and Technology in Practice The microSynergy Editor tool (see Fig. 14) was designed to support the developer in visualizing the embedded devices connected to the network, defining connector logic, as well as importing and exporting pre-built connector logic. As the Editor was designed for both expert and novice users, it supports several different development scenarios. We will now discuss three typical scenarios to give the reader an idea of the



Fig. 14. microSynergy editor



 314



J.H. Jahnke et al.



way microSynergy and microCommander can be used in concert for the development of component-based, distributed embedded controller applications. Green field network development. When a network of embedded devices is designed and implemented from scratch, engineers have the freedom to determine all the hardware and software components being used. Additionally, engineers will have some foreknowledge as to which devices will be communicating with one another. Most importantly, engineers developing embedded device networks typically have the expertise required to customize the embedded devices to meet their needs precisely. However, developing a network from scratch can be time consuming. Although engineers have the ability to design the embedded devices, they often do not have control over time and money constraints. The microSynergy Editor was designed to assist expert developers by minimizing the time and effort required to implement, maintain and evolve the logic controlling embedded device communication. Engineers can use the microSynergy Editor to create system and connector interaction diagrams. They can add target devices including their respective in-gates and out-gates to the system diagram as required, then define connector logic state machines in terms of states, inputs, outputs and conditionals. The result is a system specification document which can be downloaded and deployed to the network. Using microCommander, engineers then assemble each controller’s embedded software to meet their requirements. They then define an interface consisting of in-gates and out-gates for each controller as specified by the system diagram documentation developed using microSynergy. From a microCommander developer’s perspective, in-gates and out-gates are merely two additional mComponents that can be deployed and integrated with other software components. Finally, each controller, now equipped with the embedded software developed using microCommander, is connected to the network and the microSynergy coordination logic is downloaded to the dedicated controller hosting the microSynergy runtime engine. The engineers’ workload is significantly lightened by allowing them to download and deploy system specification diagrams developed using the microSynergy Editor rather than having to manually translate design documents into an executable format. In addition, subsequent examinations of the network result in documentation that is both up to date and accurate in terms of the devices connected to the network and the connectors defining device communication. Hence, the microSynergy Editor acts as a round-trip development tool, where design documentation can be generated from an existing implementation as well as downloaded and executed on the network. Network development with third-party controllers. In many cases, application developers themselves will not design and implement all embedded devices connected to a network from scratch. Often, they will purchase devices off-the-shelf, connect them to the network, and focus on creating custom connector logic to allow the devices to work together. microSynergy supports this method of network development, enabling the technology-savvy to quickly install and configure complex networks of third party controllers. Initially, the network developer purchases pre-programmed controllers they require from third party vendors and connects them to the network. They use microSynergy Editor to introspect the network and automatically create a System Diagram for the



 Component-Based Engineering of Distributed Embedded Control Software



315



current network topology. Connectors and their internal logic are then defined, anchored to the in-gates and out-gates associated with the controllers as required, and downloaded to the network. Here, we note that embedded device developers are alleviated from developing communication logic for their controllers and network developers are alleviated from the design and implementation of the embedded software controlling the devices in the network. This exemplifies one of the benefits of connection-based programming, that components can be bound together as required with little forethought concerning how they may actually be interconnected during their implementation. Development with third-party connectors. An increasing number of embedded automation and control applications target non-engineers as their customers. Home automation, for example, is a rapidly emerging market in many industrialized countries. The average layperson will typically have little knowledge of embedded devices, finite state machines or other concepts in the domain. However, such users should be empowered to configure and customize their network to meet their needs, e.g., their home environment and the processes they would like to automate. To address this issue, microSynergy supports the development and reuse of third-party connector logic in the form of templates. Using templates, end users can develop and configure their embedded device network without having to implement the connector logic themselves. To begin, the end user purchases the devices they require and connects them to the network. After examining the network, they can import predefined connector logic templates developed by third parties. The template logic can then be downloaded and deployed to the network. Templates can be seen as pre-defined, customizable interaction patterns between embedded devices.



5



Related Work



The idea of constructing software by configuring and connecting proven, reusable components (as opposed to manual programming) has existed for several decades. During the 90’s component-oriented construction gained increasing interest in the commercial section. This popularity has been driven by the availability of reusable frameworks and pattern libraries for object-oriented languages like C++ and Java [15] [11]. Johnson gives a good overview on the pros and cons of employing components and other re-usability technology for software construction [9]. One prominent problem of component and framework reuse is how to efficiently store, maintain, and look up a generally very large number of reusable components. Several representations, query languages and algorithms have been proposed for this purpose, e.g., by Sahraouim and Benyahia [14]. Even though the problem of component-oriented construction for general software has not yet been sufficiently solved, current industrial practice proves that this approach is viable and productive for specific application domains. For example, component-oriented techniques play an important role in constructing current graphical user interfaces, e.g., Java Beans [2]. Stewart has shown that similar advantages of domain-dedication apply to the use of component-orientation in the design of embedded systems [18]. The notion of making component reuse feasible by focusing on a particular domain is related to the idea of



 316



J.H. Jahnke et al.



product lines as presented in [3]. In this sense, microCommander and microSynergy are clearly focused on supporting control applications. They do not provide adequate support for developing other types of embedded system application, e.g., software for cellular phones. Using PC-based user interfaces for monitoring and controlling embedded systems is also the approach taken in National Instrument’s LabView [10]. However, the use of LabView requires significant knowledge and skills, e.g., requiring users to program with iterative loops and conditional statements. Our approach is simpler, being based on traditional control components that represent such things as switches, timers, status indicators, etc. Moreover, our notion of an embedded component encapsulates the functional embedded code as well as the visual representation of interactive, PC-based dialogues. Our approach, to connect different application block components, is related to work performed in the area of architectural interconnectors as presented by Allan and Garlan [1]. The difference in our approach is that it is currently restricted to asynchronous (signal-based) communication only. Furthermore, we deal with a-posteriori integration. We have chosen SDL for specifying the integration among components. This is in contrast to many other modeling approaches that employ the Unified Modeling Language (UML) for this purpose [19]. We made this decision because SDL has formal semantics and is widely used in the embedded systems domain [5]. However, the UML 2.0 specification [21], which is currently nearing completion, has adopted SDL semantics within its scope.



6



Discussion and Experiences



Throughout our research investigating component-oriented engineering of embedded control software for network-centric systems, several new issues have been brought to light. In the following paragraphs we will discuss some of these issues and the research opportunities they present. 6.1 3rd Party and Custom Components No matter how many components types are available, there will always be applications requiring at least one more specialized component that is missing in the available libraries. Alternatively, a user may have valuable legacy code that she would like to run side by side with other microCommander components. That is, she would like to wrap her legacy code into an mComponent-like interface to enjoy all of the features of microCommander, while the code performs as before on the target controller. This brings up the following issues: 1. Already, some target platforms have insufficient resources to simultaneously support all existing component types. A dynamic linking solution is being considered, which would allow the user to load only the subset of component types required by her application. However, dynamic component linking would require the component infrastructure to be extended. The resulting increase in the memory and processing footprint of the component could lead to bottlenecks on small devices.



 Component-Based Engineering of Distributed Embedded Control Software



317



2. At some point, the number of components and the task of finding the right component for the job will overwhelm the user. One view is that this will eventually create demand for some type of discovery service. These issues are subject to ongoing research at the University of Victoria. Relationships between components. As microCommander applications become more complex, the user not only needs to be aware of the mComponents and their properties, but also needs to understand the dependencies between components in order to manage them efficiently. This is particularly true when many logic components are involved or when a number of components can trigger a job on another component. For instance, there could be an overtemperature alarm/shutoff on the greenhouse heaters that could be triggered from any of a dozen temperature sensors. Seeing the link to the offending sensor provides the engineer with cognitive support during application development. This is analogous to a visual cross-referencing browser in traditional programming. Intec Automation is currently experimenting with a dependency browser for component-based applications (see Fig. 15, the logic to accomplish wear leveling of the greenhouse unit heaters, as an example). The research challenge in developing such a browser is to devise different logical views, showing different concerns of a component composition without overloading the screen with clutter. Techniques like fish-eye views and dynamic component animations might represent appropriate mechanisms.



Fig. 15. Dependency view (excerpt) of a microCommander application



6.2 Multi-tiered Architecture microCommander is currently based on a two-tier architecture comprising an embedded application tier and a PC-based operator tier. Automation micro controllers are designed to interface with the real world and are not good communication engines. This presents a problem when multiple simultaneous users are introduced into this architecture. Intec and UVic have been working on an approach that uses a three-tier architecture to provide more scalability, security and reliability for interactive embedded applications. With a three-tier architecture much of the burden of serving multiple users can be moved from the micro controller to the middle tier. This separation of concerns allows the middle tier (mGateway) to act as a liaison between the micro controller and the end-users. This tier also includes a data cache for the values from the micro controller, which will reduce redundant communications in a multi-user scenario. Interesting research topics lie in the negotiation strategies of quality of service attributes between interactive



 318



J.H. Jahnke et al.



clients and the embedded targets. For example, the middle tier might use caching strategies to decrease the latency of fulfilling client requests at the cost of the currency of the requested information etc. 6.3 Semantic Interface Ontologies One of the goals of embedded device network development and research is to support a ubiquitous computing environment that automatically evolves in step with changing requirements. Embedded networks should adapt and evolve as transparently as possible to meet the changing needs of their users. However, as networks evolve, they become increasingly complex, making subsequent network modification increasingly difficult. We are currently researching the concept of embedded components with semantic interfaces in an attempt to address the problem of transparent network evolution. Our goal is to allow a controller to be connected to the network and to have the network automatically reconfigure itself to incorporate the new device. To achieve this, a controller will be ascribed a set of predicates with a tool like microCommander that will describe the device in detail. The predicates used to describe controllers will be defined in a universally accessible ontology [16] describing the micro controller domain. Though artificial intelligence literature offers many definitions of an ontology, for our purposes an ontology is a hierarchical structure that describes entities and their inter-relations within a specified domain. Predicates defined in the ontology are used to describe a controller’s role and context within the network. The roles that can be ascribed to a controller, such as a window or a burner, will include a description of the interface common to all controllers of the same type. For example, a burner may support a Heat in-gate as well as status and malfunction out-gates. Hence, the network will be able to introspect the predicates declared by a newly connected controller and determine how signals should be routed to and from it in accordance to the connector logic. This topic provides interesting and challenging research questions such as how to describe semantic interfaces and how to publish and evolve the common ontology.



References 1. Allen, R. and D. Garlan. Beyond definition/use: architectural interconnection. Proceedings of the workshop on Interface definition languages. 1994. Portland, USA: ACM Press. 2. Deitel, H.M. and P.J. Deitel, Java: how to program. 1999, Prentice Hall:Upper Saddle River, N.J. 3. Donohoe, P.: Software product lines: experience and research directions: proceedings of the First Software Product Lines Conference (SPLC1), August 28-31, 2000, Denver, Colorado. 2000, Boston, MA: Kluwer Academic. xv, 532 4. Eisenecker, U. and Czarnecki, K., Generative Programming: Methods, Tools, and Applications, Addison-Wesley, 2000 5. Ellsberger, J., D. Hogrefe, and A. Sarma, SDL - Formal Object-oriented Language for Communicating Systems. 1997: Prentice Hall Europe. 6. Estrin, D., Govindan, R., and Heidemann, J., Embedding the Internet. Communications of the ACM, 2000. 43: p. 38-50.



 Component-Based Engineering of Distributed Embedded Control Software



319



7. Fowler, M. and Scott, K.,. UML Distilled: A Brief Guide to the Standard Object Modeling Language, 2E, Addison Wesley Professional, ISBN: 0-201-65783-X. 2000 8. Harel, D. and Gery, E., Executable Object Modeling with Statecharts Proceedings of the 18th Intl. Conf. on Software Engineering, pp. 246-257, IEEE CS / ACM Press, 1996. 9. Johnson, R. Components, frameworks, patterns. in 1997 Symposium on software reusability. 1997. Boston, USA: ACM Press. 10. LabVIEW - The Software That Powers Virtual Instruments, National Instruments Corporation, Austin, Texas. http://www.ni.com/labview 11. Leavens, G.T. and M. Sitaraman, Foundations of component-based systems. 2000, Cambridge, [England] ; New York: Cambridge University Press. ix, 312. 12. Lorenz, D., and Vlissides, J., Designing Components versus Objects: A Transformational Approach. ICSE 2001: 253-262, Toronto, Ontario, Canada, May 12-19 2001 13. MDA - The Architecture of Choice for a Changing World, Object Management Group. http://www.omg.org/mda/ 14. Mili, H., H. Sahraouim, and I. Benyahia. Representing and querying reusable object frameworks. in Symposium on software reusability. 1997. Boston, USA: ACM Press. 15. Nierstrasz, O., S. Gibbs, and D. Tsichritzis, Component-Oriented Software Development. Communications of the ACM, 1992. 35(9): p. 160-165. 16. Noy, Natalya F. and. McGuinness, Deborah L; Ontology Development 101: A Guide to Creating Your First Ontology; Stanford University, Stanford, CA, 94305 http://protege.stanford.edu/publications/ontology development/ontology101.pdf 17. Petri, C., Concurrency Theory Advanced Course on Petri Nets, pp. 1-22, Gesellschaft f¨ur Mathematik und Datenverarbeitung, St. Augustin, Germany, 1986. 18. Stewart, D. Designing Software Components for Real-Time Applications. in Embedded System Conference. 2000. San Jose, CA, USA. 19. Stevens, P. and R.J. Pooley, Using UML software engineering with objects and components. 2000, New York: Addison-Wesley. 20. Szyperski, C., Component Software, Beyond Object-Oriented Programming. 1997: AddisonWesley. 21. UML, Unified Modeling Language, UML 2.0 specification http://www.uml.org/



 Component-Based Development of Dependable Systems with UML Jan J¨urjens and Stefan Wagner Software & Systems Engineering Technische Universit¨at M¨unchen Boltzmannstr. 3, D-85748 Garching, Germany {juerjens,wagnerst}@in.tum.de



Abstract. Dependable systems have to be developed carefully to prevent loss of life and resources due to system failures. Some of their mechanisms (for example, providing fault-tolerance) can be complicated to design and use correctly in the system context and are thus error-prone. This chapter gives an overview of reliability-related analyzes for the design of component-based software systems. This enables the identification of failure-prone components using complexity metrics and the operational profile, and the checking of reliability requirements using stereotypes. We report on the implementation of checks in a tool inside a framework for tool-supported development of reliable systems with UML and two case studies to validate the metrics and checks.



1



Introduction



There is an increasing desire to exploit the flexibility of software-based systems in the context of critical systems where predictability is essential. Examples include the use of embedded systems in various application domains, such as fly-by-wire in Avionics, drive-by-wire in Automotive and so on. Given the high reliability requirements in such systems (such as a maximum of 10−9 failures per hour in the avionics sector), a thorough design method is necessary. We define reliability as the probability of failure-free functioning of a software component for a specified period in a specified environment. Reliability mechanisms cannot be “blindly” inserted into a critical system, but the overall system development must take these aspects into account. Furthermore, sometimes such mechanisms cannot be used off-the-shelf, but have to be designed specifically to satisfy given requirements. For example, the use of redundancy mechanisms to compensate for the failures that occur in any operational system may require complex protocols whose correctness can be non-obvious [41]. This can be non-trivial, as spectacular examples for software failures in practice demonstrate (such as the explosive failure of the Ariane 5 rocket in 1997). Any support to aid reliable systems development would thus be useful. In particular, it would be desirable to consider reliability aspects already in the design phase, before a system is actually implemented, since removing flaws in the design phase saves cost and time. This is significant; for example, in avionics, verification costs represent 50% of the overall costs. Moreover a means to estimate reliability, or at least identify failureprone components, early in the life-cycle of the software would be helpful to make C. Atkinson et al. (Eds.): Component-Based Software Development, LNCS 3778, pp. 320–344, 2005. c Springer-Verlag Berlin Heidelberg 2005 



 Component-Based Development of Dependable Systems with UML



321



verification more efficient. We believe that the design models are the best indicator in early phases for the future behavior of the system and thus should be used for reliability estimation. Following an idea advocated in [1], we thus aim to incorporate quality attributes of models (such as measures derived from structural or behavioral attributes) into component-based models of software systems within the context of model-based software development. As a design notation, we use the Unified Modeling Language (UML) [35], the de facto industry-standard in object-oriented modeling. It offers an unprecedented opportunity for high-quality critical systems development that is feasible in an industrial context. Problems in critical systems development often arise when the conceptual independence of software from the underlying physical layer turns out to be an unfaithful abstraction (for example in settings such as real-time or more generally safety-critical systems, see [42]). Since UML allows the modeler to describe different views on a system, including the physical layer, it seems promising to try to use UML to address these problems by modeling the interdependencies between the system and its physical environment. To support safe systems development, safety checklists have been proposed for example in [15]. In the present chapter, based on an extending work presented in [20], we tailor UML in a similar approach to reliable systems by precisely defining some checks with stereotypes capturing reliability requirements and related physical properties. We also provide metrics that estimate the failure-proneness of a software system based on the complexity of its design models and its operational profile. The reliability requirements can then be compared with the results for failure-proneness. In this way we encapsulate knowledge on prudent reliability engineering and thereby make it available to developers who may not be specialized in reliable systems. A prototypical framework for tool-support for this approach is also presented within this chapter. Outline. In Sect. 2.1 we explain the foundation for checking the constraints associated with the stereotypes suggested for reliable systems development which are presented in Sect. 2.2, together with examples of their use. A metrics suite for models is defined in Sect. 3.1 and these metrics are used to analyze the failure-proneness of components in Sect. 3.2. In Sect. 4, we briefly describe the tool assisting our approach. Two case studies describing an automatic collision notification system in Sect. 5 and an automotive network controller in Sect. 6 are finally used to validate our work.



2



Model-Based Reliability Specification and Analysis



In safety-critical systems, an important concept also used here is that of a safety level (see, e.g. [39]). Since safety-critical systems generally need to provide a high degree of reliability, it makes sense to analyze these systems with respect to their maximum allowed failure rate. We thus define the concept of a reliability level analogous to the mentioned safety levels. We exemplarily consider the following kinds of failure semantics in this chapter (other kinds have to be omitted for space reasons):



 322



Jan J¨urjens and Stefan Wagner



– crash/performance failure semantics means that a component may crash or may deliver the requested data only after the specified time limit, but it is assumed to be partially correct. – value failure semantics means that a component may deliver incorrect values. Possible failures include: – message loss which may be due to hardware failures or software failures (for example, buffer overflows) – message delay which may in turn result into the reordering of messages if the delay is variable – message corruption when a message is modified in transit. Forms of redundancy commonly employed against these failures include space redundancy (physical copies of a resource), time redundancy (rerunning functions) and information redundancy (error-detecting codes). UML Profile Mechanisms. We use the three main profile mechanisms (stereotypes, tagged values and constraints) to include reliability requirements in a UML specification, together with the constraints formalizing the requirements. To evaluate a model against the requirements, we refer to a precise semantics for the used fragment of UML extended with a notion of failures sketched in Sect. 2.1. 2.1 Evaluation of Reliability Requirements in UML Diagrams We briefly give an idea how the constraints used in the UML extension presented in Sect. 2.2 can be checked in a precise and well-defined way. A precise semantics for a (restricted and simplified) fragment of UML supporting these ideas can be found in [21]. It includes activity diagrams, statecharts, sequence diagrams, composite structure diagrams, deployment diagrams, and subsystems, each restricted and simplified to keep a mechanical analysis that is necessary for some of the more subtle behavioral reliability requirements feasible. The subsystems integrate the information between the different kinds of diagrams and between different parts of the system specification. For reliability analysis, the reliability-relevant information from the reliability-oriented stereotypes is then incorporated as explained below. Outline of Precise Semantics. In UML the objects or components communicate through messages received in their input queues and released to their output queues. Thus for each component C of a given system, the semantics defines a function [[C ]]() which – takes a multi-set I of input messages and a component state S (a multi-set, also called a bag, is a set whose elements may occur more than once) and – outputs a set [[C ]](I, S) of pairs (O, T ) where O is a multi-set of output messages and T the new component state (it is a set of pairs because of the non-determinism that may arise)



 Component-Based Development of Dependable Systems with UML



323



together with an initial state S0 of the component. The behavioral semantics [[D ]]() of a state machine diagram D models the run-to-completion semantics of UML state machines. Similarly, one can define the semantics for UML 1.5 activity diagrams. Given a sequence diagram S, we define the behavior [[S.C ]] () of each contained component C. Subsystems group together diagrams describing different parts of a system: a system component C given by a subsystem S may contain subcomponents C1 , . . . , Cn . The behavioral interpretation [[S ]]() of S is defined by iterating the following steps: 1. It takes a multi-set of input events. 2. The events are distributed from the input multi-set and the link queues connecting the subcomponents and given as arguments to the functions defining the behavior of the intended recipients in S. 3. The output messages from these functions are distributed to the link queues of the links connecting the sender of a message to the receiver, or given as the output from [[S ]] () when the receiver is not part of S. When performing reliability analysis, after the last step, the failure model may corrupt the contents of the link queues in a certain way explained below. Note that this approach is similar to that taken in [21], where a security analysis is performed in place of the reliability analysis. As an example, the state chart in Fig. 1 is executed as follows: The fuel controller starts out in state WheelsOut. It awaits either the message fuel() or wheelsin(). In the first case, the argument of the message is multiplied with the constant d and the result returned. In the second case, if the argument is false, no change occurs. In case of true, the state is switched to WheelsIn. In that state, the same behavior occurs, except that the argument of fuel() is now multiplied with the constant c.



«containment» Fuel controller {reliable={fuel}} fuel(x:Data):Data wheelsin(x:Bool) Fuel control fuel(x:Data):Data wheelsin(x:Bool)



fuel(x)/return(c.x)



fuel(x)/return(d.x)



wheelsin (true) WheelsIn



WheelsOut wheelsin (false)



wheelsin(true)



wheelsin(false)



Fig. 1. Example State chart



As in standard terminology in high assurance systems, the values output by a component by the means of call or send actions could be referred to as controlled quantities, whereas the values input as events are the monitored quantities [3]. Reliability Analysis. For a reliability analysis of a given UML subsystem specification S, we need to model potential failure behavior. We model specific types of failures that can corrupt different parts of the system in a specified way, depending on the used redundancy model. For this we assume a function F ailuresR s which takes a redundancy



 324



Jan J¨urjens and Stefan Wagner



model R and a stereotype s ∈ {crash/performance,value} and returns a set of expressions F ailuresR s ⊆ { delay(t) : t ∈ N ∧t > 0 } ∪ { loss(p) : p ∈ [0, 1] } ∪ { corruption(q) : q ∈ [0, 1] }. Here R is a name representing a redundancy mechanism (such as duplication of components together with a voting mechanism), which is semantically defined through the Failures{} sets. The natural number t represents the maximum delay to be expected in time units. p gives the probability that an expected data value is not delivered after the t time units specified in delay (t). Given a value delivered within this time period, q denotes the probability that this value is corrupted. As an example for a failures function, Table 1 gives the one for the absence of any redundancy mechanism (R = none). Here, the time and probability parameters are still included as parameters; for a given system, these will be concrete numeric values. Table 1. Failure semantics Risk Failuresnone () Crash/performance {delay (t), loss(p)} Value {corruption (q)}



The consistency of the failure model with the physical reality (and in particular the completeness in the sense that no possible failures are missing from the failure model) can for example be established by simulating the model and comparing the results with data obtained from experiments on the physical systems. Similarly, the probabilistic values and other numerical data can also be derived. Note that this consistency cannot, for principled reasons, be proved, since mathematical proofs can only be constructed with respect to mathematical models of reality, not reality itself. The consistency of the running code with the execution semantics of the UML diagrams used here can be guaranteed in two ways: Firstly, one can use automated code generation, where the code generator would (for high assurance applications) ideally be formally proved to be correct with respect to the semantics. Where code generation is not useful, the manually implemented code can still be checked against the semantics by using model-based testsequence generation (this is not considered here; [19] gives an introduction). Then we model the actual behavior of a failure, given a redundancy model R, as a failure function that, at each iteration of the system execution, non-deterministically maps the contents of the link queues in S and a state S to the new contents of the link queues in S and a new state T as explained below. For this, for any link l, we use a sequence (lqnl )n∈N of multi-sets such that at each iteration of the system, for any n, lqnl contains the messages that will be delayed for further n time units. Here lq0l stands for the actual contents of the link queue l. At the beginning of the system execution, all these multi-sets are assumed to be empty. Also, for any execution trace h (that is, a particular sequence of system states and occurring failures describing a possible history of the system execution), we define a sequence (phn )n∈N of probabilities such that at the nth iteration of the system, the failure considered in the current execution trace happened with probability phn . Thus the probability ph that a trace h of length n will take place is the product of the values ph1 , . . . , phn (since in our presentation here, we assume failures to be mutually independent, to keep the exposition accessible). Then for



 Component-Based Development of Dependable Systems with UML



325



an execution trace h, the failure function is defined as follows. It is non-deterministic in the sense that for each input, it may have a set of possible outputs. Failure behavior should be part of the trace. – For any link l stereotyped s where loss(p) ∈ FailuresR s we • either define lq0l := ∅ and append p to the sequence (phn )n∈N , • or append 1 − p to the sequence (phn )n∈N . – For any link l stereotyped s where corruption(q) ∈ FailuresR s we • either define lq0l := {  } and append q to the sequence (phn )n∈N , • or append 1 − q to the sequence (phn )n∈N . – For any link l stereotyped s where delay (t) ∈ FailuresR s and lq0l = ∅, we define lqnl := lq0l for some n ≤ t and append 1/t to the sequence (phn )n∈N . l . – Then for each n, we (simultaneously) define lqnl := lqn+1 The failure types define which kind of failure may happen to a communication link with a given stereotype, as explained above. Note that for simplicity we assume that delay times are uniformly distributed. Also, corrupted messages (symbolized by 2) are assumed to be recognized (using error-detecting codes). To evaluate the reliability of the system with respect to the given type of failure, we define the execution of the subsystem S in presence of a redundancy model R to be the function [[S ]] R () defined from [[S ]]() by applying the failure function to the link queues as a fourth step in the definition of [[S ]] () as follows: 4. The failure function is applied to the link queues as detailed above. Containment. A system ensures containment if there is no unreliable interference between components on different reliability levels (this is called non-interference in [8]). Intuitively, providing containment means that an output should in no way depend on inputs of a lower level. We assume that we are given an ordered set Levels of reliability levels. Then the containment constraint is that in the system, the value of any data element of level l may only be influenced by data of the same or a higher reliability level: Write H(l) for the set of messages of level l or higher. Given a sequence m of messages, we write m H(l) for the sequence of messages derived from those in m by deleting all events the message names of which are not in H(l). For a set M of def



sequences of messages, we define M H = {m H : m ∈ M }. Definition 1. Given a component C and a reliability level l, we say that C provides containment with respect to l if for any two sequences i, j of input messages, i H(l) = j H(l) implies [[C ]]i H(l) =[[C]]j H(l) . 2.2 Stereotypes for Reliability Analysis: The “Reliability Checklist” In Table 2 we give some of the stereotypes, together with their tags and constraints, that we suggest to be used in the model-based development of reliable systems with UML, based on previous experience in the model-based development of reliable systems (for space restrictions, we can only give a representative selection). Thus, in a way, we define a UML-based “Reliability Checklist” (which one can verify mechanically on the design level). The constraints, which in the table are only



 326



Jan J¨urjens and Stefan Wagner Table 2. Stereotypes Stereotype risk crash/ performance value



Base Class Tags link, node failure link, node



guarantee redundancy



link, node goal dependency, model component subsystem



reliable links reliable dependency critical reliable behavior containment error handling



Constraints



Description risks crash/performance failure semantics value failure semantics guarantees redundancy model



dependency reliability matched by links call, send respect data reliability



enforces reliable communication links structural data reliability critical object reliable behavior containment handles errors



link, node



subsystem object subsystem subsystem subsystem



(level) behavior fulfills reliability provides containment error object



Table 3. Tags Tag failure



Stereotype risk



Type P({delay (t), loss(p), corruption (q)}) goal guarantee P({immediate(t), eventual , correct }) model redundancy {none, majority, fastest } error object error handling string



Multipl. Description * specifies risks *



specifies guarantees



* 1



redundancy model error object



named briefly, are formulated and explained in the remainder of the section. Table 3 gives the corresponding tags. The relations between the elements of the tables are explained below in detail. Note that some of the concepts introduced below are easier to apply at component rather than object level. We explain the stereotypes and tags given in Tables 2 and 3 and give examples (which for space restrictions have to be kept simple). Note that the constraints considered here span a range in sophistication: Some of the constraints are relatively simple (comparable to type-checking in programming languages) and can be enforced at the level of abstract syntax (such as reliable links) and can be used without the semantics sketched in Sect. 2.1. Others (such as containment) refer to the semantics and can only be checked reliably using tool-support. Overview. We give an overview of the syntactic extensions together with an informal explanation of their meaning. redundancy, with associated tag {model}, describes the redundancy model that should be implemented. risk describes the risks arising at the physical level using the associated tag {failure}. guarantee requires the goals described in the associated tag {goal} for communicated data. reliable links ensures that reliability requirements on the communication are met by the physical layer. critical labels critical objects using the associated tags {level} (for each reliability level level ). reliable dependency ensures that communication dependencies respect reliability requirements on the communicated data. reliable behavior ensures that the system behaves reliably as required by



 Component-Based Development of Dependable Systems with UML



327



guarantee, in the presence of the specified failure model. containment ensures containment as defined in Definition 1. error handling with tag {error object} provides an object for handling errors. In the following paragraphs, we define the stereotypes and their constraints in detail. Redundancy. The stereotype redundancy of dependencies and components and its associated tag {model} can be used to describe the redundancy model that should be implemented for the communication along the dependency or the values computed by the component. Here we consider the redundancy models none, majority, fastest meaning that there is no redundancy, there is replication with majority vote, or replication where the fastest result is taken (but of course there are others, which can easily be incorporated in our approach). Risk, Crash/Performance, Value. With the stereotype risk on links and nodes in deployment diagrams one can describe the risks arising at these links or nodes, using the associated tag {failure}, which may have any subset of {delay (t), loss(p), corruption(q)} as its value. In the case of nodes, these concern the respective communication links connected with the node. Alternatively, one may use the stereotypes crash/performance or value, which describe specific failure semantics (by giving the relevant subset of {delay (t), loss(p), corruption(q)}): For each redundancy model R, we have a function FailuresR s from a given stereotype s ∈ {crash/ performance,value} to a set of strings FailuresR s ⊆ {delay (t), loss(p), corruption(q)}. If there are several such stereotypes relevant to a given link (possibly arising from a node connected to it), the union of the relevant failure sets is considered. This way we can evaluate UML specifications. We make use of this for the constraints of the remaining stereotypes. An example for a failures function was given above in Table 1. Guarantee. call or send dependencies in object or component diagrams stereotyped guarantee are supposed to provide the goals described in the associated tag {goal} for the data that is sent along them as arguments or return values of operations or signals. The goals may be any subset of {immediate(t), eventual (p), correct (q)}. This stereotype is used in the constraints for the stereotypes reliable links and reliable behavior. Reliable Links. The stereotype reliable links, which may label subsystems, is used to ensure that reliability requirements on the communication are met by the physical layer. We recall that in UML deployment diagrams, communication is specified on the logical level by communication dependencies between components, which is supported on the physical level by communication links between the nodes on which the components reside. More precisely then, the constraint enforces that for each dependency d with redundancy model R stereotyped guarantee between subsystems or objects on different nodes n, m, we have a communication link l between n and m with stereotype s such that – if {goal} has immediate(t) as one of its values then delay (t ) ∈ verb Failures R s entails t ≤ t,



 328



Jan J¨urjens and Stefan Wagner



– if {goal} includes eventual (p) as one of its values then loss(p ) ∈ FailuresR s entails p ≤ 1 − p, and – if {goal} has correct (q) as one of its values then corruption(q  ) ∈ FailuresR s entails q  ≤ 1 − q. Example. In Fig. 2, given the redundancy model R = none, the constraint for the stereotype reliable links is fulfilled if and only if T ≤ t, where t is the expected delay according to the Failuresnone (crash/performance) scenario in Fig. 1. client/server «reliable links»



client machine client apps



«guarantee» {goal= server machine {immediate(T)}} server «call» apps «crash/ performance»



Fig. 2. Example reliable links usage



Reliable Behavior. The stereotype reliable behavior ensures that the specified system behavior in the presence of the failure model under consideration does provide the reliability goals stated in the tag {goal} associated with the stereotype guarantee as follows, by referring to the semantics sketched in Sect. 2.1. – immediate(t). In any trace h of the system, the value is delivered after at most t time steps in transmission from the sender to the receiver along the link l. Technically, the constraint is that after at most t steps the value is assigned to lq0l . – eventual(p). In any trace h of the system, the probability that delivered value is lost during transmission is at most 1 − p. Technically, the sum of all ph for such histories h is at most 1 − p. – correct(q). In any trace h of the system, the probability that delivered value is corrupted during transmission is at most 1 − q. Technically, the sum of all ph for such histories h is at most 1 − q.



3



Model-Based Reliability Metrics



This section describes the possibilities and benefits of a reliability-related analysis of models based on complexity metrics. We first explain the motivation and assumed development process. Afterwards specific metrics for structured classes and state machines are proposed and combined with established object-oriented design metrics. These metrics are joined with the operational profile of the system to find the failureprone components. This information is finally used in combination with the reliability requirements from Sec. 2.



 Component-Based Development of Dependable Systems with UML



329



The main idea is to identify failure-prone components early in the life cycle of the software by their complexity measures and operational profile, and use this information in checks regarding reliability requirements. The complexity information can help us to rethink design decisions and simplify the design in general. The further analysis can guide the test and review efforts to concentrate on the more critical and failure-prone components. Finally, annotated reliability requirements can be checked for consistency with this information. 3.1 Model Complexity Metrics The complexity of software code has been studied to a large extent. The most widely known metrics concerning complexity are Halstead’s Software Metric [12] and McCabe’s Cyclomatic Complexity [29] and many variations of these. In [25, 32] it is shown that the reliability of a software is related to its complexity. It is generally accepted that complexity is a good indicator for the reliability of a component. This means that a component with a high complexity is more likely to contain faults. Depending on the operational profile of the component, this can mean that the reliability is low. For example, it is stated in [40] that a combination of size and cyclomatic complexity delivers good results in reliability prediction. Although the traditional complexity metrics are not easily applicable to design models, there are already a number of approaches that propose design metrics [4, 6, 48]. However, they concentrate mainly on the structure or do not support object-oriented designs. Nevertheless there are also various metrics for object-oriented design models. The most important is the metrics suite proposed in [7] which concentrates on various aspects of classes. Most of these metrics were found to be good estimators of faultprone classes in [2] and will be used and extended in the following. In using a suite of metrics we follow [11, 30] stating that a single measure is usually inappropriate to measure complexity. Development Process. The metric suite described below is generally applicable in all kinds of development processes. It does not need specific phases or sequences of phases to work. However, we need detailed design models of the software to which we apply the metrics. This is most rewarding in the early phases as the models then can serve various purposes. Otherwise, we assume no specific process (apart from being modelbased) and therefore omit details on possible process models. The idea is mainly to incorporate reliability aspects during the development of the model. We base our metrics especially on some parts of UML 2.0 that are most relevant to embedded systems development. The parts that we will look at are classes, structured classes, components, and state machines. We adjust new metrics and the ones from [7] to parts of UML 2.0 based on the design approach taken in ROOM [43] or UML-RT [44], respectively. This means that we model the architecture of the software with structured classes (called actors in ROOM, capsules in UML-RT) that are connected by ports and connectors to describe the interfaces and which can have associated state machines that describe their behavior. The structured classes can have parts that may themselves be structured. Thus a hierarchical system decomposition is possible.



 330



Jan J¨urjens and Stefan Wagner



The metrics defined below for the different model elements can predict the faultproneness of the components. To be able to make a reliability analysis we need information about the failure-proneness of components, i.e. the probability that the fault causes a failure. We will use a very simple form of an operational profile [33] to determine the usage level of a component. Therefore the development process must support the creation of operational profiles early in the development. Structured Classes and Components. Structured classes and components are a new concept in UML 2.0 derived mainly from ROOM and UML-RT. It introduces composite structures that represent a composition of run-time instances collaborating over communications links. This allows UML components and classes to have an internal structure consisting of other components or classes that are bound by connectors. Furthermore ports are introduced as a defined entry point to a class or component. A port can group various provided and required interfaces. A connection between two classes or components through ports can also be denoted by a connector. The parts of a class or component work together to achieve its behavior. A state machine can also be defined to describe additional behavior. The metrics defined in this section are applicable to components as well as classes. However, we will concentrate on structured classes following the usage of classes in ROOM. Therefore the set of documents under consideration in the following are composite structure diagrams of single classes or components with their parts, provided and required interfaces, connectors and their state machines if existing. An example for this is depicted in Fig. 3. Number of Parts (NOP). The number of parts of a structured class or component contributes obviously to its structural complexity. The more parts it has, the more coordination is necessary because of the more dependencies. Therefore, we define NOP as the number of direct parts Cp of a class or component. Number of Required Interfaces (NRI). This metric is (together with the NPI metric below) based on the fan-in and fan-out metrics from [16] and is also a substitute for the old Coupling Between Objects (CBO) that was criticized in [28] in that it does not represent the concept of coupling appropriately. It reduces ambiguity by giving a clear direction of the coupling. We use the required interfaces of a class to represent the usage of other classes. This is another increase of complexity which may as well lead to failure, for example if the interfaces are not correctly defined. Therefore we count the number of required interfaces Ir for this metric. Number of Provided Interfaces (NPI). Very similar but not as important as NRI is the number of provided interfaces Ip . This is similarly a structural complexity measure that expresses the usage of a class by other entities in the system. State Machines. State machines are used to describe the behavior of classes of a system. They describe the actions and state changes based on a partitioning of the state space of the class. Therefore the associated state machine is also an indicator of the complexity of a class and hence its fault-proneness. State machines consist of states



 Component-Based Development of Dependable Systems with UML



331



class C1 P1



P2



C1 NOP = 3 NRI = 2 NPI = 2



P3



Fig. 3. An example structured class with three parts and the corresponding metrics



and transitions where states can be hierarchical. Transitions carry event triggers, guard conditions, and actions. We use cyclomatic complexity [29] to measure the complexity of behavioral models represented as state machines because it fits most naturally to these models as well as to code. This makes the lifting of the concepts from code to model straightforward. The basic concept is to transfer the metric from the realization of the state machine in code to the graphical representation. To find the cyclomatic complexity of a state machine we build a control flow graph similar to the one for a program in [29]. This is a digraph that represents the flow of control in a piece of software. For source code, a vertex is added for each statement in the program and arcs if there is a change in control, e.g. an if- or while-statement. This can be adjusted to state machines by considering its code implementation. For a possible code transformation of state machines see [43]. An example of a state machine and its control flow graph is depicted in Fig. 4. At first we need an entry point as the first vertex. The second vertex starts the loop over the automata because we need to loop until the final state is reached or infinitely if there is no final state. The next vertices represent transitions, atomic expressions of guard conditions, and event triggers of transitions. A guard condition can consist of several boolean expressions that are connected by conjunctions and disjunctions. An atomic expression is an expression only using other logical operators such as equivalence. For a more thorough definition see [29]. These vertices have two outgoing arcs each because of the two possibilities of the control flow, i.e. an evaluation to true or false. Such a branching flow is always joined in an additional vertex. The last vertex goes back to the loop vertex from the start and the loop vertex has an additional arc to one vertex at the end that represents the end of the loop. This vertex finally has an arc to the last vertex, the exit point. If we have such a graph we can calculate the cyclomatic complexity using the formula v(G) = e − n + 2, where v is the complexity, G the control graph, e the number of arcs, and n the number of vertices (nodes). There is also an alternative formula, v(G) = p + 1, which can also be used, where p is the number of predicate nodes. Predicate nodes are vertices where the flow of control branches. Hierarchical states in state machines are not incorporated in the metric. Therefore the state machine must be transformed into an equivalent state machine with simple



 332



Jan J¨urjens and Stefan Wagner



sm Example



S2 e2 [g1] / a1 e2 [g1] / a1



S3 S1 e3 [g2]



e4 e5 [g3 && g4] / a2



e5 [g3 && g4] / a2



S4



e3 [g2]



Number of Nodes: 34 Number of Edges: 46 v(G) = e − n + 2 v(G) = 46 − 34 + 2 Cyclomatic Complexity = 14



(a)



e4



(b)



Fig. 4. (a) A simple state machine with one hierarchical state, event trigger, guard conditions, and actions. (b) Its corresponding control flow graph. The black vertices are predicate nodes. On the right the transitions for the respective part of the flow graph are noted



states. This appears to be preferable to viewing sub-states as a kind of subroutines and keeping them out of the complexity calculation, because this would lose a considerable amount of information on the complexity. Furthermore internal transitions are counted equally to normal transitions. Pseudo states are not counted themselves, but their triggers and guard conditions. Cyclomatic Complexity of State Machine (CCS). Having explained the concepts based on the example flow graph above, the metric can be calculated directly from the state machine with a simplified complexity calculation. We count the atomic expressions and event triggers for each transition. Furthermore we need to add 1 for each transition because we have the implicit condition that the corresponding source state is active. This results in the formula CCS = |T | + |E| + |AG | + 2, where T is the multi-set of transitions, E is the multi-set of event triggers, and AG is the multi-set of atomic expressions in the guard conditions. This formula yields exactly the same results as the longer version above but has the advantage that it is easier to calculate. For this metric we have to consider two abstraction layers. First, we transform the state machine into its code representation and afterwards use the control flow graph of the code representation to measure structural complexity. Note that this is done only for measuring purposes; our approach also applies if the actual implementation is not automatically generated from the UML model but manually implemented. The first “ab-



 Component-Based Development of Dependable Systems with UML



333



straction” is needed to establish the relationship to the corresponding code complexity. The code complexity is a good indicator of the fault-proneness of a program. The proposition is that the state machine reflects the major complexity attributes of the code that implements it. The next abstraction to the control flow graph was established in [29]. In [17] the correlation of metrics of design specifications and code metrics was analyzed. One of the main results was that the code metrics such as the cyclomatic complexity are strongly dependent on the level of refinement of the specification, i.e. the metric as a lower value the more the specification is abstract. This also holds for the CCS metric. Models of software can be based on various different abstractions, such as functional or temporal abstractions [37]. Depending on the abstractions chosen for the model, various aspects may be omitted, which may have an effect on the metric. Therefore, it is prudent to consider a suite of metrics rather than a single metric when measuring design complexity to assess fault-proneness of system components. In addition to the metrics which we defined above, we will now complete our metrics suite by adding two existing metrics from the literature. Metrics Suite. Three of the metrics from [7] can be adjusted to be applicable to UML models. The metrics chosen are the ones that were found to be good indicators of faultprone classes in [2]. However, we omit Response For a Class (RFC) and Coupling Between Objects (CBO) because they cannot be determined on the model level. The remaining two metrics together with the new ones developed above form our metrics suite in Tab. 4. We now describe these two adapted metrics. Depth of Inheritance Tree (DIT). This is the maximum depth of the inheritance graph T to a class c. This can be determined in any class diagram that includes inheritance. Number of Children (NOC). This is the number of direct descendants Cd in the inheritance graph. This can again be counted in a class diagram. Table 4. A summary of the metrics suite with its calculation Name Depth of Inheritance Tree Number of Children Number of Parts Number of Required Interfaces Number of Provided Interfaces Cyclomatic Complexity of State machine



Abbr. DIT NOC NOP NRI NPI CCS



Calculation max (depth(T, c)) |Cd | |Cp | |Ir | |Ip | |T | + |E| + |AG | + 2



We consider whether our metrics are structural complexity measure by the definition in [30]. The definition says that for a set D of documents with a pre-order ≤D and the usual ordering ≤R on the real numbers R, a structural complexity measure is an order preserving function m : (D, ≤D ) −→ (R, ≤R ). Each metric from the suite fulfills this definition with respect to a suitable pre-order on the relevant set of documents. The document set D under consideration is depending on the metric: either a class diagram



 334



Jan J¨urjens and Stefan Wagner



that shows inheritance and possibly interfaces, a composite structure diagram showing parts and possibly interfaces, or a state machine diagram. All the metrics use specific model elements in these diagrams as a measure. Therefore there is a pre-order ≤D between the documents of each type based on the metrics: We define d1 ≤D d2 for two diagrams d1 , d2 in D if d1 has fewer of the model elements specific to the metric under consideration than d2 . The mapping function m maps a diagram to its metric, which is the number of these elements. Hence m is order preserving and the metrics in the suite qualify as structural complexity measures. As mentioned before, complexity metrics are good predictors for the reliability of components [25, 32]. Furthermore the experiments in [2] show that most metrics from [7] are good estimators of fault-proneness. We adopted DIT and NOC from these metrics unchanged, therefore this relationship still holds. The cyclomatic complexity is also a good indicator for reliability [25] and this concept is used for CCS to be able to keep this relationship. The remaining three metrics were modeled similarly to existing metrics. NOP resembles NOC, NRI and NPI are similar to CBO. NOC and CBO are estimators for fault-proneness, therefore it is expected that the new metrics behave accordingly. This metrics suite can now be used to determine the most fault-prone classes and components in a system. However, different metrics are important for different components. Therefore one cannot just take the sum over all metrics to find the most critical component. Some component models may have an associated state machine, others not. This makes the sum meaningless. We propose to use the metrics so that we compute the metric values for each component and class and consider the ones that have the highest measures for each single metric. This way we can for example determine the components with complex behavior, or high fan-in and fan-out.



3.2 Failure Proneness We pointed out already that the fault-proneness of a component does not directly imply low reliability because a high number of faults does not mean that there is a high number of failures [45, 46]. However a direct reliability measurement is in general not possible on the model level. Nevertheless we can get close by analyzing the failure-proneness of a component, i.e. the probability that a fault leads to a failure that occurs during software execution. It is not possible to express the probability of failures with exact figures based on the design models. We propose therefore to use more coarse-grained failure levels, e.g. LF = {high, medium, low }, where LF is the set of failure levels. This allows an abstract assessment of the failure probability. It is still not reliability as generally defined but the best estimate that we can get in early phases. To determine the failure level of a component we use the metrics suite from above to define complexity levels LC = {high, low }. We assign each component such a complexity level by looking at the extreme values in the metrics results. Each component that exhibits a high value in at least one of the metrics is considered to have the complexity level high, all other components have the level low. It depends on the actual distribution of values to determine what is to be considered a high value.



 Component-Based Development of Dependable Systems with UML



335



Having assigned these complexity levels to the components, we know which components are highly fault-prone. The operational profile [33] is a description of the usage of the system, showing which functions are mostly used. We use this information to assign usage levels LU to the components. This can be of various granularity. An example would be LU = {high, medium, low }. When we know the usage of each component we can analyze the probability that the faults in the component lead to a failure. The combination of complexity level and usage level leads us to the failure level LF of the component. It expresses the probability that the component fails during software execution. We describe the mapping of the complexity level and usage level to the failure level with the function fp: fp = LC × LU −→ LF , where LF = LU ∪ {low} What the function does is simply to map all components with a high complexity level to its usage level and all component with a low complexity level to low.  y if x = high fp(x, y) = low otherwise This means that a component with high fault-proneness has a failure probability that depends on its usage and a component with low fault-proneness has generally a low failure probability. Having these failure levels for each component we can use that information to guide the verification efforts in the project, e.g. assign the most amount of inspection and testing on the components with a high failure level. 3.3 Checking of Reliability Requirements In this section, we sketch exemplarily how to use the information on fault-proneness of components obtained using the metrics in the previous section in combination with the checks of reliability requirements considered in Sect. 2. More specifically, we explain how to relate the levels derived from an UML model and the operational profile to reliability requirements formulated in the UML diagram using UML stereotypes. Critical. As defined in Sect. 2, the stereotype critical labels classes whose instances are critical in some way, as specified by the associated tags {level} for each reliability level level ∈ Levels. The intention is now that for the failure level f ∈ LF defined in Sect. 3.2 for any reliability level l ∈ Levels, and for any component C stereotyped with this level l, if the levels l and f are contradictory, C should be more closely inspected for possible flaws (for example, using a formal verification, which in general would be too costly to apply to the whole system). Contradictory means here a failure level and a reliability level that are not compatible, e.g. a high failure level and a high reliability level. Containment. We use the stereotype containment of subsystems defined in Sect. 2 to detect system parts with a high failure level which may influence data values that are supposed to be highly reliable These system parts can then be inspected more thoroughly for possible flaws.



 336



4



Jan J¨urjens and Stefan Wagner



Tool Support for Model-Based Reliability and Safety Analysis



To support our approach, we developed automated tools for the formal verification of UML models for the constraints associated with the stereotypes introduced in the Secs. 2 and 3. We describe a framework that incorporates several formal verification tools (including the model-checker Spin and the automated theorem prover e-SETHEO). Functionality. There are three consecutive stages in implementing full verification functionality for the formalized UML models. There exist verification tools for all stages. The framework is, however, designed to be extensible, so new analysis plugins can be added easily. – Static features. Checkers for static features (for example, a type-checking like enforcement of safety levels in class and deployment diagrams) have been implemented directly. – Simple dynamic features. Checks of UML models of a bounded size for simple dynamic properties (for example, that a deterministic Machine without interaction with the environment does not reach a certain critical state) can still be directly implemented. – Complex dynamic features. Checks for complicated behavioral properties or of large, or highly non-deterministic or interactive UML models require the use of sophisticated tools (such as model-checkers). This is implemented by translating the required UML constructs into the model-checker input language (for example, a Temporal Logic formula). To be able to apply sophisticated tools (such as model checkers) to compute a metric, one needs a front-end which automatically produces a semantic model and includes the relevant formalized safety requirements, when given a UML model. This avoids requiring the software developers themselves to perform this formalization, which usually needs a high level of specialized training in formal methods. UML supports this approach by offering predefined safety primitives (such as safety requirements or mechanisms) with a strictly defined semantics, which can be applied by a developer without an extensive training in safety-critical systems by simply including the relevant stereotypes in the UML model. These primitives are translated into the targeted formal language, protecting from potential errors in manual formalization of the safety properties. Since safety requirements are usually defined relative to failure model, to analyze whether the UML specification fulfills a safety requirement, the tool-support has to automatically include the failure model arising from the physical view contained in the UML specification. We briefly describe the functionality of the UML tool that meets the listed requirements. The developer creates a model and stores it in the UML 1.5 / XMI 1.2 file format. The framework will be updated to UML 2.0 as soon as the official DTDs will be available. The file is imported by the tool into the internal MDR repository. The tool accesses the model through the JMI interfaces generated by the MDR library. The checker parses the model and checks the constraints associated with the stereotype. The results are delivered as a text report for the developer describing found problems, and



 Component-Based Development of Dependable Systems with UML



337



a modified UML model, where the stereotypes whose constraints are violated are highlighted. The tool can be executed as a console application, as a web-application, or a GUI application.



5



Case-Study: Automatic Collision Notification



In this part of the chapter, we validate our proposed safety and reliability analyzes in a case study of an automatic collision notification system as used in cars to provide automatic emergency calls. First, the system is described and designed using the UML extension that we made in the previous sections, then we analyze the model and present the results. Description. This case study that we used to validate our results was done in cooperation with the automotive manufacturer BMW. There is a similar project currently in development. The problem to be solved is that many accidents of automobiles involve only a single vehicle. Therefore it is possible that no or only a delayed emergency call is made. The chances for the casualties is significantly higher if an accurate call is made quickly. This has lead to the development of so called Automatic Collision Notification (ACN) systems, sometimes also called mayday systems. They automatically notify an emergency call response center when a crash occurs but also manual notification using the location data from a GPS device can be made. We used the public specification from the Enterprise program [9, 10] as a basis for the design model because the work together with BMW is in an early stage. In this case study, we will concentrate on the built-in device of the car and ignore the obviously necessary infrastructure such as the call center. Device Design. Following [9] we will call the built-in device MaydayDevice and divide it into five components. The architecture is illustrated in Fig. 5 using a composite structure diagram of the device. The device is a processing unit that is built into the vehicle and has the ability to communicate with an emergency call center using a mobile telephone connection and retrieving position data using a GPS device. The components that constitute the mayday device are: – ProcessorModule. This is the central component of the device. It controls the other components, retrieves data from them and stores it if necessary. – AutomaticNotification This component is responsible for notifying a serious crash to the processor module. It gets notified itself if an airbag is activated. – LocationModule. The processor module request the current position data from the location module, that gathers the data from a GPS device. – CommunicationsModule. The communications module is called from the processor module to send the location information to an emergency call center. It uses a mobile communications device and is responsible for automatic retry if a connection did fail. – ButtonBox. This is finally the user interface that can be used to manually initiate an emergency call. It also controls a display that provides feedback to the user.



 338



Jan J¨urjens and Stefan Wagner



class MaydayDevice ButtonBox



AutomaticNotification



<> {low}



<> {high}



ProcessorModule <> {high}



LocationModule



CommunicationsModule



<> {high}



<> {high}



Fig. 5. The composite structure diagram of the mayday device



These components are again shown in Fig. 6 in a class diagram showing the attributes and methods of each. It also shows that we have exactly one instance of each class in the system. Furthermore we used some tagged values based on Sect. 2 to describe safety requirements on some data values. The central ProcessorModule has the annotated requirements that the method getGpsData from the LocationModule delivers its data in real time and correct, and that the data of the call is transfered correct by the



ButtonBox



AutomaticNotification



− redLED : boolean − greenLED : boolean



− sent : boolean + acknowledge() + airbagTriggered()



+ startButtonPressed() + testButtonPressed() + cancelButtonPressed() + lightLED(in color : Color)



CommunicationsModule {correct = makeCall(callData)}



ProcessorModule LocationModule {realtime = getGpsData(location), correct = getGpsData(location)} − gpsData : byte[] + newGpsData(in data : byte[]) + getGpsData(out data : byte[])



{realtime = getGpsData(location), correct = getGpsData(location), correct = makeCall(callData)} − location : byte[] − testNumber : int − emergencyNumber : int − callData : byte[] + startCall() + cancelCall() + testCall() + notify() + failure() + success()



Fig. 6. The class diagram of the parts



− retries : int = 0 − callData : byte[] − finished : boolean − connection : boolean + makeCall(in callData : byte[]) + send Data() + lineFree() + lineBusy() + connected() + noConnection() + connectionAborted() + done() + cancel()



 Component-Based Development of Dependable Systems with UML



339



method makeCall of CommunicationsModule. LocationModule and CommunicationsModule have the corresponding annotations, therefore are these requirements consistent, as defined in Sect. 2.2. Each of the components of the mayday device has an associated state machine to describe its behavior. We do not show the state machines because of space reasons but they can be found in [47]. Results. The five subcomponents of MaydayDevice are further analyzed in the following. At first we used our metrics suite from Sec. 3 to gather data about the model. The results can be found in Tab. 5. It shows that we have no inheritance in the current abstraction level of our model and also that the considered classes have no parts. Therefore the metrics regarding these aspects are not helpful for this analysis. Table 5. The results of the metrics suite for the components of MaydayDevice Class DIT NOC NOP NRI NPI CCS ProcessorModule 0 0 0 4 4 16 AutomaticNotification 0 0 0 2 1 4 LocationModule 0 0 0 1 2 4 CommunicationsModule 0 0 0 2 2 32 ButtonBox 0 0 0 2 2 8



More interesting are the metrics for the provided and required interfaces and their associated state machines. The class with the highest values for NRI and NPI is ProcessorModule. This shows that it has a high fan-in and fan-out and is therefore fault-prone. The same module has a high value for CCS but CommunicationsModule has a higher one and is also fault-prone. Therefore we assign the complexity level high to these two components, the other have the level low. The documentation in [10] shows that the main failures that occurred were failures in connecting to the call center (even when cellular strength was good), no voice connect to the call center, inability to clear the system after usage, and failures of the cancel function. These main failures can be attributed to the component ProcessorModule that is responsible for controlling the other components and CommunicationsModule that is responsible for the wireless communication. Therefore our reliability analysis labeled the correct components with a high failure level.



6



Case Study: MOST Network Master



We further validated our approach on the basis of the project results of an evaluation of model-based testing [38]. A network controller of an infotainment bus in the automotive R domain, the MOST Network Master [31], was modeled with the case tool AutoFocus [18] and test cases were generated from that model and compared with traditional tests. We use all found faults from all test suites in the following but as we have mainly fault



 340



Jan J¨urjens and Stefan Wagner



information, we concentrate on fault-proneness rather than failure-proneness. AutoFocus is quite similar to UML 2.0 and therefore the conversion was straight-forward. The composite structure diagram of the Network Master is shown in Figure 7.



class NetworkMaster Divide



MonitoringMgr



RegistryMgr RequestMgr



Merge



Fig. 7. The composite structure diagram of the MOST Network Master



We omit further parts of the design, especially the associated state machines, because of space and confidentiality reasons. The corresponding metrics are summarized in Table 6. Table 6. The results of the metrics suite for the NetworkMaster Class DIT NOC NOP NRI NPI CCS NetworkMaster 0 0 5 4 4 0 Divide 0 0 0 1 3 11 Merge 0 0 0 3 1 8 MonitoringMgr 0 0 0 2 1 0 RequestMgr 0 0 0 2 1 14 RegistryMgr 0 0 0 4 7 197



The data from the table shows that the RegistryMgr has the highest complexity in most of the metrics. Therefore we classify it as being highly fault-prone. As described in [38] several test suites were executed against an implementation of the Network Master. 24 faults have been identified by the test activities. 21 of which can be attributed to the RegistryMgr, 3 to the RequestMgr. There were no revealed faults in the other components. Hence, the high fault-proneness of the RegistryMgr did indeed result in a high number of faults revealed during testing.



 Component-Based Development of Dependable Systems with UML



7



341



Related Work



In the related area of real-time systems there has been a substantial amount of work regarding the usage of UML. For example, [44] describes constructs to facilitate the design of software architectures in this domain which are specified using UML. [22–24] contain several approaches to developing systems with various criticality requirements using UML. In particular, [13] discusses a pattern-based approach for using UML use cases for safety-critical systems. The focus is on the development of a testing strategy rather than model analysis. [36] discusses methods and tools for the checking of UML state chart specifications of embedded controllers. The focus is on the use of statecharts and on efficient methods for automated checking and does not include the use of other UML diagrams or the inclusion of safety requirements using stereotypes. Also relevant is the work towards a formal semantics of UML (see the proceedings of related conferences, including the UML and FASE conferences). [27] proposes the automated generation of fault trees based on the source code of software which may be combined with fault trees based on the electronic circuit design of the hardware, allowing the software and the hardware fault trees to be composed into a fault tree of the system. It presents a prototype of a fault tree generation tool that is capable to generate fault trees based on C++ code. [5] presents an integrated tool environment where automatic transformations of UML models can capture dependability requirements. The proposed metrics suite as a basis to find fault-prone components can also be found in [47]. Lano et al. [26] propose a method to analyze object-oriented models in terms of safety and security but not considering complexity directly. In [48] an approach is proposed that includes a reliability model that is based on the software architecture. A complexity metric that is in principle applicable to models as well as to code is discussed in [6], but it also only involves static structure. Another approach related to safety-critical systems is proposed in [20]. It annotates UML models with safety-related information for further analysis. In [4] the cyclomatic complexity is suggested for most aspects of a design metric. Safety checklists have been proposed for example in [15]. [14] uses Z and Petri nets for modeling safety-critical systems.



8



Conclusions



In this chapter we propose means to incorporate reliability requirements into UML models. It is achieved using the UML profile mechanisms of stereotypes and tagged values. This makes these important requirements visible in the model, helps to encapsulate knowledge of reliability mechanisms, and simplifies its use by non-experts. Furthermore by formalizing the requirements, checks can be done. Having annotated a model with the defined stereotypes and tagged values, one can check the consistency of the requirements through-out the model. This lends itself to tool support for automatic checking. We describe a framework in which several of such checks are implemented. To provide a reasonable basis for the reliability analysis of a system, we also present a metrics suite for UML models based on the work of [7] to measure the structural



 342



Jan J¨urjens and Stefan Wagner



complexity of the models. Specifically, we use the numbers of provided and required interfaces as a metric for fan-in and fan-out, and lift the cyclomatic complexity [29] to the machine level to measure the complexity of state machines. The suite is then used to find fault-prone components in a system. Fault-proneness, i.e. the probability of containing faults, is not a good measure for reliability because it does not take into account the probability of the faults of leading to a failure. Therefore the operational profile [33, 34] is used to estimate the usage of a component and the combination of fault-proneness and usage yields the failureproneness of the component. This information can finally be used to check consistency with earlier defined reliability requirements in the model and to improve the efficiency of verification efforts. We finish our chapter with two case studies. One of these describes an Automatic Collision Notification system for automobiles that sends automatically an emergency call in case of a crash. It shows that the metrics suite, especially in combination with an operational profile, indeed can identify failure-prone components. Furthermore, the case study illustrates the interplay of the metrics suite with the annotation with stereotypes. The second case study investigates only the capabilities of the metrics suite. It confirms that the suite helps to identify fault-prone components.



Acknowledgments We gratefully acknowledge the joint work with Martin Baumgartner, Christian K¨uhnel, Alexander Pretschner, Wolfgang Prenninger, Bernd Sostawa, and R¨udiger Z¨olch on the MOST Network Master. Furthermore we are grateful to Manfred Broy and Wolfgang Prenninger for commenting on a draft version. This work was partly sponsored by the DFG within the project InTime and the German Ministry for Science and Education within the Verisoft project.



References 1. C. Atkinson, C. Bunse, and J. W¨ust. Driving component-based software development through quality modelling. In A. Cechich, M. Piattini, and A. Vallecillo, editors, ComponentBased Software Quality, volume 2693 of LNCS, pages 207 – 224. Springer, 2003. 2. V.R. Basili, L.C. Briand, and W.L. Melo. A Validation of Object-Oriented Design Metrics as Quality Indicators. IEEE Trans. Software Eng., 22(10):751–761, 1996. 3. R. Bharadwaj and C. Heitmeyer. Developing high assurance avionics systems with the SCR requirements method. In 19th Digital Avionics Systems Conference, 2000. 4. J.K. Blundell, M.L. Hines, and J. Stach. The Measurement of Software Design Quality. Annals of Software Engineering, 4:235–255, 1997. 5. A. Bondavalli, M. Dal Cin, D. Latella, I. Majzik, A. Pataricza, and G. Savoia. Dependability analysis in the early phases of UML based system design. Journal of Computer Systems Science and Engineering, 16:265–275, 2001. 6. D.N. Card and W.W. Agresti. Measuring Software Design Complexity. The Journal of Systems and Software, 8:185–197, 1988. 7. S.R. Chidamber and C.F. Kemerer. A Metrics Suite for Object Oriented Design. IEEE Trans. Software Eng., 20(6):476–493, 1994.



 Component-Based Development of Dependable Systems with UML



343



8. B. Dutertre and V. Stavridou. A model of noninterference for integrating mixed-criticality software components. In DCCA, San Jose, CA, January 1999. 9. Mayday: System Specifications. The ENTERPRISE Program, 1997. Available at http://enterprise.prog.org/completed/ftp/mayday-spe.pdf (October 2004). 10. Colorado Mayday Final Report. The ENTERPRISE Program, 1998. Available at http://enterprise.prog.org/completed/ftp/maydayreport.pdf (October 2004). 11. N.E. Fenton and S.L. Pfleeger. Software Metrics. A Rigorous & Practical Approach. International Thomson Publishing, 2nd edition, 1997. 12. M.H. Halstead. Elements of Software Science. Elsevier North-Holland, 1977. 13. K. Hansen and I. Gullesen. Utilizing UML and patterns for safety critical systems. In J¨urjens et al. [22], pages 147–154. 14. M. Heiner and M. Heisel. Modeling safety-critical systems with Z and Petri Nets. In M. Felici, K. Kanoun, and A. Pasquini, editors, 18th International Conference on Computer Safety, Reliability and Security (SAFECOMP’99), volume 1698, pages 361–374, 1999. 15. C. Heitmeyer, R. Jeffords, and B. Labaw. Automated consistency checking of requirements specifications. ACM Trans. on Software Eng. and Methodology, 5(3):231–261, July 1996. 16. S. Henry and D. Kafura. Software Structure Metrics Based on Information Flow. IEEE Trans. Software Engineering, 7:510–518, 1981. 17. S. Henry and C. Selig. Predicting Source-Code Complexity at the Design Stage. IEEE Software, 7:36–44, 1990. 18. F. Huber, B. Sch¨atz, A. Schmidt, and K. Spies. AutoFocus: A tool for distributed systems specification. In B. Jonsson and J. Parrow, editors, Formal Techniques in Real-Time and Fault-Tolerant Systems, 4th International Symposium, FTRTFT’96, volume 1135 of LNCS, pages 467–470, Uppsala, Sweden, Sept. 9–13 1996. Springer. 19. J. J¨urjens. Critical systems development with UML and model-based testing. In The 22st International Conference on Computer Safety, Reliability and Security (SAFECOMP 2003), Edinburgh, Sept. 23-26 2003. Full-day tutorial. 20. J. J¨urjens. Developing safety-critical systems with UML. In P. Stevens, editor, UML 2003 – The Unified Modeling Language, volume 2863 of LNCS, pages 360–372, San Francisco, CA, October 20–24, 2003. Springer. 21. J. J¨urjens. Secure Systems Development with UML. Springer, 2004. 22. J. J¨urjens, V. Cengarle, E.B. Fernandez, B. Rumpe, and R. Sandner, editors. Critical Systems Development with UML, number TUM-I0208 in TU M¨unchen Technical Report, 2002. UML’02 satellite workshop proceedings. 23. J. J¨urjens, B. Rumpe, R. France, and E.B. Fernandez, editors. Critical Systems Development with UML, number TUM-I0317 in TU M¨unchen Technical Report, 2003. UML’03 satellite workshop proceedings. 24. J. J¨urjens, B. Rumpe, R. France, and E.B. Fernandez, editors. Third International Workshop on Critical Systems Development with UML, TU M¨unchen Technical Report, 2004. UML’04 satellite workshop proceedings. 25. T.M. Khoshgoftaar and T.G. Woodcock. Predicting Software Development Errors Using Software Complexity Metrics. IEEE Journal on Selected Areas in Communications, 8(2):253–261, 1990. 26. K. Lano, D. Clark, and K. Androutsopoulos. Safety and Security Analysis of Object-Oriented Models. In SAFECOMP 2002, volume 2434 of LNCS, pages 82–93. Springer, 2002. 27. P. Liggesmeyer and O. Maeckel. Quantifying the reliability of embedded systems by automated analysis. In 2001 International Conference on Dependable Systems and Networks (DSN 2001), pages 89–96. IEEE Computer Society, 2001. 28. T. Mayer and T. Hall. A Critical Analysis of Current OO Design Metrics. Software Quality Journal, 8:97–110, 1999.



 344



Jan J¨urjens and Stefan Wagner



29. T.J. McCabe. A Complexity Measure. IEEE Trans. Software Engineering, 5:45–50, 1976. 30. A. Melton, D. Gustafson, J. Bieman, and A. Baker. A Mathematical Perspective for Software Measures Research. IEE/BCS Software Engineering Journal, 5:246–254, 1990. 31. MOST Cooperation. MOST Media Oriented System Transport—Multimedia and Control Networking Technology. MOST Specification Rev. 2.3. August 2004. 32. J.C. Munson and T.M. Khoshgoftaar. Software Metrics for Reliability Assessment. In Michael R. Lyu, editor, Handbook of Software Reliability Engineering, chapter 12. IEEE Computer Society Press and McGraw-Hill, 1996. 33. J.D. Musa. Software Reliability Engineering. McGraw-Hill, 1999. 34. J.D. Musa, A. Iannino, and K. Okumoto. Software Reliability: Measurement, Prediction, Application. McGraw-Hill, 1987. 35. Object Management Group. UML 2.0 Superstructure Final Adopted specification, August 2003. OMG Document ptc/03-08-02. 36. Z. Pap, I. Majzik, and A. Pataricza. Checking general safety criteria on UML statecharts. In U. Voges, editor, SAFECOMP 2001, volume 2187 of LNCS, pages 46–55. Springer, 2001. 37. W. Prenninger and A. Pretschner. Abstractions for Model-Based Testing. In M. Pezze, editor, Proc. Test and Analysis of Component-based Systems (TACoS’04), 2004. 38. A. Pretschner, W. Prenninger, S. Wagner, C. K¨uhnel, M. Baumgartner, B. Sostawa, R. Z¨olch, and T. Stauner. One Evaluation of Model-Based Testing and its Automation. In Proc. 27th International Conference on Software Engineering (ICSE), 2005. To appear. 39. F. Randimbivololona. Orientations in verification engineering of avionics software. In R. Wilhelm, editor, Informatics – 10 Years Back, 10 Years Ahead, LNCS, pages 131–137. Springer, 2000. 40. L. Rosenberg, T. Hammer, and J. Shaw. Software Metrics and Reliability. In Proc. 9th International Symposium on Software Reliability Engineering (ISSRE’98). IEEE, 1998. 41. J. Rushby. Critical system properties: Survey and taxonomy. Reliability Engineering and System Safety, 43(2):189–219, 1994. 42. B. Selic. Physical programming: Beyond mere logic. In A. Sangiovanni-Vincentelli and J. Sifakis, editors, Embedded Software Second International Conference (EMSOFT 2002), volume 2491 of LNCS, pages 399–406, 2002. 43. B. Selic, G. Gullekson, and P.T. Ward. Real-Time Object-Oriented Modeling. John Wiley & Sons, 1994. 44. B. Selic and J. Rumbaugh. Using UML for modeling complex real-time systems. Available at http://www-106.ibm.com/developerworks/rational/library/, 1998. 45. S. Wagner. Efficiency Analysis of Defect-Detection Techniques. Technical Report TUMI0413, Institut f¨ur Informatik, Technische Universit¨at M¨unchen, 2004. 46. S. Wagner. Reliability Efficiency of Defect-Detection Techniques: A Field Study. In Suppl. Proc. 15th IEEE International Symposium on Software Reliability Engineering (ISSRE’04), 2004. 47. S. Wagner and J. J¨urjens. Model-Based Identification of Fault-Prone Components. Draft. 48. W.-L. Wang, Y. Wu, and M.-H. Chen. An Architecture-Based Software Reliability Model. In Proc. Pacific Rim International Symposium on Dependable Computing (PRDC’99), pages 143–150, 1999.



 Author Index



Atkinson, Colin



1



Lukkien, Johan J.



Berbers, Yolande 209 Bunse, Christian 1



164



Chaudron, Michel R.V. Cockburn, J. 296 Crnkovic, Ivica 232



164



Mahieu, Tom 185 Maydl, Walter 82 Mayer, Nikolas 107 McNair, A. 296 Michiels, Sam 185 Muskens, Johan 164



da Silva, Leandro Dias de Souza, P. 296 Desmet, Lieven 185 Dietrich, Christian 8



35



Nadjm-Tehrani, Simin



Furber, R.A.



Paredes Riano, Javier 107 Peper, Christian 1 Perkusich, Angelo 35 Purhonen, Anu 275



296



Gross, Hans-Gerhard 1, 107 Grunske, Lars 82, 249 Halang, Wolfgang A. Hansson, Jörgen 59 Jahnke, J.H. 296 Janssens, Nico 185 Joosen, Wouter 185 Jürjens, Jan 320 Kaiser, Bernhard 249 Kircher, Michael 143 Lavender, M. 296 Lu, Shourong 123 Lüders, Frank 232



59



8, 123



Reussner, Ralf H. 249 Rigole, Peter 209 Runeson, Per 232 Salzmann, Christian Tešanović, Aleksandra



143 59



Van Baelen, Stefan 209 Vandewoude, Yves 209 Verbaeten, Pierre 185 Voelter, Markus 143 Wagner, Stefan Zhang, Wei



8



320


	        	    
    		
    		    
    		    
    		
		    			
    
		
	    
		
		Software Engineering for Resilient Systems (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Software Technologies for Embedded and Ubiquitous Systems (Lecture Notes in Computer Science, 6399)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Hybrid Systems (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Software Engineering: An Advanced Course (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Computer Science: An Overview
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Exercises in Computer Systems Analysis (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Computer Vision Systems (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Architecture of Computing Systems (Lecture Notes in Computer Science 5974)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Model Checking Software (Lecture Notes in Computer Science, 6823)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Computational Logistics (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Reachability Problems (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Operating Systems: Lecture Notes in Computer Science Vol 80
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Architecting Dependable Systems III (Lecture Notes in Computer Science 3549)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Hybrid Systems IV (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Multiple Classifier Systems (Lecture Notes in Computer Science 5997)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Provable Security (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Information Hiding (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		An Embedded Software Primer
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Distributed Systems: Methods and Tools for Specification. An Advanced Course (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Computer Science: An Overview. 11th Edition
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		DSP Software Development Techniques for Embedded and Real-Time Systems
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		An Embedded Software Primer
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		An Embedded Software Primer
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Complexity of Constraints - An Overview of Current Research Themes
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		DSP Software Development Techniques for Embedded and Real-Time Systems
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		The Smart Internet: Current Research and Future Applications (Lecture Notes in Computer Science, 6400)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Transactions on Aspect-Oriented Software Development III (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Challenges for Action Theories (Lecture Notes in Computer Science)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Transactions on Aspect-Oriented Software Development V (Lecture Notes in Computer Science   Transactions on Aspect-Oriented Software Development)
	    
	
	
	    Read more
	
    
		    			
    
		
	    
		
		Computer Safety, Reliability, and Security (Lecture Notes in Computer Science)
	    
	
	
	    Read more


	
            
                
                    Recommend Documents
                
		
		    						    
    
	
	    
	
    
    
	
	    
		Software Engineering for Resilient Systems (Lecture Notes in Computer Science)	    
	
	
	    Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Software Technologies for Embedded and Ubiquitous Systems (Lecture Notes in Computer Science, 6399)	    
	
	
	    Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Hybrid Systems (Lecture Notes in Computer Science)	    
	
	
	    Lecture Notes in Computer Science Edited by G. Goos and J. Hartmanis Advisory Board: W. Brauer

D. Gries

J. Stoer

736
...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Software Engineering: An Advanced Course (Lecture Notes in Computer Science)	    
	
	
	    Lecture Notes in Computer Science Edited by G. Goes and J. Hartmanis

30 F. L. Bauer • J. B. Dennis • G. Goos • C. C. Go...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Computer Science: An Overview	    
	
	
	    
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Exercises in Computer Systems Analysis (Lecture Notes in Computer Science)	    
	
	
	    Lecture Notes in Computer Science Edited by G. Goos and J. Hartmanis

35 W. Everling

Exercises in

Computer SystemsAna...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Computer Vision Systems (Lecture Notes in Computer Science)	    
	
	
	    Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Architecture of Computing Systems (Lecture Notes in Computer Science 5974)	    
	
	
	    Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Model Checking Software (Lecture Notes in Computer Science, 6823)	    
	
	
	    Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...
	
    
    
    
						    
    
	
	    
	
    
    
	
	    
		Computational Logistics (Lecture Notes in Computer Science)	    
	
	
	    Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...