Schahram Dustdar Daniel Schall Florian Skopik Lukasz Juszczyk Harald Psaier Editors
Socially Enhanced Services Computing Modern Models and Algorithms for Distributed Systems
Editors Schahram Dustdar Daniel Schall Florian Skopik Lukasz Juszczyk Harald Psaier TU Wien Distributed Systems Group Argentinierstr. 8 1040 Wien Austria
[email protected] [email protected] [email protected] [email protected] [email protected]
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machines or similar means, and storage in data banks. Product Liability: The publisher can give no guarantee for all the information contained in this book. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. c 2011 Springer-Verlag/Wien Printer in Germany SpringerWienNewYork is a part of Springer Science + Business Media springer.at Typesetting: SPI, Pondicherry, India Printed on acid-free paper SPIN 80030707
With 41 Figures Library of Congress Control Number: 2011930925 ISBN 978-3-7091-0812-3 e-ISBN 978-3-7091-0813-0 DOI 10.1007/978-3-7091-0813-0 SpringerWienNewYork
Preface
Service-oriented architecture (SOA) and service technology is established in practice: many commercial products supporting service-based applications are available and in production use since years, many projects in companies have been successfully completed, and the results of these projects help people to ease solving their business problems. A plethora of standards (aka Web Services standards) have been specified to ensure interoperability of service-based solutions, and many of these standards are implemented in commercial products. Last but not least, a large number of research projects have been completed or are on their way that explore the advanced use of services and extend the corresponding concepts and technologies where needed. Historically, service technology has been developed to solve several problems in integrating platforms and applications. Thus, services are typically realized by programs. More and more services are used to immediately support and represent real-world business activities. This results in the requirement to support services that are not realized by programs but directly by the work performed by human beings. For example, Amazon’s Human Intelligence Tasks (aka Amazon Mechanical Turk) provides the use of human work rendered as Web services. As a result of this demand, a standard around the support of requesting work of human beings via Web services has been created, namely WS-HumanTask. Historically, workflow systems have been the originator of requests for human work as well as the consumer of the results of such work; consequently, WS-HumanTask has been used by the BPEL4People standard to support human work in business processes that are specified using BPEL. Furthermore, BPMN 2.0 references WSHumanTask to model human work in BPMN-based business processes. But humans often interact much more dynamically, in unforeseen manners, i.e., the modeling of such interactions via process models is not possible. The collaborators in such dynamic interactions often do not know each other from the outset; thus, trust between the collaborators must be established (ideally automatically), especially in cases where the interactions correspond to business activities. Also, services provided by human beings and (programmed) services need to interact resulting in mixed service environments. v
vi
Preface
This book is about this new and thrilling subject of mixed service environments. The authors have pioneered this area; thus, the readers of this book will get firsthand information on this subject: researchers will get a plethora of stimulations for their own research, practitioners will be enabled to judge about the relevance of this area in their domain, and developers of corresponding middleware will get ideas about potential extensions of their systems. I had a lot of fun while reading this book and learned a lot. My hope is that this book will find a lot of readers who will similarly benefit from reading it. I am quite confident that the area of mixed service environments is just at the beginning, i.e., looking into this area is relevant and important. Stuttgart, March 2011
Frank Leymann
Contents
1
The Human-Provided Services Framework . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Daniel Schall, Hong-Linh Truong, and Schahram Dustdar 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.1.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.1.2 Contributions .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.3 Interaction Models .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.3.1 HPS Interactions . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.4 HPS Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.4.1 Middleware Platform . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.4.2 Data Collections.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6 Using the HPS Framework in Ad-Hoc Collaborations . . . . . . . . . . . . . . 1.6.1 Defining Service Interfaces .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6.2 XML Collections of Services. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6.3 Personal Services . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.7 Conclusion and Future Work .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2 Unifying Human and Software Services in Web-Scale Collaborations .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Daniel Schall, Hong-Linh Truong, and Schahram Dustdar 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.2 Web 2.0’s Collaboration Landscape . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.3 Motivating Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.3.1 Ad Hoc Contribution Requests . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.3.2 User-Defined Processes .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.3.3 Interactions with Formalized Processes.. . . . . . . . . . . . . . . . . . . . 2.4 HPS in Web-Scale Collaborations . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.4.1 The Framework . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.4.2 Ad Hoc Collaboration Example .. . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.4.3 Process-Centric Collaboration Example . . . . . . . . . . . . . . . . . . . . 2.5 Future Work .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1 1 2 3 3 4 5 6 6 7 8 10 11 13 13 14 17 17 18 19 19 20 20 20 21 25 25 26 vii
viii
Contents
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Florian Skopik, Daniel Schall, and Schahram Dustdar 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.2 Service-Oriented Collaborations .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.3 Communication, Coordination, and Composition . . . . . . . . . . . . . . . . . . . 3.3.1 Social Trust in Collaborations .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.3.2 The Cycle of Trust . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.4 From Interactions to Social Trust . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.4.1 Interaction Layer .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.4.2 Personalized Trust Inference . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.4.3 Trust Projection Layer . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.5 Fuzzy Set Theory for Trust Inference.. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.6 Trust Model Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.6.1 Fundamental Trust Model . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.6.2 Temporal Evaluation .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.6.3 Trust Projection . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.7 Towards Flexible Compositions.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.7.1 Community Balancing Models . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.7.2 Request Delegation Patterns .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8 Architecture and Implementation . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8.1 Interaction Monitoring .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8.2 Activity Management .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8.3 Trust Model Administration .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8.4 Personal Trust Rules Management .. . . . .. . . . . . . . . . . . . . . . . . . . 3.8.5 Social Network Management and Provisioning . . . . . . . . . . . . 3.8.6 VieTECore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8.7 Human Provided Services in the Expert Web . . . . . . . . . . . . . . 3.8.8 Interaction Monitoring and Logging . . . .. . . . . . . . . . . . . . . . . . . . 3.8.9 Metric Calculation . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.8.10 Trust Provisioning . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.9 Evaluation and Discussion . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.9.1 Computational Complexity of Trust Management . . . . . . . . . 3.9.2 Interaction Balancing in Large-Scale Networks .. . . . . . . . . . . 3.10 Background and Related Work. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.10.1 Flexible and Context-aware Collaborations . . . . . . . . . . . . . . . . 3.10.2 Interactions in Mixed Systems . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.10.3 Behavioral and Social Trust Models for SOA . . . . . . . . . . . . . . 3.11 Conclusion and Further Work . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4 Script-Based Generation of Dynamic Testbeds for SOA . . . . . . . . . . . . . . . . Lukasz Juszczyk and Schahram Dustdar 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.2 SOA Testbeds .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.2.1 Related Research on SOA Testing . . . . . .. . . . . . . . . . . . . . . . . . . . 4.2.2 Evolution of Genesis . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
29 30 31 33 33 34 35 36 39 40 41 43 44 46 48 50 51 52 54 54 55 55 55 55 56 56 57 58 59 61 61 67 70 70 71 71 72 77 77 78 79 80
Contents
4.3
ix
The Genesis2 Testbed Generator.. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.3.1 Basic Concepts and Architecture .. . . . . . .. . . . . . . . . . . . . . . . . . . . 4.3.2 Extensible Generation of Testbed Instances . . . . . . . . . . . . . . . . 4.3.3 Exploitation of Groovy Features . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.3.4 Multicast Testbed Control . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . QoS Testbed Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Discussion and Future Work . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
81 82 84 86 87 88 91 92
5 Behavior Monitoring in Self-Healing Service-Oriented Systems . . . . . . . Harald Psaier, Florian Skopik, Daniel Schall, and Schahram Dustdar 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.1.1 Self-Healing Principles . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.1.2 Contributions .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.2 Flexible Interactions and Compositions . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.2.1 Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.2.2 Delegation Behavior . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.1 Mixed SOA Environment .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.2 Monitoring and Adaptation Layer.. . . . . .. . . . . . . . . . . . . . . . . . . . 5.4 VieCure Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.4.1 Interaction Monitoring .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.4.2 Event Trigger, Diagnosis, and Recovery Actions . . . . . . . . . . 5.5 Regulation of Behavior .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.1 Trigger .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.2 Diagnosis .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.3 Recovery Actions . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.4 Sink Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.5 Factory Behavior .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.6 Transient Behavior .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.6 Simulation and Evaluation . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.6.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.6.2 Results and Discussion . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.8 Conclusion and Outlook .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
95
4.4 4.5 4.6
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Harald Psaier, Lukasz Juszczyk, Florian Skopik, Daniel Schall, and Schahram Dustdar 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.2 On Self-Adaptation in Collaborative SOA . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.3 Profile Similarity and Dynamic Trust.. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.3.1 Interest Profile Creation . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.3.2 The Interplay of Interest Similarity and Trust . . . . . . . . . . . . . .
95 96 97 98 98 100 101 101 102 103 104 105 107 107 107 107 109 109 110 110 110 111 113 114 117
118 119 121 121 122
x
Contents
6.4
6.5 6.6
6.7 6.8
Design and Architecture .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.4.1 Genesis2 Testbed Generator Framework . . . . . . . . . . . . . . . . . . . 6.4.2 Adaptation Framework . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Behavior Monitoring and Self-Adaptation . . . . . . .. . . . . . . . . . . . . . . . . . . . Experiments .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.6.1 Scenario Overview .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.6.2 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.6.3 Result Description . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Conclusion and Outlook .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
123 124 126 127 130 130 131 133 135 136
Index . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 139
Introduction
This book aims at introducing the main concepts of a novel field, which we refer to as “Socially enhanced Services Computing”. This area conducts research at the intersection of Services Computing, Social Computing, Crowd Computing, and Cloud Computing. Social Computing is increasingly gaining momentum and is perceived mainly as a vehicle to establish and maintain social (private) relations as well as utilizing political and social interests. Not surprisingly, social computing lacks substantial uptake in enterprises. Clearly, collaborative computing as such is firmly established (as a niche), however, there is no tight integration of social and collaborative computing approaches to mainstream problem solving in and between enterprises or teams of people, nor in the area of Services Computing. In this book we present a fresh look at this problem and a collection of our papers discussing in some detail how to integrate people in the form of human-provided computing and software services into one composite system, which can be modeled, programmed, and instantiated on a large-scale. This volume contains previously published papers of the editors of this book. We believe that the selected papers discuss the fundamental aspects in this area. It is clear, however, that as this research field evolves, novel contributions need to be taken into account. As one example which is published recently, we would like to refer to the concept of the “Social Compute Unit” [1]. This paper (not contained in this book) builds a conceptual social construct of Human-Provided Services and can be seen as a natural extension of the papers presented in this book. In the first chapter of this book [2], we present the Human-Provided Services Framework, which allows users to render their skills and capabilities as Web services. The contribution of the framework is to support the user in doing that technically but also to enable novel and complex interaction models between such services, thereby establishing large-scale solutions. The second chapter [3] discusses in more detail, how human-provided and software services can actually be composed into such large-scale compositions. The third chapter [4] introduces a fundamental concept utilized in our work: Trust. Dynamic trust concepts and their relationship to Human-provided services as well as their link to Softwarebased services are presented in detail. Our assumption is that, as trust between xi
xii
Introduction
humans evolves over time, offered services and their interactions and access rights, amongst other things, need to take that dynamism into account. We show that dynamic trust is a powerful concept which can be used for service composition thus enabling higher levels of automation in systems composed of Humanprovided services and software services. Chapter four [5] addresses one major problem in most Service-oriented Architectures, namely, how to test large-scale ecosystems of services. In today’s literature, most approaches consider (only) testing of individual Web services. In our approach we present a tool (G2) as well as an underlying approach and framework on how to generate large-scale dynamic testbeds, which also consider a plethora of service components which are typically part of large deployments such as registries, Enterprise Services Bus, amongst others. Chapter 5 [6] discusses one important ingredient: how to monitor the interaction behavior of deployed services on the one hand and how to cater for self-healing support on the other hand. Finally, chapter 6 [7] shows how runtime behavior monitoring and self-adaptation can actually be achieved, highlighting our concepts as well as providing an implemented software prototype. We hope that this book can capture your imagination and enthusiasm for this novel research area and also convincingly discusses some of the required technical background and conceptual foundations for these types of modern distributed systems.
References 1. Dustdar, S., Bhattacharya, K.: The social compute unit. IEEE Internet Computing, May/June 2011, pp. 64–69 2. Schall, D., Truong, H.-L., Dustdar, S.: The Human-Provided Services Framework IEEE 2008 Conference on Enterprise Computing, E-Commerce and E-Services (EEE ‘08), July 21–24, 2008, Crystal City, Washington, D.C., USA (2008) 3. Schall, D., Truong, H.-L., Dustdar, S.: Unifying human and software services in web-scale collaborations. IEEE Internet Comput. 12(3), 62–68 (2008) 4. Skopik, F., Schall, D., Dustdar, S.: Modeling and Mining of dynamic trust in complex serviceoriented systems. Inform. Syst. J 35(7), 735–757 (2010). Elsevier 5. Juszczyk L., Dustdar S.: Script-based Generation of Dynamic Testbeds for SOA. 8th IEEE International Conference on Web Services (ICWS’10), 5–10. July 2010, Miami, USA (2010) 6. Psaier H., Skopik F., Schall D., Dustdar S.: Behavior Monitoring in Self-healing Serviceoriented Systems. 34th Annual IEEE Computer Software and Applications Conference (COMPSAC), July 19–23, 2010, Seoul, South Korea. IEEE (2010) 7. Psaier H., Juszczyk L., Skopik F., Schall D., Dustdar S.: Runtime Behavior Monitoring and SelfAdaptation in Service-Oriented Systems. 4th IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO), September 27 – October 01, 2010, Budapest, Hungary. IEEE (2010)
Chapter 1
The Human-Provided Services Framework Daniel Schall, Hong-Linh Truong, and Schahram Dustdar
Abstract The collaboration landscape evolves rapidly by allowing people to participate in ad-hoc and process-centric collaborations. Thus, it is important to support humans in managing highly dynamic and complex interactions. The problem currently with managing interactions is that humans are unable to specify different interaction interfaces for various collaborations, nor able to indicate their availability to participate in collaborations. This work introduces the Human-provided Services (HPS) framework, which allows users to provide services based on their skills and expertise. Such services can be used by human actors and software services in both ad-hoc and process-centric collaborations. With the HPS framework, people can offer multiple services and manage complex interactions, while requesters can find the right experts and available users for performing specific tasks. In this work, we present the HPS middleware, which is the core of the HPS framework. We show how HPS services can be used in Web-scale ad-hoc collaboration scenarios.
1.1 Introduction Today’s collaboration landscape has changed by allowing a large number of users to communicate and collaborate using Web-based platforms and messaging tools. Users collaborate with each other by sharing content that is made available on the Web. Also, collaborations within organizations are no longer closed ecosystems as collaborations and interactions span multiple organizations or business units that D. Schall () H.-L. Truong S. Dustdar Distributed Systems Group, Vienna University of Technology, Argentinierstr 8/184-1, 1040 Vienna, Austria e-mail:
[email protected];
[email protected];
[email protected] c 2008 IEEE. Reprinted, with permission, from Schall, D., Truong, H.-L., Dustdar S. (2008) The Human-Provided Services Framework IEEE 2008 Conference on Enterprise Computing, E-Commerce and E-Services (EEE ’08), July 21–24, 2008, Crystal City, Washington, DC, USA S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0 1, © Springer-Verlag/Wien 2011
1
2
D. Schall et al.
are scattered around the globe. However, it becomes increasingly challenging to manage collaborations that involve a number of people and comprise a large set of exchanged messages. In addition, users demand access to collaboration resources using pervasive devices in an always-on fashion. To address these challenges, collaboration platforms must support the user in managing complex interactions that span multiple organizations, and hide the complexity due to different message formats and divers types of collaboration resources, and furthermore be able to support different devices. This work introduces the Human-provided Services (HPS) framework that lets users publish their capabilities and skills as services. Using the HPS framework, users are able to define and provide services for different collaborations. HPS allows users to control their interactions beyond the simple exchange of messages by defining multiple service interfaces and interaction rules to manage complex interactions. The novelty of HPS is that collaborations take place in a service-oriented framework, thus enabling a dynamic mix of human- and software services. User- and service related information are maintained in a service registry, which allows HPSs to be discovered by both human collaborators and software (processes) services. Thus, HPS allows business processes, that require human input or intervention, to interact with humans using standardized Web services protocols, making the HPS framework a versatile collaboration and interaction framework.
1.1.1 Approach Our approach is to build an HPS middleware platform that integrates Web technologies and Web services with the goal of providing a framework to enable humans to publish services, thereby allowing humans and software to find and interact with HPS users. Figure 1.1 shows our approach, which comprises the specification and deployment of services and service discovery and interactions with HPS services. To this end, the features supported by the HPS framework must be: • Ability to define services. Anyone has to be able to define services and corresponding interfaces, or simply reference or copy an existing interface and reuse or modify it. In step 1 users specify profile information and define service interfaces. • Specification of interactions. Users must be able to specify their interaction protocols. Customized protocols allow interactions to be managed in a given context, that is in a collaboration through services. • User-centric service publishing/provisioning. Encompasses the ability to easily publish and interact with services. In step 2 users deploy and register personal services. • Discovery and interactions with users/processes. Processes and humans actors must be able to discover HPSs. HPS simplifies interactions with user-provided services by abstracting from service location and deployment. In step 3 requesters discover humans and services and interact with the selected human and services through the HPS middleware.
1 The Human-Provided Services Framework
3
Fig. 1.1 The HPS approach
1.1.2 Contributions Our contributions center around the definition of a novel framework that uses Web services in interactions and collaborations between people or people and software services. This work discusses the design, implementation, and evaluation of the HPS middleware platform. The goal of this work is to provide an insight on the various components and services provided by the middleware. Out-of-scope are legal or privacy issues as well as security issues. Structure of this Work: Interactions models applicable to human collaboration, ranging from ad-hoc to process-centric, are presented in Sect. 1.3. The HPS system architecture is detailed in Sect. 1.4, followed by a discussion on implementation aspects in Sect. 1.5. Section 1.6 describes how to use the HPS framework in ad-hoc collaboration scenarios.
1.2 Related Work The work tackles several issues related to services on the Web and human computation. In the following we discuss significant related work in those areas. Human computation is a technique to let humans solve tasks, which cannot be solved by computers (see [5] for an overview). An application of genetic algorithms has been presented in [7]. The computer asks a person or a large number of people to solve a problem and then collects their solutions (e.g., see games that matter
4
D. Schall et al.
[1]). Human computation applications can be realized in the HPS framework as people are able to provide user-defined services. Additionally, the HPS framework allows users to manage their interactions. Web-based platforms inspired by human computation include for example Yahoo! Answers1 [10] and Amazon Mechanical Turk2 , which employ human tasks that are claimed and processed by users. There are several limitations which cannot be addressed by these platforms, but by HPS: (1) how to manage interactions, (2) how to find the right person (expert), (3) how can users define their availability to participate in collaborations. Recently, specifications have been released which allow processes (i.e., BPEL) to be extended with human interactions, defined in WS-HumanTask specification [2]. Additionally, work presented in [11] aimed at integrating humans into processes. The HPS framework can be used in such process-centric collaborations as well (e.g., human task in process). However, the HPS framework allows user-define services for ad-hoc and process-centric collaborations, and also allows humans (services) to be discovered. Expert-finder systems [3] commonly utilize semantic technologies to express users’ expertise and skills as ontologies. In the HPS framework, we focus on interactions between humans and software using Web services technologies. However, the HPS framework can be extended by using semantic technologies, for example, to express skills and social relations using ontologies.
1.3 Interaction Models The interaction models in collaboration range from ad-hoc (informal) to predefined formalized process models (see [4,9]). Table 1.1 gives an overview of these different models. In the following we discuss concepts used to control interactions. In this work we show how the HPS framework can be used in ad-hoc collaboration scenarios. The only requirement for users is to define human activities (at design time), which can be automatically mapped to specific Web services and actions. During the actual collaboration (run-time), requests to perform certain activities are being sent to HPSs as XML documents that parameterize the request. In contrast to workflow-based systems, interactions need not comprise predefined process models. In HPS, there is distinction between a task announcement and an interaction control task, both using the Human Task structure. Task announcements. Requesters have the ability to create a Human Task and to specify the number of available tasks. Tasks can be linked to HPS servicecategories to express which service (i.e., which expert) is needed to process the given task. This case is indicated by the link between Human Task and the Interaction Interfaces in Fig. 1.2a (Listing 1.1 shows an actual XML example of
1 2
http://answers.yahoo.com/ http://www.mturk.com/
1 The Human-Provided Services Framework
5
Table 1.1 Interaction models in human collaboration Interactions are ad-hoc if there is no predefined control flow associated with an interaction. For example, interactions between requesters Ad-hoc and HPS users simply take place by exchanging messages. Tasks can be used to control the status of an interaction. Requesters have the ability to impose certain constraints on tasks such as start-time (when should users start processing tasks) or deadlines (maximum time when tasks have to be finished). State-awareness Process-centric collaboration can be established by defining interaction rules. Tasks can be split into sub-tasks and forwarded to other people. Multiple HPSs could be potentially involved in interactions to solve complex problems. Process-centric
a
b
Fig. 1.2 (a) Conceptual model HPS interactions. (b) Example interaction flow
task announcements). Linking tasks announcements to services is accomplished by tagging task descriptions with keywords. Tasks can be linked to a logical People Group to specify conditions associated with the users that should be able to claim and process the task (e.g., user groups in an organization’s human resources directory). Interaction control tasks. If tasks are used in interactions, defined by using Human Tasks in Fig. 1.2a, requesters are aware of the state of a given request (e.g., accepted, inprogress, or completed). Task-state information can be retrieved via pull mechanisms or, alternatively, various actions can be automatically triggered such as sending Notifications upon state changes. Interactions. HPS interactions comprise a multitude of Messages in different formats (e.g., indicated as Email or SOAP messages in Fig. 1.2a). In addition, interactions generally comprise notifications, tasks, and people/services that are involved in an interaction. Discussions on complex interaction flows are not in the scope here.
1.3.1 HPS Interactions In HPS, Web services are used to define interaction interfaces to humans. Typical interaction patterns found in the Web services domain such as the synchronous exchange of messages are not sufficient for modeling human interactions. Therefore,
6
D. Schall et al.
we introduce a new human-based service interaction model, allowing users to deal with requests in, for example, offline mode or using different devices to process requests. Since today’s collaboration landscape increasingly shifts toward pervasive collaboration and interactions, a system supporting HPS services must give users the flexibility to deploy user-defined services on a variety of devices (e.g., mobiles). Such devices are not always online or connected to the network. Thus, the HPS framework allows requests to be saved and retrieved whenever the users are available. An exemplary interaction flow is shown in Fig. 1.2b. Indeed, the number of actors involved in an interaction can be greater than two and multiple tasks can be defined. As mentioned before and like in most collaboration systems, interactions encompass a large number of messages in various formats (see HPS FS in Sect. 1.4, an XML-based file system, which has been designed to accommodate those messages). Requests are sent toward the HPS middleware, which allows messages to be exchanged either synchronously or asynchronously. Requests can be forwarded to the corresponding user instantaneously (e.g., users is available) or saved in an XML-based repository.
1.4 HPS Framework HPS allows a seamless integration of human actors in service-oriented systems and collaborations that may require human input in process-centric collaborations. However, in contrast to existing work and specifications as WS-HumanTask [2], people have the ability to define a set of user-provided services that can be used in ad-hoc collaborations and interactions between humans. The next section describes the HPS middleware platform.
1.4.1 Middleware Platform HPS Middleware Interfaces. The middleware offers interfaces for discovery of services and interactions with HPS users. The hal interface (HPS Access Layer) is a REST interface that routes requests in various formats to the corresponding user/service. An atom interface can be used to discover services by retrieving Atom feeds3 that contain service related information. Additionally, the service lookup can be performed using the soap interface, facilitating the integration of the HPS framework with other Web services-based platforms. HPS Invocation. Processes requests and sends messages in the appropriate format toward the HPS user. By specifying user or group identifiers (e.g., email address or distribution lists) and service name, HPSs can be located and an interaction initiated
3
Atom Syndication Format - RFC 4287.
1 The Human-Provided Services Framework
7
by directing the request toward the access layer (hal). Every request is then passed through the validation phase in which an authorization check is performed. The user can specify white/black lists and routing and interaction rules. White/black lists are used, for example, to prevent certain users from interacting with HPS services. The hal interface routes service requests to the desired service, thereby abstracting from actual service endpoints and service location. Requests can be delivered to the corresponding service immediately, or through an offline interaction as illustrated in Fig. 1.2b. In the latter case, requests are saved in the Message Repository. HPS FS. Manages a set of collections of diverse type of XML-based information. Collections in HPS are conceptually designed as a native XML-based file system that allows artifacts, messages, tasks, user and service related information to be managed and retrieved. An XML database stores and manages XML collections. XML documents can be retrieved by using XQuery to filter and aggregate information. HPS Interaction Component. HPS users may define a set of interaction rules to manage their collaborations (based on a set of provided services). The HPS framework does not mandate which rules users can specify. The framework allows users to specify rule languages, which can be mapped into the Rules Engine. Therefore, rules can be tailored to the needs of specific domains by creating Domain Specific Languages (DSL) to describe interaction models. For example, see [8] for related work in domain interaction models. Interaction Analysis. Human and service interactions are recorded, archived, and analyzed. This information is used for ranking services based on a set of humanmetrics such as task processing performance, availability, or expertise-rank based on the interaction network structure. Ranking algorithm help to recommend the most relevant HPS and the right expert to perform a given task/request.
1.4.2 Data Collections Collections are managed by the HPS FS as XML documents. These collections can be manipulated by using the Atom Publishing Protocol, e.g., the standard protocol model includes get, post, put, delete, head, to allow resources/ messages to be retrieved and updated. User Profile and Metrics. Profiles are used to manage and store user related information, described in XML. HPS users can specify basic information or simply import personal data that is already available (e.g., vCard format). We categorize User Profile information in hard-facts and soft-facts. Hard-facts comprise information typically found in resumes such as education, employment history including organizational information and position held by the user, and professional activities. Soft-facts are represented as competencies. A competency comprises weights (skill level of a user), classification (description of area or link to taxonomy), and evidence (external sources acting as references or recommendations). Soft-facts can
8
D. Schall et al.
be automatically generated by the HPS middleware based on users’ activities to indicate a user’s expertise or skill level. Service Registry. The registry maintains a number of XML documents describing services and allowing human and software services to be discovered. This information includes a set of service definitions, the list of available services, and information regarding a specific service provided by a user. A detailed discussion on these XML collections is given in Sect. 1.6. HPS Tasks 2007-09-24T18:30:02Z urn:uuid:63a99c80-d399-12d9-b93C-0003939e0a HPS Public Tasks 2007-09-19T18:30:02Z urn:uuid:1223c696-cfb8-4ebb-aaaa-80da344ea6
Listing 1.1 Human task-to-service mapping
Task Registry. Manages Human Tasks that can be either public tasks, used to advertise the need for HPS users to work on tasks, or private tasks that are added to interactions as control elements. Public tasks are associated with an interaction upon claiming and processing tasks. In addition, tasks can be added to an interaction without defining public tasks beforehand. Listing 1.1 shows an example of a task announcement. The announcement contains a list of public tasks that reference the type of HPS service that should process available tasks. In this example, task related information is encapsulated as elements in Atom feed entries. The category element can be used to add tags to Human Tasks.
1.5 Implementation The HPS middleware comprises the implementation of the XML based file system (HPS FS) and XQuery-based filtering and retrieval of XML documents through the implementation of the XQuery API for Java (XQJ). Furthermore, the atom interface, that supports the Atom Protocol Model to manipulate resources, and the hal interface to support complex interactions with HPSs and dispatching of
1 The Human-Provided Services Framework
9
messages are implemented. The HPS Interaction component is currently under development. We utilize the JBoss Drools4 system which supports graphical Webbased editing tools based on which HPS users can define interaction rules. User interfaces (e.g., Web browser clients) allow services to be discovered and enable service requesters to interact with HPS users. At the implementation level, we use a set of state-of-the-art Web 2.0 technologies such as AJAX to enable asynchronous interactions between the client and the middleware. In addition, context information can be used in the service discovery process, for example, by filtering XML documents based on users’ availability. Service Deployment. Services are deployed in the hosting environment, for example PCs, Smartphones or PDAs. This deployment strategy allows the HPS framework to scale to a large audience without being restricted to any specific technology. The framework supports the option to deploy services in a platform independent manner. In our experiments, we have used an Apache Axis2 Web services environment embedded in an Equinox OSGi5 container. This solution is well suited for PCs, but not for mobile devices such as Smartphones. For resource constraint devices, a combination of OSGi technology and SOAP servers with small footprint can be used. Specifically for the Windows platform, the Windows Communication Foundation (WCF) can be used to develop Web services for Windows XP and Vista. We have developed SOAP and REST (XML and JSON) based services using the API provided by WCF. User Interface Aspects. In the service discovery phase, the requester (client) receives an XML document from the middleware (registry). In Listing 1.2 and Listing 1.3 we see an example where user interfaces are represented using XForms technologies6. XForms are automatically generated by the HPS framework based on WSDL descriptions (see Listing 1.3 category and term specification). Listing 1.2 shows the model specifying SOAP as the interaction message format and the HPS middleware access layer as the submission target.
Listing 1.2 SOAP interaction model
Listing 1.3 shows the actual interface representation that allows human requesters insert the request parameters and also request messages to be rendered on various devices. The switch/case construct defines the behavior of the form – request
4
http://labs.jboss.com/drools/ http://www.osgi.org/osgi technology/ 6 http://www.w3.org/MarkUp/Forms/ 5
10
D. Schall et al.
and response representation. These forms are platform and device independent and can be displayed on, for example, mobile devices or in standard Web browser using a suitable forms plugin. Input definitions... ... Submission output.
Listing 1.3 Snippet request input form
The actual instance model – i.e., the request message – is an XML document (SOAP envelope) as defined in Listing 1.2, which is dispatched by hal upon submission (submit-envelope).
1.6 Using the HPS Framework in Ad-Hoc Collaborations We discuss the required steps to publish HPSs and show how requesters discover and interact with personal services using middleware interfaces for HPS interactions. However, due to space limits, process-centric collaboration scenarios and interactions with (business) processes are not addressed. There are three phases in ad-hoc based collaborations: Service Definition. The user specifies messages and collaborative activities (at a high level) using the Management Tools provided by the middleware (see Fig. 1.3). Based on messages and activities, the middleware automatically generates low-level HPS interfaces using interface description languages such as WSDL or WADL. These descriptions are deployed as XML documents in the Service Registry. Service Discovery. Requesters discover human and software services by browsing/filtering XML documents that contain the relevant users/services. HPS Interaction. Requesters interact with services by issuing requests toward the middleware. Requests can be converted by the Protocol Handler to match different service interface types. For example, messages that are encoded in JSON notation can be converted to XML messages, and back. However, the Protocol Handler does not support message conversion from, for example, SOAP/XML to REST/JSON notation. Messages are being routed by the Message Router to the corresponding user-provided service or saved in the XML Message Repository. The
1 The Human-Provided Services Framework
11
Fig. 1.3 HPS middleware platform and architecture
actual interaction with the HPS – receiving and processing the request – can take place depending on the user’s context (e.g., availability or also specified interaction rules).
1.6.1 Defining Service Interfaces A HPS interface definition is an XML document that contains four entries (see also XML examples in Fig. 1.4). Addressing information of personal services to describe how to interact with a particular user providing the service. This information is used by requesters to locate and interact with personal services using the hal interface. Figure 1.4(3) shows the addressing information entry. The Web Services Resource Catalog (WS-RC) meta endpoint definition7 is used to express addressing information of personal services. WS-RC endpoint descriptions can be annotated using mex elements to describe meta data, for example, taxonomies, that are applicable to all personal services of the same service type; regardless of the specific underlying protocol (SOAP or REST). The ParameterMap element defines tokens in the service address, for example, a uri that is replaced at run-time by HPS user information (e.g., user id or Email
7
Namespaces have been abbreviated for readability.
12
D. Schall et al.
Fig. 1.4 HPS discovery and interaction
address). The entry in Fig. 1.4(1.a) shows an excerpt of the WSDL definition of a HPS, which contains a link to the WSDL file and a meta data section defining the service interface (i.e., available human activities). Service interface definitions shown in Fig. 1.4(1.a), (1.b), and (1.c). Entry (1.a) shows a WSDL interface definition encapsulated in an Atom feed entry. Entry (1.b) and (1.c) show REST interface definitions using the Web Application Description Language (WADL) [6]. Fig. 1.4(1.b) denotes the interface that defines messages in
1 The Human-Provided Services Framework
13
XML format (full entry has been omitted) and (1.c) shows an entry as REST/JSON service entry defined in WADL. The technology choice depends on the specific application domain of userprovided services. At this point, the HPS middleware supports formats including SOAP/XML or REST/XML and the corresponding interface descriptions, which can be annotated with human-related information. As an example, Fig. 1.4(1.b) shows the definition of a REST HPS interface that defines the usage of JSON as the message format. This technology choice facilitates HPS service-interactions in Web browser-based client environments. A request can be created by using Javascript to issue JSON-requests toward the HPS middleware.
1.6.2 XML Collections of Services In this particular scenario, shown in Fig. 1.4, requesters are able to retrieve a list of services encoded as Atom feeds. Feeds have been designed for access of (and subscriptions to) content, which is frequently being updated. Thus, requesters can subscribe to different categories of HPSs; content which changes frequently as HPSs rely upon the availability of human actors. Category elements describe the type of available service interaction models (see Fig. 1.4(2)). Note that, for scalability reasons, XML collections of services can be created for specific categories, which can be distributed and hosted by different Service Registries. In addition, multiple copies of service collections can be stored on different servers and replicated.
1.6.3 Personal Services Personal services are user-defined services that can be provided by designing different services suitable for various collaborative activities. Example services are “document review” service, “expert opinion” service, or “news reporter” service, just to name a few. These services can be used in various collaboration scenarios and for complex problems, services can be composed by defining processes that span multiple users. However, the actors that should execute activities/tasks do not need to be determined beforehand as personal services can be discovered on demand and thereby following a service-oriented approach to collaboration. Given a requester’s (consumer) query to discover services, the framework helps to find and select the most relevant personal services by (1) matching services that satisfy a given query, (2) filter services based on context (e.g., availability, workload, etc.) and (3) ranking each service based on a set of metrics.
14
D. Schall et al.
An example description of a personal service is given in Listing 1.4. The XML description includes user related information such as name, address, and additional contact information, which can be specified by the user and/or selected from the user’s profile. The service model defines how to contact the user. In the given example, the category element contains information regarding the supported models. Note, the category element references elements in the Service Definitions document (e.g., a user-defined Review service whose service interface is defined in WSDL). Since interactions with services travers the middleware platform, the endpoint information, encoded as description element, is used to forward requests to a service endpoint. However, this information is only used within the middleware platform and not exposed to potential service consumers. My HPS Review Service Daniel Schall
[email protected] 2007-09-24T18:30:02Z urn:uuid:1223c696-cfb8-4ebb-aaaa-80da34efa6a ]]> 48.19766 16.37146
Listing 1.4 Personal service
Personal services can be annotated with expertise information using various taxonomies. This information is encapsulated in content elements, which is valid for a particular personal service. Note, a user may want to provide different services and associate with each service a different set of skills or expertise. Nonetheless, expertise information can be specified in the user’s profile, thereby being applicable to all personal services defined by a user. Context information such as location (e.g., geo tags) and user’s availability status can be used to find and filter services in the discovery phase.
1.7 Conclusion and Future Work The convergence of human- and software services in a single framework requires novel tools and platforms. By utilizing the HPS framework, users can manage their (complex) interactions, while requesters are able to find (discover) the right service. In this work we focused on the architectural aspects of the HPS framework
1 The Human-Provided Services Framework
15
and implementation details of the middleware platform. However, we have not yet addressed legal or privacy issues, which are important to open the HPS framework to a larger audience. The next steps include a detailed performance and scalability analysis of the HPS framework. By gaining insights in performance aspects, the HPS framework will be able to accommodate a large number of users by federating multiple middleware platforms. We will conduct a more detailed user validation considering different ad-hoc and process-centric interaction models. Another important aspect of the HPS framework is ranking and recommending services. We are currently defining a set of HPS related metrics and algorithm to determine the most relevant service.
References 1. von Ahn, L.: Games with a purpose. IEEE Comput. 39(6), 92–94 (2006) 2. Amend, M., Das, M., Ford, M., Keller, C., Kloppmann, M., Knig, D., Leymann, F., Mller, R., Pfau, G., Plsser, K., Rangaswamy, R., Rickayzen, A., Rowley, M., Schmidt, P., Trickovic, I., Yiu, A., Zeller, M.: Web Services Human Task (WS-HumanTask), Version 1.0. (2007) 3. Becerra-Fernandez, I.: Searching for experts on the Web: A review of contemporary expertise locator systems. ACM Trans. Inter. Tech. 6(4), 333–355 (2006). DOI http://doi.acm.org/10. 1145/1183463.1183464 4. Dustdar, S.: Caramba a process-aware collaboration system supporting ad hoc and collaborative processes in virtual teams. Distrib. Parallel Databases 15(1), 45–66 (2004). DOI http: //dx.doi.org/10.1023/B:DAPD.0000009431.20250.56 5. Gentry, C., Ramzan, Z., Stubblebine, S.: Secure distributed human computation. In: EC ’05: Proc. of the 6th ACM conference on Electronic commerce, pp. 155–164. ACM, New York (2005). DOI http://doi.acm.org/10.1145/1064009.1064026 6. Hadley, M.: Web Application Description Language (WADL). Technical report, Sun Microsystems (2006) 7. Kosorukoff, A., Goldberg, D.E.: Genetic Algorithms for Social Innovation and Creativity. Technical report, University of Illinois at Urbana-Champaign (2001) 8. Nussbaumer, M., Freudenstein, P., Gaedke, M.: Stakeholder Collaboration: From Conversation to Contribution. In: ICWE ’06: Proc. of the 6th int. conference on Web engineering, pp. 117– 118. ACM, New York (2006). DOI http://doi.acm.org/10.1145/1145581.1145608 9. Schall, D., Truong, H.L., Dustdar, S.: Unifying Human and Software Services in Web-Scale Collaborations. IEEE Internet Comput. 12(3), 62–68 (2008). DOI http://doi. ieeecomputersociety.org/10.1109/MIC.2008.66 10. Su, Q., Pavlov, D., Chow, J.H., Baker, W.C.: Internet-scale collection of human-reviewed data. In: WWW ’07: Proc. of the 16th int. conference on World Wide Web, pp. 231–240. ACM, New York (2007). DOI http://doi.acm.org/10.1145/1242572.1242604 11. Thomas, J., Paci, F., Bertino, E., Eugster, P.: User Tasks and Access Control over Web Services. In: Int. conf. on Web Services (ICWS’07), pp. 60–69. IEEE Computer Society, Salt Lake City, USA (2007). DOI 10.1109/ICWS.2007.182
Chapter 2
Unifying Human and Software Services in Web-Scale Collaborations Daniel Schall, Hong-Linh Truong, and Schahram Dustdar
Abstract As collaborative Web-based platforms evolve into service-oriented architectures, they promote composite and user-enriched services. In such platforms, the collaborations typically involve both humans and software services, thus creating highly dynamic and complex interactions. However, todays collaboration tools don’t let humans specify different interaction interfaces (services), which can be reused in various collaborations. Furthermore, humans need more ways to indicate their availability and desire to participate in collaborations. The Human-Provided Services (HPS) framework lets people manage their interactions and seamlessly integrate their capabilities into Web-scale workflows as services. It unifies humans and software services and supports ad hoc and process-centric collaborations.
2.1 Introduction Web services have paved the way for a new type of collaborative system. Services let us design collaborative systems in a modular way in a distributed environment, adhering to standard interfaces using, for example, the Web Services Description Language (WSDL) [6]. Users can create collaborative features by (re)using and composing Web services. Services already play an important role in fulfilling organizations’ business objectives because process stakeholders can design, implement, and execute business processes using Web services as well as languages such as the Business Process Execution Language (BPEL). Services have started exploiting the D. Schall () H.-L. Truong S. Dustdar Distributed Systems Group, Vienna University of Technology, Argentinierstr 8/184-1, 1040 Vienna, Austria e-mail:
[email protected];
[email protected];
[email protected] c 2008 IEEE. Reprinted, with permission, from Schall, D., Truong, H.-L., Dustdar, S. (2008) Unifying Human and Software Services in Web-Scale Collaborations. IEEE Internet Computing 12(3): 62–68 (2008) S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0 2, © Springer-Verlag/Wien 2011
17
18
D. Schall et al.
Fig. 2.1 Flexibility vs. reusability in collaboration. Opportunistic service composition represents the trade-off in loosely structured collaborations
Web and are increasingly found in Web-scale collaborations. Web services are tools that users and developers can reuse in various applications by exposing well-defined interfaces and APIs. The spectrum of collaboration ranges from process centric to ad hoc collaboration models [4]. Process-centric collaboration defines process models and follows a top-down approach. The business analyst or process architect must fully understand the processes before modeling and then enacting (instantiating) them. Such models’ reusability is generally high, because we can apply process models several times. However, flexibility is rather limited, because if changes occur (such as exceptions), process architects have to remodel the process. On the other hand, ad hoc collaboration (for example, situations in which people or businesses must act spontaneously and creatively) follows a bottom-up approach. It’s more flexible but less reusable, because many aspects depend on the actual players (that is, humans) involved in the process (see Fig. 2.1). However, Web-scale collaborations demand a flexible yet reusable approach because they might involve numerous people and software services. Here, we introduce Human-Provided Services, which you can use in ad hoc or process-centric collaborations. The HPS framework helps integrate humans into service-oriented infrastructures, thus promoting reusability and flexibility.
2.2 Web 2.0’s Collaboration Landscape The Web 2.0 paradigm encourages users to collaborate and share knowledge and information, so the Web is no longer a ‘read-only’ information repository. Consider Fig. 2.2, in which User A publishes Web content using open service-oriented applications. Other Web users can then consume, aggregate, or filter the content. However, because the depicted scenario relies purely on ad hoc interactions, users can’t apply the same procedure in other collaborations (for example, to share content including documents, videos, and photos). In addition, the collaboration isn’t structured because theres no interaction link between the users. This makes
2 Unifying Human and Software Services in Web-Scale Collaborations
19
Fig. 2.2 Web-scale collaboration. We can see both (a) ad hoc and (b) process-centric collaboration models
it difficult if not impossible to manage interactions that might span multiple users and services. Figure 2.2 shows a process-centric collaboration involving human actors (depicted as human activity in the process model). An example of such collaboration might be to model human interactions in BPEL processes as BPEL4People activity. However, the applicability of such models in Web-scale collaborations is rather limited because you cant model emerging interactions between humans and services in advance. Opportunistic service composition is the trade-off in loosely structured collaborations (see Fig. 2.1); you lose high reusability of compositions (processes) but gain flexibility in collaboration. Webscale collaboration demands the composition of complex systems, which comprise not only interactions between software services but also humans as parts of flexible compositions.
2.3 Motivating Use Cases Consider the following motivating scenarios detailing the problems arising in collaborations involving software services and humans.
2.3.1 Ad Hoc Contribution Requests In Fig. 2.2, User A records a video and posts it on the Web. However, current platforms don’t let consumers (User B) actively find available users who can contribute to collaborations by producing the desired content on demand. In particular, users should be able to find any person who can deliver the desired content using whichever platform (service) has been chosen to host the Web content.
20
D. Schall et al.
This use case depends on the activity to be performed and the involved services but not on the platform being used to share or host the content.
2.3.2 User-Defined Processes Continuing the previous use case, the collaboration might involve numerous services and people. For example, a (software) service should automatically check the input User A receives (for example, for file format compatibility) and convert it into a suitable format if needed. The requester can then check whether the provided contribution needs to be revised or re-recorded. We observe the case in which interactions interleave tasks that humans and software services perform. However, current systems don’t address reusability aspects of loosely structured processes in such collaboration scenarios, which would let users (requesters) manage interactions involving people as well as software services. This scenario targets opportunistic service composition comprising human and software services.
2.3.3 Interactions with Formalized Processes It becomes increasingly important to enable interactions between business processes and human actors, if human input is required in a process (Fig. 2.2). However, people are increasingly on the move using different (mobile) devices for collaboration. So, we must consider mobility aspects, such as the location and limited processing power of a user’s mobile devices, and we must adjust interactions according to the user’s context. In particular, processes must be able to find and select the right person available for performing certain tasks, whereas humans involved in interactions with business processes must be able to define (interaction) rules to deal with requests for example, to automatically pre-process certain requests. However, current systems can’t cope with human-process interactions that scale to the Web. They can’t find humans participating in Web-scale interactions with processes or manage interactions involving multiple people.
2.4 HPS in Web-Scale Collaborations We introduce HPS because current systems don’t sufficiently address the challenges and problems presented in the motivating use cases (see the ‘Related Work in Collaboration Systems’ sidebar). Here, we present the applicability of HPS in Webscale collaborations and introduce a framework to embrace the integration of human capabilities and interactions in service-oriented collaborations and Web-scale workflows.
2 Unifying Human and Software Services in Web-Scale Collaborations
21
The HPS framework lets people supply services based on their skills and expertise. HPSs act as interaction interfaces toward humans, letting users define various HPSs for different collaborative activities indicating their ability (and willingness) to participate in ad hoc as well as process-centric collaborations. The users can manage their interactions, which might span various platforms and services. Human actors benefit from HPSs because they can reuse different services in various collaborations (such as in different workflows), thus fostering the reusability of human capabilities. Moreover, HPSs can increase flexibility in collaborations because they let human actors provide services that can address problems that software services alone cant solve.
2.4.1 The Framework However, this novel blend of service-oriented architectures requires a new platform to let humans effectively provide services and to efficiently deal with interactions through HPSs. Current service-oriented platforms can’t sustain HPSs because: • Conventional service registries don’t offer suitable lookup interfaces for finding HPSs. • Current platforms can’t enhance service-related information by describing the human characteristics of HPSs as needed. • Current platforms don’t address HPS interaction patterns, so they can’t introduce new service (HPS) interaction patterns to let human actors efficiently deal with requests. Figure 2.3 shows the steps that the HPS framework takes to address these challenges, illustrating the scenario for ad hoc contribution requests. • Step 1. Register the profile and service. Human actors define high-level collaboration activities (for example, createReport) using an HPS interface editor that the framework hosts. The HPS framework automatically translates these activities into low-level service interfaces described in WSDL. User profile information includes name, skills, and competency, which the HPS framework uses to enhance the discovery, selection, and recommendation process to find the most suitable HPS. The user specifies basic personal profile information or uploads this information as a vCard file. Humans provide a service by registering it as a personal service. The HPS scenario in Fig. 2.3 shows an example in which humans provide reporter services to contribute Web content such as news reports. The middleware hosts a set of XML documents in the service registry thats managing the interface description and personal service information. So, its easier to achieve crossorganizational collaboration because companies can share information stored in the service registry the very foundation of Web-scale workflows. Other people who want to provide the same type of service can then reuse the
22
D. Schall et al.
Fig. 2.3 HPSs in Web-scale collaborations. This scenario illustrates ad hoc interactions between requesters and HPSs
service interfaces. Figure 2.3 shows a snippet of the XML description of a personal service. The description contains user-related information, a reference to the service interface description, and information regarding the users expertise rooted in taxonomies. This information is embedded in Atom feed entries. The Atom Syndication Format is an XML language describing frequently updated content such as news. Atom feeds contain, for example, author information, links to content, and summaries. The HPS framework uses Atom feeds as a container format for WSDL documents and various content including taxonomies describing users expertise; additional context information, such as location ( tags); and category information to tag services. The HPS framework supplies
2 Unifying Human and Software Services in Web-Scale Collaborations
23
the personal service hosting environment, which users can download to their desktop computer or mobile devices using mobile Java technology (JavaME). This environment lets the computer or device deploy software for personal services as gadgets. It comprises a micro-OSGi environment (www.osgi.org/ osgi technology/), a set of tools to manage the gadgets (services), a common lightweight SOAP library, and a user-interface rendering engine displaying user interfaces described in XML. • Step 2. Look up a service. HPSs can be discovered through an interface implementing the Atom protocol model or a Web service interface. Figure 2.3 shows an example in which location and availability information enhance the discovery process given that requesters might want to find reporter services located in some areas of interest. The Atom lookup interface returns a feed containing a ranked list of entries comprising personal HPS information. It ranks the services based on various HPS metrics, such as skill level and user response time. The lookup returns additional user interface rendering information for example, XForms, which are automatically generated based on WSDL interfaces (step 2.1 in Fig. 2.3) if human requesters attempt to interact with HPSs. XForms are a forms technology expressed in XML that describe user interfaces in a device-independent way. For example, the lookup returns interface rendering information, which can be embedded in markers of a geographical map (step 2.2). • Step 3. Interact with HPS. Ajax scripts can issue requests asynchronously toward the middleware platform. The middleware implements an HPS Access Layer interface (HAL) to dispatch HPS requests. HAL provides a security module to prevent unauthorized access, policy management to protect the users’ privacy, and request filtering to shield HPSs from denial-of-service attacks. HAL dispatches and routes service requests to the appropriate HPS and device. The HAL interface description is denoted in Fig. 2.3 as HPS interaction using Web Services Resource Catalog (WS-RC) Meta-Endpoint definitions that are parameterized by HPS addressing information, such as user identifiers. For more see WSRC1 . HPSs aren’t always online, because the personal service hosting environment might be deployed on mobile devices, which rely on wireless network availability and coverage. If the HPS isn’t available at the time of interaction, an XMLbased repository can store service requests (see HPS Middleware in Fig. 2.3) and process them whenever the HPS is back online (step 3.2). Pending requests can be received via push- and pull-based mechanisms depending on the hosting environment’s configuration. At this stage, HAL comprises request processing and routing capabilities and request filtering. Implementing security and policy management features is our current work in progress; we hope to address such implementation in the next steps.
1
www.ibm.com/developerworks/library/specification/ws-rc/index.html
24
D. Schall et al.
Related Work in Collaboration Systems The first column in the table shows features, or capabilities, which collaboration systems must support to address the needs in large-scale collaborations and workflows involving human and software services. Human computation aims to leverage human capital in computational processes [1, 5]. For example, human actors perform certain tasks in a program. Related to human computation, shown as human-reviewed data in the table, are systems such as Yahoo! Answers (http://answers.yahoo.com) [7] and Amazon Mechanical Turk (www.mturk.com), letting people claim and process tasks, which human or software requesters can issue. BPEL4People and the WS-HumanTask specification [2] provide a design for enabling human and process interactions in Business Process Execution Language processes. Expert finder systems [3] aim to define ontologies that describe the skills and expertise of people to help others find the right person (expert) on the Web. The main differences between the systems shown in the table and HumanProvided Services are that the latter are user-defined interaction interfaces that people supply and compose for various collaborations. Compared to concepts in the BPEL4People specification, people using the HPS framework decide which service to provide for example, for a specific collaboration context and manage their interactions using HPS interfaces. These interfaces let human and process requesters interact with HPSs, whereas BPEL4People specifies how the process architect can involve people in designed processes but doesn’t specify how and which services people can offer. Thus, the BPEL4People specification doesn’t let people specify their contributions as services in Web-scale collaborations; however, BPEL4People-based processes can interact with HPSs. HPS follows the Web 2.0 paradigm, in which services are user-driven contributions rather than tasks tailored to specific processes.
Feature Human requesters Modeling interactions in process Interaction-based collaborations Open collaboration environments Context-dependent discovery Expertise ranking User-defined interactions User-defined services
Human Human computation reviewed data
Human process interactions
Expert finder systems
No Yes
Yes Yes
No Yes
Yes No
No
No
No explicit Yes collaboration link Yes Enterprise level collaboration Skills Role models
No No
Yes No
No
No
Yes
No Ontologies describing skills Skills
No Yes Defined by process No designer No No
2 Unifying Human and Software Services in Web-Scale Collaborations
25
2.4.2 Ad Hoc Collaboration Example The example in Fig. 2.3 shows an ad hoc interaction without any means for control or coordination. We can create tasks to control interactions and to share status information with the requester. Specifically, we need tasks for interactions between HPSs and processes to determine whether the HPS will process the requests. Task states include inprogress, rejected, or finished. Additionally, actions can be triggered automatically based on task-state changes such as sending notifications upon state changes. Interactions with HPS may be long-running conversations comprising a multitude of messages, possibly in different formats (such as SOAP/XML, Instant Messaging, or email), notifications, tasks, people, and documents. The HPS middleware implements an XML-based file system, which provides access to the XML repository and querying and filtering capabilities through XQuery. To manage complex interactions, the user can specify Interaction Rules (see Fig. 2.3) to create loosely structured (user-defined) processes, which the user can then apply and reuse in various interactions (such as interactions through services).
2.4.3 Process-Centric Collaboration Example Figure 2.4 employs HPS as part of a formalized process comprising interleaved human and service interactions. It shows a workflow that integrates HPS and software services to respond to emergency situations by gathering information and input from various human and software services. First, the system receives video footage from a monitoring service a surveillance system that has cameras
Fig. 2.4 The workflow for an emergency scenario using HPSs. In this case, the process involves both humans and software services
26
D. Schall et al.
deployed to monitor certain areas of interest. A detection service processes the image data, detects incidents, and generates events accordingly. The policy service receives a stream of events and classifies the nature of certain events (for example, classifying events as suspicious activities). Events that the policy service can’t classify require human intervention. Classifying an emergency event constitutes an additional activity in the process, which the emergency expert service (that is, an HPS) performs. The process requires a human to evaluate the situation. The process accomplishes this by dynamically discovering a nearby HPS (user) who can review the situation and provide desired input for the process. Although not explicitly shown in Fig. 2.4, it’s possible to consult multiple HPSs. The process continues and invokes a notification service to inform local authorities (that is, the human operator) about the incident. The authorities invoke the emergency response service, which automatically deploys an emergency response team to the emergency area. The HPS framework’s contributions in this scenario are that software services and processes can discover HPSs using information available in a service registry, including users’ profiles, service-specific information, and context information. In addition, humans might use mobile devices to interact with processes, which is increasingly important in today’s collaborations.
2.5 Future Work The most promising direction for further HPS development is the automatic generation of service interfaces based on user skill and profile information. At this stage, the users design services, specifying collaborative activities, which the HPS framework translates into XML interface descriptions. Furthermore, we’re working on methods and algorithms for analyzing interactions to better understand complex behaviors in a mixed system of human and software services. Based on a set of HPS-related metrics, including social aspects and reputation, we rank the services to help requesters find and interact with the most suitable HPS. Because HPSs are user-driven services, we consider unexpected behavior by modeling the quality of an HPS and rewarding models, taking into account performance and reliability aspects in processing tasks and requests. HPS ranking is essential in large-scale collaborations and workflows because HPSs can be dynamically discovered, and because potentially a large number of users might provide a particular service, thus the recommendations must guide the service selection. Additionally, we’re working on improving tools for executing user-defined processes and for integrating a BPEL4People engine to use HPS in BPEL processes.
References 1. von Ahn, L.: Games with a purpose. IEEE Comput. 39(6), 92–94 (2006)
2 Unifying Human and Software Services in Web-Scale Collaborations
27
2. Amend, M., et al.: Web Services Human Task (WS-HumanTask), Version 1.0. (2007) 3. Becerra-Fernandez, I.: Searching for experts on the Web: A review of contemporary expertise locator systems. ACM Trans. Inter. Tech. 6(4), 333–355 (2006). DOI http://doi.acm.org/10. 1145/1183463.1183464 4. Dustdar, S.: Caramba a process-aware collaboration system supporting ad hoc and collaborative processes in virtual teams. Distrib. Parallel Databases 15(1), 45–66 (2004). DOI http://dx.doi. org/10.1023/B:DAPD.0000009431.20250.56 5. Gentry, C., Ramzan, Z., Stubblebine, S.: Secure distributed human computation. In: EC ’05: Proc. of the 6th ACM conference on Electronic commerce, pp. 155–164. ACM, New York (2005). DOI http://doi.acm.org/10.1145/1064009.1064026 6. Papazoglou, M.P., Traverso, P., Dustdar, S., Leymann, F.: Service-oriented computing: state of the art and research challenges. IEEE Comput. 40(11), 38–45 (2007) 7. Su, Q., Pavlov, D., Chow, J.H., Baker, W.C.: Internet-scale collection of human-reviewed data. In: WWW ’07: Proceedings of the 16th international conference on World Wide Web, pp. 231–240. ACM, New York (2007). DOI http://doi.acm.org/10.1145/1242572.1242604
Chapter 3
Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems Florian Skopik, Daniel Schall, and Schahram Dustdar
Abstract The global scale and distribution of companies have changed the economy and dynamics of businesses. Web-based collaborations and crossorganizational processes typically require dynamic and context-based interactions between people and services. However, finding the right partner to work on joint tasks or to solve emerging problems in such scenarios is challenging due to scale and temporary nature of collaborations. Furthermore, actor competencies evolve over time, thus requiring dynamic approaches for their management. Web services and SOA are the ideal technical framework to automate interactions spanning people and services. To support such complex interaction scenarios, we discuss mixed service-oriented systems that are composed of both humans and software services, interacting to perform certain activities. As an example, consider a professional online support community consisting of interactions between human participants and software-based services. We argue that trust between members is essential for successful collaborations. Unlike a security perspective, we focus on the notion of social trust in collaborative networks. We show an interpretative rule-based approach to enable humans and services to establish trust based on interactions and experiences, considering their context and subjective perceptions.
F. Skopik () D. Schall S. Dustdar Distributed Systems Group, Vienna University of Technology, Argentinierstr 8/184-1, 1040 Vienna, Austria e-mail:
[email protected];
[email protected];
[email protected] Reprinted from Skopik, F., Schall, D., Dustdar, S. (2010) Modeling and mining of dynamic trust in complex service-oriented systems. Information Systems 35(7): 735–757, with permission from Elsevier S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0 3, © Springer-Verlag/Wien 2011
29
30
F. Skopik et al.
3.1 Introduction The way people interact in collaborative environments and social networks on the Web has evolved in a rapid pace over the last few years. Services have become a key-enabling technology to support collaboration and interactions. Pervasiveness, context-awareness, and adaptiveness are some of the concepts that emerged recently in service-oriented systems. A system is not only designed, deployed, and executed; but rather evolves and adapts over time. This paradigm shift from closed systems to open, loosely coupled Web services-based systems requires new approaches to support interactions [51]. We present a novel approach addressing the need for flexible discovery and involvement of experts and knowledge workers in distributed, cross-organizational collaboration scenarios. Experts register their skills and capabilities as HumanProvided Services (HPS) [47] using the very same technology as traditional Web services to join a professional online help and support community. This approach is inspired by crowdsourcing techniques following the Web 2.0 paradigm. People can contribute HPSs to offer their skills to a broad number of Web users, service compositions, and enterprises that need to have on-demand access to experts. In such communities, not only humans participate and provide services to others, but also autonomous software agents and semantic Web services with sophisticated reasoning capabilities. A mixed service-oriented system comprises human- and software services that can be flexibly and dynamically composed to perform various kinds of activities. Therefore, interactions in such a system do not only span humans, but also software services. Recently, trust has been identified as a beneficial concept in large-scale networks [5,23]. Considering trust relations when selecting people for communication or collaboration, services to be utilized, and resources to be applied leads to more efficient cooperation and compositions of human- and software services [49]. In contrast to many others, we do not discuss trust from a security perspective. In this work we share the view of [44] that is related to how much humans or other systems can rely on services to accomplish their tasks. Unlike several other systems in the agent domain, e.g., see [20], we follow a centralized trust management approach [49]. In SOA, central registries and logging facilities are common mechanisms. Applying them avoids various issues, such as the malicious manipulation of interaction data and dishonesty regarding recommendations. Moreover, some trust inference mechanisms are only applicable if the participants of the network have a global view on the system. However, on the other side, a centralized approach may raise privacy issues that have to be considered in the system design. In this work, we present the following key contributions: • Social and behavioral Trust Model. We define a trust model that relies on interaction dynamics, supporting wide personalization by accounting for user preferences, and discuss its realization in the introduced use case. • VieTE framework. We outline VieTE (Vienna Trust Emergence), a modular framework that supports the management of trust in SOA-based environments. In particular, we introduce key implementation aspects, such as interaction mining, and Web of Trust provisioning.
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
31
• Evaluation and discussion. Since our work is not only theoretical, but closely coupled to SOA technology, we evaluate various functional and non-functional aspects of VieTE and its trust model. This work is organized as follows. In Sect. 3.2, we introduce the Expert Web case showing the need for flexible expert discovery and involvement. Our novel approach is based on social trust. We introduce trust concepts in collaborative environments in Sect. 3.3. Section 3.4 details the concept of interaction-based behavioral trust which will be the basis for our trust inference model. Trust can be based on different metrics whose meaning is highly subjective. In Sect. 3.5, we show our trust model established on fuzzy set theory. The trust model manages context-dependent trust between actors, i.e., humans and services, emerging from interactions. The subsequent Sect. 3.6 formalizes the fundamental trust model relying on captured and interpreted interactions. Successful, thus highly trusted network members, are valuable collaborators. However, overload due to large amounts of work represent bottlenecks. In Sect. 3.7, we present a balancing approach to prevent inefficient interactions. Our architecture is implemented on-top of SOA and Web services. We show the implementation details of the system in Sect. 3.8. Section 3.9 deals with evaluations to test the performance of the presented system as well as effectiveness of balancing algorithms. Finally, we discuss related work in the area of SOA, social trust, and flexible interactions models in Sect. 3.10 and conclude this work in Sect. 3.11.
3.2 Service-Oriented Collaborations In virtual communities, where people dynamically interact to perform activities, reliable and dependable behavior promotes the emergence of trust. As collaborations are increasingly performed online, supported by service-oriented technologies, such as communication-, coordination-, and resource management services, interactions have become observable. By monitoring and analyzing interactions, trust can be automatically inferred [15, 20, 36, 51]. In contrast to manual rating approaches for mainly static communities, automatic inference is well-suited for complex networks with short-running interactions between potentially thousands of rapidly changing network members. We motivate our work with a scenario showing discovery of experts and flexible interaction support as depicted in Fig. 3.1. In this use case, a higher level process model may be composed of single tasks assigned to responsible persons, describing the steps needed to produce a software module. After finishing a common requirements analysis, and in parallel a reusability check of existing software artifacts produced in related projects, a software architect designs the actual software framework. The implementation task is carried out by a software developer, and additionally software test cases are generated with respect to functional properties (e.g., coverage of requirements) and non-functional properties
32
F. Skopik et al.
Fig. 3.1 Service-oriented large-scale collaboration in the Expert Web
(e.g., performance and memory consumption). We assume that this task is deployed in a global enterprise spanning multiple departments and locations. Thus, the single task owners in this process exchange only electronic files and interact by using communication tools. While various languages and techniques for modeling such processes already exist, for example BPEL, we focus on another aspect in this scenario: interactions with trusted experts. A language such as BPEL demands for the precise definition of flows and input/output data. However, even in carefully planned processes with human participation, for example modeled as BPEL4People activities [3], ad-hoc interactions and adaptation are required due to the complexity of human tasks, people’s individual understanding, and unpredictable events. In Fig. 3.1, the software architect receives the requirement analysis document from a preceding step. But if people have not yet worked jointly on similar tasks, it is likely that they need to set up a meeting for discussing relevant information and process artifacts. Personal meetings may be time and cost intensive, especially in cases where people belong to different geographically distributed organizational units. Various Web 2.0 technologies, including forums, Wiki pages and text chats, provide well-proven support for online-work in collaborative environments. Several challenges remain unsolved: (a) If people, participating in the whole process, are not able to solve problems by discussion, who should be asked for support? (b) How can experts be flexibly involved in ongoing collaborations? (c) What are influencing factors for favoring one expert over others. (d) How can we support trusted interactions in such dynamically changing environments and how can this situation be supported by service-oriented systems? Traditionally, discovering support is simply done by asking third persons in the working environment, the discussion participants are convinced they are able to help, namely trusted experts. In an environment with a limited number of people, persons usually tend to know who can be trusted and what data have to be shared in order to proceed with solving problems of particular nature. Furthermore, they easily find ways to contact trusted experts, e.g., via phone or e-mail. In case requesters do not know skilled persons, they may ask friends or colleagues, who faced similar problems before, to recommend experts. The drawbacks of this
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
33
approach are that people need extensive knowledge about the skills of colleagues and internal structures of the organization (e.g., the expertises of people in other departments). Discovering support in such a manner is inefficient in large-scale enterprises with thousands of employees and not satisfying if an inquiry for an expert becomes a major undertaking. Today’s communication and collaboration technologies cannot fully address the mentioned challenges because many existing tools lack the capability of managing and utilizing dynamic trust. The Expert Web. We propose the Expert Web, consisting of connected experts that provide help and support in a service-oriented manner. The members of this Expert Web are either humans, such as company employees offering help as online support services, or software services encapsulating knowledge bases. Such an enterprise service network, spanning various organizational units, can be consulted for efficient discovery of available support. Users, such as the engineer or drawer in our use case, send requests for support (RFSs). The users establish trust in experts’ capabilities based on their response behavior (e.g., availability, response time, quality of support). This trust, reflecting personal positive or negative experiences, fundamentally influences future selections of experts. As in the previous case, experts may delegate RFSs to other experts in the network, for example, when they are overloaded or not able to provide satisfying responses. Following this way, not only users of the enterprise service network establish trust in experts, but also trust relations between experts emerge.
3.3 Communication, Coordination, and Composition 3.3.1 Social Trust in Collaborations In contrast to a common security perspective, social trust refers to the interpretation of previous collaboration behavior [51] and may additionally consider the similarity of dynamically adapting interests [14,50]. Especially in collaborative environments, where users are exposed to higher risks than in common social network scenarios [13], and where business is at stake, considering social trust is essential to effectively guide interactions [33]. Hence, we define trust as follows (see also [15, 36, 51]): Trust reflects the expectation one actor has about another’s future behavior to perform given activities dependably, securely, and reliably based on experiences collected from previous interactions.
This definition includes several key characteristics that need to be supported by a foundational trust model: • Trust reflects an expectation and, therefore, cannot be expressed objectively. It is influenced by subjective perceptions of the involved actors. • Trust is context dependent and is basically valid within a particular scope only, such as the type of an activity or the membership in a certain team.
34
F. Skopik et al.
• Trust relies on previous interactions, i.e., from well-proven previous behavior a prediction of the future is inferred. We strongly believe that trust and reputation mechanisms are key to the success of open dynamic service-oriented environments. However, trust between human and software services is emerging based on interactions. Interactions, for example, may be categorized in terms of success (e.g., failed or finished) and importance. Therefore, a key aspect of our approach is the monitoring and analysis of interactions to automatically determine trust in mixed service-oriented systems. We argue that in large-scale SOA-based systems, only automatic trust determination is feasible. In particular, manually assigned ratings are time-intensive and suffer from several drawbacks, such as unfairness, discrimination or low incentives for humans to provide trust ratings. Moreover, in the mentioned mixed system, software services demand for mechanisms to determine trust relations to other services. Much research effort has been spent on defining and formalizing trust models (for instance, [2, 20, 36, 43]). Although most of these models are closely related, e.g. in terms of concepts for recommendation and reputation, we add the following novel contributions. Personalized Trust Inference. A fundamental characteristic of trust is its subjective perception. Humans have different requirements to establish trust to others. Therefore, we use a rule-based system, relying on fuzzy set theory that allows each participant of the network to define his/her own rules and influencing factors to establish trust. Instead of a ‘hard-wired’ logic to determine trust, we enable participants to model their individual trust perception, e.g., their optimistic and pessimistic views. Multi-Faceted Trust. We support the diversity of trust by enabling the flexible aggregation of various interaction metrics that are determined by observing ongoing collaborations. Furthermore, data from other sources, such as human profiles and skills, as well as service features and capabilities may influence the trust inference process. Compositional Trust. The majority of today’s trust models in the agent domain, such as typical buyer-seller scenarios, deal with the establishment of trust between exactly two entities. In contrast to that, we focus a compositional perspective, and study trust in group formation processes and compositions of services. In our environment, we understand compositions not from a structural perspective with pre-defined interaction paths (e.g., as in BPEL), but from a dynamic point of view, where members of the network select interaction partners flexibly [12].
3.3.2 The Cycle of Trust Previously, we introduced a conceptual approach for determining trust based on interactions: the Cycle of Trust [49]. This cycle, adopting the MAPE concept [21], consists of four phases, which are Monitor, Analyze, Plan, and Execute. Periodically
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
35
running through these four phases establishes a kind of environmental feedback control, and therefore allows to adapt to varying circumstances. Applied in our environment, we are able to infer trust dynamically during ongoing collaborations. In the Monitoring Phase the trust management system observes interactions between humans and services, including their types, context and success. In the Analyzing Phase interactions are used to infer trust relationships. For this purpose, interaction metrics are calculated and interpreted using personal trust rule sets that depend on the purpose of and situation for trust determination. The following Planning Phase covers the set up of collaboration scenarios, including user activities and human-, and service compositions, taking the inferred trust relations into account. The Execution Phase provides support to enhance the execution of planned collaboration, including observing activity deadlines, checking the availability of actors, and compensation of resource limitations. The interactions of actors are observed in the execution phase; and the loop is closed.
3.4 From Interactions to Social Trust In this work, we demonstrate the inference of trust depending on captured collaboration data considering individual trust perceptions. Conceptually we follow a three layer approach (Fig. 3.2), realizing trust emergence concepts that support our motivating scenario. Interaction Layer. On the bottom layer logging and analyzing of interactions take place. From atomic interactions, more meaningful and aggregated interaction metrics are extracted in subsequent time intervals by the means of message correlation and pattern detection. Domain-specific interaction metrics are determined in preconfigured scopes. Personal Trust Layer. On the middle layer interaction metrics are combined and weighted individually (‘interpreted’) by applying configured rules, and a
Trust Projection Layer
Reputation
Trust Aggregation
Trust Teleportation
Recommendation
Trust Mapping
Trust Mirroring
Scope-dependent Trust Relations
Temporal Trust Evaluation
Context-aware Interaction Analysis
Actor Profile Extraction
Personal Trust Layer
Personalized Metric Interpretation Interaction Layer
Interaction Logging
Fig. 3.2 Layered trust emergence approach
36
F. Skopik et al.
fundamental trust network is established. These personal trust relations reflect the individual trust perception of the actors. Trust Projection Layer. On the top-layer potential future trust relations are predicted where no personal trust has been established yet. While the interaction- and trust metrics on the first two layers are calculated offline in fixed subsequent time intervals (due to potentially high computational effort for personalized trust relations in large networks), trust projection on the top layer is performed dynamically when needed. In the following we discuss the three layers in detail focusing on personal trust.
3.4.1 Interaction Layer The identified key concepts on that layer are (a) harnessing diverse available collaboration data, (b) enabling and capturing various types of interactions in SOAbased mixed systems environments, (c) accounting for context models associated with these interactions, and (d) defining trust scopes to allow a rule-based inference of trust from observed interactions.
3.4.1.1 Collaboration Data Recently, various trust models have been published relying on various mathematical concepts, e.g. probability [22, 40, 55] or reasoning rules [38]. Most works make no assumption about the data used to determine trust, so trust models are completely decoupled from the data underneath. In particular, many approaches account only for one-dimensional manual user feedback (rating) and categorize interactions only in cooperative and defective ones. However, the support of automatic inference demands for observable evidence of trust. Therefore, the large part of used data comes from observing interactions (see the research area of complex event processing1). Besides interactions, there are more data sources in collaborative and social networks that can be used to express the diversity of trust, and are utilized for some higher level trust projection concepts. Finally, we identified the following sources, common in most large-scale SOAbased networks, such as described in the motivating use case: • Interactions: Interactions provide evidence about the success of previous collaboration encounters. We categorize in fundamental interactions, such as e-mail traffic, instant messaging, VoIP communication and SOAP/REST-based service invocations; and in higher-level interactions, mostly relying on fundamental interactions, but annotated with their semantic meaning, e.g., file exchange, report submission, and task delegation.
1
http://complexevents.com/
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
37
• Profiles: Profiles contain valuable information about actors. Human profiles are about professional background, job position, skills, and expertises; service profiles can contain vendor information, features, and capabilities. Similarities of profiles are utilized by higher-level trust concepts, including trust mirroring and teleportation (detailed in the following sections). Profiles may be entered completely manually, or are (partly) determined dynamically based on interactions. An example for inferring expertise from activity involvements can be found in [45]. • Structural Relations and Hierarchies: Knowledge about memberships in groups, roles of humans or services, joint activities and projects can be used to extend the notion and perception of trust. For instance, in a business environment someone may require that potentially trusted partners are members in the same team or are employed by a certain organization. • Manually declared Relations: Nowadays, standardized technologies for specifying friend (buddy) networks (FOAF2 , XFN3 ) are used. This enables, people on the one side to define trust relationships explicitly, but on the other side also distrust (‘foes’). Using such explicit commitments may override and fine-tune automatically inferred relations. Capturing all these information may raise privacy concerns. However, neither content of interaction messages is stored nor semantic analysis is performed. We follow a pure structural approach, which makes extensive use of metadata instead of the actual message content. Furthermore, after analyzing interactions, logs can be deleted, and there remain only higher level metrics. Data are stored and managed by a centralized architecture, so there is no need to propagate sensitive data through the network (as in peer-to-peer networks).
3.4.1.2 Context-Aware Interaction Observation Our approach considers two different levels of interactions. On the bottom level we capture basic interactions, including fundamental exchanges of e-mail messages, VoIP calls, instant messages, and basic Web services invocations via SOAP or REST. In most cases there is no possibility to determine the semantic meaning of these interactions, i.e., the reason for initiating a Skype call. It can be only observed if a call is accepted and its duration. However, on the top level, Web service enabled interactions, using dedicated tools for delegating activities, performing periodic reports, and requesting help and support, provide more information on the reason of interactions and nature of collaboration. Furthermore, in mixed service-oriented systems, consisting of humans and software services, we distinguish interactions according to the type of interacting
2 3
Friend-Of-A-Friend http://xmlns.com/foaf/0.1/ XHTML Friends Network http://www.gmpg.org/xfn/11
38
F. Skopik et al.
entities. Our framework – VieTE – accounts for the following types of interactions: (a) human-human, including instant messaging and e-mail via dedicated services with integrated interaction sensors; (b) human-service typically service invocations via SOAP or REST interfaces, using an external access layer that intercepts and captures messages; (c) service-human, e.g., reminder service or meeting scheduling service, notifying humans about events; (d) service-service, in typical service compositions, e.g., modeled with BPEL.
3.4.1.3 Interaction Metrics and Scopes Interaction metrics are calculated by observing (monitoring) interactions and further analysis. Therefore, metrics describe the interaction behavior of actors, either humans or services, in a mixed service-oriented system. Such metrics are, for instance, their responsiveness (e.g., measured by an average response time), the reliability in responding to requests, the ratio of performed to delegated tasks, or the variance in delivering periodic status reports. However, these interaction metrics are of course valid only in particular situations. For example, interaction behavior varies depending on the risk actors are facing, or the benefit they are receiving. Depending on the environment, several external factors may influence the interaction behavior of actors, such as their motivation, interest or expertise. For example, in the outlined help and support environment in Sect. 3.2, people might be more responsive in their dedicated expertise areas than in topics outside their interests. Therefore, we introduce the notion of scopes. In general, scopes describe what activities are relevant for determining certain interaction metrics. All interactions within relevant activities are taken into account for metrics determination. Scopes are defined specifically for a domain and depending on business areas. In the previous Expert Web use case, for instance, one scope can define that all software implementation and software testing activities should be considered, to describe someone’s collaboration behavior in the area of software engineering. Other scopes could define to account for interactions regarding management activities of team leaders (e.g., delivering status reports, delegating tasks to team members etc.), or aggregate interactions from risky activities only. The definition of scopes is supported in two different forms: • Tag-based Scopes: Scopes are represented as lists of key-words (tags). All interactions within activities whose descriptions incorporate these keywords, are taken into account for metrics calculation. • Activity-based Scopes: Scopes are determined by matching constraints on explicitly defined activities, e.g., matching activity type, a minimum team size to indicate mass collaboration, or a maximum risk. All interactions that take place in the context of matching activities are considered for interaction metrics calculation. In case more than one tag is set and constraints defined respectively, there will be interaction contexts (i.e., activities)
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
39
that match only partly (e.g., only two of three tags). Then the impact of interactions on a metric is weighted based on the degree of match. More information on observing, logging, aggregating, and analyzing interactions in mixed systems, as well as calculating metrics have been studied in [1, 45, 49].
3.4.2 Personalized Trust Inference We model the network of humans and services with their trust relations as a directed graph G D .V; E/, where the vertices V denote the members of the network, and edges E reflect their trust relations in between. General profiles of network members are attached to the vertices. Furthermore, both vertices and edges are annotated with various metrics that describe collaboration behaviors of network members and their relationships. A community comprises a subset of vertices (and references to their connecting edges). Figure 3.3 visualizes this model. We distinguish the following classes of metrics: • Interaction Metrics (subsets of MEdge) describe the interaction behavior as explained before, such as an actor’s responsiveness and reliability in distinct scopes.
Collaborative Network Community
1
Vertex
1 MGroup
2
*
1
1
*
Edge
1
* * Metric (1)
Annotations: (1) Metrics depend on scopes (2) Interaction, similarity, or trust metrics
MEdge (2)
Fig. 3.3 Social and collaborative network model
MCollaboration
40
F. Skopik et al.
• Similarity Metrics (subsets of MEdge) provide information about skill-, feature-, or expertise similarities, depending on the type of actors. • Trust Metrics (subsets of MEdge) are interpreted from interaction- and similarity metrics, e.g., personal trust, symmetry of trust relations (bidirectional trust) and trust trends in certain time intervals. • Collaboration Metrics (MCollaboration) are bound to a user, and describe independent from collaboration partners someone’s previous experiences, such as collected expertise by performing activities, and behavior, e.g., reciprocity [36]. Furthermore, edge metrics can be aggregated to calculate collaboration metrics; for instance, an average value of someone’s responsiveness or availability. • Group Metrics (MGroup) provide information about average values and distribution of vertex- and edge metrics in a community, therefore, they are a valuable mean to determine a metric value relative to others in the same group. Users of the system, i.e., the network members, specify rules for evaluating calculated interaction metrics to trust. For this purpose, we utilize an approach based on fuzzy set theory (see more details in the next section) that (a) enables users to express their rules in almost natural language (similar to a domain specific language), and (b) offers elegant and efficient mechanisms to aggregate fuzzy expressions of trust. We decided to build a rule-based system, instead of a certain analytical model, because of the flexibility of the environment. Humans and services (i.e., service vendors) should be able to define their personal rules that have to be satisfied to establish trust in others. For instance, let us assume the members of the network want to describe influencing factors for establishing trust in a software engineer. There are various factors on that trust may rely: (a) based on formal skills provided by profiles including certificates such as university degrees, (b) based on collected experience and previous success (e.g., performed activities of different types in the scope of software engineering), (c) based on particular interaction behavior, such as support reliability and quality in the previously introduced Expert Web use case), or (d) accounting for capabilities and behavior in related scopes of software engineering.
3.4.3 Trust Projection Layer In large-scale networks with thousands of humans and services, each member interacts only with a small amount of potential partners leading only to a small portion of personal trust relations from each member’s point of view. Therefore, several concepts have been introduced to predict not existing relations, e.g., recommendation by the means of trust propagation, and reputation by the means of trust aggregation. We extend this list by three novel concepts, based on the similarities of actors, their trust perceptions, and the situations described by context data: • Trust mapping deals with using trust relations established in other, but to some extent similar scopes (e.g., related expertise areas).
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
41
• Trust mirroring [50], implies that actors with similar profiles (interests, skills, community membership), tend to trust each other more than completely unknown actors. • Trust teleportation [50] rests on the similarity of human or service capabilities, and describes that trust in a member of a certain community can be teleported to other members. For instance, if an actor, belonging to a certain expert group, is trusted because of his distinguished knowledge, other members of the same group may benefit from this trust relation as well.
3.5 Fuzzy Set Theory for Trust Inference Fuzzy set theory, developed by Zadeh et al.[56], and fuzzy logic emerged in the domain of control engineering, but are nowadays more and more used in computer science to enable lightweight reasoning on a set of imperfect data or knowledge. The concept of fuzziness has been used earlier in trust models [16, 42, 48], however, to our best knowledge not to enable a personalized interpretation of trust from larger and diverse sets of metrics, calculated upon observable interactions. As fuzzy inference is a key mechanisms of our trust model, we introduce the fundamental definitions in this section. There exists various further literature on fuzzy set theory, for instance [59]. Zadeh et al.[56] defined a fuzzy set A in X (A X ) to be characterized by a membership function A .x/ W X 7! Œ0; 1 which associates with each point in X a real number in the interval Œ0; 1, with the value of A .x/ at x representing the ‘grade of membership’ of x in A. Thus, the nearer the value of A .x/ to 1, the higher the grade of membership of x in A. When A is a set in the ordinary sense of the term, its membership function can take only two values (A .x/ W X 7! f0; 1g, (3.1)), according as x does or does not belong to A. Thus, in this case A .x/ reduces to the familiar characteristic function of a set A. ( 1 if x 2 A A .x/ D (3.1) 0 if x … A Equation (3.2) depicts an example definition of a membership function A .x/ describing a fuzzy set. This membership function is part of the linguistic variable ‘responsiveness’ highlighted in Fig. 3.4a, left side.
A .x/ D
8 ˆ 0 ˆ ˆ ˆ ˆ x ˆ ˆ < 12 1 1 ˆ ˆ ˆ x ˆ ˆ 12 5 ˆ ˆ :0
if 0 x < 12 if 12 x < 24 if 24 x < 48 if 48 x < 60 else
(3.2)
42
F. Skopik et al.
a
(tr) 1.0
LOW
(sr) 1.0
HIGH
MEDIUM
LOW
MEDIUM
HIGH
0.75
0.75
0.5
0.5
response time tr [h]
0.25 12 18 24
36
48
60
success rate sr [%]
0.25
72
b
10
50
100
75
c
(trust) 1.0
If tr is low and sr is high then trust is full
LOW
MEDIUM
HIGH
FULL
0.75
If tr is low and sr is medium then trust is high
0.5
If tr is medium and sr is high then trust is high
0.61
0.25
trust
If tr is medium and sr is medium then trust is medium 0.2
0.4
0.6
0.8
1.0
Fig. 3.4 An example showing fuzzy trust inference. Applied interaction metrics are response time tr D 18h and success rate sr D 75%. (a) definition of membership functions and fuzzified interaction metrics; (b) four applied fuzzy rules following the max-min inference; (c) defuzzification by determining the center of gravity
Two or more fuzzy sets, describing the same characteristic (i.e., metric), can be merged to a linguistic variable. For instance in Fig. 3.4a, the linguistic variable ‘responsiveness’ is described by three fuzzy sets: high, medium, and low. if if if if if if if if if
tr tr tr tr tr tr tr tr tr
is is is is is is is is is
low low low medium medium medium high high high
and and and and and and and and and
sr sr sr sr sr sr sr sr sr
is is is is is is is is is
high medium low high medium low high medium low
then then then then then then then then then
is is is is is is is is is
full high low high medium low medium low low
Listing 3.1 Given the linguistic variables response time tr , success rate sr, and trust , with the membership functions as defined in Fig. 3.4, we provide this rule base to the fuzzy engine
The definition of linguistic variables (and the their single membership functions respectively), has to be performed carefully as they determine the operation of the reasoning process. Linguistic variables are defined either for the whole community, or for groups, and even single network members, by: • A domain expert, using his experience and expertise. However, depending on the complexity of the rules and aggregated metrics continuous manual adjustments are needed (especially when bootstrapping the trust system). • The system itself based on knowledge about the whole community. For instance, the definition of a ‘high’ skill level is determined by the best 10% of all network members in certain areas. • The users based on individual constraints. For example, a ‘high’ skill level from user u’s point of view starts with having more than the double score of himself.
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
43
Let XA and XB be two feature spaces, and sets that are describes by their membership function A and B respectively. A fuzzy relation R .xA ; xB / W XA XB 7! Œ0; 1 describes the set X , whereas R .xA ; xB / associates each element .xA ; xB / from the cartesian product XA XB a membership degree in Œ0; 1. Fuzzy relations are defined by a rule base (see example in Listing 3.1), where each rule, as shown in (3.3), comprises a premise p (condition to be met) and a conclusion c. IF p THEN c
(3.3)
Approximate reasoning by evaluating the aforementioned rule base, needs some fuzzy operators to be defined [56]: OR, AND, and NOT. A OR B A [ B .x/ D max.A .x/; B .x// for x 2 X
(3.4)
A AND B A \ B .x/ D min.A .x/; B .x// for x 2 X
(3.5)
NOTA .x/ D 1 A .x/ for x 2 X
(3.6)
The defuzzification operation [26] determines a discrete (sharp) value xs from the inferred fuzzy set X . For that purpose all single results obtained by evaluating rules (see Fig. 3.4b) are combined, forming a geometric shape. One of the most common defuzzification methods is to determine the center of gravity of this shape, as depicted in the example in Fig. 3.4c. In general, center of gravity defuzzification determines the component x of xs of the area below the membership function x .x/ (see (3.7)). R x x .x/ dx (3.7) xs D xR x x .x/ dx Figure 3.5 depicts possible trust values after defuzzification for metrics tr and sr when applying membership functions defined in Fig. 3.4 and the rule base in Listing 3.4.
3.6 Trust Model Definitions The trust model manages context-dependent trust between actors, i.e., humans and services, emerging from interactions (see Fig. 3.7) that are captured and interpreted. A trust relation is always asymmetric, i.e., a directed edge from one vertex to another one in G. We call the trusting actor the trustor u (the source of an edge), and the trusted entity the trustee v (the sink of an edge). Analyzed interactions are any kind of communication, coordination or execution actions initiated by u regarding v. The context of interactions reflects the situation and reason for their occurrences, and is modeled as activities. Activities, as presented in [49] and shortly discussed before, describe work-relevant context elements. When interactions are interpreted, only a minor subset of all describing context elements is relevant within a trust scope. In the motivating use case of this work, such a trust scope may describe the expertise area.
44
F. Skopik et al.
tr = 18h sr = 75 % τ = 0.609 1
0.8
trust τ
0.6
0.4
0.2
0 100 80
0 60
20 40
40
success rate sr [%]
60
20
80 0
100
response time tr [hours]
Fig. 3.5 Trust Inference results visualization for the given rule base
3.6.1 Fundamental Trust Model Available metrics are processed by individually configured fuzzy (event)-conditionaction-rules. These rules define conditions to be met by metrics M for interpreting trustworthy behavior, e.g., ‘the responsiveness of the trustee must be high’ or ‘a trustworthy software programmer must have collected at least average experiences in software integration activities’. Rules reflect a user’s trust perception, e.g., pessimists may demand for stricter trustworthy behavior, than optimists. On top of metrics, the confidence c s .u; v/ 2 Œ0; 1 of u in v in scope s is determined. This confidence upon available interaction-, collaboration-, and similarity metrics M.u; v/ that describe the relationship from u to v, represents recent evidence that an actor behaves dependably, securely and reliably. Besides highly dynamic interaction metrics, actor profiles P may be considered during calculation, e.g., a human actor’s education or a service’s vendor. The function cs (3.8) evaluates u’s fuzzy rule set Rc .u/ to determine confidence c in scope s in his collaboration partners (e.g., v). This confidence value is normalized to Œ0; 1 according to the evaluation results of the rule base. c s .u; v/ D cs .u; M.u; v/; P .v/; Rc .u/; s/
(3.8)
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
45
The reliability of confidence .c s .u; v// 2 Œ0; 1, ranging from totally uncertain to fully confirmed, depends mainly on the amount of data used to calculate confidence (more data provide higher evidence), and the variance of metric values collected over time (e.g., stable interaction behavior is more trustworthy; see later about temporal evolvement). The function s (3.9) determines the reliability of confidence c s .u; v/ relying on utilized metrics in Rc .u/. As the determination of reliability can be quite complex (considering temporal trends and variances of metrics), and the additional personal setup of this measure could be very demanding for the end-users, we let a domain expert configure a global reliability measure that accounts for metrics in Rc .u/ of respective network members. .c s .u; v// D s .u; M.u; v/; P .v/; Rc .u/; s/
(3.9)
Our engine infers personal trust s .u; v/ 2 Œ0; 1 by combining confidence with its reliability (see operator ˝ in (3.10)). This can be performed either rule-based by attenuating confidence respecting reliability, or arithmetically, for instance by multiplying confidence with reliability (as both are scaled to the interval Œ0; 1). Since trust relies directly on confidence that is inferred by evaluating personal rules, an actor’s personal trust relation in this model indeed reflects its subjective criteria for trusting another actor. s .u; v/ D hc s .u; v/; .c s .u; v//; ˝i
(3.10)
We introduce the trust vector Ts .u/ to enable efficient trust management in the Web of Trust. This vector is combined of single personal trust relations (outgoing edges of a vertex in G) from an actor u to others in scope s (3.11). Ts .u/ D h s .u; v/; s .u; w/; s .u; x/; : : : i
(3.11)
The trust matrix Ts comprises trust vectors of all actors in the environment, and is therefore the adjacency matrix of the mentioned trust graph G. In this matrix, as shown in (3.12) for four vertices V D fu; v; w; xg, each row vector describes the trusting behavior of a particular actor (Ts ), while the column vectors describe how much an actor is trusted by others. If no relation exists, such as self-connections, this is denoted by the symbol ?. 0
1 ? s .u; v/ s .u; w/ s .u; x/ B s .v; u/ ? s .v; w/ s .v; x/ C C Ts D B @ s .w; u/ s .w; v/ ? s .w; x/A s .x; u/ s .x; v/ s .x; w/ ?
(3.12)
In cases where actors define their personalized trust inference rules, the trust perception ps .u/ represents the ‘trusting behavior’ of u, i.e., its attitude to trust others in scope s. The absolute value of ps .u/ is not of major importance, but
46
F. Skopik et al.
it is meaningful to compare the trust perceptions of various actors. Basically, this is performed by comparing their rule bases for trust inference (3.13), e.g., if actors account for the same metrics, or if they are shaped by optimism or pessimism. Therefore, more similar rules means more similar requirements for trust establishment. The application of trust perception becomes clear when discussing the trust projection concepts, such as weighting received recommendations based on the similarity of the recommender’s trust perception. simpercep .ps .u/; ps .v// D sim.Rcs .u/; Rcs .v/
(3.13)
3.6.2 Temporal Evaluation Personal trust s .u; v/ from u in v is updated periodically in successive time intervals ti , numbered with consecutive integers starting with zero. We denote the personal trust value calculated at time step i as is . As trust is evolving over time, we do not simply replace old values, i.e., is1 , with newer ones, but merge them according to pre-defined rules. For this purpose we apply the concept of exponential moving average4, to smoothen the sequence of calculated trust values as shown in (3.14). is D ˛ is C .1 ˛/ is1
(3.14)
With this method, we are able to adjust the importance of the most recent trust behavior s compared to history trust values s (smoothing factor ˛ 2 Œ0; 1). In case, there are no interactions between two entities, but an existing trust relation, the reliability of this trust relation is lowered by a small amount each evaluation interval. Therefore, equal to reality, trust between entities is reduced stepwise, if they do not interact frequently. Figure 3.6 shows an example of applied EMA. The dashed line represents the trustworthiness of an actor’s behavior, i.e., is , for the i th time interval, calculated independently from previous time intervals. In this extreme situation an actor behaves fully trustworthy, drops to zero, and behaves trustworthy again. Similar to reality, EMA enables us to memorize drops in recent behavior. If an actor once behaved untrustworthy, it will likely take some time to regain full trust again. Therefore, depending on the selected tuning parameter ˛, different strategies for merging current trust values with the history can be realized. According to (3.14), for ˛ > 0:5 the actual behavior is counted more, otherwise the history gains more importance. Figure 3.6 shows three smoothened time lines, calculated with different smoothing factors. There exist several other approaches to trust evolution which work with deep histories, e.g., [53], however, EMA requires less memory and lower computational effort.
4
http://www.itl.nist.gov/div898/handbook/
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems Fig. 3.6 Smoothing of trust values over time
47
1
evaluation
0,8 0,6 Behavior EMA a=0.2 EMA a=0.5 EMA a=0.8
0,4 0,2 0 0
15 20 25 time (a) Evolution of trust applying EMA. 5
10
30
1
evaluation
0,8 0,6 0,4
Behavior Pessimist Optimist
0,2 0 0
5
10
15 20 25 30 time (b) Optimistic and pessimistic perception of trust modeled with adaptive EMA.
As shown in Fig. 3.6, by applying EMA previous or current behavior is given more importance. However, personal traits, such as being optimistic or pessimistic, demands for more sophisticated rules of temporal evaluation. In our case, we define an optimist as somebody who predominantly remembers positive and contributing behavior and tends to quickly forgive short-term unreliability. In contrast to that, a pessimist loses trust also for short-term unreliability and needs more time to regain trust than the optimist. Examples of this behavior are depicted by Fig. 3.6. Optimistic and pessimistic perceptions are realized by adapting the smoothing factor ˛ according to (3.15). Whenever the curve depicted in Fig. 3.6 changes its sign, is is re-calculated with adapted ˛. A small deviation denotes that the smoothing factor is either near 0 or near 1, depending on falling or rising trustworthiness. An enhanced version of this approach may adapt parameters in more fine-grained intervals, for instance, by considering lower and higher drops/rises of trustworthiness. 8 ˆ 0C ˆ ˆ ˆ
Listing 3.3 HPS WSDL binding
The GenericResource defines common attributes and metadata associated with resources such as documents or policies. A GenericResource can encapsulate remote resources that are hosted by a collaboration infrastructure (e.g., document management). Request defines the structure of an RFS (here we show a simplified example). A Reply is the corresponding RFS response (we omitted the actual XML definition). The protocol (at the technical HPS middleware level) is asynchronous allowing RFSs to be stored, retrieved, and processed. For that purpose we implemented a middleware service (HPS Access Layer - HAL) which dispatches and routes RFSs. GetSupport depicts a WSDL message corresponding to the RFS SupportRequest. Upon receiving such a request, HAL generates a session identifier contained in the output message AckSupportRequest. A notification is sent to the requester (assuming a callback destination or notification endpoint has been provided) to deliver RFS status updates for example; processed RFSs can be retrieved via GetSupportReply. The detailed notification mechanism can be found in [45].
3.8.8 Interaction Monitoring and Logging The HPS Access Layer logs each service interaction (request and response message) through a logging service. RFSs and their responses, exchanged between community members, are modeled as traditional SOAP calls, but with various header extensions, as shown in Listing 3.4. The most important extensions are: • Timestamp capture the actual creation of the message and is used to calculate temporal interaction metrics, such as average response times. • Delegation holds parameters that influence delegation behavior, such as the number of subsequent delegations numHops (to avoid circulating RFSs) and hard deadlines. • Activity uri describes the context of interactions (see [46] for activity model). • MessageID enables message correlation, i.e., to properly match requests and responses. • WS-Addressing tags, besides MessageID, are used to route RFSs through the network.
58
F. Skopik et al.
uuid http://.../Actor#Florian http://.../Actor#Florian http://.../Actor#Daniel http://.../Type/RFS WSDL consumption with Axis2? Axis2 reports a parsing error while consuming the given resource. What is wrong? WSDL, Axis2
Listing 3.4 Simplified RFS via SOAP example
3.8.9 Metric Calculation Metrics describe the interaction behavior and dynamically changing properties of actors. Currently, we account for the metrics described in Table 3.1 for trust interpretation upon logged SOAP calls in the Expert Web scenario. Note, as described before, these metrics are determined for particular scopes; i.e., based on a subset of interactions that meet certain constraints. The availability of a service, either provided by humans or implemented in Software, can be high in one scope, but much lower in another one. Furthermore, these metrics are calculated for each directed relation between pairs of network members. An actor u might serve v reliably, but not a third party w.
Table 3.1 Metrics utilized for trust inference Metric name Range Description Availability [0,100] Ratio of accepted to received RFSs Response time [0,96] Average response time in hours Success rate [0,100] Amount of successfully served RFSs Experience [0,1] Number of RFSs served RFS reciprocity [1,1] Ratio of processed to sent RFSs Manual reward [0,5] Optional manually assigned scores Costs [0,5] Price for serving RFSs
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
59
Our approach relies on mining of metrics, thus, values are not manually entered but are frequently updated by the system. This enables collaboration partners to keep track of the dynamics in highly flexible large-scale networks. Besides interaction behavior in terms of reliability or responsiveness, also context-aware experience mining can be conducted. This approach is explained in detail in [45]. In trust inference examples in previous sections, we accounted for the average response time tr (3.21) of a service and its success rate sr (3.22). These are typical metrics for an emergency help and support environment, where fast and reliable support is absolutely required, but costs can be neglected. We assume, similar complexity of requests for support (RFS) in a scope s, thus different RFSs require comparable efforts from services (similar to a traditional Internet forum). The response time is calculated as the duration between sending (or delegating) a request (tsend ) to a service and receiving the corresponding response (treceive), averaged over all served RFSs. Unique IDs of calls (see SOAP header in Listing 3.4) enable sophisticated message correlation to identify corresponding messages. P trs
D
rf s2RF S
.treceive .rf s/ tsend .rfs// jRFSj
(3.21)
An RFS is considered successfully served (sRFS) if leading to a result before a predefined deadline, otherwise it fails (fRFS). sr s D
num.sRFS/ num.sRFS/ C num.fRFS/
(3.22)
3.8.10 Trust Provisioning The Social Network Provisioning WS (see Fig. 3.9) is a WSDL-based Web Service that provides the dynamically changing Web of Trust as standardized directed graph model. It is a major part of the VieTE framework and used by other services, such as partner discovery tools, to retrieve social relations for service personalization and customization in virtual communities. The Web service interface deals with the following fundamental types of entities: • Vertex: A vertex describes either a human, software service, or HPS. • Edge: An Edge reflects the directed relation between two vertices. • Metric: Metrics describe properties of either vertices (such as the number of interactions with all partners, or the number of involved activities) or edges (such as the number of invocations from a particular service by a particular human). Metrics are calculated from interactions and provided profiles with respect to pre-configured rule sets (e.g., only interactions of a particular type are considered in the trust determination process).
60
F. Skopik et al.
• Scope: Rules determine which interactions and collaboration metrics are used for trust calculation. These rules describe the constraints for the validity of calculated metrics, i.e., the scope of their application. Common scopes are preconfigured and can be selected via the Web Service interface. The Social Network Provisioning WS enables the successive retrieval of the Web of Trust starting with a predefined vertex, e.g., reflecting the current service user. We specify its interface as shown in Table 3.2. Note, for data retrieval, metrics are merged in the entities vertex and edge. All entities are identified by an URI, which is a combination of a basepath (e.g., www.infosys.tuwien.ac.at), the entity type (e.g., vertex) and an integer id.
Table 3.2 Social Network Provisioning WS interface specification Method name
Parameter
getVertex
vertexURI
getVerticesByName
vertexName (regex)
getAllVertices getEdge getEdges
– edgeURI sourceVertexURI, sinkVertexURI
getOutEdges
sourceVertexURI
getInEdges
sinkVertexURI
getScope
scopeURI
getAllScopes getSourceVertex
edgeURI
getSinkVertex
edgeURI
getNeighbours getSuccessors
vertexURI, numHops sourceVertexURI
getPredecessors getVersion
sinkVertexURI –
Description Get the vertex object with the given uri Get a list of vertices with matching names Get all vertices (can be restricted to a maximum number due to performance reasons) Get the specified edge Get all directed edges from sourceVertex to sinkVertex Get all out edges of the specified vertex Get all in edges of the specified vertex Get one particular scope in the network Get all available scopes in the network Get the vertex object which is the source of the given edge Get the vertex object which is the sink of the given edge Get neighbors (independent of edge orientation); the optional parameter numHops may set the maximum path length from the specified vertex to resulting neighbours Get successors of specified vertex Get direct predecessors of specified vertex Get version string
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
61
3.9 Evaluation and Discussion In this section, we show the results of performance evaluations that discuss major design decisions and VieTE’s applicability in large-scale networks; and a functional evaluation that deals with the actual application of our trust inference approach for balancing communities.
3.9.1 Computational Complexity of Trust Management A fundamental aspect of our trust management approach is the context-awareness of data and social relations. Due to the high complexity of large-scale networks comprising various kinds of interactions and distinct scopes of trust, we evaluate the feasibility of our framework by well-directed performance studies. We focus on the most critical parts, i.e., potential bottlenecks, in our system, in particular, on (a) trust inference upon interaction logs, (b) profile similarity measurement for trust mirroring and teleportation, (c) the calculation of recommendations based on mined graph structures and (d) provisioning of graph segments to users. The conducted experiments address general technical and research problems in complex networks, such as emerging relations in evolving structures, graph operations on large-scale networks, and information processing with respect to contextual constraints.
3.9.1.1 Experiments Setup and Data Generation For conducting our performance studies, we generate an artificial interaction and trust network that we would expect to emerge under realistic conditions. For that purpose we utilize the preferential attachment model of Barabasi and Albert to create6 network structures that are characteristic for science collaborations [41]. As shown in Fig. 3.10 for a graph with 500 vertices, the output is a scale-free network with vertex degrees7 following a power-law distribution. These structures are the basis for creating realistic interaction logs that are used to conduct trust inference experiments. For a graph G, we generate in total 100 jEj interactions between pairs of vertices .u; v/. In our experiments, we assume that 80% of interactions take place between 20% of the most active users (reflected by hub vertices with high degree). Generated interactions have a particular type (support request/response, activity success/failure notification) and timestamp, and occur in one of two abstract scopes. While we payed attention on creating a realistic amount and distribution of interactions that are closely bound to vertex degrees, the interaction properties
6 7
see JUNG graph library: http://jung.sourceforge.net the vertex size is proportional to the degree; white vertices represent ‘hubs’
62
F. Skopik et al.
3
10
N(k) ∼ k −2.5
2
N(k)
10
1
10
0
10 0 10
(a) scale-free graph structure.
1
10
k (b) power-law distribution
2
10
Fig. 3.10 Generated network applying preferential attachment
themselves, i.e., type, timestamp, do not influence the actual performance study (because they do not influence the number of required operations to process the interaction logs). For the following experiments, VieTE’s trust provisioning service is hosted on a server with Intel Xeon 3.2GHz (quad), 10GB RAM, running Tomcat 6 with Axis2 1.4.1 on Ubuntu Linux, and MySQL 5.0 databases. The client simulation that retrieves elements from the managed trust graph runs on a Pentium 4 with 2GB on Windows XP, and is connected with the server through a local 100MBit Ethernet.
3.9.1.2 Trust Inference Performance Through utilizing available interaction properties, we calculate the previously discussed metrics (a) average response time tr , and (b) success rate sr (ratio of success to the sum of success and failure notifications). Individual response times are normalized to Œ0; 1 with respect to the highest and lowest values in the whole network. The rule base to infer confidence between each pair of connected vertices has been shown in Listing 3.1. If the amount of interactions jI.u; v/j between a pair .u; v/ is below 10, we set the reliability of confidence to jI.u;v/j , else we assume a 10 reliability of 1. Trust is calculated by multiplying confidence with its reliability. Interactions take place in context of activities. Instead creating artificial activity structures, we randomly assign context elements to synthetic interactions. These elements are represented by tags that are randomly selected from a predefined list. This list holds 5 different tags, and each interaction gets 2 to 4 of them assigned. Such tags may describe the activity type where an interaction takes place, e.g., ‘software development’; but also certain constraints, e.g., ‘high risk’. We define 5 scopes, each described by exactly one possible tag. Thus, each interaction belongs to 2 to 4 scopes; and scopes may overlap. Interactions are uniformly distributed among scopes. We measure the required time to completely process the synthetic interaction logs, including reading logs from the interaction database (SQL), aggregating logs and calculating metrics, normalizing metrics (here only the response time, because
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems Table 3.3 Trust inference performance results
Network characteristics Small-scale Medium-scale Large-scale
Mode No scopes 5 scopes No scopes 5 scopes No scopes 5 scopes
63
Computation time 1 min 11 s 1 min 56 s 11 min 41 s 19 min 48 s 109 min 03 s 182 min 37 s
the values of the success rate are already in Œ0; 1), inferring trust upon a predefined rule base, and updating the trust graph (EMA with ˛ D 0:25). Experiments are performed for three networks of different sizes: small-scale with 100 vertices and 200 trust edges; medium-scale with 1,000 vertices and 2,000 edges; and large-scale with 10,000 vertices and 20,000 edges. Furthermore, trust is inferred (a) neglecting scopes (i.e., tags), (b) for the defined scopes as above. The results in Table 3.3 show that especially for medium and large networks only a periodic offline calculation is feasible. Note, the difference of computational efforts accounting for no context (no scopes) and all scopes is not as high as one might expect. The reason is that a significant amount of time is required for SOAP communication in both cases.
3.9.1.3 Profile Similarity Measurement Trust mirroring and trust teleportation, as explained in Sect. 3.6, rely on mechanisms that measure the similarities of actors in terms of skills, capabilities, expertise and interests. In contrast to common top-down approaches that apply taxonomies and ontologies to define certain skills and expertise areas, we follow a mining approach that addresses inherent dynamics of flexible collaboration environments. In particular, skills and expertise as well as interests change over time, but are rarely updated if they are managed manually in a registry. Hence, we determine and update them automatically through mining. However, since trust mirroring and teleportation are mainly used in the absence of interaction data, we need to acquire other data sources. The creation of interest profiles without explicit user input has been discussed in [50]. That work assumes that users tag resources, such as bookmarks, pictures, videos, articles; and thus express their distinct interests. In particular, a dataset from citeulike8 expresses people’s use and understanding of scientific articles through individually assigned tags. We use these data to create dynamically adapting interest profiles based on tags (ATPs - actor tagging profiles) and manage them in a vector space model [50]. However, since arbitrary tags may be freely assigned – there is no agreed taxonomy – no strict comparison can be performed. Therefore, we cluster tags according to
8
http://www.citeulike.org/
64
F. Skopik et al.
#ATP similarity measurements
45 40 35 30 25 20 15 10 5 0
Sim (0.0,0.2( Sim (0.2,0.4( Sim (0.4,0.6( Sim (0.6,0.8( Sim (0.8,1.0) L5
L4 L3 L2 L1 cluster level of comparison
L0
Fig. 3.11 Similarity results among 10 realistic actor tagging profiles (ATPs)
their similarities and compare the actors’ usage of tags on higher cluster levels. For instance, actors using tags belonging to the same cluster have similar interests, even if they do not use exactly the same tags. Hierarchical clustering enables us to regulate the fuzziness of similarity measurements, i.e., the size of tag clusters. The concrete mechanisms and algorithms are described in [50] and therefore out of scope of this work. But we outline the evaluation results of [50] to demonstrate the applicability of automatic actor profile creation and cluster similarity measurement, supporting the realization of trust mirroring and teleportation. We determine for 10 representative citeulike users their tagging profiles (ATPs) in the domain of social networks. Then we compare these ATPs to find out to which degree actors use similar/same tags. The fundamental question is, if we are able to effectively distinguish similarities of different degrees among ATPs. In other words, in order to apply trust mirroring and teleportation we need distinguishable similarity results; and not e.g., all ATPs somehow similar. Figure 3.11 shows the results of various profile similarity measurements. As explained, we compare profiles with varying fuzziness, i.e., on 5 different tag cluster levels. While on L5 each tag is in its own cluster, these clusters are consecutively merged until all tags are in the same cluster (L0). Hence, on L5 the most fine-grained comparison is performed, while on L0 all profiles are virtually identical. As shown, on L2 and L3 a small set of highly similar ATPs are identified, while the majority is still recognized as different. This is the desired effect required to mirror/teleport trust only to a small subset of available actors. From a performance perspective, retrieving tags, aggregating and clustering them, and creating profiles takes some time. Especially mining these data on the Web is time-intensive. The overall performance highly depends on external systems
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
65
s with 10 and 100 recommender Table 3.4 Calculation times for rec
Recommendation calculation method Client-side Servicer-side (SQL) Server-side (in memory model) Server-side (pre-calculation)
10 recommender 1:1 s 0:46 s 0:28 s 0:18 s
100 recommender 6:3 s 2:2 s 0:34 s 0:18 s
that provide required data, such as citeulike in our case. Therefore, further performance studies have been omitted here.
3.9.1.4 Network Management This set of experiments, deal with managing trust in a graph model and the calculation of recommendation and reputation on top of a large-scale trust network with 10,000 vertices. Table 3.4 depicts the required time in seconds to calculate the s .u; v/, having 10 and 100 recommender in the same scope (i.e., recommendation rec intermediate vertices on connecting parallel paths .u; v/ of length 2). Several ways to implement recommendations exist. First, a client may request all recommender vertices and their relations and calculate recommendations on the client-side. However this method is simple to implement on the provider side, it is obviously the slowest one due to large amounts of transferred data. Still retrieving all recommender and relations directly from the backend database, but performing the calculation server-side, dramatically improves the performance. However, this method produces heavy load at the provider and its database and deems not to be scalable. Therefore, we map the network data, i.e., a directed graph model with annotated vertices and edges, in memory and perform operations without the backend database. Since all data are held in memory, the performance of calculating recommendations online is comparable to provisioning of pre-calculated data only. Hence, we design our system with an in-memory graph model, and further measure some aspects of this design decision. Figure 3.12 illustrates required time for mapping the whole graph from the backend database to its in-memory representation. The effort increases linear with the number of vertices in the graph. Figure 3.12 shows the memory consumption for graph instances of different sizes, first for the whole Social Network Provisioning Service, and second only for the graph object itself.
3.9.1.5 Trust Graph Provisioning Retrieving trust values of certain relations, and even recommendations as shown before, causes minor computational effort. However, imagine someone frequently wants to calculate reputation based on network structures (see TrustRank [18]), would like to get notified if his neighborhood in the Web of Trust has grown to
66
F. Skopik et al. 10000
time (sec)
1000
100
10
1 10
100
1000 10000 #vertices (a) graph mapping time.
100000
1000 mem (full service) mem (graph model) memory (MB)
100
10
1 1 0,1
10
100
1000
10000
100000
#vertices (b) memory consumption.
Fig. 3.12 Performance tests for mapping the graph model
a certain size or if his collaboration partners have reached a particular experience level. Then, periodically retrieving larger segments of the trust graph G from the Social Network Provisioning Service is required. Therefore, we run some experiments to estimate the produced load in such situations. The first experiment investigates the average size of potential collaboration partners who are either personally trusted or can be recommended (i.e., are connected through exactly one intermediate vertex). Experiment are conducted for various network sizes n and different average connection degrees of vertices. We pick random vertices from this set and run experiments for each of them until we calculate stable average results. Figure 3.13 shows that in higher cross-linked networks (i.e., #t rustees > 2), personal relations and recommendations (so called ‘second hand experiences’) deem to be sufficient to discover new collaboration partners. However, in case of sparsely connected graphs, other mechanisms, such as trust mirroring or teleportation may be of high benefit.
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
67
70 n=10 n=100 n=1000 n=10000
#connected vertices
60 50 40 30 20 10 0 1
2
3 #trustees (avg)
4
5
Fig. 3.13 Number of discovered potential collaboration partners through personal relations and recommendations for different network structures
#graph operations
10000
1000
100
10
pp=2
pp=3
pp=4
pp=5
pp=6 1 1
2
3 #trustees (avg)
4
5
Fig. 3.14 Average number of required graph operations (for different average number of trustees) to determine all neighbors of a given vertex that are reachable on a path not longer than pp
Propagating trust over more than one intermediate vertex is of course possible (and widely applied), but leads to significantly higher computational effort. Figure 3.14 depicts the number of required graph operations depending on the average number of trustees (average outdegree of vertices). These graph operations mainly consist of retrieving vertices and edges, including their assigned metrics and trust values. For higher propagation path lengths pp, costs increase exponentially.
3.9.2 Interaction Balancing in Large-Scale Networks We evaluate the functional application of the VieTE framework, by simulating typical scenarios in large-scale communities. In this experiment, we utilize the
68
F. Skopik et al.
popular Repast Simphony9 toolkit, a software bundle that enables round-based agent simulation. In contrast to researchers in the agent domain, we do not simulate our concepts by implementing different actor types and their behavior only, but we use a network of actors to provide stimuli for the actual VieTE framework. Therefore, we are not only able to evaluate the effectiveness of our new approach of fuzzy trust inference, but also the efficiency of the technical grounding based on Web service standards. We focus on the motivational Expert Web use case from Sect. 3.2. In this scenario, a small set of simulated network members interact (sending, responding, and delegating RFSs), and these interactions are provided to the logging facilities of VieTE. The framework infers trust by calculating the described metrics tr and sr, and using the rule set of Listing 3.1 for behavioral interpretation. Finally, emerging trust relations between the simulated actors influence the selection of receivers of RFSs. Hence, VieTE and the simulated actor network relies on each other, and are used in a cyclic approach; exactly the same way VieTE would be used by a real Expert Web. For this demonstration, all interactions take place in the same scope.
3.9.2.1 Simulation Setup 3.9.2.2 Simulated Agent Network Repast Simphony offers convenient support to model different actor behavior. As an inherent part of our environment, we make no distinction between human users and software services. Each actor owns a unique id (a number), creates SOAP requests, and follows one of the following behavior models: (a) malicious actors accept all RFSs but never delegate or respond, (b) erratic actors accept all RFSs but only process (respond directly or delegate) RFSs originally coming from requesters with odd-numbered IDs, (c) fair players process all requests if they are not overloaded, and delegate to trustworthy network neighbors otherwise. We set up a network comprising 15 actors, where only one is highly reputed and fully trusted by all others as depicted in Fig. 3.15. This is the typical starting point of a newly created community, where one actor invites others to join.
3.9.2.3 VieTE Setup After each simulation step (round) seven randomly picked actors send one RFS to its most trusted actor (in the beginning this will only be the highly reputed one who starts to delegate). Each actor’s input queue has exactly 5 slots to buffer incoming RFSs. A request is always accepted and takes exactly one round to be served. An actor processes an RFS itself if it has a free slot in its input queue, otherwise
9
http://repast.sourceforge.net
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
(a) initial n=0
(c) balanced n=250
69
(b) intermediate n=100
(d) balanced (reduced)
Fig. 3.15 Network structure after simulation round n=f0, 100, 250g. Elliptic vertices are fair players, rectangular shapes represent erratic actors, diamond shaped vertices reflect malicious actors
incoming RFSs are delegated to randomly picked trusted ( > 0:8) neighbors in the network. Note, one actor does not delegate more than one RFS per round to the same neighbor, however, an actor may receive more than one RFS from different neighbors in the same round. Delegations require one additional simulation round. There is an upper limit of 15 rounds for an RFS to be served (deadline); otherwise it is considered failed. A request can be delegated only three times (but not back to the original requester) (hops) to avoid circulating RFSs. Because the simulation utilizes only two fully automatically determined metrics (tr and sr), and no manual rewarding of responses, we assume an RFS is successfully served if a response arrives within 15 rounds (no fake or low quality responses). After each fifth round, VieTE determines tr based on interactions in the most recent 25 rounds, and sr upon interactions in the last 50 rounds, and purges older logs. New values are merged with current ones using EMA with a fixed ˛ D 0:25.
3.9.2.4 Simulation Results We perform 250 simulation rounds of the described scenario with the aforementioned properties, and study the network structure in certain points of the simulation. The depicted networks in Fig. 3.15 show actors with different behavior and the temporal evolvement of trust relations between them. The size of the graph’s vertices
70
F. Skopik et al.
depend on the amount of trust established by network neighbors. Beginning with a star structure (Fig. 3.15), the network structure in Fig. 3.15 emerges after 100 rounds, and Fig. 3.15 after 250 rounds, respectively. Note, since the behavior of actors is not deterministic (i.e., RFSs are sent to random neighbors that are trusted with > 0:8 (lower bound of full trust; see Fig. 3.4)), the simulation output looks differently for each simulation run, however, the overall properties of the network are similar (number and strength of emerged trust relations). In the beginning, all RFSs are sent to actor 0, who delegates to randomly picked trusted actors. If they respond reliably, requesters establish trust in that third parties. Otherwise they lose trust in actor 0 (because of unsuccessful delegations). Therefore, actors with even-numbered IDs lose trust in actor 0 faster than oddnumbered actors, because if actor 0 delegates requests to erratic actors, they are not replied. As an additional feature in round 100, actors that are not trusted with > 0:2 by at least on other network member, are removed from the network, similar to Web communities where leechers (actors that do not contribute to the network) are banned. Therefore, actors with malicious behavior disappear, while actors with erratic behavior still remain in the network. Figure 3.15 shows a reduced view of the balanced network after 250 rounds. Only trust relations with > 0:8 are visualized. As expected, most vertices have strong trust relations in at least one fair player (actors who reliably respond and delegate RFSs). However, remember that erratic actors reliably serve only requests coming from actors with odd-numbered IDs. Therefore, actor 3 and actor 9 also establish full trust in actors from this class. Note, if actor 3 and actor 9 would have re-delegated many RFSs coming from even-numbered actors to erratic actors, than those RFSs would have failed and only low trust would have emerged. However, due to the comparatively low load of the network (less than half of the actors receive RFSs per round (until n D 100)), only a low amount of re-delegations occur (approx. 8% of RFSs).
3.10 Background and Related Work 3.10.1 Flexible and Context-aware Collaborations In collaborations, activities are the means to capture the context in which human interactions take place. Activities describe the goal of a task, the participants, utilized resources, and temporal constraints. Studies regarding activities in various work settings are described in [19]. They identify patterns of complex business activities, which are then used to derive relationships and activity patterns [34, 35]. The potential impact of activity-centric collaboration is highlighted [46] with special focus on the value to individuals, teams, and enterprises. Studies on distributed teams focus on human performance and interactions [7, 39], even in Enterprise 2.0 environments [8]. Caramba [11] organizes work items of individuals as activities that can be used to manage collaborations. For example, one can see the status of
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
71
an activity, who contributed to an activity, documents created within a particular activity, etc. Based on log analysis, human interaction patterns can be extracted [12].
3.10.2 Interactions in Mixed Systems Major software vendors have been working on standards addressing the lack of human interaction support in service-oriented systems. WS-HumanTask [4] and Bpel4People [3] were released to address the emergent need for human interactions in business processes. These standards specify languages to model human interactions, the lifecycle of human tasks, and generic role models. Rolebased access models [4] are used to model responsibilities and potential task assignees in processes. While Bpel4People based applications focus on top-down modeling of business processes, mixed systems target flexible interactions and compositions of Human-Provided and software-based services. This approach is aligned with the vision of the Web 2.0, where people can actively provide services. An example for a mixed system is a virtual organization (VO) using Web 2.0 technologies. A VO is a temporary alliance of organizations that come together to share skills or core competencies and resources in order to better respond to business opportunities, and whose cooperation is supported by computer networks [9]. Nowadays, virtual organizations are more and more realized with SOA concepts, regarding service discovery, service descriptions (WSDL), dynamic binding, and SOAP-based interactions. In such networks, humans may participate and provide services in a uniform way by using the HPS framework [45, 47].
3.10.3 Behavioral and Social Trust Models for SOA Marsh [29] introduced trust as a computational concept, including a fundamental definition, a model and several related concepts impacting trust. Based on his work, various extended definitions and models have been developed. Some surveys on trust related to computer science have been performed [5, 15, 23], which outline common concepts of trust, clarify the terminology and describe the most popular models. From the many existing definitions of trust, those from [15, 36] describe that trust relies on previous interactions and collaboration encounters, which fits best to our highly flexible environment. Context dependent trust was investigated by [5, 15, 23, 29]. Context-aware computing focusing modeling and sensing of context can be found in [6, 27]. Recently, trust in social environments and service-oriented systems has become a very important research area. SOA-based infrastructures are typically distributed comprising a large number of available services and huge amounts of interaction logs. Therefore, trust in SOA has to be managed in an automatic manner. A trust management framework for service-oriented environments has been presented in
72
F. Skopik et al.
[10, 25, 28], however, without considering particular application scenarios with human actors in SOA. Although several models define trust on interactions and behavior, and account for reputation and recommendation, there is hardly any case study about the application of these models in service-oriented networks. While various theoretically sound models have been developed in the last years, fundamental research questions, such as the technical grounding in SOA and the complexity of trust-aware context-sensitive data management in large-scale networks are still widely unaddressed. Depending on the environment, trust may rely on the outcome of previous interactions [36, 51], and the similarity of interests and skills [14, 31, 50, 57]. Note, trust is not simply a synonym for quality of service (QoS). Instead, metrics expressing social behavior and influences are used in certain contexts. For instance, reciprocity [36] is a concept describing that humans tend to establish a balance between provided support and obtained benefit from collaboration partners. The application of trust relations in team formations and virtual organizations has been studied before, e.g., in [24] and [60]. Trust propagation models [17, 30, 54, 58] are intuitive methods to predict relations where no personal trust emerged; e.g., transitive recommendations. In this work, we described an approach to trust inference that is based on fuzzy set theories. This technique has been applied in trust models before [16, 42, 48], however, to our best knowledge, not to interpret diverse sets of interaction metrics. Utilizing interaction metrics, in particular calculated between pairs of network members, enables us to incorporate a personalized and social perspective. For instance, an actor’s behavior may vary toward different network members. This aspect is usually out of scope in Web Services trust models, that are often closely connected to traditional QoS approaches [32]. Bootstrapping addresses the cold start problem and refers to putting a system into operation. Trust – from our perspective – cannot be negotiated or defined in advance. It rather emerges upon interactions and behavior of actors and thus, needs a certain time span to be built. However, until enough data has been collected, interests and skills can be used to predict potentially emerging trust relations. Mining, processing, and comparing user profiles is a key concept [14, 50, 57].
3.11 Conclusion and Further Work Emerging service-oriented platforms no longer operate in closed enterprises. An increasing trend can be observed towards temporary alliances between companies requiring composition models to control and automate interactions between services. The resulting service-oriented application needs to be flexible supporting adaptive interactions. In this work, we have motivated the need for adaptive interactions discussing an Expert Web scenario where people can register their skills and capabilities as services. Mixed service-oriented systems are open ecosystems comprising human- and software-based services. Trust mechanisms become
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
73
essential in these systems because of changing actor interests and the dynamic discovery capabilities of SOA. Our trust model is based on fuzzy logic and rulebased interpretation of observed (logged) interactions. This makes the inference of trust in real systems possible as interaction data is monitored and interpreted based on pre-specified rules. We have demonstrated the application of our trust model by supporting dynamic, trust-based partner discovery and selection mechanisms. This scenario is based on advanced interaction patterns in flexible compositions such as trusted delegations to achieve load-balancing and scalability in the Expert Web. Our future work will include the deployment and evaluation of the implemented framework in cross-organizational collaboration scenarios. This will be done within the EU FP7 COIN project focusing on collaboration in VOs. The emphasis of COIN is to study new concepts and develop tools for supporting the collaboration and interoperability of networked enterprises. The end-user evaluation in COIN will discover the usability of trusted expert discovery and balancing mechanisms. Acknowledgements This work is supported by the European Union through the FP7-216256 Project COIN.
References 1. van der Aalst, W.M.P., Song, M.: Mining social networks: Uncovering interaction patterns in business processes. In: International Conference on Business Process Management (BPM), vol. 3080, pp. 244–260 (2004) 2. Abdul-Rahman, A., Hailes, S.: Supporting trust in virtual communities. In: Hawaii International Conferences on System Sciences (HICSS) (2000) 3. Agrawal, A. et al.: WS-BPEL Extension for People (BPEL4People), Version 1.0, 2007. specification available online (2007) 4. Amend, M. et al.: Web services human task (ws-humantask), version 1.0, 2007. specification available online (2007) 5. Artz, D., Gil, Y.: A survey of trust in computer science and the semantic web. Web Semantics 5(2), 58–71 (2007) 6. Baldauf, M., Dustdar, S., Rosenberg, F.: A survey on context aware systems. Int. J. Ad Hoc Ubiquitous Comput. 2(4), 263–277 (2007) 7. Balthazard, P.A., Potter, R.E., Warren, J.: Expertise, extraversion and group interaction styles as performance indicators in virtual teams: how do perceptions of it’s performance get formed? DATA BASE 35(1), 41–64 (2004) 8. Breslin, J., Passant, A., Decker, S.: Social web applications in enterprise. The Social Semantic Web 48, 251–267 (2009) 9. Camarinha-Matos, L.M., Afsarmanesh, H.: Collaborative networks - value creation in a knowledge society. In: PROLAMAT, pp. 26–40 (2006) 10. Conner, W., Iyengar, A., Mikalsen, T., Rouvellou, I., Nahrstedt, K.: A trust management framework for service-oriented environments. In: International World Wide Web Conference (WWW) (2009) 11. Dustdar, S.: Caramba - a process-aware collaboration system supporting ad hoc and collaborative processes in virtual teams. Distributed and Parallel Databases 15(1), 45–66 (2004) 12. Dustdar, S., Hoffmann, T.: Interaction pattern detection in process oriented information systems. Data and Knowledge Engineering (DKE) 62(1), 138–155 (2007)
74
F. Skopik et al.
13. Dwyer, C., Hiltz, S.R., Passerini, K.: Trust and privacy concern within social networking sites: A comparison of facebook and myspace. In: Americas Conference on Information Systems (AMCIS) (2007) 14. Golbeck, J.: Trust and nuanced profile similarity in online social networks. ACM Transactions on the Web (TWEB) 3(4), 1–33 (2009) 15. Grandison, T., Sloman, M.: A survey of trust in internet applications. IEEE Communications Surveys and Tutorials, 2000, 3(4) (2000) 16. Griffiths, N.: A fuzzy approach to reasoning with trust, distrust and insufficient trust. In: CIA, vol. 4149, pp. 360–374 (2006) 17. Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of trust and distrust. In: International World Wide Web Conference (WWW), pp. 403–412 (2004) 18. Gyngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: International Conference on Very Large Data Bases (VLDB), pp. 576–587 (2004) 19. Harrison, B.L., Cozzi, A., Moran, T.P.: Roles and relationships for unified activity management. In: International Conference on Supporting Group Work (GROUP), pp. 236–245 (2005) 20. Huynh, T.D., Jennings, N.R., Shadbolt, N.R.: An integrated trust and reputation model for open multi-agent systems. Autonomous Agents and Multiagent Systems (AAMAS) 13(2), 119–154 (2006) 21. IBM: An architectural blueprint for autonomic computing. Whitepaper 2005 (2005) 22. Jøsang, A., Ismail, R.: The beta reputation system. In: Bled Electronic Commerce Conference (2002) 23. Jøsang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decision Support Systems 43(2), 618–644 (2007) 24. Kerschbaum, F., Haller, J., Karabulut, Y., Robinson, P.: Pathtrust: A trust-based reputation service for virtual organization formation. In: International Conference on Trust Management (iTrust), pp. 193–205 (2006) 25. Kovac, D., Trcek, D.: Qualitative trust modeling in soa. Journal of Systems Architecture 55(4), 255–263 (2009) 26. Leekwijck, W.V., Kerre, E.E.: Defuzzification: criteria and classification. Fuzzy Sets and Systems 108(2), 159–178 (1999) 27. Loke, S.W.: Context-aware artifacts: Two development approaches. IEEE Pervasive Computing 5(2), 48–53 (2006) 28. Malik, Z., Bouguettaya, A.: Reputation bootstrapping for trust establishment among web services. IEEE Internet Computing 13(1), 40–47 (2009) 29. Marsh, S.P.: Formalising trust as a computational concept. Ph.D. thesis, University of Stirling (1994) 30. Massa, P., Avesani, P.: Trust-aware collaborative filtering for recommender systems. In: CoopIS, DOA, ODBASE, pp. 492–508 (2004) 31. Matsuo, Y., Yamamoto, H.: Community gravity: Measuring bidirectional effects by trust and rating on online social networks. In: International World Wide Web Conference (WWW), pp. 751–760 (2009) 32. Maximilien, E.M., Singh, M.P.: Toward autonomic web services trust and selection. In: International Conference on Service Oriented Computing (ICSOC), pp. 212–221 (2004) 33. Metzger, M.J.: Privacy, trust, and disclosure: Exploring barriers to electronic commerce. J. Computer-Mediated Communication, 2004, 9(4) (2004) 34. Moody, P., Gruen, D., Muller, M.J., Tang, J.C., Moran, T.P.: Business activity patterns: A new model for collaborative business applications. IBM Systems Journal 45(4), 683–694 (2006) 35. Moran, T.P., Cozzi, A., Farrell, S.P.: Unified activity management: Supporting people in ebusiness. Communications of the ACM 48(12), 67–70 (2005) 36. Mui, L., Mohtashemi, M., Halberstadt, A.: A computational model of trust and reputation for e-businesses. In: Hawaii International Conferences on System Sciences (HICSS), p. 188 (2002) 37. Nowak, M., Sigmund, K.: Evolution of indirect reciprocity by image scoring. Nature 393, 573– 577 (1998)
3 Modeling and Mining of Dynamic Trust in Complex Service-Oriented Systems
75
38. Orgun, M.A., Liu, C.: Reasoning about dynamics of trust and agent beliefs. In: IEEE International Conference on Information Reuse and Integration (IRI), pp. 105–110 (2006) 39. Panteli, N., Davison, R.: The role of subgroups in the communication patterns of global virtual teams. IEEE Transactions on Professional Communication 48(2), 191–200 (2005) 40. Patel, J., Teacy, W.T.L., Jennings, N.R., Luck, M.: A probabilistic trust model for handling inaccurate reputation sources. In: International Conference on Trust Management (iTrust), vol. 3477, pp. 193–209. Springer (2005) 41. Reka, A., Barab´asi: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002) 42. Sabater, J., Sierra, C.: Reputation and social network analysis in multi-agent systems. In: International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 475– 482. ACM, New York, NY, USA (2002) 43. Sabater, J., Sierra, C.: Social regret, a reputation model based on social relations. SIGecom Exchanges 3(1), 44–56 (2002) 44. Salehie, M., Tahvildari, L.: Self-adaptive software: Landscape and research challenges. ACM Transactions on Autonomous and Adaptive Systems 4(2), 1–42 (2009) 45. Schall, D.: Human interactions in mixed systems - architecture, protocols, and algorithms. Ph.D. thesis, Vienna University of Technology (2009) 46. Schall, D., Dorn, C., Dustdar, S., Dadduzio, I.: Viecar - enabling self-adaptive collaboration services. In: Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 285–292 (2008) 47. Schall, D., Truong, H.L., Dustdar, S.: Unifying human and software services in web-scale collaborations. IEEE Internet Computing 12(3), 62–68 (2008) 48. Sherchan, W., Loke, S.W., Krishnaswamy, S.: A fuzzy model for reasoning about reputation in web services. In: ACM Symposium on Applied Computing (SAC), pp. 1886–1892 (2006) 49. Skopik, F., Schall, D., Dustdar, S.: The cycle of trust in mixed service-oriented systems. In: Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 72– 79. IEEE (2009) 50. Skopik, F., Schall, D., Dustdar, S.: Start trusting strangers? bootstrapping and prediction of trust. In: International Conference on Web Information Systems Engineering (WISE), pp. 275– 289. Springer (2009) 51. Skopik, F., Schall, D., Dustdar, S.: Trustworthy interaction balancing in mixed service-oriented systems. In: ACM Symposium on Applied Computing (SAC), pp. 801–808. ACM (2010) 52. Skopik, F., Truong, H.L., Dustdar, S.: VieTE - enabling trust emergence in service-oriented collaborative environments. In: International Conference on Web Information Systems and Technologies (WEBIST), pp. 471–478. INSTICC (2009) 53. Srivatsa, M., Xiong, L., Liu, L.: Trustguard: countering vulnerabilities in reputation management for decentralized overlay networks. In: International World Wide Web Conference (WWW), pp. 422–431. ACM (2005) 54. Theodorakopoulos, G., Baras, J.S.: On trust models and trust evaluation metrics for ad hoc networks. IEEE Journal on Selected Areas in Communications 24(2), 318–328 (2006) 55. Wang, Y., Singh, M.P.: Formal trust model for multiagent systems. In: International Joint Conferences on Artificial Intelligence (IJCAI), pp. 1551–1556 (2007) 56. Zadeh, L.A.: Fuzzy sets. Information and Control 8, 338–353 (1965) 57. Ziegler, C.N., Golbeck, J.: Investigating interactions of trust and interest similarity. Decision Support Systems 43(2), 460–475 (2007) 58. Ziegler, C.N., Lausen, G.: Propagation models for trust and distrust in social networks. Information Systems Frontiers 7(4-5), 337–358 (2005) 59. Zimmermann, H.J.: Fuzzy Set Theory and Its Applications, third edn. Kluwer Academic Publishers (1996) 60. Zuo, Y., Panda, B.: Component based trust management in the context of a virtual organization. In: ACM Symposium on Applied Computing (SAC), pp. 1582–1588 (2005)
Chapter 4
Script-Based Generation of Dynamic Testbeds for SOA Lukasz Juszczyk and Schahram Dustdar
Abstract This chapter addresses one of the major problems of SOA software development: the lack of support for testing complex service-oriented systems. The research community has developed various means for checking individual Web services but has not come up with satisfactory solutions for testing systems that operate in service-based environments and, therefore, need realistic testbeds for evaluating their quality. We regard this as an unnecessary burden for SOA engineers. As a proposed solution for this issue, we present the Genesis2 testbed generator framework. Genesis2 supports engineers in modeling testbeds and programming their behavior. Out of these models it generates running instances of Web services, clients, registries, and other entities in order to emulate realistic SOA environments. By generating real testbeds, our approach assists engineers in performing runtime tests of their systems and particular focus has been put on the framework’s extensibility to allow the emulation of arbitrarily complex environments. Furthermore, by exploiting the advantages of the Groovy language, Genesis2 provides an intuitive yet powerful scripting interface for testbed control.
4.1 Introduction In the last years, the principles of Service-oriented Architecture (SOA) have gained high momentum in distributed systems research and wide acceptance in software industry. The reasons for this trend are SOA’s advantages in terms of communication interoperability, loose coupling between clients and services, reusability and
L. Juszczyk () S. Dustdar Distributed Systems Group, Institute of Information Systems at Vienna University of Technology e-mail:
[email protected];
[email protected] c 2010 IEEE. Reprinted, with permission, from Juszczyk L., Dustdar S. (2010) Script-based Generation of Dynamic Testbeds for SOA. 8th IEEE International Conference on Web Services (ICWS’10), July 5–10, 2010, Miami, FL, USA S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0 4, © Springer-Verlag/Wien 2011
77
78
L. Juszczyk and S. Dustdar
composability of services, and many more. Moreover, novel features which are associated with SOA [22] are adaptivity [14], self-optimization and self-healing (self-* in general) [15], and autonomic behavior [29]. The result of this evolution is that, on the one hand, SOA is being increasingly used for building distributed systems, but, on the other hand, is becoming more and more complex itself. As complexity implies error-proneness as well as the need to understand how and where such complexity emerges, SOA-based systems must be tested intensively during the whole development process and, therefore, require realistic testbeds. These testbeds must comprise emulated Web services, clients, registries, bus systems, mediators, and other SOA components, to simulate real world scenarios. However, due to missing tool support, the set up of such testbeds has been a major burden for SOA engineers. In general, the lack of proper testing support has been regarded as one of the main problems of SOA [13]. Looking at currently available solutions, it becomes evident that the majority aims only at testing of single Web services [9, 25, 31] and composite ones [16, 17] which, however, only covers the service provider part of SOA. For testing systems which operate in service-based environments themselves, the engineer is facing the problem of setting up realistic test scenarios which cover the system’s whole functionality. There do exist solutions for testbed generation but these are restricted to specific domains, e.g., for checking Service Level Agreements by emulating Quality of Service [11]. However, if engineers need generic support for creating customized testbeds covering various aspects of SOA, no solutions exist to our knowledge which would relieve them from this time-consuming task. We believe, this issue is a severe drawback for the development of complex SOAs. In this chapter we present the current state of our work on a solution for this issue. We introduce the Genesis2 framework (Generating SOA Testbed Infrastructures, in short, G2) which allows to set up SOA testbeds and to manipulate their structure and behavior on-the-fly. It comprises a front-end from where testbeds are specified and a distributed back-end on which the generated testbed is hosted. At the front-end, engineers write Groovy scripts to model the entities of the testbed and to program their behavior, while the back-end interprets the model and generates real instances out of it. To ensure extensibility, G2 uses composable plugins which augment the testbed’s functionality, making it possible to emulate diverse topologies, functional and non-functional properties, and behavior. The rest of the chapter presents our work as follows. In Sect. 4.2 we give an overview of related research. Section 4.3 is the main part of the chapter and describes the concepts of the G2 framework. Section 4.4 demonstrates the application of G2 via a sample scenario. Finally, Sects. 4.5 and 4.6 discuss open issues, present our plans for future work, and conclude.
4.2 SOA Testbeds Comparing the state of the art of research on SOA in general and the research on testing in/for SOA, an interesting divergence becomes evident. SOA itself has had an impressive evolution in the last years. At its beginning, Web service-based
4 Script-Based Generation of Dynamic Testbeds for SOA
79
SOA had been mistaken as yet another implementation for distributed objects and RPC and, therefore, had been abused for direct and tightly-coupled communication [27]. After clearing up these misconceptions and pointing out its benefits derived from decoupling, SOA has been accepted as an architectural style for realizing flexible document-oriented distributed computing. Today’s SOAs comprise much more than just services, clients and brokers (as depicted in the outdated Web service triangle [21]) but also message mediators, service buses, monitors, management and governance systems, workflow engines, and many more [22]. As a consequence, SOA is becoming increasingly powerful but also increasingly complex, which implies higher error-proneness [10] and, logically, requires thorough testing. But looking at available solutions for SOA testing (research prototypes as well as commercial products), one might get the feeling that SOA is still reduced to its find-bind-invoke interactions because most approaches deal only with testing of individual Web services, and only few solutions deal to some extent with complex SOAs. All in all, it is possible to test whether a single Web service behaves correctly regarding its functional and non-functional properties, but testing systems operating on a whole environment of services is currently not supported. Let us take the case of an autonomic workflow engine [26] for example. The engine must monitor available services, analyze their status, and decide whether to adapt running workflows. To verify the engines’ correct execution it is necessary to perform runtime tests in a real(-istic) service environment, in short, a service testbed. The testbed must comprise running instances of all participants (in this simple case only Web services), emulate realistic behavior (e.g., Quality of service, dependencies among services), and serve as an infrastructure on which the developed system can be tested. Of course, for more complex systems, more complex testbeds are required to emulate all characteristics of the destination environment. But how do engineers create such testbeds? Unfortunately, up to now, they had to create them manually, as no proper support had been available. To be precise, some solutions do exist but are too restricted in their functionality and cannot create testbeds of arbitrarily complex structure and behavior. This has been our motivation for doing research on supporting the generation of customizable testbeds for SOA. In the following, we give an overview on the current state of the art of research and discuss the evolution of Genesis since its first version.
4.2.1 Related Research on SOA Testing Available solutions have been mostly limited to testing Web service implementations regarding their functional and non-functional properties. This includes, for instance, tests for performance and Quality of Service (QoS) [9, 23], robustness [19], reliability [31, 32], message schema conformance [30], but also techniques for testing composed services [16, 17] as well as generic and customizable testing tools [25]. In spite of their importance, these solutions only support engineers in checking the service providers of a SOA. Which means that they can be only used
80
L. Juszczyk and S. Dustdar
for testing the very basic building blocks but not the whole integrated system. This makes these works out of scope of our current research and, therefore, we do not review them in detail. Unfortunately, the challenging task of testing complex SOAs and their components, such as governance systems which operate and also depend on other services, has not gained enough attention in the research community. Some groups have done research on testbed generation but their investigations have been focused only on specific domains such as QoS or workflows. For instance, SOABench [12] provides sophisticated support for benchmarking of BPEL engines [5] via modeling experiments and generating service-based testbeds. It provides runtime control on test executions as well as mechanisms for test result evaluation. Regarding its features, SOABench is focused on performance evaluation and generates Web service stubs that emulate QoS properties, such as response time and throughput. Similar to SOABench, the authors of PUPPET [11] examine the generation of QoS-enriched testbeds for service compositions. PUPPET does not investigate the performance but verifies the fulfillment of Service Level Agreements (SLA) of composite services. This is done by analyzing WSDL [7] and WS-Agreement (WSA) documents [8] and emulating the QoS of generated Web services in order to check the SLAs. Both tools, SOABench and Puppet, support the generation of Web service-based testbeds, but both are restricted to a specific problem domain (workflows/compositions & QoS/SLA). In contrast, G2 provides generic support for generating and controlling customized testbeds. Though, if desired, G2 can be also used for emulating QoS. Further related work has been done on tools for controlling tests of distributed systems. Weevil [28], for example, supports experiments of “distributed systems on distributed testbeds” by generating workload. It automates deployment and execution of experiments and allows to model the experiment’s behavior via programs written in common programming languages linked to its workload generation library. We do not see Weevil as a direct competitor to G2, but rather as a complementary tool. While Weevil covers client-side tests of systems, G2 aims at generating testbeds. We believe that a combination of both systems would empower engineers in setting up and running sophisticated tests of complex SOAs and we will investigate this in future work. Another possible synergy we see in combining G2 with DDSOS [24]. This framework deals with testing SOAs and provides model-and-run support for distributed simulation, multi-agent simulation, and an automated scenario code generator creating executable tests. Again, this framework could be used to control tests on G2-based testbeds.
4.2.2 Evolution of Genesis Our work on SOA testbeds had first led to the development of Genesis [18] (in short, G1), the predecessor of G2. To our knowledge, G1 was the first available
4 Script-Based Generation of Dynamic Testbeds for SOA
81
“multi purpose” testbed generator for SOA and we have published the prototype as open-source [2]. Similar to G2, it is a Java-based framework for specifying properties of SOAP-based Web services [6] and for generating real instances of these on a distributed back-end. Via a plug-in facility the service testbed can be enhanced with complex behavior (e.g., QoS, topology changes) and, furthermore, can be controlled remotely by changing plugin parameters. At the front-end, the framework offers an API which can be integrated, for instance, into the Bean Scripting Framework (BSF) [4] for a convenient usage. However, G1 suffers from various restrictions which limit the framework’s functionality and usability. First of all, the behavior of Web services is specified by aligning plugin invocations in simple structures (sequential, parallel, try/catch) without having fine-grained control. This makes it hard to implement, for instance, fault injection on a message level [30]. Also, deployed testbeds can only be updated by altering one Web service at a time, which hampers the control of large-scale testbeds. Moreover, G1 is focused on Web services and does not offer the generation of other SOA components, such as clients or registries. In spite of G1’s novel features, we regarded the listed shortcomings as an obstacle for further research and preferred to work on a new prototype. By learning from our experiences, we determined new requirements for SOA testbed generators: • Customizable control on structure, composition, and behavior of testbeds. • Ability to generate not only Web services, but also other SOA components. • Ability to create and control large-scale testbeds in an efficient manner, supporting multicast-like updates. • Furthermore, a more convenient and intuitive way for modeling and programming the testbed. The appearance of the listed requirements made it necessary to redesign Genesis and to rethink its concepts. These efforts resulted in our new framework, Genesis2.
4.3 The Genesis2 Testbed Generator Due to the breadth of G2, it is not feasible to introduce the whole spectrum of concepts and features in a single chapter. Hence, we concentrate on the most relevant novelties and present an overall picture of our framework and its application. We give an overview on G2’s capabilities, explain shortly how testbeds are generated and how G2 benefits from the Groovy language, and introduce the feature of multicast-based updates for managing large-scale testbeds. To avoid ambiguities, we are using the following terminology: model schema for the syntax and semantics of a testbed specification, model types for the single elements of a schema, model for the actual testbed specification, testbed (instance) for the whole generated testbed environment consisting of individual testbed elements, such as services, registries, etc.
82
L. Juszczyk and S. Dustdar
4.3.1 Basic Concepts and Architecture G2 comprises a centralized front-end, from where testbeds are modeled and controlled, and a distributed back-end at which the models are transformed into real testbed instances. In a nutshell, the front-end maintains a virtual view on the testbed, allows engineers to manipulate it via scripts, and propagates changes to the back-end in order to adapt the running testbed. The G2 framework follows a modular approach and provides the functional grounding for composable plugins that implement generator functionality. The framework itself offers (a) generic features for modeling and manipulating testbeds, (b) extension points for plugins, (c) inter-plugin communication among remote instances, and (d) a runtime environment shared across the testbed. All in all, it provides the basic management and communication infrastructure which abstracts over the distributed nature of a testbed. The plugins, however, enhance the model schema by integrating custom model types and interpret these to generate deployable testbed elements at the back-end. Taking the provided WebServiceGenerator plugin for example, it enhances the model schema with the types WebService, WsOperation, and DataType, integrates them into the model structure on top of the default root element Host (see Fig. 4.1), and, eventually, supports the generation of Web services at the back-end. Furthermore, the provided model types define customization points (e.g., for service binding and operation behavior) which provide the grounding for plugin composition. For instance, the CallInterceptor plugin attaches itself to the WebService type and allows to program the intercepting behavior, which will be then automatically deployed with the services. In G2’s usage methodology, the engineer creates models according to the provided schema at the front-end, specifying what shall be generated where, with which customizations, and the framework takes care of synchronizing the model with the corresponding back-end hosts on which the testbed elements are generated and deployed. The front-end, moreover, maintains a permanent view on the testbed, allowing to manipulate it on-the-fly by updating its model.
Fig. 4.1 Sample model schema
4 Script-Based Generation of Dynamic Testbeds for SOA
83
Fig. 4.2 Genesis2 architecture: infrastructure, plugins, and generated elements
For a better understanding of the internal procedures inside G2, we take a closer look at its architecture. Figure 4.2 depicts the layered components, comprising the base framework, installed plugins, and, on top of it, the generated testbed: • At the very bottom, the basic runtime consists of Java, Groovy, and third-party libraries. • At the framework layer, G2 provides itself via an API and a shared runtime environment is established at which plugins and generated testbed elements can discover each other and interact. Moreover, an active repository distributes detected plugins among all hosts. • Based on that grounding, installed plugins register themselves at the shared runtime and integrate their functionality into the framework. • The top layer depicts the results of the engineer’s activities. At the front-end he/she is operating the created testbed model. The model comprises virtual objects which act as a view on the real testbed and as proxies for manipulation commands. While, at the back-end the actual testbed is generated according to the specified model. However, Fig. 4.2 provides a rather static image of G2, which does not represent the system’s inherent dynamics. Each layer establishes its own communication structures (see Fig. 4.3) which serve different purposes: • On the bottom layer, the G2 framework connects the front-end to the backend hosts and automatically distributes plugins for having a homogeneous infrastructure. • For the plugins, G2 allows to implement custom communication behavior. For example, plugins can exchange data via undirected gossiping or, as done in the SimpleRegistry plugin, by directing requests (e.g., service lookups) to a dedicated instance.
84
L. Juszczyk and S. Dustdar
Fig. 4.3 Interactions within G2 layers
• The testbed control is strictly centralized around the front-end. Each model object has its pendants in the back-end and acts as a proxy for accessing them. • Finally, in the running testbed, G2 does not restrict the type and topology of interactions but outsources this to the plugins and their application. For instance, Web services can interact via nested invocations and, in addition, can integrate registries, workflow engines, or even already existing legacy systems into the testbed. The framework’s shared runtime environment deserves further explanation due to its importance. In G2, the SOA engineer writes Groovy scripts for modeling and programming of testbeds. The capabilities of the system, however, are defined by the applied plugins which provide custom extensions. The runtime environment constitutes a binding between these by acting as a distributed registry. Every object inside the testbed (e.g., plugin, model type, generated testbed instance, function/macro, class, variable) is registered at the environment via aliases, in order to make it discoverable and G2 provides a homogeneous runtime infrastructure on each host. This offers high flexibility, as it ensures that locally declared scripts, which reference aliases, are also executable on remote hosts. In the following sections we give a more detailed insight into selected features of G2 in order to convey its potential.
4.3.2 Extensible Generation of Testbed Instances Because of its generic nature, which provides a high level of extensibility, the G2 framework outsources the generation of testbed elements to the plugins. It does
4 Script-Based Generation of Dynamic Testbeds for SOA
85
also not predefine a strict methodology for how they must be generated, but rather provides supporting features. This might raise the false impression that we are just providing the base framework and leave the tricky part to the plugin developers. The truth is that we kept the framework generic on purpose, in order to have a basic grounding for future research on testbed generation, which might also include nonSOA domains. For our current needs, we have developed some plugins covering basic SOA: • WebServiceGenerator creating SOAP Web services • WebServiceInvoker calling remote SOAP services, both generated and preexisting ones (e.g., 3rd-party .NET-based) • CallInterceptor processing SOAP calls on a message level (e.g., for fault injection [30]) • DataPropagator providing automated replication of data/functions among back-end hosts • QOSEmulator emulating Quality of Service properties • SimpleRegistry for global service lookups • ClientGenerator seeding testbeds with standalone clients (e.g., for bootstrapping testbed activities) Of these, the WebServiceGenerator plays a major role and, therefore, serves as a good example for demonstrating the testbed generation process. We have reused selected parts of the generation code from G1 [18], however, we were able to simplify it significantly by using Groovy features. Basically, the process comprises the following steps: 1. Recursive analysis of the WebService model to determine used customization plugins and message types. 2. Translation of message types (DataType models) to Java classes that represent the XSD-based data structures (using xjc, the Java XML Binding Compiler). 3. Automatic generation of Java/Groovy source code implementing the modeled Web service. 4. Compilation of sources using Groovy’s built-in compiler. 5. Generation of customizations by corresponding plugins. 6. Deployment of completed Web service instance at local Apache CXF [1] endpoint. 7. Subscription to model changes for automatic adaptation of deployed Web service instance. The whole generation procedure depends completely on the plugins functional purpose. For instance, the CallInterceptor translates intercepting code into Apache CXF Features and binds them to service and client instances, the ClientGenerator simply implements a programmable thread, and the QOSEmulator does not generate any deployable elements but works in the background. Evidently, in G2, plugins are more than just simple extensions but provide essential features for testbed generation. They define the model schema, implement testbed capabilities, and handle the actual generation of testbed instances.
86
L. Juszczyk and S. Dustdar
Consequently, they can become quite complex. To support the implementation of new plugins, G2 provides a base class that carries out fundamental tasks for installation, deployment, and communication among remote instances, so that developers can focus on the plugin’s primary features.
4.3.3 Exploitation of Groovy Features G2 derives a great deal of its flexibility and extensibility from Groovy [3]. In short, Groovy is a dynamic programming language for the Java Virtual Machine, providing modern features such as dynamic typing, closures, and support for meta programming. Also, it has a compact syntax and can be used as an easy-to-read scripting language. G2 uses Groovy’s dynamic Expando type as a base class for model types. This allows to expand the model (ergo the generated testbed) on-the-fly and to observe changes, which facilitates automatic front-end/back-end synchronization. Moreover, by intercepting model manipulation requests, plugin developers can customize the handling of these (e.g., to log everything) and can restrict the model’s expandability. Internally, model objects are realized as flexible hash maps and entire testbed models are constructed by aggregating these, e.g., by attaching a WsOperation instance to the corresponding list inside a WebService’s map. However, aggregating model objects by hand is rather cumbersome and inefficient, especially for complex testbeds. As a solution, we use Groovy’s Builder support which helps to create nested data structures in an intuitive manner. The following sample demonstrates the convenience of builders: / / h a s h mapb a s e d c r e a t i o n o f web s e r v i c e model d e f s1 = w e b s e r v i c e . c r e a t e ( "TestService" ) s1 . b i n d i n g = "doc,lit" s1 . t a g s += "test" d e f op = w s o p e r a t i o n . c r e a t e ( "SayHi" ) op . param Types += [ name : S t r i n g ] op . r e s u l t T y p e = S t r i n g op . b e h a v i o r = f r e t u r n "hi $name" g / / "test" i n s . t a g s g . o p e r a t i o n . any f o> o . name == "SayHi"g
In all, G2 benefits from its Groovy binding in a twofold manner. The dynamic features provide the functional grounding for generating extensible testbeds, while the language’s brevity helps to model them by using a clear and compact syntax.
4.3.4 Multicast Testbed Control A drawback of G1 was that testbed manipulations had to be done in a point-topoint manner, updating one Web service at a time. This was an issue for controlling large-scale testbeds, such as the one used in the VReSCo project [20] consisting of up to 10,000 services. To overcome this issue, G2 supports multicast-based manipulations. This feature is inspired by multicast network communication, where a single transmitted packet can reach an arbitrary large number of destination hosts with the help of replicating routers. To provide similar efficiency, G2 uses filter closures which specify the destination of a change request and reduces the number of request messages. In detail, G2 applies the filter at the local testbed model to get the resulting set of designated elements and checks at which back-end hosts these are deployed. Then it wraps the change request, including the filter, and sends it to the involved hosts. Eventually, the hosts unwrap it, run the filter locally, and perform the changes on each matched testbed element. This way, G2 reduces the number of request messages to the number of involved back-end hosts, which significantly improves efficiency. The following snippet shows a sample multicast manipulation. It addresses Web services matching a namespace and performs a set of modifications on them, e.g., appending a new operation and setting model properties.
88
L. Juszczyk and S. Dustdar
d e f newOp= o p e r a t i o n . c r e a t e ( "newOperation" ) w e b s e r v i c e ( op : newOp ) f s> / / f i l t e r closure s . nam espace = ˜ / i n f o s y s . t u w i e n . ac . a t / g f s> / / command c l o s u r e s . o p e r a t i o n s +=op s . s o m e P r o p e r t y = "someValue" g
4.4 QoS Testbed Scenario In this chapter we do not evaluate the performance of G2. Instead, we chose to demonstrate G2 in practice in order to give a better understanding of the previously presented concepts and also to give an impression about the intuitiveness of G2’s script-based control. Our scenario covers the creation of a rather simple testbed for testing the QoS monitor [23] used in the VReSCo project [20]. The monitor performs periodical checks for determining a Web service’s execution time, latency, throughput, availabiltiy, robustness, and other QoS properties. Most of the monitoring is done in a non-intrusive manner, while for some checks local sensors need to be deployed at the service. For verifying the monitor’s correct functionality, runtime tests must be performed on a testbed of generated Web services simulating QoS properties. Furthermore, the QoS properties must be controllable during test execution and the Web services must support the application of local sensors. Even though, the creation of such a testbed is perfectly feasible with G2, we had to restrict its functionality due to space constraints. We omitted testbed features, such as registration of generated services at a broker, and replaced the usage of the QoSEmulator. Instead, we just simulate processing time and failure rate via simple delaying and throwing exceptions at the Web operations. However, for demonstration purposes, we have included some additional features, such as nested invocations, dynamic replacement of functionality, and generation of active clients. For setting up the testbed, we are using the plugins WebServiceGenerator, WebServiceInvoker, CallInterceptor, ClientGenerator, SimpleRegistry, and DataPropagator, which establish the model schema depicted in Fig. 4.1. We divided the scenario into three parts: in the first step we generate the service-based testbed, then we generate clients invoking the testbed’s services, and, finally, show how the running testbed can be altered at runtime. / / r e f e r e n c e 10 backend h o s t s 1 . u p t o ( 1 0 ) f n> h o s t . c r e a t e ( "192.168.1.$n" , 8 0 8 0 ) g / / l o a d m essage t y p e d e f i n i t i o n s from XSD f i l e d e f i n T y p e = d a t a t y p e . c r e a t e ( "types.xsd" , "inputType" ) d e f o u t T y p e = d a t a t y p e . c r e a t e ( "types.xsd" , "outputType" )
4 Script-Based Generation of Dynamic Testbeds for SOA
89
p r o p . r a n d o m L i s t I t e m =f l i s t > / / g e t random i t e m l i s t [ new Random ( ) . n e x t I n t ( l i s t . s i z e ( ) ) ] g def s e r v i c e L i s t = webservice . bui l d f 1 . u p t o ( 1 0 0 ) f i > / / c r e a t e S e r v i c e 1 . . S e r v i c e 1 0 0 "Service$i" ( d e l a y : 0 , f a i l u r e R a t e : 0 . 0 ) f t a g s = ["worker" ] / / Web s e r v i c e o p e r a t i o n ” P r o c e s s ” P r o c e s s ( i n p u t : i nType , r e s p o n s e : o u t T y p e ) f Thread . s l e e p ( del ay ) i f ( new Random ( ) . n e x t D o u b l e () < f a i l u r e R a t e ) f throw new E x c e p t i o n ( "sorry!" ) g r e t u r n o u t T y p e . createDummy ( ) g gg
g
1 . u p t o ( 2 0 ) f i > / / c r e a t e 20 d e l e g a t o r s e r v i c e s "CompositeService$i" ( ) f t a g s = ["delegator" , "composite" ] p r o c e s s E r r o r =fg / / i n i t i a l l y empty f u n c t i o n / / Web s e r v i c e o p e r a t i o n ” D e l e g a t e ” D e l e g a t e ( i n p u t : i nType , n e e d e d R e s u l t s : h d r ( i n t ) , response : arrayOf ( outType ) ) f d e f g o t R e s u l t s =0 def r e s u l t =[] w h i l e ( g o t R e s u l t s / / d e p l o y a t random h o s t s s . deployAt ( randomListItem ( host . g e t A l l ( ) ) ) g Listing 4.1 ‘Generation of Web services for task delegation example’
Listing 4.1 covers the specification of the services. First, a set of back-end hosts is referenced and the service’s message types are imported from an XSD file. In Line 8, the DataPropagator plugin is invoked, via its alias prop, to bind a global function/closure to the shared runtime environment. The testbed itself comprises 100 simple worker services and, in addition, 20 delegators that dispatch invocations to the workers. In Lines 13–24, the worker services are built, for each we declare variables for controlling the simulation of QoS, and add a tag
90
L. Juszczyk and S. Dustdar
for distinction. For the worker’s Web service operation Process we specify its I/O message types and customize its behavior with simple code for simulating delay and failure rate, controlled via the service’s variables. The composite delegator services are created in a similar manner, but contain nested service invocations and a user-defined customization (processError()). Furthermore, a header argument is specified (neededResults), which means that it is declared as part of the SOAP header instead of the body. In Line 36 the SimpleRegistry is queried to get a list of references to worker services. Of these random ones are picked and invoked (Line 39) in sequence, until the required number of correct responses has been reached. On faults, the customizable error handling routine named processError() is called. Eventually, the delegator service returns a list of responses. At the end of the script, the testbed is generated by deploying the modeled Web services on random hosts. def i n i t C l i e n t = c l i e n t . c r e a t e ( ) i n i t C l i e n t . run = true / / boolean f l a g ’ run ’ i n i t C l i e n t . code =f / / c l i e n t code a s c l o s u r e while ( run ) f Thread . s l e e p (5000) / / every 5 seconds d e f r e f s = r e g i s t r y . g e t f"delegator" i n i t . t a g s g d e f r = r a n d o m L i s t I t e m ( r e f s ) / / p i c k random def arg =inType . newInstance ( ) r . D e l e g a t e ( arg , 3 ) / / i n i t i a t e d e l e g a t i o n g g i n i t C l i e n t . deployAt ( host . g e t A l l ( ) )
/ / run c l i e n t s
Listing 4.2 ‘Generation of clients invoking delegator Web services’
Though, in this state the testbed contains only passive services awaiting invocations. In order to make it “alive”, by generating activity, Listing 4.2 specifies and deploys clients which invoke random delegator services in 5 s intervals. def pi = c a l l i n t e r c e p t o r . c r e a t e ( ) p i . hooks = [ i n : "RECEIVE" , o u t : "SEND" ] / / where t o b i n d p i . code =f msg> qosmon . a n a l y z e ( msg ) g / / s e n s o r p l u g i n w e b s e r v i c e ( i : p i ) f s> "delegator" i n s . t a g s g f s> s . i n t e r c e p t o r s += i / / a t t a c h t o a u t h o r s e r v i c e s s . p r o c e s s E r r o r = f e> d e f u r l ="http://somehost.com/reportError?WSDL" def reportWs = w sreferen c e . c r e a t e ( u r l ) r e p o r t W s . R e p o r t ( my . w e b s e r v i c e . name , e . m essage ) g g i n t c y c l e s =1000 w h i l e ( c y c l e s >0) f Thread . s l e e p (2000) / / every 2 seconds d e f w o r k e r s = w e b s e r v i c e . g e t f"worker" i n i t . t a g s g d e f w= r a n d o m L i s t I t e m ( w o r k e r s )
4 Script-Based Generation of Dynamic Testbeds for SOA
g
91
w . d e l a y =new Random ( ) . n e x t I n t ( 2 0 1 0 0 0 ) / / 0 20 s e c w . f a i l u r e R a t e =new Random ( ) . n e x t F l o a t ( ) / / 0 . 0 1 . 0
i n i t C l i e n t . r u n = f a l s e / / s h u t down a l l c l i e n t s Listing 4.3 ‘On-the-fly manipulation/extension of running testbed’
Finally, Listing 4.3 demonstrates how running testbeds can be altered at runtime. At first, a call interceptor is created, which can be, for instance, used to place the QoS sensors. We make use of G2’s multicast updates and enhance all delegator services by appending the interceptor to the service model. In the same request we replace the (formerly empty) processError() routine and instruct the services to report errors to a third-party Web service. At the backend, the WebServiceGenerator plugins will detect the change request and automatically adapt the addressed services. Furthermore, by making use of G2’s immediate synchronization of models with running testbed instances, the simulation of QoS is altered on the fly by changing the corresponding parameter variables of worker services in a random manner. In the end, the clients are shut down by changing their run flag. In this scenario we have tried to cover as many key features of G2 as possible, to demonstrate the simplicity of our scripting interface. We have used builders to create nested model structures (service!operation!datatype), designed Web services and clients with parameterizable behavior, customized behavior with closures, applied plugins (e.g., call interceptors and service invokers), performed a multicast manipulation request, and steered the running testbed via parameters. The generated testbed consists of interconnected Web services and active clients calling them. To facilitate proper testing of the QoS monitor [23], it would require to simulate not only processing time and fault rate, but also scalability, throughput, and other properties which we have skipped for the sake of brevity. In any case, we believe that the presented scenario helps to understand how G2 is used and gives a good impression about its capabilities.
4.5 Discussion and Future Work Certain concepts of G2 might be considered with skepticism by readers and, therefore, require to be discussed. First of all, the usage of closures, which encapsulate user-defined code, for customizations of behavior is definitely risky. As we do not check the closures for malicious code, it is, for instance, possible to assign fSystem.exit(0)g to some testbed instance at the back-end, to invoke it, and hereby to shut down the remote G2 instance. This security hole restricts G2 to be used only by trusted engineers. For the current prototype we accepted this restriction on purpose and kept closure-based customizations for the vast flexibility their offer.
92
L. Juszczyk and S. Dustdar
Some readers may also consider the G2 framework as too generic, since it does not generate the testbed instances but delegates this to the plugins, and may wonder whether it deserves to be called a “testbed generator framework” at all. In our opinion this is mainly a question of where to define the boundary between a framework and its extensions. We implemented a number of plugins which generate basic SOA artifacts, such as services, clients, and registries. If we decide to direct our future research towards non-SOA testbeds, we will be able to base this work on the G2 framework. Moreover, in the introduction we said that SOA comprises more than just Web services, but also clients, service buses, mediators, workflow engines, etc. But looking at the list of plugins which we developed (see Sect. 4.3.2), it becomes evident that we do not cover all these components. This is partially true, as this chapter presents the current state of our work in progress. However, we are continuously extending our plugin repertoire and will make up for the missing ones soon, e.g., by porting G1’s BPEL workflow plugin to G2. Also, G2 is currently missing sophisticated support for WS-* standards which are an essential asset for SOAP-based communication. In the strict sence, it is possible to use call interceptors for WS-* processing but the engineer must handle the complex processing. We regard it as necessary, to unburden him/her by providing plugins for the common standards (e.g., WS-Addressing for asynchronous communication, WS-Policy, WS-Security) and to support the creation of additional ones. Last but not least, the question might be raised why we prefer a script-based approach. The reason is that we derive a lot of flexibility from the Groovy language and see high potential in the ability to program the testbed’s behavior compared to, for instance, composing everything in GUIs, which provides user convenience at the cost of flexibility.
4.6 Conclusion We have introduced Genesis2, a framework supporting engineers in generating testbed infrastructures for SOA. We have given an overview of the framework’s concepts and outlined its novel features which offer a high level of extensibility and customizability. Furthermore, we have used a scenario example to demonstrate how engineers can specify and program testbeds via an intuitive scripting language. We regard Genesis2 as an important contribution for the SOA testing community, as it is the first generic testbed generator that is not restricted to a specific domain but can be customized to set up testbeds of diverse components, structure, and behavior. We plan to release the software via our Web site [2] and expect that it will have significant impact on future research on automated testbed generation.
4 Script-Based Generation of Dynamic Testbeds for SOA
93
References 1. Apache CXF. http://cxf.apache.org/ 2. Genesis Web site. http://www.infosys.tuwien.ac.at/prototype/Genesis/ 3. Groovy Programming Language. http://groovy.codehaus.org/ 4. Jakarta Bean Scripting Framework. http://jakarta.apache.org/bsf/ 5. OASIS - Business Process Execution Language for Web Services. http://www.oasis-open.org/ committees/wsbpel/ 6. SOAP. http://www.w3.org/TR/soap/ 7. Web Services Description Language. http://www.w3.org/TR/wsdl 8. WS-Agreement. http://www.ogf.org/documents/GFD.107.pdf 9. Barros, M.D., Shiau, J., Shang, C., Gidewall, K., Shi, H., Forsmann, J.: Web services wind tunnel: On performance testing large-scale stateful web services. In: DSN, pp. 612–617. IEEE Computer Society (2007) 10. Basili, V.R., Perricone, B.T.: Software errors and complexity: An empirical investigation. Commun. ACM 27(1), 42–52 (1984) 11. Bertolino, A., Angelis, G.D., Frantzen, L., Polini, A.: Model-based generation of testbeds for web services. In: TestCom/FATES, Lecture Notes in Computer Science, vol. 5047, pp. 266–282. Springer (2008) 12. Bianculli, D., Binder, W., Drago, M.L.: Soabench: performance evaluation of service-oriented middleware made easy. In: J. Kramer, J. Bishop, P.T. Devanbu, S. Uchitel (eds.) ICSE (2), pp. 301–302. ACM (2010) 13. Canfora, G., Penta, M.D.: Testing services and service-centric systems: challenges and opportunities. IT Professional 8(2), 10–17 (2006) 14. Denaro, G., Pezz`e, M., Tosi, D., Schilling, D.: Towards self-adaptive service-oriented architectures. In: TAV-WEB, pp. 10–16. ACM (2006) 15. Halima, R.B., Drira, K., Jmaiel, M.: A qos-oriented reconfigurable middleware for self-healing web services. In: ICWS, pp. 104–111. IEEE Computer Society (2008) 16. Holanda, H.J.A., Barroso, G.C., de Barros Serra, A.: Spews: A framework for the performance analysis of web services orchestrated with bpel4ws. In: ICIW, pp. 363–369. IEEE Computer Society (2009) 17. Huang, H., Tsai, W.T., Paul, R.A., Chen, Y.: Automated model checking and testing for composite web services. In: ISORC, pp. 300–307. IEEE Computer Society (2005) 18. Juszczyk, L., Truong, H.L., Dustdar, S.: Genesis - a framework for automatic generation and steering of testbeds of complexweb services. In: ICECCS, pp. 131–140. IEEE Computer Society (2008) 19. Martin, E., Basu, S., Xie, T.: Websob: A tool for robustness testing of web services. In: ICSE Companion, pp. 65–66. IEEE Computer Society (2007) 20. Michlmayr, A., Rosenberg, F., Leitner, P., Dustdar, S.: End-to-end support for qos-aware service selection, binding and mediation in vresco. IEEE T. Services Computing (2010 (forthcoming)) 21. Michlmayr, A., Rosenberg, F., Platzer, C., Treiber, M., Dustdar, S.: Towards recovering the broken soa triangle: a software engineering perspective. In: IW-SOSWE, pp. 22–28. ACM (2007) 22. Papazoglou, M.P., Traverso, P., Dustdar, S., Leymann, F.: Service-oriented computing: a research roadmap. Int. J. Cooperative Inf. Syst. 17(2), 223–255 (2008) 23. Rosenberg, F., Platzer, C., Dustdar, S.: Bootstrapping performance and dependability attributes of web services. In: ICWS, pp. 205–212. IEEE Computer Society (2006) 24. Tsai, W.T., Cao, Z., Wei, X., Paul, R.A., Huang, Q., Sun, X.: Modeling and simulation in service-oriented software development. Simulation 83(1), 7–32 (2007) 25. Tsai, W.T., Paul, R.A., Song, W., Cao, Z.: Coyote: An xml-based framework for web services testing. In: HASE, pp. 173–176. IEEE Computer Society (2002)
94
L. Juszczyk and S. Dustdar
26. Verma, K., Sheth, A.P.: Autonomic web processes. In: ICSOC, Lecture Notes in Computer Science, vol. 3826, pp. 1–11. Springer (2005) 27. Vogels, W.: Web services are not distributed objects. IEEE Internet Computing 7(6), 59–66 (2003) 28. Wang, Y., Rutherford, M.J., Carzaniga, A., Wolf, A.L.: Automating experimentation on distributed testbeds. In: ASE, pp. 164–173. ACM (2005) 29. White, S.R., Hanson, J.E., Whalley, I., Chess, D.M., Kephart, J.O.: An architectural approach to autonomic computing. In: ICAC, pp. 2–9. IEEE Computer Society (2004) 30. Xu, W., Offutt, J., Luo, J.: Testing web services by xml perturbation. In: ISSRE, pp. 257–266. IEEE Computer Society (2005) 31. Zhang, J.: A mobile agents-based approach to test the reliability of web services. IJWGS 2(1), 92–117 (2006) 32. Zhang, J., Zhang, L.J.: Criteria analysis and validation of the reliability of web servicesoriented systems. In: ICWS, pp. 621–628. IEEE Computer Society (2005)
Chapter 5
Behavior Monitoring in Self-Healing Service-Oriented Systems Harald Psaier, Florian Skopik, Daniel Schall, and Schahram Dustdar
Abstract Web services and service-oriented architecture (SOA) have become the de facto standard for designing distributed and loosely coupled applications. Many service-based applications demand for a mix of interactions between humans and Software-Based Services (SBS). An example is a process model comprising SBS and services provided by human actors. Such applications are difficult to manage due to changing interaction patterns, behavior, and faults resulting from varying conditions in the environment. To address these complexities, we introduce a self-healing approach enabling recovery mechanisms to avoid degraded or stalled systems. The presented work extends the notion of self-healing by considering a mixture of human and service interactions observing their behavior patterns. We present the design and architecture of the VieCure framework supporting fundamental principles for autonomic self-healing strategies. We validate our selfhealing approach through simulations.
5.1 Introduction Large-scale distributed applications become increasingly dynamic and complex. Adaptations are necessary to keep the system fit and running. New requirements and flexible component utilization call for updates and extensions. Thus, a challenge is the sound integration of new and/or redesign of established components. Integration H. Psaier () F. Skopik D. Schall S. Dustdar Distributed Systems Group, Vienna University of Technology, Argentinierstr 8/184-1, 1040 Vienna, Austria e-mail:
[email protected];
[email protected];
[email protected];
[email protected] c 2010 IEEE. Reprinted, with permission, from Psaier, H., Skopik, F., Schall, D., Dust dar, S. (2010) Behavior Monitoring in Self-healing Service-oriented Systems. 34th Annual IEEE Computer Software and Applications Conference (COMPSAC), July 19–23, 2010, Seoul, South Korea. IEEE S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0 5, © Springer-Verlag/Wien 2011
95
96
H. Psaier et al.
must also consider changing dependencies. Unfortunately, to cope with all efforts, including deployment, integration, configuration, and fine tuning, monitoring and control of the system has proven sheer impossible by humans alone [8]. Today’s SOAs are composed of loosely coupled services orchestrated to collaborate on various kinds of tasks. However, their benefit, modularity and an almost infinite number of combinations, fosters unpredictable behavior and as a consequence results in poor manageability. Mixed Systems extend the solely software implemented capabilities of traditional Service-oriented Systems with human provided services. The integration of humans and software-based services is motivated by the difficulties to adopt human expertise into software implementations. Rather than dispense with the expertise, in Human-Provided Services (HPSs) a human handles tasks [21] behind a traditional service interface. The mix of common services based purely on software denoted as Software-Based Service (SBS) and HPS forms a Mixed System. Systems with self-healing properties are part of the Autonomic Computing [8] and Self-adaptive Systems [19] research. The self-healing properties of a system enhance new or existing unpredictably, unsatisfactorily manageable environments with self-aware recovery strategies. Hence, self-healing is considered a property of a system that comprises fault-tolerant, self-stabilizing, and survivability capabilities, and on exceptions, relies also on human intervention [9, 18]. A certain selfawareness is guaranteed by a continuous flow of status information between self-healing enhancement and environment. Inherited from fault-tolerant systems, the success of self-healing strategies depends on the recognition of the system’s current state.
5.1.1 Self-Healing Principles Mixed Systems are designed and built for long term use. Once available they are expected to remain accessible and tend to grow in size. To keep the system prevalent new services are integrated and legacy ones are updated. New requirements, advances in and novel technologies involve necessary changes. Therefore, a certain adaptability is required and expected from the system. However, the required flexibility increases the complexity of the system, and adaptations possibly cause unexpected behavior. The main goal of a self-healing approach is to avoid unpredictable behavior leading to faults. Filtered events are correlated to analyze the health of the system. The problem is identified and appropriate recovery actions are deployed [16]. The current health is usually mapped to recognizable system states as provided by the generic three state model for self-healing as for example discussed in [9]. According to their classification a system is considered in healthy state when not compromised by any faults. Once a degradation of system performance caused by faults is detected, the system moves to a degraded state but still functions.
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
97
The situation is in particular observed in large-scale systems. This provides selfhealing extensions with time for carefully planned recovery strategies that do not only include fault recovery by repair actions, but also sound deployment and compensation of side-effects. Finally, if the faults affect essential parts or a majority of the nodes the system’s behavior becomes unpredictable and ultimately stalls. The system is considered in unhealthy state. Self-healing tries to avoid a stalled system. The state is prevented by a combination of self-diagnosing and self-repairing capabilities [19]. A compelling precondition for any self-healing enhancement is a continuous data-flow between those and the guarded system. According to [8] a control loop is the essence of automation in a system. In detail [13] presents the autonomic manager as a generic layout for any self-management property, including self-healing. The manager relies on a control loop and includes monitor, analyze, plan, and execute modules.
5.1.2 Contributions Possible fault sources in Mixed Systems are manifold. Failures occur on all layers including the infrastructure layer, e.g., hardware and communication channels, implementation, such as mistakes and errors in application software, and application layer, due to errors in utilization and incomprehensible administration. In this work we focus on a novel kind of fault source: unpredictable and faulty behavior of services in a Mixed System. For that purpose, we observe the behavior of the heterogeneous services and their interactions. In particular, we focus on task delegation behavior in a collaborative scenario. Services have a limited buffer for tasks and excessive delegations to single nodes in the network can cause buffer overloads, and furthermore, may lead to service degradation or ultimately to failure. It is thus essential that we identify misbehavior, analyze the cause, and heal the affected services. Moreover, we use a non-intrusive healing approach which punishes misbehavior by protecting affected nodes from load and restricting the delegation options of misbehaving nodes. In this chapter we present the following contributions: • Delegation Behavior Models. We identify the fundamental delegation behavior models and their effects on the health state of the network. • Failure Models. We outline failure models in the system caused by misbehavior and analyze their root cause. • VieCure Architecture. We present our self-healing framework using state of the art Web services technologies. • Recovery Strategies. We formulate algorithms to compensate the effects of misbehavior and facilitate fast system recovery. • Evaluation. We simulate discussed recovery strategies to enable sophisticated self-healing in mixed service-oriented networks.
98
H. Psaier et al.
The rest of the chapter is structured as follows. In Sect. 5.2 we outline our motivation for the chosen approach, give a guiding example scenario, and identify two types of misbehavior. Sections 5.3 and 5.4 describe the components and architecture and detail our self-healing framework. The algorithm presented in Sect. 5.5 represents our misbehavior healing approach. An evaluation with experiments follows in Sect. 5.6. Related work is discussed in Sect. 5.7, and the chapter is concluded in Sect. 5.8.
5.2 Flexible Interactions and Compositions In this section we introduce a cooperative system environment, explain the motivation for our work, and deal with the major challenges of self-healing in mixed SOA.
5.2.1 Scenario Today, processes in collaborative environments are not restricted to single companies only, but may span multiple organizations, sites, and partners. External consultants and third-party experts may be involved in certain steps of such processes. These actors perform assigned tasks with respect to prior negotiated agreements. Single task owners may consume services from external expert communities. For a single service consumer this scenario is shown in Fig. 5.1. We model a mixed expert network consisting of HPSs [21] and SBSs that belong to different communities. The members of these communities are discovered based on their main expertise areas (depicted as shaded areas), and are connected through
request service consumer
WS DL
WS DL
Symbols: HPS SBS WS DL
WS DL
profile data expertise area
Expert Network
Fig. 5.1 Flexible cooperation of actors in an expert network
delegation relation
WS DL
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
99
certain relations (see later for details). Community members receive requests from external service consumers, process them and respond with appropriate answers. A typical use case is the evaluation of experiment results and preparation of test reports in biology, physics, or computer science by third-party consultants (i.e., the Expert Network). While the results of certain simple but often repeated experiments can be efficiently processed by SBSs, analyzing more complex data usually needs human assistance. For that purpose, HPS offers the advantage of loosely coupling and flexible involvements of human experts in a service-oriented manner. Therefore, our environment uses standardized SOA infrastructures, relying on widely adopted standards, such as SOAP and the Web Service Description Language (WSDL), to unify humans and software services in one harmonized environment. Various circumstances may be the cause for inefficient task assignments in expert communities. Performance degradations can be expected when a minority of distinguished experts become flooded with tasks while the majority remains idle. Load distribution problems can be compensated with the means of delegations [23]. Each expert in a community knows (i.e., realized as ‘knows’ relation in FOAF profiles1 ) some other experts that may potentially receive delegations. We assume that experts delegate work they are not able to perform because of missing mandatory skills or due to overload conditions. Delegation receivers can accept or reject task delegations. Community members usually have explicit incentives to accept tasks, such as collecting rewards for successfully performed work to increase their community standing (reputation). Delegations work well as long as there is some agreement on members’ delegation behavior: How many tasks should be delegated to the same partner in a certain time frame? How many tasks can a community member accept without neglecting other work? However, if misbehavior cannot be avoided in the network, its effects need to be compensated. Consider the following scenario: Someone is invited to join a community, e.g., computer scientists, in the expert network. Since she/he is new and does not know many other members, she/he is not well connected in the Web. In the following, she/he will receive tasks that match her/his expertise profile, but is not able to delegate to other members. Hence, she/he may get overloaded if several tasks arrive in short time spans. A straightforward solution is to find another member with similar capabilities that has free capacities. A central question in this work is how to support this process in an effective manner considering global network properties. In this chapter we focus on failures in the ad-hoc expert network. Such failures impact the network in a harmful manner by causing degradations. In particular, we deal with misbehavior of community members and highlight concepts for self-healing to recover from degraded states in SOA-based environments comprising human and software services.
1
FOAF: http://xmlns.com/foaf/spec/
100
H. Psaier et al.
Fig. 5.2 Delegation behavior models
5.2.2 Delegation Behavior Each node, i.e., community member, has a pool of open tasks. Therefore, the load of each node varies with the amount of assigned tasks. In Fig. 5.2 the load of nodes is depicted by vertical bars. If a single node cannot process assigned tasks or is temporarily overloaded, it may delegate work to neighbor nodes. The usual delegation scenario is shown in Fig. 5.2. In that case, node a delegates work to its partner nodes b, c, and d , which are connected by channels. A channel is an abstract description of any kind of link that can transport various information of communication, coordination and collaboration. In particular, a delegation channel has a certain capacity that determines the amount of tasks that may be delegated from a node a to a node b in a fixed time frame. None of the nodes is overloaded with work in the healthy state. 5.2.2.1 Delegation Factory As depicted in Fig. 5.2 a delegation factory produces unusual amounts (i.e., unhealthy) of task delegations, leading to a performance degradation of the entire network. In the example, node a accepts large amounts of tasks without actually performing them, but simply delegates to its neighbor node d . Hence, a’s misbehavior produces high load at this node. Work overloads lead to delays and, since tasks are blocked for a longer while, to a performance degradation from a global network point of view. 5.2.2.2 Delegation Sink A delegation sink behaves as shown in Fig. 5.2. Node d accepts more task delegations from a, b, and c as it is actually able to handle. In our collaborative network, this may happen due to the fact that d either underestimates the workload or wants to increase its reputation as a valuable collaboration partner in a doubtful manner. Since d is actually neither able to perform all tasks nor to delegate to colleagues (because of missing outgoing delegation channels), accepted tasks remain in its task pool. Again, we observe misbehavior as the delegation receiver causes blocked tasks and performance degradation from a network perspective.
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
101
Healing refers to compensating the effects of delegation misbehavior by adapting structures in the delegation network. This includes modifying the capacity of delegation channels, as well as adding new channels and removing existing ones.
5.3 Architecture Overview One of the biggest challenges in Mixed Systems is to support flexible interactions while keeping the system within boundaries to avoid degraded or stalled system states. Thus, adaptation mechanisms are needed to guide and control interactions. In this section we introduce the VieCure framework to support self-healing principles in mixed service-oriented systems. Such environments demand for additional tools and services to account for human behavior models and complex interactions. In the following, we present the overall architecture, inspired by existing architectural models in the self-healing and autonomic computing domain, and introduce novel components such as a behavior registry holding information regarding HPS delegation behavior. Figure 5.3 shows the overall framework model comprising three main building blocks: SOA Environment consisting of human and software services, Monitoring and Adaptation Layer to observe and control the actual environment and the VieCure framework providing the main features to support self-healing actions.
5.3.1 Mixed SOA Environment Many collaboration and composition scenarios involve interactions spanning human actors as well as software services. Traditional SOA architectures were designed to host SBSs without considering Human-Provided Services. We extend the architectural model by introducing: • A service registry maintaining information related to human and software services. • Definition of interaction patterns and interaction constraints using Web service technology. • Enhanced service-related information by describing human characteristics and capabilities. The resulting environment characteristics are dynamic, because of changing behavior and profiles, and the need for adaptation mechanism due to variable load conditions (e.g., changing availability of human actors and changing amount of task that need to be processed).
102
H. Psaier et al.
Fig. 5.3 Environment overview and the VieCure framework
5.3.2 Monitoring and Adaptation Layer The main building block of an environment enhanced with self-* capabilities is a feedback loop enabling adaptation of complex systems. The functions of a feedback loop can be realized as a MAPE-K cycle (Monitor, Analyze, Plan, Execute, and K denoting the Knowledge) [13]. Therefore our architecture needs to integrate the functions of this loop by performing two essential steps:
5.3.2.1 Observations Part of the knowledge base is provided by observations. Observations constitute most of the current knowledge of the system. Interaction data is gathered from the mixed system environment and stored in the logging database (denoted as Logs). Events are registered and captured in the environment, stored in historical logs, and serve as input for triggers and the diagnosis.
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
103
5.3.2.2 Recovery Actions By filtering, analyzing, and diagnosing events, an adaptation may need to be performed. Recovery actions are parts of a whole adaptation plan determined by diagnosis. Single recovery actions are deployed in correct order and applied to the environment by Recovery module.
5.4 VieCure Framework The building blocks of the VieCure framework are be detailed in this section. Figure 5.4 shows the fundamental interplay of VieCure’s components. The Monitoring and Adaptation Layer is the interface to the controlled environment that is observed by the framework and influenced afterward through corrective actions. All monitored interactions, such as SOAP-based task delegations (see Listing 5.1), are stored for later analysis by Interaction Logging Facilities. Environment events, including adding/removing services or state changes of nodes, are stored by similar Event Logging Facilities. Logs, events, and initial environment information represent the aggregated knowledge used by the VieCure framework to apply selfhealing mechanisms. The effectiveness and accuracy of the healing techniques strongly depend on data accuracy. The Event Monitor is periodically scheduled to collect recent interactions and events from the logging facilities. Upon this data, the monitor infers higher level composite events (c eve nt). Pre-configured triggers for such events, e.g. events reporting agreement violations, inform the Diagnosis Module about deviations from desired behavior. Furthermore, the actual interaction behavior of nodes is periodically updated and stored in the Behavior Registry. This mechanism assists the following diagnosis to correlate behavior changes and environment events. Furthermore, profiles in conjunction with the concept of HPSs allow to categorize these services and determine root causes.
Monitoring and Adaptation Layer
Interaction Logging
Event Logging
Event Monitor
Diagnosis Module
Behavior Registry
Recovery Module
get logs interaction interaction interaction event
behavior metrics log
infer c-event
log log
update
get logs
log
c-event get logs
event log
get behavior
interaction analyze root cause
log
set recovery actions execute action 1...n
Fig. 5.4 VieCure’s fundamental mode of operation
104
H. Psaier et al.
Once a deviation indicating composite event triggered the Diagnosis Module, a root cause analysis is initiated. Previously captured and filtered interaction logs as well as actual node behaviors, assist a sophisticated diagnosis and to recognize the mixed system’s health state. On failures a set of corrective recovery actions is submitted to the Recovery module. A substantial part of recovery is the self-healing policy registry (underneath the Recovery block in Fig. 5.3). It manages available adaptation methods. As mentioned before, adaptations and constraints applied by self-healing policies include, for example, boundaries and agreements imposed on the services defining the interaction paths and limiting recovery strategies. The recovery module executes the recovery actions and influences the mixed system environment through the Monitoring and Adaptation Layer.
5.4.1 Interaction Monitoring Interactions between community members of the expert network are modeled as standardized SOAP messages with header extensions (see also [23]), as shown in Listing 5.1. 50 && role == "worker") recoveryActionList:ArrayList() then Node neighbor = Utils.lookupNodeSimilarCapabilities(node) RecoveryAction ctlCapacity = new CtlCapacity(neighbor, node); recoveryActionList.add(ctlCapacity); end rule "TriggerUnusualDelegationRateWorker" when node:Node(numTasksQueued > 15 && delegationRate < 2) then ... end
Listing 5.3 Triggering events and setting recovery actions
The final step in the healing process is to execute recovery actions. Listing 5.4 shows an example how such recovery actions can be performed in our system. As mentioned previously, an approach for recovering from degraded system state is regulation of delegation behavior between actors (HPSs). This is accomplished by sending the corresponding recovery action to an Activity Management Service (see [20] for details). In Listing 5.4, a ControlAction of type Coordination is depicted regulating the flow of delegations between two actors. Each Coordination action has a unique identifier and is applied in the context of an activity. The ControlAction also contains what kind of ActionType has to be regulated as a result of a recovery. In this example regulation applies to Delegation actions by changing the capacity of delegation channels. http://www.expertweb.org/Actor#Harald http://myhps.org/Action/Delegation http://www.expertweb.org/Actor#Florian
Listing 5.4 Control action to recover from degraded system state
4
http://jboss.org/drools
5 Behavior Monitoring in Self-Healing Service-Oriented Systems Fig. 5.5 Self-healing recovery actions for a failure affected node
107
e b (i)
a
c (iii)
(ii)
d
5.5 Regulation of Behavior In our self-healing algorithm for Mixed Systems we opted for a regulation of a node’s behavior in a non-intrusive manner. Instead of healing misbehavior directly at the nodes, we influence their behavior by restricting delegations, establishing new delegation channels, and by redirecting work. Next, we outline the modules of our self-healing mechanism in Algorithm 1 and detail and analyze the concepts with respect to the failure scenario in Fig. 5.5.
5.5.1 Trigger The first module (line 1–5), a trigger, represents a filter for the failure scenario in Fig. 5.5. As a prerequisite any agreements and constrains monitored by this selfhealing approach need to be expressed as threshold values. These values are integral part of the decision logic of a trigger module.
5.5.2 Diagnosis A recognized violation fires the second module (line 6–23), the diagnosis. It defines the necessary recovery actions by analyzing the result of the task history evaluation of the failing node.
5.5.3 Recovery Actions The possible resulting recovery actions are listed in the last three modules (line 24–37). The first balances load of a failing node by restricting incoming delegations. The second provides the failing node with new delegation channels for blocked tasks. The last assists neighbors by providing new delegation channels to alternative nodes. As mentioned before, a loop-style data-flow between the guarded system and the self-healing mechanism allows to observe changes. Changes leading to possible
108
H. Psaier et al.
failures are recognized by the mechanism by directing the data-flow through the trigger modules’ logic. In Algorithm 1 Trigger triggerQueueOverload filters events which indicate a threshold violation of the task queue capacity of a node (Line 3). Such an event causes triggerQueueOverload to fire the related diagnosis diagnoseBehavior passing on the failure affected node information. For example, in Fig. 5.5 the congestion of node b is reported as such an event.
Algorithm 1 Detection of misbehavior and recovery actions Require: Monitoring of all nodes Require: Listen to Events 1: Trigger triggerQueueOverload(event ) 2: node event:node /*affected node*/ 3: if q >#q then 4: fire diagnoseBehavior(node) 5: end if 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23:
/*diagnose sink and factory behavior*/ Diagnosis diagnoseBehavior(node) recAct s ; /*set of returned recovery actions*/ recAct s.add(addChannel(node)) analyzeTaskHistory(node) for neighbor affectedNeighbors(node) if (rankTasks(node) > #pref ) or (p < #p ) then /*root cause: sink behavior*/ recAct s.add(redDeleg(neighbor)) recAct s.add(ctlCapacity(neighbor, node)) else if (q < #q ) and (d > #d ) then /*root cause: factory behavior*/ recAct s.add(ctlCapacity(neighbor, node)) else /*root cause: transient degradation*/ recAct s.add(redDeleg(neighbor)) end if return recAct s
24: 25: 26: 27:
/*recovery action: control capacity*/ Recovery Action ctlCapacity(neighbor, node) cap estimateCapacity(neighbor, node) setCapacity(cap)
28: 29: 30: 31: 32:
/*recovery action: add channel*/ Recovery Action addChannel(node) si mNode lookupNodeSameCapabilities(node) addDelChannel(node, si mNode) ctlCapacity(node, si mNode)
33: 34: 35: 36: 37:
/*recovery action: redirect delegations*/ Recovery Action redDeleg(neighbor) si mNode lookupNodeRequiredCapabilites(neighbor) addDelChannel(neighbor, si mNode) ctlCapacity(neighbor, si mNode)
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
109
As a first precaution in diagnoseBehavior the algorithm balances the load at node and adds recovery action addChannel to the recovery result-set recActs. The idea is to relieve node by providing node with new delegation options to nodes with sufficiently free capacities. The task of this recovery action is to discover a node that has capabilities similar to node. Once the delegation channel is added, in ctlCapacity method estimateCapacity estimates the maximum possible of task transfer regarding the discovered nodes’ processing capabilities. Finally, setCapacity controls the throughput accordingly. Next, in analyzeTaskHistory the diagnosis derives a root cause from the reported node’s task history. A repository of classified failure patterns is compared to the last behavior patterns of the node and the corresponding root cause returned. In a loop (line 11), by selecting the affected neighbors, behavior is analyzed.
5.5.4 Sink Behavior Line 12 identifies sink behavior. The result of the pattern analysis shows that node is still accepting tasks from any neighbor, however, prefers to work on tasks of a certain neighbor and delays the tasks of the other nodes. The second misbehavior of a sink is to perform tasks below an expected rate (p < #p ). The additional counter actions try to provide options for the set of affected delegating neighbor nodes and to decouple the sink. Recovery action redDeleg finds the alternatives and again estimates the adequate capacity of the new delegation channels. Recovery action ctlCapacity sets the delegation rate between sink and its neighbors to a minimum. The situation is depicted in Fig. 5.5. Delegation channel (ii) is added from b to similar capable node d and allows b to dispense a certain amount of capability matching tasks. Delegation channel (iii) from a to d is a result of redDeleg. In our example, d has enough resources to process blocked (from b) and new tasks (from a). The amount of recently delegated tasks is balanced in estimateCapacity. Thereafter the capacity of delegation channel (i) is minimized. A limitation of the delegations depends on the content of b’s task queue. The example assumes that it mostly contains tasks from a. If the capacity of delegation channel (iii) is too low for a’s delegation requirements, it might consider to process the tasks itself, or discover an additional node for delegation. The whole scenario is also applicable for a factory behavior of a. In that case, further uncontrolled delegations of a are avoided and no new delegation channel (iii) would be added.
5.5.5 Factory Behavior Line 16 detects a delegation factory behavior. A factory is identified by moderate use of queue capacity (q < #q ) in contrast to high and exceeding delegation rates (d > #d ) causing overloaded nodes despite available alternatives. Recovery restricts the delegations from the factories to node, expecting that the factories start increasing their task processing performance or find themselves other nodes for delegations.
110
H. Psaier et al.
Besides releasing the load from node, ctlCapacity ensures that the delegation of tasks from a factory to node is set to a minimum.
5.5.6 Transient Behavior In Line 19, if neither factory nor sink behavior are recognized diagnoseBehavior must assume a temporal overload of node. As a second precaution the algorithm estimates alternative delegation nodes in redDeleg for the neighbors of node.
5.6 Simulation and Evaluation In our experiments we evaluate the effectiveness of previously presented recovery action algorithms (c.f., Sect. 5.5) in a simulated mixed SOA environment. Figure 5.6 outlines the controllable simulation environment on the left used for our experiments. We took interaction logs from the real mixed SOA environment on the right to reconstruct the main characteristics.
5.6.1 Simulation Setup 5.6.1.1 Simulated Heterogeneous Service Environment The simulated interaction network comprises a node actor framework implemented in JAVA language. At bootstrapping the nodes receive a profile including different
Fig. 5.6 Simulation setup
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
111
behavior models. Each node has a task list with limited capacity. Depending on the deployed behavior model a node tends either to delegate, or process tasks, or exposes a balanced behavior. New tasks are constantly provided to a quarter of the nodes via connected entry points. Tasks have an effort of three units. A global timer initiates the simulation rounds. Depending on the behavior model, in each round a node decides to process tasks or delegate one task. A node is able to process the effort of a whole task, or if delegating, only one effort unit. For the delegation activity a node holds a current neighbor list which is ordered according to the neighbors’ task processing tendency. The delegating node prefers nodes with processing behavior and assigns the selected the longest remaining task. A receiving node with a task queue at its upper boundary refuses additional tasks. However, each task is limited by a ten round expiry. If a task is not processed entirely in this period it is considered a failed task.
5.6.1.2 VieCure Setup At bootstrapping the VieCure monitoring and adaptation layer is instantiated. In our simulated environment the monitor has an overview over all nodes. Thus, the monitor provides the VieCure framework with a current node list together with their task queue levels. A trigger filters the queues’ levels and reports to diagnosis if the lower threshold value is exceeded. Diagnosis estimates then the actual level and decides on the recorded history together with the current situation which recovery action to choose. For the purpose of the evaluation of the recovery actions, we required diagnosis to act predictable and decide according to our configuration which recovery action to select.
5.6.1.3 Recovery Actions Two of the outlined recovery actions in Sect. 5.5 were implemented. In control capacity, the delegation throughput to the affected node is adapted according to the current task queue level. In add channel, the filtered node is provided with a new channel to the node with the currently lowest task queue load factor. In order to evaluate the effects of the recovery actions we executed four different runs with the same setting. At the end of each experiment the logging facilities of the VieCure framework provided us with all the information needed for analysis. The results are presented next.
5.6.2 Results and Discussion The experiments measure the efficiency of a recovery action by the amount of failed tasks. An experiment consists of a total number of 150 rounds and a simulation
112
H. Psaier et al.
(a) Current failure rate.
(b) Final overal success rate.
Fig. 5.7 Equal distirbution of behavior models
environment with 128 nodes. During an experiment 4,736 tasks are assigned to the nodes’ network. In order to prevent an initial overload of a single node as a result of too many neighbor relations, we limited the amount of incoming delegations channels to a maximum of 6 incoming connections at start-up. The resulting figures present on their left the total of failed tasks after a certain simulation round. The curves show the progress of different configurations of VieCure’s diagnosis module. The figures on the right represent the ratio failed/processed tasks in percentages at the end of the experiments with an equal setting. The setting for the results in Fig. 5.7 consisted of an equal number of the three behavior models distributed among the nodes. Whilst the nodes on their own produce a total of 2,083 failed tasks (top continuous curve) the two different recovery actions separately expose an almost equal progress and finish at almost half as much; 1,171 for add channel action and 1,164 for control capacity action, respectively. Combining both diminishes the failure rate to a quarter compared to no action, to 482 failed tasks (lower continuous curve). The results demonstrate that in an equilibrated environment our two recovery actions perform almost equal and complete each-other when combined. In Fig. 5.8 the setting configured a tenth of nodes with factory tendency and an equal distribution of the other two models across the remaining nodes. An immediate result of the dominance of task processing nodes is that less tasks fail generally. The failure rate for the experiment with no recovery falls to a total of 1,693 (top continuous curve). The success of add channel (dashed curve) remains almost the same (1,143). With this unbalanced setting the potential neighbors for a channel addition remain, however, the same as in the previous setting. In contrast, the success of control capacity (dotted curve, 535) relies on the fact that regulating channels assures that the number of tasks in a queue relates to the task processing capabilities given by a node’s behavior. In strategy combination (lower continuous curve, 77), this balancing mechanism is supported by additional channels to eventually still failing nodes. The results are also reflected by the success rate figure. In Fig. 5.9 the setting was changed to a 10% of sink behavior trend.
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
113
(a) Current failure rate. (b) Final overall success rate.
Fig. 5.8 Distribution with a trend for 10% factory behavior
(a) Current failure rate.
(b) Final overal success rate.
Fig. 5.9 Distribution with a trend for 10% sink behavior
Without a recovery strategy the environment performs almost the same as in the previous setting (top continuous curve, 1,815). The strategy of just adding channels to overloaded nodes fails. Instead of relieving nodes from the task load, tasks circle until they expire. Thus, a number of 2,022 tasks fail for add channel (dashed curve). The figure further shows, that this problem has also impact on the combination of the two strategies (lower continuous curve, 1,157). The best solution for the setting is to inhibit the dominating factory behavior by controlling the channels capacity (dotted curve, 753).
5.7 Related Work The concepts of self-healing are applicable in various research domains [9]. Thus, there is a vast amount of research available on self-healing designs for different areas. These include higher layers such as models and systems’ architecture [4, 7] application layer, and in particular interesting for our research are large-scale agentbased systems [2,6,26], Web services [11] and their orchestration [1]. In the middle,
114
H. Psaier et al.
self-healing ideas can be found for middleware [3, 17], and at a lower layer selfhealing designs include operating systems [22, 25], embedded systems, networks, and hardware [10]. The two main emerging directions that include self-healing research are provided by autonomic computing [13, 24] and self-adaptive systems [19]. Whilst autonomic computing includes research on all possible layers, selfadaptive systems focus primarily on research above the middleware layer with a more general approach. With current systems growing in size and ever changing requirements plenty of challenges remain to be faced such as autonomic adaptations [16] and service behavior modeling [15]. The self-healing research demonstrated in this chapter relates strongly to the challenges in Web services and workflow systems. Apart from the cited, substantial research on self-healing techniques in Web Service environments has been conducted in the course of the European Web service technology research project WS-Diamond (Web-Service DIAgnosinbility, MONitoring and Diagnosis). The recent contributions focus in particular on QoS related self-healing strategies and adaptation of BPEL processes [11, 12]. Others are theoretical discussions on self-healing methodologies [5]. Human-Provided Services [21] close the gap between Software-Based Services and humans desiring to provide their skills and expertise as a service in a collaborative process. Instead of a strict predefined process flow, these systems are denoted by ad-hoc contribution request and loosely structured processes collaborations. The required flexibility induces even more unpredictable a system property responsible for various faults. In our approach we monitor failures caused by misbehavior of service nodes. The contributed self-healing method recovers by soundly restricting delegation paths or establishing new connections between the nodes.
5.8 Conclusion and Outlook In our work we analyze misbehavior in Mixed Systems with our novel VieCure framework comprising an assemble of cooperating self-healing modules. We extract the monitored misbehaviors to models and diagnose them with our self-healing algorithms. The recovery actions of the algorithm heal the identified misbehaviors in non-intrusive manner. The evaluations in this work shown that our elaborate recovery actions compensate satisfactorily the misbehaviors in a Mixed System (about 30% higher success rate with equal distribution of behavior models). The success rates of the recovery actions depend on the environment settings. In all but one of the cases, deploying recovery actions supports the overloaded nodes resulting in a higher task processing rate. Important to note, that the failure rate increase near linearly even when recovery actions adjust the nodes’ network structure. This observation emphasizes our attempt in implementing non-intrusive self-healing recovery strategies. Future work will involve the integration of VieCure into the GENESIS testbed framework [14] in order to interface the controlling capabilities of the framework
5 Behavior Monitoring in Self-Healing Service-Oriented Systems
115
with VieCure’s self-healing implementations. Experiments in this testbed environment will provides us with more accurate data when extending VieCure with additional self-healing policies to cover new models of Mixed System’s misbehavior.
References 1. Baresi, L., Guinea, S., Pasquale, L.: Self-healing bpel processes with dynamo and the jboss rule engine. In: ESSPE, pp. 11–20 (2007) 2. Bigus, J.P., Schlosnagle, D.A., Pilgrim, J.R., Mills, I.W.N., Diao, Y.: Able: A toolkit for building multiagent autonomic systems. IBM Syst. J. 41(3), 350–371 (2002) 3. Blair, G.S., Coulson, G., Blair, L., Duran-Limon, H., Grace, P., Moreira, R., Parlavantzas, N.: Reflection, self-awareness and self-healing in openorb. In: WOSS, pp. 9–14 (2002) 4. Cheng, S.W., Garlan, D., Schmerl, B.R., Sousa, J.P., Spitnagel, B., Steenkiste, P.: Using architectural style as a basis for system self-repair. In: WICSA, pp. 45–59 (2002) 5. Cordier, M., Pencol´e, Y., Trav´e-Massuy`es, L., Vidal, T.: Characterizing and checking selfhealability. In: ECAI, pp. 789–790 (2008) 6. Corsava, S., Getov, V.: Intelligent architecture for automatic resource allocation in computer clusters. In: IPDPS, p. 201.1 (2003) 7. Dashofy, E.M., van der Hoek, A., Taylor, R.N.: Towards architecture-based self-healing systems. In: WOSS, pp. 21–26 (2002) 8. Ganek, A.G., Corbi, T.A.: The dawning of the autonomic computing era. IBM Syst. J. 42(1), 5–18 (2003) 9. Ghosh, D., Sharman, R., Raghav Rao, H., Upadhyaya, S.: Self-healing systems - survey and synthesis. Decis. Support Syst. 42(4), 2164–2185 (2007) 10. Glass, M., Lukasiewycz, M., Reimann, F., Haubelt, C., Teich, J.: Symbolic reliability analysis of self-healing networked embedded systems. In: SAFECOMP, pp. 139–152 (2008) 11. Halima, R., Drira, K., Jmaiel, M.: A QoS-Oriented Reconfigurable Middleware for SelfHealing Web Services. In: ICWS, pp. 104–111 (2008) 12. Halima, R., Guennoun, K., Drira, K., Jmaiel, M.: Non-intrusive QoS Monitoring and Analysis for Self-Healing Web Services. In: ICADIWT, pp. 549–554 (2008) 13. IBM: An architectural blueprint for autonomic computing. IBM White Paper (2005) 14. Juszczyk, L., Truong, H.L., Dustdar, S.: Genesis - a framework for automatic generation and steering of testbeds of complexweb services. In: ICECCS’08, pp. 131–140 (2008) 15. Kaschner, K., Wolf, K.: Set algebra for service behavior: Applications and constructions. In: BPM ’09, pp. 193–210. Springer-Verlag, Berlin, Heidelberg (2009). DOI http://dx.doi.org/10. 1007/978-3-642-03848-8 14 16. Kephart, J.O.: Research challenges of autonomic computing. In: ICSE, pp. 15–22 (2005) 17. Ledoux, T.: Opencorba: A reflektive open broker. In: Reflection, pp. 197–214 (1999) 18. Psaier, H., Dustdar, S.: A survey on self-healing systems - approaches and systems. Computing 87(1) (2010) 19. Salehie, M., Tahvildari, L.: Self-adaptive software: Landscape and research challenges. ACM TAAS 4(2), 1–42 (2009) 20. Schall, D., Dorn, C., Dustdar, S., Dadduzio, I.: Viecar - enabling self-adaptive collaboration services. In: SEAA ’08, pp. 285–292. IEEE Computer Society, Washington, DC, USA (2008). DOI http://dx.doi.org/10.1109/SEAA.2008.25 21. Schall, D., Truong, H.L., Dustdar, S.: Unifying human and software services in web-scale collaborations. IEEE Internet Comput. 12(3), 62–68 (2008) 22. Shapiro, M.W.: Self-healing in modern operating systems. ACM Queue 2(9), 66–75 (2005) 23. Skopik, F., Schall, D., Dustdar, S.: Trusted interaction patterns in large-scale enterprise service networks. In: Euromicro PDP, pp. 367–374 (2010)
116
H. Psaier et al.
24. Sterritt, R.: Autonomic computing. ISSE 1(1), 79–88 (2005) 25. Tanenbaum, A., Herder, J., Bos, H.: Can we make operating systems reliable and secure? Computer 39(5), 44–51 (2006) 26. Tesauro, G., Chess, D.M., Walsh, W.E., Das, R., Segal, A., Whalley, I., Kephart, J.O., White, S.R.: A multi-agent systems approach to autonomic computing. In: AAMAS, pp. 464–471 (2004)
Chapter 6
Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems Harald Psaier, Lukasz Juszczyk, Florian Skopik, Daniel Schall, and Schahram Dustdar
Abstract Mixed service-oriented systems composed of human actors and software services build up complex interaction networks. Without any coordination, such systems may exhibit undesirable properties due to unexpected behavior. Also, communications and interactions in such networks are not preplanned by top-down composition models. Consequently, the management of service-oriented applications is difficult due to changing interaction and behavior patterns that possibly contradict and result in faults from varying conditions and misbehavior in the network. In this chapter we present a self-adaptation approach that regulates local interactions to maintain desired system functionality. To prevent degraded or stalled systems, adaptations operate by link modification or substitution of actors based on similarity and trust metrics. Unlike a security perspective on trust, we focus on the notion of socially inspired trust. We design an architecture based on two separate independent frameworks. One providing a real Web service testbed extensible for dynamic adaptation actions. The other is our self-adaptation framework including all modules required by systems with self-* properties. In our experiments we study a trust and similarity based adaptation approach by simulating dynamic interactions in the real Web services testbed.
H. Psaier () L. Juszczyk F. Skopik D. Schall S. Dustdar Distributed Systems Group, Vienna University of Technology, Argentinierstr 8/184-1, 1040 Vienna, Austria e-mail:
[email protected];
[email protected];
[email protected];
[email protected] c 2010 IEEE. Reprinted, with permission, from Psaier, H., Juszczyk, L., Skopik, F., Schall, D., Dustdar, S. (2010) Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems. 4th IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO’10), September 27 – October 1, 2010. Budapest, Hungary S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0 6, © Springer-Verlag/Wien 2011
117
118
H. Psaier et al.
6.1 Introduction Service-oriented architectures (SOA) implementations are typically designed as large-scale systems. Applications are composed from the capabilities of distributed services that are discovered at runtime. Dynamic loosely bound systems make the management of large-scale distributed applications increasingly complex. Adaptations are necessary to keep the system within well-defined boundaries such as expected load or desired behavior. Changing requirements and flexible utilization demand for comprehensive analysis of the resulting effects prior to integration. Changes interfere with established services, connections, or policies and on top of all affect dependencies. However, service compositions must be maintained and adapted depending on predefined runtime properties such as quality of service (QoS) [32] and behavior [22]. In this work we propose a monitoring and self-adaptation approach of serviceoriented collaboration networks. We consider systems that are based on the capabilities of human actors, defined as Human-Provided Services (HPSs) [25] and traditional Software-Based Services (SBSs). The integration of humans and software-based services is motivated by the difficulties to adopt human expertise into software implementations. Instead of dispensing with human capabilities, people handle tasks behind traditional service interfaces. In contrast to process-centric flows (top-down compositions), we advocate flexible compositions wherein services can be added at any time exhibiting new behavior properties. However, especially the involvement of and dependencies on humans as a part of flexible compositions makes the functioning of applications difficult to determine. Heterogeneity has a major impact on all aspects of the system since system dynamics and evolution are driven by software services and human behavior [22]. A main challenge is to monitor, analyze, and evaluate specific behaviors which may affect system performance or reliability. We present a solution to this problem based on an architecture including a Web services testbed [17] at its core. The testbed allows to simulate and track the effects on a composition resulting from different environmental conditions. The success of self-adaptation strategies commonly depends on the recognition of the system’s current state and potential actions to achieve desired improvements. This chapter presents the following novel key contributions: • Modeling and simulating human behavior in service-oriented collaboration networks. • A flexible interaction model for service-oriented systems. The interaction model is based on delegation actions performed by actors. Associated tasks are routed through the system following standard WS-Addressing techniques. • Models for misbehavior and related repair actions to prevent inefficient or degraded system performance. We identify delegation factory and delegation sink and their behavior. • Discovery of delegation receivers to prevent or mitigate misbehavior. We present a novel trust metric based on profile similarity measurements.
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
119
The chapter’s structure is as follows. Section 6.2 provides a motivating scenario for service-oriented collaboration systems. Section 6.3 explains the concepts of similarity and trust used for adaptation strategies. Section 6.4 outlines the twofold system architecture. Section 6.5 details the aspects of behavior monitoring. Experiments and results are discussed in Sect. 6.6 followed by related work in Sect. 6.7. Section 6.8 concludes the chapter.
6.2 On Self-Adaptation in Collaborative SOA The goal of self-adaptation in service-oriented systems is to prevent the running system from the trend to an unexpected low performance. As in autonomic computing the aim is to create robust dependable self-managing systems [28]. The established methodology [10], and the one of self-adaptive systems [23] is to design and implement a control-feedback loop. This feedback loop is known as the MAPE cycle consisting of four essential steps: monitor, analyze, plan, and execute. Systems that adapt themselves autonomously are enhanced with sensors and effectors that allow network model creation and adaptation strategies. This provides the necessary self-awareness to manage the system autonomously. Figure 6.1 illustrates the proposed approach to manage and adapt serviceoriented collaboration networks. Such systems comprise different kinds of actors, services, and compositions thereof. Interactions are captured from the system through interceptor and logging capabilities. The monitoring component feeds interaction logs into a network representation of actors and their relations. Behavior patterns are analyzed based on a network model. A self-adaptation engine evaluates policies to trigger potential adaptation strategies. Adaptations include structural change (link modification) and actor substitution. Our approach with two frameworks allows testing of adaptation strategies in versatile service-based application scenarios. Examples are crowdsourcing applications [30] in enterprise environments or open Internet based platforms. These online platforms distribute problem-solving tasks among a group of humans. Crowdsourcing follows the ‘open world’ assumption allowing humans to provide
Fig. 6.1 Self-adaptation and behavior monitoring approach
120
H. Psaier et al.
their capabilities to the platform by registering themselves as services. Some of the major challenges [6] are monitoring of crowd capabilities, detection of missing capabilities, strategies to gather those capabilities, and tasks’ status tracking. In the following we discuss collaborations in service-oriented networks. Processes in collaborative environments are not restricted to single companies only, but may span multiple organizations, sites, and partners. External consultants and third-party experts may be involved in certain steps of such processes. These actors perform assigned tasks with respect to prior negotiated agreements. Single task owners may consume services from external expert communities. A typical use case is the evaluation of experiment results and preparation of test reports in biology, physics, or computer science by third-party consultants (i.e., the network of experts). While the results of certain simple but often repeated experiments can be efficiently processed by software services, analyzing more complex data usually needs human assistance. We model a mixed expert network consisting of Human-Provided and Software-Based Services belonging to different communities. The members of these communities are discovered based on their main expertise areas, and connected through certain relations (detailed in the following sections). Community members receive requests from external service consumers, process them, and respond to the requests. Our environment uses standardized SOA infrastructures, relying on widely adopted standards, such as SOAP and the Web Service Description Language (WSDL), to combine the capabilities of humans and software services. Various circumstances may cause inefficient task assignments in expert communities. Performance degradations can be expected when a minority of distinguished experts become flooded with tasks while the majority remains idle. Load distribution problems can be compensated with delegations [9, 27]. Each expert in a community is connected to other experts that may potentially receive delegations. We assume that experts delegate work they are not able to perform because of missing mandatory skills or due to overload conditions. Delegation receivers can accept or reject task delegations. Community members usually have explicit incentives to accept tasks, such as collecting rewards for successfully performed work to increase their community standing (reputation). Delegations work well as long as there is some agreement on members’ delegation behavior: How many tasks should be delegated to the same partner in a certain time frame? How many task can a community member accept without neglecting other work? However, if misbehavior cannot be avoided in the network, its effects need to be compensated. We identify two types of misbehavior: delegation factory and delegation sink. A delegation factory produces unusual (i.e., unhealthy) amounts of task delegations, leading to a performance degradation of the entire network. For example (see Fig. 6.1), if a node v accepts large amounts of tasks without actually performing them, but simply delegates to one of its neighboring nodes (e.g., w). Hence, v’s misbehavior produces high load at the neighboring node w. Work overloads lead to delays and, since tasks are blocked for a longer while, to a performance degradation from a global network point of view. A delegation sink can be characterized by the following behavior. Node w accepts more task delegations from u, v, and x as it is actually able to handle. In our collaborative network, this may happen
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
121
due to the fact that w either underestimates the workload or wants to increase its reputation as a valuable collaboration partner in a doubtful manner. Since w is actually neither able to perform all tasks nor to delegate to colleagues (because of missing outgoing delegation links), accepted tasks remain in its task pool. Again, we observe misbehavior as the delegation receiver causes blocked tasks and performance degradation from a network perspective. Our approach provides a testing environment for such applications to address related challenges.
6.3 Profile Similarity and Dynamic Trust Collaborative networks, as outlined in the previous sections, are subject to our trust studies. Unlike a security view, we focus on the notion of dynamic trust from a social perspective [33]. We argue that trust between community members is essential for successful collaborations. The notion of dynamic trust refers to the interpretation of previous collaboration behavior [12,27] and considers the similarity of dynamically adapting skills and interests [11, 21]. Especially in collaborative environments, where users are exposed to higher risks than in common social network scenarios, and where business is at stake, considering trust is essential to effectively guide human interactions. In this chapter, we particularly focus on the establishment of trust through measuring interest similarities [27]: • Trust Mirroring implies that actors with similar profiles (interests, skills, community membership) tend to trust each other more than completely unknown actors. • Trust Teleportation rests on the similarity of human or service capabilities, and describes that trust in a member of a certain community can be teleported to other members. For instance, if an actor, belonging to a certain expert group, is trusted because of his distinguished knowledge, other members of the same group may benefit from this trust relation as well.
6.3.1 Interest Profile Creation In contrast to common top-down approaches that apply taxonomies and ontologies to define certain skills and expertise areas, we follow a mining approach that addresses inherent dynamics of flexible collaboration environments. In particular, skills and expertise as well as interests change over time, but are rarely updated if they are managed manually in a registry. Hence, we determine and update them automatically through mining. The creation of interest profiles without explicit user input has been studied in [27]. As discussed before, interactions, i.e., delegation requests, are tagged with
122
H. Psaier et al.
keywords. As delegation receivers process tasks, our system is able to learn how well people cope with certain tagged tasks; and therefore, able to determine their centers of interests. We use task keywords to create dynamically adapting interest profiles based on tags and manage them in a vector space model. The utilized concepts are well-known from the area of information retrieval (see for instance [24]). However, while they are used to determine the similarities of given documents, we create these documents (that reflect user profiles) from used tags dynamically on the fly. The profile vector pu of actor u in (6.1) describes the frequencies f the tags T D ft1 ; t2 ; t3 : : : g are used in delegated tasks accepted by actor u. pu D hf .t1 /; f .t2 /; f .t3 / : : : i
(6.1)
The tag frequency matrix T (6.2) in (6.2), built from profile vectors, describes the frequencies of used tags T D ft1 ; t2 ; t3 : : : g by all actors A D fu; v; w : : : g. T D hpu ; pv ; pw : : : ijT jjAj
(6.2)
The popular tf idf model [24] introduces tag weighting based on the relative distinctiveness of tags; see (6.3). Each entry in T is weighted by the log of the total number of actors jAj, divided by the amount nt D jfu 2 A j tf.t; u/ > 0gj of actors who used tag t. jAj tf idf.t; u/ D tf.t; u/ log (6.3) nt Finally, the cosine similarity, a popular measure to determine the similarity of two vectors in a vector space model, is applied to determine the similarity of two actor profiles pu and pv ; see (6.4). simprofile .pu ; pv / D cos.pu ; pv / D
pu pv jjpu jj jjpv jj
(6.4)
6.3.2 The Interplay of Interest Similarity and Trust In our model, a trust relation .u; v/ mainly relies on the interest and expertise similarities of actors. We apply various concepts to facilitate the emergence of trust among network members.
6.3.2.1 Trust Mirroring Trust mir (Fig. 6.2) is typically applied in environments where actors have the same roles (e.g., online social platforms). Depending on the environment, interest and competency similarities of people can be interpreted directly as an indicator for
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
123
Fig. 6.2 Concepts for the establishment of trust through interest similarities (a) Trust Mirroring.
(b) Trust Teleportation
future trust (6.5). There is strong evidence that actors ‘similar minded’ tend to trust each other more than any random actors [21, 33]; e.g., movie recommendations of people with same interests are usually more trustworthy than the opinions of unknown persons. Mirrored trust relations are directed, iff simprofile .pu ; pv / ¤ simprofile .pu ; pv /. For instance an experienced actor v might have at least the same competencies as a novice u. Therefore, v covers mostly all competencies of u and mi r .u; v/ is high, while this is not true for mi r .v; u/. mir .u; v/ D simprofile .pu ; pv /
(6.5)
6.3.2.2 Trust Teleportation Trust tele is applied as depicted by Fig. 6.2. We assume that u has established a trust relationship to w in the past, for example, based on w’s capabilities to assist u in work activities. Therefore, others having interests and capabilities similar to w may become similarly trusted by u in the future. In contrast to mirroring, trust teleportation may also be applied in environments comprising actors with different roles. For example, a manager might trust a software developer belonging to a certain group. Other members in the same group may benefit from the existing trust relationship by being recommended as trustworthy as well. We attempt to predict the amount of future trust from u to v by comparing w’s and v’s profiles P . P 2 0 .u; w/ .simprofile .pw ; pv // tele .u; v/ D w2MP (6.6) w2M 0 simprofile .pw ; pv / Equation (6.6) deals with a generalized case where several trust relations from u to members of a group M 0 are teleported to a still untrusted actor v. Teleported relations are weighted and attenuated by the similarity measurement results of actor profiles.
6.4 Design and Architecture This section provides an overview of the components and services that allow simulatons and tests of adaptation scenarios in collaborative service-oriented systems. Our architecture (see Fig. 6.3) consists of two main building blocks: the testbed runtime
124
H. Psaier et al.
Fig. 6.3 Architecture for self-adaptation in service-oriented systems
environment based on the Genesis2 framework [17] and the VieCure adaptation and self-healing framework, partly adopted from our previous work [22]. The integration of both systems enables the realization of the control-feedback loop as illustrated in Fig. 6.1.
6.4.1 Genesis2 Testbed Generator Framework The purpose of the Genesis2 framework (in short, G2) is to support software engineers in setting up testbeds for runtime evaluation of SOA-based concepts and implementations; in particular also collaboration networks. It allows to establish environments consisting of services, clients, registries, and other SOA components, to program the structure and behavior of the whole testbed, and to steer the execution of test cases on-the-fly. G2’s most distinct feature is its ability to generate real testbed instances (instead of just performing simulations) which allows engineers to integrate these testbeds into existing SOA environments and, based on these infrastructures, to perform realistic tests at runtime. As depicted in Fig. 6.3, the G2 framework comprises a centralized front-end, from where testbeds are modeled and controlled, and a distributed back-end at which the models are transformed into real testbed instances. The front-end maintains a virtual view on the testbed, allows engineers to manipulate it via Groovy [13] scripts, and propagates changes to the back-end in order to adapt the running testbed. To ensure extensibility, G2 follows a modular approach where a base runtime framework provides a functional grounding for composable plugins. These augment the testbed’s functionality, making it possible to emulate diverse topologies, functional and non-functional properties, and behavior. Furthermore, each plugin registers itself at the shared runtime in order offer its functionality via the framework’s script API. The sample script in Listing 6.1 demonstrates a specification of a Web service which queries a registry plugin, applies a delegation strategy, and forwards the
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
125
li=callinterceptor.create() // logging interceptor li.hooks=[in:"RECEIVE", out :"PRE_STREAM"] // bind to phases li.code=fctx > logger.logToDB(ctx.soapMsg) g // process msg msgType=datatype.create("file.xsd","typeName") // xsd import sList=webservice.build f // create web service Proxy(binding:"doc,lit", namespace="http://...") f // attach logging interceptor interceptors+=li // create web service operation Delegate(input:msgType, resonse:msgType) f refs = registry.getfs> "Worker" in s.tagsg // by tag r = dStrat(refs) return r.Process(input).response g // delegation strategy as closure variable dStrat=f refs > return refs[0] g // default: take first g g srv=sList[0] // only one service declared, take it h=host.create("somehost:8181") // import backend host srv.deployAt(h) // deploy service at remote backend host srv.dStrat=f refs> /.../ g // adapt strategy at runtime
Listing 6.1 Groovy script specifying delegator service
request message to a worker service. First, a call interceptor is created and customized with a Groovy closure which passes the SOAP message to the logger plugin. Then, a data type definition is imported from an XML Schema file for being later applied as a message type for the subsequently defined web service Proxy. The proxy service first attaches the created call interceptor to itself and defines an operation which delegates the request. This procedure is split into querying the registry for tagged Web services, applying the delegation strategy (dStrat) for determining the destination, and invoking the Process operation on it. For later adaptations, the delegation behavior itself is not hardcoded into the operation but outsourced as a service variable containing the delegation code. This makes it possible to update the deployed service’s behavior at runtime by replacing the variable. Finally, in Lines 24 and 25 a back-end host is referenced and the proxy service is deployed on it. Due to space constraints, this demo script does only cover a heavily restricted specification of the testbed and also lacks the definition of other participants, such as worker services and clients for bootstrapping the testbed’s activity. In our evaluation, we have applied G2 in order to have a customizable Web service testbed for verifying the quality of our concepts in realistic scenarios, e.g., for a detailed analysis of performance and scalability. For a more detailed description of the G2 framework and its capabilities we refer readers to [17].
126
H. Psaier et al.
6.4.2 Adaptation Framework The adaptation framework is located on the right side in Fig. 6.3. The framework has monitoring features including logging, eventing, and a component for capturing actor behavior. Based on observations obtained from the testbed, adaptation actions are taken. • The Logging Service is used by the logger plugin (see PLogger in Fig. 6.3). Logged messages are persistently saved in a database for analysis. The logging service also implements a publish/subscribe mechanism to offer distributed event notification capabilities. Subscribers can specify filters using XPath statements which are evaluated against received logged messages. A short example is shown in Listing 6.2. Header extensions (Line 7–22) include the context of interactions (i.e., the activity that is performed), delegation restrictions, identify the sender and receivers using WS-Addressing [31] mechanisms, and hold some meta-information about the activity type itself. MessageIDs enable message correlation to correctly match requests and responses. Timestamps capture the actual creation of the message and are used for message ordering. For HPSs, SOAP messages are mapped to user interfaces by the HPS framework [25]. Task Context related information is also transported via header mechanisms. While activities depict what kind of information is exchanged between actors (type system) and how collaborations are structured, tasks control the status of interactions and constraints in processing certain activities. WS, Adaptation, Trust
Listing 6.2 Simplified SOAP interaction example
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
•
•
•
•
•
127
Multiple instances of the logging service can be deployed to achieve scalability in large scale environments. Event Subscribers receive events based on filters that can be specified for different types of (inter-)actions, for example, to capture only delegation flows. Subscribers are used to capture the runtime state of nodes within the testbed environment such as current load of a node. The Behavior Monitor updates and stores periodically the actual interaction behavior of nodes as profiles in the behavior registry. This mechanism assists the following diagnosis to correlate environment events and behavior changes. Diagnosis and Analysis algorithms are initiated to evaluate the root cause of undesirable system states. Pre-configured triggers for such events, e.g., events reporting violations, inform the diagnosis module about deviations from desired behavior. Captured and filtered interaction logs as well as actual node behaviors assist in recognizing the system’s health state. The Similarity Service uses the tag database to search for actors based on profile keywords (i.e., to replace an actor or to establish a new link to actors). Tags are obtained from logged interactions. The Adaptation Module deployed appropriate adaptation actions. An example for an adaptation action is to update a node’s delegation strategy as indicated in Fig. 6.3. For that purpose, the PAction plugin communicates with G2’s control interface.
A set of Web-based Admin Tools have been implemented to offer graphical user interfaces for configuring and visualizing the properties of testbeds. User tools include, for example, policy design for adaptations or visualizations of monitored interactions.
6.5 Behavior Monitoring and Self-Adaptation The design of the architecture presented in the previous section provides a variety of possibilities for self-adaptation strategies. Figure 6.3 shows that the adaptation framework is loosely coupled to the testbed. Furthermore, logging interactions is a very generic approach to monitor the environment. The focus of this w is adaptation of service misbehavior. Misbehavior appears on any unexpected change of behavior of a testbed component with noticeable function degradation impacts to the whole or major parts of the testbed. Our monitoring and adaptation strategies follow the principle of smooth integration with least interference. However, a loosely coupled design often results in delayed and unclear state information. This can cause a possibly delayed deployment and application of adaptations. On the other hand, the testbed remains more authentic and true to current real environments which lack direct monitoring and adaptation functionality. Monitoring in this architecture relies on the accuracy and timeliness of the Logging Service. Diagnosis and Analysis get all required status updates with the help of
128
H. Psaier et al.
the Event Subscriber mechanism. Filtered status information populates the network model held by Diagnosis and Analysis module. During start-up the first interaction information is used to build the initial structure of the model. During runtime this information synchronizes the model with actual status changes observed on the network. Especially the interaction data filtered by the Behavior Monitor module allows Diagnosis and Analysis to draw conclusions from interactions about possible misbehavior at the services. Detectable misbehavior patterns are described in the Policy Store together with related recovery strategies. The components of the store include trigger, diagnosis and recovery action modules (cf., [22]). Whilst the trigger defines potential misbehavior in a rule, the fired diagnosis analyzes the detected incident using its network model. The model information in combination with current interaction facts from the log history is used to estimate the necessary recovery actions. Finally, recovery strategies are estimated and deployed to adapt the real network. Referring, e.g., to the misbehavior patterns presented in Sect. 6.2 a sink behavior trigger could be expressed according to the previously given description by a threshold value defining an admissible amount of tasks at a monitored node. A fired diagnosis would further inspect the delegation history of a suspected node by consulting its task delegation log data an integral part of its network model. If a sink behavior is identified the diagnosis plans recovery actions. Actions are situation dependent and there are possibly multiple options for recovery. In this chapter the recovery approach is to reconfigure the network by adapting the interaction channels between the service nodes. Channels are opened to provide new interactions to alternative nodes and closed to hinder misbehaving nodes to further affect the surrounding nodes and degrade the environment’s function. The challenge is not only to detect misbehaving nodes but also to find alternative interaction channels for those problem nodes. A feasible adaptation must temporarily decouple misbehaving nodes from the network and instantly find possible candidates for substitution. Potential candidates must expose similar properties as the misbehaving node, e.g., have similar capabilities, and additionally, have the least tendency to misbehavior, e.g., those with least current task load. In a real mixed system environment nodes’ capabilities will change and the initial registered profiles will diverge with time from the current. Therefore our framework includes a Similarity Service that keeps track of the profile changes and provides alternatives to nodes according to their current snapshot profile. In the following we show how the misbehavior patterns introduced in the scenario of Sect. 6.2 can be detected and adapted with the tools of our adaptation framework. A sink behavior is observed when a node persists in accepting tasks from other nodes however prefers to work on tasks of certain neighbors, or under-performs in task processing. This behavior is recognizable by a dense delegation of tasks to the sink possibly requiring different capabilities and a low task completion notification in the observed time span. In the notion of Groovy scripts introduced in Sect. 6.4, Listing 6.3 shows the procedure used to detect and adapt nodes with sink behavior in the testbed framework.
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
129
// in the monitoring loop def sinkNode = env.triggerSink(4) // sink trigger with threshold 4 tasks if (sinkNode) f // sink suspected if (env.analyzeTaskQueueBehavior(sinkNode)) f // analyze task history def simNodes = sim.getSimilar(sinkNode) // call similarity service altNodes = [] simNodes.each f s > if (env.loadTolerable(s)) altNodes += s // find nodes with tolerable load g def neighborNodes = env.getNeighbors(sinkNode) // affected neighbors neighborNodes.each f n > n.dStrat = f refs > // overwrite dStrat from Listing 1. refs += altNodes // add alternatives channels refs = sinkNode // remove channel to sink ... // selection strategy g g g g
Listing 6.3 Code example for sink adaptation
The script extract defines the task queue trigger’s triggerSink threshold first. If the limit of four tasks is violated by a node analysis analyzeTaskQueueBehavior scans the affiliated task history and compares the latest delegation and task status reporting patterns of the node. If a sink is detected, the Similarity Service sim is called and returns a set simNodes of possible candidates for replacement. In the next loop the candidates’ current task queue size is examined (loadTolerable). Only those with few tasks are added to the final alternative nodes altNode list. In the last step the delegation strategies of the neighbors of the sink node are updated. The alternatives are added to the possible delegation candidates and the sinkNode is avoided. A moderate use of queue capacity in contrast to high and exceeding delegation rates despite available alternatives causes overload at single nodes. This identifies the factory behavior. Again interaction data uncovers the misbehavior expressed by a high fluctuation of tasks from the factory and a low task completion rate in the monitored interval. The Groovy script in Listing 6.4 presents our factory adaptation algorithm for the testbed framework. The factory trigger’s threshold triggerFactory fires diagnosis on task queue sizes below two tasks. If analyzeDelegationBehavior confirms a pattern with high delegation frequency a factory node is detected. The same as with a sink, a selection of alternative nodes for a factory node replacement is collected. From this list only those with minor load are further considered. Then the affected neighbors who are delegating nodes (getDelegators) are freed from the factory and provided with the alternative nodes. Finally, the delegation strategy of the delegating neighbors is adapted. In contrast to the sink in the last step all factory’s delegation channels are closed temporarily.
130
H. Psaier et al.
// in the monitoring loop def factoryNode = env.triggerFactory(2) // factory trigger with threshold 2 tasks if (factoryNode) f //factory suspected if (env.analyzeDelegationBehavior(factoryNode)) f// analyze task history def simNodes = sim.getSimilar(factoryNode) // call similarity service altNodes = [] simNodes.each f s > if (env.loadTolerable(s)) altNodes += s // find nodes with tolerable load g def neighborNodes = env.getDelegator(factoryNode) // affected delegators neighborNodes.each f n > n.dStrat = f refs > // overwrite dStrat from Listing 1. refs += altNodes //add alternatives channels refs = factoryNode // remove channel to factory ... // selection strategy g g factoryNode.dStrat=fg // no delegations allowed g g
Listing 6.4 Code example for factory adaptation
6.6 Experiments In our experiments we evaluate the efficiency of similarity based adaptation in a virtual team of a crowd of task-based services. This team comprises a few hundreds of collaborators. The assumption is that some of the HPSs expose a misbehavior with the progress of time. Misbehavior is caused by team members that for various reasons including, e.g., task assignment overload, change of interest, or preference for particular tasks, start to process assigned tasks irregularly. Our strategy is to detect misbehavior by analyzing the task processing performance of the team. A degrading task processing rate indicates misbehavior. The main idea is to detect these degradations, identify the misbehaving team members with a task history analysis, and, in time, provide a fitting replacement for the misbehaving member. This member match is provided by our Similarity Service that mines the capabilities and noted changes at the members The main information source of our misbehavior analysis and detections is the data contained in the delegated tasks.
6.6.1 Scenario Overview Following the concept of crowdsourcing we modeled a scenario showcasing the interaction dynamics of a specific sector comprised by a bunch of teams. Interested parties wish to outsource multiple tasks to a crowd. In order to get their tasks completed they refer to an entry point service that forwards tasks to multiple teams
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
131
of the crowd. A team comprises two types of members. The first, the delegators, receive new tasks directly from the entry point. Instead of working on the tasks their concern is to redistribute the tasks to their neighbors. These neighbors are also called workers. A delegator picks its most capable and trusted workers that can process the assigned task. Each team is specialized on a particular type of task. Tasks carry keyword information in order to distinguish which team receives a particular task. A task’s life-cycle starts at the entry point that provides the team constantly with new tasks. It acts as a proxy between team and actual task owner and its main assignment is to decide which of the team members is suitable for processing. The question is how to find the appropriate worker for a task. All services are registered at startup by the registry including their capabilities. Though, the information of the registry remains static and becomes outdated over the course of time. Members’ processing behaviors can change over time when tasks start to be delegated and processing loads vary. Thus, the entry point can refer to the environment’s registry for candidates at the beginning and shortly after bootstrapping but once profiles start to change the lookup information becomes inaccurate. The solution is the Similarity Service which is aware of these changes. It tracks the interest shift by monitoring the delegation behavior between interacting neighbors. Therefore, the service provides the most accurate candidates for a delegation during runtime. However, at the contrary the Similarity Service cannot provide satisfying results from the beginning because of the lack of interaction data. Once the appropriate candidate is selected by the entry point it delegates the task. Teams, as in our scenario are composed of a sub-community of HPSs that know and trust each other and, hence, keep references to each other in a neighbor-list. Delegations in the team are only issued between these trusted neighbors. Tasks are associated with a deadline to define the task’s latest demanded completion time and a processing effort. Each worker has its individual task processing speed depending on the knowledge compared to the tasks requirements and the current work load. At the end of a task’s life-cycle, a worker reports the task as complete, or if the deadline is missed, expired. The main focus of the misbehavior regulation is to avoid tasks to expire. Our algorithm identifies failing services by observing the task throughput. It filters tasks that missed their deadline in a certain periode. Such a misbehavior is then adapted with the help of the knowledge of the Similarity Service and the task history. First the most similar members to the misbehaving are selected and than with a task queue size analysis the least loaded chosen for an adaptation. Depending on the current trust-based adaptation strategy channels between working nodes are added or delegations shifted to competent but less busy workers.
6.6.2 Experiment Setup In order to simulate described medium size teams of the aforementioned crowdsourcing model, we set up following environment. The teams comprise a total of 200 collaborators represented by Web services created by G2 scripts deployed
132
H. Psaier et al.
to one backend instance. 20% of these members expose a delegation behavior the rest works on assigned tasks. All services are equipped with a task queue. As in the real world the services are not synchronized and have their individual working slots. Usually a worker processes one entire task per slot. A worker starts to misbehave once its task queue is filled past the threshold of 6 tasks. It then reduces its working speed to one third. A total of 600 task are assigned to the environment. We do not adapt from start. At start there is a period of 200 task with no adaptation. Then in an adaptation cycle the workers task queue size is monitored by tracing the delegation flow among the nodes. The difference between acknowledged assignments and complete or expired reported tasks results in the current task queue size at a particular worker. Once this number exceeds the preset task queue threshold which we vary for the different results of our experiments, the similarity service is invoked for a list of workers with similar capabilities. In a loop over this list sorted by best match the candidate is picked with the currently smallest task queue size. The applied adaptation action depends on the experiment’s current adaptation strategy. In trust mirroring a channel between two similar workers is opened which allows the overloaded node additionally to delegate one task per slot over the new channel. In trust teleportation the overloaded worker is relieved from the most delegating neighbor and a new channel is opened from the delegator to a substitute worker. Figure 6.4 shows the temporal evolution of dynamic interactions under different adaptation actions. It demonstrates the changes in interactions for a threshold of 6 tasks in the three sub-figures. A node’s size represents the total number of incoming delegations. Larger edges indicate a high number of delegations across this channel with the arrow pointing in the delegation direction. Therefore, the node in the middle is easily identified as the entry point. It sheer provides tasks to all the connected delegators. Figure 6.4 shows that these delegators prefer selected workers to complete their tasks. In this figure six extremely overloaded workers are present after the first 200 tasks have left the entry point. Only a few others are sporadically called. Figure 6.4 represents the effects at the end of the experiment for the mirroring strategy. The effects of this strategy are clearly visible. The load between the workers is better distributed. A few, however more equilibrate worker nodes remain compared to no action because the delegators still prefer to assign tasks to their most trusted workers. However, a lager number of new workers is added at the outer leaves of the tree which release these nodes from their task load. Figure 6.4 highlights the situation with the trust teleportation strategy. The side-effects here show that the number of loaded nodes remains almost the same. However, the load peek at the preferred workers is kept below the predefined threshold. Once exceeded the worker is relieved from its delegator and a replacement found. With this strategy workers get loaded to their boundary and are then replaced with new workers. In our experiments we tested the effectiveness of adaptations with different task queue threshold triggers. The effectiveness is measured by the total task processing performance at the end of the experiment. Only completely processed and reported tasks went into the final result.
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
(a) No adaptation applied.
133
(b) Adaptation through mirroring.
(c) Adaptation through teleportation.
Fig. 6.4 Evolving interaction networks based on adaptation actions
6.6.3 Result Description Figure 6.5 presents the results of our simulation evaluations. Both diagrams provide the time-line in minutes on the x-axis and the number of completed tasks at the end of this period on the y-axis. In both cases there is a well noticeable incrementation of completed tasks until minute 4. This is when the first 200 tasks have been distributed to the workers. The task distribution is not linear over the measured period. This is due to the fact that at the beginning not so many tasks can be distributed because of bootstrapping delays in the G2 backend. This is also when the first adaptations are deployed. Whilst the task completion ratio decreases rapidly at this point if
H. Psaier et al.
90
90
80
80
Number of Completed Tasks
Number of Completed Tasks
134
70 60 50 40 30
no adaptation threshold 4 threshold 6 threshold 8 threshold 10
20 10 0
1
2
3
4 5 6 Time [min]
70 60 50 40 30
no adaptation threshold 4 threshold 6 threshold 8 threshold 10
20 10
7
8
0
9
1
2
3
(a) Mirroring.
4 5 6 Time [min]
7
8
9
7
8
9
(b) Teleportation.
Fig. 6.5 Adaptations using different thresholds for mirroring and teleportation 12
threshold 4 threshold 6 threshold 8 threshold 10
10 Adaptation Actions
10 Adaptation Actions
12
threshold 4 threshold 6 threshold 8 threshold 10
8 6 4 2
8 6 4 2
0
0 1
2
3
4 5 6 Time [min]
7
(a) Actions applied in mirroring.
8
9
1
2
3
4 5 6 Time [min]
(b) Actions applied in teleportation.
Fig. 6.6 Number of adaptation actions applied using different strategies
no adaptation actions are taken (demonstrated by the dashed line) the other lines represent the progress of the task completion when different thresholds triggers together with reconfigurations are applied. The diagrams in Fig. 6.6 show again the time-line on the x-axis and the number of applied actions at the end of the period on the y-axis. Figure 6.5 details the results of an adaptation strategy using trust mirroring. Generally all strategies perform better than when no action is taken. With a trigger threshold of 4 tasks and approximately 3 actions every minute the curve exposes an increment followed by a decrement between 70 and 50 completed task every minute. The pattern is similar to the curve representing a threshold of 8. Figure 6.6 shows that the adaptations are less and the altering of direction in Fig. 6.5 is slower. The
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
135
smoothest adaptations result from a trigger matching the real worker’s threshold of 6 tasks. Comparing the figures, a smaller growth of success in task completion is noticed after the deployment of the 3 followed by 4 adaptations between minute 4 and 6. A threshold of 10 tasks decreases slower than an adaptation free environment but with only about 20 more successfully processed tasks. With the same adaptation effort as at threshold 8 this strategy exposes an overall inconvenient timing of the adaptations and can be considered impractical. The situation is different in Fig. 6.5. As Fig. 6.6 shows, there are more adaptations deployed with this strategy. But not without leaving following side-effects. The curve of adaptations triggered at threshold 4 increases rapidly after minute 5 when a total of 11 new channels are provided to new workers in a time slot of 1 min. Even if again with the smoothest progress among the successful strategies the curve representing actions at threshold 6 cannot reach the top performances of their neighbors (threshold 4 and 8). Instead the 20 new channels set between minute 4 and 6 let the system performance progress even. Finally the curve of threshold 10 has a noticeable regress between minute 3 and 4 caused by the dynamics of the system. In the following this type of strategy with only 9 adaptations in total is not able to recover and is even outperformed by the no adaptation run. The final results show that the precise timing of multiple adaptations in a short term is most convenient for environment adaptation actions. However this has a trend to highly altering task processing results (e.g., approximately 40 task for a threshold 8 in Fig. 6.5). Comparing both, a strategy where the trigger matches the environments actor’s threshold of 6 is most practical in a balanced environment. Strategies with a threshold above 8 are infeasible for this setup. Generally the teleportation strategy performs better than mirroring, however requires the double and more adaptation actions.
6.7 Related Work Two main research directions on self-adaptive properties emerged in the past years. One initiated by IBM and presented by the research of autonomic computing [16, 29] and the other manifested by the research on self-adaptive systems [23]. Whilst autonomic computing includes research on all possible system layers and an alignment of self-* properties to all available system parts, self-adaptive system research pursuits a more global and general approach. The efforts in this area focus primarily on research above the middleware layer and consider self-* methodologies that adapt the system as a whole. These include higher layers such as models and systems’ architecture [7], application layer, and in particular interesting for our research are large-scale agent-based systems [4], Web services, and their orchestration [1]. Self-adaptive ideas can be found for middleware [5] and also at a lower layer include, e.g., operating systems [26]. With current systems growing in size and ever changing requirements plenty of challenges remain to be faced such as autonomic adaptations [19] and service behavior modeling [18]. The self-adaptive research demonstrated in this chapter strongly
136
H. Psaier et al.
relates to the challenges in Web services and workflow systems. Apart from the cited, substantial research on self-adaptive techniques in Web Service environments has been conducted in the course of the European Web service technology research project WS-Diamond (Web-Service DIAgnosinbility, MONitoring and Diagnosis). The recent contributions focus in particular on QoS related self-adaptive strategies and adaptation of BPEL processes [14, 15]. Others are theoretical discussions on self-adaptive methodologies [8]. Regarding runtime evaluation, several approaches have been developed which could be applied for testing adaptation mechanisms. SOABench [3] and PUPPET [2], for instance, support the creation of mock-up services in order to test workflows. However, these prototypes are restricted to emulating non-functional properties (QoS) and cannot be enhanced with programmable behavior. By using Genesis2 [17] which allows to extend testbeds with plugins we were able to implement a testbed which was flexible enough to test diverse adaptation mechanisms. Human-Provided services [25] close the gap between Software-Based Services and humans desiring to provide their skills and expertise as a service in a collaborative process. Instead of a strict predefined process flow [20], these systems are denoted by ad-hoc contribution request and loosely structured processes collaborations. The required flexibility induces even more unpredictable a system property responsible for various faults. In our approach we monitor failures caused by misbehavior of service nodes. The contributed self-adaptive method recovers by soundly restricting delegation paths or establishing new connections between the nodes. Over the last years, trust has been defined from several points of views [12], however, until now, no agreed definition exists. Unlike the area of network and computer security we focus on the notion of dynamic trust from a social perspective [33]. Our notion of trust [27] is based on the interpretation of collaboration behavior [12, 27] and dynamically adapting skills and interest similarities [11, 21]. In the introduced environment we make explicit use of the latter one.
6.8 Conclusion and Outlook The main objective of this work was to demonstrate the successful integration of two frameworks. On one side the G2 [17] SOA testbed and on the other the extensible VieCure [22] adaptation framework. The two remain separate and independent frameworks and are only loosely coupled. As a first extension in this chapter we added to the adaptation loop a module providing similarity ratings for the testbed services. The results of our evaluation confirm that the deployed task processing team scenario and the two adaptation strategies trust mirroring and teleportation interplay satisfactorily. A precise timing and a careful aligned threshold for the actions is essential to reach high amounts of task completion rates. This observation emphasizes our attempt in implementing non-intrusive self-healing
6 Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented Systems
137
recovery strategies that can not always relate on accurate status information for a decision. In our future work we plan to deploy a whole crowdsourcing environment with miscellaneous teams to a distributed testbed. It will then also become essential to distribute and duplicate some of the components of the adaptation framework, e.g., logging, diagnosis and analysis modules. We plan a layered adaptation strategy that provides an interface to deploy local adaptations and allows global adaptations on a higher layer involving utility based changes for the whole crowd. New models of Mixed System’s misbehavior and extended rules for detection and diagnosis of behavior will become necessary.
References 1. Baresi, L., Guinea, S.: Dynamo and self-healing bpel compositions. In: ICSE, pp. 69–70 (2007) 2. Bertolino, A., Angelis, G.D., Frantzen, L., Polini, A.: Model-based generation of testbeds for web services. In: TestCom/FATES, Lecture Notes in Computer Science, vol. 5047, pp. 266– 282. Springer (2008) 3. Bianculli, D., Binder, W., Drago, M.L.: Automated performance assessment for serviceoriented middleware. Tech. Rep. 2009/07, Faculty of Informatics - University of Lugano (2009). URL http://www.inf.usi.ch/research publication.htm?id=55 4. Bigus, J.P., Schlosnagle, D.A., Pilgrim, J.R., Mills, I.W.N., Diao, Y.: Able: A toolkit for building multiagent autonomic systems. IBM Syst. J. 41(3), 350–371 (2002) 5. Blair, G.S., Coulson, G., Blair, L., Duran-Limon, H., Grace, P., Moreira, R., Parlavantzas, N.: Reflection, self-awareness and self-healing in openorb. In: WOSS, pp. 9–14 (2002) 6. Brabham, D.: Crowdsourcing as a model for problem solving: An introduction and cases. Convergence 14(1), 75 (2008) 7. Cheng, S.W., Garlan, D., Schmerl, B.: Architecture-based self-adaptation in the presence of multiple objectives. In: SEAMS, pp. 2–8 (2006) 8. Cordier, M., Pencol´e, Y., Trav´e-Massuy`es, L., Vidal, T.: Characterizing and checking selfhealability. In: ECAI, pp. 789–790 (2008) 9. Dustdar, S.: Caramba—a process-aware collaboration system supporting ad hoc and collaborative processes in virtual teams. Distrib. Parallel Databases 15(1), 45–66 (2004) 10. Ganek, A.G., Corbi, T.A.: The dawning of the autonomic computing era. IBM Syst. J. 42(1), 5–18 (2003) 11. Golbeck, J.: Trust and nuanced profile similarity in online social networks. ACM Trans. on the Web 3(4), 1–33 (2009) 12. Grandison, T., Sloman, M.: A survey of trust in internet applications. IEEE Communications Surveys and Tutorials, 2000, 3(4) (2000) 13. Groovy Programming Language: http://groovy.codehaus.org/ 14. Halima, R., Drira, K., Jmaiel, M.: A QoS-Oriented Reconfigurable Middleware for SelfHealing Web Services. In: ICWS, pp. 104–111 (2008) 15. Halima, R., Guennoun, K., Drira, K., Jmaiel, M.: Non-intrusive QoS Monitoring and Analysis for Self-Healing Web Services. In: ICADIWT, pp. 549–554 (2008) 16. IBM: An architectural blueprint for autonomic computing. IBM White Paper (2005) 17. Juszczyk, L., Dustdar, S.: Script-based generation of dynamic testbeds for soa. In: ICWS. IEEE Computer Society (2010) 18. Kaschner, K., Wolf, K.: Set algebra for service behavior: Applications and constructions. In: BPM, pp. 193–210. Springer-Verlag, Berlin, Heidelberg (2009). DOI http://dx.doi.org/10. 1007/978-3-642-03848-8 14
138
H. Psaier et al.
19. Kephart, J.O.: Research challenges of autonomic computing. In: ICSE, pp. 15–22 (2005) 20. Leymann, F.: Workflow-based coordination and cooperation in a service world. In: CoopIS, DOA, GADA, and ODBASE, pp. 2–16 (2006) 21. Matsuo, Y., Yamamoto, H.: Community gravity: Measuring bidirectional effects by trust and rating on online social networks. In: WWW, pp. 751–760 (2009) 22. Psaier, H., Skopik, F., Schall, D., Dustdar, S.: Behavior Monitoring in Self-healing Serviceoriented Systems. In: COMPSAC. IEEE (2010) 23. Salehie, M., Tahvildari, L.: Self-adaptive software: Landscape and research challenges. ACM Trans. Auton. Adapt. Syst. 4(2), 1–42 (2009) 24. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Proc. and Mgmt. 24(5), 513–523 (1988) 25. Schall, D., Truong, H.L., Dustdar, S.: Unifying human and software services in web-scale collaborations. Internet Computing 12(3), 62–68 (2008) 26. Shapiro, M.W.: Self-healing in modern operating systems. ACM Queue 2(9), 66–75 (2005) 27. Skopik, F., Schall, D., Dustdar, S.: Modeling and mining of dynamic trust in complex serviceoriented systems. Information Systems 35(7), 735–757 (2010). DOI doi:10.1016/j.is.2010.03. 001 28. Sterritt, R.: Autonomic computing. Innovations in Systems and Software Engineering 1(1), 79–88 (2005) 29. Sterritt, R.: Autonomic computing. ISSE 1(1), 79–88 (2005) 30. Vukovic, M.: Crowdsourcing for Enterprises. In: Proceedings of the 2009 Congress on Services, pp. 686–692. IEEE Computer Society (2009) 31. WS-Addressing: http://www.w3.org/Submission/ws-addressing/ 32. Zeng, L., Benatallah, B., Ngu, A.H., Dumas, M., Kalagnanam, J., Chang, H.: Qos-aware middleware for web services composition. IEEE Trans. on Softw. Eng. 30, 311–327 (2004). DOI http://doi.ieeecomputersociety.org/10.1109/TSE.2004.11 33. Ziegler, C.N., Golbeck, J.: Investigating interactions of trust and interest similarity. Dec. Sup. Syst. 43(2), 460–475 (2007)
Index
Ad-hoc collaborations, 10 HPS framework, 10–14 interaction models, human, 4, 5 web-scale collaborations, 25 Adaptation module, 127
Business-oriented models, 51, 52
Complex service-oriented systems, 28 architecture and implementation, 54 activity management, 55 human provided services, expert web, 56, 57 interaction monitoring, 54, 55 interaction monitoring and logging, 57, 58 metric calculation, 58, 59 personal trust rules management, 55 social network management and provisioning, 55, 56 trust model administration, 55 trust provisioning, 59, 60 VieTECore, 56 behavioral and social trust models, SOA, 71, 72 communication, coordination, and composition, 33 cycle of trust, 34, 35 social trust, 33, 34 flexible and context-aware collaborations, 70, 71 flexible compositions, 50 community balancing models, 50–52 request delegation patterns, 50–53
large-scale networks, interaction balancing, 67 simulation setup and simulated agent network, 68, 69 VieTE setup, 68, 69 mixed systems, interactions in, 71 service-oriented collaborations, 31–33 social and behavioral trust model, 30 social trust, 35 interaction layer, 35–39 personalized trust inference, 35–39 trust projection layer, 40, 41 trust inference, fuzzy set theory, 41–43 trust management, computational complexity, 61 experiments setup and data generation, 60, 61 network management, 65 profile similarity measurement, 63, 64 trust graph provisioning, 65–67 trust inference performance, 62, 63 trust model definitions, 43 fundamental trust model, 44–46 temporal evaluation, 46, 47 trust inference concepts, 43, 49 trust projection, 48–50 Vienna trust emergence (VieTE) framework, 30 Compositional trust, 34
Delegation behavior, 97, 100 delegation factory, 100 delegation sink, 100, 101 models, 97 Design and Architecture, 123
S. Dustdar et al. (eds.), Socially Enhanced Services Computing, DOI 10.1007/978-3-7091-0813-0, © Springer-Verlag/Wien 2011
139
140 Environment overview, 102 Event subscriber’s mechanism, 127, 128 Expert web, 32 human provided services, 56, 57 response behavior, 33 service-oriented large-scale collaboration, 32
Factory adaptation, code example, 129–130 Fuzzy set theory, trust inference, 41–43
Genesis2 testbed generator, 81 basic concepts and architecture, 82–84 benefits, 81 groovy features, 86–87 multicast testbed control, 87–88 testbed instances, extensible generation of, 84–86 Groovy script specifying delegator service, 124, 125
Human task-to-service mapping, 8 Human-provided services (HPS) framework, 1 ad-hoc collaborations, 10 discovery and interaction, 12 middleware platform and architecture, 10, 11 personal services, 13, 14 service definition, 10 XML collections, 11–13 approach, 2, 3 data collections, 7, 8 Expert-finder systems, 4 goal of work, 3 Human computation, 3, 4 implementation, 8–10 interaction models conceptual model, 5 human collaboration, 4, 5 interaction flow, 5, 6 Task announcements, 4 middleware platform, 6, 7 web-scale collaborations, 20 ad hoc collaboration, 22, 25 framework, 21–24 process-centric collaboration, 25, 26 WS-HumanTask, 4
Interaction layer, social trust, 36 collaboration data, 36, 37
Index context-aware interaction observation, 37, 38 interaction metrics and scopes, 38, 39 layered trust emergence approach, 35
Logging service, 126
Multi-faceted trust, 34
Personal service, 13, 14 Personalized trust inference, 34, 39, 40 Process-centric collaboration, 25, 26
Self-adaptation, 119 and behavior monitoring, 127–130 collaborative SOA, 119–121 Self-healing service-oriented systems, 95 architecture, 101 mixed SOA environment, 101 monitoring and adaptation layer, 102 VieCure framework, 101, 102 behavior models, equal distribution, 112 behavior regulation, 107 diagnosis, 107 factory behavior, 109, 110 recovery actions, 107–109 sink behavior, 109 transient behavior, 110 trigger, 107 contributions, 97, 98 factory behavior, 112, 113 flexible interactions and compositions, 98–101 principles, 96, 97 simulation and evaluation, 110 recovery actions, 111 simulated heterogeneous service environment, 110, 111 VieCure setup, 111 sink behavior, 113 VieCure framework, 103 event trigger, diagnosis, and recovery actions, 105, 106 interaction monitoring, 104, 105 mode of operation, 103, 104 Service-oriented architecture (SOA), 76 behavioral and social trust models, 71 Genesis2 Testbed Generator, 81
Index basic concepts and architecture, 82–86 benefits, 81 groovy features, 86, 87 multicast testbed control, 87, 88 testbed instances, extensible generation of, 84, 85 QoS testbed scenario, 88–91 testbeds, 78–81 Service-oriented collaborations, 31–33, see also Complex service-oriented systems Service-oriented systems, 117 behavior monitoring and self-adaptation, 127–130 design and architecture adaptation framework, 126, 127 Genesis2 testbed generator framework, 124, 125 experiments, 130 adaptation actions, 134, 135 experiment setup, 131–133 mirroring and teleportation, 133, 134 scenario overview, 130, 131 profile similarity and dynamic trust, 121 interest profile creation, 121, 122 interest similarity and trust, 122, 123 self-adaptation, 119–121 service-oriented collaboration networks, 118 Sink adaptation, code example, 128, 129 SOAP interaction model, 9 Social trust, 33 and behavioral models, 71 communication, coordination, and composition, 33, 34 definition, 33 interaction layer, 35–36 personalized trust inference, 35, 39 trust projection layer, 40, 41
141 Social-oriented models, 51, 52 Switch/case construct, 9, 10
Testbeds, 78–81, see also Service-oriented architecture (SOA) Trust mirroring, 121, 122 Trust projection layer, 40, 41 Trust teleportation, 121, 123
VieCure framework, 101 architecture, 101, 102 event trigger, diagnosis, and recovery actions, 105, 106 interaction monitoring, 104, 105 mode of operation, 103 simulation and evaluation, 111 Vienna trust emergence (VieTE) framework, 30 mixed systems environments, 54 setup, 68, 69 SOA-based environments, 30
Web Application Description Language (WADL), 12 Web-scale collaborations, 16 flexibility vs. reusability, 18 HPS, 20 ad hoc collaboration, 22, 25 framework, 21–24 process-centric collaboration, 25, 26 motivating scenarios, 19 ad hoc contribution, 19, 20 user-defined processes, 20 with formalized processes, 20 Web 2.0’s collaboration landscape, 18, 19
XML collections, 11, 13