Information Modelling and Knowledge Bases XXII - Volume 225 Frontiers in Artificial Intelligence and Applications

INFORMATION MODELLING AND KNOWLEDGE BASES XXII Frontiers in Artificial Intelligence and Applications FAIA covers all a...

Author: A. Heimbrrger | Y. Kiyoki | T. Tokuda | H. Jaakkola | N. Yoshida

159 downloads 1003 Views 10MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

INFORMATION MODELLING AND KNOWLEDGE BASES XXII

Frontiers in Artificial Intelligence and Applications FAIA covers all aspects of theoretical and applied artificial intelligence research in the form of monographs, doctoral dissertations, textbooks, handbooks and proceedings volumes. The FAIA series contains several sub-series, including “Information Modelling and Knowledge Bases” and “Knowledge-Based Intelligent Engineering Systems”. It also includes the biennial ECAI, the European Conference on Artificial Intelligence, proceedings volumes, and other ECCAI – the European Coordinating Committee on Artificial Intelligence – sponsored publications. An editorial panel of internationally well-known scholars is appointed to provide a high quality selection. Series Editors: J. Breuker, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras, R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong

Volume 225 Recently published in this series Vol. 224. J. Barzdins and M. Kirikova (Eds.), Databases and Information Systems VI – Selected Papers from the Ninth International Baltic Conference, DB&IS 2010 Vol. 223. R.G.F. Winkels (Ed.), Legal Knowledge and Information Systems – JURIX 2010: The Twenty-Third Annual Conference Vol. 222. T. Ågotnes (Ed.), STAIRS 2010 – Proceedings of the Fifth Starting AI Researchers’ Symposium Vol. 221. A.V. Samsonovich, K.R. Jóhannsdóttir, A. Chella and B. Goertzel (Eds.), Biologically Inspired Cognitive Architectures 2010 – Proceedings of the First Annual Meeting of the BICA Society Vol. 220. R. Alquézar, A. Moreno and J. Aguilar (Eds.), Artificial Intelligence Research and Development – Proceedings of the 13th International Conference of the Catalan Association for Artificial Intelligence Vol. 219. I. Skadiņa and A. Vasiļjevs (Eds.), Human Language Technologies – The Baltic Perspective – Proceedings of the Fourth Conference Baltic HLT 2010 Vol. 218. C. Soares and R. Ghani (Eds.), Data Mining for Business Applications Vol. 217. H. Fujita (Ed.), New Trends in Software Methodologies, Tools and Techniques – Proceedings of the 9th SoMeT_10 Vol. 216. P. Baroni, F. Cerutti, M. Giacomin and G.R. Simari (Eds.), Computational Models of Argument – Proceedings of COMMA 2010 Vol. 215. H. Coelho, R. Studer and M. Wooldridge (Eds.), ECAI 2010 – 19th European Conference on Artificial Intelligence

ISSN 0922-6389 (print) ISSN 1879-8314 (online)

Information Modelling and Knowledge Bases XXII

Edited by

Anneli Heimbürger University of Jyväskylä, Finland

Yasushi Kiyoki Keio University, Japan

Takehiro Tokuda Tokyo Institute of Technology, Japan

Hannu Jaakkola Tampere University of Technology, Finland

and

Naofumi Yoshida Komazawa University, Japan

Amsterdam • Berlin • Tokyo • Washington, DC

© 2011 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-60750-689-8 (print) ISBN 978-1-60750-690-4 (online) Library of Congress Control Number: 2010942038 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: [email protected] Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail: [email protected]

LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved.

v

Preface In recent decades information modeling and knowledge bases have become hot topics, not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The 20th European-Japanese Conference on Information Modeling and Knowledge Bases (EJC2010) continues the series of events that originally started as a co-operation initiative between Japan and Finland, back in the second half of the 1980’s. Later (1991) the geographical scope of these conferences expanded to cover the whole of Europe and other countries as well. The EJC conferences constitute a worldwide research forum for the exchange of scientific results and experiences achieved in computer science and other related disciplines using innovative methods and progressive approaches. In this way a platform has been established drawing together both researchers and practitioners who deal with information modelling and knowledge bases. The main topics of EJC conferences target the variety of themes in the domain of information modeling: conceptual analysis, the design and specification of information systems, multimedia information modelling, multimedia systems, ontology, software engineering, knowledge and process management, knowledge bases, cross-cultural communication and context modelling. We also aim at applying new progressive theories. To this end much attention is also paid to theoretical disciplines including cognitive science, artificial intelligence, logic, linguistics and analytical philosophy. In order to achieve the targets of the EJC, an international program committee selected 15 full papers and 10 short papers in a rigorous reviewing process from 34 submissions. The selected papers cover many areas of information modelling, namely the theory of concepts, database semantics, knowledge representation, software engineering, WWW information management, context-based information retrieval, ontological technology, image databases, temporal and spatial databases, document data management, process management, cultural modelling and many others. The conference could not be a success without a lot of effort on the part of many people and organizations. In the program committee, 29 reputable researchers devoted a lot of energy to the review process, selecting the best papers and creating the EJC2010 program, and we are very grateful to them. Professor Yasushi Kiyoki and Professor Takehiro Tokuda acted as co-chairs of the program committee while Senior Researcher, Dr. Anneli Heimbürger, and her team took care of the conference venue and local arrangements. Professor Hannu Jaakkola acted as the general organizing chair and Ms. Ulla Nevanranta as conference secretary for the general organizational matters necessary for running the annual conference series. Dr. Naofumi Yoshida and his Program Coordination Team managed the review process and the conference program. We also gratefully appreciate the efforts of all our supporters, especially the Department of Mathematical Information Technology at the University of Jyväskylä (Finland), for supporting this annual event and the 20th jubilee year of EJC.

vi

We believe that the conference was productive and fruitful in the advance of research and application of information modelling and knowledge bases. This book features papers edited as a result of the presentation and discussion at the conference. The Editors Anneli Heimbürger, University of Jyväskylä, Finland Yasushi Kiyoki, Keio University, Japan Takehiro Tokuda, Tokyo Institute of Technology, Japan Hannu Jaakkola, Tampere University of Technology (Pori), Finland Naofumi Yoshida, Komazawa University, Japan

vii

Conference Committee General Programme Chair Hannu Kangassalo, University of Tampere, Finland Co-Chairs Yasushi Kiyoki, Keio University, Japan Takehiro Tokuda, Tokyo Institute of Technology, Japan Members Maria Bielikova, Slovak University of Technology in Bratislava, Slovakia Boštjan Brumen, University of Maribor, Slovenia Pierre-Jean Charrel, University of Toulouse and IRIT, France Xing Chen, Kanagawa Institute of Technology, Japan Alfredo Cuzzocrea, ICAR Institute and University of Calabria, Italy Marie Duží, VSB-Technical University Ostrava, Czech Republic Jørgen Fischer Nilsson, Techinical University of Denmark, Denmark Hele-Mai Haav, Institute of Cybernetics at Tallinn University of Technology, Estonia Roland Hausser, Erlangen University, Germany Anneli Heimbürger, University of Jyväskylä, Finland Jaak Henno, Tallinn University of Technology, Estonia Yoshihide Hosokawa, Gunma University, Japan Hannu Jaakkola, Tampere University of Technology, Pori, Finland Ahto Kalja, Tallinn University of Technology, Estonia Eiji Kawaguchi, Kyushu Institute of Technology, Japan Mauri Leppänen, University of Jyväskylä, Finland Sebastian Link, Victoria University of Wellington, New Zealand Tommi Mikkonen, Tampere University of Technology, Finland Jari Palomäki, Tampere University of Technology, Pori, Finland Hideyasu Sasaki, Ritsumeikan University, Japan Tetsuya Suzuki, Shibaura Institute of Technology, Japan Bernhard Thalheim, Kiel University, Germany Peter Vojtáš, Charles University Pragu, Czech Republic Yoshimichi Watanabe, University of Yamanashi, Japan Naofumi Yoshida, Komazawa University, Japan Koji Zettsu, NICT, Japan General Organizing Chair Hannu Jaakkola, Tampere University of Technology, Pori, Finland

viii

Organizing Committee Anneli Heimbürger, University of Jyväskylä, Finland Xing Chen, Kanagawa Institute of Technology, Japan Ulla Nevanranta, Tampere University of Technology, Pori, Finland Program Coordination Team Naofumi Yoshida, Komazawa University, Japan Xing Chen, Kanagawa Institute of Technology, Japan Anneli Heimbürger, University of Jyväskylä, Finland Jari Palomäki, Tampere University of Technology, Pori, Finland Teppo Räisänen, University of Oulu, Finland Daniela Ďuráková, Technical University of Ostrava, Czech Republic Akio Takashima, Hokkaido University, Japan Tomoya Noro, Tokyo Institute of Technology, Japan Turkka Näppilä, University of Tampere, Finland Jukka Aaltonen, University of Lapland, Finland External Reviewers Thomas Proisl Besim Kabashi

ix

Contents Preface Anneli Heimbürger, Yasushi Kiyoki, Takehiro Tokuda, Hannu Jaakkola and Naofumi Yoshida

v

Ontology As a Logic of Intensions Marie Duží, Martina Číhalová and Marek Menšík

1

A Three-Layered Architecture for Event-Centric Interconnections Among Heterogeneous Data Repositories and Its Application to Space Weather Takafumi Nakanishi, Hidenori Homma, Kyoung-Sook Kim, Koji Zettsu, Yutaka Kidawara and Yasushi Kiyoki

21

Partial Updates in Complex-Value Databases Klaus-Dieter Schewe and Qing Wang

37

Inferencing in Database Semantics Roland Hausser

57

Modelling a Query Space Using Associations Mika Timonen, Paula Silvonen and Melissa Kasari

77

Architecture-Driven Modelling Methodologies Hannu Jaakkola and Bernhard Thalheim

97

An Emotion-Oriented Image Search System with Cluster Based Similarity Measurement Using Pillar-Kmeans Algorithm Ali Ridho Barakbah and Yasushi Kiyoki

117

The Quadrupel – A Model for Automating Intermediary Selection in Supply Chain Management Remy Flatt, Markus Kirchberg and Sebastian Link

137

A Simple Model of Negotiation for Cooperative Updates on Database Schema Components Stephen J. Hegner

154

A Description-Based Approach to Mashup of Web Applications, Web Services and Mobile Phone Applications Prach Chaisatien and Takehiro Tokuda

174

A Formal Presentation of the Process-Ontological Model Jari Palomäki and Harri Keto

194

Performance Forecasting for Performance Critical Huge Databases Bernhard Thalheim and Marina Tropmann

206

Specification of Games Jaak Henno

226

x

Bridging Topics for Story Generation Makoto Sato, Mina Akaishi and Koichi Hori A Combined Image-Query Creation Method for Expressing User’s Intentions with Shape and Color Features in Multiple Digital Images Yasuhiro Hayashi, Yasushi Kiyoki and Xing Chen Towards Context Modelling and Reasoning in a Ubiquitous Campus Ekaterina Gilman, Xiang Su and Jukka Riekki A Phenomena-of-Interest Approach for the Interconnection of Sensor Data and Spatiotemporal Web Contents Kyoung-Sook Kim, Takafumi Nakanishi, Hidenori Homma, Koji Zettsu, Yutaka Kidawara and Yasushi Kiyoki

247

258 278

288

Modelling Contexts in Cross-Cultural Communication Environments Anneli Heimbürger, Miika Nurminen, Teijo Venäläinen and Suna Kinnunen

301

Towards Semantic Modelling of Cultural Historical Data Ari Häyrinen

312

A Collaboration Model for Global Multicultural Software Development Taavi Ylikotila and Petri Linna

321

A Culture-Dependent Metadata Creation Method for Color-Based Impression Extraction with Cultural Color Spaces Totok Suhardijanto, Kiyoki Yasushi and Ali Ridho Barakbah

333

R-Web: A Role Accessibility Definition Based Web Application Generation Yusuke Nishimura, Kosuke Maebara, Tomoya Noro and Takehiro Tokuda

344

NULL ‘Value’ Algebras and Logics Bernhard Thalheim and Klaus-Dieter Schewe

354

Ontology Representation and Inference Based on State Controlled Coloured Petri Nets Ke Wang, James N.K. Liu and Wei-min Ma

368

The Discourse Tool: A Support Environment for Collaborative Modeling Efforts Denis Kozlov, Tore Hoel, Mirja Pulkkinen and Jan M. Pawlowski

378

On Context Modelling in Systems and Applications Development Anneli Heimbürger, Yasushi Kiyoki, Tommi Kärkkäinen, Ekaterina Gilman, Kyoung-Sook Kim and Naofumi Yoshida

396

Future Directions of Knowledge Systems Environments for Web 3.0 Koji Zettsu, Bernhard Thalheim, Yutaka Kidawara, Elina Karttunen and Hannu Jaakkola

413

Subject Index

447

Author Index

449

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-1

1

Ontology as a Logic of Intensions Marie DUŽÍa,1, Martina ÍHALOVÁ a, Marek MENŠÍKa,b a b

VSB-Technical University Ostrava, 17. listopadu 15, 708 33 Ostrava, Czech Republic Institute of Computer Science, FPF, Silesian University in Opava, Bezruovo nám. 13, 746 01 Opava, Czech Republic [email protected], [email protected], [email protected]

Abstract. We view the content of ontology via a logic of intensions. This is due to the fact that particular intensions like properties, roles, attributes and propositions can stand in mutual necessary relations which should be registered in the ontology of a given domain, unlike some contingent facts. The latter are a subject of updates and are stored in a knowledge-base state. Thus we examine (higher-order) properties of intensions like being necessarily reflexive, irreflexive, symmetric, anti-symmetric, transitive, etc., mutual relations between intensions like being incompatible, being a requisite, being complementary, and so like. We also define two kinds of entailment relation between propositions, viz. mere entailment and presupposition. Finally, we show that higher-order properties of propositions trigger necessary integrity constraints that should also be included in the ontology. As the logic of intensions we vote for Transparent Intensional Logic (TIL), because TIL framework is smoothly applicable to all three kinds of context, viz. extensional context of individuals, numbers and functions-in-extension (mappings), intensional context of properties, roles, attributes and propositions, and finally hyper-intensional context of procedures producing intensional and extensional entities as their products. Keywords. Ontology, intension, hyperintension, Transparent Intensional Logic, integrity constraint.

Introduction In informatics, the term ‘ontology’ has been borrowed from philosophy, where ontology is a systematic account of existence. In most general, what exists is that what can be represented. Thus in recent Artificial Intelligence and information systems a formal ontology is an explicit and systematic conceptualization of a domain of interest. Given a domain, ontological analysis should clarify the structure of knowledge on what exists in the domain. A formal ontology is, or should be, a stable heart of an information system that makes knowledge sharing, reuse and reasoning possible. As J. Sowa says in [14, p. 51], “logic itself has no vocabulary for describing the things that exist. Ontology fills that gap: it is the study of existence, of all the kinds of entities abstract and concrete that make up the world”. Current languages and tools applicable in the area of an ontology design focus in particular on the form of ontological representation rather than what a semantic content of ontology should be. Of course, a unified syntax is useful, but the problems of syntax 1

Corresponding Author.

2

M. Duží et al. / Ontology As a Logic of Intensions

are almost trivial compared to the problems of developing a common semantics for any domain. In this paper we focus on ontology content rather than a form. We concentrate on describing concepts necessary for the specification of relations between higher-order entities like properties, roles/offices, attributes and propositions, which are all modelled as PWS (possible-world semantics) intensions, i.e. functions with the set of possible worlds as their domain. To this end we apply the procedural semantics of Transparent Intensional Logic (TIL), which provides a universal framework applicable smoothly in all three kinds of context, namely extensional context of individuals, numbers and functions-in-extension, intensional context of PWS-intensions and finally hyperintensional context of concepts viewed as abstract procedures producing extensional as well as intensional entities as their products.2 The paper is organised as follows. Ontology content and languages for ontology specification are introduced in Section 1. Here we also provide a brief introduction to Transparent Intensional Logic, the tool we are going to apply throughout the paper. In Section 2 we introduce our logic of intensions, in particular the logic of requisites. Section 3 tackles the phenomenon of presupposition and compares it with mere entailment. Finally, concluding Section 4 outlines further research.

1. Ontology content and knowledge representation Knowledge representation is a multidisciplinary discipline that applies theories and tools of logic and ontology. It comprises both knowledge base and ontology design. Yet there is a substantial distinction between the former and the latter. Whereas the content of a knowledge base state consists in particular of contingent values of (empirical) attributes, the ontology content comprises in particular the taxonomy of entities that should not depend on contingent facts. Thus, for instance in Description Logic (DL) we distinguish between definitional and incidental part, the former containing concepts of attributes rather than their values. The main reason for building knowledge-based systems comprising ontologies can be characterized as making hidden knowledge explicit and logically tractable. To this end it is desirable to apply an expressive semantic framework in order that all the semantically salient features of knowledge specification can be adequately represented so that reasoning based on this representation is logically adequate and does not yield paradoxes. In general, current ontology languages are mostly based on the 1st-order predicate logic (FOL). Though FOL has become stenography of mathematics, it is not expressive enough when applied in other areas such as ontology specification. The obvious shortcoming of the FOL approach is this: in FOL we must treat higher-order intensions and hyper-intensions as elements of a flat universe, due to which knowledge representation is not comprehensible enough. Moreover, when representing knowledge in FOL, the well-known problem of the paradox of omniscience is almost inevitable. For applications where FOL is not adequate, it would be desirable to extend the framework to a higher-order logic (HOL). A general objection against using HOL logic is its computational intractability. However, HOL formulas are relatively well understood, and reasoning systems for HOLs do already exist, e.g., HOL [6] and Isabel [13].

2

Recent most up-to-date results and applications of TIL can be found in [5].

M. Duží et al. / Ontology As a Logic of Intensions

3

1.1. Standard ontological languages There are a number of languages which have been developed for knowledge representation. They provide tools for knowledge-base specification and deductive reasoning using the specified knowledge. Of these, perhaps the best known and broadly used logical calculi are F-logic and Description Logic (DL) in their various variants.3 The F-logic arose from the practice of frame systems. Thus it can be viewed as a hierarchy of classes of elements which are furnished with attributes, accompanied by inference rules. The DL-philosophy is different; it makes use of the notion of a logical theory defined as a set of special axioms built over the first-order predicate logic calculus. Particular classes and their mutual relations are defined by logical formulas. Thus in DL the class hierarchy typical for frame systems is not directly specified. Rather, it is dynamically derived using logical definitions (class descriptions). Though the existing ontology languages have been enriched by a few constructs exceeding the power of FOL, these additional constructs are usually not well defined and understood. Moreover, particular languages are neither syntactically nor semantically compatible. The W3C efforts at standardization resulted in accepting the Resource Description Framework (RDF) language as the Web ontological recommendation. However, this situation is far from satisfactory. Quoting from Horrocks and Schneider [8]: “The thesis of representation underlying RDF and RDFS is particularly troublesome in this regard, as it has several unusual aspects, both semantic and syntactic. A more-standard thesis of representation would result in the ability to reuse existing results and tools in the Semantic Web.” RDF includes three basic elements. Resources are anything with an URI address. Properties specify attributes and/or (binary) relations between resources and an object used to describe resources. Statements of the form ‘subject, predicate, object’ associate a resource and a specific value of its property. RDF has unusual aspects that make its use as the foundation of representation in the area of ontology building and Semantic Web difficult at best. In particular, RDF has a very limited collection of syntactic constructs, and these are treated in a very uniform manner in the semantics of RDF. The RDF syntax consists of the so-called triples – subject, predicate and object, where only binary predicates are allowed. This causes serious problems concerning compatibility with more expressive languages. The RDF thesis requires that no other syntactic constructs than the RDF triples are to be used and that the uniform semantic treatment of syntactic constructs cannot be changed only augmented. In RDFS we can specify classes and properties of individuals, constraints on properties, and the relation of subsumption (subclass, subproperty). It is not possible, for instance, to specify properties of properties, e.g., that the relation (property) is functional or transitive. Neither it is possible to define classes by means of properties of individuals that belong to the class. The RDF like languages originally did not have a model theoretic semantics, which led to many discrepancies. As stated above, RDF(S) is recommended by W3C, and its usage is world spread. The question is whether it is a good decision. A classical FOL approach would be better, or even its standard extension to HOL would be more suitable for ontologies. Formalisation in HOL is much more natural and comprehensive, the universe of discourse is not a flat set of ‘individuals’; rather, properties and relations can be naturally talked about as well, which is much more apt for representation of ontologies. 3

For details on Description Logic and F-logic see, for instance, [1] and [11], respectively.

4

M. Duží et al. / Ontology As a Logic of Intensions

Recognition of the limitations of RDFS led to the development of ontology languages such as OIL, DAML-ONT and DAML+OIL, which resulted into the OWL. OWL has been developed as an extension of RDFS. OWL (like DAML+OIL) uses the same syntax as RDF (and RDFS) to represent ontologies, the two languages are syntactically compatible. However, the semantic layering of the two languages is more problematical. The difficulty stems from the fact that OWL (like DAML+OIL) is largely based on DL, the semantics of which would normally be given by a classical first-order model theory in which individuals are interpreted as elements of some domain (a set), classes are interpreted as subsets of the domain and properties are interpreted as binary relations on the domain. The semantics of RDFS, on the other hand, are given by a non-standard model theory, where individuals, classes and properties are all elements in the domain. Properties are further interpreted as having extensions which are binary relations on the domain, and class extensions are only implicitly defined by the extension of the rdf:type property. Moreover, RDFS supports reflection on its own syntax: interpretation of classes and properties can be extended by statements in the language. Thus language layering is much more complex, because different layers subscribe to these two different approaches. A bit more sophisticated approach is provided by the OWL (Ontology Web Language) that is also recommended by W3C, which is based on DL framework. In DL we talk about individuals that are elements of a universe domain. The individuals are members of subclasses of the domain, and can be related to other individuals (or data values) by means of properties (n-ary relations are called properties in Web ontologies, for they are decomposed into n properties). The universe of discourse is divided into two disjoint sorts: the object domain of individuals and the data value domain of numbers. Thus the interpretation function assigns elements of the object domain to individual constants, elements of data value domain to value constants, and subclasses of the data domain to data types. Further, object and data predicates are distinguished, the former being interpreted as a subset of the Cartesian product of object domain, the latter a subset of the Cartesian product of value domain. DL is rather rich, though being an FOL language. It makes it possible to distinguish intensional knowledge (knowledge on the analytically necessary relations between concepts) and extensional knowledge (of contingent facts). To this end DL knowledge base includes the so-called T-boxes (terminology or taxonomy) and A-boxes (contingent attributes of objects). T-box contains verbal definitions, i.e., a new concept is defined composing known concepts. For instance, a woman can be defined: WOMAN = PERSON & SEXFEMALE, and a mother: MOTHER = WOMAN & child(HASchild). Thus the fact that, e.g., mother is a woman is analytic (necessary) truth. In T-boxes there are also specifications of necessary properties of concepts and relations between concepts: the property satisfiability corresponds to a nonempty concept, the relation of subsumption (intensionally contained concepts), equivalence and disjointness (incompatibility). Thus, e.g., that a bachelor is not married is analytically true proposition. On the other hand, the fact that, e.g., Mr. Jones is a bachelor is a contingent unnecessary fact. Such contingent properties (attributes) of objects are recorded in A-boxes. The third group of ontology languages lies somewhere between the FOL framework and RDFS. This group comprises SKIF and Common Logic [7]. The SKIF syntax is compatible with functional language LISP, but in principle it is an FOL syntax. These languages also have a non standard model theory, with predicates being interpreted as individuals, i.e., elements of a domain. Classes are however treated as subsets of the domain, and their redefinition in the language syntax is not allowed.

M. Duží et al. / Ontology As a Logic of Intensions

5

Based on common logic, the SKIF language accommodates some higher-order constructs. The SKIF languages are syntactically compatible with LISP, i.e., the FOL syntax is extended with the possibility to mention properties and use variables ranging over properties. For instance, we can specify that John and Peter have a common property: p.p(John) & p(Peter). The property they have in common can be, e.g., that they both love their wives. We can also specify that a property P is true of John, and the P has the property Q: P(John) & Q(P). If P is being honest and Q is being eligible, the sentence can be read as that John is honest, which is eligible. The interpretation structure is a triple ¢D, ext, V², where D is the universe, V is the function that maps predicates, variables and constants to the elements of D, and ext is the function that maps D into sets of n-tuples of elements of D. SKIF does not reduce the arity of predicates. To our best knowledge, the only ontology language supporting inferences at this level is a Semantic Web Rule Language (SWRL) combining OWL and RuleML [9]. According to the OWL (Web Ontology Language) overview [19], OWL is intended to be used when information contained in documents needs to be processed by applications, as opposed to situations where the contents only need to be presented to humans. OWL can be used to represent the meaning of terms in vocabularies and relationships between those terms. OWL has been designed on the top of XML, XLink, RDF and RDFS in order to provide more facilities for expressing meaning and semantics to represent machine interpretable content on the Web. Summarising, well-defined ontology should serve at least these goals: (1) universal library to be accessed and used by humans in a variety of information use contexts, (2) the backdrop work of computational agents carrying out activities on behalf of humans, and (3) a method for integrating knowledge bases and databases to perform tasks for humans. Current ontology languages, however, are far from meeting these goals, and their expressive power does not enable computational agents to make use of an adequate inference machine. Still worse, from a logical-semantic point of view these languages suffer the following shortcomings. None of them (perhaps with an exception of languages based on DL) makes it possible to express modalities (what is necessary and what is contingent), to distinguish three kinds of context, viz. extensional level of objects like individuals, numbers, functions (-in-extension), intensional level of properties, propositions, offices and roles, and finally hyperintensional level of concepts (i.e. algorithmically structured procedures). Concepts of n-ary relations are unreasonably modelled by properties. True, each n-ary relation can be expressed by n unary relations (properties) but such a representation is misleading and incomprehensible. Ontology language should be, however, universal, highly expressive, with transparent semantics and meaning driven axiomatisation. For these reasons we vote for an expressive system of Transparent Intensional Logic (TIL). From the formal point of view, TIL is a hyper-intensional, partial, typed O-calculus. Hyperintensional, because we apply top-down approach to semantics, from hyper-intensional (conceptual) level of procedures, via intensional down to extensional level of abstraction. Basic semantic construct is an abstract procedure known as TIL construction. Since TIL has been referred to in numerous EJC papers, in the next paragraph we only briefly recapitulate basic principles of TIL. For the most up-to-date exposition, see [5] and also [10].

6

M. Duží et al. / Ontology As a Logic of Intensions

1.2. A brief introduction to TIL TIL is an overarching semantic theory for all sorts of discourse, whether colloquial, scientific, mathematical or logical. The theory is a procedural one, according to which sense is an abstract, pre-linguistic procedure detailing what operations to apply to what procedural constituents to arrive at the product (if any) of the procedure. Such procedures are rigorously defined as TIL constructions. The semantics is entirely anticontextual and compositional and it is, to the best of our knowledge, the only one that deals with all kinds of context in a uniform way. Thus the sense of a sentence is an algorithmically structured construction of the proposition denoted by the sentence. The denoted proposition is a flat, or unstructured, mapping with domain in a logical space of possible worlds. Our motive for working ‘top-down’ has to do with anticontextualism: any given unambiguous term or expression (even one involving indexicals or anaphoric pronouns) expresses the same construction as its sense whatever sort of context the term or expression is embedded within. And the meaning of an expression determines the respective denoted entity (if any), but not vice versa. The denoted entities are (possibly 0-ary) functions understood as set-theoretical mappings. Thus we strictly distinguish between a procedure (construction) and its product (here, a constructed function), and between a function and its value. Intuitively, construction C is a procedure (a generalised algorithm). Constructions are structured in the following way. Each construction C consists of sub-instructions (constituents), each of which needs to be executed when executing C. Thus a specification of a construction is a specification of an instruction on how to proceed in order to obtain the output entity given some input entities. There are two kinds of constructions, atomic and compound (molecular). Atomic constructions (Variables and Trivializations) do not contain any other constituent but themselves; they specify objects (of any type) on which compound constructions operate. The variables x, y, p, q, …, construct objects dependently on a valuation; they v-construct. The Trivialisation of an object X (of any type, even a construction), in symbols 0X, constructs simply X without the mediation of any other construction. Compound constructions, which consist of other constituents as well, are Composition and Closure. Composition [F A1…An] is the operation of functional application. It vconstructs the value of the function f (valuation-, or v-, -constructed by F) at a tuple argument A (v-constructed by A1, …, An), if the function f is defined at A, otherwise the Composition is v-improper, i.e., it fails to v-construct anything.4 Closure [Ox1…xn X] spells out the instruction to v-construct a function by abstracting over the values of the variables x1,…,xn in the ordinary manner of the O-calculi. Finally, higher-order constructions can be used twice over as constituents of composite constructions. This is achieved by a fifth construction called Double Execution, 2X, that behaves as follows: If X v-constructs a construction X’, and X’ v-constructs an entity Y, then 2X v-constructs Y; otherwise 2X is v-improper, failing as it does to v-construct anything. TIL constructions, as well as the entities they construct, all receive a type. The formal ontology of TIL is bi-dimensional; one dimension is made up of constructions, the other dimension encompasses non-constructions. On the ground level of the type hierarchy, there are non-constructional entities unstructured from the algorithmic point of view belonging to a type of order 1. Given a so-called epistemic (or objectual) base 4 As mentioned above, we treat functions as partial mappings, i.e., set-theoretical objects, unlike the constructions of functions.

M. Duží et al. / Ontology As a Logic of Intensions

7

of atomic types (R-truth values, L-individuals, W-time moments / real numbers, Zpossible worlds), the induction rule for forming functional types is applied: where D, E1,…,En are types of order 1, the set of partial mappings from E1 u…u En to D, denoted ‘(D E1…En)’, is a type of order 1 as well.5 Constructions that construct entities of order 1 are constructions of order 1. They belong to a type of order 2, denoted ‘*1’. The type *1 together with atomic types of order 1 serves as a base for the induction rule: any collection of partial mappings, type (D E1…En), involving *1 in their domain or range is a type of order 2. Constructions belonging to a type *2 that identify entities of order 1 or 2, and partial mappings involving such constructions, belong to a type of order 3. And so on ad infinitum. The sense of an empirical expression is a hyperintension that is a construction that produces a (possible world) D-intension, where D-intensions are members of type (DZ), i.e., functions from possible worlds to an arbitrary type D. On the other hand, Dextensions are members of a type D, where D is not equal to (EZ) for any E, i.e., extensions are functions whose domain is not the set of possible worlds. Intensions are frequently functions of a type ((DW)Z), i.e., functions from possible worlds to chronologies of the type D (in symbols: DWZ), where a chronology is a function of type (DW). Some important kinds of intensions are: Propositions, type RWZ. They are denoted by empirical sentences. Properties of members of a type D, or simply D-properties, type (RD)WZ.6 General terms, some substantives, intransitive verbs (‘student’, ‘walks’) denote properties, mostly of individuals. Relations-in-intension, type (RE1…Em)WZ. For example transitive empirical verbs (‘like’, ‘worship’), also attitudinal verbs denote these relations. D-roles, also D-offices, type DWZ, where D (RE). Frequently LWZ. Often denoted by concatenation of a superlative and a noun (‘the highest mountain’). An object A of a type D is denoted ‘A/D’. That a construction C/ n v-constructs an object of type D is denoted ‘C ov D’. We use variables w and t as v-constructing elements of type Z (possible worlds) and W (times), respectively. If C ov DWZ vconstructs an D-intension, the frequently used Composition of the form [[Cw]t], the intensional descent of the D-intension, is abbreviated ‘Cwt’. The analysis of a sentence consists in discovering the logical construction (procedure) encoded by a given sentence. To this end we apply a method of analysis that consists of three steps:7 1) Type-theoretical analysis, i.e., assigning types to the objects that receive mention in the analysed sentence. 2) Synthesis, i.e., combining the constructions of the objects ad (1) in order to construct the proposition of type RWZ denoted by the whole sentence. 3) Type-Theoretical checking. 5 TIL is an open-ended system. The above epistemic base {R, L, W, Z} was chosen, because it is apt for natural-language analysis, but the choice of base depends on the area and language to be analysed. For instance, possible worlds and times are out of place in case of mathematics, and the base might consist of, e.g., R and Q, where Q is the type of natural numbers. 6 We model D-sets and (D1…Dn)-relations by their characteristic functions of type (RD), (RD1…Dn), respectively. Thus an D-property is an empirical function that dependently on states-of-affairs (WZ) picks-up a set of D-individuals, the population of the property. 7 For details see, e.g.,[12].

8

M. Duží et al. / Ontology As a Logic of Intensions

To illustrate the method, let us analyse the sentence “All drivers are persons”. Ad (1) The objects mentioned by the sentence are individual properties of being a Driver and being a Person, and the quantifier All. Individual properties receive the type (((RL)W)Z), RWZ for short. Given a world-time pair ¢w, t², a property applied to world w and time t returns a class of individuals, its population at ¢w, t². Yet the sentence does not mention any particular individual, be it a driver or a person. It says that the population of drivers is a subset of persons. Thus the type of the (restricted) quantifier All is ((R(RL))(RL)). Given a set M/(RL) of individuals, the quantifier All returns all the supersets of M. Thus we have [0All 0M] o (R(RL)). Ad (2) Now we combine constructions of the objects ad (1) in order to construct the proposition (of type RWZ) denoted by the whole sentence. Since we aim at discovering the literal analysis of the sentence, objects denoted by semantically simple expressions ‘driver’, ‘person’ and ‘all’ are constructed by their Trivialisations: 0Driver, 0Person, 0 All. By Composing these constructions, we obtain a truth-value (T or F), according as the population of people belongs to the set of supersets of the population of drivers. Thus we have, [[0All 0Driverwt] 0Personwt] ov R. Finally, by abstracting over the values of the variables w and t, we construct the proposition: OwOt [[0All 0Driverwt] 0Personwt]. Ad (3). By drawing a type-theoretical structural tree, we check whether particular constituents of the above Closure are combined in a type-theoretically correct way. Ow Ot [[0All

0

Driverwt]

((R(RL))(RL))

0

Personwt]

(RL)

(R(RL))

(RL) R

(RW) ((RW)Z)

the type of a proposition, RWZ for short.

So much for the method of analysis and the semantic schema of TIL. 1.3. Ontology content Formal ontology is a result of the conceptualization of a given domain. It contains definitions of the most important entities, forms a conceptual hierarchy together with the most important attributes and relations between entities. Material individuals are mereological sums of other individuals, but only contingently so. Similarly, values of attributes and properties are ascribed to individuals contingently, provided a given property is purely contingent, that is without an essential core. Thus we advocate for a (modest) individual anti-essentialism. On the other hand, on the intensional level of propositions, properties, offices and roles, that is entities which we call ‘intensions’, the most important relation to be observed is that of requisite. For instance, the property of being a mammal is a requisite of the property of being a whale. It is an analytically necessary relation between intensions that gives rise to the so-called ISA hierarchy. Thus on the intensional level we advocate for intensional essentialism; an essence of a

M. Duží et al. / Ontology As a Logic of Intensions

9

property is the set of all its requisites. Finally, on the hyper-intensional level of concepts, relations to be observed are equivalence (i.e. producing the same entity), refinement (a compound concept is substituted for a simpler yet equivalent concept), entailment and presupposition. The structure of ontology building starts on the hyper-intensional level with the specification of primitive concepts. Next we specify compound concepts as ontological definitions of entities of a given domain. Having defined entities, we can specify their most important descriptive attributes. The building process continues by specifying particular (empirical) relations between entities and analytical relations of requisites that serve to build up ontological hierarchy. Finally, the most important general rules that govern behaviour of the system are specified. Here again we distinguish analytically necessary constraints from nomic and common necessities that are given by laws and conventions, respectively; they are not valid analytically necessary. For instance, mathematical laws are analytically necessary, they hold independently of states of affairs. On the other hand, laws of physics are not logically or analytically necessary, they are only nomically necessary. It is even disputable whether these laws are eternal in our world. Yet still weaker constraints are, for instance, traffic laws. That we drive on the right-hand side of a lane is valid only by convention and locally. Summarising, basic parts of a formal ontology should encompass: (1) Conceptual (terminological) dictionary which contains: a) primitive concepts b) compound concepts (ontological definitions of entities) c) the most important descriptive attributes, in particular identification of entities (2) Relations a) contingent empirical relations between entities, in particular the part-whole relation b) analytical relations between intensions, i.e., requisites and essence, which give rise to ISA hierarchy (3) Integrity constraints a) Analytically necessary rules b) Nomologically necessary rules c) Common rules of ‘necessity by convention’ Concerning ad (1), in particular ontological definitions, this topic has been dealt with in [4]. Briefly, ontological definition of an entity is a compound construction of the entity. Such a definition often serves as a refinement of a primitive concept of the entity, which makes it possible to prove some analytic statements about the entity. For example, the sentence “Whales are not dolphins” contains the empirical predicates ‘is a whale’ and ‘is a dolphin’, yet the sentence is analytic truth. At no world/time are the properties being a whale and being a dolphin co-instantiated by the same individual. The proposition constructed by the sentence is the necessary proposition TRUE. In order to prove it, we need to refine the concept of a whale. To this end we make use of the fact that the property of being a whale can be defined as the property of being a marine mammal of the order Cetacea that is neither a dolphin nor a porpoise. 8 Thus the ontological definition of the property of being a whale is

8

See, for instance, http://mmc.gov/species/speciesglobal.html#cetaceans or http://www.crru.org.uk/education/factfiles/taxonomy.htm

10

M. Duží et al. / Ontology As a Logic of Intensions

OwOt Ox [[0Mammalwt x] [0Marinewt x] [0Cetaceawt x] [0Dolphinwt x] [0Porpoisewt x]] Types: x o L; Cetacea, Mammal, Marine, Dolphin, Porpoise/(RL)WZ. Using this definition instead of the primitive concept 0Whale we get: OwOt [0No Ox [[0Mammalwt x] [0Marinewt x] [0Cetaceawt x] [0Dolphinwt x] [0Porpoisewt x]] 0Dolphinwt]. Gloss: “No individual x such that x is a marine mammal of the order Cetacea and x is neither a dolphin nor a porpoise is a dolphin”. In this paper we focus problems ad (2) and (3), that is, we will examine relations between intensions, properties of intensions and various integrity constraints viewed via the logic of intensions.

2. Logic of intensions 2.1. Requisites and ISA hierarchies. It is important to distinguish between purely contingent propositions and the proposition TRUE that takes the value T in all ¢w, t²-pairs. The latter is denoted by analytically true sentences such as the above analysed sentence “No whale is a dolphin” or “All drivers are persons”. We have seen that the literal analysis does not make it possible to prove the analytic truth of the sentence. To this end we have to possibilities. Either we can record ontological definitions refining the primitive concepts of the objects talked about (as illustrated by the above whale-example), or we need to explicitly record in our ontology the fact that there is a necessary relation (-inextension) between the two properties. We call this relation a requisite, in this case Req1/(R(RL)WZ(RL)WZ) and it receives this definition: [0Req1 0Person 0Driver] =df wt [x [[0Driverwt x] [0Personwt x]]] Gloss. Being a person is a requisite of being a driver. In other words, necessarily and for any individual x, if x instantiates the property of being a driver then x also instantiates the property of being a person. Now we set out the logic of requisites, because this relation is the basic relation that gives rise to ISA taxonomies.9 The requisite relations Req are a family of relationsin-extension between two intensions, hence of the polymorphous type (RDWZEWZ), where possibly D = E. The relation of a requisite can be defined between intensions of any type. For instance, a requisite of finding is the existence of a sought object. Infinitely many combinations of Req are possible, but the following four are the relevant ones we wish to consider: (1) Req1 /(R (RL)WZ (RL)WZ): an individual property is a requisite of another such property. (2) Req2 /(R LWZ LWZ): an individual office is a requisite of another such office. (3) Req3 /(R (RL)WZ LWZ): an individual property is a requisite of an individual office. (4) Req4 /(R LWZ (RL)WZ): an individual office is a requisite of an individual property. 9

Parts of this section draw on material presented in [5], Chapter 4.

M. Duží et al. / Ontology As a Logic of Intensions

11

Neglecting complications due to partiality, definitions of particular kinds of requisites should be obvious: “Y is a requisite of X” iff “necessarily whatever occupies/ instantiates X at ¢w, t² it also occupies/instantiates Y at this ¢w, t².” Examples. Being a Person and being a Driver is an example of Req1. An example of Req2 is The Commander-in-Chief and the President of USA. The former office is a requisite of the latter, such that whoever is the President is also the Commander-inChief. However, it may happen that the Presidency goes vacant, while somebody occupies the office of Commander-in-Chief. As an example of Req3 we can adduce the property of being a US citizen and the office President of USA. Finally, an example of Req4 is the pair of God-office and the property of being Omniscient. Note that while Req1/(R(RL)WZ(RL)WZ) and Req2/(RLWZLWZ) are homogeneous, Req3, Req4 are heterogeneous. Since the latter two do not have a unique domain, it is not sensible to ask what sort of ordering they are. Not so with the former two. We define them as quasi-orders (a.k.a. pre-orders) over (R(RL)WZ), (RLWZ), respectively, that can be strengthened to weak partial orderings. However, they cannot be strengthened to strict orderings on pain of paradox, since they would then both be reflexive and irreflexive. We wish to retain reflexivity, such that any intension having requisites will count itself among its requisites. Since intensions are properly partial functions, in order to deal with partiality we make use of three properties of propositions True, False, Undef/(RRWZ)WZ. If P o RWZ is a construction of a proposition, [0Truewt P] returns T if the proposition takes the truthvalue T in a given ¢w, t², otherwise F. [0Falsewt P] returns T if the proposition takes the truth-value F in a given ¢w, t², otherwise F. [0Undefwt P] returns T in a given ¢w, t² if neither [0Truewt P] nor [0Falsewt P] returns T, otherwise F. Claim 1 Req1 is a quasi-order on the set of L-properties. Proof. Let X, Y o (RL)WZ. Then Req1 belongs to the class QO/(R(R(RL)WZ(RL)WZ)) of quasi-orders over the set of individual properties: Reflexivity.

[0Req1 X X] = wt [x [[0Truewt OwOt [Xwt x]] [0Truewt OwOt [Xwt x]]]]

Transitivity.

[[[0Req1 Y X] [0Req1 Z Y]] [0Req1 Z X]] =

[wt [x [[0Truewt OwOt [Xwt x]] [0Truewt OwOt [Ywt x]]] [[0Truewt OwOt [Ywt x]] [0Truewt OwOt [Zwt x]]]] wt [x [[0Truewt OwOt [Xwt x]] [0Truewt OwOt [Zwt x]]]]] In order for a requisite relation to be a weak partial order, it will need to be also anti-symmetric. The Req1 relation is, however, not anti-symmetric. If properties X, Y are mutually in the Req1 relation, i.e., if [[0Req1 Y X] [0Req1 X Y]] then at each ¢w, t² the two properties are truly ascribed to exactly the same individuals. This does not entail, however, that X, Y are identical. It may be the case that there is an individual a such that [Xwt a] v-constructs F whereas [Ywt a] is v-improper. For instance, the following properties X, Y differ only in truth-values for those individuals who never

12

M. Duží et al. / Ontology As a Logic of Intensions

smoked (let StopSmoke/(RL)WZ: the property of having stopped smoking).10 Whereas X yields truth-value gaps on such individuals, Y is false of them: X = OwOt Ox [0StopSmokewt x] Y = OwOt Ox [0Truewt OwOt [0StopSmokewt x]]. In order to abstract from such an insignificant difference, we introduce the equivalence relation Eq/(R(RL)WZ(RL)WZ) on the set of individual properties; p, q o (RL)WZ; =/(RRR): Eq = Opq [x [[0Truewt OwOt [pwt x]] = [0Truewt OwOt [qwt x]]]].

0

Now we define the Req1’ relation on the factor set of the set of L-properties as follows. Let [p]eq = Oq [0Eq p q] and [Req1’ [p]eq [q]eq] = [Req1 p q]. Then: Claim 2 Req1’ is a weak partial order on the factor set of the set of L-properties with respect to Eq. Proof. It is sufficient to prove that Req1’ is well-defined. Let p’, q’ be L-properties such that [0Eq p p’] and [0Eq q q’]. Then [Req1’ [p]eq [q]eq] = [Req1 p q] = wt [x [[0Truewt OwOt [pwt x]] [0Truewt OwOt [qwt x]]]] = wt [x [[0Truewt OwOt [p’wt x]] [0Truewt OwOt [q’wt x]]]] = [Req1’ [p’]eq [q’]eq]. Now obviously the relation Req1’ is antisymmetric: [[0Req1’ [p]eq [q]eq] [0Req1’ [q]eq [p]eq]] [[p]eq = [q]eq]. Claim 3 Req2 is a weak partial order defined on the set of L-offices. Proof. Let X, Y o LWZ. Then the Req2 relation belongs to the class WO/(R(R LWZLWZ)) of weak partial orders over the set of individual offices. Reflexivity.

[0Req2 X X] = [wt [[0Occwt X] [0Truewt OwOt [Xwt = Xwt]]]].

Antisymmetry.

[[[0Req2 Y X] [0Req2 X Y]] [X = Y]] =

[wt [[[0Occwt X] [0Truewt OwOt [Xwt = Ywt]]] [[0Occwt Y] [0Truewt OwOt [Xwt = Ywt]]]] [X = Y]] Transitivity.

[[[0Req2 Y X] [0Req2 Z Y]] [0Req2 Z X]] =

[wt [[[0Occwt X] [0Truewt OwOt [Xwt = Ywt]]] [[0Occwt Y] [0Truewt OwOt [Ywt = Zwt]]]] wt [[0Occwt X] [0Truewt OwOt [Xwt = Zwt]]]]. Remark. Antisymmetry requires the consistent identity of the offices constructed by X, Y: [X = Y]. The two offices are identical iff at all worlds/times they are either co10

We take the property of having stopped smoking as presupposing that the individual previously smoked. For instance, that Tom stopped smoking can be true or false only if Tom was once a smoker. Similarly for the property of having stopped whacking one’s wife.

M. Duží et al. / Ontology As a Logic of Intensions

13

occupied by the same individual or are both vacant: wt [[0Truewt OwOt [Xwt = Ywt]] [0Undefwt OwOt [Xwt = Ywt]]] = wt [0Falsewt OwOt [Xwt = Ywt]], which is the case here. It is a well-known fact that hierarchies of intensions based on requisite relations establish inheritance of attributes and possibly also of operations. For instance, a driver in addition to his/her special attributes like having a driving license inherits all the attributes of a person. This is another reason for including such a hierarchy into ontology. This concludes our definition of the logic of the requisite relations. We turn now to dealing with a part-whole relation. 2.2. Part-whole relation We advocate for the thesis of modest individual anti-essentialism: If an individual I has a property P necessarily (i.e., in all worlds and times), then P is a constant or partly constant function. In other words, the property has a non-empty essential core Ess, where Ess is a set of individuals that have the property necessarily, and I is an element of Ess. There is, however, a frequently voiced objection to individual anti-essentialism. If, for instance, Tom’s only car is disassembled into its elementary physical parts, then Tom’s car no longer exists; hence, the property of being a car is essential of the individual referred to by ‘Tom’s only car’. Our response to the objection is this. First, what is denoted (as opposed to referred to) by ‘Tom’s only car’ is not an individual, but an individual office/role, which is an intension of type LWZ having occasionally different individuals, and occasionally none, as values in different possible worlds at different times. Whenever Tom does buy a car, it is not logically necessary that Tom buy some one particular car rather than any other. Second, the individual referred to as ‘Tom’s only car’ does not cease to exist even after having been taken apart into its most elementary parts. It has simply lost some properties, among them the property of being a car, the property of being composed of its current parts, etc, while acquiring some other properties. Suppose somebody by chance happened to reassemble the parts so that the individual would regain the property of being a car. Then Tom would have no right to claim that this individual was his car, in case it was allowed that the individual had ceased to exist. Yet Tom should be entitled to claim the reassembled car as his.11 Therefore, when disassembled, Tom’s individual did not cease to exist; it had simply (unfortunately) obtained the property of completely disintegrating into its elementary physical parts. So much for modest individual anti-essentialism. The second thesis we are going to argue for is this. A material entity that is a mereological sum of a number of parts, such as a particular car, is from a logical point of view a simple, hence unstructured individual. Only its design, or construction, is a complex entity, namely a structured procedure. This is to say that a car is not a structured whole that organizes its parts in a particular manner. Tichý says: [A] car is a simple entity. But is this not a reductio ad absurdum? Are cars not complex, as anyone who has tried to fix one will readily testify? No, they are not. If a car were a complex then it would be legitimate to ask: Exactly how complex is it? Now how many parts does a car consist of? One plausible answer which may suggest itself is that it has three parts: an engine, a chassis, and a body. But an equally plausible answer can be given in terms of a much longer list: several spark plugs, several pistons, a 11 As Tichý argues in [16], where he uses the example of a watch being ‘repaired’ by a watchmaker in such a way as to become a key.

14

M. Duží et al. / Ontology As a Logic of Intensions

starter, a carburettor, four tyres, two axles, six windows, etc. Despite being longer the latter list does not overlap with the former: neither the engine, nor the chassis nor the body appears on it. How can that be? How can an engine, for example, both be and not be a part of one and the very same car? There is no mystery, however. It is a commonplace that a car can be decomposed in several alternative ways. … Put in other words, a car can be constructed in a very simple way as a mereological sum of three things, or in a more elaborate way as a mereological sum of a much larger set of things. ([17], pp. 179-80.)

It is a contingent fact that this or that individual consists of other individuals and thereby creates a mereological sum. Importantly, being a part of is a relation between individuals, not between intensions. There can be no inheritance or implicative relation between the respective properties ascribed to a whole and its individual parts. Thus it is vital not to confuse the requisite relation, which obtains between intensions, with the part-whole relation, which obtains between individuals. The former relation obtains of necessity (e.g., necessarily, any individual that is an elephant is a mammal), while the latter relation obtains contingently. Logically speaking, any two individuals can enter into the part-whole relation. One possible combination has Saturn a part of Socrates (or vice versa). There will be restrictions on possible combinations, but these restrictions are anchored to nomic necessity (provided a given possible world at which a combination of individuals is attempted has laws of nature at all). One impossible combination would have the largest mountain on Saturn be a part of S (or vice versa). Why impossible? Because of wrong typing: the arguments of the part-whole relation must be individuals (i.e., entities of type L), but the largest mountain on Saturn is an individual office while S is a real number. Yet there is another question interesting from the ontological point of view: which parts are essential for an individual in order to have a property P? For instance, the property of having an engine is essential for the property of being a car, because something designed without an engine does not qualify as a car, but at most as a toy car, which is not a car. The answer to the question which parts are essential in order to have a property P is, in the car/engine example, that the property of having an engine is a requisite of the property of being a car. What is necessary is that a car, any car, should have an engine. It is even necessary that it should have a particular kind of engine, where being a kind of engine is a property of a property of individuals. This kind of a requisite relation should be also included into ontology. What is not necessary is that any car should have some one particular engine belonging to a particular kind of engine: mutatis mutandi, any two members of a particular kind of engine will be mutually replaceable.12 Thus the relation Part_of is of type (RLL)WZ. 2.3. Some other properties of intensions In addition to the above described higher-degree relations of requisite it is also useful to include into ontology some other higher-degree relations between and properties of intensions. In particular, we examine properties of relations-in-intension. For instance, that a given relation is necessarily reflexive, anti-symmetric and transitive, like the partial order induced by a requisite relation. 12

This problem is connected with the analysis of property modification, including being a malfunctioning P.

M. Duží et al. / Ontology As a Logic of Intensions

15

These higher-order properties of intensions are necessarily valid due to the way they are constructed. Since we explicate concepts as closed constructions modulo D- and Ktransformation, we can also speak about mutual relations between and properties of concepts which define particular intensions. Those that deserve our attention are in particular: Incompatibility of concepts defining particular properties, i.e., the respective populations are necessarily disjoint; example: bachelor vs. married man. Equivalence of concepts, i.e., the defined properties are one and the same property Week-equivalence of concepts, i.e., the defined properties are ‘almost the same’; as an example we echo the relation Eq between individual properties defined in the previous paragraph Functionality of a relation-in-intension, that is necessarily, in each ¢w, t²-pair, a given relation R Awt uBwt is a mapping fR: Awt Æ Bwt assigning to each element of A at most one element of B Inverse functionality of a relation-in-intension, that is necessarily, in each ¢w, t²pair, a given relation-in-extension R Awt u Bwt is a mapping fR–1: Bwt Æ Awt assigning to each element of Bwt at most one element of Awt. We also often need to specify some restrictions on the domain or range of a given mapping. Such local restrictions are specified as integrity constraints which we are going to deal with in the next paragraph.13 2.4. Integrity constraints Classical integrity constraints specify whether a given function-in-intension (i.e. an attribute) must be singular or may be multi-valued, and whether it is mandatory or optional. These constraints are analytically necessary. As an example of a cardinality constraint we can adduce the constraint that everybody has just one (biological) mother and father. That each order must concern a customer, a producer/seller and some products is an example of a constraint on mandatory relation. In addition to these analytical constraints it is useful to specify restrictions on cardinality in case of multi-valued attributes, or particular roles of individuals that enter into a given relation, etc. These constraints have the character of nomically necessary constraints given by some conventions valid in a given domain. For instance, there can be a constraint valid in a given organization that each exporter can have five customers at maximum. Regardless of the character of a given domain, we should always specify the degree of necessity of a given integrity constraint. If C ov R v-constructs the respective condition to be met, the basic kinds of constraints ordered from the highest to the lowest are: a) Analytically necessary rules; these are specified by constructions of the form wt C. b) Nomologically necessary rules; these are specified by constructions of the form Owt C.

13 In the terminology of standard ontology languages, the so-called “properties” are actually relationsin-intension with ‘slots’. Thus we can speak about ‘slot constraints’ and facets that are local slot constraints. See [15].

16

M. Duží et al. / Ontology As a Logic of Intensions

c)

Common rules of ‘necessity by convention’; these are specified by constructions of the form OwOt x [C …x …]. To adduce an example, imagine a mobile agent (typically a car) that encounters an obstacle on his way. In order to specify the behaviour of the agent properly, we must take into account priorities of particular constraints. First, the agent must take into account analytical constraints like that there cannot be two material objects at the same position at the same time. Second, physical laws must be considered; for instance, we must calculate vehicle stopping distance taking into account the speed of the agent as well as of the obstacle and the direction of their move. Only then conventional laws like traffic rules can be considered. If the agent comes to a conclusion that the stopping distance is greater than the distance of an obstacle then, of course, the rules like driving on the right-hand side of a lane or traffic sings cannot be followed. So much for the logic of intensions. In the next section we tackle another important phenomenon that is useful to include into ontology so that reasoning of agents can be properly specified, namely two kinds of entailment relation which also can be viewed as higher-order integrity constraints. They are presupposition vs. mere entailment.

3. Presupposition and entailment When used in a communicative act, a sentence communicates something (the focus F) about something (the topic T). Thus the schematic structure of a sentence is F(T). The topic T of a sentence S is often associated with a presupposition P of S such that P is entailed both by S and non-S. On the other hand, the clause in the focus usually triggers a mere entailment of some P by S. Schematically, (i) S |= P and non-S |= P (P is a presupposition of S); Corollary: If non-P then neither S nor non-S is true. (ii) S |= P and neither (non-S |= P) nor (non-S |= non-P)

(mere entailment).

More precisely, the entailment relation obtains between hyperpropositions P, S, i.e., the meaning of P is entailed or presupposed by the meaning of S. For the precise definition of entailment and presupposition, see [5], Section 1.5. The phenomenon of topic-focus is associated de dicto – de re ambivalence. Consider a pair of sentences differing only in terms of topic-focus articulation: (1) (2)

The critical situation on the highway D1 was caused by the agent a. The agent a caused the critical situation on the highway D1.

While (1) not only entails but also presupposes that there be a critical situation on D1, the truth-conditions of (2) are different, as our analysis clarifies. First, (1) as well as (1’), (1’)

The critical situation on the highway D1 was not caused by the agent a.

are about the critical situation, and that there is a such a situation is not only entailed but also presupposed by both the sentences. As we have seen above, the meaning of a sentence is a procedure producing a proposition, i.e. an object of type RWZ. Execution of this procedure in any world/time yields a truth-value T, F or nothing. Thus we can conceive the sense of a sentence as an

M. Duží et al. / Ontology As a Logic of Intensions

17

instruction on how to evaluate its truth-conditions in any world/time. The instruction encoded by (1) formulated in logician’s English is this: If there is a critical situation on the highway D1 then return T or F according as the situation was caused by the agent a, else fail (to produce a truth-value). Applying our method of analysis introduced in Section 1, we start with assigning types to the objects that receive mention in the sentence. Simplifying a bit let the objects be: Crisis/RWZ: the proposition that there is a critical situation on the highway D1; Cause/(RLRWZ)WZ: the relation-in-intension between an individual and a proposition which has been caused to be true by the individual; Agent_a/L. A schematic analysis of (1) comes down to this procedure: (1s)

OwOt [if 0Crisiswt then [0Causewt 0Agent_a 0Crisis] else Fail]

So far so good; yet there is a problem of how to analyse the connective if-then-else. There has been much dispute over the semantics of ‘if-then-else’ among computer scientists. We cannot simply apply material implication, . For instance, it might seem that the instruction expressed by “If 5=5 then output 1, else output the result of 1 divided by 0” received the analysis [[[05=05] [n=01]] [[05=05] [n=[0Div 01 00]]]], where n is the output number. But the output of the above procedure should be the number 1 because the else clause is never executed. However, due to the strict principle of compositionality that TIL observes, the above analysis fails to produce anything, the construction being improper. The reason is this. The Composition [0Div 01 00] does not produce anything: it is improper because the division function takes no value at the argument <1, 0>. Thus the Composition [n = [0Div 01 00]] is v-improper for any valuation v, because the identity relation = does not receive an argument, and so any other Composition containing the improper Composition [0Div 01 00] as a constituent also comes out v-improper. The underlying principle is that partiality is being strictly propagated up. This is the reason why the if-then-else connective is often said to be a non-strict function. However, there is no cogent reason to settle for non-strictness. We suggest applying a mechanism known in computer science as lazy evaluation. The procedural semantics of TIL operates smoothly even at the level of constructions. Thus it enables us to specify a strict definition of if-then-else that meets the compositionality constraint. The analysis of “If P then C1, else C2” is a procedure that decomposes into two phases. First, on the basis of the condition P ov R, select one of C1, C2 as the procedure to be executed. Second, execute the selected procedure. The first phase, viz. the selection, is realized by the Composition [0the_only Oc [[P [c=0C]] [P [c=0D]]]]. The Composition [[P [c=0C]] [P [c=0D]]] v-constructs T in two cases. If P v-constructs T then the variable c receives as its value the construction C, and if P vconstructs F then the variable c receives the construction D as its value. In either case the set v-constructed by Oc [[P [c=0C]] [P [c=0D]]] is a singleton. Applying the singulariser the_only to this set returns as its value the only member of the set, i.e., either the construction C or D.

18

M. Duží et al. / Ontology As a Logic of Intensions

Second, the chosen construction c is executed. As a result, the schematic analysis of “If P then C else D” turns out to be (*)

[ L Oc [[P [c=0C]] [P [c=0D]]]].

2 0

Types: PoR (the condition of the choice between the execution of C or D); C, D/ n; variable c ov n; the_only/( n(R n)): the singulariser function that associates a singleton set of constructions with the only construction that is an element of this singleton, and which is otherwise (i.e., if the set is empty or many-valued) undefined. Note that we do need a hyperintensional, procedural semantics here. First of all, we need a variable c ranging over constructions. Moreover, the evaluation of the first phase does not involve the execution of the constructions C and D. These constructions are only arguments of other constructions. Returning to the analysis of (1), in our case the condition P is that there be a crisis on the highway D1, i.e., 0Crisiswt. The construction C that is to be executed if P yields T is [0Causewt 0Agent_a 0Crisis]], and if P yields F then no construction is to be selected. Thus the analysis of the sentence (1) comes down to this Closure: (1*)

OwOt 2[0LOc [[0Crisiswt [c = 0[0Causewt 0Agent_a 0Crisis]]]] [0Crisiswt 0F]]]

The evaluation of (1*) in any ¢w, t²-pair depends on whether the presupposition 0 Crisiswt is true in ¢w, t². If true, then the singleton v-constructed by Oc [ … ] contains as the only construction the Composition [0Causewt 0Agent_a 0Crisis]], which is afterwards executed to return T or F, according as the agent a caused the crisis. If false, then the second conjunct in Oc […] comes down to [0T 0F] and thus we get Oc 0F. The v-constructed set is empty. Hence, 2[LOc 0F] is v-improper, that is the Double Execution fails to produce a truth-value. To generalise, an analytic schema of a sentence S associated with a presupposition P is a procedure of the form If P then S else Fail. The corresponding schematic TIL construction is (**)

OwOt 2[0LOc [[Pwt [c=0Swt]] [Pwt 0F]]]. The truth-conditions of the other reading, i.e. the reading of (2)

(2)

“The agent a caused the critical situation on the highway D1”

are different. Now the sentence (2) is about the agent a (topic), ascribing to a the property that it caused the crisis (focus). Thus the scenario of truly asserting that (2) is not true can be, for instance, this. Though it is true that the agent a is known as a hit and run driver, this time he behaved well and prevented a critical situation from arising. Or, a less optimistic scenario is thinkable. The critical situation on D1 is not because of the agent a’s risky driving but because the highway is in a very bad condition. Hence, that there is a crisis is not presupposed by (2), and its analysis is this Closure: (2*)

OwOt [0Causewt 0Agent_a 0Crisis]

M. Duží et al. / Ontology As a Logic of Intensions

19

The moral we can extract from these examples is this. Logical analysis cannot disambiguate any sentence, because it presupposes full linguistic competence. Thus we should include into our formal ontology the schematic rules that accompany activities like agents’ seeking and finding, causing something, etc. Then our fine-grained method can contribute to a language disambiguation by making these hidden features explicit and logically tractable. In case there are more non-equivalent senses of a sentence we furnish the sentence with different TIL constructions. If an agent receives an ambiguous message, he/she can answer by asking for disambiguation. Having a formal fine-grained encoding of a sense, the agent can then infer the relevant consequences.

4. Conclusion The theoretical specification of particular rules is only the first step. When making these features explicit we keep in mind an automatic deduction that will make use of these rules. To this end we currently develop a computational FIPA compliant variant of TIL, the functional programming language TIL-Script (see [3]). The direction of further research is clear. We are going to continue the development the TIL-Script language in its full-fledged version equivalent to TIL calculus. The development of TIL-Script is still a work in progress, in particular the implementation of its inference machine. From the theoretical point of view, the calculus and the rules of inference have been specified in [5], Sections 2.6 and 2.7. Yet its full implementation is a subject of further research. Currently we proceed in stages. First we implemented a method that decides a subset of the TIL-Script language computable by Prolog (see [2]). This subset has been now extended to the subset equivalent to standard FOL. For ontology building we combine traditional tools and languages like OWL (Ontology Web Language) with TIL-Script. We developed an extension of the editor Protégé-OWL so that to create an interface between OWL and TIL-Script. The whole method has been tested within the project ‘Logic and Artificial Intelligence for Multi-Agent Systems’ (see http://labis.vsb.cz/) using a traffic system as a case study. The sample test contained five mobile agents (cars), three car parks and a GIS agent. The GIS agents provided mobile agents with ‘visibility’, i.e., the coordinates of the objects within their visibility. All the agents communicated in TILScript and started with minimal (but not overlapping) ontologies. During the test they learned new concepts and enriched their ontology in order to be able to meet their goals. The agents’ goal was to find a vacant parking lot and park the car. All the agents succeeded and parked in a few seconds, which proved that the method is applicable and usable not only as an interesting theory but also in practice.

Acknowledgements. This research has been supported by the Grant Agency of the Czech Republic, projects No. 401/09/H007 ‘Logical Foundations of Semantics’ and 401/10/0792, ‘Temporal aspects of knowledge and information’, and by the internal grant agency of FEECS VSB-Technical University Ostrava, project No. IGA 22/2009, ‘Modeling, simulation and verification of software processes’.

20

M. Duží et al. / Ontology As a Logic of Intensions

References [1] Baader, F., Calvanese, D., McGuinness, D., L., Nardi, D., and Patel-Schneider, P., F., editors. The Description Logic Handbook: Theory, Implementation and Application. Cambridge University Press, 2002. [2] íhalová, M., Ciprich, N., Duží, M., Menšík, M. (2009): Agents' reasoning using TIL-Script and Prolog. In 19th Information Modelling and Knowledge Bases, ed. T. Tokuda, Y. Kiyoki, H. Jaakkola, T. Welzer, Družovec, Slovenia: University of Maribor, 137-156. [3] Ciprich, N., Duží, M. and Košinár, M.: The TIL-Script language. In Kiyoki, Y., Tokuda, T. (eds.): EJC 2008, Tsukuba, Japan 2008, pp. 167-182. [4] Duží, M., Materna, P. (2009): Concepts and Ontologies. In Information Modelling and Knowledge Bases XX . Y. Kiyoki, T. Tokuda, H. Jaakola, X. Chen, N. Yoshida (eds.), Amsterdam: IOS Press, pp. 45-64. [5] Duží, M., Jespersen, B. and P. Materna: Procedural Semantics for Hyperintensional Logic; Foundations and Applications of Transparent Intensional Logic. Springer: series for Logic, Epistemology and the Unity of Science, Vol. 17, 2010, ISBN: 978-90-481-8811-6. [6] Gordon, M. J. C. and Melham, T. F. (eds.) 1993: Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge: Cambridge University Press. [7] Hayes, P., Menzel, C., 2001. Semantics of knowledge interchange format. In: IJCAI 2001 Workshop on the IEEE Standard Upper Ontology. [8] Horrocks, I. and Patel-Schneider, P.F. 2003: Three These of Representation in the Semantic Web. WWW2003, May 20-24, Budapest, Hungary, 2003, (retrieved 10.1.2005) . [9] Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B. and Dean, M 2004: SWRL: A Semantic Web Rule Language Combiming OWL and RuleML. W3C Member Submission, May 2004, (retrieved 10.1.2010), . [10] Jespersen, B. (2008): ‘Predication and extensionalization’. Journal of Philosophical Logic, vol. 37, 479 – 499. [11] Kifer, M., Lausen, G., and James Wu. Logical foundations of object-oriented and frame-based languages. Journal of the ACM, 42(4):741-843, 1995. [12] Materna, P. and Duží M. (2005): ‘The Parmenides principle’, Philosophia, 32, 155-80. [13] Paulson. L. C. 1994: Isabelle: A Generic Theorem Prover. Number 828 in LNCS. Berlin: Springer. [14] Sowa, John, F.: Knowledge Representation. Logical, Philosophical, and Computational Foundations. Brooks/Cole 2000. [15] Svátek, V. Ontologie a WWW. www source: http://nb.vse.cz/~svatek/onto-www.pdf [16] Tichý, P. 1987. Individuals and their roles (in German; in Slovak in 1994). Reprinted in (Tichý 2004: 710-748). [17] Tichý, P. 1995. Constructions as the subject-matter of mathematics. In The Foundational Debate: Complexity and Constructivity in Mathematics and Physics, eds. W. DePauli-Schimanovich, E. Köhler and F. Stadler, 175-185. Dordrecht, Boston, London, and Vienna: Kluwer. Reprinted in (Tichý 2004: 873-885). [18] Tichý, P. 2004. Collected Papers in Logic and Philosophy, eds. V. Svoboda, B. Jespersen, C. Cheyne. Prague: Filosofia, Czech Academy of Sciences, and Dunedin: University of Otago Press. [19] W3C 2004: The World Wide Web Consortium: OWL Web Ontology Language Overview W3C Recommendation 10 February 2004, (retrieved 10.1.2010) .

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-21

21

A Three-layered Architecture for Event-centric Interconnections among Heterogeneous Data Repositories and its Application to Space Weather Takafumi NAKANISHIa, Hidenori HOMMAa, Kyoung-Sook KIM a, Koji ZETTSU a, Yutaka KIDAWARAa and Yasushi KIYOKIa,b a National Institute of Information and Communication Technology(NICT), Japan b Keio University, Japan

Abstract. Various knowledge resources are spread to a world-wide scope. Unfortunately, most of them are community-based and never thought to be used among different communities. That makes it difficult to gain “connection merits” in a web-scale information space. This paper presents a three-layered system architecture for computing dynamic associations of events to related knowledge resources. The important feature of our system is to realize dynamic interconnection among heterogeneous knowledge resources by event-driven and event-centric computing with resolvers for uncertainties existing among those resources. This system navigates various associated data including heterogeneous data-types and fields depending on user's purpose and standpoint. It also leads to effective use for the sensor data because the sensor data can be interconnected with those knowledge resources. This paper also represents application to the space weather sensor data. Keywords. Event-centric interconnections, heterogeneous data repositories, threelayered architecture, uncertainties for interrelationships, space weather sensor data

Introduction A wide variety of knowledge resources are spread to a worldwide scope via Internet with WWW. Most knowledge resources are provided through community-based creation and they are not shared and used well among different communities. In fact, most data repositories are constructed and used in the local community independently. It is difficult for users to interconnect these widely distributed data according to their purposes, tasks, or interests. That makes it difficult to gain “connection merits” in a web-scale information space. The difficulty in retrieving and interconnecting various knowledge resources arises because of heterogeneities of data-types, contents and utilization objectives. Recently, various sensor data resources are also created widely and spread to the worldwide areas. It is becoming very important to find how to utilize them in related applications. For specialists in different fields from the community sharing the sensor data, it is difficult to use those data effectively because their usage and definitions are not clearly recognized. Each research community focuses on the sensor for research

22

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

purpose dependent of the community. In the current state, most sensor data are not used effectively widely because each research community installs the sensor of each research purpose. It is necessary to share the sensor data with the information on the purpose of use and the background knowledge. For users in the other fields, it is difficult to understand how the sensor data are related to their lives and what the sensor data means. Generally, the expression of the sensor data is an enumeration of the numerical values with domain-specific formatting. For making it possible to utilize those data by other domain-specialists, it is important to show what the sensor data mean and what influence the sensor data cause. Some methods of annotating and connecting the sensor data are expected directly. However, it is too hard and complex. An interpretation and utilization of the sensor data are different according to user's background knowledge and his/her purposes. It is important to realize interconnection mechanisms depending on user's background knowledge and his/her purpose for sensor data. Currently, we have organized a joint research with the Space Environment Group of NICT, to solve how to share sensor data related to the space weather field. The aim of this research is to create new applications of space-weather sensor-data by combining the related knowledge resources. Space Environment Group of NICT is delivering sensor data of solar activities and space environment that is called space weather by RSS [1]. Space weather shows conditions on the Sun and in the solar wind, magnetosphere, ionosphere, and thermosphere. These can endanger human life or health by affecting the performance and reliability of space-borne and ground-based man-made systems [2] such as communication failure, damage of electric devices for space satellite, bombing, etc. The group is delivering these data so that various users may use them. In our current global environment, it is important to transmit significant knowledge to actual users from various data resources. In fact, most events affect various aspects of other areas, fields and communities. For example, in the case of the space weather, a sensor data representing abnormality of Dst index, which is one of the sensor data on the space weather related to Geomagnetic storm event, and news articles on interruption of relay broadcast for XVI Olympic Winter Games are interrelated in the context of “watching TV.” The Dst index and those news articles are individually published from different communities. In order to understand a concept in its entirety on user’s standpoint, a user would need to know the various interrelationships between data in interdisciplinary fields. By only using existing search engines, however, it is difficult to find various data resources in interdisciplinary fields. Moreover, the interconnection will change over time. In order to manage ever-changing interrelations among a wide variety of data repositories, it is important to realize an approach for discovering “event-centric interrelations” of various types of data on each different community depending on user’s standpoint. In this paper, we present a three-layered system architecture for computing dynamic associations of events in nature to related knowledge resources. The important feature of our system is to realize dynamic interconnection among heterogeneous data resources by event-driven and event-centric commuting with resolvers for uncertainties existing among those resources. This realizes interconnection indirectly and dynamically by semantic units for the data of various types such as text data, multimedia data, sensor data etc. In other words, it navigates various appropriate data including data of heterogeneous data-type and heterogeneous fields depending on user's purpose and standpoint. In addition, it leads to effective use for the sensor data because

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

23

the sensor data are interconnected with various data. We also propose a three-layer data structure for representing semantic units extracted from all type of data. The data structure represents semantic units depending on a constraint in each layer. By this data structure, we can compute interconnection between heterogeneous data in some semantic units. Actually, we consider that it is difficult to construct only static basic interrelationships that are acceptable in any cases. It is effective to provide the interrelationships corresponding to user’s standpoint dynamically. The essence of our system is to dynamically select, integrate and operate various appropriate content resources for distributed environment. We define constraints in each layer of the threelayer data stricture for semantic units –event, occurrence and scene. Therefore, our framework is important and effective to realize the distributed heterogeneous data resources. This paper is organized as follows. In section 1, we present a three-layer data structure for interconnection. ries. In section 2, we present the overview of interconnection for heterogeneous content repositories. In section 3, 4, and 5, we describe detail data structures and operations of an event, an occurrence, and a scene. In section 6, we describe the related works. Finally, in section 7, we conclude this paper.

1. Three-layer Data Structure for Interconnection In this section, we present a three-layer data structure for realizing event-centric interconnection of heterogeneous data repositories. Currently, a relationship between each data is represented in a static link. We consider that there are limitations to uniquely represent global static interrelationships. Because interrelationships keep changing in various factors such as spatiotemporal condition, background field, situation. Of course, the interrelation that everyone supports might exist, too. However, it is important to dynamically represent interrelationships depending on an arbitrary situation. It is difficult to represent unique and global interrelationship because it has uncertainties. We define the constraint for reducing the uncertainties, and design the method for representation of various interrelationships. In section 1.1, we describe uncertainties for interrelationships between heterogeneous data. In section 1.2, we define a three-layer data structure for interrelationships that considers these uncertainties. Furthermore, in section 1.3, we consider why we apply interconnection not integration from the standpoint of three uncertainties. 1.1. Uncertainties of Interrelationships between Heterogeneous Data Generally, it is difficult to represent static interrelationships between heterogeneous data because it has uncertainties. However, most current systems utilize static link representation. They implicitly have limitation of interconnection such as limitation of domain, data-type, and field. For realizing interconnection between heterogeneous data, we have to clear uncertainty items. There are three uncertainties for interrelationship between heterogeneous data as follows: (1) Which part of data to focus on.

24

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

It is necessary to extract metadata set as a semantic unit from target data in order to target heterogeneous data. In this case, the extracted semantic unit depends on which part of data to focus on. For example, it is assumed to extract semantic unit from the sensor data of precipitation. In the case that you focus when precipitation is zero, you can detect semantic unit that represents fine or cloudy weather. In the case that you focus when precipitation is higher than the threshold, you can detect semantic unit that represents heavy rain. A different semantic unit can be extracted from the same data source by changing the constraint. That is, it is important to clarify focus point of the data as constraint. (2) What standpoint to interpret data. An interpretation of each extracted semantic unit is changing by user’s background knowledge, standpoint, etc. For example, it assumes that there are disaster ontology and climate changing ontology. When the heavy rain semantic unit is mapped to disaster ontology, the event will be semantically arranged close to swollen river, traffic damage, etc. When the same heavy rain semantic unit is mapped to climate changing ontology, the event will be semantically arranged close to global warming. By this example, you can find various interpretations of the semantic unit are possible by changing the constraint. That is, it is important to clarify what standpoint to interpret data as constraint. (3) What standpoint to interrelate between each data. An interrelationship of each extracted semantic unit is also changing by user’s background knowledge, standpoint, etc. Actually, most interconnection depends on a situation. In such case, we should represent the interrelationship according to the situation. That is, it is important to clarify what standpoint to interrelate between each data as constraint. We consider that we can uniquely represent an interconnection on the constraints if we apply the constraints that exclude three above-mentioned uncertainties. Therefore, it is important to design a data structure for defining the constraints that represent three uncertainties. 1.2. Three-layer Data Structure—Event, Occurrence and Scene For representing interrelationship between heterogeneous data with such three uncertainties, we realize event-centric interconnection for heterogeneous data. It is necessary to design a new data structure for solving the uncertainties. In this section, we design a new three-layer data structure for interconnection of heterogeneous data. The data structure consists of three layer based on three uncertainties. By this data structure, we can represent interconnection between heterogeneous data depending on user’s purpose and standpoint. The data structure consists of three data-types in each layer –event, occurrence and scene. Figure 1 shows overview of the data structure and its layers. Each data has constraints – condition, context and viewpoint. à Event An event is a minimum semantic unit extracted from delivered target data. An event consists of set of various metadata that represent its features. For detecting event from target data, we have to determine a constraint. The constraint for event detection is called a condition. The condition represents which part of target data to focus on. In other words, the condition is constraints that represent

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

25

Figure 1. Overview of three-layer data structure for interconnection. The data structure consists of event, occurrence, and scene. There are three types of constraint – condition, context and viewpoint –for avoiding the uncertainties.

how to summarize target data and how to composite an event. Various events can be detected by setting various conditions from same target data. That is, this solves uncertainty (1) shown in section 1.1. The event also has its condition. It becomes possible to process unitedly by making various different kinds of data resources an event. à Occurrence An occurrence is a projected event according to a constraint that is called context. The interpretation of the event is different according to the standpoint, the background knowledge, etc. The context is a constraint for uniquely providing the interpretation of an event such as user's standpoint, background knowledge, etc. A occurrence is projection data of event along context. That is, the context solves uncertainty (2) shown in section 1.1. By the context, we can specify semantic of an event. Conversely, various occurrences can be composited by setting various contexts from same event. The occurrence consists of projected metadata with contexts. à Scene A scene is set of relationships between occurrences according to a constraint that is called viewpoint. The interconnection of occurrences is different according to the standpoint, the background knowledge, etc. The viewpoint is a constraint for uniquely providing the interconnection of occurrences such as user's standpoint, background knowledge, etc. That is, the viewpoint solves uncertainty (3) shown in section 1.1. By the viewpoint, we can specify interconnection. Conversely, various scenes can be composited by setting various viewpoints from same occurrences. The various interconnections between heterogeneous data can be represented by this data structure of three layers. For representing interconnection between heterogeneous data, events are detected from target data according to condition; occurrences are constructed by projection of events according to context; and scenes

26

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

Table 1. Summary of integration and interconnection

are constructed by interconnection of each occurrence according to viewpoint. The interconnection of heterogeneous data in the three constraints – condition, context and viewpoint– can be found if tracing this data structure oppositely according to the three constraints. 1.3. Integration or Interconnection Generally, techniques for arranging two or more resources include integration and interconnection. In this section, we consider whether integration or interconnection is effective in this case. Table 1 shows a summary for general features of integration and interconnection. For realizing an integration technique, we have to reconstruct all system in most cases because it is necessary to consolidate the system that distributes. However, an integration technique provides efficient computation for arranging two or more resources. An integration technique can arrange static, usual interrelationships fast. Oppositely, it is not possible to apply to the arrangement of various dynamic relationships. On the other hand, it is easy to implement an interconnection technique in most case because it is possible to mount making the best use of existing systems. However, the computational complexity tends to increase. It is better to apply an integration technique not an interconnection technique to arrange static, usual interrelationships because there are a lot of computational complexities. It is possible to apply an interconnection technique to arrangement of various dynamic interrelationships. In this paper, we focus on interrelationships of heterogeneous data. It is difficult to represent static interrelationships between heterogeneous data because it has the uncertainties shown in section 1.1. In this assumption, we should present the method for representing various interrelationships that change dynamically depending on the various constraints by avoiding these uncertainties. The interconnection can realize such an environment. Recently, a lot of data repositories and resources have been widely spread on the Internet. It is difficult to integrate these environments. Of course, it is not impossible to construct the integration system with a part of them. From the standpoint of the extendibility, it is reasonable to apply the interconnection to this environment that increases every day. An interconnection can be applied without changing the arrangement of the resource of the distributed environment. Actually, effectively using the heterogeneous data repositories scattered in the distributed environment is becoming important. In this case, we also take care of three uncertainties for interrelationship. In the case of space weather sensor data derived by Space

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

27

Environment Group of NICT, we are grappling with the similar issue. They require also representing various relationships between their space weather sensor data and other data. Furthermore, we are working “knowledge cluster systems” for knowledge sharing, analysis, and delivery among remote knowledge sites on a knowledge grid [3]. In this environment, we have constructed and allocated over 400 knowledge bases to each site. One of the important issues in this environment is how to arrange and interrelate among these knowledge bases. We have proposed a viewpoint-dependent interconnection method of knowledge bases by focus on concept words in each knowledge base [4]. In this case, to arrange each knowledge base maintaining a distributed environment, the interconnection is applied. Therefore, in order to compute interrelation among various resources in distributed environment, it is important to realize an interconnection mechanism depending on some constraint for avoiding uncertainties.

2. Overview of Interconnection for Heterogeneous Content Repositories In this section, we describe an overview of event-centric interconnection of heterogeneous content repositories. This is a model for interconnection of interdisciplinary data resources in distributed environment depending on some constraint for avoiding uncertainties shown in section 1. In today’s global environment, it is important to transmit significant knowledge to actual users from various data resources. In order to realize this environment, it is important to interrelate among data resources depending on some constraint for avoiding uncertainties. This framework realizes interconnection indirectly and dynamically for the data of various types such as text data, multimedia data, sensor data etc. That is, it helps a user to obtain various appropriate data including data of heterogeneous data-type and heterogeneous fields depending on user's purpose and standpoint. The overview of an event-centric interconnection for heterogeneous contents repositories is shown in Figure 2. Here, for realizing the framework, there are four modules – event detection module, event projection module, correlation analysis module and codifier module. à Event detection module: An event detection module extracts events shown in section 1.2 from target data depending on a condition. The condition is a kind of constraint for avoiding uncertainty shown in section 1. The event detection module can composite various events by setting various condition from same target data. The diversity of data itself that is one of the uncertainties when an event is extracted is avoided by a condition. The input of the module is target data. It must be set in each data repository. The output of the module consists of extracted event set. It is possible to process unitedly by making various heterogeneous data resources an event. à Event projection module: An event projection module projects detected event depending on a context. We call a projected event a occurrence shown in section 1.2. The projection process corresponds to the interpretation of the event according to the context. For example, it assumes that an event detection module extracts a heavy rain event from article data and there are disaster ontology and climate changing ontology. When a context is disaster, heavy rain event will be projected in disaster ontology, and construct a new occurrence. The occurrence

28

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

Figure 2. The overview of an event-centric interconnection for heterogeneous contents repositories. This method consists of four modules—event detection module, event projection module, correlation analysis module, and codifier module.

à

à

will be semantically arranged close to swollen river, traffic damage, etc. When a context is climate changing, heavy rain event will be projected in climate changing ontology, and construct a new occurrence. The occurrence will be semantically arranged close to global warming. In these two case, an event projection module projects thematic metadata described in the heavy rain event to each ontology as a new occurrence. When a context is a spatiotemporal constraint, a new occurrence may be constructed as a shape that represents spatiotemporal region on 3D axis (latitude, longitude, and time) from heavy rain event. In this case, an event projection module projects spatiotemporal metadata described in the heavy rain event to 3D shape as a new occurrence. An event projection module can composite various occurrences by setting various contexts from same event. The occurrence consists of projected metadata with contexts. Correlation analysis module: A correlation analysis module interconnects occurrences depending on a viewpoint based on computing correlation. We call a set of interconnection between occurrences a scene shown in section 1.2. The interconnection of occurrences is different according to the standpoint, the background knowledge, etc. The viewpoint is a constraint for uniquely providing the interconnection of occurrences such as user's standpoint, background knowledge, etc. By the viewpoint, we can specify interconnection. Conversely, A correlation analysis module can composite various scenes by setting various viewpoints from same occurrences. This module can indirectly interconnect heterogeneous data by utilizing occurrences. Codifier module: A codifier module arranges and organizes scenes extracted from a correlation analysis module. The interconnection of heterogeneous data in

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

29

the three constraints – condition, context and viewpoint– can be found if tracing this data structure oppositely according to the three constraints. The process of event-centric interconnection of heterogeneous content repositories is described as follows: Step1. Detecting events from heterogeneous data An event detection module extracts an event from target data along an event class database. In the event class database, event models and their conditions are stored. This step produces semantic units that are unified data-type from various data as events. By this step, it is possible to process unitedly by making various heterogeneous data resources an event. Step2. Projecting events as occurrences An event projection module projects detected event along a occurrence class database. In the occurrence class database, occurrence models and their context are stored. This step produces projected events as occurrences. An event projection module can composite various occurrences by setting various contexts. An occurrence is an event interpreted by the context by projection. Therefore, for representing various interconnections, this step should produce various occurrences from a same event. Step3. Interconnecting occurrences as scenes A correlation analysis module interconnects occurrences depending on a viewpoint along a scene class database. In the scene class database is stored scene models and their viewpoints. This step produces interconnection set of occurrences as scenes. This step can composite various scenes by setting various viewpoints from same occurrences. This set can indirectly interconnect heterogeneous data represented in interconnection set of occurrences. Step4. Providing organized scenes as event-centric interrelationships between heterogeneous data A codifier module arranges and organizes scenes extracted from a correlation analysis module. When a user gives some queries representing a condition, a context and a viewpoint, this step provides appropriate scene set dynamically. By this process, a user obtains interconnection between heterogeneous data depending on three constraints for avoiding uncertainties. Figure 3 shows three important operations for representation of interrelationships between heterogeneous data. These are detection, projection and interconnection. Each operation has a constraint—condition, context, and viewpoint. On the viewpoint from target data, it is possible to expand various interconnections of target data by these constraints. Conversely, on the viewpoint from a user, it is possible to narrow interconnections candidate of target data by these constraints. The computation result by this process can represent relationships between heterogeneous data by utilizing scene data in RDF etc. With regard to each step, any method is acceptable. Please note that this process dynamically represents interrelationships between heterogeneous data depending on a condition, a context and a viewpoint. Conversely, by this process, we can find the approval constraints for the interrelationships (e.g. which data, which part of data, what standpoint to interpret data, and what standpoint to interrelate). This process dynamically represents various interconnections with the condition, context, and viewpoint. That is, it helps a user to obtain various appropriate data including data of heterogeneous data-type and

30

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

Figure 3. Three important operations for representation of interrelationships between heterogeneous data— detection, projection and interconnection— and the data structure.

Figure 4. Overview of an event and its condition. An event data extracted from target data depending on event model including condition. An event consists of a basic attribute (e.g. event label), feature attributes (e.g. date, place, keywords), and origin attributes (e.g. event type, source and condition).

heterogeneous fields depending on user's purpose and standpoint while user’s understanding.

3. Event—Detection Figure 4 shows an overview of event detection. An event is extracted from target data by an event model and its condition in event class database shown in Figure 2. An event consists of seven attributes as follows: event=<eventLabel, eventType, date, place, keywords, source, condition>, where eventLabel means the name of the event, eventType means the kind of the event and represents to which an event model to belong, date means temporal annotations, place means spatial annotations, keywords represents thematic annotations, source means URI of source data, and condition represents condition expression used for the event detection. Please note that not only each detected event but also each event model stored in event class database shown in Figure 2 has same seven attributes. These event models are used as basic patterns when the events are extracted. These attributes are roughly divided into the basic attribute (eventLabel) that represents basic information, the feature attributes (date, place, keywords) that represent the feature of the event and the origin attributes (eventType, source, condition) that represent how to extract themselves. That is, an event consists of two

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

31

Figure 5 Overview of occurrences and their contexts. An occurrence is projected event depending on a context.

types of attribute—feature attribute and origin attribute. The feature attributes are used for interconnecting target data that is represented by event. The origin attributes are used to navigate source data and represent as reason. Furthermore, each attribute is permitted to have two or more elements. The elements given to each attributes are roughly classified into two types—inheritance element and data dependence element. The inheritance element is an element decided depending on the event model. Both events extracted by using the same event model have the same elements. These elements are called inheritance elements because they are inherited from the model. That is, the inheritance element represents features of its event type. The data dependence element is extracted from target data itself. Elements of this type change depending on the target data even if both events are extracted from the same event model. That is, data dependence element represents features of itself. An event is detected from target data by using a condition in each event module; some elements of each attribute are inherited from event module; and some other elements of each attribute are extracted from the target data. By this process, it is possible to unitedly process various heterogeneous data resources by extracting minimum semantic units as event.

4. Occurrence—Projection Figure 5 shows an overview of projection of an event as occurrences. An occurrence is a projected event by occurrence models including its context in occurrence class database shown in Figure 2. The occurrence model represents how to project events in each context. An occurrence represents as follow: occurrence=, where occurrenceLabel means the name of the occurrence, occurrenceType means the kind of the occurrence and represents to which occurrence models to belong, eventSource means URI of target event data, context represents context expression used for the event projection as the occurrence, and an attrii’ represents projected feature attributes depending on a context.As with an event, a occurrence has three types of

32

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

Figure 6. Overview of a scene and its viewpoint. A scene is a record including an interrelationship between occurrences

attributes— a basic attribute (occurrenceLabel), feature attributes (attrii’), and origin attributes (occurrenceType, eventSource, context). Please note that feature attributes set of a occurrence foccurrence is changing depending on an occurrence model including a context Pcontext. foccurrence=(attri1’, attri2’,…, attrin’)= Pcontext (fevent), fevent=(attri1, attri2,…, attrim), where attrij is feature attribute of an event, attrii ’ is feature attribute set of a occurrence, and Pcontext. is an occurrence model with a context. That is, an occurrence model Pcontext. projects event feature attributes attrij to occurrence feature attributes attrii’. Various occurrences can be composited by setting various occurrence models with contexts from same event. Composing various occurrences by using various occurrence models depending on the context means various interpretations of an event are introduced. Therefore, for representing various interconnections, various occurrences should be produced from a same event. When this data structure applies to the system, you can uniquely clarify interpretation of an event by a context that represents user's standpoint, background knowledge, etc. We specify semantic of an event by a occurrence.

5. Scene —Interconnection Figure 6 shows an overview of a scene. A scene is a record including interrelationship of occurrences by a scene model including its viewpoint in scene class database shown in Figure 2. A scene represents as follow: Scene=<sceneLabel, scenType, interrelationship, viewpoint>, Interrelationship=, where sceneLabel means the name of the scene, sceneType means the kind of the scene and represents to which scene models to belong, interrelationship means an interrelationship of maters, and viewpoint represents viewpoint expression used for the occurrence interconnection as the scene. The interrelationship has two types of occurrences. It consists of fromOccurrenceURI that represents cause occurrences for relationship and toOccurrenceURI that represents effect occurrences for relationship.

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

33

Please note that not only each scene but also each scene model stored in scene class database shown in Figure 2 has same attribute sets. These scene models are used as basic patterns when the occurrences are interconnected by correlation analysis. Various scenes can be composited by setting various viewpoints from same occurrences. This data set can indirectly interconnect heterogeneous data represented in interconnection set of occurrences. When this data structure applies to the system, you can uniquely clarify interrelationships of a occurrence by a viewpoint that represents user's standpoint, background knowledge, etc. We specify interconnection of occurrences by a scene. This process dynamically represents interrelationships between heterogeneous data depending on a viewpoint. Conversely, we can find the approval viewpoints for the interrelationships.

6. Implementation Example—Application to the Space Weather Figure 7 shows an implementation for interconnection of heterogeneous contents repositories applying to space weather data as an example. Currently, we are coworking with the Space Environment Group of NICT. Space Environment Group of NICT is delivering sensor data of solar activities and space environment that is called space weather by RSS. One of the important problems is groping for effective use of the space weather data. One of the effective uses is to show how the event that these sensors represent influences our life of every day. For realizing it, we are developing an interconnection method for space weather sensor data and other data such as meteorological sensor data, general newspaper article, etc by using the three-layered architecture. It means this system bridges the gap between general facts such as events in our life of everyday and concepts in specific field such as space weather sensor data. In Figure 7, the system consists of event extraction modules, a correlation analysis management module, correlation analysis modules and codifier module. à Event extraction modules Each event extraction module detects events from each data such as news article data, meteorological sensor data that are AMeDAS (Automated Meteorological Data Acquisition System) data by Japan Meteorological Agency, Space weather sensor data, etc. These modules produce semantic units shown in section 3 that are unified data-type from various data as events. à Correlation analysis management module A correlation analysis management module has two operations. One is projection of each detected event data to correlation analysis modules as occurrences. An occurrence is an event interpreted by the context by projection. Another is organization of correlation analysis modules. In this system, various types of correlation analysis modules provide various scenes that represent interrelationships between occurrences (projected events). The correlation analysis management module should organize these data. That is, this module is input/output interfaces for correlation analysis modules. à Correlation analysis modules A correlation analysis module interconnects occurrences depending on a viewpoint. In this system, we are developing two types of correlation analysis modules—spatiotemporal correlation analysis module and semantic correlation analysis module. Spatiotemporal correlation analysis module

34

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

Figure 7. An implementation for interconnection of heterogeneous contests repositories applying to space weather data

A spatiotemporal correlation analysis module is an analysis module that specializes in the axis of time and spaces. This finds interrelationships of the projected events (occurrences) into which the region and time hour by hour change as phenomenon. We are developing this module based on a moving phenomenon model [5] Semantic correlation analysis module A semantic correlation analysis module is an analysis module that specializes in the semantics. This finds interrelationships of the projected events (occurrences) depending on viewpoint. We are developing this module based on this reference [4] The interrelation is extracted by mutual constraint between these analysis modules. à Codifier module A codifier module arranges and organizes scenes extracted from a correlation analysis management module as shown in section 2. When a user gives some queries representing a condition, a context and a viewpoint, this module provides appropriate scene set dynamically by RDF. By these modules, we can obtain interrelationships between heterogeneous data by bridging the gap between general facts and specific concepts. For example, in the case of the space weather, a sensor data that shows abnormality of Dst index, which is one of the sensor data on the space weather related to Geomagnetic storm event, and an news article on interruption of relay broadcast for XVI Olympic Winter Games are interrelated in the viewpoint of “watching TV” while they are individually published from different communities.

7. Related Works The relationships among concepts are predefined on the basis of a bridge concept. Schema mappings [6] and bridge ontologies [7] are typically used for the bridge concept. These methods are employed to predefine the universal relationships between two different domains; however, it is quite difficult to understand these relationships in

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

35

most cases. As a result, conventional approaches can be employed only on a small scale. The QOM [8] realizes semi-automatic alignment of different ontologies quickly. However, there is no concern about contexts. That is, it is purpose to create static whole ontologies. The feature of our method is dynamic extraction of event-centric interrelationships depending on the content of web feeds selected by a user. The essences of our purpose are to dynamically select, integrate and operate various appropriate data resources depending on a context for distributed environment. Therefore, our method is important and effective to realize interconnection of the distributed heterogeneous data repositories. Recently, linked data [9] that connects various resources at the instance level have attracted attention. Especially, the Linking Open Data community project [10] tries to connect various RDF data. The project enables us to use a large number of open interlinked datasets as structured data. Some works extracts structured data from Wikipedia such as DBpedia [11] and YAGO [12]. These works provide static interlinks for RDF data. In near future, these interlinks apply to not only data but also device, environment, resources, etc. In this sense, it is difficult to expand various interlinks without excluding three uncertainties shown in section 1.1 because of heterogeneities of data-type, content and utilization purpose. Our system realizes dynamic interconnection among heterogeneous data resources by event-driven and event-centric computing with resolvers for uncertainties existing among those resources. Therefore, Our architecture can solve these problems.

8. Conclusion In this paper, we presented a three-layered system architecture for computing dynamic associations of events in nature to related knowledge resources. The important feature of our system is to realize dynamic interconnection among heterogeneous data resources by event-driven and event-centric commuting with resolvers for uncertainties existing among those resources. This realizes interconnection indirectly and dynamically by semantic units for the data of various types such as text data, multimedia data, sensor data etc. In other words, it navigates various appropriate data including data of heterogeneous data-type and heterogeneous fields depending on user's purpose and standpoint. In our current global environment, it is important to transmit significant knowledge to actual users from various data resources. In fact, most events affect various aspects of other areas, fields and communities. This helps a user to obtain related information on heterogeneous data-type, contents and fields while providing a wide understanding of the relationships between them depending on user's standpoint. As our future study, we will extend the system to peer-to-peer environment. We will also formulate the evaluation indexes of represented concepts and contents. Furthermore, we will apply our method to various fields and communities.

References [1] Space Weather Information Center, NICT, http://swc.nict.go.jp/contents/. [2] National Space Weather Program Implementation Plan, 2nd Edition, FCM-P31-2000, Washington, DC, July 2000.Available in PDF at http://www.ofcm.gov/nswp-ip/tableofcontents.htm.

36

T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections

[3] K. Zettsu, T. Nakanishi, M. Iwazume, Y. Kidawara, Y. Kiyoki: Knowledge cluster systems for knowledge sharing, analysis and delivery among remote sites, Information Modelling and Knowledge Bases, vol. 19, pp. 282–289, 2008. [4] T. Nakanishi, K. Zettsu, K. Kidawara, Y. Kiyoki: A Context Dependent Dynamic Interconnection Method of Heterogeneous Knowledge Bases by Interrelation Management Function, In proceedings of the 19th European-Japanese Conference on Information Modelling and Knowledge Bases, Maribor, Slovenia, June, 2009. [5] K.-S. Kim, K. Zettsu, K. Kidawara, Y. Kiyoki: Moving Phenomenon: Aggregation and Analysis of Geotime-Tagged Contents on the Web, In proceedings of the 9 th international symposium on Web & Geographical Information Systems (W2GIS2009), pp.7-24, 2009. [6] R. J. Miller, L. M. Haas, M. A. Hernandez: Schema Mapping as Query Discovery, Proc. of the 26th International Conference on Very Large Data Bases (VLDB2000), pp. 77–88, 2000. [7] A. H. Doan, J. Madhavan, P. Domingos, A. Halevy: Learning to Map between Ontologies on the Semantic Web, Proc. of the 11th international conference on World Wide Web, pp. 662–673, 2002. [8] M. Ehrig, S.Staab: QOM–Quick Ontology Mapping, In Proc. of Third International Semantic Web Conference (ISWC 2004), pp. 683–697, Hiroshima, Japan (2004). [9] T. Berners-Lee, Linked Data, http://www.w3.org/DesignIssues/LinkedData.html, 2006. [10] Linking Open Data W3C SWEO Community Project, http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/. [11] S. Auer, C. Bizer, J. Lehmann, G. Kobilarov, R. Cyganiak, Z. Ives: DBpedia: A Nucleus for a Web of Open Data, In proceedings of the 6th International and 2nd Asian Semantic Web Conference (ISWC2007+ASWC2007), pp.715-728, 2007. [12] F.M. Suchanek, G. Kansneci, G Weikum: YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia, In proceedings of the 16th international conference on World Wide Web, pp.697-706, 2007.

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-37

37

Partial Updates in Complex-Value Databases Klaus-Dieter SCHEWE a,1 and Qing WANG b,2 a Software Competence Centre Hagenberg, Austria b University of Otago, Dunedin, New Zealand Abstract. Partial updates arise when a location bound to a complex value is updated in parallel. Compatibility of such partial updates to disjoint locations can be assured by applying applicative algebras. However, due to the arbitrary nesting of type constructors, locations of complex-value database are often deﬁned at multiple abstraction levels and thereby non-disjoint. Thus, applicative algebras is not as smooth as its simple deﬁnition suggests. In this paper, we investigate this problem in the context of complex-value databases, where partial updates arise naturally in database transformations. We show that a more efﬁcient solution can be obtained when generalising the notion of location and thus permitting dependencies between locations. On these grounds we develop a systematic approach to consistency checking for update sets that involve partial updates. Keywords. Abstract State Machine, partial update, complex value, applicative algebra, database transformation

1. Introduction According to Blass’s and Gurevich’s sequential and parallel ASM theses sequential3 and parallel algorithms are captured by sequential and general Abstract State Machines (ASMs), respectively [3,6] (see also [4]). A decisive characteristic of ASMs is that states are ﬁrst-order structures consisting of updatable (partial) functions. Thus, in each step a set of locations is updated to new values, where a location is deﬁned by an n-ary function symbol f in the (ﬁxed) state signature of the ASM, and n values a1 , . . . , an in the (ﬁxed) base set B of the structures deﬁning states. That is, in a state S the function symbol f is interpreted by a function fS : B n → B, and an update of f (a1 , . . . , an ) to a new value b ∈ B gives rise to fS (a1 , . . . , an ) = b in the successor state S . The progression from a state S to a successor state S is deﬁned by an update set Δ, i.e. a set of updates (, b) with a location and a new value b for this location, provided Δ is consistent, where consistency of an update set is deﬁned by the uniqueness of new values for all locations, i.e. whenever (, b), (, b ) ∈ Δ hold, we must have b = b . However, this requirement is too strict, if the base set B contains values that themselves 1 E-mail:

[email protected] [email protected] 3 In Gurevich’s seminal work “parallelism” actually means unbounded parallelism, whereas algorithms with an a priori given bound to parallelism in elementary computation steps are still considered to be sequential. 2 E-mail:

38

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

have a complex structure. For instance, if the values for a location are tuples (A1 : a1 , . . . , Ak : ak ), then updates to different attributes Ai and Aj can still be compatible. The same applies to lists, ﬁnite sets, counters, labelled ordered trees, etc., and is therefore of particular interest for database transformations over complex-value databases. It is therefore desirable to distinguish between total and partial updates. For the former ones consistency of an update set should remain unchanged, whereas for the latter ones we should strive to ﬁnd a way to guarantee compatibility and then merge partial updates to a location in an update set into a single total update on . The problem of partial updates in ASMs was ﬁrst observed by the research group on Foundations of Software Engineering at Microsoft Research during the development of the executable ASM speciﬁcation language AsmL [7,8]. This motivated Gurevich’s and Tillmann’s investigation on the problem of partial updates over data types counter, set and map [9]. An algebraic framework was established by deﬁning particles as unary operations over a datatype, and the parallel composition of particles as an abstraction of order-independent sequential composition. However, this fails to address partial updates over data types such as sequence as exempliﬁed in [10]. This limitation led to the proposal of applicative algebras as a general solution to the problem of partial updates [11]. It was shown that the problem of partial updates over sequences and labeled ordered trees could be solved in this algebraic framework, and the approach in [9] was a special kind of an applicative algebra. Deﬁnition 1.1 An applicative algebra consists of elements, which comprise a trivial element ⊥ and a non-empty set denoted by a client type τ , a monoid of total unary operations (called particles) over the elements including a null particle λ, and a parallel composition operation Ω, which assigns a particle ΩM to each ﬁnite multiset M of particles, such that the following two conditions (AA1) and (AA2) are satisﬁed: (AA1) f (⊥) = ⊥ for each particle f , and λ(x) = ⊥ for every element x. (AA2) Ω{{f }} = f , Ω(M {{id}}) = ΩM , and Ω(M {{λ}}) = λ. A multiset M of particles is called consistent iff ΩM = λ. When applying applicative algebras to the problem of partial updates each partial update (, b) has to be interpreted as a partical applied to the content of in state S (denoted by valS ()) and all these particles form a multiset M that is aggregated to ΩM such that valS () = ΩM (valS ()) holds, provided M is consistent. In this paper, we investigate the partial update problem in the context of complexvalue databases. In database transformations, bounded parallelism is intrinsic and complex data structures form the core of each data model. Thus, the problem of partial updates arises naturally. Several examples of partial update problems encountered in complex-value database are provided in Section 2. Furthermore, in Section 2, we discuss the reasons why using applicative algebras is not as smooth as the simple deﬁnition above suggests. One of important assumptions of applicative algebra is that locations of partial updates must be disjoint. However, it is common in data models to permit the arbitrary nesting of complex-value constructors. Consequently, we need particles for each position in a complex value, and each nested structure requires its own parallel composition operation. It means that we have to deal with the theoretical possibility of inﬁnitely many applicative algebras, which requires a

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

39

mechanism for the construction of such algebras out of algebras for parts of the type of every object in a complex-value database. This leads to the question of how to efﬁciently check consistency for sets of partial updates. In view of these problems we propose an alternative solution to the problem of partial updates. The preliminaries such as the deﬁnition of partial locations, partial updates, and different kinds of dependencies among partial locations are handled in Section 3. We relax the disjointness assumption on the notion of location in order to reﬂect a natural and ﬂexible computing environment for database computations. While in principle the prime locations bound to complex values are not independent from each other, we may consider each position within a complex value as a sublocation, which for simplicity of terminology we prefer to call also location. Then a partial update to a location is in fact a (partial) update to a sublocation. In doing so, we can transform the problems of consistency checking and parallel composition into two stages: normalisation of shared updates and integration of total updates, which are discussed in Section 4 and Section 5, correspondingly. The ﬁrst stage deals with compatibility of operators in shared updates and the second one deals with compatibility of clusters of exclusive updates. The work in this paper is part of our research on formal foundations of database transformations. Taking an approach analogous to the ASM thesis we demonstrated that all database transformations are captured by a variant of Abstract State Machines [13]. Decisive for this work is the exploitation of meta-ﬁnite states [5] in order to capture the intrinsic ﬁniteness of databases, the explicit use of background structures [2] to capture the requirements of data models, and the handling of genericity [1]. For XML database transformations the requirements for tree-based backgrounds were made explicit in [12], and a more convenient machine model called XML machines was developed permitting the use of monadic second-order logic. On these grounds we developed a logic to reason about database transformations [14].

2. Motivation We begin with modiﬁcations on tuples in a relation since tuples represent a common view for locations in the relational model. As will be revealed in the following example, parallel manipulations on distinct attributes of a tuple are prohibited if only tuples are permissible locations in a state. Example 2.1 Let S be a state containing a nested relation schema R = {A1 : {A11 : D11 , A12 : D22 }, A2 : D2 , A3 : D3 } and a nested relation I(R) over R as shown in Figure 1 where oi (i = 1, 3) are tuple identiﬁers in I(R) and oij (j = 1, 2) are tuple identifers in the relations in the attribute A1 of tuples oi . Suppose that the following two rules execute in parallel, modifying values of attributes A2 and A3 of the same tuple. forall x, y, z with R(x, y, z) ∧ y = b3 do par R(x, y, z) := f alse R(x, y, c2 ) := true par enddo

forall x, y, z with R(x, y, z) ∧ y = b3 do par R(x, y, z) := f alse R(x, b, z) := true par enddo

40

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

o1 o3

o11 o12 o31

A1 A11 {(a11 , (a11 , {(a31 ,

A12 a12 ), a12 )} a32 )}

A2

A3

b

c1

b3

c3

Figure 1. A relation I(R) in nested relational databases

The right rule changes the attribute value b3 in the second tuple to b, meanwhile the left rule changes the attribute value c3 in the same tuple to c2 . They yield pairs of updates {(R({(a31 , a32 )}, b, c3 ), true), (R({(a31 , a32 )}, b3 , c3 ), false)} and {(R({[a31 , a32 )}, b3 , c3 ), false), (R({(a31 , a32 )}, b3 , c2 ), true)}, respectively. Since the rules are running in parallel, we get a set of updates, i.e., {(R({(a31 , a32 )}, b, c3 ), true), (R({(a31 , a32 )}, b3 , c3 ), false), (R({(a31 , a32 )}, b3 , c2 ), true)}. However, applying such a set of updates results in replacing the tuple R({(a31 , a32 )}, b3 , c3 ) by two tuples R({(a31 , a32 )}, b, c3 ), R({(a31 , a32 )}, b3 , c2 ) rather than a single tuple R({(a31 , a32 )}, b, c2 ) as expected. A straightforward solution of solving this problem is to add a ﬁnite number of attribute functions as locations for accessing attributes of tuples. Thus, locations are extended to either an n-ary relational function symbol R with n arguments such as R(a1 , ..., an ), or a unary attribute function symbol with an argument in the form of fR.A1 ....Ak (o) for a relation name R, attributes A1 , . . . , Ak and an identiﬁer o. Note that, attribute functions cannot entirely replace relational functions. To delete a tuple from or add a tuple into a relation, we must still use relational functions. Attribute functions can only be used to modify the values of attributes, including NULL values. The following example illustrates how values of distinct attributes in the same tuple can be modiﬁed in parallel by using this approach. Example 2.2 Let us consider again the nested relation I(R) in Figure 1. Assume that there is a set of attribute functions with a one-to-one corresponding to the attributes in R, i.e., for each Ak ∈ {A1 , A1 .A11 , A1 .A12 , A2 , A3 }, there is a fR.Ak (x) = y for a tuple identiﬁer x in I(R) of a state S and a value y in the domain of Ak . Thus, we have the following locations and their interpretations for the second tuple of I(R). • • • • • • •

valS (fR.A1 (o3 )) = {(a31 , a32 )} valS (fR.A2 (o3 )) = b3 valS (fR.A3 (o3 )) = c3 valS (fR.A1 .A11 (o31 )) = a31 valS (fR.A1 .A12 (o31 )) = a32 valS (fR .A1 (o3 )(a31 , a32 )) = true valS (R({(a31 , a32 )}, b3 , c3 )) = true

Using this approach, the following rule is able to modify values of attributes A2 and A3 of the same tuple in parallel. forall x with R(x) ∧ fR.A2 (x) = b3 do par fR.A2 (x) := b

41

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

o1 o3

o11 o12 o31

A1 A11 {(a11 , (a11 , {(a31 ,

A12 a12 ), a12 )} a32 )}

A2

A3

[b11 , b12 ]

{{c1 , c1 }}

[b3 ]

{{c31 , c32 , c33 }}

Figure 2. A relation I(R ) in complex-value databases

fR.A3 (x) := c2 par enddo The nested relation is just one example of complex-value databases. Other complexvalue data models are possible by allowing arbitrary nesting of various type constructors over base domains. Next we propose the locations necessary for two other common type constructors: list and multiset. Following the terminology in [11], we call a position of a list as the number referring to an element of the list, and a place of a list to be before the ﬁrst element, between two adjacent elements or after the last element of the list, which both start from zero and counts from left-to-right. Let us take the list [b11 , b12 ] as an example. There are three positions in [b11 , b12 ], where b11 is in position 0 and b12 is in position 1. Moreover, the list [b11 , b12 ] has three places, where place 0 is just before b11 , place 1 is between b11 and b12 and place 2 is after b12 . Here, we prefer to consider that, for a ﬁnite list s with length n, the locations of s are in form of fs (k, k) for k = 0, . . . , n and fs (k, k + 1) for k = 0, . . . , n − 1. That is, a location fs (k, k) indicates an insertion point of the list s, while a location fs (k, k + 1) targets an element in the list. The symbol ↓ is used to indicate a deletion operation. For a multiset, we associate it with a pair (D, f ), where D is a domain of elements and f : D → N is a function from D to the set of natural numbers. Correspondingly, the locations referring to elements of a multiset M are expressed as unary functions in form of fM (x), and an update (fM (x), y) speciﬁes that there are y occurrences of the element x in M. If y is zero, then we say that the element x does not exist in M.

Example 2.3 Let us extend the relation I(R) in Figure 1 to a relation I(R ) with R = {A1 : {A11 : D11 , A12 : D22 }, A2 : N (D2 ), A3 : M(D3 )} in Figure 2 where N (D2 ) denotes the set of all ﬁnite lists over the domain D2 and M(D3 ) denotes the set of all ﬁnite multisets over the domain D3 . The attribute functions for attributes A2 and A3 thus need to be changed, e.g., • valS (fR .A2 (o3 )) = [b3 ] • valS (fR .A3 (o3 )) = {{c31 , c32 , c33 }} Therefore, for the attribute value [b3 ], we may have (fR .A2 (o3 )(0, 0), b31 ) to insert b31 before b3 , (fR .A2 (o3 )(0, 1), b3 ) to replace b3 with b3 or (fR .A2 (o3 )(0, 1), ↓) to delete b3 . For the attribute value {{c31 , c32 , c33 }}, we may have (fR .A3 (o3 )(c34 ), 2) to add a new element c34 with the number of occurrence 2, (fR .A3 (o3 )(c32 ), 0) to delete c32 from the multiset.

42

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

In addition, we may want to increase the number of occurrence of an element in a multiset by a number k on top of the original occurrence. In this case, we do not care about the original number of occurrence as long as the number of occurrence of the element has been increased by k. For this kind of modiﬁcation, it would be natural to associate an additional operator with an update so as to describe how the number of occurrence will be changed, for example, increase or decrease. The above approach of adding attribute functions works quite well in resolving the partial update problem on distinct attributes of a tuple or a subtuple. Nevertheless, the co-existence of locations R(a1 , ..., an ) for relational functions and fR.A1 ....Ak (o) for attribute functions give rise to new problems as illustrated by the following example. Example 2.4 Suppose that we have the following rule executing over I(R) in Figure 1. Then, the rule yields an update set containing two updates (fR.A2 (o3 ), b) and (R({(a31 , a32 )}, b3 , c3 ), false). By using the standard deﬁnition for a consistent update set, we know that this update set is consistent. However, they are actually conﬂicting each other: The update (fR.A2 (o3 ), b) intends to change the value of attribute A2 of the tuple with identiﬁer o3 to b, while the update (R({(a31 , a32 )}, b3 , c3 ), false) intends to delete this tuple. par forall x with R(x) ∧ fR.A2 (x) = b3 do fR.A2 (x) := b enddo forall x, y, z with R(x, y, z) ∧ y = b3 do R(x, y, z) := f alse enddo par Due to the arbitrary nesting of type constructors, locations of complex-value database are often deﬁned at multiple abstraction levels and thereby non-disjoint. In fact, allowing locations of different abstraction levels plays a vital role in supporting the requests of updating complex-values of a database at different granularity. This brings us to the question of how to utilise applicative algebra to solve the partial update problem in the setting of non-disjoint locations. One possible approach is to transform updates with nested locations into updates with nested modiﬁcations but disjoint locations and then apply applicative algebras as suggested in [11]. Because all sorts of particles used to modify the nested internal structure of an element have to be deﬁned on the outermost type of the element, this immediately leads to particles with complicated controls which encode the nested modiﬁcations. The second approach is to establish a mechanism for the nested construction of applicative algebras in accordance with complex data structures used in a data model. Let us take I(R ) in Figure 2 for example. Assume that Ai (i = 11, 12, 2, 3) are applicative algebras built upon the domains Di , and we use the notations set(A), tup(A), lis(A) and mul(A) to denote applicative algebras built upon an applicative algebras A for types set, tuple, list and multiset, respectively. Then, the following nested applicative algebra needs to be constructed for I(R ): set(tup(set(tup(A11 , A12 )), lis(A2 ), mul(A3 ))).

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

43

Clearly, this kind of construction is quite complicated. Furthermore, there are two issues to be considered: (1) How to properly reﬂect the consistency and integration of partial updates at a particular level in the consistency and integration of partial updates at higher levels? (2) Is there an efﬁcient algorithm that can handle the consistency and integration of a multiset of partial updates at different abstraction levels? In the rest of this paper, we will develop a customised and efﬁcient mechanism to handle these problems.

3. Preliminaries on Partial Updates We ﬁrst formalise the notion of partial location and then formally deﬁne partial updates. Deﬁnition 3.1 Let S be a state, f be an auxiliary dynamic function symbol of arity n in the state signature, and a1 , ..., an be elements in the base set of S. Then, f (a1 , ..., an ) is called a non-prime location. Deﬁnition 3.2 A location 1 subsumes a location 2 (notation: 2 1 ) if, for all states S, valS (1 ) uniquely determines valS (2 ). While, in principle, prime locations bounded to complex values are independent from each other, we may consider each position within a complex value as a non-prime location. We call a location 2 the sublocation of a location 1 iff 2 1 holds. A location is the sublocation of itself. A trivial location ⊥ is the sublocation of every location. Example 3.1 fR .A2 (o3 )(0, 0) and fR .A3 (o3 )(c32 ) discussed in Example 2.3 are non prime locations and also the sublocations of R ({(a31 , a32 )}, [b3 ], {{c31 , c32 , c33 }}). From a constructive point of view, a prime location may be considered as an algebraic structure in which its sublocations refer to parts of the structure. Since such a structure is always constructed by using type constructors like set, tuple, list, multiset, etc. from a speciﬁc data model, we only allow sublocations of a prime location, which either subsume or disjoint one another, to be partial locations by the following deﬁnition. This restriction is more a technicality so that we can focus on discussing the integration and consistency checking of partial updates. Extending to the general case would be straightforward after adding a decomposition procedure to eliminate sublocations that are overlapping but do not subsume one another. Let 1 , 2 , 3 be any prime or non-prime locations. Then 1 2 = 3 if 1 3 , 2 3 and there is no other ∈ L such that = 3 , 1 , 2 and 3 . Similarly, 1 2 = 3 if 3 1 , 3 2 and there is no other ∈ L such that = 3 , 1 , 2 and 3 . We say 1 2 = ⊥ if 1 and 2 are disjoint. Deﬁnition 3.3 Let S be a state. Then, the set of partial locations of S is the smallest set such that • each prime location is a partial location, and • if each prime location is an algebraic structure (L , , , , ⊥ ) satisfying the following conditions, then each sublocation of is a partial location.

44

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

0 1 2 3 11 111 112 21 22 23 31 32 33 ⊥

R ({(a31 , a32 )}, (b3 ), {{c31 , c32 , c33 }}) fR .A1 (o3 ) fR .A2 (o3 ) fR .A3 (o3 ) fR .A1 (o3 )(a31 , a32 ) fR .A1 .A11 (o31 ) fR .A1 .A12 (o31 ) fR .A2 (o3 )(0, 0) fR .A2 (o3 )(0, 1) fR .A2 (o3 )(1, 1) fR .A3 (o3 )(c31 ) fR .A3 (o3 )(c32 ) fR .A3 (o3 )(c33 )

Figure 3. An algebraic structure of a prime location

— (L , , ) is a lattice, consisting of a set L of all sublocations of , and two binary operations (i.e., join) and (i.e., meet) on L , — is the identity element for the join operation , — ⊥ is the identity element for the meet operation , and — for any 1 and 2 in L , one of the following conditions must be satisﬁed: (1) 1 2 = 1 , (2) 1 2 = 2 , or (3) 1 2 = ⊥ .

Example 3.2 Let us consider the prime location R ({(a31 , a32 )}, [b3 ], {{c31 , c32 , c33 }}) in the relation I(R ) of Figure 2. This prime location can be regarded as an algebraic structure in Figure 3, where the label i of a node in the picture at the left hand side corresponds to the index i of the sublocation i at the right hand side. As all conditions required in Deﬁnition 3.3 are satisﬁed, therefore, these sublocations are partial locations. In addition to the subsumption relation, one partial location may be dependent on another partial location, i.e., the dependence relation over partial locations of a state. Deﬁnition 3.4 A location 1 depends on a location 2 (notation: 2 1 ) if valS (2 ) = ⊥ implies valS (1 ) = ⊥ for all states S. The dependency relation is said to be strict on the location , if for all 1 , 2 , 3 ∈ L = { | }, we have that whenever 1 2 and 1 3 hold, then either 2 3 or 3 2 holds as well. B+-trees provide examples for non-strict dependency relations that are at the same time not induced by subsumption. However, such a dependency may also occur without nesting, the prominent examples being sequences and trees. Example 3.3 Consider the partial locations fR .A2 (o3 )(0, 0), fR .A2 (o3 )(0, 1) and fR .A2 (o3 )(1, 1) in the relation I(R ) of Figure 2. As fR .A2 (o3 )(k1 , k2 )fR .A2 (o3 )(k1 , k2 ) holds for k1 < k2 , the dependency relation is strict on fR .A2 (o3 ).

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

45

A partial location 2 that is subsumed by a partial location 1 certainly depends on it in the sense that if it is bound to a value other than ⊥ (representing undeﬁnedness), then also 1 cannot be bound to ⊥. So the following lemma is straightforward. Lemma 3.1 For two partial locations 1 , 2 with 2 1 we also have 1 2 . Proof. Let S be a state. As valS (1 ) uniquely determines valS (2 ), clearly valS (l1 ) = ⊥ implies valS (l2 ) = ⊥. That is, 2 depends on 1 , i.e. 1 2 . To formalise the deﬁnition of partial updates, we associate a type with each partial location = f (a1 , ..., an ), such that the type τ () of f (a1 , ..., an ) is the codomain of the function f : D1 × ... × Dn → D, i.e., τ () = D. Therefore, a type of partial locations can be a built-in type provided by database systems, such as String, Int, Date, etc., a complex-value type constructed by using type constructors in a data model, such as set, tuple, list and multiset constructors, or a customised type deﬁned by users, i.e., user-deﬁned types (UDTs) used in database applications.

Example 3.4 Reconsider the partial locations 1 , 2 and 3 of I(R ) in Figure 3. They have the following types: τ (1 ) = P(N T 2 (D11 , D12 )), τ (2 ) = N (D2 ) and τ (3 ) = M(D3 ), where P(D) denotes the set of all subsets over the domain D, and N T 2 (D1 , D2 ) denotes the set of all 2-ary tuples over the domains D1 and D2 . Instead of particles, we will formalise partial updates of a database transformation in terms of exclusive and shared updates. Deﬁnition 3.5 An exclusive update is a pair (, b) consisting of a location and a value b of the same type τ as . A shared update is a triple (, b, μ) consisting of a location of type τ , a value b of type τ and a binary operator μ : τ × τ → τ . For a state S and an update set Δ containing a single (exclusive or shared) update, we have valS+Δ ()

=

b if Δ = {(, b)} μ(valS (), b) if Δ = {(, b, μ)}

Although exclusive updates have the similar form to updates of ASMs deﬁned in a standard way, exclusive updates are allowed to have partial locations. It means that the locations of two exclusive updates may have a dependency relationship, whereas the locations of two standard updates of ASMs are assumed to be disjoint. Therefore, the notion of exclusive update generalises the notion of update in ASMs. Updates deﬁned in ASMs become exclusive updates to prime locations in our deﬁnition. In a shared update (, b, μ), the binary operator μ is used to specify how the value b partially affects the content of in a state. When multiple partial updates are generated to the same location simultaneously, a multiset of partial updates is obtained. For example, a location of type N may associate with a multiset of shared updates {{(, 10, +), (, 10, +), (, 5, −)}} (i.e., increase the content of by 10 twice and decrease the content of by 5 once). The use of a binary operator μ in shared updates helps us to separate the concerns relating to database instance and database schema. By this separa-

46

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

tion, the consistency checking of incompatible operators can be conducted at a database schema level, which will be further discussed in the next section. So this viewpoint is efﬁcient in practice, particularly for those database applications with large data-sets. We provide different update rules to generate exclusive and shared updates. Deﬁnition 3.6 Let t1 and t2 be terms of type τ , and μ be a binary operator over type τ , then the partial update rules take one of the following two forms • the rule for exclusive updates: t1 ⇔ t2 ; • the rule for shared updates: t1 ⇔μ t2 . Semantically, the partial update rules generate updates in a multiset. Let S be a ¨ 1 ⇔μ ¨ 1 ⇔ t2 , S, ζ) = {{(, b)}} and Δ(t state and ζ be a variable assignment, then Δ(t t2 , S, ζ) = {{(, b, μ)}}, where = t1 [a1 /x1 , . . . , an /xn ] for var(t1 ) = {x1 , . . . , xn } and ζ(xi ) = ai (i = 0, . . . , n), and valS,ζ (t2 ) = b. Remark 3.1 The addition of auxiliary functions as locations of a state requires a shifted view for partial updates in our deﬁnition. In contrast to an update (, b) deﬁned in standard ASMs, in which valS+{(,b)} () = b holds for every state S, the partial updates considered here do not satisfy such a condition.

Example 3.5 Consider a state S that has the relation I(R ) of Figure 2 and the partial updates (fR .A2 (o3 )(0, 0), d31 ) and (fR .A2 (o3 )(0, 1), d32 ). Applying these partial updates will change the value of attribute A2 at the tuple with identiﬁer o3 from [d3 ] in the state S to [d31 , d32 ] in the successor state S = S + {(fR .A2 (o3 )(0, 0), d31 ), (fR .A2 (o3 )(0, 1), d32 )}. However, valS (fR .A2 (o3 )(0, 0)) = d31 , and similarly, valS (fR .A2 (o3 )(0, 1)) = d32 . Instead, we have valS (fR .A2 (o3 )(0, 0)) = null and valS (fR .A2 (o3 )(0, 1)) = d31 . For simplicity, we will call partial location as location in the rest of this paper.

4. Normalisation of Shared Updates ¨ of partial updates is the process of merging all shared Normalisation of a multiset Δ ¨ is transformed into updates to the same location into a single exclusive update. Thus, Δ an update set Δ containing only exclusive updates. ¨ is in the normal form if each update in it is an Deﬁnition 4.1 An update multiset Δ exclusive update with multiplicity 1. ¨ and Opt(Δ) ¨ denote the set of locations and the set of As a convention, let Loc(Δ) ¨ ¨ denotes the submultiset operators occurring in an update multiset Δ, respectively, and Δ ¨ containing all shared updates that have the location . of an update multiset Δ

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

47

4.1. Operator-Compatibility The notion of operator-compatible addresses the inconsistencies arising from shared updates to the same location in an update multiset, no matter which abstraction level their locations reside at and whether they are dependent on other locations in the same update multiset. Example 4.1 Let Q∗ be the set of rational numbers excluding zero and R be the set of real numbers, then addition + and substraction − are operators over R, and multiplication × and division ÷ are operators over Q∗ , respectively. Suppose that is a location of type Q∗ , then the following modiﬁcations can be executed in parallel. par ⇔+ b1 ⇔− b2 ⇔× b3 ⇔÷ b4 par ¨ = {{(, b1 , +), (, b2 , −), (, b3 , ×), (, b4 , ÷)}} For this rule, the update multiset Δ is obtained. The operators in the submultisets {{(, b1 , +), (, b2 , −)}} and {{(, b3 , ×), ¨ is not compatible, (, b4 , ÷)}} are compatible. Nevertheless, the operators in Δ ¨ in different orders yields different results. because applying updates in Δ Many languages developed for database manipulations have set-theoretic operations, such as Structured Query Language (SQL), Relational Algebra (RA), etc. The partialupdate problem relating to set-theoretic operations is about the parallel manipulations on sets via various set-based operations. The following example illustrates that, after a main computation initializes a set of subcomputations, each of subcomputations may yield a set of values that are then unioned into the ﬁnal result in parallel. Example 4.2 Let P(D) be the powerset of the domain D, then the set-based operations: union ∪, intersection ∩, difference −, symmetric difference , etc. can be regarded as common operators over domain P(D). The following rule produces an operatorcompatible update multiset {{(, {b1 , b2 }, ∪), (, {b2 , b3 , b4 }, ∪)}}. par ⇔∪ {b1 , b2 } ⇔∪ {b2 , b3 , b4 } par These examples motivate a straightforward deﬁnition of operator-compatibility in terms of order-independent application of shared updates to the same location. ¨ = {{(, ai , μi ) | i = 1, ..., k}} be a multiset of shared upDeﬁnition 4.2 Let Δ ¨ is operator-compatible if for any two permudates on the same location . Then Δ tations (p1 , ..., pk ) and (q1 , ..., qk ), we have, for all x, μpk (...μp1 (x, ap1 )..., apk ) = ¨ is operator-compatible if Δ ¨ is μqk (...μq1 (x, aq1 )..., aqk ). An update multiset Δ ¨ operator-compatible for each ∈ Loc(Δ).

48

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

As illustrated in Example 4.1, the order-independence of operators is easy to check when the number of shared updates is small. However, in case of a large number of shared updates, compatibility checking by means of exploring all possible orderings is far too time-consuming. Therefore, we introduce an algebraic approach to characterize the operator-compatibility of shared updates to the same location. Deﬁnition 4.3 A binary operator μ1 (over the domain D) is compatible to the binary operator μ2 (notation: μ1 μ2 ) (over D) iff μ2 is associative and commutative and for all x ∈ D there is some x˙ ∈ D that for all y ∈ D we have y μ1 x = y μ2 x. ˙ Obviously, each associative and commutative operator μ is compatible to itself (i.e., self-compatible). The following Lemma gives a sufﬁcient condition for compatibility. Lemma 4.1 Let μ1 and μ2 be two binary operators over domain D such that (D, μ2 ) deﬁnes a commutative group, and (x μ1 y) μ2 y = x holds for all x, y ∈ D. Then μ1 μ2 holds. Proof. Let e ∈ D be the neutral element for μ2 and x˙ be the inverse of x. Then we get, y μ1 x = (y μ1 x) μ2 e = (y μ1 x) μ2 (x μ2 x) ˙ = ((y μ1 x) μ2 x) μ2 x˙ = y μ2 x. ˙

Example 4.3 Let us look back Example 4.1. Both (R, +) and (Q∗ , ×) are abelian groups, the duality property in Lemma 4.1 is satisﬁed by addition + and substraction − on R, and multiplication × and division ÷ on Q∗ , respectively. Thus, − + and ÷ ∗ hold on R and Q∗ , respectively. Similarly, set operations such as union ∪, intersection ∩, symmetric difference are self-compatible. Moreover, as x − y = x ∩ y¯ holds with the complement y¯ of the set y, set difference − is compatible to intersection ∩. Compatibility ı1 ı2 permits replacing each shared update (, v, ı1 ) by the shared update (, v, ˙ ı2 ). Then the associativity and commutativity of ı2 guarantees orderindependence. Thus, we obtain the following theorem. ¨ of shared updates on the same location is Theorem 4.1 A non-empty multiset Δ ¨ ¨ ) such that, operator-compatible if either |Δ | = 1 holds or there exists a μ ∈ Opt(Δ ¨ for all μ1 ∈ Opt(Δ ), μ1 μ holds. Proof. The ﬁrst case is trivial. In the second case, if μ1 μ holds, we can replace ¨ with μ1 by shared updates with μ. In doing so we obtain an all shared updates in Δ update multiset, in which only the self-compatible operator μ is used. The associativity and commutativity of μ implies (. . . ((x μ b1 ) μ b2 ) . . . μ bk ) = (. . . ((x μ bς(1) ) μ bς(2) ) . . . μ bς(k) ) for all x, b1 , . . . , bk and all permutations ς as desired.

¨ 1 ) = {+, −}, Δ ¨ 2 with Opt(Δ ¨ 2 ) = ¨ 1 with Opt(Δ Example 4.4 Suppose that we have Δ ¨ 3 with Opt(Δ ¨ 3 ) = {∩, −} and Δ ¨ 4 with Opt(Δ ¨ 4 ) = {∩, ∪}. {×, ÷}, Δ ¨ 1 , Δ ¨ 2 and Δ ¨ 3 are operator-compatible, and From Theorem 4.1, we obtain that Δ ¨ 4 is not operator-compatible. Δ

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

49

Remark 4.1 Theorem 4.1 allows checking of the operator-compatibility of shared updates to the same location by only utilising the schema information. This approach can ensure conformance to the genericity principle of database transformations, while considerably improving database performance. 4.2. Normalisation Algorithm ¨ we denote the normalisation of the update multiset Δ. ¨ FurBy the notation norm(Δ), thermore, let Δλ be a trivial update set, indicating that an update set is inconsistent. This comes into play when we do not have operator-compatibility. ¨ is conducted for each location apNormalisation of a given update multiset Δ ¨ ¨ ¨ is transformed into a set conpearing in Δ, i.e. we normalise Δ . In doing so, Δ ¨ taining exactly one exclusive update, provided Δ is operator-compatible. Otherwise ¨ ) = Δλ . The following algorithm describes the normalisation process in detail. norm(Δ Algorithm 4.1 ¨ and a state S Input: An update multiset Δ ¨ Output: An update set norm(Δ) Procedure: ¨ the set of locations Loc(Δ) ¨ appearing in Δ ¨ (i) By scanning through updates in Δ, ¨ and all exclusive is obtained, shared updates to each location are put into Δ updates a ¨ , the following steps are processed: (ii) For each Δ ¨ = {{(, b, μ)}}, then norm(Δ ¨ ) = {(, μ(valS (), b)}; (a) If Δ ¨ (b) otherwise, check Opt(Δ ): ¨ ) such that for all μ ∈ Opt(Δ ¨ ), μ μ holds, i. If there exists μ ∈ Opt(Δ then ¨ where μ = μ into the form • translate each update (, b, μ) ∈ Δ (, b , μ ) according to the results from Lemma 4.1; • assume that the update multiset after ﬁnishing the translation on ¨ is {{(, b , μ ), ..., (, b , μ )}}, Δ ¨ can be inteeach update in Δ 1 k ¨ ) = {(, b )}, where b = grated into the update set norm(Δ valS () μ b1 μ ... μ bk ¨ = Δλ and then exit the algorithm. ii. otherwise, norm(Δ) ¨ is obtained by norm(Δ) ¨ = ¨ ) ∪ Δ ¨ excl . (iii) norm(Δ) norm(Δ ¨ ∈Loc(Δ)

The following result is a direct consequence of the algorithm. ¨ its normalisation norm(Δ) ¨ is different from Corollary 4.1 For an update multiset Δ, ¨ Δλ iff Δ is operator-compatible. ¨ we can immediately draw the con¨ = Δλ for an update multiset Δ, If norm(Δ) ¨ is not consistent. Otherwise, we obtain an update set containing only exclusion that Δ ¨ = Δλ , clusive updates. In the following section we will therefore assume norm(Δ) and investigate further inconsistencies among exclusive updates in an update set after normalisation.

50

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

5. Integration of Exclusive Updates In this section, we will deal with the second stage of consistency checking starting from a normalised update set that only contain exclusive updates. Since exclusive updates may have partial locations, the deﬁnition for the consistency of an update set can not be directly taken from the standard deﬁnition of ASMs. Even if, for values b and b of any two exclusive updates to the same location in an update set Δ, we have b = b , Δ still might not be consistent. It is possible that inconsistencies arise from updates of distinct but non-disjoint locations, as illustrated in Example 2.4. Therefore, instead of consistent, we call an update set value-compatible if such a condition is satisﬁed. Deﬁnition 5.1 A set Δ of exclusive updates is value-compatible if, for each location in Δ, whenever (, b), (, b ) ∈ Δ holds, we have b = b . An update set that contains exclusive updates may be value-compatible but not consistent. On the other hand, following the standard deﬁnition for the consistency of an update set, we can have the following fact. Fact 5.1 Let Δ = {( 1 , v1 ), ..., (k , vk )} be an update set containing exclusive updates. If the condition i j is satisﬁed, then Δ is consistent. 1≤i=j≤k

Obviously, the condition

1≤i=j≤k

i j is sufﬁcient but not necessary. There are

cases in which a set of exclusive updates to non-disjoint locations is consistent.

Example 5.1 For the relation I(R ) in Example 2.3, suppose that we have Δ1 = {(fR .A2 (o3 )(0, 1), b31 ), (fR .A2 (o3 )(1, 1), b32 ), (fR .A2 (o3 ), (b31 , b32 ))} which mean to add b31 before the ﬁrst element of [b3 ], to replace the ﬁrst element of [b3 ] and to change [b3 ] with [b31 , b32 ]. As applying the updates (fR .A2 (o3 )(0, 1), b31 ) and (fR .A2 (o3 )(1, 1), b32 ) simultaneously over the relation I(R ) results in the update (fR .A2 (o3 ), (b31 , b32 )), which coincides with the third update in Δ1 , Δ1 is consistent. The above example demonstrates that, in order to check the consistency of exclusive updates that may have non-disjoint locations, we need to compose exclusive updates to locations deﬁned at the same abstraction level. 5.1. Parallel Composition We start with the parallel composition operations for updates, which have locations constructed by using common type constructors set, multiset, list and tuple. Set Set constructor has been widely used in various data modeling. Assume that we have a location representing a set in a state S, i.e., valS () = f , and locations referring to the elements a of the set f are expressed as f (a). For the set Δ of updates in which the locations refer only to the elements of the set f , if Δ is value-compatible, then the set of updates in Δ can be integrated into an update such that Ω(Δ) = (, b) where b = valS () ∪ {ai |bi = true ∧ (f (ai ), bi ) ∈ Δ} −{ai |bi = f alse ∧ (f (ai ), bi ) ∈ Δ}.

51

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

Multiset Multiset constructor is also known as bag constructor in data modeling. Assume that we have a location representing a multiset in a state S, i.e., valS () = M, and a location referring to an element a of the multiset M is expressed as f (a) as discussed in Section 2. Alternatively, a multiset M may be represented as a set of elements of the form (a, c) where a is the element of M and c its number of occurrence in the multiset M. A value compatible set Δ of updates in which the locations refer only to the elements of the multiset M can be integrated into an update such that Ω(Δ) = (, b) where

b = valS () − {(a, b )|(a, b ) ∈ valS () ∧ (f (a), b) ∈ Δ ∧ b = b } ∪{(a, b)|(f (a), b) ∈ Δ}. List List constructor provides the capability of modelling the order of elements when such an order is of interest. Consequently, the sublocations constructed by applying a list constructor are ordered, which we can capture by using a strict dependence relation among them as discussed in Section 3. Assume that we have a location representing a list f in a state S, and the locations referring to the parts of the list are expressed by f (k1 , k2 ) as discussed in Section 2. Then, a value-compatible set Δ = {(1 , b1 ), ..., (n , bn )} of updates, in which the locations i (i = 1, ..., n) refer to the elements of the list f can be integrated into an update such that Ω(Δ) = (, b) where b = valS+{(p1 ,bp1 )}+...+{(pn ,bpn )} () and pi pi+1 for a permutation p1 , . . . , pn of the updates in Δ and i = 1, . . . , n − 1. That is, b is the list obtained by applying Δ over the list f in the current state S in the order of ﬁrst taking the update whose location is being dependent by the locations of other updates. Tuple Tuple constructor can be treated in a similar way to list constructor, except that the order of applying updates in an update set can be arbitrarily chosen. Assume that the location representing a tuple in a state S. Then, a value-compatible set Δ = {(1 , b1 ), ..., (n , bn )} of updates, in which the locations only refer to the attribute values of the tuple represented by can be integrated into an update such that Ω(Δ) = (, b) where b = valS+Δ (). 5.2. Location-Based Partitions To efﬁciently handle dependencies between partial locations, we propose to partition a given update set containing only exclusive updates into a family of update sets. Each update set in such a family is called a cluster which has an update subsuming all other updates. The notation SubL() denotes the set of all the sublocations of a location . Lemma 5.1 LetLS denote the set of locations in a state S. Then there exists a unique partition LS = Li such that i∈I

• for all i, j ∈ I with i = j we have i j and j i for all i ∈ Li and j ∈ Lj , and • for each i ∈ I there exists a location i ∈ LS with Li = SubL(i ).

52

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

Proof. By taking connected components of the graph deﬁned by (LS , ) we can partition LS into Li (i ∈ I) satisfying the ﬁrst property. Moreover, none of the Li can be further decomposed while still satisfying the ﬁrst property and we cannot combine multiple partition classes such that the second property holds. Thus, this partition is unique. According to the deﬁnition of the subsumption relation , each SubL() is contained in one Li , and SubL(2) ⊆ SubL(1 ) holds for 2 1 . On the other hand, maximal elements with respect to deﬁne disjoint locations. Therefore, for a maximal element with respect to we must have SubL() = Li for some i ∈ I, which shows the second property.

Now let Δ be an update set containing exclusive updates. Using the partition of LS from Lemma 5.1 we obtain a partition Δ = Δi where Δi = {(, b) ∈ Δ | ∈ Li } i∈I

and I = {i ∈ I | Δi = ∅}. The following lemma is a direct consequence of the independence of locations in different set Li . Lemma 5.2 Δ is consistent iff each Δi for i ∈ I is consistent. As not all locations in Li appear in an update set Δ, we may further decompose each Δi for i ∈ I . For this let L(Δi ) ⊆ Li be the set of locations appearing in Δi . By taking connected components of the graph deﬁned by (L(Δi ), ) we can get partition L(Δi ) = j∈Ji Lij such that for all j1 , j2 ∈ Ji with j1 = j2 we have j1 j2 and j2 j1 for all j1 ∈ Lij1 and j2 ∈ Lij2 . As none of the Lij can be further decomposed, this partition is also unique. Taking Δij = {(, b) ∈ Δi | ∈ Lij } and omitting those of these update sets that are empty, we obtain a unique partition of Δi . Lemma 5.3 Δi is consistent for i ∈ I iff each Δij with j ∈ Ji is consistent. Proof. Consider the maximal elements i1 , ..., ik in L(Δi ) with respect to and the unique values vij (j = 1, ..., k) with (ij , vij ) ∈ Δi . Let S be a state with valS (i ) = vi . If Δij is consistent, then valS+{(ij ,vij )} () = valS+Δij () for all (, v) ∈ Δij . As the locations ij are pairwise disjoint, according to Fact 5.1 we may simultaneously apply all updates (ij , vij ) to vi to obtain a value vi , thus valS+{(i ,vi )} () = valS+{(i1 ,vi1 ),...,(ik ,vik )} () for all (, v) ∈ Δi . The converse, that Δi (i.e., the union of all Δij ) is not consistent if any Δij is not consistent, is obvious.

In the proof we actually showed more, as we only need “upward consistency” for the set of locations below the maximal elements ij . Corollary 5.1 For the maximal elements i1 , . . . , ik in L(Δi ) with respect to , let Δij = {(, v) ∈ Δi | ij }. Then Δi is consistent iff all Δij (j = 1, . . . , k) are consistent. Note that the update sets Δij in Corollary 5.1 are uniquely determined by Δ. There exist locations i and ij such that ij i and for all updates (, v) ∈ Δij we have ij . We call such an update set Δij a cluster below ij . With respect to the subsumption relation , locations in Li may be assigned with levels. Assume that the length of the longest downward path to a minimal element from the maximal element in Li is n. Then,

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

53

• the maximal element is a location at the level n, • the elements which are the children of a location at the level k are locations at the level k − 1. Thus, the maximal element i ∈ Li (as in Lemma 5.1) resides at the highest level, the minimal element in Li resides at the lowest level and other locations in Li are arranged at levels in the middle. A location at the level n is denoted as n . For a cluster Δij below ij , the level of ij is called the height of Δij and is denoted as height(Δij ).

Example 5.2 Let us consider again the prime location R ({(a31 , a32 )}, [b3 ], {{c31 , c32 , c33 }}) and its sublocations (see Example 3.2).

• Suppose that we have Δ = {(112 , a32 ), (22 , b31 ), (23 , b32 ), (2 , [b31 , b32 ])}. Because 22 2 and 23 2 , thus (22 , b31 ), (23 , b32 ) and (2 , [b31 , b32 ]) are partitioned into one cluster, while (112 , a32 ) is in another cluster by itself. • Suppose that we have Δ = {(112 , a32 ), (22 , b31 ), (23 , b32 ), (2 , [b31 , b32 ]), (0 , (∅, [b3 ], {{c31 , c32 }}))}. As 112 , 22 , 23 and 2 are all subsumed by the location 0 , they are all in one cluster. 5.3. Cluster-Compatibility In light of Corollary 5.1, the problem of consistency checking is reduced to that of verifying the consistency of clusters. Lemma 5.4 Let Δ be a cluster below the location . If the set {(n1 , b1 ), ..., (ni , bi )} of all updates in Δ at a level n < height(Δ ) is value-compatible, then, as discussed in Subsection 5.1, it is possible to deﬁne a set {(n+1 , b1 ), ..., (n+1 , bj )} of updates at the 1 j level n + 1 such that, for all states S and any location ∈ LS , we have valS+{(n1 ,b1 ),...,(ni,bi )} ( ) = valS+{(n+1 ,b ),...,(n+1,b )} ( ). 1

1

j

j

Proof. Since the level n is less than height(Δ ), the set {(n1 , b1 ), ..., (ni , bi )} of updates can be grouped based on the condition whether their locations are subsumed by the same location at the level n + 1, e.g., {(nk1 , bk1 ), ..., (nkp , bkp )} ⊆ {(n1 , b1 ), ..., (ni , bi )} is the group in which the locations nk1 ,..., nkp are subsumed by some location n+1 ∈ m n+1 n+1 {1 , . . . , j }. Then, for each group of updates with the locations at the level n, if they are value-compatible, then they can be integrated into an exclusive update that has a location at the level n + 1 as follows:

Ω{(nk1 , bk1 ), ..., (nkp , bkp )} = (n+1 m , bm ), where valS+{(nk ,bk1 ),...,(nk ,bkp )} ( ) = valS+{(n+1 ( ) for each state S and all m ,bm )} p 1 n n ∈ LS . In doing so, the set of updates {(1 , b1 ), ..., (i , bi )} deﬁnes a set of exclusive , b1 ), ..., (n+1 , bj )} in which the locations are one level higher than n if updates {(n+1 1 j it is value-compatible.

We ﬁnally obtain the following main result on the consistency of clusters.

54

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

Theorem 5.1 Let Δ be a cluster below the location . If Δ is “level-by-level” valuecompatible, then Δ is consistent. Proof. If Δ is “level-by-level” value-compatible, then for any state S and starting from updates on locations at the lowest level, exclusive updates on locations at the same level in Δ can be replaced by exclusive updates on one-level-higher locations as stated in Lemma 5.4. As the set of exclusive updates at each level is value-compatible, this procedure continues until we reach the highest level in Δ , i.e., the height of Δ . Finally, all the updates at the level Δ are combined into a single exclusive update (, b) if they

are value-compatible, i.e., valS+{(,b)} ( ) = valS+Δ ( ) for all ∈ LS . Example 5.3 Let us look back again the cluster below the location 0 in the sec ond case of Example 5.2. First, {(112 , a32 ) at level 0 can be integrated into update (11 , (a31 , a32 )) at level 1. Then (11 , (a31 , a32 )) at level 1 is integrated into update (1 , {(a31 , a32 )}) at level 2. Similarly, integrating (22 , b31 ) and (23 , b32 ) at level 2 results in update (2 , [b31 , b32 ]) at level 2, which is identical with the original up date to the location 2 in the cluster. As (1 , {(a31 , a32 )}) and (2 , [b31 , b32 ]) are also value-compatible, they can be integrated to check for consistency against (0 , (∅, [b3 ], {{c31 , c32 }})). Since the resulting update (0 , ({(a31 , a32 )}, [b31 , b32 ], {{c31 , c32 , c33 }})) at level 3 is not value-compatible with the update (0 , (∅, [b3 ], {{c31 , c32 }})) at level 3, thus this cluster above 0 is not consistent. 5.4. Integration Algorithm In this subsection, we present how to algorithmically integrate exclusive updates. For clarity, the procedure is given in terms of two algorithms. The ﬁrst algorithm clusters the updates in a given set of exclusive updates. Every update is initially assumed to deﬁne a cluster. We then successively consider each pair of updates where one update subsumed the other, and amalgamate their respective clusters into larger ones until no more changes can be made. Algorithm 5.1 Input: An update set Δ that only contain exclusive updates Output: A set clus(Δ) of clusters Procedure: (i) starting with P = ∅ and clus(Δ) = {{u}|u ∈ Δ}; (ii) checking the subsumption relation for any two updates ux , uy ∈ Δ, • if the locations of ux and uy are related by subsumption, then add {ux , uy } into P such that P = P ∪ {{ux , uy }}; (iii) doing the following as long as there are changes to clus(Δ): • for each element V in P , do the following — V = {x|x ∈ clus(Δ) and x ∩ V = ∅}, — clus(Δ)=clus(Δ) ∪ {V } − {x|x ⊆ V and x ∈ clus(Δ)}.

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

55

The second algorithm then take the set of clusters and transforms it into a set of exclusive updates in which locations are pairwise disjoint. This is done in accordance with Theorem 5.1, that is, through level-by-level integration provided the updates in each cluster at each level is value-compatible. Algorithm 5.2 Input: A set clus(Δ) of clusters Output: An update set Δ Procedure: (i) Δ = ∅; (ii) For each cluster Δi ∈ clus(Δ), apply the following steps: • Assigning a level to each location in Loc(Δi ) in accordance with the schema information provided by the database environment; • V = Δi ; • Doing the following until the height of the cluster Δi is reached: — P = {(, b)|(, b) ∈ V and the level level() of is minimal in V }; — partition updates in P such that, for each partition class {(1 , b1 ), ..., (n , bn )} ⊆ P , there exists a location with level() = i + 1 and i (i = 1, ..., n); — For each partition class {(1 , b1 ), ..., (n , bn )} ⊆ P , checking the valuecompatibility of the update set {(1 , b1 ), ..., (n , bn )}. (a) if it is value-compatible, then do the following: ∗ apply the parallel composition operation (, b) = Ω{(1 , b1 ), ..., (n , bn )}; ∗ V = V − P ∪ {(, b)}. (b) otherwise, Δ = Δλ and then exit the algorithm. • Δ=Δ∪V. (iii) Exit the algorithm with Δ.

6. Conclusion In this paper, we presented our research on the problem of partial updates in the context of complex-value databases. The work was motivated by the need for an efﬁcient approach for checking the consistency of partial updates, in which locations may refer to parts of a complex object. We proposed an efﬁcient approach for checking whether a given set of partial update is consistent. In the approach, partial updates are classiﬁed into exclusive and shared updates, and the consistency checking consists of two stages. The ﬁrst stage uses an algebraic approach to normalize shared updates based on the compatibility of operators among shared updates, while the second stage checks the compatibility of clusters by integrating exclusive updates level-by-level. In the future, we will continue to exploit the use of partial updates in optimising, rewriting and maintaining aggregate computations in database applications.

56

K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases

References S. Abiteboul and V. Vianu. Datalog extensions for database queries and updates. Journal of Computer and System Sciences, 43(1):62–124, 1991. [2] A. Blass and Y. Gurevich. Background, reserve, and Gandy machines. In Proceedings of the 14th Annual Conference of the EACSL on Computer Science Logic, pages 1–17, London, UK, 2000. Springer-Verlag. [3] A. Blass and Y. Gurevich. Abstract state machines capture parallel algorithms. ACM Transactions on Computational Logic, 4(4):578–651, October 2003. [4] E. B¨orger and R. F. St¨ark. Abstract State Machines: A Method for High-Level System Design and Analysis. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2003. [5] E. Gr¨adel and Y. Gurevich. Metaﬁnite model theory. Information and Computation, 140(1):26–81, 1998. [6] Y. Gurevich. Sequential abstract state machines capture sequential algorithms. ACM Transactions on Computational Logic, 1(1):77–111, July 2000. [7] Y. Gurevich, B. Rossman, and W. Schulte. Semantic essence of AsmL. Theoretical Computer Science, 343(3):370–412, 2005. [8] Y. Gurevich, W. Schulte, and M. Veanes. Rich sequential-time ASMs. In Formal Methods and Tools for Computer Science, pages 291–293, Canary Islands, Spain, 2001. Universidad de Las Palmas de Gran Canaria. [9] Y. Gurevich and N. Tillmann. Partial updates: Exploration. Journal of Universal Computer Science, 7(11):917–951, November 2001. [10] Y. Gurevich and N. Tillmann. Partial updates exploration II. In Abstract State Machines, 2003. [11] Y. Gurevich and N. Tillmann. Partial updates. Theoretical Computer Science, 336(2-3):311–342, 2005. [12] K.-D. Schewe and Q. Wang. XML database transformations, 2009. submitted for publication. [13] K.-D. Schewe and Q. Wang. A customised ASM thesis for database transformations. Acta Cybernetica, 19:765–805, 2010. [14] Q. Wang and K.-D. Schewe. Towards a logic for abstract metaﬁnite state machines. In S. Hartmann and G. Kern-Isberner, editors, Lecture Notes in Computer Science, volume 4932, pages 365–380. Springer, 2008. [1]

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-57

57

Inferencing in Database Semantics Roland HAUSSER Abteilung Computerlinguistik, Universität Erlangen-Nürnberg (CLUE) Bismarckstr. 6, 91054 Erlangen, Germany [email protected] Abstract. As a computational model of natural language communication, Database Semantics1 (DBS) includes a hearer mode and a speaker mode. For the content to be mapped into language expressions, the speaker mode requires an autonomous control. The control is driven by the overall task of maintaining the agent in a state of balance by connecting the interfaces for recognition with those for action. This paper proposes to realize the principle of balance by sequences of inferences which respond to a deviation from the agent’s balance (trigger situation) with a suitable blueprint for action (countermeasure). The control system is evaluated in terms of the agent’s relative success in comparison other agents and the absolute success in terms of survival, including the adaptation to new situations (learning). From a software engineering point of view, the central question of an autonomous control is how to structure the content in the agent’s memory so that the agent’s cognition can precisely select what is relevant and helpful to remedy a current imbalance in real time. Our solution is based on the content-addressable memory of a Word Bank, the data structure of proplets deﬁned as non-recursive feature structures, and the time-linear algorithm of Left-Associative grammar.

Introduction Designing an autonomous control as a software system requires a functional principle to drive it. Following earlier work such as [Bernard 1865] and [Wiener 1948], DBS control is based on the principle of balance, i.e., it is designed to maintain the agent in a steady state (equilibrium, homeostasis) relative to a continuously changing external and internal environment, short-, mid-, and long-term.2 In this way, changes of the environment are utilized as the main motor activating the agent’s cognitive operations. The balance principle guides behavior towards daily survival in the agent’s ecological niche. Behavior driven by instinct and by human desires not directly related to survival, such as power, love, belonging, freedom, and fun, may also be subsumed under the balance principle by treating them as part of the internal environment – like hunger. The agent’s balancing operations provide the foundation for a computational reconstruction of intention in DBS, just as the agent’s recognition and action procedures provide the foundation for a computational reconstruction of concepts and of meanings 1 For

an introduction to DBS see [NLC’06]. For a concise summery see [Hausser 2009a]. conceptually much different from previous and current approaches to autonomous control, our mechanism is closer in spirit to circular causal systems in ecology [Hutchinson 1948] than to the more recent systems of control with a stratiﬁed architecture structured into the levels of organization, coordination, and execution [Antsaklis and Passino 1993]. 2 Though

58

R. Hausser / Inferencing in Database Semantics

(cf. [AIJ’01]). This differs from [Grice 1965], who bases his notion of meaning on an elementary (undeﬁned, atomic) notion of intention – which is unsuitable for computation.3 An autonomous control maintaining a balance by relating recognition to the evaluated outcome of possible reactions is decentralized,4 in line with [Brooks 1985].

1. Inferences of Database Semantics Maintaining the agent in a state of balance is based on three kinds of DBS inference, called R(eactor), D(eductor), and E(ffector) inferences.5 R inferences are initiated by a trigger provided (i) by the agent’s current external or internal recognition or (ii) by currently activated memories (subactivation, cf. Sect. 6). D and E inferences, in contrast, are initiated by other already active inferences, resulting in chaining. As a ﬁrst, simple method of chaining, let us assume that the consequent of inference n must equal the antecedent of inference n+1. R(eactor) inferences provide a response to actual or potential deviations from the agent’s balance (cf. 1.1, 4.1, 12.1). A given trigger automatically initiates exactly those R inferences which contain the trigger concept, e.g., hot or hungry, in their antecedent. D(eductor) inferences establish semantic relations of content, and are illustrated by summarizing (cf. 3.2), downward traversal (cf. 10.1), and upward traversal (cf. 10.4). Other kinds of D inferences are precondition and cause and effect. Triggered initially by an R inference, a D inference may activate another D inference or an E inference. E(ffector) inferences provide blueprints for the agent’s action components.6 Because E inferences connect central cognition with peripheral cognition, their deﬁnition has to be hand-in-glove with the robotic hardware they are intended to control. The interaction of reactor, deductor, and effector inferences is illustrated by the following chain, using English rather than the formal data structure of proplets7 for simplicity: 1.1. C HAINING R, D, AND E INFERENCES 1. R: β is hungry cm β eats food. 2. D: β eats food pre β gets food. 3. D: β gets food ⇓ β gets α, where α {apple, pear, salad, steak}. 4. E: β gets α exec β locates α at γ. 5. E: β locates α at γ exec β takes α. 6. E: β takes α exec β eats α. 7. D: β eats α ⇑ β eats food. Step 1 is an R inference with the connective cm (for countermeasure) and triggered by a sensation of hunger. Step 2 is a D inference with the connective pre (for precondition), 3 Cf.

[FoCL’99], Sect. 4.5, Example II. behavior of social animals, e.g., ants in a colony, may also be described in terms of balance. 5 This terminology is intended to distinguish DBS inferences from the inferences of symbolic logic. For example, while a deductive inference like modus ponens is based on form, the deductor inferences of DBS take content into account. 6 In robotics, effectors range from legs and wheels to arms and ﬁngers. The E inferences of DBS should also include gaze control. 7 Proplets are deﬁned as non-recursive (ﬂat) feature structures and serve as the basic elements of propositions. Like the cell in biology, the proplet is a fundamental unit of structure, function, and organization in DBS. 4 The cooperative

R. Hausser / Inferencing in Database Semantics

59

while step 3 is the D inference for downward traversal with the connective ⇓ (cf. 10.1). Steps 4, 5, and 6 are E inferences with the connective exec (for execute). Step 4 may be tried iteratively for the instantiations of food provided by the consequent of step 3 (see the restriction on the variable α). If the agent cannot locate an apple, for example, it tries next to locate a pear, etc. Individual food preferences of the agent may be expressed by the order of the elements in the variable restriction. Step 7 is based on the D inference for upward traversal with the connective ⇑ (cf. 10.4). This step is called the completion of the chain because the consequent of the inference equals the consequent of step 1. The completion indicates the successful execution of the countermeasure to the imbalance indicated by the antecedent of the initial reactor inference.

2. Coreference-by-Address The implementation of DBS inferences depends on the DBS memory structure. Called Word Bank, it is content-addressable8 in that it does not require a separate index (inverted ﬁle) for the storage and retrieval of proplets. A content-addressable memory is especially suitable for ﬁxed content, i.e., content is written once and never changed. This provides a major speed advantage over the more widely used coordinate-addressable memory (as in a relational database) because internal access may be based on pointers enabling direct access to data. In DBS, the requirement of ﬁxed content is accommodated by adding content instead of revising it, and by connecting the new content to the old by means of pointers. Consider, for example, a cognitive agent observing at moment ti that Julia is sleeping and at tj that Julia is awake, referring to the same person. Instead of representing this change by revising the ﬁrst proposition into the second,9 the second proposition is added as new content, leaving the ﬁrst proposition unaltered: 2.1. C OREFERENTIAL COORDINATION IN A W ORD BANK STORING PROPLETS ...

noun: Julia fnc: sleep . . . prn: 675

...

... ...

member proplets noun: (Julia 675) fnc: wake prn: 702

...

...

verb: sleep arg: Julia ... prn: 675

verb: wake arg: (Julia 675) prn: 702

owner proplets

. . . core: Julia

. . . core: wake

. . . core: sleep

In a proplet, the part-of-speech attribute, e.g., noun or verb, is called the core attribute and its value is called the core value. A Word Bank stores proplets with equivalent core values in the same token line in the order of their arrival. The occurrence of Julia in the 8 See 9A

[Chisvin and Duckworth 1992] for an overview. more application-oriented example would be fuel level high at ti and fuel level low at tj .

60

R. Hausser / Inferencing in Database Semantics

second proposition is represented by a proplet with a core attribute containing an address value, i.e., [noun: (Julia 675)], instead of a regular core value, e.g., [noun: Julia]. Coreference-by-address enables a given proplet to code as many semantic relations to other proplets as needed. For example, the proplets representing Julia in 2.1 have the fnc value sleep in proposition 675, but wake in proposition 702. The most recent (and thus most up-to-date) content relating to the original proplet is found by searching the relevant token line from right to left, i.e., in the anti-temporal direction. Coreference-by-address combines with the semantic relations of functor-argument and coordination structure, as in the following example: 2.2. C OREFERENCE - BY- ADDRESS CONNECTING NEW TO OLD CONTENT verb: sleep 1 arg: Julia ↔ prn: 675

noun: Julia fnc: sleep prn: 675

2 ←

noun: (Julia 675) fnc: wake prn: 702

3 ↔

verb: wake arg: (Julia 675) prn: 702

The connections 1 and 3 are intrapropositional and based on the functor-argument relations between Julia and sleep, and Julia and wake, respectively. Connection 2 is extrapropositional and based on the coreference between the pointer proplet of proposition 702 and the original Julia proplet of proposition 675.10 One way to realize 2.2 in English would be Julia was asleep. Now she is awake. 3. Inference for Creating Summaries Coreference-by-address allows not only (i) to revise the ﬁxed information in a contentaddressable memory by extending it, as in 2.1, but also (ii) to derive new content from stored content by means of inferencing. One kind of DBS inference is condensing content into a meaningful summary. As an example, consider a short text, derived in detail in Chapts. 13 (hearer mode) and 14 (speaker mode) of [NLC’06]: The heavy old car hit a beautiful tree. The car had been speeding. A farmer gave the driver a lift.

A reasonable summary of this content would be car accident. This summary may be represented in the agent’s Word Bank as follows: 3.1. R ELATING SUMMARY TO TEXT member proplets ...

noun: accident mdr: (car 1) prn: 67

... ...

noun: car fnc: hit prn: 1

noun: (car 1) noun: (car 1) fnc: speed . . . mdd: accident prn: 2 prn: 67

owner proplets

. . . core: accident

. . . core: car

... 10 In its basic form, coreference-by-address is one-directional, from the pointer proplet to the original. The inverse direction may be handled by building an additional index. As usual, the proplets in 2.2 are order-free. During language production, an order is re-introduced by navigating from one proplet to the next.

61

R. Hausser / Inferencing in Database Semantics

⎡

⎤

verb: hit ⎢arg: car tree⎥ ⎢ ⎥ . . . ⎢nc: 2 speed ⎥. . . ⎣pc: ⎦ prn: 1 ⎡ ⎤ verb: speed ⎢arg: (car 1) ⎥ ⎢ ⎥ ... ⎢pc: 1 hit ⎥ ⎣nc: 3 give ⎦ prn: 2 ...

. . . core: hit

. . . core: speed

Propositions 1 and 2 are connected (i) by adjacency-based coordination coded in the nc (next conjunct) and pc (previous conjunct) attribute values of their verb proplets hit and speed, and (ii) by coreferential coordination based on the original car proplet in proposition 1 and the corresponding pointer proplet in proposition 2. The summary consists of another car pointer proplet and the accident proplet, each with the same prn value (here 67) and related to each other by the modiﬁer-modiﬁed relation. The connection between the summary and the original text is based on the address value (car 1), which serves as the core value of the rightmost car proplet as well as the mdr (modiﬁer) value of the accident proplet. The summary-creating inference deriving the new content with the prn value 67 is formally deﬁned as the following D(eductor) inference rule, shown with the sample input and output of 3.1 at the content level: 3.2. S UMMARY- CREATING D INFERENCE antecedent

rule level

consequent

noun: α verb: hit noun: β noun: (α K) noun: accident ⇒ mdd: accident fnc: hit arg: α β fnc: hit mdr: (α K) prn: K prn: K prn: K prn: K+M prn: K+M where α {car, truck, boat, ship, plane, ...} and β {tree, rock, wall, mountain, ...} ∪ α matching and binding

noun: car content fnc: hit level prn: 1 input

⎡

⎤

verb: hit ⎢arg: car tree⎥ noun: tree ⎥ ⎢ ⎢nc: 2 speed ⎥ fnc: hit ⎦ prn: 1 ⎣pc: prn: 1

noun: (car 1) mdd: accident prn: 67

noun: accident mdr: (car 1) prn: 67

output

The rule level shows two sets of pattern proplets, called the antecedent and the consequent, and connected by the operator ⇒. Pattern proplets are deﬁned as proplets with variables as values, while the proplets at the content level do not contain any variables. The consequent pattern uses the address (or pointer, cf. Sect. 2) value (α K) to relate to the antecedent and has the new prn value K+M, with M > 0. In the rule, the possible values which α and β may be bound to during matching are restricted by the co-domains of these variables: the restricted variable α generalizes the summary-creating inference to different kinds of accidents, e.g., car accident, truck accident, etc., while the restricted variable β limits the objects to be hit to trees, rocks, etc., as well as cars, trucks, etc. Any content represented by the proplet hit with a subject

62

R. Hausser / Inferencing in Database Semantics

and an object proplet satisfying the variable restrictions of α and β, respectively, will be automatically (i) summarized as an accident of a certain kind whereby (ii) the summary is related to the summarized by means of an address value, here (car 1), thus fulﬁlling the condition that the data in a content-addressable memory may not be modiﬁed. By summarizing content into shorter and shorter versions, there emerges a hierarchy which provides retrieval relations for upward or downward traversal (cf. Sect. 10). An upward traversal supplies more and more general notions, which may be used by the agent to access inferences deﬁned at the higher levels. A downward traversal supplies the agent with more and more concrete instantiations. 4. Horizontal and Vertical Aspects of Applying DBS Inferences DBS inferences are deﬁned as formal rules which are applied to content in the agent’s Word Bank by means of pattern matching. As a software operation, such an application may be divided into phases which happen to have horizontal and vertical aspects. The horizontal aspect concerns the relation between the antecedent and the consequent of an inference and the chaining of inferences. The vertical aspect concerns the relation between the rule level and the content level, within an inference and in a chain of inferences. Consider the formal deﬁnition of the ﬁrst inference in 1.1, applied to a suitable content: 4.1. F ORMAL DEFINITION OF THE hungry-eat R( EACTOR ) INFERENCE

rule level

antecedent consequent noun: β verb: hungry noun: (β K) cm fnc: eat fnc: hungry arg: β prn: K prn: K prn: K+M matching and binding

noun: Julia content fnc: hungry level prn: 211

verb: hungry arg: Julia prn: 211

noun: (Julia 211) fnc: eat prn: 220

verb: eat arg: (β K) food prn: K+M where 0 < M < θ

verb: eat arg: (Julia 211) food prn: 220

noun: food fnc: eat prn: K+M

noun: food fnc: eat prn: 220

The upper bound θ is intended to ensure that the content of the consequent closely follows the content of the antecedent. Furthermore, the inclusion of the antecedent’s subject in the consequent by means of the address value (β K) excludes cases in which one agent is hungry and another one eats food – which would fail as an effective countermeasure. The rule application starts with the vertical grounding of the antecedent in the trigger situation by matching and binding. Next there is the horizontal relation between the grounded antecedent and the consequent, which formalizes a countermeasure (cm) connected to the antecedent and its trigger situation. Finally, the patterns of the consequent vertically derive a new content as a (preliminary) blueprint for action which may horizontally activate another inference, as shown in 1.1. 5. Schema Derivation and Intersection The sets of connected pattern proplets constituting the antecedent and the consequent of an inference like 3.2 or 4.1 are each called a DBS schema. Schemata are used in

R. Hausser / Inferencing in Database Semantics

63

general for retrieving (visiting, activating) relevant content in a Word Bank. A schema is derived from a content, represented as a set of proplets, by simultaneously substituting all occurrences of a constant with a restricted variable. Consider the following example of a content: 5.1. P ROPLETS CODING THE CONTENT OF Julia knows John.

noun: Julia fnc: know prn: 625

verb: know arg: Julia John prn: 625

noun: John fnc: know prn: 625

This representation characterizes functor-argument structure in that the Julia and John proplets11 specify know as the value of their fnc attributes,12 and the know proplet speciﬁes Julia and John as the values of its arg attribute. The content may be turned into a schema by replacing its prn value 625 with the variable K, restricted to the positive integers. This schema will select all propositions in a Word Bank with a content equivalent to 5.1 The set of proplets matched by a schema is called its yield. The yield of a schema relative to a given Word Bank may be controlled precisely by two complementary methods. One is by the choice and number of constants in a content which are replaced by restricted variables. For example, the following schema results from replacing the constants Julia, John, and 625 in content 5.1 with the variables α, β, and K, respectively: 5.2. P OSSIBLE SCHEMA RESULTING FROM 5.1

noun: α fnc: know prn: K

verb: know arg: α β prn: K

noun: β fnc: know prn: K

The yield of this schema are all contents in which someone knows someone. However, if only John and 625 in content 5.1 are replaced by variables, the resulting schema has a smaller, more speciﬁc yield, namely all contents in which Julia knows someone. When a schema with several pattern proplets is used as a query, its yield is obtained by “intersecting” the token lines corresponding to the pattern proplets’ core values (provided the latter are constants). As an example, consider the schema for hot potato: 5.3. S CHEMA FOR hot potato

adj: hot mdd: potato prn: K

noun: potato mdr: hot prn: K

The functor-argument structure of this example (consisting of a modiﬁer and a modiﬁed) is a schema because the prn value is the variable K. Applying the schema to the corresponding token lines in the following example results in two intersections: 11 When 12 When

we refer to a proplet by its core value, we use Italic, e.g., John. we refer to an attribute or a value within a proplet, we use Helvetica, e.g., fnc or know.

64

R. Hausser / Inferencing in Database Semantics

5.4. I NTERSECTING TOKEN LINES FOR hot AND potato ... ...

adj: hot mdd: potato prn: 20

⎡

adj: hot mdd: water prn: 32

member proplets

adj: hot mdd: potato prn: 55

⎤⎡

⎤⎡

adj: hot mdd: day prn: 79

⎤⎡

owner proplets

core: hot

⎤

noun: potato noun: potato noun: potato noun: potato ⎥⎢fnc: eat ⎥ ⎢fnc: look_for⎥⎢fnc: cook ⎥ ⎢fnc: ﬁnd ... ⎣ ⎦⎣mdd: small ⎦ ⎦⎣mdr: big ⎦ ⎣mdr: hot mdr: hot prn: 20 prn: 35 prn: 55 prn: 88

core: potato

The intersections contain the proplets with the prn values 20 and 55. They are selected because the pattern proplets of schema 5.3 match only hot proplets with the mdd (modiﬁed) value potato and only potato proplets with the mdr (modiﬁer) value hot. The other method to control and adjust the yield of a schema is in terms of the restrictions on the variables. Restrictions may consist in an explicit enumeration of what a variable may be bound to (cf. 3.2). Restrictions may also be speciﬁed by constants, like vehicle or obstacle, which lexically provide similar sets as the enumeration method by using a thesaurus, an ontology, WordNet, or the like. The two methods of ﬁne-tuning a DBS schema result in practically13 perfect recall and precision. This is crucial for autonomous control because the effective activation of relevant data is essential for the artiﬁcial agent to make good decisions.

6. Subactivation (Selective Attention) In DBS, the selection of content by means of schemata is complemented by the equally powerful method of subactivation: the concepts provided by recognition and inferencing are used as a continuous stream of triggers which select corresponding data in the Word Bank. As an example, consider the following subactivation of a token line: 6.1. T RIGGER CONCEPT SUBACTIVATING A CORRESPONDING TOKEN LINE member proplets

adj: hot mdd: potato prn: 20

adj: hot mdd: water prn: 32

adj: hot mdd: potato prn: 55

owner proplet

trigger concept

adj: hot mdd: day . . . core: hot ⇐ hot prn: 79

Subactivation is an automatic mechanism of association,14 resulting in a mild form of selective attention. It works like a dragnet, pulled by the incoming concepts serving as triggers and accompanying them with corresponding experiences from the agent’s past. Intuitively, subactivation may be viewed as highlighting an area of content at half strength, setting it off against the rest of the Word Bank, but such that exceptional evaluations (cf. Sect. 8) are still visible as brighter spots. In this way, the agent will be alerted to potential threats or opportunities even in current situations which would otherwise seem innocuous – resulting in virtual triggers for suitable inferences. 13 Recall 14 Like

and precision are deﬁned in terms of subjective user satisfaction. Cf. [Salton 1989]. associating a certain place with a happy memory.

R. Hausser / Inferencing in Database Semantics

65

The primary subactivation 6.1 may be extended into a secondary and tertiary one by spreading activation15 [Quillian 1968]. For example, using the semantic relations coded by the left-most proplet in 6.1, the following proposition may be subactivated, based on the continuation and prn values potato 20, look_for 20, and John 20: 6.2. S ECONDARY SUBACTIVATION OF A PROPOSITION ⎤ ⎡ ⎡ ⎤ verb: look_for noun: potato ⎢arg: John, potato⎥ adj: hot noun: John ⎥ ⎢fnc: look_for⎥ ⎢ fnc: look_for ⎢pc: cook 19 ⎥ ⎣mdr: hot ⎦ mdd: potato ⎦ ⎣nc: eat 21 prn: 20 prn: 20 prn: 20

prn: 20

While a secondary subactivation utilizes the intrapropositional relations of functorargument and coordination structure (cf. [NLC’06], Chapts. 6 and 8), a tertiary subactivation is based on the corresponding extrapropositional relations (cf. [NLC’06], Chapts. 7 and 9). For example, using the pc (previous conjunct) and nc (next conjunct) values of the look_for proplet in 6.2, the tertiary subactivation may spread from John looked for a hot potato to the predecessor and successor propositions with the verb values cook and eat, and the prn values 19 and 21, respectively.

7. Semantic Relations Subactivation may spread along any semantic relations between proplets. By coding the semantic relations inside and between propositions solely as proplet-internal values, proplets become order-free and are therefore suitable for efﬁcient storage and retrieval in the content-addressable memory of a Word Bank. Subactivation is made especially efﬁcient by coding the semantic relations as pointers (cf. Sect. 2). In DBS, the semantic relations are of two kinds, (i) form and (ii) content. The semantic relations of form are functor-argument and coordination structure, intra- and extrapropositionally; they are established during recognition and are utilized in the encoding of blueprints for action. In natural language communication, for example, the semantic relations of grammatical form are established in the hearer mode (recognition) and encoded in the speaker mode (action). The semantic relations of content are exempliﬁed by cause and effect, precondition, the semantic hierarchies, etc. Content relations have been used to deﬁne associative (or semantic) networks (cf. [Brachman 1979] for an overview). In DBS, semantic relations of content are established by inferences. The topic of semantic relations in general and of content relations in particular is widely discussed in linguistics, psychology, and philosophy. Content relations in lexicography, for example, are classiﬁed in terms of synonymy, antonymy, hypernymy, hyponymy, meronymy, and holonymy. In philosophy, content relations are viewed from a different perspective, described by [Wiener 1948], p. 133, as follows: According to Locke, this [i.e., the subactivation of ideas, R.H.] occurs according to three principles: the principle of contiguity, the principle of similarity, and the principle of cause and 15 In ﬁction, our notion of triggering a spreading subactivation is illustrated by the madeleine experience of [Proust 1913], which brings back an almost forgotten area of what he calls "l’édiﬁce immense du souvenir."

66

R. Hausser / Inferencing in Database Semantics

effect. The third of these is reduced by Locke, and even more deﬁnitely by Hume, to nothing but constant concomitance, and so is subsumed under the ﬁrst, contiguity.

Formal examples of semantic relations of content in DBS are the summary inference 3.2, the hungry-eat inference 4.1, and the hierarchy inferences for downward traversal 10.1 and for upward traversal 10.4. DBS inferences serve not only to maintain the agent’s balance, but also code a kind of knowledge which is different from a content like 5.1. 8. Evaluation of Content If a cognitive agent were to value all subactivated contents the same, they would provide little guidance towards successful behavior – neither absolute in terms of the agent’s survival nor relative in comparison to other agents. Even the path of daily routine, of least resistance, or of following some majority is ultimately the result of choices based on evaluation. As a general notion, content evaluation has been investigated in philosophy, linguistics, psychology, and neurology. In today’s natural language processing, it has reappeared as the sentiment detection of data mining [Turney 2002]. In modern psychology, evaluation is analyzed in emotion theory [Arnold 1993] and in appraisal theory [Lazarus and Lazarus 1994]. For a software model of control, evaluations are not so much a question of how they are expressed or which of them are universal,16 but how they are assigned internally by individual agents. In DBS, evaluations are assigned when new content is read into the agent’s Word Bank – by recognition or by inference. At their lowest level, recognition-based evaluations must be integrated into the agent’s hardware (else they would be ﬁgments of imagination). For example, hot and cold require a sensor for temperature. Evaluations have been classiﬁed in terms of joy, sadness, fear, or anger, and are expressed in terms of good vs. bad, true vs. false, excellent vs. poor, virtuous vs. depraved, brave vs. cowardly, generous vs. cheap, loyal vs. treacherous, desirable vs. undesirable, acceptable vs. unacceptable, etc. For guiding the autonomous control of a cognitive agent, DBS uses the features [eval: attract] and [eval: avoid]. They are of a more basic and more neutral nature, and ﬁt into the data structure of proplets. Their values may be scalar and may be set between neutral (0) and the extremes asymptotically approaching -1 or +1. The overall purpose of DBS evaluation is to record (i) any actual deviation from the agent’s state of balance, (ii) any impending threat to the agent’s balance, and (iii) any possibility to secure positive aspects of maintaining the agent’s balance mid- and longterm. Each is used as a trigger for selecting an inference which provides an appropriate reaction. For example, if it is too hot (evaluation-based trigger), go to where it is cooler (inference-based reaction). 9. Adaptation and Learning The mechanism of deriving and adjusting DBS schemata (cf. Sect. 5) holds at a level of abstraction which applies to natural and artiﬁcial agents alike. Because of the simplicity 16 Cf.

[Darwin 1872], Chapt. XIV, pp. 351–360.

R. Hausser / Inferencing in Database Semantics

67

of this mechanism, artiﬁcial agents may be designed like natural agents in that they adjust automatically over time. Thereby, the following differences between natural and artiﬁcial agents do not stand in the way: In natural agents, adjusting to a changing environment as well as optimizing come in two varieties, (i) the biological adaptation of a species in which physical abilities and cognition are co-evolved, and (ii) the learning of individuals which is mostly limited to cognition. Adaptation and learning differ also in that they apply to different ranges of time and different media of storage (gene memory vs. brain memory). In artiﬁcial agents, in contrast, improvement of the hardware is the work of engineers, while development of an automatically adjusting cognition is the work of software designers. Because of this division between hardware and software, the automatic adjustment of artiﬁcial agents corresponds more to learning than to adaptation. Fortunately, the absence of natural inheritance in artiﬁcial agents may be easily compensated by copying the cognition software (including the artiﬁcial agent’s experiences and adaptations) from the current hardware model to the next. The DBS mechanism underlying adaptation as well as learning is based on (i) deriving schemata from sets of content proplets17 by replacing constants with variables and on (ii) adjusting the restrictions of the variables (cf. Sect. 5). This mechanism may be automated based on the frequency of partially overlapping contents: 9.1. A SET OF CONTENTS WITH PARTIAL OVERLAP Julia eats an apple Julia eats a pear Julia eats a salad Julia eats a steak For simplicity, the propositions are presented in English rather than by corresponding sets of proplets. Because of their partial overlap, the propositions may be automatically summarized as the following schema: 9.2. S UMMARIZING THE SET 9.1 WITH A SCHEMA Julia eats α, where α {apple, pear, salad, steak} Due to the restriction on the variable α, 9.2 is strictly equivalent to 9.1. The next step is to replace α by a concept serving as a hypernym, here food: 9.3. R EPLACING THE RESTRICTED VARIABLE BY A HYPERNYM Julia eats food, where food {apple, pear, salad, steak} This concept may serve as the literal meaning of the word food in English, aliment in French, Nahrung in German, etc. (cf. [Hausser 2009b]). Implicit in the content of 9.3 is the following semantic hierarchy: 17 Content proplets consist of context proplets and language proplets (cf. [NLC’06], Sect. 3.2). Language proplets consist of unconnected lexical proplets (e.g., [NLC’06], 5.6.1) and the connected proplets of languagebased propositions (e.g., [NLC’06], 3.2.4).

68

R. Hausser / Inferencing in Database Semantics

9.4. R EPRESENTING THE SEMANTIC HIERARCHY IMPLICIT IN 9.3 AS A TREE food

apple

pear

salad

steak

The automatic derivation of a semantic hierarchy illustrated in 9.1 – 9.3 is empirically adequate if the resulting class containing the instantiations corresponds to that of the surrounding humans. For example, if the artiﬁcial agent observes humans to habitually (frequency) eat müsli, the restriction list of α must be adjusted correspondingly.18 Furthermore, the language surface chosen by the artiﬁcial agent for the hypernym concept (cf. 9.3) must correspond to that of the natural language in use. 10. Hierarchy Inferences An agent looking for food must know that food is instantiated by apples, pairs, salad, or steaks, just as an agent recognizing an apple must know that it can be used as food. In DBS, this knowledge is implemented in terms of inferences for the downward and the upward traversal of semantic hierarchies like 9.4. For example, if Julia is looking for food, the following downward inference will derive the new content that Julia is looking for an apple, a pear, a salad, or a steak: 10.1. H IERARCHY- INFERENCE FOR DOWNWARD TRAVERSAL antecedent

rule level

noun: Julia content level fnc: look_for prn: 18

consequent

noun: food noun: α ⇓ fnc: β fnc: (β K) prn: K prn: K+M where α {apple, pear, salad, steak} matching and binding

verb: look_for arg: Julia food prn: 18

noun: food fnc: look_for prn: 18

noun: α verb: (look_for 18) prn: 25

The antecedent consists of a single pattern proplet with the core value food. When this pattern matches a corresponding proplet at the content level, the consequent derives a new content containing the following disjunction19 of several proplets with core values corresponding to the elements of the restriction set of α: 10.2. O UTPUT DISJUNCTION OF THE DOWNWARD INFERENCE APPLICATION 12.1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ noun: apple or

noun: pear

noun: salad

noun: steak

prn: 25

prn: 25

prn: 25

prn: 25

⎢fnc: (look_for 18)⎥ ⎢pc: apple ⎥ ⎢pc: pear ⎥ ⎢pc: salad ⎥ ⎣nc: pear ⎦ ⎣nc: salad ⎦ ⎣nc: steak ⎦ ⎣nc: ⎦ 18 This method resembles the establishment of inductive inferences in logic, though based on individual agents. 19 See [NLC’06], Chapt. 8, for a detailed discussion of intrapropositional coordination such as conjunction and disjunction.

69

R. Hausser / Inferencing in Database Semantics

The proplets of the output disjunction are concatenated by the pc (for previous conjunct) and nc (for next conjunct) features, and have the new prn value 25. They are related to the original proposition by the pointer address (look_for 18) serving as the fnc value of the ﬁrst disjunct. The output disjunction may be completed automatically into the new proposition Julia looks_for apple or pear or salad or steak, represented as follows: 10.3. P ROPOSITION RESULTING FROM APPLYING DOWNWARD INFERENCE 12.1

noun: (Julia 18) fnc: (look_for 18) prn: 25

⎡

⎤ ⎡

verb: (look_for 18) arg: (Julia 18) apple or prn: 25

⎤⎡

⎡

⎤

noun: pear noun: apple or ⎢fnc: (look_for 18)⎥ ⎢pc: apple ⎥ ⎦ ⎣nc: salad ⎦ ⎣nc: pear prn: 25 prn: 25

⎤

noun: salad noun: steak ⎢pc: pear ⎥ ⎢pc: salad ⎥ ⎦ ⎣nc: steak ⎦ ⎣nc: prn: 25 prn: 25

This new proposition with the prn value 25 is derived from the given proposition with the prn value 18 shown at the content level of 10.1, and related to it by pointer values. The inverse of downward traversal is the upward traversal of a semantic hierarchy. An upward inference assigns a hypernym like food to concepts like salad or steak. Consider the following deﬁnition with an associated sample input and output at the content level: 10.4. H IERARCHY- INFERENCE FOR UPWARD TRAVERSAL antecedent rule level

α {apple, pear, salad, steak}

&

consequent

noun: α fnc: β prn: K

⇑

noun: food fnc: (β k) prn: K+M

matching and binding

content level

noun: Julia fnc: prepare prn: 23

verb: prepare arg: Julia salad prn: 23

noun: salad fnc: prepare prn: 23

noun: food fnc: (prepare 23) prn: 29

Like the downward inference 10.1, the antecedent of the upward inference consists of a single pattern proplet with the restricted variable α as the core value. Due to the use of a pointer address as the fnc value of the output (required anyway by the contentaddressable memory of DBS), there is sufﬁcient information to complete the output proplets into the proposition Julia prepares food, with the prn value 29 and pointer proplets for Julia and prepare. The limited matching used by the upward and downward inferences has the advantage of generality. The automatic derivation and restriction of schemata (cf. Sect. 9) directly controls the automatic adaptation of the hierarchy inferences. They illustrate how DBS is intended to fulﬁll the three functions which deﬁne an autonomic system: “automatically conﬁgure itself in an environment, optimize its performance using the environment and mechanisms for performance, and continually adapt to improve performance and heal itself in a changing environment” [Naphade and Smith 2009].

70

R. Hausser / Inferencing in Database Semantics

11. Analogical Models as Blueprints for Action To obtain a suitable blueprint for an action, the agent may assemble reactor, deductor, and effector inferences creatively into a new chain – which may or may not turn out to be successful. Most of the time, however, it will be easier and safer for the agent to re-use an earlier action sequence, successfully self-performed or observed in others, provided such an analogical model is available in the agent’s memory. These earlier models are contained at various levels of detail in the contents subactivated by the initial R inference. The R inference deﬁned in 4.1, for example, subactivates all contents matching the β is hungry schema (antecedent), the β eats food schema (consequent), as well the token lines of the inference’s constants, here hungry, eat, and food. By spreading to secondary and tertiary subactivations (cf. Sect. 6), the initial R inference may subactivate a large set of contents in the agent’s Word Bank. These serve to illustrate the trigger situation with a cloud of subactivations (cf. [NLC’06], Sect. 5.6), but their precision is too low as to provide a speciﬁc blueprint for practical, goal-directed action. In order for a content stored in memory to be useful for resolving the agent’s current challenge, it must (i) ﬁt the trigger situation as precisely as possible and (ii) have a positively evaluated outcome. For this, our method of choice is DBS intersection (cf. Sect. 5). Assume that the agent is alone in Mary’s house – which serves as a trigger (cf.6.1) subactivating the token line of Mary in the agent’s Word Bank. Furthermore, the agent is hungry, which triggers the hungry-eat inference 4.1. The constant eat in the consequent subactivates the corresponding token line, resulting in intersections between the Mary and eat token lines such as the following: 11.1. E XAMPLE OF TWO Mary eat INTERSECTIONS

noun: (Mary 25) fnc: eat prn: 49

⎡

⎤

verb: eat ⎢arg: (Mary 25) apple⎥ ⎦ ⎣pc: take 48 prn: 49

noun: (Mary 25) fnc: eat prn: 82

⎡

⎤

verb: eat ⎢arg: (Mary 25) müsli⎥ ⎦ ⎣pc: take 81 prn: 82

In other words, the agent remembers Mary once eating an apple and once eating müsli. The two proplets in each intersection share a prn value, namely 49 and 82, respectively, and are in a grammatical relation, namely functor-argument structure. In both intersections, the verb proplet eat provides two continuations. For example, the verb of the ﬁrst intersection provides the continuation values apple and take 48, which may result in the following secondary and tertiary subactivations (cf. Sect. 6). 11.2. S UBACTIVATION SPREADING FROM Mary eat TO Mary take apple. ⎡

⎤⎡

⎤

verb: eat noun: apple noun: (Mary 25) ⎥ ⎢arg: (Mary 25) apple⎥ ⎢fnc: eat fnc: eat ⎦ ⎣eval: attract ⎦ ⎣pc: take 48 prn: 49 prn: 49 prn: 49 ⎤ ⎡ verb: take noun: (Mary 25) ⎢arg: (Mary 25) apple⎥ noun: apple ⎥ ⎢ fnc: take ⎥ fnc: take ⎢nc: eat 49 ⎦ prn: 48 ⎣pc: locate 47 prn: 48 prn: 48

R. Hausser / Inferencing in Database Semantics

71

The anti-temporal order corresponds to the spreading direction of the subactivation. The apple 49 proplet (secondary subactivation) contains the eval attribute with the value attract. Assuming that the corresponding subactivation for the second intersection happens to evaluate the müsli 82 proplet as eval: avoid20 (not shown), the agent would pursue only the tertiary subactivation from the ﬁrst (and not the second) intersection in 11.1 as a possible candidate for an analogical model for regaining balance. To get at the information relevant for ﬁnding something to eat in Mary’s house, the subactivation 11.2 may spread further, based on the pc (for previous conjunct) value locate 47 of the take 48 proplet. In this way, the subactivation of the earlier eating event may be completed into the following backward sequence of propositions: 11.3. S UBACTIVATED SEQUENCE OF PROPOSITIONS ( ANTI - TEMPORAL ORDER ) Mary eat apple [prn: 49]. Mary take apple [prn: 48]. Mary locate apple in blue cupboard [prn: 47]. The information relevant for the hungry agent is the location from where Mary got the apple, i.e., the blue cupboard. If the anti-temporal order is reversed, the propositions in 11.3 will match the antecedent of step 5 in Example 1.1 all the way to the consequent of step 7. This completes the chain relative to the consequent of the initial R inference 4.1 at the level of content, obviating steps 1–4 and thus without any assertion that Mary was hungry when she ate the apple.21 From the content 11.3 provided by memory via intersection, the agent may obtain an analogical model by (i) reversing the order and (ii) replacing the value Mary with a pointer to the agent, represented as moi: 11.4. R ESULTING ANALOGICAL MODEL exec Moi locate apple in blue cupboard [prn: 102] exec Moi take apple [prn: 103] exec Moi eat apple [prn: 104] ⇑ Moi eat food [prn: 105] Whether or not these blueprints for the agent’s action components will result in a successful countermeasure depends on whether proposition 102 turns out to hold in the agent’s current situation or not. 12. Learning by Imitation The purposeful subactivation of an earlier content in the Word Bank by means of intersection provides the agent with an analogical model potentially suitable to remedy its current imbalance. For example, instead of looking randomly through Mary’s house for something to eat, the agent will begin with searching for an apple in the blue cupboard. To implement such a system requires an agent with interfaces for recognition and action of a quality not yet available. Therefore, let us consider a simpler example, namely a robot loading its battery at one of several loading stations in its environment. In analogy to 1.1, this behavior may be controlled by the following chain of inferences: 20 The

assumed evaluations reﬂect the agent’s preference of eating apples over eating müsli. the agent were to assume (unnecessarily) that Mary must have been hungry, then this would correspond to an abductive inference in logic. The point is that observing Mary eating is sufﬁcient for the purpose at hand. 21 If

72

R. Hausser / Inferencing in Database Semantics

12.1. AUTONOMOUS CONTROL AS A CHAIN OF R-D-E INFERENCES 1. R: β low battery cm β load battery. 2. D: β load battery pre β locate station. 3. D: β locate station ⇓ β locate α, where α {1, 2, 3, etc. }. 4. E: β locate α exec β attach to α. 5. D: β attach to α ⇑ β attach to station. 6. E: β attach to station exec β load battery. The connectives cm (countermeasure), pre (precondition), ⇓ (is instantiated by), ⇑ (hypernym), and exec (execute) are as in 1.1. Steps 3 and 5 show a primitive semantic hierarchy, namely the term station for the instantiations of α. The consequent of step 6 provides completion. In terms of current technology, each notion used in this software program, e.g., locate, attach, or load, has a rather straightforward procedural counterpart. It is therefore possible even today to build a real robot in a real environment performing this routine. Instead of programming the robot’s operations directly, for example in C or Java, let us use a declarative speciﬁcation in terms of proplets in a Word Bank. In other words, the robots’ recognitions, e.g., locate α, are stored in its Word Bank as sets of proplets and the robot’s actions, e.g., attach_to α, are controlled by sequences of proplets. To simulate learning by imitation, let us use two such robots, called A and B. Initially, each is training in its own environment, whereby A has the loading stations 1 and 2, and B has the loading stations 3, 4, and 5 – with their respective α variables deﬁned accordingly. Once the individual loading routines are well established for both, A is put into the environment of B. To simplify A’s recognition of loading events by B, let us assume that B emits a signal every time it is loading and that A can correctly interpret the signal. In order for A to imitate B, A must follow B, remember the new locations, and adapt A’s deﬁnition of α to the new environment. The new loading stations may differ in hight, which may cause different efforts of reach, thus inducing preferences (evaluation). After following B around, A’s battery is low. This imbalance triggers step 1 in 12.1. Being in B’s environment, A subactivates the token line of B in A’s Word Bank, while the consequent of step 1 subactivates the token line of load, leading to their intersection – in analogy to 11.1. Spreading results in secondary and tertiary subactivations: 12.2. S UBACTIVATED SEQUENCE OF PROPOSITIONS ( ANTI - TEMPORAL ORDER ) B load battery [prn: 69]. B attach to station 3 [prn: 68]. B locate station 3 [prn: 67]. By reversing the spreading order into the temporal order and by replacing B by A, the visiting robot obtains the following blueprints for its action components: 12.3. B LUEPRINTS FOR ACTION A locate station 3 [prn: 87]. A attach to station 3 [prn: 88]. A load battery [prn: 89]. Except for the replaced subject, these propositions consist of recognition content from A’s memory. Therefore, their core values are tokens carrying sensory, motor, and conceptual information which is not provided by the types of the inference chain 12.1, but essential for action blueprints sufﬁciently detailed to master the situation at hand.

R. Hausser / Inferencing in Database Semantics

73

13. Fixed vs. Adaptive Behavior The behavior of robot A described above is ﬂexible in that it can adapt to different environments of a known kind, here two rooms which differ in the number and location of loading stations. In this example, the artiﬁcial agents and their artiﬁcial environments are co-designed by the engineers. A more demanding setup is to take a given natural environment and to design a robot able to maintain a balance relative to internal and external changes. This requires (i) analysis of the external environment, (ii) construction of interfaces for the agent’s recognition of, and action in, the external environment, and (iii) deﬁnition of R(eactor), D(eductor), and E(ffector) inferences for optimal survival. The ultimate goal, however, is to design a robot with a basic learning software. It should be capable of deriving schemata (cf. Sect. 5) and semantic relations of content (cf. Sect. 7), and of automatically establishing and adapting instantiation classes22 (cf. Sect. 9). In this way, it should be able to continuously optimize behavior for daily survival in the agent’s ecological niche. This may be done in small steps, ﬁrst testing the artiﬁcial agent in artiﬁcial environments it was speciﬁcally designed for, and then in new environments. By putting the artiﬁcial agent into more and more challenging test situations, the control software may be ﬁne-tuned in small steps, by hand and by automatic adaptation.

14. Component Structure and Functional Flow At any moment in time, the DBS model of a cognitive agent distinguishes three kinds of content: (i) old content stored in the Wordbank, (ii) new content provided by recognition, and (iii) new content provided by inference. Recognition, including language interpretation in the hearer mode, interprets the data stream provided by the external and internal interfaces non-selectively and adds the resulting content to the Word Bank. Inferences, in contrast, are triggered selectively by items which match their antecedent. Their derivation of new content is usually based on the subactivation of stored data (cf. Sect.11), and is used as blueprints for action, including language production in the speaker mode. Memories of these actions are added non-selectively23 to the Word Bank. The procedures of recognition and of inference are formally based on small sets of connected pattern proplets, called DBS schemata, which operate on corresponding sets of content proplets by means of pattern matching. The matching between individual pattern proplets and content proplets is greatly facilitated by their non-recursive feature structures (cf. [NLC’06], Sect. 3.2). So far, this method has been used for the following cognitive operations: 14.1. C OGNITIVE OPERATIONS BASED ON MATCHING a. natural language interpretation: matching between LA-hear grammar rules and language proplets (cf. [TCS’92], [NLC’06], Sect. 3.4) 22 [Steels 1999] presents algorithms for automatically evolving new classes from similar data by abstracting from what they take to be accidental (in the sense of Aristotle). 23 We are leaving aside the psychological phenomenon of repression (Unterdrückung) in natural agents.

74

R. Hausser / Inferencing in Database Semantics

b. navigation: matching between LA-think grammar rules and content proplets (cf. [NLC’06], Sect. 3.5, [Hausser 2009a]) c. querying: matching between query patterns and content proplets (cf. [NLC’06], Sect. 5.1) d. inferencing: matching between inference rules and content proplets (cf. 3.2, 4.1, 10.1, 10.4). Navigation (b) and inferencing (d) jointly provide the conceptualization (what to say?) and substantial parts of the realization (how to say it) for language production. The different kinds of matching between pattern proplets and content proplets in combination with the agent’s cognitive input and output suggest the following component structure:24 14.2. C OMPONENT STRUCTURE OF A COGNITIVE AGENT cognitive agent peripheral cognition

rule component

I/O component

1

5 8 6

2 3

central cognition

4

7

content component

1 = external recognition 2 = external action 3 = internal recognition 4 = internal action 5 = input to rule component 6 = output of content component 7 = rule-content interaction 8 = content-rule interaction

The diagram shows three general components, (i) an I/O (input-output) component for recognition and action, (ii) a rule component for interpretation and production, and (iii) a content component for language and context (or non-language) data. The separation of patterns and of contents into distinct components provides a uniform structural basis for the rule component to govern the processing of content (7) – with data-driven feedback from the content component (8), including automatic schema derivation (Sect. 9). The rule and the content component are each connected unidirectionally to the I/O component. All recognition output of this I/O component is input to the rule component (5), where it is processed and passed on to the content component (7). All action input to the I/O component comes from the content component (6), derived in frequent (8, 7) interaction with the rule component. 24 The component structure 14.2 raises the question of how it relates to an earlier proposal, presented in [NLC’06] as diagram 2.4.1. The [NLC’06] diagram models reference in the sense of analytic philosophy and linguistics, namely as a vertical relation between a horizontal language level and a horizontal context level – which is helpful for explaining the Seven Principles of Pragmatics (see [NLC’06], Sect. 2.6, for a summary). In diagram 14.2, this earlier component structure is embedded into the content component. Technically, the [NLC’06] diagram is integrated into 14.2 by changing to a different view: instead of viewing content proplets as sets with a common prn value (propositions), and separated into a language and a context level, the same proplets are viewed as items to be sorted into token lines according to their core value. Treating the [NLC’06] diagram as part of the content component in 14.2 serves to explain the separate

R. Hausser / Inferencing in Database Semantics

75

Conclusion Language production in the speaker mode of a cognitive agent raises the question of where the content to be realized should come from. The cycle of natural language communication modeled in DBS answers this question by providing two sources: (i) content provided by recognition, either current or stored in the agent’s memory, and (ii) blueprints for action derived on-the-ﬂy by the agent to maintain a state of balance (equilibrium, homeostasis) vis-à-vis a constantly changing external and internal environment. So far, work on the speaker mode in DBS has concentrated on a systematic description of (i), i.e., production from recognition content (cf. [NLC’06], [Hausser 2009b]). This paper, in contrast, explores the foundations of (ii), i.e., a general solution to providing blue-prints for meaningful actions by the agent, including natural language production. As a consequence, our focus here is on the what to say aspect of natural language production (conceptualization) rather than the how to say it aspect (realization). A conceptualization based on a cognitive agent with a memory and interfaces to the external and internal environment is in a principled contrast to a language production for weather reports or query answering for ship locations, train schedules, and the like. The latter are agentless applications; they are popular in the research literature because they allow to fudge the absence of an autonomous control. Their disadvantage, however, is that they cannot be extended to agent-based applications such as free dialog [Schegloff 2007], whereas the inverse direction from an agent-based to an agentless application is comparatively easy. Proceeding on the assumption that a sound theoretical solution to natural language production must be agent-based, this paper shows how an autonomous control based on the principle of balance may be embedded into the cycle of natural language communication as formally modeled and computationally veriﬁed in DBS [NLC’06]. Founded technically on a content-addressable memory and coreference-by-address (pointers), this extension of the existing system requires a number of new procedures, such as automatic schema derivation, the subactivation and evaluation of content, adaptation and learning, the deﬁnition and chaining of inferences for deriving action blueprints, etc. The resulting conceptual model of a cognitive agent is summarized by showing the basic components and the functional ﬂow connecting the interfaces for recognition with those for action. To bring across the basic ideas, the presentation tries to be as intuitive as possible. Nevertheless, the formal illustrations of contents, patterns, rules, intersections, etc., provide the outline of a declarative speciﬁcation for a straightforward transfer into efﬁciently running code. Acknowledgements This paper beneﬁtted from comments by Johannes Handl, Thomas Proisl, Besim Kabashi, and Carsten Weber, research and teaching associates at the Abteilung für Computer-Linguistik Uni Erlangen (CLUE). input-output channels for the language and the context component in the earlier diagram: The I/O component of 14.2 provides the rule component with a (usually clear) distinction between language and non-language surfaces, resulting in a distinction between language proplets and context proplets during lexical lookup [Handl et al. 2009]. Therefore, the input channel to the content component 7 and the output channel 8 may each be divided into a part for language proplets and a part for context proplets.

76

R. Hausser / Inferencing in Database Semantics

References [AIJ’01] Hausser, R. (2001). Database Semantics for natural language, Artiﬁcial Intelligence, 130.1:27–74, Elsevier. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Anderson 1983] Anderson, J. R. (1983). A spreading activation theory of memory, Journal of Verbal Learning and Verbal Behavior, 22:261-295 [Antsaklis and Passino 1993] Antsaklis, P.J., and K. M. Passino, eds. (1993). An Introduction to Intelligent and Autonomous Control, Dordrecht: Kluwer Academic [Arnold 1993] Arnold, M. B. (1984). Memory and the Brain, Hillsdale, NJ: Erlbaum [Bernard 1865] Bernard, C. (1865). Introduction à l’étude de la médecine expérimentale, ﬁrst English translation by Henry Copley Greene, published by Macmillan, 1927; reprinted in 1949 [Brachman 1979] Brachman, R.J. (1979). On the Epistemological Status of Semantic Networks, in N. Findler (ed.) Associative Networks, pp. 3–50, Academic Press [Brooks 1985] Brooks, R. (1985). A Robust Layered Control System for a Mobile Robot Cambridge, MA: MIT AI Lab Memo 864, 227–270 [Chisvin and Duckworth 1992] Chisvin, L., and R. J. Duckworth (1992). Content-Addressable and Associative Memory In M.C. Yovits (ed.) Advances in Computer Science, 2nd ed. pp. 159–235, Academic Press [Darwin 1872] Darwin, C. (1872/1998). The Expression of the Emotions in Man and Animals. 3rd edition. London: Harper Collins [FoCL’99] Hausser, R. (1999). Foundations of Computational Linguistics, 2nd ed.. Heidelberg Berlin New York: Springer [Grice 1965] Grice, P. (1965). Utterer’s meaning, sentence meaning, and word meaning, Foundations of Language, 4:1–18 [Handl et al. 2009] Handl, J., B. Kabashi, T. Proisl, and C. Weber (2009). JSLIM - Computational morphology in the framework of the SLIM theory of language, in C. Mahlow and M. Piotrowski (eds.) State of the Art in Computational Morphology, Berlin Heidelberg New York: Springer [Hausser 2009a] Hausser, R. (2009). Modeling Natural Language Communication in Database Semantics, Proceedings of the APCCM 2009, Australian Comp. Sci. Inc., CIPRIT, Vol. 96. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Hausser 2009b] Hausser, R. (2009). From Word Form Surfaces to Communication, in T. Tokuda et al. (eds.) Information Modelling and Knowledge Bases XXI, Amsterdam: IOS Press Ohmsha. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Hutchinson 1948] Hutchinson, G.E. (1948). Circular Causal Systems in Ecology, Ann. New York Acad. Science 50:221-246 [Lazarus and Lazarus 1994] Lazarus, R., and B. Lazarus (1994). Passion and Reason: Making Sense of Our Emotions, New York: Oxford University Press [Naphade and Smith 2009] Naphade, M.R., and J. R. Smith (2009.) Computer program product and system for autonomous classiﬁcation, Patent Application #:20090037358 - Class: 706 46 (USPTO) [NLC’06] Hausser, R. (2006). A Computational Model of Natural Language Communication. Berlin Heidelberg New York: Springer [Proust 1913] Proust, M. (1913). Du côté de chez Swann, ed. by Jean-Yves Tadie et al., Bibliotheque de la Pleiade, Paris: Gallimard,1987-89 [Quillian 1968] Quillian, M. (1968). Semantic memory in M. Minsky (ed.), Semantic Information Processing, 227–270, Cambridge, MA: MIT Press [Salton 1989] Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Reading, Mass.: Addison-Wesley [Schegloff 2007] Schegloff, E. (2007). Sequence Organization in Interaction, New York: CUP [Steels 1999] Steels, L. (1999). The Talking Heads Experiment. Antwerp: limited pre-edition for the Laboratorium exhibition [TCS’92] Hausser, R. (1992). Complexity in Left-Associative Grammar. Theoretical Computer Science 106.2:283-308, Elsevier. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Turney 2002] Turney, P. (2002). Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classiﬁcation of Reviews, Association for Computational Linguistics (ACL), 417-424 [Wiener 1948] Wiener, N. (1948). Cybernetics: Or the Control and Communication in the Animal and the Machine, Cambridge, MA: MIT Press

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-77

77

Modelling a Query Space Using Associations Mika TIMONEN a,1 , Paula SILVONEN a,2 and Melissa KASARI b,3 a Technical Research Centre of Finland, PO Box 1000, FI-02044 VTT, Finland b Department of Computer Science, PO Box 68, FI-00014 University of Helsinki, Finland Abstract. We all use our associative memory constantly. Words and concepts form paths that we can follow to ﬁnd new related concepts; for example, when we think about a car we may associate it with driving, roads or Japan, a country that produces cars. In this paper we present an approach for information modelling that is derived from human associative memory. The idea is to create a network of concepts where the links model the strength of the association between the concepts instead of, for example, semantics. The network, called association network, can be learned with an unsupervised network learning algorithm using concept co-occurrences, frequencies and concept distances. The possibility to create the network with unsupervised learning brings a great beneﬁt when compared to semantic networks, where the ontology development usually requires a lot of manual labour. We present a case where the associations bring beneﬁts over semantics due to easier implementation and the overall concept. The case focuses on a business intelligence search engine where we modelled its query space using association modelling. We utilised the model in information retrieval and system development. Keywords. Association network, Association modelling, Human Associative Memory, Query space modelling, Information retrieval

Introduction Information modelling has been researched rigorously in recent years. The aim has been to present the complex set of information related to different domains in a structured manner so it can be utilised in different applications. The information is reﬁned into knowledge that can be understood by intelligent agents; both human and artiﬁcial. A lot of work in this ﬁeld has been done on ontologies, knowledge bases and semantic networks. Ontologies aim to deﬁne concepts in the abstract level, providing semantics to the knowledge located in knowledge bases. Semantic networks model the relationships between concepts; for example, ’car’ is a ’transport device’ that has ’tyres’. Even though semantic networks are useful, their implementation is very labour intensive as the ontology is usually created manually; an example of this can be found in [1]. 1 E-mail:

Mika.Timonen@vtt.ﬁ Paula.Silvonen@vtt.ﬁ 3 E-mail: [email protected].ﬁ 2 E-mail:

78

M. Timonen et al. / Modelling a Query Space Using Associations

This is the biggest drawback with ontologies. As there are cases when a simpler model of the domain is enough, implementing an ontology and a semantic network is not a suitable option. Especially when we want to link related concepts together to be used, for example, in a search engine or in a recommendation system, we do not necessarily need to identify their semantics. In these cases, a lighter approach is usually preferred. The term query space refers here to the collection of concepts found in the documents of the given domain. For example, a database consisting of research articles makes a query space which consists of concepts that are the terms found from the documents. By modelling this space we can map the concepts and ﬁnd links between them. The mappings can then be used when processing users’ queries by ﬁnding related terms and expanding the query. For example, if there is a relation between terms ’car’ and ’tyre’ and user searches for tyres, the query can be expanded to include also cars especially if the initial search does not produce any results. In this paper, we present a method for modelling a business intelligence related query space by identifying associations between the concepts. We model the associations using an association network that mimics the human associative memory. For some reason, when we think of a concept, e.g., ’car’, our ﬁrst association may be something completely unrelated in the semantic sense, e.g., ’Australia’. This association has been formed by our experiences; for instance a long road trip in Australia. In a semantic network, the concept ’car’ would most probably be linked with concepts like ’vehicle’, ’automobile’ or ’tyre’. The idea behind association modelling is not to model the semantics of a domain but the associative relationship of the concepts in a domain. It does not necessarily link semantically similar concepts closely together; in association network two concepts may have a strong association even if they do not have any semantic relationship. We used our association modelling approach to model a query space of a business intelligence search engine called BI-search. The idea was to tackle two major problems with the search engine: (1) as the searched databases are fairly limited, the query space of each database is also limited. Therefore, users often used terms not found from the query space (databases) even though related terms were found. To address this problem, we needed to map related terms together and use the mapping in query expansion. We also wanted to (2) facilitate the search process by providing an intuitive and easy to use graphical user interface that presents the related terms and provides a possibility to reﬁne and continue the search. We implemented the network using a project database, which contains information about approximately 9 000 on-going and completed projects done in Technical Research Centre of Finland (VTT). The information includes project name, start and end year, abstract and keywords. For each project there are two or more keywords that describe the relevant concepts of the given project. We assumed that (1) the keyword list holds the relevant concepts of the project in a concise way, and (2) if two keywords appear with each other they will form an association. The more often they appear with each other, the stronger the association is. We developed an unsupervised graph learning algorithm to create the association network from the keywords. The biggest challenge with the algorithm is the way the association weights are learned. We used conﬁdence, which is an important metric in association rule mining [2], as the starting point for calculating the association weight. We calculated the conﬁdence of each keyword pair, i.e., probability that keyword B is linked to the project when keyword A is, and weighted it with the average distance of

M. Timonen et al. / Modelling a Query Space Using Associations

79

the keywords in the keyword lists and the age of the keyword list (i.e., the project). This mimics the human associative memory by giving stronger associations to concepts that are "fresh in the memory" and that appear often with each other. We assessed the created network by manually evaluating the association. We also compared the utilisation of the network, i.e., query expansion, with information retrieval and other query expansion methods. We concluded that our approach brings several beneﬁts over the compared methods. For example, space consumption was lower than with thesauri and term frequency - inverse document frequency methods. The precision (percentage of relevant results in the result set) was lower after using the method but recall (percentage of relevant results compared to all relevant results in the query space) was better as was expected. By scoring and ranking the results, the negative effects of the lower precision were diminished and the beneﬁts of higher recall emphasized. This document is organised as follows. We review related work in Section 1. In Section 2 we describe the BI-search engine and give the background for this work. In Section 3 we present association modelling using association network and its implementation in an abstract level. In Sections 4 and 5 we describe the case study and the method we used to automatically model the query space associations. Section 6 presents the evaluation and its results. We conclude the paper in Section 7.

1. Related Work Association network represents a conceptual model of a domain by modelling the associations between concepts. Therefore it should not be confused with neural networks [3] and association neural networks [4] that concentrate on, for example, pattern recognition and classiﬁcation. In this section we survey psychology and neurobiology, information modelling and information retrieval as they are closely related to the method presented in this paper. 1.1. Psychology and Neurobiology Associationism, the theory that associations between concepts operate the mental processes, was ﬁrst presented by Plato. Later, philosophers like David Hume, John Locke and James Mill continued this work [5]. Nowadays, associations are the corner stone of psychology as they are studied from cognition and memory modelling perspective. Search of Associative Memory (SAM) [6] was initially created to model episodic memory. According to SAM, associations are made when two concepts occupy the same memory buffer at the same time. The more often this happens, the stronger the association gets. In other words, often co-occurring concepts will have a stronger association. Context is also included to the associations. The longer a concept is present in the given context, the higher the association between the concept and context. The activation of the associated concept - context or concept - concept pair will be determined by the strength of the association. In the synaptic level, the neurons will have a higher degree of connections if they have a strong association. Hebb, the father of the Hebbian theory, which concerns how neurons might connect themselves to become engrams (the way memory traces are stored with biochemical changes in the brain), stated in [7] that when two cells are repeatedly

80

M. Timonen et al. / Modelling a Query Space Using Associations

activated, they tend to be associated; meaning that an activation in one tends to lead to activation in the other. However, these associations will gradually deteriorate if they are not used; newer concepts will have stronger association than the older ones. Our work on association modelling is based on these theories. 1.2. Information Modelling Ontologies, knowledge bases and semantic networks are the most relevant information modelling methods related to association modelling. They are usually used for formally modelling the concepts of a domain and the relationships between the concepts [8]. There are two major characteristics of a semantic network: ﬁrst, the nodes, which contain the concepts, are usually linked to ontology or taxonomy. This deﬁnes the nodes formally by stating an upper level concept to which they are mapped. Second, the links between the nodes are labelled and they deﬁne the type of relationship between the nodes. The type of relationships can be freely deﬁned; is_a, is_part_of, has_synonym, is_needed, and so on. For example, the following could be found from a semantic network: wheel is_part_of car, where ’wheel’ and ’car’ are the nodes and is_part_of is the link between the nodes. Node ’wheel’ may be mapped to the upper level concept ’steering device’ and ’car’ to ’vehicle’ found from taxonomy. By linking the nodes to the taxonomy and deﬁning the relationships with each other, a semantic network is created. Ontology engineering is the research ﬁeld concentrating on implementation process and methods. There are some methods for automatic implementation of ontologies, for example [9], but usually the development process is done manually due to the complexity of the domain [1]. The difference between a semantic network and an association network is clear. In semantic networks the network holds more knowledge about the entities, i.e., the nodes, and the relationships between the entities. In association networks there are only the entities and the weight between them. The biggest beneﬁt of an association network when compared to semantic networks is that it is easy to implement by training the network unsupervised. It should be noted, however, that combining a semantic network and an association network could produce even greater beneﬁts than using either one alone. 1.3. Information Retrieval Information retrieval aims to ﬁnd documents that are relevant to a user’s information need. User satisﬁes his or her information need by doing a search, i.e., a query. The problem with the search is usually how the query is formulated. When the query is well formulated the results are also good but more often than not the query is too short or does not hold all the terms needed to satisfy the user’s information need. In this case, there is a need for reformulating the query by adding new search terms to it. This method is called query expansion which is a widely researched method for improving the performance of information retrieval. Expanding a query is a difﬁcult but important problem when doing information retrieval. There are a lot of different approaches how this reformulation is done. There are several relevant methods related to our approach, including: • Relevance feedback, • Pseudo-relevance feedback,

M. Timonen et al. / Modelling a Query Space Using Associations

• • • •

81

Statistically co-occurring words, WordNet, Term frequency - inverse document frequency, Spreading activation

Relevance feedback [10] is one of the ﬁrst methods proposed for query expansion. The idea is that the user can select the relevant documents from the result set and do the search again. The query is reformulated by adding terms from the relevant documents to the query. Pseudo-relevance feedback is a method that does not require any input from the user [11,12]. It is based on automatic calculation of document relevance and using the top k most relevant documents from the result set as an input to the relevance feedback method. Another approach is to expand the query before the initial search. This can be done, for example, by creating a list of terms that map terms together. For instance, if a term A is present in the query, the list could hold that terms B and C should be also added to the query. One way of storing the term - term mappings is using a thesaurus. There can be different types of thesauri but usually a thesaurus is deﬁned as a set of mappings from terms to other related terms [13]. The classical way is to use semantic relations mappings such as synonym, hyponym and antonym. A good example of this is WordNet [14]. Thesaurus can be built using different methods, the most notable being manually built thesaurus, co-occurrence-based thesaurus and linguistic relations based thesaurus. Building a thesaurus from linguistic relations is based on the idea that terms that appear in similar context, e.g., have similar verbs near them, are similar [15]. Co-occurrence-based approach is fairly similar to our approach. The method is based on the assumption that terms that often appear together in the same document are similar in some way. Hearst [16] proposed a method that divides the document into pseudosentences of size n terms and calculates the similarity between the terms by checking how often the terms appear together in the pseudo-sentences. We have taken this approach further, as described later in this paper. Term frequency - inverse document frequency (tf-idf) [17] is the classic method used in information retrieval. The method weights the terms in each document by calculating how frequent the term is in the document, and in the collection of documents. The term’s weight is larger if the term is frequent in one document and infrequent in the collection of documents, i.e., appears in only a few documents. This method is used by search engines to rank the documents in comparison to the user’s search string. Even though tf-idf is mostly used for ranking the documents it can also be used to tackle the problem of query expansion. One approach is to use it for ﬁnding documents that are related to the original search string by comparing the documents’ term vectors; if two documents have similar term vectors they contain similar information even if they do not use the same terms. For example, the term ’road’ may appear when talking about ’cars’ and ’trucks’, making their document vectors similar. We can then deduce that ’cars’ and ’trucks’ have a connection between them. Spreading activation [18] is a method developed for searching a semantic or neural network. In the network, the nodes or the edges need to be weighted as the activation is spread between the strongest weighted nodes. The activation is continued until the activation value reaches below a given threshold. There is also a decay factor that lowers the activation value after each jump. Even though developed for a different case, this has also been used in information retrieval where nodes present the documents and their

82

M. Timonen et al. / Modelling a Query Space Using Associations

Figure 1. Search page.

terms [18]. We utilised this approach in our method to break the expansion when the activation was spread enough.

2. Background We decided to use the business intelligence search (BI-search) engine as the test case for the association network. The BI-search application was implemented in collaboration with the Technical Research Centre of Finland (VTT) and Fujitsu Laboratories Japan. BI-search is an application that queries internal and external databases, integrates the information and presents the results to the user. The users of the system are researchers who want to do a quick business intelligence check related to a project idea or proposal they have. The idea behind the system is to provide a comprehensive and intuitive report of patents, projects, persons and companies that are relevant to the new project and its proposal. The search page is presented in Figure 1. The data sources we have integrated to the system include: (1) project database called Research Register that contains approximately 9 000 on-going and completed projects done within VTT, (2) personnel database called SkillBase that holds information about the employees and their skills, (3) patent database called Patent Register, and (4) Yahoo! search engine. Research Register is used for ﬁnding completed and on-going projects to support the project or the project proposal writing process. SkillBase, which holds a large collection of skills relevant to VTT, is organised into taxonomy to form a hierarchy. These skills include java programming, which is a sub-skill of programming; data mining, subskill of Technologies and methods; and customer relationship management, sub-skill of Competence areas. Each employee has rated their skill level in each of the skills listed in SkillBase. In BI-search, SkillBase is used for ﬁnding if there are persons who can do the tasks required in the project.

M. Timonen et al. / Modelling a Query Space Using Associations

83

Figure 2. The front page of the report view and an example of the term - company relationship graph presented to the user.

Patent Register is used for getting relevant patent information and ﬁnding which companies have relevant patents in this ﬁeld. The Yahoo! search engine is used for ﬁnding companies related to the search terms. The system works as follows: user inputs a set of search terms that are relevant to the new project. The different terms are separated and forwarded to the search engine. The set of search terms makes the query set Q. The search engine queries the different data sources using the query set Q. It should be noted that standard pre-processing of the terms is done before the queries. This includes lower casing, and transforming the terms to singular form. The results are processed and analysed using different methods and heuristics to create an informative and intuitive report for the user. The results from Yahoo are processed using a text mining pipeline that extracts company names and locations from the results. The documents not containing any company names are discarded. Result analysis process includes scoring of the results. The results are shown in the report page that holds information that was found from the databases. The information is presented in descending order starting with the highest ranking score. Some of the information is also presented in different types of graphs. An example report view is shown in Figure 2. More information about the implementation of the association network can be found in Sections 3 and 4. Query expansion and result scoring are described in Section 5.1.

3. Association Network The idea behind association network is to mimic the way human associative memory works. The method is based on the theory that when two concepts appear often with each

84

M. Timonen et al. / Modelling a Query Space Using Associations

Figure 3. An example association network.

other, they tend to get a stronger association between them [6]. However, the associations are probabilistic in nature; we do not always follow the same association path but the paths vary. For example, we may usually associate the concept ’car’ to ’driving’, but we may also think hundreds of other concepts, like ’road’, ’wheel’ and ’pavement’ among other things. We model the associations using a network. The nodes in the network represent the concepts that can be words, terms or phrases like ’car’, ’arctic regions’ and ’road trip across Australia’. The nodes are linked together with directed edges that represent association and are weighted with the strength of the association. Figure 3 represents a small example of association network. The network and its notation is nothing new in computer science; Bayesian networks look similar as they consist of nodes that model concepts, and edges that model the probabilities between the concepts. Therefore the contribution of the association network is more abstract than concrete: the idea of modelling the associations instead of semantics or probabilities. Associations between concepts are formed when we experience something [7,6]. The experiences usually consist of several unrelated concepts that we then associate with each other. For example, a road trip across Australia may form associations between concepts like ’Australia’, ’driving’, ’car’, and ’kangaroos’. The stronger the experience is, the stronger the association. In human brain, stronger associations have more neural pathways between them [7]; in association network we use a decimal value to indicate how strong the association is. The experiences can be just about anything, including actual events from every day life, textual documents, signals and images. In our work we have concentrated on textual information found from databases. From machine learning perspective, it is usually difﬁcult, if not impossible to identify how strong an "experience" is. Therefore we have based our association weighting method on a concept used in association rule mining: conﬁdence. Conﬁdence is the probability of concept A appearing when the concept B appears. For instance, when talking about cars, we might talk about tyres 25% of times, making the conﬁdence between cars and tyres 0.25. This is not symmetric, i.e., the conﬁdence will be different when talking about tyres; cars may be talked about 50% of the times, making the conﬁdence of tyres and cars 0.5. Using only the conﬁdence is not enough as we usually make a stronger association between the concepts that were experienced closely together. If using only the conﬁdence to indicate the association weight, all of the concepts from the same experience would have the same weight. In addition, the association tends to be stronger with newer experiences and gradually deteriorate as time passes.

M. Timonen et al. / Modelling a Query Space Using Associations

85

Algorithm 1. Representation of an abstract level implementation of association network.

for Each concept c in experience E do Create node n n←c for Each Concept ce in E \ c do Create node m m ← ce Create edge e Calculate weight w(c, ce ) we ← w(c, ce ) end for end for Therefore we include two additional parameters to weight the association value: distance, which indicates how closely together the concepts were experienced, and time, which indicates the age of the concept pairing. Distance is an attribute that can vary depending on the data source. In an unstructured text, distance can be measured as the number of words, noun phrases, sentences or even paragraphs between the concepts. In time series data, the distance can be temporal. In some cases, it may be possible to use Euclidean distance. When the age of the experience can be deduced or extracted from the data, it can be used to simulate the natural deterioration of neural pathways. In Section 4 we give a more detailed description on association weight calculation. Algorithm 1 presents an abstract level algorithm of association network implementation. Eq. (1) presents a simple approach for calculating the association weight that takes the distance and conﬁdence into consideration. In Eq. (1) c denotes the concepts, ce the concepts it will have an association with, s conﬁdence, which is usually calculated with Eq. (2), and d the distance between c and ce . In Eq. (2) f req(c) is the frequency of concept c (how many times c has appeared), and f req(ce |c) is the frequency of concept ce ’s co-appearances with the concept c. w(c, ce ) = s(c, ce ) × d(c, ce ) s(c, ce ) =

f req(ce |c) f req(c)

(1) (2)

The association network has some similarities with a semantic network. Both have nodes and links but the idea behind association network is to remove the elements that require a lot of manual work. Therefore there is no ontology or taxonomy that would give semantics to the nodes. Also, the links between the nodes are a bit simpler as the labels are replaced by weights. These modiﬁcations are made so that the network would be lighter and it can be automatically implemented. We did not see any reason to add the semantics to the network but in case the semantics are needed (like ’car’ is a ’vehicle’), new information can be added to the network. Also, the relations do not have to be labelled as we only need the information about the weight of the relationship, i.e., how strong the association is. However, it may be useful to include the type of the relationship in the future as it may hold interesting information. The more information the network contains, the more usable it becomes but also more work is required in implementation. In our opinion, if seman-

86

M. Timonen et al. / Modelling a Query Space Using Associations

tic network and association network would be combined, the resulting network would provide the best uses. When possible, it may be a good idea to add the associations to the semantic network as extracting them is fast when compared to the arduous task of modelling the semantics.

4. Query Space Model When we started implementing the BI-search engine, we faced several challenges. The ﬁrst and the biggest challenge was mapping the search terms to the terms found from the databases. This was a major challenge due to the limitations of SkillBase. SkillBase consists of, approximately, 100 concepts; it was likely that the search term did not match any of the SkillBase concepts. For example, a search like ’knowledge base’, even though related to ’ontology’ which is found from SkillBase, did not produce any results. Another challenge was the usability of the system. When the results are not good, i.e., when some of the data sources produce no or incomplete results, the users wanted to update their search. For example, a search did not produce any results from SkillBase even though the user knows there are people who have expertise in ’knowledge bases’. The problem was that as the feasible search term did not produce results, it is difﬁcult to guess the related term that would generate the desired outcome. We addressed these issues by modelling the query space using the association network described in Section 3. A query space S is a collection of terms and concepts t that have some relevance to the domain in question. In the case of documents, query space consists of the terms found from the documents. An association network G(V, E) holds a set V of nodes (or vertices) and a set E of directed edges. Each node n ∈ V represents a term t ∈ S. If terms tn and tm are experienced together (for example found in the same document), the corresponding nodes n and m are linked with directed edges (n, m) and (m, n) in G. Each edge e ∈ E has a weight we (the strength of the association). We chose this approach as the association network can link related terms in the query space with very little effort. We base the work on the assumption that if a concept A appears with concept B often, there is a good chance that concept B will be interesting from the user’s point of view. Even though the relationship between the terms is not deﬁned in the network, terms that have high association will be relevant in most cases. We used the VTT’s Research Register when creating the network as it holds the key concepts of the query space. Each project found from Research Register holds several attributes; title, abstract, start and end years, and keywords being the most relevant. For implementing the network, we used only the keywords of each project as they hold the key concepts in a concise way. When compared to abstracts, the biggest beneﬁt with keywords is that they usually hold the same information but extracting them is notably easier. As described previously, we have based the implementation of the association network on two assumptions: (1) if two concepts appear often in the same context, their association is stronger, and (2) if two concepts appear often closely together, i.e., their average distance is small, the association between them is even stronger. We also adjust the weight with gradual deterioration. Algorithm 2 presents the association network creation. The ﬁrst step when implementing the association network is to pre-process the input data; in this case the key-

M. Timonen et al. / Modelling a Query Space Using Associations

87

Algorithm 2. Representation of the association network implementation algorithm that is used to create the query space model.

for Project p, collect keywords K do for Each Keyword kn in K do Create node n n ← kn for Each Keyword km in K \ kn do Create node m m ← km Create edge e Calculate weight w(kn , km ) we ← w(kn , km ) end for end for end for words. As the keywords were comma separated, the keyword extraction was a trivial task. After the keywords of each project are extracted, each keyword pair (kn , km ) linked to a project p is used to create the network. If a node for keyword kn or km does not exist, it will be created. The edge e between the nodes kn and km is created and its weight calculated. Calculating the weight between the nodes is the most crucial part of the algorithm as it indicates the strength of the association between two concepts. In order to mimic the associative memory, we base the weights on co-occurrence and frequencies. Our assumption is that when two concepts, i.e., keywords, occur together, association between them will be formed. If the occurrence of the pair is rare, the association is weak. On the other hand, if they occur together often, they will have a strong association. We used this idea when we developed the calculation scheme for the association network. We started out by calculating the frequencies of each keyword pair (kn , km ). The frequencies were then used to calculate the conﬁdence S(kn , km ) as described in Eq. (2). For example, when keyword A appears 10 times, and of those 10 times keyword B co-appears 7 times, the conﬁdence S(kA , kB ) = 0.7. This indicates that the association between kA , kB is 0.7. It should be noted that the edge between kn , km is directed (from kn to km ). The weight of the association from km to kn is calculated separately. The intuition behind this is that when we think of a term ’tyre’ we may think of ’car’ 70% of the time, but when we think of car we may think of ’tyre’ 10% of the time. If we use only the conﬁdence for the association weight we will lose an important element. Consider a case where you will have to memorise a list of words. When memorising, the words that appear next to each other will get a higher association when recollecting the words. As the keyword lists often consist of several keywords, we will utilise this by taking the distance between the keywords into consideration; if two concepts appear close to each other in the keyword list, they will get a stronger association. It is clear that in the keyword lists some of the keywords appear next to each other by chance. But it is highly unlikely that they would appear together often enough to merit a high association value. In other words, if two terms appear closely together often, they will get a higher association weight; otherwise the weight will be lower.

88

M. Timonen et al. / Modelling a Query Space Using Associations

Table 1. Effect of conﬁdence and distance to the association weight. Conﬁdence / Distance

1

2

3

5

7

9

1.0

1.0

1.0

1.0

1.0

1.0

1.0

0.8

1.0

1.0

1.0

1.0

0.95

0.89

0.6

1.0

1.0

1.0

0.86

0.71

0.63

0.5

1.0

1.0

1.0

0.72

0.59

0.52

0.3

1.0

0.997

0.63

0.43

0.35

0.31

0.2

1.0

0.66

0.42

0.29

0.24

0.21

0.1

1.0

0.33

0.21

0.14

0.11

0.10

0.05

1.0

0.17

0.10

0.07

0.06

0.05

We added this distance factor to the calculations by taking the average distance of two keywords and calculating the logarithm of the distance. The distance d between two terms is simply: d = n - m

(3)

where n is the order number of the nth keyword (kn ) and m is the order number of the mth keyword (km ). If the average distance is 1 (terms always appear next to each other), making log(1) = 0, we deﬁned this factor to be 0.01. If the distance was more than 10, we deﬁned the factor as 1.0. This way we will get factor values that vary between 0.01 and 1.0. Eq. (4) shows how we used the distance when calculating the weight. w(kn , km ) =

S(kn , km ) log10 (dkn ,km )

(4)

Table 1 presents how the weights range depending on the distance and conﬁdence of the keyword pair (kn , km ). The distance makes a big difference only when it is small. When the distance is near 2, the weight is approximately three times the conﬁdence. The average distance between the keywords we used for creating the network was 3.7, making the average impact on weight 176%. As there are keywords that appear only once, these keywords will have too much weight when compared with other keywords, especially with their neighbouring keywords. Therefore, we made a small adjustment to the distance calculation. This adjustment a, which can be seen in Eq. (5), gives more weight to the keywords that appear often. a(kn , km ) =

1 f rec(km |kn )

(5)

The distance is calculated now: d(kn , km ) = n − m + a(kn , km )

(6)

When a keyword appears only once, its distance will be ’penalised’ 100%, but when it appears ten times, the penalty is at most 10% of the original distance. Eq. (7) presents the way we calculate the weight for each keyword pair (kn , km ) after the adjustment a.

M. Timonen et al. / Modelling a Query Space Using Associations

w(kn , km ) =

S(kn , km ) log10 (n − m + a(kn , km ))

89

(7)

It is possible that the weight is above 1, especially if the term appears only once. In this case, we normalise the value to be 1 or smaller. This is done with Eq. (8), where max w(kn , N ) refers to the maximum weight in the node kn ’s neighbourhood N . w(kn , km ) =

w(kn , km ) max w(kn , N )

(8)

For example, if the weight w(kn , km ) is 1.20, and there is a keyword kp in the kn ’s neighbourhood N to which the weight is 1.40 (making the max w(kn , N ) = 1.40), weight w(kn , km ) will be normalised to 0.86. Finally, we included the gradual deterioration of the associations to the weighting schema. The motivation for this is the fact that when there are two associations with otherwise similar attributes (distance, co-occurence frequency), the newer one should have a greater probability to activate. Especially in our case, we feel that the younger associations are more interesting to the users: for example, a research project conducted in the 1970’s is far less interesting than a research project done last year. As Research Register holds the start and end years of the projects we were able to extract and use this information. Eq. (9) presents how we calculate the gradual deterioration gd function. gd(kn , km ) = 1 −

ln kage α

(9)

We used α = 30 in the calculations to make the values fall between 1.15 and 0.85. The value kage denotes the average age of the keyword that is calculated by taking the current year minus the average of the end years of the projects where kn and km occur together. If the average age 0 or below (the concept pairing is new), we assign kage = 0.01. The ﬁnal adjustment for the weight is done by multiplying it with the gradual deterioration, as shown in Eq. (10). The effect of the gd adjustment is small but noticable. If the concept’s average age is less than one year the weight will increase slightly. If the age is ﬁve years, the weight will decrease approximately 5.5%. By changing α we can give more emphasis to the age factor and make these changes more signiﬁcant. For example, if α = 10, ﬁve year old pairings would get 16% lower association weight and the new pairings would get 46% higher weight. w(kn , km ) = w(kn , km ) × gd(kn , km )

(10)

The result of this process was a network that contains approximately 14 000 nodes and 300 000 edges. It should be noted that there are always two edges between two nodes; from A to B and from B to A.

5. Utilisation of the Associations We implemented the association network to tackle the following three problems: (1) facilitate search and query expansion, (2) integrate data sources, and (3) improve the user interface and usability of the system.

90

M. Timonen et al. / Modelling a Query Space Using Associations

5.1. Query Expansion Before including the association network to the search engine, the biggest problem with BI-search was the null results. It was too common that search terms that should have produced results returned nothing. The feedback received from the users indicated that this was a clear problem. This problem was due to the limitations of the queried data sources; it could have been tackled manually but mapping hundreds of related search terms and database concepts together seemed too big of a task. We based our query expansion algorithm on spreading activation [18]. Algorithm 3 presents the pseudo code of the query expansion; for each query term q, the algorithm ﬁnds the corresponding node n from the network. The query is expanded by extracting the neighbours of the node n to the set N . The top k neighbours, i.e., the nodes with the highest association weights w, are added to the expansion set E. Next, each of the nodes ne located in E are expanded by extracting their neighbours. The association weight between the original query node n and the expanded node ne 2 (which is the neighbour of the neighbour) is calculated by multiplying the weights between the path from n to ne2 , as shown in Eq. (11). w(n, nej ) = n nej (11) In Eq. (11) ne j indicates that the link distance between node n and ne j is j; for example, ne 2 is directly linked to ne 1 which is directly linked to n. The node ne j is added to E if it has greater association weight than the smallest weight in E, i.e., ne j > min wn e , or if E does not hold k nodes, i.e., E < k. After expansion ﬁnishes, the nodes in set E are added to the query and the different databases are searched with this new set of query terms. The results of the search are analysed and the report is printed out for the user. We use different types of heuristics to score the results and to order them. The scoring will usually rank the results from the expanded terms lower than the ones found using user’s original search terms; however, if a result contains both expanded and user’s terms, its score will be high. We score the results using the following method: ﬁrst, a result is scored by checking the query term that produced the result. If the query term is found only in the set E (i.e., E \ Q, where Q is the original query set), the score is calculated by multiplying the score with the term’s association weight. For example, if we have expanded the query ’car’ with the term ’road’ (w = 0.7), the results that were received with the query ’car’ will receive the weight 1 and ’road’ 0.7. If the result holds both, its score will be 1.7. We also check other information about the result, such as what the spatial location of the resulted company, patent or person is, and how old the document is. These will affect the ranking only a little. As with all query expansion methods, it is evident that using query expansion will lower the overall precision of the results but the recall will be much higher. But by scoring the results and weighting the score with association value we ensure that the lower precision will not irritate the users. However, the higher recall will be noticed when it is needed, i.e., when the query would not otherwise produce any results. 5.2. Associative Search The null results produced also another problem for the users. As the user’s search terms were feasible, users commented that they did not know how to modify their search to

M. Timonen et al. / Modelling a Query Space Using Associations

91

Algorithm 3. Algorithm for query expansion using association network.

for Each query term q in Q do Find corresponding node n = q N ← n’s neighbours Order N by association weight w E ← N ’s top k nodes for Each node ne in E do Extract ne ’s neighbours Ne j for Each node ne j in Ne j do Calculate weight w between n and ne j if w > min wn e then E ← ne j end if end for end for end for produce the results they wanted. And even if the results were good, we wanted to provide an intuitive search option to continue and expand the search manually in case more information is needed. To address these issues we included an intuitive search option to the user interface called Associative search. The idea behind the search is that user can see the terms that have some association with the original search terms and use them to manually form the next query. We also included the SkillBase taxonomy to this search. We had to limit the expansion set to top k nodes as the precision of the search would otherwise be too low. When we present the nodes to the user we can set the limit higher. Therefore when expanding the search with top k nodes, as described in 5.1, we also get additional top j nodes that are presented to the user but not included in the query expansion. These k + j nodes are presented to the user in the user interface. Figure 4 presents the user interface of the Associative Search which provides the user with the possibility of manually expanding the search by selecting new search terms from the list of concepts. The list also includes the association weight (relevance weight from the user’s point of view) and the original search term to which it was mapped to. The concepts can be added to form a new search by clicking them on the list.

6. Experiments 6.1. Evaluation Setup Evaluation of the network is a difﬁcult task as it is hard to assess if the association weight between two concepts is feasible. It may not even be sensible to assess the associations as they are, in fact, associations. Nonetheless, we conducted a small scale evaluation of the network by manually checking approximately 300 of the associations and their weights, concentrating mostly on the top weighted associations. This sample contains approximately 1% of the top weighted associations; we considered that the sample of this

92

M. Timonen et al. / Modelling a Query Space Using Associations

Figure 4. Associative search, located on the left, can be used to manually expand the query with the related terms found from the association network.

size gives a good indication of the feasibility of the results. When assessing the results, it was difﬁcult to know if the result was good, as can be seen from Table 2. We evaluated the query expansion by comparing the space consumption and the results of our approach against other query expansion methods. The associative search was evaluated by collecting feedback from the users. 6.2. Results This section describes the evaluation results of the association network, information retrieval and associative search. 6.2.1. Association Network Approximately 9% of the associations were weighted 1 and approximately 1% of the associations were weighted 0.9 < w < 1. Approximately 26% of the associations were weighted over 0.5 and 33% below 0.1. Table 2 presents 30 example associations and their weights. Figure 5 shows an example association in regards to Table 2. From Table 2 we can see that most of the associations that have weight over 0.9 are feasible. Some of them, such as satellite picture - satellite image, are synonyms. Several of them have a strong association in a real life setting, such as GPRS - UMTS and road - asphalt. The table shows also the effect of the age factor. In most cases age lowers

M. Timonen et al. / Modelling a Query Space Using Associations

93

Figure 5. An example association, where concept (from) is GPRS, concept (to) is UMTS and weight is 1. In other words, association from GPRS to UMTS is weighted 1.

the weight but in some the weight is increased. In our opinion, the impact of weight is feasible as the newer association are usually more relevant from the user’s stand-point. We evaluated 300 randomly selected associations, 200 of them having weight 1.0; 50 of them were weighted 0.3 < w < 0.7; and 50 of them below 0.1. The evaluation was difﬁcult as there are several concepts that are unclear to us. In addition, assessing the associations may not be feasible. Therefore when we classiﬁed an association as a "negative hit" we consider that the mapping would produce negative search results with a high probability. If we consider that the weight should be higher, we indicate it in the "higher" column of Table 3. It should be noted that when we evaluated the network, we discarded the misspelled concepts that were present in the training data. Table 3 presents the results of the evaluation. As can be seen from the table, we considered most of the associations with weight 1 as correct. The associations with weight between 0.3 and 0.7 were mostly correct but there were a great number of associations that were considered too lightly weighted. However, this is a gray area as the association is quite strong. With the association weight below 0.1, approximately half of the association weights were too low. However, it is important to note that when assessing the associations we did not have all the information available. For instance, the same concept may have stronger association with other concepts, making the lower weight sensible as there should not be several strong associations for one concept. This is especially true in cases when a concept has dozens of associations. Therefore, even though the evaluation may seem to produce poor results when association weights are small, we think that these results can be justiﬁed in most cases. 6.2.2. Query Expansion We compared the query expansion with association network against other information retrieval and query expansion methods. We used projects found from Research Register and the data from SkillBase to compare the methods. Term frequency - inverse document frequency produced quite good results but there were two major problems. First, as we used several different data sources we could not tackle the problem of mapping the search terms to related terms found from SkillBase with tf-idf. For ﬁnding similar projects from the Research Register, tf-idf produced good results. However, it produced more true negative results, i.e., projects that are not relevant, than our query expansion method. This was due to the way tf-idf works: it takes all of the keywords and creates a vector that is compared to the original search string. Second, the space consumption of the method was substantial. We used n × d matrix, where n is the number of terms and d is the number of documents, making the size of the matrix 126 000 000 entries. Thesaurus approach takes even more space than tf-idf as it requires n × n matrix to store the weights between the terms. We can weight the terms the same way we weight the edges in the association network, making the space consumption the only difference

94

M. Timonen et al. / Modelling a Query Space Using Associations

Table 2. Example set of associations. The last column indicates the weight after the age has been factored in. Concept (from)

Concept (to)

Weight

Weight with age factor

satellite picture

satellite image

1.0

1.0

building information modelling

safety

1.0

1.0

waste combustion

biomass

1.0

0.977

ontology

reasoning

1.0

0.967

regional construction

energy distribution

1.0

0.96

oulu

energy conservation

1.0

0.96

rﬁd tag

barcode

1.0

0.96

pulping process

pulping industry

1.0

0.95

gprs

umts

1.0

0.935

road

asphalt

1.0

0.92

competitor survey

SME

1.0

0.91

sun

isotropy

1.0

0.91

rime

ice formation

1.0

0.90

respirator

occupational safety

1.0

0.896

screwdriver

hand saw

1.0

0.895

polymer

plastic

1.0

0.85

apms

paper machine

0.96

0.889

iron

steel

0.95

0.81

organic contaminant

enzyme

0.93

0.998

sea level

climatic change

0.93

0.877

felling

pulpwood

0.90

0.845

mobile telephone

local area network

0.90

0.85

lightweight concrete

stiffness

0.90

0.81

aerial photography

aerial survey

0.83

0.76

online measurement technology

high pressure

0.71

0.65

atmosphere

scanning

0.63

0.58

testing methods

failure

0.55

0.5

process simulation

processes

0.52

0.49

rye

wheat

0.42

0.45

energy conservation

fuel consumption

0.22

0.21

food processing

electric device

0.09

0.07

enzyme

health care

0.013

0.015

between the methods. With this approach, the space consumption is 196 000 000 entries. Association network requires only space for each node, and for the edges between the nodes. In our network, there are 13 712 nodes and 291 536 edges between the nodes, making the space requirement for the network approximately 300 000 entries. Pseudo-relevance feedback requires only the space for the results making the space consumption 0. We tested pseudo-relevance feedback method by extracting the keywords from the result set and expanded the search using these keywords. This lowered the precision drastically as there were on average 4 new search terms added per project in the result set. As there were approximately 28 projects in the result set, the number of new search terms was approximately 110. We did not weight or prune the set of new search

95

M. Timonen et al. / Modelling a Query Space Using Associations

Table 3. Results of the network evaluation. Positives were considered as correctly weighted, negatives incorrectly weighted. Higher and lower indicates whether the negatives should be valued higher or lower. Weight

Positives

Negatives

Higher

Lower

1.0

92%

8%

0%

100%

0.3 - 0.7

60%

40%

85%

15%

< 0.1

45%

55%

100%

0%

terms. It could be possible to use a variation of our association weighting method here; however this would create more time consumption for the algorithm as the weights need to be calculated on each run separately. As it is expected, the precision of the results is lower when using the query expansion. This is due to new search terms that are added to the search. On the other hand, recall is much higher for the same reason. Finding the balance between precision and recall is difﬁcult but as described in Section 5.1, we have avoided this problem with the result weighting schema in BI-search. 6.2.3. Associative Search During the ﬁnal stages of the development we conducted user tests on the system and collected feedback about the search engine and the associative search. The test setup was simple: user does a search after which he or she checks the results and is asked to look the associative search panel on the screen. If there are interesting terms present, a new search is made. The system received favourable comments especially about the usability. First, it was easy to continue the search after the initial results as the related concepts were present. A couple of users commented, that by doing a new search using the related terms produced new ideas for the project by pointing towards a possible domain for test cases and directed towards persons with similar completed projects within the company; even though the original search did not produce such results.

7. Conclusions In this paper we presented an unsupervised method for implementing association networks. We used the method for modelling a query space and utilised the network in query expansion and in enhancing the usability of the BI-search system by presenting the relevant associative concepts to the user. We used keywords rather than free text as they contain approximately the same information in a more concise way, making it easier to extract the concepts of the domain. The network itself is a useful and intuitive tool to present the associations between the concepts. When compared, for example, to matrices, the network requires much less space and is more intuitive and efﬁcient to use. The results proved this approach useful for our needs. Even though precision was lower, as was expected, recall was high. The network was able to make two improvements to the search: (1) to provide results when null results would otherwise occur, and (2) to provide additional results that could interest the user. The user interface and usability of the system were also successfully improved, as the user feedback indicated.

96

M. Timonen et al. / Modelling a Query Space Using Associations

In the future we will experiment with the association network in other domains such as content-based recommendation systems. An interesting challenge is extracting the concepts from free text, such as abstracts. Future improvements to query expansion may be to ﬁnd the strongest paths between the query terms and expanding the search using the concepts on each path. This may be efﬁcient as it concentrates on several query terms at the same time instead of just one.

References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

[11]

[12] [13] [14] [15] [16] [17] [18]

Timonen, M., Implementation of an Ontology-Based Biological Knowledge Base, Master’s Thesis, Department of Computer Science, University of Helsinki, Helsinki, 2007. Agrawal, R., Imielinski, T., Swami, A., Mining association rules between sets of items in large databases, SIGMOD rec., 22(2), 1993, pp. 207-216. Gurney, K., Neural Networks, CRC, 1997. Tetko, I., Associative Neural Networks, Neural Processing Letters, 16(2), 2002, pp. 187-199. Anderson, J., Bower, G,. Human Associative Memory: A brief edition, Psychology press, 1980. Raaijmakers, J., Schiffrin, R., Search of associative memory, Psychological Review, 8(2), 1981, pp. 98134. Hebb, D., The organization of behavior, New York: Wiley, 1949 Gruber, T., A translation approach to portable ontology speciﬁcations, Knowledge Acquisition, 5, 1993, pp. 199-220. Blomqvist, E., OntoCase - A Pattern-Based Ontology Construction Approach, On the Move to Meaningful Internet Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS, pp. 971-988. Kelly, D., Belkin, N., Reading Time, Scrolling and Interaction: Exploring Implicit Sources of User Preferences for Relevance Feedback During Interactive Information Retrieval, SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, New Orleans, Louisiana, United States, 2001, pp. 408-409. Buckley, C., Salton, G., and Allan, J., Automatic Retrieval with Locality Information Using Smart, Text REtrieval Conference (TREC-1), National Institute of Standards and Technology, Gaithersburg, MD, 1992, pp. 59-72. Efthimiadis, N. E., Query Expansion, Annual Review of Information Systems and Technology, 31, 1996, pp. 121-187. Schutze, H., Pedersen, J., A cooccurrence-based thesaurus and two applications to information retrieval, Information Processing & Management, 33(3), 1997, pp. 307-318. WordNet: An Electronic Lexical Database, http://wordnet.princeton.edu/ Wang, Y., Vandendorpe, J., Relational Thesauri in Information Retrieval, Journal of the American Society for Information Science, 36(1), 1985, pp. 15-27. Hearst, M., Multi-paragraph segmentation of expository text, Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Las Cruces, New Mexico, United States, 1994, pp. 9-16. Salton, G., Buckley, C., Term-weighting approaches in automatic text retrieval, Information Processing & Management, 24(5), 1988, pp. 513-523. Crestani, F., Application of Spreading Activation Techniques, Information Retrieval, Artiﬁcial Intelligence Review, 11(6), 1997, pp. 453-482.

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-97

97

Architecture-Driven Modelling Methodologies Hannu JAAKKOLA a,1 and Bernhard THALHEIM b,2 Tampere University of Technology, P.O.Box 300, FI-28101 Pori, Finland b Christian-Albrechts-University Kiel, Computer Science Institute, 24098 Kiel, Germany a

Abstract. Classical software development methodologies take architectural issues as granted or pre-determined. They thus neglect the impact decisions for architecture have within the development process. This omission is applicable as long as we are considering monolithic systems. It cannot however been kept whenever we move to distributed systems. Web information systems pay far more attention to users support and thus require sophisticated layout and playout systems. These systems go beyond what has been known for presentation systems. We thus discover that architecture plays a major role during systems analysis, design and development. We thus target on building a framework that is based on early architectural decisions or on integration of new solutions into existing architectures. We aim at development of novel approaches to web information systems development that allow a co-evolution of architectures and software systems. Keywords. architecture-driven development, software development, web, information systems, modelling.

1. Introduction Typical components of modern information systems are large databases, which are utilized trough internet connections. The applications - Web Information systems (WIS) are usually large and the structure of them is complex covering different types of assets from reusable architectures to COTS components and tailored software elements. The complexity of information systems is increased also the growing demand of interoperability expectations. Larry Boehm - in his conference paper [1] - is using the term “complex systems of systems” in this context. His message is that modern information systems are layered and complex structures based on interoperability between individual systems, products and services. There is no commonly agreed deﬁnition for the notion of a software architecture 3 . Some of the notions we found in the literature are too broad, some others are too narrow4 . Boehm [2] approaches the topic by analyzing the trends that are worth of knowing in 1 Corresponding

Author: hannu.jaakkola@tut.ﬁ http://www.pori.tut.ﬁ/∼ hj http://www.is.informatik.uni-kiel.de/∼ thalheim 3 Compare the large list more than hundred of deﬁnitions collected from contributors to http://www.sei.cmu.edu/architecture/start/community.cfm 4 Compare http://www.sei.cmu.edu/architecture/start/moderndefs.cfm 2 [email protected]

98

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

adapting the software engineering practices and methods in the current needs. One of his ﬁndings points out the importance of architectures. Architectures are means to communicate about software, to set up preconditions to the components and interfaces, to adopt beneﬁcial approaches for strategic reuse in software development, etc. Architecture has three roles: • to explain: architecture explains the structure of software; • to guide: architecture guides the designer to follow the predeﬁned commonly accepted rules; • to enable: architecture provides high level mechanism to implement the requirements set to the product. In modern software development especially the role of enabling architectures has been growing as the role of reuse as a part of development is increasing. A similar observation has been made for advanced database system architectures [6,14]. A key observation for database management systems has been that the invariants in database processing determine the architecture of a system. [6] predicted that novel systems such as native XML systems must either use novel architectures or let the user experience the “performance catastrophe”. Business information systems applications that target novel applications, e.g., SOA [15,21], require completely different architectures. Architecture is a term that must cope with a variety of different aspect reﬂections and viewpoints. The Quasar model of sd& m [23]) distinguished between the application architecture that reﬂects the outside or gray-box view of a system, the technical or module construction architecture that separates components or modules for construction and implementation, and the technical infrastructure architecture that considers the embedding of the system into a larger system or into the supporting infrastructure. This separation of concern is similar to different viewpoints in geometry that uses the top view, the proﬁle view, and the ground view. These views are only three views out of a large variety of views. We use the following deﬁnition of the notion architecture: A system architecture represents the conceptual model 5 of a system together with models derived from it that represent (1) different viewpoints deﬁned as views on top of the conceptual model, (2) facets or concerns of the system in dependence on the scope and abstraction level of various stakeholders, (3) restrictions for the deployment of the system and description of the quality warranties of the system, and (4) embeddings into other (software) systems. We can distinguish ﬁve standard views in an architectural framework: (I) The information or data view represents the data that is required by the business to support its activities. This answers the what information is being processed question. (II) The functional business or domain view represents all the business processes and activities that must be supported. This answers the “what business activities are being carried out”. (III) The integration or data-ﬂow view represents the ﬂow of information through the business, where it comes from and where it needs to go. This answers the which business activities require it question. (IV) The deployment or technology view represents the 5 The conceptual model includes structural, behavioural and collaboration elements. Systems might be modularised or can also be monolithic. The conceptual model allows us to derive a speciﬁcation of the system capacity. We may distinguish between standard views and views that support different purposes such as system construction, system componentisation, documentation, communication, analysis, evolution or migration, mastering of system complexity, system embedding, system examination or assessment, etc.

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

99

physical conﬁguration and technology components used to deploy the architecture in the operating environment. This answers the where is the information located question. (V) The infrastructure or embedment view represents the system as a black- or grey-box and concentrates on the embedding of the system into other systems that are either supporting the system or are using systems services. Web information systems are typically layered or distributed systems. Layering and distribution results in rather speciﬁc data structures and functions that are injected in order to cope with the speciﬁc services provided by layers or components. The CottbusNet projects used a multi-layer and distributed environment. For instance, the events calendar in city information systems may use a dozen or more different database systems and a view tower. A view tower of such systems must provide advanced search facilities [4]. It uses views that compile a variety of ETL results into a common view for events, an extraction view for presentation of events at a classical website or at other media such as video text canvas or smart phone display, a derived search functionality for these data, and a collection view for a shopping cart of a event shopper. A similar observation can be made for OLTP-OLAP systems [12,13]. OLAP systems are typically built on top of OLTP systems by applying ﬁrst grouping and aggregation functions and second by integrating data obtained into a data mart presentation. In projects aiming in developing web information systems [25] we discovered that interactivity required redevelopment and adjustment of functionality and of structuring of supporting database systems. Therefore, the presentation layer of a system “struck through” to the support system and resulted in change of this system. This observation complements the observations such as [6,14,21] and shows that web information systems must be build on a more ﬂexible consideration of architectures. These observations can be summarized into the architecture/application impedance mismatch: Architecture solutions heavily inﬂuence the capability of a system and must be considered as an orthogonal dimension during systems development. Outline of the Paper This paper opens discussion on Architecture-Driven Modelling Methodologies in the connection with large Web Information Systems. The paper has its roots in a joint research project of the co-authors; the project has had connections to other related research activities of the participating organisations, and it is funded by DAAD in Germany and Academy of Finland. This paper provides an overview to the approach and methodology developed in the project. Sections 2 introduces the key concepts of the paper. Sections 3 and 4 cover the bindings of the topic to the state-of-the-art of classical IS methodologies and to the Co-Design approach developed by one of the co-authors [25,19]. Architecture Driven Methodologies are discussed in Section 5. The paper summarises the ﬁndings of the project by introducing a four-dimensional or four-facetted model to software development in Section 6. 2. Architecture-Driven Modelling of Web Information Systems 2.1. The Challenges of Modern Web-Based and Web Information Systems Web information systems (WIS) [3,9,20] augment classical information systems by modern Web technologies. They require at the same time a careful development and sup-

100

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

port for the interaction or story spaces beside the classical support for the working space of users. These dimensions complicate the system development process. Usually, WIS are data-intensive applications which are backed by a database. While the development of information systems is seen as a complex process, Web information systems engineering adds additional obstacles to this process because of technical and organizational speciﬁcs: • WIS are open systems from any point of view. For example, the user dimension is a challenge. Although purpose and usage of the system can be formulated in advance, user characteristics cannot be completely predeﬁned. Applications have to be intuitively usable because there cannot be training courses for the users. Non-functional properties of the application like ‘nice looking’ user interfaces are far more important compared with standard business software. WIS-E is not only restricted to enterprises but is also driven by an enthusiastic community fulﬁlling different goals with different tools. • WIS are based on Web technologies and standards. Important aspects are only covered by RFCs because of the conception of the Internet. These (quasi)standards usually reﬂect the ‘common sense’ only, while important aspects are handled individually. • Looking at the complete infrastructure, a WIS contains software components with uncontrollable properties like faulty, incomplete, or individualistically implemented Web browsers. • Base technologies and protocols for the Web were deﬁned more than 10 years ago to fulﬁll the tasks of the World Wide Web as they had been considered at this time. For example, the HTTP protocol was deﬁned to transfer hypertext documents to enable users to browse the Web. The nature of the Web changed signiﬁcantly since these days, but there were only minor changes to protocols to keep the Holy Cow of Compatibility alive. Today, HTTP is used as a general purpose transfer protocol which is used as the backbone for complex interactive applications. Shortcomings like statelessness, loose coupling of client and server, or the restrictions of the request-response communication paradigm are covered by proprietary and heavy-weight frameworks on top of HTTP. Therefore, they are not covered by the standard and handled individually by the framework and the browser, e.g., session management. Small errors may cause unwanted or uncontrollable behavior of the whole application or even security risks. WIS can be considered from two perspectives: the system perspective and the user perspective. These perspectives are tightly related to each other. We consider the presentation system as an integral part of WIS. It satisﬁes all user requirements. It is based on real life cases. Software engineering has divided properties into functional and nonfunctional properties, restrictions and pseudo-properties. This separation can be understood as a separation into essential properties and non-essential ones. If we consider the dichotomy of a WIS then this separation leads to a far more natural separation into information system requirements and presentation systems requirements. The system perspective considers properties such as performance, efﬁciency, maintainability, portability, and other classical functional and non-functional requirements. Typical presentation system requirements are usability, reliability, and requirements oriented to high quality in use, e.g., effectiveness, productivity, safety, privacy, and satisfaction. Safety and security are also considered to be restrictions since they specify undesired behavior of systems.

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

101

Pseudo-properties are concerned with technological decisions such as language, middleware, operating system or are imposed by the user environment, the channel to be used, or the variety of client systems. WIS must provide a sophisticated support for a large variety of users, a large variety of usage stories, and for different (technical) environments. Due to this ﬂexibility the development of WIS differs from the development of information systems by careful elaboration of the application domain, by adaptation to users, stories, environments, etc. Classical software engineering typically climbs down the system ladder to the implementation layer in order to create a productive system. The usual way in today’s WIS development is a manual approach: human modelling experts interpret the speciﬁcation to enrich and transform it along the system ladder. This way of developing speciﬁcations is error-prone: even if the speciﬁcation on a certain layer is given in a formal language, the modelling expert as a human being will not interpret it in a formal way. Misinterpretations, misunderstandings, and therefore the loss of already speciﬁed system properties is the usual business. 2.2. The Classical Presentation System Development for Web Information Systems Classical approaches to web information systems are often based on late integration of presentation systems into the WIS information system. This approach is depicted in in Figure 1. Classically several layers of abstraction are identiﬁed. The top layer is called the application domain layer. It is used to describe the system in a general way: What are the intentions? Who are the expected users? The next lower layer is called the requirements prescription layer, which is used to concretise the ideas gathered on the application domain layer. This means to get a clearer picture of the different kinds of users and their proﬁles. This may also include the different roles of users and tasks associated with these roles. The major part of this layer, however, deals with the description of the story board. Stories identify possible paths through the system and the information that is requested to enable such paths. So the general purpose of the business layer is to anticipate the behaviour of the system’s users in order to set up the system in a way that supports the users as much as possible. The central layer is the conceptual layer. Whilst the requirements prescription layer did not pay much attention to technical issues, they come into play on the conceptual layer. The various scenes appearing in the story board have to be analysed and integrated, so that each scene can be supported by a unit combining some site content with some functionality. This will lead to designing abstract media types. The information content of the media types must be combined to design the structure of an underlying database. The next lower layer is the presentation layer which is devoted to the problem of associating presentation options to the media types. This can be seen as a step towards implementing the system. Finally, the lowest layer is the implementation layer. All the aspects of the physical implementation have to addressed on this layer. This includes setting up the logical and physical database schemata, the page layout, the realisation of functionality using scripting languages, etc. As far as possible, components on the implementation layer, especially web-pages, should be generated from the description on the higher layers.

102

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

PP

Description/ prescription layer

Conceptual layer

Implementation layer

PP PP PP Application PP

area PP PP description PP PP PP PP PP PP P P PP PP PP Design P PP Requirements Reﬁnement PP PP prescriptions P WIS description PPP PP P and prescription P

Information system

speciﬁcation Presentation system speciﬁcation WIS speciﬁcation Information system Implementation Transformation Presentation system Web information system

Figure 1. The classical dichotomy of human-computer systems and the systems ladder

This approach has the advantage that the presentation system speciﬁcation is based on database views. The entire presentation depends on the maturity of the information systems speciﬁcation. For this reason we may prefer the development according to the methodology depicted in Figure 1 or better in Figure 4.

3. State of the Art and Classical (Web) Information Systems Methodologies ARIS (Architecture of Integrated Information Systems) [16] deﬁnes a framework with ﬁve views (functional, organizational, data, product, controlling) and three layers (conceptual (‘Fachkonzept’), technical (‘DV-Konzept’), and implementation). ARIS was designed as a general architecture for information systems in enterprise environments. Therefore, it is too general to cover directly the speciﬁcs of Web information systems and needs to be tailored. The Rational Uniﬁed Process (RUP) [10] is an iterative methodology incorporating different interleaving development phases. RUP is backed by sets of development tools.

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

103

RUP is strongly bound to the Uniﬁed Modelling Language (UML). Therefore, RUP limits the capabilities of customization. Like ARIS, RUP does not address the speciﬁcs of WIS-E. A similar discussion can be made for other general purpose approaches from software engineering. OOHDM [22] is a methodology which deals with WIS-E speciﬁcs. It deﬁnes an iterative process with ﬁve subsequent activities: requirements gathering, conceptual design, navigational design, abstract interface design, and implementation. OOHDM considers Web Applications to be hypermedia applications. Therefore, it assumes an inherent navigational structure which is derived from the conceptual model of the application domain. This is a valid assumption for data-driven (hypermedia-driven) Web applications but does not ﬁt the requirements for Web information systems with dominating interactive components (e.g., entertainment sites) or process-driven applications. There are several other methodologies similar to OOHDM. Like OOHDM, most of these methodologies agree in an iterative process with a strict top-down ordering of steps in each phase. Surprisingly, most of these methodologies consider the implementation step as an ‘obvious’ one which is done by the way, although speciﬁcs of Web applications cause several pitfalls for the unexperienced programmer especially in the implementation step. Knowledge management during the development cycles is usually neglected. There are several methodologies that cope with personalization of WIS. For example, the HERA methodology [7] provides a model-driven speciﬁcation framework for personalized WIS supporting automated generation of presentation for different channels, integration and transformation of distributed data and integration of Semantic Web technologies. Although some methodologies provide a solid ground for WIS-E, there is still a need for enhancing the possibilities for specifying the interaction space of the Web information system, especially interaction stories based on the portfolio of personal tasks and goals. This list of projects is not complete. Most of the project are not supporting conceptual development but provide services for presentation layout or playout. The Yahoo pipes project6 uses mashup services for remixing popular feed types. The Active Record pattern embeds the knowledge of how to interact with the database directly into the class performing the interaction.

4. Co-Design of Web Information Systems We distinguish a number of facets or views on the application domain. Typical facets to be considered are business procedure and rule facets, intrinsic facets, support technology facets, management and organization facets, script facets, and human behavior. These facets are combined into the following aspects that describe different separate concerns: • The structural aspect deals with the data which is processed by the system. Schemata are developed which express the characteristics of data such as types, classes, or static integrity constraints. • The functional aspect considers functions and processes of the application. • The interactivity aspect describes the handling of the system by the user on the basis of foreseen stories for a number of envisioned actors and is based on media 6 See:

http//pipes.yahoo.com

104

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

objects which are used to deliver the content of the database to users or to receive new content. • The distribution aspect deals with the integration of different parts of the system which are (physically or logically) distributed by the explicit speciﬁcation of services and exchange frames. Each aspect provides different modelling languages which focus on speciﬁc needs. While higher layers are usually based on speciﬁcations in natural language, lower layers facilitate formally given modelling languages. For example, the classical WIS Co-Design approach uses the Higher-Order Entity Relationship Modelling language for modelling structures, transition systems and Abstract State Machines for modelling functionality, Sitelang for the speciﬁcation of interactivity, and collaboration frames for expressing distribution. Other languages such as UML may be used depending on the skills of modelers and programmers involved in the development process. A speciﬁcation of a WIS consists of a speciﬁcation for each aspect such that the combination of these speciﬁcations (the integrated speciﬁcation) fulﬁlls the given requirements. Integrated speciﬁcations are considered on different levels of abstraction (see Figure 2) while associations between speciﬁcations on different levels of abstraction reﬂect the progress of the development process as well as versions and variations of speciﬁcations. Unfortunately, the given aspects are not orthogonal to each other in a mathematical sense. Different combinations of speciﬁcations for structure, functionality, interactivity, and distribution can be used to fulﬁll given requirements while the deﬁnition of the ‘best combination’ relies on non-functional parameters which are only partially given in a formal way. Especially the user perspective of a WIS contributes many informal and vague parameters possibly depending on intuition. For example, ordering an article in an online shop may be modelled as a workﬂow. Alternatively, the same situation may be modelled by storyboards for the dialog ﬂow emphasizing the interactivity part. This principle of designing complex systems is called Co-Design, known from the design process of embedded systems where certain aspects can be realized alternatively in hardware or software (Hardware Software Co-Design). The Co-Design approach for WIS-E developed in the Kiel project group deﬁnes the modelling spaces according to this perception. We can identify two extremes of WIS development. Turnkey development is typically started from scratch in a response to a speciﬁc development call. Commercial offthe-shelf development is based on software and infrastructure whose functionality is decided upon by the makers of the software and the infrastructure than by the customers. A number of software engineering models has been proposed in the past: waterfall model, iterative models, rapid prototyping models, etc. The Co-Design approach can be integrated with all these methods. At the same time, developers need certain ﬂexibility during WIS engineering. Some information may not be available. We need to consider feedback loops for redoing work that has been considered to be complete. All dependencies and assumptions must be explicit in this case. In [5] we discussed one strategy to early incorporate architectural concerns into website development. The outcome was a methodology with a third development step that aims in the development of a systems architecture before any requirements elicitation is deployed.

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

Application domain layer

Scoping

? Requirements acquisition layer

Variating

? Business user layer

Designing

A A

X XX X

A

A

A

A A

XXX X XX Distribution speciﬁcation XXX XXX XX

105

? Conceptual layer Structuring ImpleXX speciﬁcation XX menting XX ? XX XX Implementation Functionality layer speciﬁcation

A

A A

A

A

A

A A Dialogue speciﬁcation

Figure 2. Abstraction Layers and Model Categories in WIS Co-Design

Architectural styles provide an abstract description of general characteristics of a solution. The following table list some of the styles. Style Client-Server ComponentBased Architecture Layered Arch. Message-Bus

N-tier / 3-tier

ObjectOriented Separated Presentation SOA

Description Segregates the system into two applications, where the client makes a service request to the server. Decomposes application design into reusable functional or logical components that are location-transparent and expose well-deﬁned communication interfaces. Partitions the concerns of the application into stacked groups (layers). A software system that can receive and send messages that are based on a set of known formats, so that systems can communicate with each other without needing to know the actual recipient. Segregates functionality into separate segments in much the same way as the layered style, but with each segment being a tier located on a physically separate computer. An architectural style based on division of tasks for an application or system into individual reusable and self-sufﬁcient objects, each containing the data and the behavior relevant to the object. Separates the logic for managing user interaction from the user interface (UI) view and from the data with which the user works. Refers to Applications that expose and consume functionality as a service using contracts and messages.

Each of these styles has strengthes, weaknesses, opportunities, and threats. Strengths and opportunities of certain architectural styles are widely discussed. Weaknesses and threats are discovered after implementing and deploying the decision. For instance, the strengths of SOA (service oriented architecture) are domain alignment, abstraction, reusable com-

106

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

ponents, and discoverability. Weaknesses of SOA are acceptance for SOA within the organization, harder aspects of architecture and service modeling, implementation difﬁculties for a team, methodologies and approaches for implementing SOA, and missing evaluations of various commercial products that purport to help with SOA rollouts. Threats of SOA are the development of a proper architectural plan, the process plan, resource scope, the application of an iterative methodology, the existence of a governance strategy, and the agreement on clear acceptance criteria. Therefore, a selection of an architecture has a deep impact on the web information system itself and drives the analysis, design and development of such systems. Figures 1 and 4 consider a separation of systems into a presentation system and a support system, i.e. the classical client-server decision. The picture is more complex if we decide to use 3-tier, SOA or other architectures. The structuring and the functionality that are provided by each of the subsystems must be properly designed. Therefore, the architectural style is going to drive the development process. 5. Architecture-Driven and Application-Domain-Ruled Modelling Methodologies The project we report was aiming in bridging two technologies developed in the research groups at Kiel and Tampere universities. The Tampere team has been concentrating in the past on software development technologies and methodologies. They have been contributing to corresponding standards. The Kiel team has gained deep insight into web information systems development. In the past the two groups have already been collaborating for the development of a web information systems design methodology. We built a framework that is based on early architectural decisions or on integration of new solutions into existing architectures. We aim in development of novel approaches to web information systems development that allow a co-evolution of architectures and software systems. WIS development results in a number of implemented features and aspects. These features and aspects are typically well-understood since they are similar to classical software products. One dimension that has often been taken into consideration at the intentional level is the level of detail or granularity of the description. Classical databases schemata are, for instance, schemata at the schema level of detail. This schema level is extended by views within the three-level architecture of database systems. These views are typically based on macro-schemata. Online analytical processing and data warehouse applications brought another level of detail and are based on aggregated data. Content management systems are additionally based on annotations of data sets and on concepts that explain these data sets and provide their foundation. Finally, scientiﬁc applications require another schema design since they use sensor data which are compacted and coded. These data must be stored together with the ‘normal’ data. The architectural component has been neglected for most systems since architecture has been assumed to be canonically given. This non-consideration has led to a number of competing architectures for distributed, main-frame or client-server systems. These architectures can however been considered as elements of the architecture solution space. Therefore the development space for software systems development can be considered to be three-dimensional. Figure 3 displays this space. Web information systems development has sharpened the conﬂicting goals of system development. We must consider at the same time a bundle of different levels of details,

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

107

Signatures/ schemata/ speciﬁcation languages 6founded annotated aggregated macro-schema annotated aggregated macro-schema aggregated macro-schema macro-schema schema (sensor) micro-schema Processes

Feature A Aspect B mainframe client/server federated collaborated collaborating on demand

-

and products of development

Architectures Figure 3. The Development Space for Web Information Systems

languages and schemata. Systems will not provide all features and aspects to all users. Users will only get those services that are necessary for their work. At the same time, a number of architectural solutions must co-exist. 5.1. Development by Separation of Concern Our approach concentrates on the separation of concern for development. We shall distinguish the user request diploid within a development: Application domain modelling aims in meeting the expectations of the user depending on their proﬁle and their work portfolio. Users want to see a system as companion and do not wish to get another additional education before they can use a system. Architecture modelling proposes a realisation alternative. This architecture is typically either based on already existing solutions or must be combined with the user system. Separation of concern for development allows to decompose an application into ﬁelds of action, thought or inﬂuence. All components have an internal structure formed from a set of smaller interlocking components (sub-component) performing well-deﬁned functions within the overall application domain. Separation of concern covers the what, who, when and (if its relevant) the why aspects of the business and allows us to identify ‘owners’ and ‘inﬂuencers’ of each signiﬁcant business activity that we need to consult whenever we want to change any of these aspects. A prescriptive (i.e., principles driven) separation is easier to justify to business stakeholders when proposals are put forward to restructure a business activity to improve overall efﬁciency. Functional business areas have a high inﬂuence on a system. They are identiﬁable vertical business areas such as ﬁnance, sales & marketing, human resources or product manufacturing; and in other cases, they are cross-functional “horizontal” areas such as customer service or business intelligence. Therefore, the business areas already govern the architecture of a system. The establishment of an “ownership” of an information ﬂow assigns the owner to be responsible for making the data available to other business areas as and when those business areas require it. “Inﬂuencers” of an information ﬂow need to be consulted when any changes are proposed to ensure that they can comply with the

108

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

change. Coherence boundaries are the points at which different functional business areas have to communicate with the outside world in a consistent and grammatically structured language. This request diploid is mapped then to different systems and can be separated as shown in Figure 1. We typically distinguish between the user system e.g. consisting of the presentation system and possibly of supporting systems and the computer system which uses a certain architecture, platform and leads to an implementation. Based on the abstraction layer model in Figure 2 we may distinguish different realisations of systems: Information-systems-driven development is based on late integration of the presentation and user system. Presentation systems are either developed after the conceptualisation has been ﬁnished (this leads to the typical ladder in Figure 1) or are started after the implementation has been developed. In this case we distinguish the following phases: 1. application domain description; 2. requirements elicitation, acquisition, and compilation prescription; 3. business user layer; 4. conceptual layer; 5. implementation layer. Web information systems use more ﬂexible architectures. Their development is intentionally often already based on development methodologies presented in Figure 4. So far, no systematic development of an methodology beside the methodology developed in our collaboration has been made. We typically may distinguish the following phases: 1. application domain description; 2. requirements elicitation, acquisition, and compilation prescription; 3. conceptual systems layer; 4. presentation systems layer; 5. implementation layer. Additionally we may also consider the deployment, maintenance, ... etc. layers. We restricted our project to the layers discussed above. 5.2. Abstraction Layering During Systems Development Our approach allows to integrate architecture development with architecture development. Top-down development of systems seems to be the most appropriate whenever a system is developed from scratch or a system is extended. For this reason, we may differentiate among three layers: the systems description and prescription layer, the conceptual speciﬁcation layer, and the systems layer. These layers may be extended by the the strategic layer that describes the general intention of the system, by the business user layer that describes how business users will see the system and by the logical layer that relates the conceptual layer to the systems layer by using the systems languages for pro-

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

109

gramming and speciﬁcation. Figure 4 relates the three main layers of systems development. The system ladder distinguishes at least between the following reﬁnement layers: description / prescription, speciﬁcation, and implementation. The reﬁnement layers allow to concentrate on different aspects of concern. At the same time, reﬁnement is based on reﬁnement decisions which should be explicitly recorded. The implementation is the basis for the usage. The dichotomy distinguishes between the user world and the system world. They are related to each other through user interfaces. So, we can base WIS engineering on either the user world description, the systems prescription, the developers presentation speciﬁcation, the developers systems speciﬁcation. We may extend the ladder by introduction layer, the deployment layer, and the maintenance layer. Since the last layers are often considered to be orthogonal to each other and we are mainly discussing WIS engineering the three layers are out of our scope. 5.3. Another Dichotomy for Web Information Systems Development We thus develop another methodology for web information systems. WIS have two different faces: the systems perspective and the user perspective. These perspectives are tightly related to each other. We consider the presentation system as an integral part of WIS. It satisﬁes all user requirements. It is based on real life cases. The dichotomy is displayed in Figure 4 where the right side represents the system perspective and the left side of the ladder represents the user perspective. Software engineering has divided properties into functional and non-functional properties, restrictions and pseudo-properties. This separation can be understood as a separation into essential properties and non-essential ones. If we consider the dichotomy of a WIS then this separation leads to a far more natural separation into information system requirements and presentation systems requirements. The system perspective considers properties such as performance, efﬁciency, maintainability, portability, and other classical functional requirements. Typical presentation system requirements are usability, reliability, and requirements oriented to high quality in use, e.g., effectiveness, productivity, safety, privacy, and satisfaction. Safety and security are also considered to be restrictions since they specify undesired behaviour of systems. Pseudo-properties are concerned with technological decisions such as language, middleware, operating system or are imposed by the user environment, the channel to be used, or the variety of client systems.

6. Extending the Triptych to the Software Modelling Quadruple We are going to combine the results of the ﬁrst three solutions into architecture development. One dimension of software engineering that has not been yet integrated well is the software architecture. Modelling has different targets and quality demands depending on the architecture. For instance, mainframe-oriented modelling concentrates on the development of a monolithic schema with a support by view schemata for different aspects of the application. Three-tier architectures separate the system schema into presentation schemata, business process schemata and supporting database schemata based on separation of concern and information hiding. Component architectures are based on

110

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

Description/ prescription layer

Conceptual layer

Implementation layer

PP PP PP Application PPP PP area PP description PP PP PP PP PP PP PP P P Design PP PP P P Requirements PP Reﬁnement PP PPprescriptions PP PP WIS description PP PP PP and prescription PP PP Presentation system PP PP speciﬁcation PP PP PP PP PP PP PP PP PP Implementation PPP P PP Information systems Transformation PP PP speciﬁcation PP PP PP PP PP PP WIS speciﬁcation P PP PP Presentation P system PP PP PP PP PP PP PP PP PP PP P P PP PP Information PP PP system PP Web information system PPP P

Figure 4. The dichotomy of human-computer systems and the systems ladder

‘meta-schemata’ that describe the intention of the component, the interfaces provided by the component, and the bindings among the interfaces. SOA architectures encapsulate functionality and structuring into services and use orchestration for realisation of business tasks through mediators. Therefore, application domain description is going to be extended by consideration of architectures and environments. Software architecture is often considered from the technical or structural point of view and shows the association of modules or packages of software. Beside this structural point of view we consider the application architecture that illustrates the structure of the software from the application domain perspective. Additionally we might include the perspective of the technical infrastructure, e.g. periphery of the system. These three viewpoints are one the most important viewpoints of the same architecture. We call an architecture documentation architecture blueprint. Summarizing we ﬁnd four interwoven parts of a software system documentation that we need to consider and that is depicted in Figure 5. The tasks and the objective of (conceptual) modelling changes depending on the architecture that has been chosen for the system. 6.1. The Prescription of Requirements Architecture has an impact on development of early phases. We consider ﬁrst requirements description.

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

111

Application domain description @ @ @ Architecture @ Requirements blueprint @ prescription @ @ @ Software speciﬁcation Figure 5. The Software Engineering Quadruple

Software engineering has divided properties into functional and non-functional properties, restrictions and pseudo-properties. This separation can be understood as a separation into essential properties and non-essential ones. If we consider the dichotomy of a WIS then this separation leads to a far more natural separation into information system requirements and presentation systems requirements. The system perspective considers properties such as performance, efﬁciency, maintainability, portability, and other classical functional requirements. Typical presentation system requirements are usability, reliability, and requirements oriented to high quality in use, e.g., effectiveness, productivity, safety, privacy, and satisfaction. Safety and security are also considered to be restrictions since they specify undesired behaviour of systems. Pseudo-properties are concerned with technological decisions such as language, middleware, operating system or are imposed by the user environment, the channel to be used, or the variety of client systems. Properties are often difﬁcult to specify and to check. We should concentrate on those and only those properties that can be shown to hold for the desired system. Since we are interested in prooﬁng or checking the adherence of the system to the properties we need to deﬁne properties in such a way that tests or proofs can be formulated. They need to be adequate, i.e. cover what business users expect. At the same time, they need to be implementable. We also must be sure that they can be veriﬁed and validated. 6.2. Architecture-Driven System Development WIS speciﬁcation is often based on an incremental development of WIS components, their quality control and their immediate deployment when the component is approved. The development method is different from those we have used in the ﬁrst layers. Application domain description aims in capturing the entire application based on exploration techniques. Requirements prescription is reﬁning the application domain description. Speciﬁcation is based on incremental development, veriﬁcation, model checking, and testing. This incremental process leads to different versions of the WIS: demo WIS, skeleton WIS, prototype WIS, and ﬁnally approved WIS. Software becomes surveyable, extensible and maintainable if a clear separation of concerns and application parts is applied. In this case, a skeleton of the application structure is developed. This skeleton separates parts or services. Parts are connected through interfaces. Based on this architecture blueprint, an application can be developed part by part. We combine modularity, star structuring, co-design, and architecture development to a novel framework based on components. Such combination seems to be not feasible. We discover, however, that we may integrate all these approaches by using a component-

112

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

based approach [26,27]. This skeleton can be reﬁned during evolution of the schema. Then, each component is developed step by step. Structuring in component-based codesign is based on two constructs: Components: Components are the main building blocks. They are used for structuring of the main data. The association among components is based on ‘connector’ types (called hinge or bridge types) that enable in associating the components in a variable fashion. Skeleton-based construction: Components are assembled together by application of connector types. These connector types are usually relationship types. A typical engineering approach to development of large conceptual models is based on general solutions, on an architecture of the solution and on combination operations for parts of the solution. We may use a two-layer approach for this kind of modelling. First, generic solutions are developed. We call these solutions conceptual schema pattern set. The architecture provides a general development contract for subparts of a schema under development. The theory of conceptual modelling may also be used for a selection and development of an assembly of modelling styles and perspectives. Typical wellknown styles [24] are inside-out reﬁnement, top-down reﬁnement, bottom-up reﬁnement, modular reﬁnement, and mixed skeleton-driven reﬁnement. A typical perspective is the three-layer architecture that uses a conceptual model together with a number of external models and an implementation model. Another perspective might be the separation into an OTP-OLAP-DW system. The adaptation of a conceptual schema pattern set to development contracts and of styles and perspectives leads to a conceptual schema grid. 6.3. Architecture Blueprint An architecture blueprint consists of models, documents, artifacts, deliverables etc. which are classiﬁed by the following states: The architecture framework consists of the information or data view, functional business or domain view, integration or data-ﬂow view, deployment or technology view, and infrastructure or embedment view. The WIS development architectures guide: The current architecture is the set all solution architecture models that have been developed by the delivery projects to date. Ownership of the solution architecture models are transferred to the current Enterprise Architecture when the delivery project is closed. The development state architecture represents the total set of architecture models that are currently under development within the current development projects. The target vision state architecture provides a blueprint for the future state of the architecture needed in order to satisfy the application domain descriptions and target operating model. 7. Applying Architecture-Driven and Application-Domain-Ruled Modelling Methodologies 7.1. The CottbusNet Design and Development Decisions Let us consider the event calendar in an infotainment setting of a city information system. This calendar must provide a variety of very different information from various heterogeneous resources:

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

113

• Event-related information: Which event is performed by whom? Where from are the actors? How the event is going on? • Location-based information: Which location can be reached by which trafﬁc under which conditions with whose support? • Audience information: Which audience is sought under which conditions, regulations and with which support? • Marketing information: Which provider or supplier markets the event under which time restrictions with which business rules? • Time-related information: Which speciﬁc time data should be provided together with events? • Intention information: Are there intentions of the event that should be provided? The event calendar is based on a different databases: event databases for big events, marketing events, sport events, cultural events, minor art events etc.; location databases for support of visitors of the event providing also trafﬁc, parking etc. information; auxiliary databases for business rules, time, regulations, ofﬁcial restrictions, art or sport activists, reports on former events etc. It is not surprising that this information is provided by heterogeneous databases, in a variety of formats, in a large bandwidth of data quality, in a variety of update policies. Additionally, it is required to deliver the data to the user in the right size and structuring, at the right moment and under consideration of the user’s information demand. Consider, for instance, minor art events such as a cabaret event held in a restaurant. The information on this event is typically incomplete, not very actual, partially inexact and partially authorised. The infotainment site policy requires however also to cope with such kinds of events. We might consider now a number of architectures,e.g., the following one: • Server-servlet-applet-client layered systems typically use a ground database system with the production data, a number of serving databases systems with the summarised and aggregated data based on media type technology [17], and playouting systems based on container technology [13] depending on adaption to the storyboard [18]. • OLTP-OLAP-Warehouse systems [11,12] use a ground database system for OLTP computing, a derived (summarised, aggregated) OLAP system for comprehensive data delivery to the user, and a number of data warehouses for data playout to the various kinds of users. Depending on these architectures we must enhance and extend the conceptual schema for the different databases, the workﬂow schemata for data input, storage, and data playout to the user. 7.2. The Resulting Quality of Service and Tracking Back Problems to Decisions Made Quality of WIS is characterized depending on the abstraction layers [8]: Quality parameters at business user layer may include ubiquity ( access unrestricted in time and space) and security/privacy (against failures, attacks, errors; trustworthy; privacy maintenance). Quality parameters at conceptual layer subsume interpretability (formal framework for interpretation) and consistency (of data and functions).

114

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

Quality parameters at implementation layer include durability (access to the entire information unless it is explicitly overwritten), robustness (based on a failure model for resilience, conﬂicts, and persistency), performance (depending on the cost model, response time and throughput), and scalability (to changes in services, number of clients and servers). We use a number of measures that deﬁne quality of service (QoS) for WIS: • Deadline Miss Ratio of User Transactions: In a WIS QoS speciﬁcation, a developer can specify the target deadline miss ratio that can be tolerated for a speciﬁc real-time application. • Data Freshness: We categorize data freshness into database freshness and perceived freshness. Database freshness is the ratio of fresh data to the entire temporal data in a database. Perceived freshness is the ratio of fresh data accessed to the total data accessed by timely transactions - transactions which ﬁnish within their deadlines. • Overshoot is the worst-case system performance in the transient system state. In this paper, it is considered the highest miss ratio over the miss ratio threshold in the transient state. In general, a high transient miss ratio may imply a loss of proﬁt in e-commerce. • Settling time is the time for the transient overshoot to decay and reach the steady state performance. • Freshness of Derived Data: To maintain the freshness, a derived data object has to be recomputed as the related ground database changes. A recomputation of derived data can be relatively expensive compared to a base data update. • Differentiated Timeliness: In WIS QoS requirements, relative response time between service classes can be speciﬁed. For example, relative response time can be speciﬁed as 1:2 between premium and basic classes. We observe that these quality of services characteristics are difﬁcult to specify in systems if architecture is not taken into consideration. Let us consider data freshness as an example for WIS. Data freshness results is related to information logistics that aims in providing the correct data at the best point of time, in the agreed format and quality for the right user with the at the right location and context. Methods for achieving the logistics goals are the analysis of the information demand, storyboarding of the WIS, an intelligent information system, the optimization of the ﬂow of data and the technical and organizational ﬂexibility. Therefore, data freshness can be considered to be a measure for appropriateness of the system. Depending on the requested data freshness we derive the right architecture of the system. 7.3. Resolution and Toleration of QoS Problems Based on our co-design modelling appraoch and as a result separation of concern within the software engineering quadruple be can derive a number of techniques for architecture-driven and application-domain-rules modelling of high quality WIS: • Introduction of artiﬁcial bottlenecks: Instead of replicating data at different sites or databases we may introduce a central data store that exhibits a numbe rof versions to each of the clients that require different data.

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

115

• Introduction of a tolerance model: We may introduce an explicit tolerance model that decreases the burden of data actuality to those web pages for which complete actuality is essential. • A cost-beneﬁt model of updates: Updates may sometimes causes a large overhead of internal computing due to constraint maintenance and due to propagation of the update to all derived data. We thus may introduce delays of updates and speciﬁc update obligations for certain time points. Typical resulting techniques are dynamic adaptation of updates and the explicit treatment by an update policy. • Data replication in a distributed environment: Data access can be limited in networking environments. The architecture may however introduce explicit data replication and speciﬁc update models for websites. This list of techniques is not complete but demonstrates the potential of architecturedriven WIS development. 8. Conclusion This paper discusses the results of a project that was aiming in developing a methodological approach to web information systems development. Most approaches known so far did not take into consideration architectural issues. Typically, they are taken for granted or assumed on default. This paper shows that architectures have a deep impact on the development methodology. We took as an example web information systems development. These systems are typically based on the 2-tier architectures. The information system development part is very-well considered. The presentation system development is often mixed with the information system development. It cannot however be mixed. We separate these two systems from each other. While separating we discover that in this case the application domain description ﬁts very well with the support by the presentation system. This description is the source for requirements prescription. The later results in software speciﬁcation and later development and coding of the system. The presentation system conceptualisation and coding can either be done before considering the information system or done afterwards. Classical approaches consider the three facets of system development: application domain description, requirements prescription and software speciﬁcation. We discover in this paper that there is a fourth facet that cannot be neglected: architecture of the system. Therefore, we extend the classical framework to the software modelling quadruple. References [1] B. Boehm. A view of 20th and 21st century software engineering. In Proc. ICSE’06, pages 12–29, ACM Press, 2006. [2] B. Boehm, D. Port, and K. Sullivan. White paper for value based software engineering. http://www.isis.vanderbilt.edu/sdp/Papers/, May 2007. [3] Stefano Ceri, Piero Fraternali, and Maristella Matera. Conceptual modeling of data-intensive web applications. IEEE Internet Computing, 6(4):20–30, 2002. [4] A. D¨usterh¨oft and B. Thalheim. Linguistic based search facilities in snowﬂake-like database schemes. Data and Knowledge Engineering, 48:177–198, 2004. [5] G. Fiedler, H. Jaakkola, T. M¨akinen, B. Thalheim, and T. Varkoi. Co-design of web information systems supported by SPICE. In Information Modelling and Knowledge Bases, volume XX, pages 123–138, Amsterdam, 2009. IOS Press.

116

H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies

[6] T. H¨arder. XML databases and beyond - plenty of architectural challenges ahead. In ADBIS, volume 3631 of Lecture Notes in Computer Science, pages 1–16. Springer, 2005. [7] G.-J. Houben, P. Barna, F. Frasincar, and R. Vdovjak. HERA: Development of semantic web information systems. In Third International Conference on Web Engineering – ICWE 2003, volume 2722 of LNCS, pages 529–538. Springer-Verlag, 2003. [8] H. Jaakkola and B. Thalheim. A framework for high quality software design and development: A systematic approach. IET Software, 2010. to appear. [9] G. Kappel, B. Pr¨oll, S. Reich, and W. Retschitzegger, editors. Web Engineering: Systematische Entwicklung von Web-Anwendungen. dpunkt, 2003. [10] Philippe Kruchten. The Rational Uniﬁed Process - An Introduction. Addison-Wesley, 1998. [11] H.-J. Lenz and B. Thalheim. OLAP databases and aggregation functions. In Proc. SSDBM 2001, pages 91–100. IEEE, 2001. [12] H.-J. Lenz and B. Thalheim. OLTP-OLAP schemes for sound applications. In TEAA 2005, volume LNCS 3888, pages 99–113, Trondheim, 2005. Springer. [13] J. Lewerenz, K.-D. Schewe, and B. Thalheim. Modeling data warehouses and OLAP applications by means of dialogue objects. In Proc. ER’99, LNCS 1728, pages 354–368. Springer, Berlin, 1999. [14] Peter C. Lockemann. Information system architectures: From art to science. In BTW, volume 26 of LNI, pages 30–56. GI, 2003. [15] C. Pahl, W. Hasselbring, and M. Voss. Service-centric integration architecture for enterprise software systems. J. Inf. Sci. Eng., 25(5):1321–1336, 2009. [16] A.-W. Scheer. Architektur integrierter Informationssysteme - Grundlagen der Unternehmensmodellierung. Springer, Berlin, 1992. [17] K.-D. Schewe and B. Thalheim. Modeling interaction and media objects. In Proc. NLDB 2000, LNCS 1959, pages 313–324. Springer, 2001. [18] K.-D. Schewe and B. Thalheim. Reasoning about web information systems using story algebra. In ADBIS’2004, LNCS 3255, pages 54–66, 2004. [19] K.-D. Schewe and B. Thalheim. The co-design approach to web information systems development. International Journal of Web Information Systems, 1(1):5–14, March 2005. [20] K.-D. Schewe and B. Thalheim. Conceptual modelling of web information systems. Data and Knowledge Engineering, 54:147–188, 2005. [21] T. Schmedes. Entwurfsmethode f¨ur service-orientierte Architekturen im dezentralen Energiemanagement. In Multikonferenz Wirtschaftsinformatik. GITO-Verlag, Berlin, 2008. [22] D. Schwabe, G. Rossi, and S. Barbosa. Systematic hypermedia design with OOHDM. In Proc. Hypertext ’96, pages 116–128. ACM Press, 1996. [23] J. Siedersleben. Moderne Softwarearchitektur. dpunkt-Verlag, Heidelberg, 2004. [24] B. Thalheim. Entity-relationship modeling – Foundations of database technology. Springer, Berlin, 2000. [25] B. Thalheim. Co-design of structuring, functionality, distribution, and interactivity of large information systems. Technical Report 15/03, BTU Cottbus, Computer Science Institute, Cottbus, September 2003. 190pp. [26] B. Thalheim. Application development based on database components. In Y. Kiyoki H. Jaakkola, editor, EJC’2004, Information Modeling and Knowledge Bases XVI. IOS Press, 2004. [27] B. Thalheim. Component development and construction for database design. Data and Knowledge Engineering, 54:77–95, 2005.

Acknowledgement We would like to thank the Academy of Finland and the German Academic Exchange Service (DAAD) for the support of this research.

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-117

117

An Emotion-Oriented Image Search System with Cluster based Similarity Measurement using Pillar-Kmeans Algorithm a

Ali Ridho BARAKBAHa and Yasushi KIYOKI b Graduate School of Media and Governance, Keio University, Japan b Faculty of Environmental Information, Keio University, Japan 5322 Endoh, Fujisawa, Kanagawa, Japan, 252-8520 [email protected], [email protected] Abstract. This paper presents an image search system with an emotion-oriented context recognition mechanism. Our motivation implementing an emotional context is to express user’s impressions for retrieval process in the image search system. This emotional context recognizes the most important features by connecting the user’s impressions to the image queries. The Mathematical Model of Meaning (MMM: [2], [4] and [5]) is applied for recognizing a series of emotional contexts for retrieving the most highly correlated impressions to the context. These impressions are then projected to a color impression metric to obtain the most significant colors for subspace feature selection. After applying subspace feature selection, the system then clusters the subspace color features of the image dataset using our proposed Pillar-Kmeans algorithm. Pillar algorithm is an algorithm to optimize the initial centroids for K-means clustering. This algorithm is very robust and superior for initial centroids optimization for K-means by positioning all centroids far separately among them in the data distribution. It is inspiring that by distributing the pillars as far as possible from each other within the pressure distribution of a roof, the pillars can withstand the roof’s pressure and stabilize a house or building. It considers the pillars which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as number of centroids among the gravity weight of data distribution in the vector space. Therefore, this algorithm designates positions of initial centroids in the farthest accumulated distance between them in the data distribution. The cluster based similarity measurement also involves a semantic filtering mechanism. This mechanism filters out the unimportant image data items to the context in order to speed up the computational execution for image search process. The system then clusters the image dataset using our Pillar-Kmeans algorithm. The centroids of clustering results are used for calculating the similarity measurements to the image query. We perform our proposed system for experimental purpose with the Ukiyo-e image dataset from Tokyo Metropolitan Library for representing the Japanese cultural image collections. Keywords. Image search, emotional context, multi-query images, subspace feature selection, cluster based similarity.

1. Introduction The World Wide Web has become a significant source of information, including image data. Everyday abundant information resources are transformed and collected into huge databases which make difficult in processing and analyzing data without the use of automatic approaches and techniques. Related to image data, many researchers and developers developed an efficient image searching, browsing, and retrieval systems in order to provide better ways and approaches for such kinds of activities.

118

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

The image retrieval systems based on the contents are attracting and challenging in research areas of image searching. Many content-based image retrieval (CBIR) systems have been proposed and widely applied to both commercial purposes and research systems. The system analyzes the content of an image by extracting primitive features such as color, shape, texture, etc. Most approaches have been introduced to explore the content of an image and identify the primary and dominant features inside the image. QBIC [3] introduced an image retrieval system based on color information inside an image. VisualSeek [7] represented a system by diagramming spatial arrangements based on representation of color regions. NETRA [8] developed a CBIR system by extracting color and texture features. Virage [6] utilized color, texture, and shape features for the image retrieval engine. CoIRS [10] also introduced a cluster oriented image retrieval system based on color, shape, and texture features. Veltkamp and Tanase [9] and Liu et al [11] presented a survey to many image retrieval systems using diverse features. Barakbah and Kiyoki introduced an image retrieval system by combining color, shape and structure features [12].

Figure 1. System architecture of our proposed image search system

Several researches addressed emotional recognition problems for the image retrieval system. The search system commonly constructs the emotion model driven by the user interaction to the system [17]. Park and Lee [18] introduced an emotion-based image retrieval driven by users. The system constructed emotion recognition by analyzing consistency feedbacks from the users. Solli and Lenz [19] developed an image retrieval system involving bags of emotion. The system used color emotion models derived from psychophysical experiments which are activity, weight and heat. However, it has not connected directly the queries of emotional expressions to the models yet. Wang and He [20] presented a survey on emotional semantic image

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

119

retrieval. The supervised learning techniques usually used to bridge semantic gap between image features and emotional semantics. This paper presents an image search system with an emotion oriented context recognition mechanism by connecting a series of emotion expressions to the color based impression. Our search system addresses a dynamic manipulation of unsupervised emotion recognition. Our motivation implementing an emotional context in the image search system is to express user’s impressions for retrieval process. This emotional context recognizes the most important features by connecting the user’s impressions to the image queries. In this system, the Mathematical Model of Meaning (MMM: [2], [4] and [5]) is applied and transformed to the color features with a color impression metric for subspace feature selection. Our previous work [14] presented how to connect the user’s impressions to the queries by involving a series of emotional contexts (such as “happy”, “calm”, “beautiful”, “luxurious”, etc.) and recognize the most important features for the image dataset and the image query. This paper continues our previous work by expanding the MMM vector space ([2], [4], [5]) with the lists impressions in the Color Image Scale. This paper also introduces a multi-query image search system by applying an aggregation mechanism to generate representative query colors for processing multi-query images. The Mathematical Model of Meaning (MMM) is applied and transformed to the color features with a color impression metric for subspace feature selection. This paper implements a cluster based similarity measurement in order to tie the similar colors of the subspace color features in a same group in the process of similarity measurement. We apply our previous work, Pillar-Kmeans algorithm, for the cluster based similarity measurement with involving a semantic filtering mechanism to filter out the irrelevant data. The Pillar-Kmeans algorithm is an optimized K-means clustering with our Pillar algorithm by generating initial centroids for K-means. Applying our Pillar-Kmeans algorithm for cluster based similarity measurement is important to reach high precision of the clustering result as well as to speed up the computational time of the clustering. Figure 1 shows the system architecture of the proposed image search system. We organize this paper as follows. In Section 2, the emotional context recognition mechanism using MMM is described. Section 3 discusses the feature extraction, representative query color generation of multi-image queries and subspace feature selection with a color impression metric. The cluster based similarity measurement using Pillar-Kmeans algorithm with a semantic filtering mechanism is described in Section 4. Section 5 describes the experimental results using the Ukiyo-e image dataset and discusses the performance analysis, and then followed by concluding remarks in Section 6.

2. Emotional Context Recognition Mechanism Our idea to recognize an emotional context in the image search system is to provide a function in which the users can express their impressions (such as “happy”, “calm”, “beautiful”, “luxurious”, etc.) for image search. This function finds the most essential features related to an emotional context, given as the user’s impressions to the image query. The Mathematical Model of Meaning (MMM) is applied for recognizing a series of emotional contexts for retrieving the most highly correlated impressions to the

120

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

context. In this section, the outline of the Mathematical Model of Meaning (MMM) is briefly reviewed. This model has been presented in [2], [4] and [5] in detail. 2.1. An overview of the Mathematical Model of Meaning In the Mathematical Model of Meaning [2][4][5], an orthogonal semantic space is created for semantic associative search. Retrieval candidates and queries are mapped onto the semantic space. The semantic associative search is performed by calculating the correlation of the retrieval candidates and the queries on the semantic space in the following steps: (1) A context represented as a set of impression words is given by a user, as shown in Figure 2(a). (2) A subspace is selected according to the given context as shown in Figure 2(b). (3) Each information resource is mapped onto the subspace and the norm of p is calculated as the correlation value between the context and the information resource, as shown in Figure 2(c).

Figure 2. Semantic associative search in MMM

2.2. The outline of semantic associative search in MMM The outline of the MMM is expressed as follows [2][4][5]: (1) A set of m words is given, and each word is characterized by n features. That is, an m by n matrix M is given as the data matrix. (2) The correlation matrix MTM with respect to the n features is constructed from the matrix M. Then, the eigen value decomposition of the correlation matrix is computed and the eigenvectors are normalized. The orthogonal semantic space MDS is created as the span of the eigenvectors which correspond to nonzero eigen values. (3) Context words are characterized by using the n features and representing them as n-dimensional vectors. (4) The context words are mapped into the orthogonal semantic space by computing the Fourier expansion for the n-dimensional vectors. (5) A set of all the projections from the orthogonal semantic space to the invariant subspaces (eigen spaces) is defined. Each subspace represents a phase of meaning, and it corresponds to a context or situation. (6) A subspace of the orthogonal semantic space is selected according to the user's impression expressed in n-dimensional vectors as context words, which are given as a context represented by a sequence of words.

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

121

The dynamic interpretation of meaning of data according to the given context words is realized through the selection of a semantic subspace from the entire semantic space that consists of approximately 2000 orthogonal vectors. A subspace is extracted by the semantic projection operator when context words, or the user’s impressions, are given. Thus, vectors of document data in the semantic subspace have norms adjusted accordingly with the given context words. The semantic interpretation is performed as projections of the semantic space dynamically, according to the given contexts, as shown in Figure 3. This process has been presented in our previous works [2][4][5] which describe as follows. 1. Defining a set of the semantic projections Πν: We consider the set of all the projections from the semantic space I to the invariant subspaces (eigen spaces). We refer to the projection as the semantic projection and the corresponding projected space as the semantic subspace. Since the number of i dimensional invariant subspaces is (v (v – 1)…(v – i + 1)) / i ! , the total number of the semantic projections is 2v. That is, this model can express 2v different phases of the meaning. 2. Constructing the Semantic Operator Sp: Suppose a sequence sℓ of ℓ words which determines the context is given. We construct an operator Sp to determine the semantic projection according to the context. We call the operator a semantic operator. (a) First we map the ℓ context words in databases to the semantic space I. This mathematically means that we execute the Fourier expansion of the sequence sℓ in I and seek the Fourier coefficients of the words with respect to the semantic elements. This corresponds to seeking the correlation between each context word of sℓ and each semantic element. (b) Then we sum up the values of the Fourier coefficients for each semantic element. (We call this sum corresponding axis’ weight). This corresponds to finding the correlation between the sequence sℓ and each semantic element. Since we have v semantic elements, we can constitute a v dimensional vector. We call the vector normalized in the infinity norm the semantic center of the sequence sℓ . (c) If the sum obtained in (b) for a semantic element is greater than a given threshold ε, we employ the semantic element to form the projected semantic subspace. We define the semantic projection by the sum of such projections. This operator automatically selects the semantic subspace which is highly correlated with the sequence sℓ of ℓ the context words which determines the context. This model makes dynamic semantic interpretation possible. We emphasize here that, in our model, the “meaning” is the selection of the semantic subspace, namely, the selection of the semantic projection and the “interpretation” is the best approximation in the selected subspace. Figure 3 shows the semantic interpretation according to contexts in MMM. The most correlated information resources to the given context are extracted in the selected subspace by applying the metric defined in the semantic space. We expand the 2000 Longman vector space in MMM that was used in our previous work [14] to 180 impression words of Color Image Scale. The most highly correlated words to the

122

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

context are the representative impressions for Color Image Scale in order to select subspace color features.

Figure 3. Semantic interpretation according to contexts in MMM

3. Feature extraction and Subspace Selection This section consists of three discussions: (1) the color feature extraction in the image dataset and the image query with quantization of RGB color system using Color Image Scale, (2) the aggregation mechanism of representative query color generation for processing multi-query images, and (3) subspace feature selection with a color impression metric.

Figure 4. The 130 basic color features are mapped on RGB color space and used for expressing relations between colors and impressions

123

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

3.1. Color Feature Extraction The system extracts color features using 130 basic color features of Color Image Scale [1]. These features consist of non-uniform quantization of RGB color space based on human impression. The features contain 120 chromatic colors and 10 achromatic colors. These features have encompasses 10 hues and 12 tones. Each hue may be bright or dull, showy or sober, and has a number of tones. The tone of a color [1] is the result of the interaction of two factors: brightness or value, and color saturation or chroma. Colors of the same tone are arranged in order of hue, starting from red at the left of the scale. The lines linking colors of the same tone show the range of images that tone can convey [1]. Figure 4 shows the 130 non-uniform quantization of RGB color space by Color Image Scale for expressing relations between color and impressions. These 130 basic color features will be projected to the lists of impressions discussed in Section 3.3. 3.2. Representative Query Color Generation In this paper, our image search system provides a multi-query input that allows users to assign the image query more than one image. With this multi-query input, the users have more spaces and flexibility to express what they want to search in the image dataset. For realizing this, we construct an aggregation mechanism of representative query color generation for processing multi-query images. The mechanism works by the following steps. Step 1: Extracting the color features f of the n image queries into 130 color features of Color Image Scale. ⎡ f1,1 L f1, 130 ⎤ ⎢ M O M ⎥ ⎢f ⎥ L f n , 130 ⎥ ⎣⎢ n,1 ⎦ where: fq,c is c-th color feature of image query q

(1)

Step 2: Calculating local average L of each image query for normalizing the value of histogram bin for each image query ⎡ L1,1 L L1, 130 ⎤ ⎢ M O M ⎥ ⎥ ⎢L ⎢⎣ n,1 L Ln, 130 ⎦⎥

(2)

where: Lq,c is local average of c-th color feature for image query q and be defined in Eq (3) Lq ,c =

f q, c fq

(3)

Step 3: Accumulating values of local average for each feature [ M 1 L M 130 ]

where:

(4)

124

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

Mc =

Σ nq =1 Lq, c n

(5)

Step 4: Calculating average A and standard deviation S of M, as shown in Eq (6). ⎡ A1 L A130 ⎤ ⎢⎣ S1 L S130 ⎥⎦

(6)

Step 5: Calculating density D of each color feature. Because a color feature which is a candidate as representative color feature is identified to have high A and low S, the density D of each color feature can be defined in Eq (7). Dc =

Ac + α Sc + α

(7)

where : α is a small number to avoid zero-division Step 6: Filtering out the irrelevant Dc which closes to zero. In this case, it is very important to filter out the irrelevant data adjusting to the data distribution. Because of that, an automatic clustering which can recognize number of clusters automatically is applied using our previous work Hill Climbing Automatic Clustering [15]. The Hill Climbing Automatic Clustering analyzes moving variances of clusters, and then observes the pattern to find the global optimum for number of clusters. After clustering the density D, the cluster members those are belonging to the cluster which locates closest to the zero point is filtering out. The rest of cluster members are selected to be representative color features. Figure 5 shows the visual representation of representative query color generation. Our approach can identified non-representative feature (indicated by red color in Figure 5) and remove them from the selection.

Figure 5. The identified non-representative colors (indicated by red color) will be removed from query feature extraction

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

125

3.3. Subspace Feature Selection The most highly correlated impressions from MMM (discussed in Section 2), is projected to the Color Impression Metric defined by Color Image Scale [1]. The Color Impression Metric consists of 130 basic color features and 180 key impression words. The projection calculates the relationships between the representative impressions from MMM and key image impression words in the Color Image Scale. The most significant colors which have the highest values of the projection is obtained and then used for selecting the color features among 130 color features of the image dataset and the representative image query colors.

4. Cluster Based Similarity Measurement After applying subspace color feature selection to the image features, a cluster based similarity measurement is calculated with involving a semantic filtering mechanism. This mechanism filters out the unimportant image datasets to the context in order to speed up the computational execution for image search process. The system then clusters the subspace color features of the image dataset using our Pillar-Kmeans algorithm. 4.1. Semantic Filtering Mechanism Before clustering the selected subspace color features for similarity calculation, it is important to filter out the irrelevant data items those have low correlation to the emotional contexts. The semantic information filtering was introduced in our previous work [16]. It works by providing a mechanism with a way to express user’s impressions. When the users give contexts to express their impressions to the system, the contexts lead number of data items to be low and high correlation to the contexts. By filtering out retrieval candidate data items with low semantic information retrieval with the given contexts, the retrieval process becomes effective because analysis of data is only performed on data items with high correlation with the contexts. By filtering out the irrelevant data, it can reduce number of data items and speed up the computational time.

Figure 6. Semantic filtering mechanism for filtering out irrelevant data

126

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

The irrelevant data semantically locates close to zero point in the vector space of the subspace color features. A case-dependent threshold th is used for selecting semantic information filtering. The vectors with norms less than th are considered unnecessary and filtered out from the subspace, as shown in Figure 6. The users can decide high threshold if they want to filter out a relatively large amount of data and retrieve limited data which are highly related to their impressions or set the threshold at a lower value so that they gain most data for thorough analysis. In our case, we set th as average color distances to the zero. 4.2. Pillar-Kmeans Algorithm After applying subspace feature selection, the system then clusters the subspace color features of the image dataset using our previous work, Pillar-Kmeans algorithm [13]. Pillar algorithm is an algorithm to optimize the initial centroids for K-means clustering. This algorithm is very robust and superior for initial centroids optimization for Kmeans by positioning all centroids far separately among them in the data distribution. Pillar algorithm is inspired by the thought process of determining a set of pillars’ locations in order to make a stable house or building. Figure 7 illustrates the locating of two, three, and four pillars, in order to withstand the pressure distributions of several different roof structures composed of discrete points. It is inspiring that by distributing the pillars as far as possible from each other within the pressure distribution of a roof, the pillars can withstand the roof’s pressure and stabilize a house or building. It considers the pillars which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as number of centroids among the gravity weight of data distribution in the vector space. Therefore, this algorithm designates positions of initial centroids in the farthest accumulated distance between them in the data distribution.

Figure 7. Illustration of locating a set of pillars (white points) withstanding against different pressure distribution of roofs

The process of determining the initial centroids by Pillar algorithm have been presented in our previous work [13] described as follows. First of all, the grand mean of data points is calculated as the gravity center of the data distribution. The distance metric D (let D1 be D in this early step), is then created between each data point and the grand mean. A data point which has the highest distance in D1 will be selected as the first candidate of the initial centroid ж. Figure 8(a) illustrates m as the grand mean of data points and ж which is has the farthest distance to m is the candidate of the first

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

127

initial centroid. If ж is not an outlier, it will be promoted to the first initial centroid c1. We then recalculate D (D2 in this step), which is the distance metric between each data points and c1. Starting from this step, we use the accumulated distance metric DM and assign D2 to DM. This step which initiates the creation of DM is an improvement part of our previous work, MDC algorithm [16], that the construction of DM is started from D1. To select a candidate for the second initial centroid, the same mechanism is applied using DM instead of D. The data point with the highest distance of DM will be selected as the second initial centroid candidate ж, as shown in Figure 8(b). If ж is not classified as an outlier, it becomes c2. To select a next ж for the candidate of the rest initial centroids, Dt (where t is the current iteration step) is recalculated between each data points and ct-1. The Dt is then added to the accumulated distance metric DM (DM DM + Dt). This accumulation scheme can avoid the nearest data points to ct-1 being chosen as the candidate of the next initial centroid. It consequently can spread out the next initial centroids far away from the previous ones. The data points with the highest distance in DM will then be selected as ж, as shown in Figure 8(c). If ж is not an outlier, it will become ct. The iterative process guarantees that all initial centroids are designated. In this way, all centroids can be located as far as possible from each other within the data distribution.

Figure 8. Selection for several candidates of the initial centroids

Here is the detail sequence of Pillar algorithm. Let X={xi | i=1,…,n} be data, k be number of clusters, C={ci | i=1,…,k} be initial centroids, SX ⊆ X be identification for X which are already selected in the sequence of process, DM={xi | i=1,…,n} be accumulated distance metric, D={xi | i=1,…,n} be distance metric for each iteration, and m be the grand mean of X. The following execution steps of the proposed algorithm are described as follows: 1. Set C=∅, SX=∅, and DM=[] 2. Calculate D dis(X,m) 3. Set number of neighbors nmin = α . n / k 4. Assign dmax argmax(D) 5. Set neighborhood boundary nbdis = β . dmax 6. Set i=1 as counter to determine the i-th initial centroid 7. DM = DM + D 8. Select ж xargmax(DM) as the candidate for i-th initial centroids 9. SX=SX ∪ ж 10. Set D as the distance metric between X to ж. 11. Set nonumber of data points fulfilling D ≤ nbdis

128

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

12. 13. 14. 15. 16. 17. 18.

Assign DM(ж)=0 If no < nmin, go to step 8 Assign D(SX)=0 C=C∪ж i=i+1 If i ≤ k, go back to step 7 Finish in which C is the solution as optimized initial centroids.

The centroids of clustering results from Pillar-Kmeans algorithm discussed in previous Section are used for calculating the similarity measurements to the representative query color features of the image queries. In this case, we use Cosine distance metric for similarity calculation.

5. Experimental Results To apply our emotion-oriented image search system, we implement it for cultural image datasets. For experimental study, we use the Ukiyo-e image dataset from Tokyo Metropolitan Library for representing the Japanese cultural image collections. It contains 8743 typical images and artworks of famous paintings in Edo and Meiji era, including Hiroshige, Toyokuni, Kunisada, Yoshitoshi, Kunichika, Sadahige, Kuniteru, etc. We set the highest ranks of 15 number of image retrieved results. For performance analysis, we compare the highest 10 impression words of retrieved results using color impression metric to the given emotional contexts, as shown in Eq (8). In this case, the comparison to the given emotional contexts encompasses two things: (1) comparison definitely to the contexts, and (2) comparison semantically to the closest meanings of the given contexts. 15 ⎧ prec i = 1 ← imprs (retrvimg i ) = contexts precision = ∑ prec i ⎨ i =1 ⎩ preci = 0 ← otherwise

(8)

5.1. Experiment 1 Four 4 images are given as multiple queries, shown in Figure 9. We set two emotional contexts which are “calm” and “quiet”, for expressing the impressions to the queries in which we want to retrieve in the image search system.

Figure 9. Multiple queries given to the search system

The computational steps are described as follows. First, the given contexts “calm” and “quiet” are computed by MMM to calculate the most highly correlated words to the context. We add the 2000 Longman vector space in MMM that was used in our

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

129

previous work [14] with 180 impression words of Color Image Scale. We compute a series of correlated words to the given contexts by MMM and obtain the most highly 10 correlated words which are "calm", "peaceful", "clean", "fresh", "quiet", "rich", "tender", "pretty", "bitter", and "rational". These most highly correlated words are then projected to the color impression metric to obtain the most significant colors for subspace feature selection. The result of this projection is that 78 most significant colors related to the impression words are selected among 130 color features. This color feature subspace selection is applied for both the image dataset and the image queries. Before applying the color feature selection, the features of the image dataset and the image queries are extracted. Because the multiple queries are given to the system, we need to aggregate the color features and generate their representative color features, as described in Section 3.2. Figure 10 shows the 130 histogram bins of extracted color features of the 4 image queries in Figure 9. The histogram of representative query colors by our proposed representative query color generation is performed in Figure 11. As shown in Figure 12, the selection of representative colors are not applied for all query color data items those have values in the histogram bins, but only applied for those have high average value and low standard deviation value of histogram bins.

Figure 10. Histogram of the multiple image queries

After applying extracting the features of the image dataset and the representative query colors, the 78 most significant colors resulted by the projection of MMM to the color impression metric is used for subspace color feature selection.

Figure 11. Histogram of the representative colors of image queries

130

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

The next step is the similarity calculation between the features of the subspace colors and the representative query colors. The semantic filtering mechanism is applied to filter out the irrelevant data items of subspace color features of the image dataset those have low correlation to the emotional contexts. In this experiment, our semantic filtering mechanism selected 2893 of 8743 data items and filtered out the irrelevant rests of data items. After filtering out the irrelevant data items, the clustering is applied to grouping the similarity distribution of the relevant data items. Our Pillar-Kmeans algorithm is used to cluster the data. In this case, we set 20 numbers of clusters for clustering process. After grouping the data items, the cosine distance metric is used for the similarity calculation. The result of the calculation is ranked to obtain the best retrieved image results. Figure 12 shows the top 15 retrieval results of our image search system. For performance analysis, we extract the most highly computed impression words from each retrieved image results using color impression metric. Table 1 shows the lists of 10-impression words from each retrieved image results. Table 1 performs 8 of 15 retrieved image results (indicated by red font color) containing “calm quiet” context. Moreover, if we refer to human perception which the given context “calm quiet” may relatively consist of several meanings which are “restfull”, “tranquil”, “sedate”, “solemn”, “sober”, “placid”, “quiet_and_sophisticated”, and “simple_quiet_elegant”, the experimental results achieved all correct retrieved image results (indicated by blue font color). This experimental result performed that our proposed system is able to reach high precision for image retrieval in accordance with the given context by the users.

Figure 12. The top 15 retrieved image results of “calm quiet” emotional contexts

Figure 13 shows the precision of the retrieval results in line with i-th number of image results. In that figure, PR1 indicates the precision of the image results containing impressions those are definitely same to the contexts, PR2 indicates the precision of the image results containing impressions those are very close to the similar impressions of the contexts (or in other word, semantically same to the contexts), and MaxPR is the maximum bound of the precision. Even though PR1 just reached 53.33% of the

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

131

precision in line with the top i image results, but PR2 performed the all correct retrieval results.

Figure 13. The precision of the retrieval results in line with i-th number image results

Table 1. The impression words of retrieved images with contexts “calm quiet” Rank retrieved image results 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Impression words dry, familiar, gentle, pleasant, Large_hearted, agreeable_to, restful, amiable, gentle_and_elegant, simple_and_appealing dry, familiar, gentle, pleasant, gentle_and_elegant, agreeable_to, Large_hearted, amiable, mild, restful dry, gentle, familiar, nostalgic, agreeable_to, pleasant, gentle_and_elegant, Large_hearted, calm, simple_and_appealing dry, simple_and_appealing, familiar, tranquil, gentle, pleasant, Large_hearted, restful, simple_quiet_elegant, gentle_and_elegant dry, familiar, restful, tranquil, simple_and_appealing, Large_hearted, pleasant, gentle_and_elegant, amiable, gentle dry, familiar, gentle, subtle_and_mysterious, nostalgic, calm, pleasant, gentle_and_elegant, Large_hearted, agreeable_to dry, familiar, restful, pleasant, gentle_and_elegant, Large_hearted, simple_and_appealing, calm, tranquil, gentle dry, subtle_and_mysterious, familiar, simple_and_appealing, calm, pleasant, Large_hearted, gentle, nostalgic, gentle_and_elegant dry, familiar, restful, tranquil, pleasant, Large_hearted, gentle_and_elegant, simple_and_appealing, gentle, plain dry, familiar, restful, tranquil, pleasant, Large_hearted, gentle_and_elegant, simple_and_appealing, gentle, amiable dry, familiar, pleasant, subtle_and_mysterious, restful, gentle_and_elegant, Large_hearted, simple_and_appealing, nostalgic, calm dry, familiar, mild, gentle_and_elegant, pleasant, restful, Large_hearted, amiable, gentle, calm dry, gentle_and_elegant, familiar, mild, simple_and_appealing, pleasant, restful, tranquil, calm, amiable dry, simple_and_appealing, tranquil, familiar, gentle_and_elegant, mild, restful, amiable, pleasant, Large_hearted dry, familiar, gentle_and_elegant, restful, pleasant, calm, Large_hearted, tranquil, simple_and_appealing, gentle

132

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

5.2. Experiment 2 Figure 14 shows eight images as queries. We set two emotional contexts which are “luxurious” and “elegant”, for expressing the impressions to the queries in which we want to retrieve in the image search system.

Figure 14. Multiple queries given to the search system

We compute a series of correlated words to the given contexts by MMM and obtain the most highly 10 correlated words which are "elegant", "graceful", "luxurious", "stylish", "grand", "precious", "chic", "youthful", "masculine", and "feminine". These most highly correlated words are then projected to the color impression metric to obtain the most significant colors for subspace feature selection. This projection selected 61 of 130 most significant colors related to the impression words. This color feature subspace selection is applied for both the image dataset and the image queries. Before applying the subspace color feature selection, we need to aggregate the query color features and generate their representative features. Figure 15 shows the 130 histogram bins of extracted color features of the 4 image queries in Figure 14. The histogram of representative query colors by our proposed representative query color generation is performed in Figure 16. After applying extracting the features of the image dataset and the representative query colors, the 61 most significant colors resulted by the projection of MMM to the color impression metric is used for subspace color feature selection. For the similarity calculation, our semantic filtering mechanism selected 2960 of 8743 data items and filtered out the irrelevant rests of data items.

Figure 15. Histogram of the multiple image queries

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

133

Figure 16. Histogram of the representative colors of image queries

After filtering out the irrelevant data items, the clustering using our Pillar-Kmeans algorithm is applied to grouping the similarity distribution of the relevant data items. In this case, we set 20 numbers of clusters for clustering process. After the clustering is applied to the data items, the cosine distance metric is used for the similarity calculation between the representative query colors and the clustered data items. The result of the calculation is ranked to obtain the best retrieved image results. Figure 17 shows the top 15 retrieval results of our image search system.

Figure 17. The top 15 retrieved image results of “luxurious elegant” emotional contexts

For performance analysis, we extract the most highly computed impression words from each retrieved image results using color impression metric. Table 2 shows the lists of 10-impression words from each retrieved image results. Table 2 performs 11 of 15 retrieved image results (indicated by red font color) containing “luxurious elegant” context. Moreover, if we refer to human perception which the given context “luxurious elegant” may relatively consist of several meanings which are "rich", "simple_quiet_elegant", "gentle_elegant", "grand", and "tasteful", the experimental results achieved all correct retrieved image results (indicated by blue font color). This experimental result performed that our proposed system is able to reach high precision for image retrieval in accordance with the given context by the users.

134

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

Figure 18. The precision of the retrieval results in line with i-th number image results

Figure 18 shows the precision of the retrieval results in line with i-th number of image results. In that figure, PR1 indicates the precision of the image results containing impressions those are definitely same to the contexts, PR2 indicates the precision of the image results containing impressions those are very close to the similar impressions of the contexts, and MaxPR is the maximum bound of the precision. Figure 18 shows that PR1 reached 73.33% correct retrieval results, and PR2 performed the all correct results in line with the top i image results. Table 2. The impression words of retrieved images with contexts “luxurious elegant” Rank retrieved image results 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Impression words robust, rich, dynamic_and_active, mellow, luxurious, mature, untamed, authoritative, folksy, elaborate robust, mellow, dynamic_and_active, rich, luxurious, subtle_and_mysterious, elaborate, provincial, untamed, simple_quiet_elegant robust, mellow, subtle_and_mysterious, luxurious, rich, chic, dynamic_and_active, nostalgic, elaborate, gentle_and_elegant robust, dynamic_and_active, rich, provincial, calm, mellow, aromatic, simple_quiet_elegant, untamed, nostalgic robust, subtle_and_mysterious, mellow, rich, luxurious, dynamic_and_active, chic, elaborate, folksy, solemn robust, elaborate, rich, substantial, mellow, luxurious, old-fashioned, traditional, dynamic_and_active, folksy robust, rich, luxurious, elaborate, dynamic_and_active, mellow, untamed, folksy, substantial, subtle_and_mysterious robust, dynamic_and_active, rich, formal, mellow, untamed, stout, elaborate, heavy_and_deep, authoritative robust, dynamic_and_active, mellow, rich, elaborate, luxurious, untamed, heavy_and_deep, formal, authoritative rich, robust, mellow, dynamic_and_active, mature, luxurious, untamed, elaborate, folksy, fruitful robust, intellectual, dynamic_and_active, authoritative, rich, luxurious, formal, subtle_and_mysterious, elaborate, mellow robust, chic, dynamic_and_active, rich, metallic, formal, stout, subtle_and_mysterious, untamed, solemn robust, dynamic_and_active, rich, untamed, mellow, luxurious, elaborate, substantial, folksy, stout robust, dynamic_and_active, mellow, subtle_and_mysterious, stout, rich, authoritative, formal, solemn, elaborate robust, dynamic_and_active, rich, mellow, luxurious, subtle_and_mysterious, chic, intellectual, authoritative, formal

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

135

6. Conclusion and future works This paper presented a semantic image search system by applying the emotional contexts of user’s impressions for retrieval process. The system provided a function in which the users can express their impressions (such as “happy”, “calm”, “beautiful”, “luxurious”, etc.) for image search. This emotional context recognizes the most important features by connecting the user’s impressions to the image queries. A multiquery input is applied in the system so that the users have more spaces and flexibility to express what they want to retrieve in the image search system. The Mathematical Model of Meaning is applied and transformed to the color features with a color impression metric for subspace feature selection. After applying subspace color feature selection to the image features, our Pillar-Kmeans algorithm is applied for the cluster based similarity measurement with involving a semantic filtering mechanism to filter out the irrelevant data. The Pillar algorithm designates positions of initial centroids in the farthest accumulated distance between them in the data distribution in order to improve the precision of K-means clustering and speed up the computational time of clustering. Our image search system was examined in the experimental study with the 8743 Ukiyo-e image datasets from Tokyo Metropolitan Library for representing the Japanese cultural image collections. The experimental results described in Section 5 showed that the proposed system reached 53.33% and 73.33% of the precision rate to the given context respectively in the Experiment 1 and Experiment 2, and performed all correct retrieval results to the close meanings of the given emotional contexts. In our future work, we will integrate our emotion-oriented image search system with our previous work of image retrieval system involving shape and structure features.

References [1] [2]

Shigenobu Kobayashi, Color Image Scale, 1-st edition, Kodansha International publisher, 1992. T. Kitagawa, Y. Kiyoki, A mathematical model of meaning and its application to multidatabase systems, Proc. 3rd IEEE International Workshop on Research Issues on Data Engineering: Interoperability in Multidatabase Systems, pp.130-135, 1993. [3] C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, W. Equitz, Efficient and effective querying by image content, Journal of Intelligent Information Systems 3 (3–4), pp. 231–262, 1994. [4] Y. Kiyoki, T. Kitagawa, T. Hayama, A metadatabase system for semantic image search by a mathematical model of meaning, ACM SIGMOD Record, Vol.23, No. 4, pp.34-41, 1994. [5] Y. Kiyoki, , T. Kitagawa, Y. Hitomi, A fundamental framework for realizing semantic interoperability in a multidatabase environment, International Journal of Integrated Computer-Aided Engineering, Vol.2, No.1 (Special Issue on Multidatabase and Interoperable Systems), pp.3-20, John Wiley & Sons, 1995. [6] J. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Gorowitz, R. Humphrey, R. Jain, C. Shu, Virage image search engine: an open framework for image management, Proc. The SPIE, Storage and Retrieval for Image and Video Databases IV, San Jose, CA, pp. 76–87, 1996. [7] J.R. Smith, S.F.Chang, VisualSEEk: a fully automated content-based image query system, Proc. The Fourth ACM International Conference on Multimedia, Boston, MA, pp. 87-98, 1996. [8] W.Y. Ma, B.S. Manjunath, Netra: A toolbox for navigating large image databases, Multimedia Systems 7 (3), pp. 184–198, 1999. [9] R.C. Veltkamp, M. Tanase, Content-Based Image Retrieval Systems: A survey, Technical Report UUCS-2000-34, 2000. [10] H.M. Lotfy, A.S. Elmaghraby, CoIRS: Cluster-oriented Image Retrieval System, Proc. 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004) 00, pp. 224-231, 2004. [11] Y. Liu, D. Zhang, G. Lu, W.Y. Ma, A survey of content-based image retrieval with high-level semantics, Pattern Recognition 40, pp. 262–282, 2007.

136

A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System

[12] A.R. Barakbah, Y. Kiyoki, An Image Database Retrieval System with 3D Color Vector Quantization and Cluster-based Shape and Structure Features, The 19th European-Japanese Conference on Information Modelling and Knowledge Bases, Maribor, 2009. [13] A.R. Barakbah, Y. Kiyoki, A Pillar Algorithm for K-Means Optimization by Distance Maximization for Initial Centroid Designation, IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Nashville-Tennessee, 2009. [14] A.R. Barakbah, Y. Kiyoki, Cluster Oriented Image Retrieval System with Context Based Color Feature Subspace Selection, In. Proc. Industrial Electronics Seminar (IES) 2009, pp. C101-C106, Surabaya, 2009. [15] A.R. Barakbah, K. Arai, Determining Constraints of Moving Variance to Find Global Optimum and Make Automatic Clustering, In. Proc. Industrial Electronics Seminar (IES) 2004, p.409-413, Surabaya, Indonesia, 2004. [16] D. Sakai, Y. Kiyoki, N. Yoshida, T. Kitagawa, A Semantic Information Filtering and Clustering Method for Document Data with a Context Recognition Mechanism, Journal of Information Modelling and Knowledge Base, Vol. XIII, pp. 325-343, 2002. [17] S. Wang, X. Wang, Emotion Semantics Image Retrieval: An Brief Overview, ACII 2005, LNCS 3784, pp. 490–497, Springer-Verlag Berlin Heidelberg, 2005. [18] E.J. Park, J.W. Lee, Emotion-Based Image Retrieval Using Multiple-Queries and Consistency Feedback, The 6th IEEE International Conference on Industrial Informatics (INDIN) 2008, pp. 16541659, 2008. [19] M. Solli, R. Lenz, Color Based Bags-of-Emotions, CAIP 2009, LNCS 5702, pp. 573–580, SpringerVerlag Berlin Heidelberg, 2009. [20] W. Wang, Q. He, A Survey On Emotional Semantic Image Retrieval, The 15th IEEE International Conference on Image Processing (ICIP) 2008, San Diego, USA, 2008.

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-137

137

The Quadrupel -A Model for Automating Intermediary Selection in Supply Chain Management Remy FLATT b , Markus KIRCHBERG a , and Sebastian LINK b,1 a Agency for Science, Technology and Research (A∗ STAR), Singapore b School of Information Management, Victoria University, New Zealand Abstract. The selection of intermediaries is a fundamental and challenging problem in supply chain management. We propose a conceptual process model to guide the supply chain coordinator through the selection process. Besides the support of our model for the agility, adaptability and alignment of the target supply chain, it also provides extensive automated assistance for the selection of tactics by off-theshelf tools from the area of artiﬁcial intelligence. Keywords. Supply Chain Modeling, Strategic Concept Development, Intermediary Selection, Decision Support, Artiﬁcial Intelligence

1. Introduction Supply chain management (SCM) evolved from a traditional focus on purchasing and logistics practised between the mid-1960s and mid-1990s, to a broader, more integrated emphasis on value creation in the new millennium. Leading companies increasingly view supply chain excellence as more than just a source of cost reduction - rather, they see it as a source of competitive advantage, with the potential to drive performance improvement in customer service, proﬁt generation, asset utilization, and cost reduction. The effective selection of intermediaries is essential to achieve these goals, individually and collectively. In electronic markets, the dynamics of market restructuring may lead some intermediaries to extinction, but the overall market picture will compensate for the losses by providing opportunities for both existing and new intermediaries to enter the market through providing value-added services to electronic transactions. The opportunities for dis-intermediation, re-intermediation and cyber-mediation in electronic markets are contingent on their market structures, products and services as well as relationships between the various market participants. On balance, the world of electronic commerce will be characterized by an increasing differentiation of market channels. The resulting outcome is a dynamic mixed-mode structure that represents a continuum of combinations of traditional channels, dis-, re- and cyber-mediation [7]. 1 Corresponding Author: Sebastian Link, School of Information Management, Victoria University, Wellington, New Zealand; E-mail: [email protected].

138

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

The design of the supply chain is a complex decision that involves the strategic choice of the appropriate channel structure, and the tactical selection of the appropriate intermediaries. In general, if there are n intermediaries that are candidates for selection, then 2n different selections are possible; and, hypothetically, each of these selections must be considered. Due to the required ﬂexibility of the supply chain, selecting intermediaries is not a one-time process. These arguments suggests that the supply chain coordinator requires assistance in the selection process, e.g., in form of advice by experts that are currently available, by automated decision support, or by a process model that guides the intermediary selection process to support an agile, adaptable and aligned supply chain [16]. Contributions. As the ﬁrst main contribution of this paper, we propose such a process model. The framework is generic in the sense that it is not tailored to any kinds of products, or to any speciﬁc part of a supply chain. Reﬁnements and specializations of our model will be investigated in the future. Our model deals with the high complexity of the selection process by following a divide-and-conquer approach. That is, based on events, sudden changes and the expertise currently available, the supply chain is divided into different fragments. The domain experts then develop new or adjust existing recommendations for the fragments of their expertise to adapt to the current circumstances. In our model, these recommendations are abstract summaries of a careful analysis process, which we do not specify in detail to guarantee maximal generality of our model. Indeed, the recommendations are speciﬁed in a certain language (possibly by some language expert). This language restriction serves as a coordination mechanism which enables the supply chain coordinator to integrate and align the different recommendations into an overall strategy for selecting intermediaries. In fact, this mechanism guarantees off-the-shelf support for generating automatically all tactics available to implement such a strategy. Subsequently, there is also support to narrow down the choices for a preferred tactic, or to approximate a tactic as closely as possible. Our four-stage process model is iterative to accommodate the constant changes in the supply chain. Note that our framework may also be seen as a model for integrating different supply chains. It ﬁts well into already existing models: it is an instance of the dynamic e-business strategy model [12], supports strategy deﬁnition in the generic strategy process model [4], and strongly supports the derivation and maintenance of the triple-A chain [16]. As a byproduct, we also propose explicit deﬁnitions of what a strategy and a tactic constitute, which we think is interesting in its own right. As the second major contribution, we demonstrate how off-the-shelf tools from artiﬁcial intelligence provide automatic assistance for the supply chain coordinator in selecting intermediaries. Organization. We introduce our model in Section 2. In Section 3 we comment on the division of the supply chain into fragments, and introduce a running example. We explain the syntax and semantics of propositional logic in Section 4. In Section 5 we show how to specify local plans for individual supply chain fragments. In Section 6 we deﬁne strategies and tactics, and describe how the supply chain coordinator can use offthe-shelf tools to reason about the consistency of local plans. Section 7 shows how all available tactics of a plan can be determined. We discuss approximations of strategies in Section 8. Heuristics for selecting preferred tactics of approximations are analyzed in Section 9. Methods for evaluating the suitability of current tactics proposed in Section

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

139

1 Strategic analysis Internal resources

2 Strategic objectives Vision and Mission

Objectives

3 Strategic definition Option generation

Option evaluation

Option selection

Monitoring, Evaluation and Response

External environment

4 Strategic implementation Planning

Execution

Control

Figure 1. A generic strategy process model

10. We brieﬂy discuss related work in Section 11, and conclude in Section 12 with an outlook to future work.

2. The Quadruple-A Model The selection of intermediaries in the supply chain involves a strategic decision on the channel structures involved and a tactical decision on the appropriate intermediaries in each of the channel structures. As such, intermediary selection naturally belongs to the third phase of the generic four-stage strategy process model [4], illustrated in Figure 1. As part of the deﬁnition of the business strategy, options for intermediaries are generated, evaluated and selected. Since the supply chain is highly complex in nature it is nearly impossible that a single supply chain coordinator can select the intermediaries. Instead, we propose an agile, adaptable, and aligned process model that also provides automated assistance to the supply chain coordinator. Our model is iterative, and each iteration consists of four phases. The iterations can be triggered by events, and therefore support the agility of the target supply chain. Examples of such events may be sudden changes in supply or demand, revised sets of strategic objectives or any types of disasters. In the ﬁrst phase of every iteration, the supply chain is divided into different (possibly overlapping) fragments such that each intermediary candidate is covered by at least one fragment. The supply chain coordinator engages (a team of) domain experts to coordinate the fragments. In fact, the fragmentation may be based on the scope of the domain knowledge for which experts are currently available. Based on their key insights, the domain experts develop local plans for the selection of intermediaries within their fragments. These plans are abstract recommendations in some suitable formal language that we will specify later. Essentially, the recommenda-

140

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM Events Agility Feedback

Selection of preferred tactics

Fragmentation of supply chain and allocation of experts

Revision

Key insights

Adaptability Development of local plans for individual fragments

Alignment

Priorities

Inference of a strategy or approximations for the whole supply chain

Integration

Assistance

Figure 2. The Quadruple-A Model for Intermediary Selection

tions are summaries of a careful analysis of the fragment that adapt the target supply chain to local market situations or changes. For the purpose of this paper, we view this careful analysis as a black box. The local plans may consist of speciﬁc recommendations for the selection of intermediaries already, or specify complex conditions under which such selections take place. One example of a suitable formal language that speciﬁes the local plans is discussed in Section 5. Subsequently, the supply chain coordinator attempts to align the local plans into a strategy for the whole supply chain, i.e., a selection of intermediaries that satisﬁes all the recommendations set out in the local plans. At this stage, it may well turn out that the recommendations of different local plans contradict one another. In that case, the coordinator may ask (some of) the domain experts to align their local plans, possibly by collaboration of different teams. This process will be iterated until the local plans become consistent, or the decision is made that the inconsistencies cannot be resolved presently. In the latter case, approximations of a strategy are developed subsequently. At the end of this stage, the supply chain coordinator may have the choice between several tactics available for either a strategy or approximations of a strategy. In the ﬁnal step of one iteration, the coordinator applies some heuristics to narrow down the choice of the tactics available for a strategy or for approximations thereof. These heuristics are based on corporate strategies, for instance to minimize the number of intermediaries. The preferred tactic identiﬁes a unique selection of intermediaries that meets the strategy for the supply chain, or an approximation thereof. Our process model is illustrated in Figure 2. From the description so far, it becomes apparent that automated assistance is necessary for: 1. 2. 3. 4.

the decision whether an alignment of local plans into a strategy is possible at all. inferring all tactics available for a strategy. approximating a strategy as closely as possible. narrowing down the choices of tactics for a strategy or approximations thereof.

The ﬁrst item requires us to reason about the consistency between local plans. For example, one expert may recommend to select an intermediary while another advises the

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

141

opposite (recall that the same intermediary may be part of different fragments). Basically, such contradictions can be hidden deeply inside the speciﬁcations of the local plans, and reasoning about consistency means to detect any such contradictions. That implies that we need a formal language expressive enough for the domain experts to specify their local plans, and which allows us to reason about consistency efﬁciently at the same time. As a ﬁrst example of such a language, we choose Boolean propositional logic in this paper. We believe this language to be expressive enough to accommodate many recommendations that result from a careful analysis of the fragment under consideration. The limits of propositional logic can be seen as some kind of coordination mechanism by which the supply chain coordinators forces the domain experts to express their key insights. On the other hand, propositional logic has been studied extensively in Artiﬁcial intelligence, and there are off-the-shelf tools available for us to reason efﬁciently about the consistency between local plans speciﬁed in this language. We will also describe what automated support propositional logic has to offer for the remaining items listed above. It may well turn out that there are other suitable candidates for such languages. These can simply be plugged into our framework. In summary, we propose a Quadruple-A model for intermediary selection that provides strong support for the agility, adaptability and alignment of a Triple-A supply chain [16], but also offers extensive automated assistance.

3. Dividing the Supply Chain The ﬁrst step in a single iteration of our process model consists of the division of the supply chain into different fragments, and the allocation of domain experts to these fragments. Formally, the supply chain candidates form a non-empty set, denoted by SCC, of potential intermediaries, i.e., SCC = {I1 , . . . , In } for some positive integer n, and where each Ij denotes some intermediary. A fragmentation of the supply chain candidates SCC is some collection F(SCC) ⊆ 2SCC of subsets of SCC such that every element of SCC is an element of at least one fragment, i.e., for every I ∈ SCC there is some fragment F ∈ F(SCC) such that I ∈ F . The elements of a fragment F ∈ F(SCC) are also called the intermediaries of F . Example 1 For our running example we consider a down-stream supply chain. The supply chain candidates consist of four different intermediaries. These are two wholesalers W1 and W2 , and two retailers R1 and R2 . That is, SCC = {W1 , W2 , R1 , R2 }. Incidentally, there are four different domain experts assigned to the task of selecting intermediaries. The ﬁrst two are experts in the geographical location of the ﬁrst wholesaler and ﬁrst retailer, and the second wholesaler and second retailer, respectively. Furthermore, there is an expert in the domain of the wholesalers, and an expert in the domain of the retailers. More formally, the fragmentation F(SCC) consists of the following four fragments F1 = {W1 , R1 }, F2 = {W2 , R2 }, F3 = {W1 , W2 }, and F4 = {R1 , R2 }. Indeed, every intermediary of SCC belongs to at least one of the overlapping fragments. For example, W1 is an intermediary of F1 and F3 . We assume implicitly that each fragment is allocated to some (team of) domain experts, for instance based on the scope of the expert’s knowledge. A fragmentation of the

142

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

F3 Expert: Wholesalers

F4 Expert: Retailers

F1 Expert: Geographic Region 1

W1

R1

F2 Expert: Geographic Region 2

W2

R2

Figure 3. A fragmentation of the supply chain candidates SCC = {W1 , W2 , R1 , R2 }

supply chain candidates can be illustrated by a hypergraph. The nodes of the hypergraph are given by the underlying supply chain candidates, and the edges of the hypergraph are given by the elements of the fragmentation. For instance, Figure 3 illustrates the fragmentation F(SCC) of the supply chain candidates from Example 1. Alternatively, we could deﬁne a fragmentation to be a multiset F(SCC) of subsets of SCC. In that case, duplicate elements of F(SCC) may represent the fact that different agents work on the same fragment. As yet another alternative, we may deﬁne a fragmentation to be an anti-chain F(SCC) of subsets of SCC, i.e., for any two fragments F and F of F(SCC) it holds that F is not a subset of F and F is not a subset of F . For the framework of this paper, it does not matter which deﬁnition we pick, we just offer some alternatives here. The local plans for the supply chain candidates will be speciﬁed over each of the fragments in the fragmentation of the supply chain candidates. More speciﬁcally, a local plan over fragment F will be a propositional formulae over F . Before we introduce the local plans in Section 5, we will therefore deﬁne the syntax and semantics of propositional logic in the next section. The reader that is already familiar with propositional logic, may skip this section.

4. A Primer on Propositional Logic In this section, we give a self-contained summary of the syntax and semantics of Boolean propositional logic [6]. We will also brieﬂy comment on the state-of-the-art of a decision problem associated with formulae in propositional logic, and one of its search variants. In subsequent sections we will see that these problems naturally occur in the process of intermediary selection. 4.1. Syntax We deﬁne the language of Boolean propositional logic, i.e., we specify which objects belong to this language. In a ﬁrst step we ﬁx a countably inﬁnite set of propositional variables, denoted by V. The elements of V form the atomic objects of our language, and all other objects will be derived from them. That is, we now specify the set of formulae over V, denoted by FV . In fact, we deﬁne FV to be the smallest set that satisﬁes the following rules:

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

143

• every propositional variable in V is a formulae in FV , i.e., V ⊆ FV . • if ϕ ∈ FV , then (¬ϕ) ∈ FV , and we say that (¬ϕ) is the negation of ϕ, • if ψ1 , ψ2 ∈ FV , then (ψ1 ∧ψ2 ) ∈ FV , and we say that (ψ1 ∧ψ2 ) is the conjunction of ψ1 and ψ2 . Suppose that V1 , V2 , and V3 are propositional variables in V. Then the following objects are examples of formulae in FV : (¬V2 ), (V1 ∧ (¬V2 )), (¬(V1 ∧ (¬V2 ))). For convenience, we introduce the following shortcuts. The formula (ψ1 ∨ ψ2 ) is a shortcut for ¬(¬ψ1 ∧ ¬ψ2 ), (ψ1 ⇒ ψ2 ) denotes (¬ψ1 ∨ ψ2 ), and (ψ1 ⇔ ψ2 ) denotes ((ψ1 ⇒ ψ2 ) ∧ (ψ2 ⇒ ψ1 )). We call (ψ1 ∨ ψ2 ) the disjunction of ψ1 and ψ2 , (ψ1 ⇒ ψ2 ) the material implication of ψ2 by ψ1 , and (ψ1 ⇔ ψ2 ) the equivalence between ψ1 and ψ2 . The operators of negation, conjunction, disjunction, material implication and equivalence are also known as connectives. For convenience, we also introduce the following rules of precedence: ¬ binds stronger than ∧ and ∨, which both bind stronger than ⇒, which binds stronger than ⇔. We may also omit the out-most parentheses in a formula. For example, the formula (¬(V1 ∧ (¬V2 ))) reduces to ¬(V1 ∧ ¬V2 ). 4.2. Semantics Now we attach some meaning to the formulae in FV , i.e., we will specify the conditions under which any element ϕ of FV will be true given an assignment of truth values to the propositional variables that occur in ϕ. That is, the truth of a complex formula ϕ in FV can be derived from the truth values assigned to the variables that occur in ϕ. Let false and true denote the Boolean propositional truth values. A truth assignment over V is a mapping θ : V → {0, 1} that assigns to each variable in V either true or false. We extend θ to a function Θ : FV → {false, true} that maps every formula ϕ in FV to its truth value Θ(ϕ) as follows: • if ϕ ∈ V, then Θ(ϕ) = θ(ϕ), • if ϕ = (¬ψ) for some ψ ∈ FV , then Θ(ϕ) =

true false

, if Θ(ψ) = false , , otherwise

• if ϕ = (ψ1 ∧ ψ2 ) for some ψ1 , ψ2 ∈ FV , then Θ(ϕ) =

true false

, if Θ(ψ1 ) = Θ(ψ2 ) = true . , otherwise

Even though the semantics of the shortcut connectives can be derived from the semantics of negation ¬ and conjunction ∧, we make this explicit in Table 1. The names of these connectives becomes apparent when we look at their semantics. Negation negates the truth value, a conjunction ψ1 ∧ ψ2 is true precisely when both of its conjuncts ψ1 and ψ2 are, a disjunction ψ1 ∨ ψ2 is true precisely when at least one of its disjuncts ψ1 or ψ2 is, etc.

144

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

ψ1

ψ2 ψ1 ∨ ψ2 ψ1 ⇒ ψ2 ψ1 ⇔ ψ2

true true

true

true

true

true false false true false false

true true false

false true true

false false true

Table 1. The semantics of disjunction, material implication and equivalence

4.3. SAT We say that a truth assignment θ over V satisﬁes the formula ϕ in FV , denoted by |=θ ϕ, if and only if Θ(ϕ) = true. If θ satisﬁes ϕ, we also call θ a model of ϕ. We say that θ is a model of a set Σ of propositional formulae, if it is a model of every element of Σ. If θ is not a model of ϕ (Σ), we also say that θ violates ϕ (Σ). A set Σ of propositional formulae over V is said to be satisﬁable if there is some model of Σ. Satisﬁable sets of propositional formulae are also said to be consistent. The satisﬁability problem, SAT, is to decide whether an arbitrary set Σ of propositional formulae is satisﬁable. For instance, the set Σ1 = {V1 , V1 ⇒ V2 , V1 ⇒ ¬V2 } is not satisﬁable while the set Σ2 = {V1 ⇒ V2 , V1 ⇒ ¬V2 } is indeed satisﬁable. SAT was the ﬁrst problem to be shown NP-complete [5]. That means that, unless P=NP, there is no deterministic polynomial time algorithm for deciding SAT. Despite this suspected intractability, there are SAT-solvers that can deal efﬁciently with instances of SAT that contain up to a million different variables [14]. For a comprehensive survey on SAT-solvers we recommend [9]. 4.4. ALLSAT A search version of SAT computes a satisfying truth assignment for an arbitrary given set Σ of propositional formulae, if there is one. For the purpose of this paper, we are interested in a search variant of SAT known as ALLSAT where the aim is to enumerate all satisfying truth assignments of an arbitrary given set of formulae. For instance, for the input Σ2 = {V1 ⇒ V2 , V1 ⇒ ¬V2 } an ALLSAT-solver would return two truth assignment both of which assign false to V1 , and one assigns false to V2 and the other assigns true to V2 . SAT-solvers only require modest modiﬁcations to solve ALLSAT. A popular approach is the use of blocking clauses where the negation of each satisfying truth assignment that is found is added to the original problem, and the computation restarts. There are several optimizations for this method, focused on minimizing the number of assigned variables in a solution, such that each blocking clause represents a set of solutions. It is known that enumerating all satisfying truth assignments is proportional to the number of all satisfying truth assignments and the effort required to generate each satisfying truth assignment in isolation. For most of the instances in intermediary selection, the application of ALLSAT-solvers is feasible in practice.

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

145

5. Development of Local Plans In this section, we start to describe how the language of propositional logic can be applied to our process model for intermediary selection. Given a fragmentation F(SCC) of the set SCC of supply chain candidates, we will deﬁne what a local plan (for F(SCC)) constitutes. In order to get a feeling for what kind of plans we have in mind, we start with an example in which the plan is given in natural language. Example 2 Recall our fragmentation from Example 1 where F(SCC) consists of the following four fragments F1 = {W1 , R1 }, F2 = {W2 , R2 }, F3 = {W1 , W2 }, and F4 = {R1 , R2 }. The domain expert for geographical region 1, i.e., for fragment F1 , develops the following local plan (LSF1 ): if W1 is selected as an intermediary, then R1 is selected as an intermediary as well. The domain expert for geographical region 2, i.e., for fragment F2 , follows the same local plan in her domain (LSF2 ): if W2 is selected as an intermediary, then R2 is selected as an intermediary as well. The domain expert for the wholesalers, i.e., for fragment F3 , decides to select both W1 and W2 (LSF3 ). The expert for the retailers, i.e., for fragment F4 , develops the local plan (LSF4 ) that either R1 is selected or R2 (i.e. precisely one of them). Local plans are deﬁned for each fragment F of a fragmentation F(SCC) for a set SCC of supply chain candidates. Therefore, we ﬁx the propositional language FF for each of the fragments F . That is, the set of propositional variables of FF is given by F . Therefore, each supply chain candidate I of F is an atomic formula of FF . We interpret the atomic formula I ∈ FF as “the domain expert allocated to F recommends to select I for the supply chain”. From this interpretation of the atomic formula, the interpretation of the more complex formulae in FF can be derived. For example, ¬I ∈ FF means that it is recommended not to select I; or the formula R1 ⇔ ¬R2 ∈ FF4 expresses the fact that the domain expert of fragment F4 recommends to select R1 if and only if R2 is not selected. That is, it is recommended to select precisely one of R1 or R2 . Let F ∈ F(SCC) denote a fragment of SCC. A local plan for F is a propositional formula over F , i.e., an element of FF which we usually denote by λπF . A local plan for F(SCC) is a local plan for some fragment in F(SCC). Note that the condition of having just one formula represent a local plan is not a restriction: if there are several formulae, then let λπF just be the conjunction of these. Example 3 Using our interpretation of the variables for the intermediaries, the local plans LSF1 to LSF4 of Example 2 are speciﬁed by the following propositional formulae: • • • •

λπF1 λπF2 λπF3 λπF4

= W 1 ⇒ R1 , = W 2 ⇒ R2 , = R1 ⇔ ¬R2 , and = W1 ∧ W2 .

Note that local plans can be rather complex. For example, suppose that we have three different manufacturers M1 , M2 and M3 , and a distributor D. A local plan could be to

146

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

select precisely two manufacturers when the distributor is not selected. In this case, the plan is formalized by ¬D ⇒ (M1 ∧ M2 ∧ ¬M3 ) ∨ (M1 ∧ ¬M2 ∧ M3 ) ∨ (¬M1 ∧ M2 ∧ M3 ).

6. Conquer the Supply Chain: Strategies and Tactics In this section, we continue to describe the application of propositional logic to our framework. We will give an explicit deﬁnition of what a strategy and tactic for intermediary selection constitute, identify a decision problem fundamental to our framework, and the decision support available for it. A plan for an intermediary selection with respect to a fragmentation F(SCC) of supply chain candidates is the union π = ∪F ∈F (SCC) {λπF }. A policy ϑ of a plan π for an intermediary selection with respect to a fragmentation F(SCC) is a truth assignment ϑ : SCC → {true,false}. A policy ϑ of a plan π is said to be a tactic of π if ϑ is a model of π. A plan is said to be a strategy, usually denoted by ζ, if there is some tactic for ζ. An intermediary selection from a set SCC of supply chain candidates with respect to a fragmentation F(SCC) of SCC is a subset ι ⊆ SCC such that there is a strategy ζ with respect to F(SCC) and a tactic ϑ of ζ such that for all I ∈ SCC we have: I ∈ ι if and only if ϑ(I) = true. Hence, each tactic ϑ of a strategy ζ deﬁnes the intermediary selection ιϑ = {I ∈ SCC ||=ϑ I}. We say that the intermediary selection ιϑ is deﬁned by the tactic ϑ. This terminology results in the following decision problem for the supply chain coordinator: Problem: Strategy INPUT: A plan π QUESTION: Is π a strategy? Example 4 Let π = {λπF1 , λπF2 , λπF3 , λπF4 } be a plan for an intermediary selection with respect to the fragmentation F(SCC) from Example 1 that results from the local plans of Example 3. Table 2 enumerates all policies of π. However, none of these policies satisﬁes all local plans of π. Consequently, there is no tactic for π, or in other words, π is not a strategy for an intermediary selection. Let ζ = {λπF1 , λπF2 , λπF3 } denote another plan without the local plan λπF4 . In this case, the plan ζ is indeed a strategy. For example, the policy ϑ that assigns true to the intermediary R1 and false to the intermediaries W1 , W2 , and R2 is a tactic for ζ. Consequently, the intermediary selection deﬁned by ϑ is {R1 }. In our process model illustrated in Figure 2, the supply chain coordinator accumulates the local plans λπF for all fragments F ∈ F(SCC) into the plan π. Before different tactics will be identiﬁed to implement this plan, it would be helpful to decide whether there are any such tactics at all. If not, then either the local plans need to be revised, or the plan π can only be approximated. Consequently, decision support for the problem Strategy is fundamental to the framework we propose.

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

147

Intermediary Plan π = {λπF1 , λπF2 , λπF3 , λπF4 } Policy W1 W2 R1 R2 W1 ⇒ R1 W2 ⇒ R2 R1 ⇔ ¬R2 W1 ∧ W2 ϑ1 ϑ2 ϑ3 ϑ4 ϑ5 ϑ6 ϑ7 ϑ8 ϑ9 ϑ10 ϑ11 ϑ12 ϑ13 ϑ14 ϑ15 ϑ16

true true true true true true true false true true false true true true false false true false true true true false true false true false false true true false false false false true true true false true true false false true false true false true false false false false true true false false true false false false false true false false false false

true true false false true true false false true true true true true true true true

true false true false true true true true true false true false true true true true

false true true false false true true false false true true false false true true false

true true true true false false false false false false false false false false false false

Table 2. Enumeration of all policies of plan π

However, the problem Strategy is nothing else but the satisﬁability problem SAT, i.e., to decide whether there is a model for the set π of propositional formulae. Since SAT is one of the most studied problems in AI, there is plenty off-the-shelf state-of-the-art decision support available [14].

7. Enumerating all tactics of a strategy Once the supply chain coordinator knows that the plan ζ is actually a strategy, i.e., the problem Strategy with input ζ has an afﬁrmative answer, then the question is what the tactics of this strategy are. In a nutshell, there might be plenty of tactics and it might not be wise to let an automated procedure pick such a tactic. Instead, the supply chain coordinator should be aware of all such tactics to ensure that the best tactic has not been overlooked. On the other hand, all policies that are not tactics should be removed from the attention of the supply chain coordinator. Hence, we have the following problem. Problem: All-Tactics INPUT: A plan π QUESTION: What are all tactics for π? Note that All-Tactics is a more general problem than Strategy: if there are no tactics for input π, then π is not a strategy, and if there is at least one tactic, then π is a strategy.

148

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

However, it is generally more efﬁcient to decide Strategy before moving on to enumerate all tactics of a strategy. As was the case with Strategy, the problem All-Tactics enjoys full decision support since the problem is equivalent to the well-studied problem ALLSAT in AI. Example 5 Let ζ = {λπF1 , λπF2 , λπF3 } denote the plan that is input to the problem AllTactics. The table Tactic ϑi W1 W2 R1 R2 Selection ιϑi ϑ6 true false true false {W1 , R1 } ϑ11 false true false true {W2 , R2 } ϑ14 false false true false {R1 } ϑ15 false false false true {R2 } shows all four tactics ϑi of ζ, and the associated intermediary selections ιϑi deﬁned by ϑi .

8. Approximations of Strategies As mentioned previously, it might become necessary to decide that contradictions between the local plans cannot be resolved, and therefore that a strategy cannot be obtained. In that situation, it would be helpful to approximate a strategy as closely as possible. Informally, an approximation of a plan is a strategy that contains as many simultaneously satisﬁable local plans of the plan as possible. A maximal approximation of a plan is an approximation of a plan of maximum cardinality. Formally, an approximation of a plan π is a maximal sub-strategy of π, i.e., a subset ς ⊆ π such that ς is a strategy and no strategy ς ⊆ π is a proper superset of ς. Note that an approximation of a strategy ζ is unique, and that it is ζ itself. A best approximation of a plan π is an approximation ς of π with a maximum number of local plans, i.e., there is no approximation ς of π that consists of more local plans than ς. Considering our framework, we would be looking for automated support for the following problem. Problem: All-Best-Approximations INPUT: A plan π QUESTION: What are all best approximations of π? The problem All-Best-Approximations is what is known in the AI literature as the problem ALL-MC. The problem ALL-MC is to enumerate for an arbitrary given ﬁnite set Σ of propositional formulae all maximally satisﬁable subsets of Σ with maximum cardinality. Again, this problem and variations thereof have been well-studied in the AI literature [3]. Example 6 Let π = {λπF1 , λπF2 , λπF3 , λπF4 } denote the plan that is input to the problem All-Best-Approximations. In this case we obtain four best approximations of π which are the three-element sub-strategies of π. The table

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

149

Best Approximation Tactic of α α (W1 , W2 , R1 , R2 ) {λπF1 , λπF2 , λπF3 } (true,false,true,false) (false,true,false,true) (false,false,true,false) (false,false,false,true) {λπF1 , λπF2 , λπF4 } (true,true,true,true) {λπF1 , λπF3 , λπF4 } {λπF2 , λπF3 , λπF4 }

Selection ι {W1 , R1 } {W2 , R2 } {R1 } {R2 } {W1 , W2 , R1 , R2 } (true,true,true,false) {W1 , W2 , R1 } (true,true,false,true) {W1 , W2 , R2 }

shows all four best approximations α of π, all their available tactics, and the associated intermediary selections ι. 9. Heuristics for Intermediary Selection At the ﬁnal stage of an iteration in our process model, the supply chain coordinator applies heuristics to narrow down the choices for the tactics of a strategy or an approximation thereof. The heuristics can be derived from corporate objectives or preferences. A prime example of such an objective could be to minimize the number of intermediaries. Informally, a minimal tactic selects a minimal number of intermediaries among all tactics. Formally, a minimal tactic of a plan π is a tactic ϑ of π such that there is no other tactic ϑ of π which deﬁnes an intermediary selection ιϑ that is a proper subset of the intermediary selection ιϑ . Problem: All-Minimal-Tactics INPUT: A plan π QUESTION: What are all minimal tactics for π? The problem All-Minimal-Tactics is what is known in the AI literature as the problem ALL-MINIMAL. The problem ALL-MINIMAL is to enumerate for an arbitrary given ﬁnite set Σ of propositional formulae all minimal models of Σ. A minimal model of a formula ϕ is a model θ of ϕ such that there is no other model θ of ϕ where {V | θ (V ) = true} is a proper subset of {V | θ(V ) = true}. This problem has been well-studied in the AI literature [2]. Example 7 Suppose that the previous steps of our process model have resulted in the approximation α = {λπF1 , λπF2 , λπF3 }. The table of Example 6 shows the four different tactics available for α. If α is the input to the problem All-Minimal-Tactics, then the tactics: Tactic Selection (W1 , W2 , R1 , R2 ) ι (false,false,true,false) {R1 } (false,false,false,true) {R2 }

150

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

are returned. For example, the tactic (true,false,true,false) is not minimal since it deﬁnes the selection of both W1 and R1 , but the tactic (false,false,true,false) deﬁnes the selection of R1 only. The corporate objective to minimize the number of selected intermediaries might be more reﬁned, e.g., the minimum requirement may only apply to a certain selection X of candidate intermediaries. Let X ⊆ SCC be a subset of candidate intermediaries. An X-minimal tactic of a plan π is a tactic ϑ of π such that there is no other tactic ϑ of π where ιϑ ∩ X is a proper subset of ιϑ ∩ X. Problem: All-X-Minimal-Tactics INPUT: A plan π, a subset X of candidate intermediaries QUESTION: What are all X-minimal tactics for π? Note that All-X-Minimal-Tactics subsumes the problem All-Minimal-Tactics for the special case where X = SCC. The problem All-X-Minimal-Tactics is what is known in the AI literature as the problem ALL-X-MINIMAL. The problem ALL-X-MINIMAL is to enumerate for an arbitrary given ﬁnite set Σ of propositional formulae all X-minimal models of Σ. An X−minimal model of a formula ϕ is a model θ of ϕ such that there is no other model θ of ϕ where {V | θ (V ) = true} ∩ X is a proper subset of {V | θ(V ) = true} ∩ X. Again, this problem has been well-studied in the AI literature [1]. Example 8 Consider again the approximation α = {λπF1 , λπF2 , λπF3 }. with the four different tactics available for α illustrated in Example 6. Let X denote the collection {W1 , R1 } of intermediaries, i.e., it is the corporate strategy to minimize the selection of W1 and R1 . If α and X form the input to the problem All-X-Minimal-Tactics, then the tactics: Tactic Selection (W1 , W2 , R1 , R2 ) ι (false,true,false,true) {W2 , R2 } (false,false,false,true) {R2 } are returned. For example, the tactic (false,false,true,false) is not X−minimal, since it deﬁnes the selection of R1 , but the tactic (false,true,false,true) deﬁnes a selection that does neither select W1 nor R1 . The corporate strategy may also suggest some order of priority for the different fragments of the supply chain. In other words, the selection of the tactics might be based on a ranking of the local plans. The following example illustrates how such a ranking can be combined with the approximation of a strategy. Example 9 As before, let π = {λπF1 , λπF2 , λπF3 , λπF4 }. Since π does not have a strategy, we determine the best approximations of π in a ﬁrst step to evaluate our options. These

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

Policy

Intermediaries

π π π Plan π = {λπ F1 , λF2 , λF3 , λF4 }

151

Selection

ϑ

W1 W2 R1 R2 W1 ⇒ R1 W2 ⇒ R2 R1 ⇔ ¬R2 W1 ∧ W2

ιϑ

ϑ3

true true false true

false

true

true

true

ϑ2

true true true false

true

false

true

true

{W1 , W2 , R1 }

ϑ1

true true true true

true

true

false

true

{W1 , W2 , R1 , R2 }

ϑ6 true false true false ϑ11 false true false true ϑ14 false false true false

true true true

true true true

true true true

false false false

{W1 , R1 } {W2 , R2 } {R1 }

ϑ15 false false false true

true

true

true

false

{R2 }

{W1 , W2 , R2 }

Table 3. Ranking of best approximations of π according to preference order 4,3,2,1

best approximations are the seven policies ϑ1 , ϑ2 , ϑ3 , ϑ6 , ϑ11 , ϑ14 and ϑ15 from Table 2. The corporate strategy tells us that policies that satisfy λπF4 have highest priority: this leaves us with ϑ1 , ϑ2 , and ϑ3 . The next highest priority is given to those policies that satisfy λπF3 which gives us an option between ϑ2 and ϑ3 . Finally, priority of λπF2 over λπF1 determines the preferred tactic ϑ3 . This policy deﬁnes the intermediary selection ιϑ3 = {W1 , W2 , R2 }. The rankings of the approximations of π is illustrated in Table 3.

10. Assessment of Intermediary Selections In this section we brieﬂy mention two related problems that are of value when the current selection of intermediaries is to be assessed with respect to a plan. Such situations may occur, for example, when a plan has been revised but the current tactic has not. The ﬁrst problem is to decide whether an arbitrary given policy ϑ is a tactic for an arbitrary given plan π. Problem: Tactic INPUT: A plan π, a policy ϑ QUESTION: Is ϑ a tactic for π? In terms of propositional logic, this is the model checking problem MODEL, i.e., given a ﬁnite set Σ of propositional formulae and some truth assignment θ, decide whether θ is a model of Σ. A related problem is to decide whether an arbitrary given policy ϑ is a minimal tactic for an arbitrary given plan π. Problem: Minimal Tactic INPUT: A plan π, a policy ϑ QUESTION: Is ϑ a minimal tactic for π? In terms of propositional logic, this is the minimal model checking problem MINMODEL, i.e., given a ﬁnite set Σ of propositional formulae and some truth assignment θ, decide whether θ is a minimal model of Σ.

152

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

11. Related Work As explained previously, intermediary selection directly ﬁts into the strategic deﬁnition within the strategy process [4]. In the context of e-business, it is a specialization of the dynamic e-business strategy model [12]. Our model supports the development and maintenance of the Triple-A supply chain [16] and adds considerable automated decision support. The authors of [8,13] call for supply chain collaboration. In [13], a model of iterative loops is suggested which is similar to ours: choosing strategic partners, aligning supply chain strategy and corporate strategy, and identifying the most appropriate supply chain strategy. Our model can thus be viewed as a collaborative way of selecting intermediaries. It also demonstrates what decision support might be of use [13]. Other models for intermediary selection have been proposed in the literature. An example is the model in [20] which focuses on the development of local plans in our model (without specifying a language), based on the strategy to maximize proﬁts within the given budget constraints. To the author’s best knowledge, our model is the ﬁrst to suggest a divide-and-conquer approach that enjoys full decision support. In particular, the abstract speciﬁcation of our local plans results in the ability to generate optimal tactics that may not only accommodate a single parameter, but may show the relative impact of altering different parameters in the supply chain. This property was identiﬁed as one of the future modelling opportunities in supply chain management [21]. Supply Chain Management views a business as a chain of inter-connected entities of commercial activities. Therefore, multi-agent systems may be utilized to explore optimum chain connections from the procurement to the customer [11]. We refer the interested reader to [10], or to [15] for a more recent survey.

12. Conclusion and Future Work We have proposed a process model that assists supply chain coordinators in their task to select intermediaries. Our model follows an iterative, four-stage, divide-and-conquer approach that fosters the idea of a quadruple-A supply chain: agility, adaptability, alignment, and assistance. We have proposed to use propositional logic as a formal language to specify local plans for each of the fragments of a supply chain. This results in a concise representation of the key insights of each of the domain experts assigned to the fragments. Most importantly, it enables automated assistance for many tasks in the selection process. We have identiﬁed at least seven different problems that are fundamental to our process model. Each of the problems has a counterpart in propositional logic that has been well-studied by the Artiﬁcial Intelligence community. Table 4 provides a summary of these problems and their relationship. Even though the problems are, in general, perceived to be intractable, modern algorithms can deal efﬁciently with instances of the problem that contain a huge number of variables [14]. This number is usually signiﬁcantly greater than the number of intermediary candidates in any supply chain. In future work we will test our process model in various case studies. This will provide useful insight into the level of support that our framework has to offer, but also on its limits. We would also like to analyze the potential of other formal languages, e.g., ﬁrst-order logic and modal logics. Since different domain experts are likely to have different opinions it seems also natural to look at various approaches to dealing with inconsistencies.

R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM

Problem

Related AI Problem

Strategy All-Best-Approximations All-Tactics

SAT ALL-MC ALLSAT

All-Minimal-Tactics All-X-Minimal-Tactics Tactic Minimal Tactic

ALL-MINIMAL ALL-X-MINIMAL MODEL MIN-MODEL

153

Table 4. Correspondences of Problems in Intermediary Selection and AI

References [1] C. Avin and R. Ben-Eliyahu-Zohary. An upper bound on computing all X-minimal models. AI Commun., 20(2):87–92, 2007. [2] R. Ben-Eliyahu-Zohary. An incremental algorithm for generating all minimal models. Artif. Intell., 169(1):1–22, 2005. [3] E. Birnbaum and E. Lozinskii. Consistent subsets of inconsistent systems: structure and behaviour. J. Exp. Theor. Artif. Intell., 15(1):25–46, 2003. [4] D. Chaffey. E-Business and E-commerce management. Prentice-Hall, 2007. [5] S. Cook. The complexity of theorem-proving procedures. In ACM Symposium on Theory of Computing, pages 151–158, 1971. [6] H. Enderton. A mathematical introduction to logic: Second Edition. Academic Press, 2001. [7] G. Giaglis, S. Klein, and R. O’Keefe. The role of intermediaries in electronic marketplaces: developing a contingency model. Information Systems Journal, 12:231–246, 2002. [8] M. Grieger. Electronic marketplaces: A literature review and a call for supply chain management research. European Journal of Operational Research, 144:280–294, 2003. [9] J. Gu, P. Purdom, J. Franco, and B. Wah. Algorithms for the satisﬁability (SAT) problem: A survey. In Satisﬁability Problem: Theory and Applications, pages 19–152. Amer. Math. Soc., 1997. [10] R. Guttman, A. Moukas, and P. Maes. Agent-mediated electronic commerce: a survey. The Knowledge Engineering Review, 13(2):147–159, 1998. [11] B. Hellingrath, C. Böhle, and J. van Hueth. A framework for the development of multi-agent systems in supply chain management. In HICSS, pages 1–9, 2009. [12] R. Kalakota and M. Robinson. E-business. Roadmap for success. Addison-Wesley, 2000. [13] P. Kampstra, J. Ashayeri, and J. Gattorna. Realities of supply chain collaboration. CentER Discussion Paper Series No. 2006-59. Available at SSRN: http://ssrn.com/abstract=919813, 2006. [14] H. Kautz and B. Selman. The state of SAT. Discrete Applied Mathematics, 155(12):1514–1524, 2007. [15] N. Lang, H. Moonen, F. Srour, and R. Zuidwijk. Multi-agent systems in logistics: A literature and state-of-the-art review. ERIM Report Series Reference No. ERS-2008-043-LIS, available at SSRN: http://ssrn.com/abstract=1206705, 2008. [16] H. Lee. The Triple-A supply chain. Harvard Business Review, 10(11):102–112, 2004. [17] S. Link. On the implication of multivalued dependencies in partial database relations. Int. J. Found. Comp. Sci., Volume 19(3): 691-715, 2008. [18] S. Link. On the logical implication of multivalued dependencies with null values. Conferences in Research and Practice in Information Technology, Volume 51: 113-122, 2006. [19] S. Link. Consistency enforcement in databases. Semantics in Databases, Lecture Notes in Computer Science, Volume 2582:139-159, 2001. [20] V. Rangan, A. Zoltners, and R. Becker. The channel intemediary selection decision: A model and an application. Management Science, 32(9):1114–1122, 1986. [21] J. Swaminathan and S. Tayur. Models for supply chains in E-business. Management Science, 49(10):1387–1406, 2003.

154

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-154

A Simple Model of Negotiation for Cooperative Updates on Database Schema Components Stephen J. HEGNER Umeå University, Department of Computing Science SE-901 87 Umeå, Sweden [email protected] http://www.cs.umu.se/~hegner Abstract. Modern applications involving information systems often require the cooperation of several distinct users, and many models of such cooperation have arisen over the years. One way to model such situations is via a cooperative update on a database; that is, an update for which no single user has the necessary access rights, so that several users, each with distinct rights, must cooperate to achieve the desired goal. However, cooperative update mandates new ways of modelling and extending certain fundamentals of database systems. In this paper, such extensions are explored, using database schema components as the underlying model. The main contribution is an effective three-stage process for inter-component negotiation. Keywords. database, component

Introduction The idea of modelling large software systems as the interconnection of simpler components, or componentware [3], has long been a central topic of investigation. In recent work, Thalheim has forwarded the idea that a similar approach, that of database componentware, is a fruitful direction for the modelling of large database systems [23]. Database componentware is a true software-component approach, in that it embodies the principle of co-design [24] [10] — that applications should be integrated into the design of information systems. Indeed, the formal model [25] is closely related to that of the software components of Broy [5] [6]. While this approach has obvious merits, it does involve one substantial compromise; namely, the classical notion of conceptual data independence [17, p. 33] is sacriﬁced, since the applications are integral to the design. As new applications become necessary, or as existing applications must be modiﬁed, a change to the entire design may become necessary. It is therefore appropriate to ask whether a component-based approach to modelling database systems which preserves conceptual data independence, and thus mirrors more closely the traditional notions of a database schema, is feasible. In [12], the foundations for such a framework were presented. The core idea is that of a schema component, consisting of database schema and a collection

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

155

of its views, called ports. Interconnections are formed by connecting ports; that is, by requiring the states of connected ports to match. Such an interconnection deﬁnes a composite database schema. The idea is closely related to lossless and dependency-preserving decomposition, but it is really a theory of composition — the main schema is constructed from components rather than decomposed into constituents. The structure necessary to connect components together is part of the deﬁnition of the components themselves. The ultimate value of any concept lies in its applicability. In [15], initial ideas surrounding the use of schema components as the underlying framework for the support of cooperative update were presented. The model developed was a proof-of-concept effort, and many simplifying assumptions were made. Furthermore, the focus was upon a formal computational model rather than upon an illustration of how the technique may be used to model situations requiring cooperative update. The goal of this paper is to complement and extend [15]. The main contribution the presentation of a simple yet effective negotiation process. Any approach to cooperative update must support negotiation while still providing for reasonable convergence. While the process described in [15] is guaranteed to converge, the number of steps which are possible can be very large [15, 3.5(a)]. In this paper, a much more efﬁcient negotiation process is developed in which each component executes at most three negotiating steps. This process is illustrated via an extended and annotated example, rather than via a completely formal model. There are a number of other aspects of cooperative update which were not even mentioned, much less addressed, in [15]. In this paper, several of the most important are discussed brieﬂy, and illustrated relative to the running example. One of the most important is relative authority. Even in cooperative situations, there will typically be a hierarchy of authority, so that some players will be obligated in certain ways to accommodate the proposals of others. Others include models of behavior when actors are presented with choices for supporting an update request, and models for ensuring the cooperation does not lead to corruption. There has been considerable research on the general topic of cooperative work in general and cooperative transactions in particular [16] [22] [28]. There has also been some very recent work on synchronizing updates to repositories [18]. Relative to these, the focus of this paper is upon how an update which is proposed by a single agent (the initiator) to a single schema component may be realized via suitable updates to other components. It does not address more general situations in which a group of agents must begin from scratch to produce a desired ﬁnal result, although such situations could conceivably be modelled within the context of schema components also.

1. Fundamentals of Schema Components and Cooperative Update The work of this paper is based upon the formal foundations of schema components and cooperative update, as presented in [12] and [15], respectively. While a complete understanding of the formalisms of those papers is not absolutely necessary for this paper, it is nevertheless useful for the reader to be familiar with the basic concepts and notation. The purpose of this section is to summarize the material from those two references which is central to the rest of this paper. The reader may wish to skim this section rather brieﬂy, referring back to it as the need arises. In any case, the reader is referred to those papers for details and a more systematic presentation. The ideas are presented in terms of the

156

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

classical relational model, although they may easily be generalized to any data model admitting the notions of state and of view. 1.1. Schema Components Let E0 be the relational schema with the single relation symbol R[ABCDE], constrained by the functional dependencies (FDs) F = {B → C,C → DE}. The notation LDB(E0 ) is used to represent the set of all legal databases of E0 ; that is, the set of all relations on ABCDE which satisfy the FDs in F, while DB(E0 ) denotes the set of all databases on E0 which may or may not satisfy the constraints of F. Consider the decomposition of this schema into its four projections in {AB, BC, CD,CE}. Using classical relational database theory, it is easy to establish that this decomposition is lossless, in the sense that the original database may be reconstructed by joining together the projections, and dependency preserving in the sense that the elements of F may be recovered from the dependencies which are implied on the projections. Together, these two properties imply that there is a natural bijective correspondence between LDB(E0 ) and the decomposed databases. More precisely, if N = NAB , NBC , NCD , NCE is a quadruple of databases, with NAB a relation on AB which satisﬁes all of the dependencies in (the closure of) F which embed into AB, and likewise for NBC , NCD , and NCE on their respective projections, then there is an M ∈ LDB(E0 ) which decomposes into N. E0 = To proceed further, a more comprehensive notation is essential. Deﬁne ΠBC E BC 0 BC (E0 , πBC ) to be the view which is the projection of R onto BC. Here E0 is the relational schema with the single relation symbol RBC , constrained by FAB = {B → C}, and E0 E0 E0 E πBC0 : E0 → EBC 0 is the projection of R onto RBC . The views ΠAB , ΠCD , and ΠCE are deﬁned in a completely analogous fashion, with analogous notation, as the projections onto the given sets of attributes. Modelling using components embraces explicitly two related notions which are only implicit in the above view-based approach. First, the model is totally distributed, in the sense that no reference to a main schema is necessary. Second, because of this lack of an explicit main schema, the means by which the components are interconnected must be made explicit. These ideas are now examined in more detail in the light of the above example. E0 The component corresponding to ΠAB consists of the schema EAB 0 together with the EAB

EAB

EAB

AB 0 0 view ΠB0 of EAB 0 which projects AB onto B. Write KAB = (E0 , {ΠB }). The view ΠB is called a port of KAB because it is used to connect to other components. A component EBC

EBC

0 0 may have more than one port. Indeed, KBC = (EBC 0 , {ΠB , ΠC }) has two ports. The ECD

ECE

BC 0 0 components KCD = (EBC 0 , {ΠC }) and KCE = (E0 , {ΠC }), each with a single port, are deﬁned similarly. For each of these components, the ﬁrst entry is the schema and the second its set of ports. It is convenient to have a graphical notation for the representation of interconnected components. Figure 1 illustrates this notation for the example just given. The components are represented as rectangles, with the ports depicted as circles. When two ports are connected, they are shown as a single circle. The interconnection family for Figure 1 speciﬁes how the components are interconnected, and gives the sets of ports which are connected together. In this case, it is EAB

EBC

EBC

ECD

ECE

J0 = {{ΠB0 , ΠB0 }, {ΠC0 , ΠC0 , ΠC0 }}. A single member of an interconnection fam-

157

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

KCD KAB RAB [AB]

KBC

πBAB

RB [B]

πBBC

RBC [BC] B →C

πCBC

πCCD RC [C]

RCD [CD] C→D KCE

πCCE

RCE [CE] C→E

Figure 1. An interconnection of components

ily is called a star interconnection. Thus, J0 consists of two star interconnections. For this notation to be unambiguous, the set of components must be name normalized, in that globally, over all components, no two ports have the same name. Since this is just a naming convention, it can always be met through suitable renaming. Note, on the other hand, for two ports to be members of the same star interconnection, they must have idenEAB

EBC

tical schemata. For example, even though ΠB0 and ΠB0 are distinct ports, from distinct components, they have identical (and not just isomorphic) schemata. This condition is essential because the semantic condition on such an interconnection is that the states of all such view schemata must be identical. When the port schema (deﬁned by RB in this E case) is from a view of a main schema (ΠB0 in this case), this happens automatically, but in the case of component interconnection without reference to a main schema, it must be enforced explicitly. Note further that the graphical notation of Figure 1 embodies this idea implicitly, since each common port schema is represented by a single circle. 1.2. Cooperative Update For convenience, assume that the current state of the main schema is M = E0 {R(a1 , b1 , c1 , d1 , e1 ), R(a2 , b2 , c2 , d2 , e2 )}. The state of ΠAB is then MAB = {RAB (a1 , b1 ), E0 E0 E0 RAB (a2 , b2 )}, with the states MBC , MCD , and MCE of ΠBC , ΠCD , and ΠCE obtained simE0 ilarly. Suppose that a given user aAB has access to the database only through view ΠAB , E0 and wishes to insert RAB (a3 , b2 ). This update can be realized entirely within ΠAB . By inE0 serting R(a3 , b2 , c2 , d2 , e2 ) into M, the desired update to ΠAB is achieved without altering the state of any of the other three views. Indeed, this is an instance of update via the E classical constant-complement strategy [2]. The mutual view ΠB0 , the projection onto B, E0 E0 is called the meet of ΠAB and ΠBC , and is precisely that which must be held constant under the constant complement strategy [11]. Now suppose that instead that user aAB wishes to insert RAB (a3 , b3 ). This update E0 which holds the states of the other cannot be realized by a change to the state of ΠAB three views constant. Indeed, it is necessary to insert a tuple of the form RBC (b3 , c? ) into E0 E0 . Since user aAB does not have write access to view ΠBC , the cooperation the state of ΠBC of another user who has such write access, say aBC , is necessary. If that user chooses to insert, say, RBC (b3 , c2 ), then the process terminates without any need for cooperation from E0 E0 or ΠCE . However, if user aBC chooses to cooperate by inserting, say, RBC (b3 , c3 ), then ΠCD E0 E0 the cooperation of additional users, one for ΠCD and one for ΠCE is necessary. Finally, if these additional users choose to insert RCD (c3 , d3 ) and RCE (c3 , e3 ), respectively, then

158

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

the tuple R(a3 , b3 , c3 , d3 , e3 ) may be inserted into the state M of E0 to achieve the desired result. Note that no single user, of a single view, could effect this update; by its very nature it requires the cooperation of distinct views, likely controlled by distinct users.

2. Three-Stage Negotiation for Cooperative Update In this section, a three-stage negotiation process for cooperative update on an interconnection of schema components is developed. Rather than presenting a completely formal model, the main ideas are developed in detail in the context of a simple business process, the approval of a travel request. This example is superﬁcially similar to that found in [15]; however, not only the example process but also the underlying schema differs substantially, because the points which require emphasis are quite different. 2.1. The Schemata and Components of the Example Figures 2 and 3, together with Table 1, provide the basic deﬁnitions for the example, which is presented in the relational model. In Figure 2, the immutable relations of the model; that is, the ones which may not be updated (at least for the purposes of servicing a business process) are shown. Keys are marked with an underline, while set-valued attributes (i.e., multisets in the terminology of SQL:2003 [8]) are marked with a :::: wavy underscore. Thus, each employee has an employee ID, a home department deﬁned by the ID of that department, and a set of assigned projects. Similarly, each department has a supervisor, each account has an account manager, and each project has a supervisor and a set of accounts (for travel funds). These relations are shared by all components. Employee [ EmpID, DeptID, ProjIDs ] :::::

Project [ ProjID, SupID, ProjAccts ]

Department [ DeptID, SupID ] Account [ AcctID, AMgrID ]

:::::::

Figure 2. The immutable relations of the running example

Figure 3, which employs the symbolic notation which was introduced in [12] and is summarized in Section 1, shows the basic schema components and ports. The upper line in each rectangle (e.g., Accounting) gives the name of the associated component, while the lower line (e.g., RActg SBank ) identiﬁes the mutable relations which deﬁne the schema of that component; that is, the relations which may be modiﬁed in the course of servicing a travel request. Shown within each circle is the relation deﬁning the schema of the associated port. Information on the attributes of the individual relations of the components, aside from the port relations, is given in Table 1. For each attribute name, a checkmark in the column of a relation indicates that the attribute is included in that relation, and an underline of a checkmark indicates that the given attribute is a key. Thus, for example, RActg may be expressed more completely in standard relational notation as RActg [TripID, EmpID, ProjID, TotalCost, AcctID, ApprvAcct]. Since TripID is a key for every relation of the form Rxxx (i.e., every relation except SBank ), those relations may be joined together to form one large relation R on the set of all attributes

159

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

Accounting RActg SBank

RAcAm

AccountMgr RActtMgr

RSeHt

RSeAc

RSeDm

ProjectMgr RProjMgr

RSeEm

Secretariat RSecrt

RSePm

DeptMgr RDeptMgr

Hotel RHotel

Employee REmpl

Figure 3. The components of the running example and their relations

Travel

REmpl

RSecrt

RHotel

RActg

RActtMgr

TripID

✓

✓

✓

✓

✓

✓

✓

EmpID

✓

✓

✓

✓

✓

✓

ProjID

✓

✓

✓

✓

✓

✓

Purpose

✓

✓

✓

✓

StartDate

✓

✓

✓

✓

✓

EndDate

✓

✓

✓

✓

✓

Location

✓

✓

✓

✓

✓

HotelCost

✓

✓

✓

TotalCost

✓

✓

✓

✓

AcctID

✓

✓

✓

ApprvProj

✓

ApprvSup

✓

ApprvAcct

✓

HotelName

✓

✓

RProjMgr

RDeptMgr

SBank

✓ ✓ ✓

✓

✓

✓ ✓

Balance Table 1. The mutable relations of the running example

shown in Table 1, save for the last one, Balance, which is used only in SBank . Then SBank may be joined with R, since AcctID is a key for it, and thus a universal relation Travel on all of the attributes may be obtained, with each of the component relations a projection of Travel. Each relation associated with a port is also a projection of Travel; the attributes of a port schema are given by the intersection of the attributes associated with the connecting components. For example, the attributes of RSeAc are {TripID, EmpID, ProjID, TotalCost, AcctID, ApprvAcct}. The semantics of the attributes of Table 1 are self explanatory, for the most part. Each trip is taken by a single employee and is associated with a single project. It has a purpose, a start date, and end date, and a location. There is a total cost for the entire trip, as well as the cost of just the hotel. The costs are charged to a single account. A trip must receive three distinct approvals, one by the project supervisor, one by the department supervisor, and one by account manager for the account to which the charges are made.

160

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

Finally, the relation SBank recaptures that each account has a balance, which is reduced accordingly when a trip is charged to that account. The component interconnection of Figure 3 illustrates a spoke-and-hub topology, in that there is a central vertex (in this case Secretariat) which embodies most, but not all, of the mutable information. This is not an essential feature of the schema-component model, but it is a very useful architecture for many applications, such as the travel-request example considered here. Also, in Figure 3, each port schema connects only two components, but this is not a general requirement either, as the example of Section 1 illustrates. 2.2. The Representation of a Simple Update Request In principle, a travel request may be initiated as an update to any of the components. Indeed, this is one of advantages of the using schema components to model business processes — the actual control ﬂow need not be speciﬁed; rather, only the constraints on that ﬂow imposed by the model need be respected. One of the most common cases is that that an employee, say Annie for the sake of concreteness, initiates a request for her own travel. Annie has write access only to the component Employee, and indeed, only to tuples of REmpl which are associated with her EmpID. Suppose that she is working on the French project and wishes to travel to one of Nantes or Nice from April 1 to April 5. To express this request as an update, she obtains a new TripID from a server and proposes an insertion of a single tuple into REmpl satisfying the following expression. uEmpl:0 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), 1000 ≤ TotalCost ≤ 1500, HotelCost ≤ 1500, HotelName = ∗. The plus sign indicates that the update is an insertion; that is, the tuple(s) indicated by the expression are to be inserted. It actually represents many possibilities, and so is termed a nondeterminstic update request, and the expression uEmpl:0 identiﬁes an update family. Each possible update inserts only one tuple, but the values of the TotalCost, HotelCost, and HotelName ﬁelds are not ﬁxed. No values for HotelCost and HotelName are excluded. Since Annie does not know Nantes, she has used the ∗ wildcard to indicate that she expresses no preference for a hotel, and allows a cost up to and including the total amount for the trip. Similarly, any value for TotalCost between 1000 and 1500 Euros inclusive is a possibility. In effect, an update family may be thought of as a set of ordinary, deterministic updates. In this case, there is one deterministic update in uEmpl:0 for each quadruple (Loc, TC, HC, HN) in which Loc ∈ {Nantes, Nice}, 1000 ≤ TC ≤ 1500, 0 ≤ HC ≤ 1500, and HN is the name of a hotel in the appropriate city. It is assumed that all such update families are checked for integrity with the given constraints. For example, the relation Employee must reﬂect that Annie is a member of the French project. 2.3. The Three Stages of the Negotiation Process Annie has the authority to update REmpl only insofar as that update does not affect the other components. However, any of these proposed updates would affect the state of

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

161

RSeEm as well. Thus, the cooperation of neighboring components, in this case the Secretariat component, must be obtained in order to obtain a completion of her initial request. The component Secretariat will then need to cooperate with other components. The process by which all components come to agreement on a completion of the initial update request uEmpl:0 is called negotiation. In [15], a negotiation process is described in which any component can make a decision at any time. While such a model is very attractive theoretically and is well suited for the formal model presented there, convergence may be very slow. Here, a simple negotiation process is described in which each component goes through three distinct stages, although different components may be in different stages at different times. For a given component, each stage requires the execution of one well-speciﬁed task. Once these tasks are completed, the negotiation process is complete. In particular, negotiation cannot continue indeﬁnitely in a back-and-forth fashion. The description given below assumes that the interconnection are acyclic [12, Sec. 3], in the sense that there are no cycles in the graph which represents the interconnection of the components. The example interconnection of Figure 3 is acyclic. It also requires a few simple deﬁnitions. For components K and K , a simple path from K to K goes through no component more than once. For example, in Figure 3, Employee, Secretariat, DeptMgr is a simple path from Employee to DeptMgr, while Employee, Secretariat, ProjectMgr, Secretariat, DeptMgr is a path which is not simple. For an acyclic graph, there is at most one simple path between any two components. Let Γ be a port of K . Call Γ inner relative to K if it occurs on the simple path from K to K , and outer otherwise. For example, the port of Accounting deﬁned by RSeAc is inner with respect to Employee, while the port deﬁned by RAcAm is outer. Call a component K extremal with respect to another component K if there is a simple path K = K0 , K1 , . . . , Kn = K from K to K and this path cannot be extended beyond K while keeping it simple. Relative to Employee, the components Hotel, AccountMgr, ProjectMgr, and DeptMgr are extremal, while the others are not. The three stages of the negotiation process are described as follows. Stage 1 — Outward propagation: During Stage 1, the initial update request is radiated from the initiating component outwards to the other components. Each user of a given component, as it receives information about the initial update request, makes a decision regarding the way in which it is willing to support that request. It is only during this stage that such decisions may be made. In the later stages, each user must respect the decisions which were made in Stage 1. Since the underlying graph is assumed to be acyclic, each component receives information about the proposed update from at most one of its neighbors. Thus, there is no need to integrate information from different sources during this step. The component which initiates the update request enters Stage 1 immediately. It then projects this request onto its ports; each neighboring component then lifts state on the port to an update request on its own schema. These neighboring components enter Stage 1 as soon as they have performed this lifting. The process then continues, with each component which are newly in Stage 1 projecting its lifting onto their inner ports relative to the initiating component. It ends when the liftings have been propagated to the extremal components.

162

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

Stage 2 — Propagate inward and merge: During Stage 2, the liftings which were chosen during Stage 1 are radiated back inwards towards the initiating component. In each component, the information from its neighbors which are connected to its outer ports is merged into a single update family. Since an extremal component has no outer ports, it enters Stage 2 as soon as it has decided upon a lifting for the update request. After that decision has been made, it is transmitted it back to the component from which the initial update request was received during Stage 1 by projecting it onto the appropriate port. Components which are not extremal enter Stage 2 when they have received a return update request from each neighbor which is connected to an external port, and then have merged the possibilities of these into a single update family. This merged update family is then transmitted back towards the initiating component via the inner ports of the current component. This merger may be empty, in which case it is impossible to realize the initial update request. However, even if it is empty, it is transmitted back. Stage 3 — Choose ﬁnal state and commit: Once the initiator of the update request has received and merged all of the incoming requests, it has reached Stage 2, and that marks the end of Stage 2 for all components, since all components have now merged the information from their more outward neighbors. The ﬁnal step is for the initiating component to select one of the possibilities which it has computed in its merge as the actual update to its schema. (If this set of possibilities is empty, the update request fails.) Once it has chosen a possibility, it transmits this decision outward, just as in Stage 1. Each component must make a decision as to which of the possibilities in the update family determined in Stage 2 will be the actual update. This decision process is called Stage 3. Once all of these decisions are made, the update can be committed to the database. There is one detail which was not elaborated in the above description. It is possible that some components will not need to be involved in the negotiation process, because none of the possible liftings will change their states. These components are simply ignored in the process. 2.4. The Negotiation Process via Example The three-stage process described above is now illustrated on the running example, using the update family uEmpl:0 deﬁned in 2.2. In the ﬁrst step, the update to the component Employee is projected onto the view RSeEm ; in this case RSeEm and REmpl have the same attributes and so this projection is the identity. At this point, Employee has completed Stage 1. Next, this projection must be lifted to an update family on the schema of the component Secretariat, which must include values for every attribute of RSecrt ; that is, every attribute listed in Table 1 save for Balance. Without further restrictions, a user of the Secretariat component (a human secretary, say) could choose any subset of the set of possible liftings to propagate forward, including the empty set, which would abort the proposed update. This liberal model is in fact used in [15]. In a real modelling situation, the set of liftings which are allowed must be regulated in some way; this topic is discussed further in 3.3. For now, assume that the rôle of the Secretariat carries no decision-making authority; thus, it must allow all possible liftings which do not involve extraneous riders, such as additional

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

163

travel for someone else. See 3.2 for an elaboration of this notion. The lifting will then have a representation of the following form. uSecrt:0 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), HotelName = ∗, HotelCost ≤ 1500, 1000 ≤ TotalCost ≤ 1500, ApprvProj = Carl, ApprvSup = Barbara, ( AcctID = A1, ApprvAcct = AM1 ∨AcctID = A2, ApprvAcct = AM2 ∨AcctID = A3, ApprvAcct = AM3 ∨AcctID = A4, ApprvAcct = AM4) The IDs for the project supervisor and department manager have been ﬁlled in, since these are single valued and given in the immutable tables Project and Department. Similarly, the identities of the four accounts which are associated with the French project, together with their managers, are obtained from the table Account. No decision on the part of the secretariat is required to determine these values. To complete the process for Stage 1 for component Secretariat, uSecrt:0 is projected onto each outer port. At this point, Stage 1 for component Secretariat is complete. Consider ﬁrst the communication with the component Hotel, which is assumed to be autonomous (with no decision-making authority) and simply returns a list of available hotel rooms for the given time interval. Suppose that the following lifting is obtained. uHotel:0 := +TripID = 12345, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, ( HotelCost = 1600, HotelName = TrèsCher ∨HotelCost = 1200, HotelName = AssezCher ∨HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple) Thus, there are no hotels available in Nice for the request period of time, but there are four from which to choose in Nantes (although one turns out to be too expensive). Hotel is an extremal component, so upon placing this lifting on the port deﬁned by RSeHt , both Stage 1 and Stage 2 for that component are complete. This result is held by Secretariat until the other responses are received and it can complete its processing for Stage 2. Next, consider the projection onto the outer port deﬁned by RSeAc , connected to component Accounting. Only the values for TripID, EmpID, ProjID, and TotalCost, as well as the alternatives for AcctID and ApprvAcct, are included. The lifting to the component Accounting must add information on the relation SBank , as shown below. uActg:0 := +TripID = 12345, EmpID = Annie, ProjID = French, ( AcctID = A1, 1000 ≤ TotalCost ≤ 1500, ApprvAcct = AM1 ∨AcctID = A2, TotalCost = 1000, ApprvAcct = AM2 ∨AcctID = A3, 1000 ≤ TotalCost ≤ 1100, ApprvAcct = AM3) ∪ ±Balance ← Balance − TotalCost

164

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

The account A4 has been excluded because the balance was insufﬁcient to fund the trip. (Assume that it was 900 Euros, say.) Similarly, the amounts allowed for accounts A2 and A3 are below those of the initial request, since these accounts cannot fund the entire 1500 Euros. This process of reducing the allowed liftings is called trimming. A decision to exclude other accounts, such as A2, might also be made; whether or not this would be allowed would depend upon the authority of the user of this component (see 3.3). However, in this example, all applicable accounts with sufﬁcient balance have been included. Also, in this model, the entire cost of the trip must be paid from one account; the cost of a single trip may not be shared amongst accounts. In contrast to the update families which have been obtained thus far, this one is not a pure insertion. In order to pay for the trip, funds must be removed from the paying account. Thus, the update, which is tagged with a “+” indicating an insertion, also has a sub-update which is tagged with a “±”, indicating a modiﬁcation. Standard imperative programming notation has been used to express this. To complete Stage 1 for Accounting, this update family is passed to component AccountMgr via the port with schema RAcAm . Here there is not a single user which must construct a lifting; rather, each account manager must make a decision, and these decisions subsequently combined into a single lifting. However, no negotiation amongst these managers is required; the individual decisions are independent of one another. Suppose that two of the account managers agree to funding, each at a different level, but a third (AM2 for account A2) does not, so that the lifting in AccountMgr is given by the following expression. uActg:0 := +TripID = 12345, EmpID = Annie, ProjID = French, ( AcctID = A1, 1000 ≤ TotalCost ≤ 1500, ApprvAcct = AM1 ∨AcctID = A3, 1100 ≤ TotalCost ≤ 1100, ApprvAcct = AM3) Since AccountMgr is an extremal component, this lifting is transmitted back to component Accounting, thus completing not only Stage 1 but also Stage 2 for AccountMgr. This information requires that component Accounting trim its initial proposal to remove the possibility of using account A2. The following is computed as the ﬁnal lifting in Accounting. uActg:1 := +TripID = 12345, EmpID = Annie, ProjID = French, ( AcctID = A1, 1000 ≤ TotalCost ≤ 1500, ApprvAcct = AM1 ∨AcctID = A3, 1000 ≤ TotalCost ≤ 1100, ApprvAcct = AM3) ∪ ±Balance ← Balance − TotalCost Component Accounting now projects this result back to its inner port deﬁned by RSeAc , thus completing its Stage 2. The component Secretariat is still in Stage 1, and must communicate the initial update request to the other two manager components, ProjectMgr and DeptMgr. The project manager and department manager make only approve/disapprove decisions; no other parameters are involved. They are presented only with the proposed values for TripID, EmpID, ProjID, Purpose, StartDate, EndDate, and Location. They indicate approval by placing their IDs in the respective approval ﬁelds: ApprvProj or ApprvSup. For example, the update expression which is passed to the component ProjectMgr is

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

165

uSePm:0 := TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), ApprvProj = Carl Observe in particular that the location is given as either Nantes or else Nice. Even though there are no hotels available in Nice, for this simple model, the communication of component Secretariat with Hotel, Accounting, ProjectMgr, and DeptMgr occurs in parallel. Thus, it is not necessarily known that there are no hotels available in Nice when this update request is sent to ProjectMgr. Furthermore, even if Secretariat had received the reply from Hotel before initiating communication with ProjectMgr, it may not have the authority to pass this information along to that component. See 3.1 and 3.3 for a further discussion of this type of situation. Returning to the communication with ProjectMgr, it indicates approval by returning this same expression, and indicates rejection by returning the empty expression. In either case, since it is an extremal component, returning the decision completes Stages 1 and 2 for it. An analogous expression applies for communication with the component DeptMgr. In the decision ﬂow of this example, assume that both return positive decisions. At this point the Secretariat component has received all of the responses, and is in a position to complete its Stage 2. To do this, it merges all of these responses to ﬁnd a greatest common expression; that is, the largest update family which respects each of the update families which was reﬂected back to it. The expression which is obtained is the following. uSecrt:1 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, ApprvSup = Barbara, ApprvProj = Carl, ( 1200 ≤ TotalCost ≤ 1300, AcctID = A1, ApprvAcct = AM1, HotelCost = 1200, HotelName = AssezCher ∨1000 ≤ TotalCost ≤ 1300, AcctID = A1, ApprvAcct = AM1, ( HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple) ∨1000 ≤ TotalCost = 1100, AcctID = A3, ApprvAcct = AM3, ( HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple)) To complete Stage 2 for Secretariat, this expression is projected back to component Employee as the following. Note that details about approval and about which account can fund the trip are not included; such information is not part of the view for Employee.

166

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

uEmpl:1 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, ( 1200 ≤ TotalCost ≤ 1300, HotelCost = 1200, HotelName = AssezCher ∨1000 ≤ TotalCost ≤ 1300, ( HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple)) This completes Stage 2 for Employee. Now, for Stage 3, Annie must choose one of the possibilities. If she decides to take as much travel funds as possible, namely 1300 Euros, she will have only 100 Euros left for the hotel. So, she chooses the hotel PasCher for 400 Euros. Because she is a very responsible person, and because the hotel is so inexpensive, she decides to take only 1100 Euros in total expenses, since 700 is more than enough to cover the other expenses. Her ﬁnal, deterministic update request is thus the following. uEmpl:2 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, TotalCost = 1100, HotelCost = 400, HotelName = PasCher To complete Stage 3 for all components, this decision must be propagated to the other components, and then committed to the database. This is not quite trivial, because even though Annie has made a decision, there is still a choice to be made in another component. In this example, since she chose to take only 1100 Euros, either account A1 or account A3 may be charged. It is within the domain of the administrator who has update rights on the Accounting component to make this decision. In any case, the process of propagating the decision to the other components is again a simple project-lift process, which will not be elaborated further here. Once these decisions are made, the update may be committed to the database, completing Stage 3. 2.5. Analysis of the Three-Stage Negotiation Process The process presented here is a very simple one. Basically, there are only two points at which an actor may make a decision. The ﬁrst is during Stage 1, when the set of alternatives which the actor will accept is asserted. In effect, the actor agrees to support each of these alternatives for the life of the negotiation process. This stands in sharp contrast to the model forwarded in [15], in which an actor may at any time decide to withdraw alternatives which it previously agreed to support. Similarly, in Stage 3, an actor must decide which of the alternatives to support in the ﬁnal update, but this is also a single decision which may not be modiﬁed once it is made. Stage 2 does not involve any decisions at all. Rather, its purpose is to merge the decisions made in Stage 1, and may be carried out in an entirely automated fashion, without any input at all from the actors. Again, this is in contrast to the approach of [15], in which the actors may examine the results of merging the previous results and make new decisions as to which alternatives to support and which to reject. The upshot is that the total number of steps required in the negotiation process is effectively independent of the number of alternatives considered.

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

167

In contrast, the process described in [15] will in the worst case require a number of steps proportional to the total number of alternatives possible for satisfying the update request. Of course, this reduction comes at the expense of some ﬂexibility in the process itself, but for many applications it should be more than adequate. The dominant cost for this approach is governed not by the number of decisions but rather by the resources required to specify and manage nondeterministic update speciﬁcations. This is indeed an important issue which requires further work. It may be addressed both by exploring efﬁcient methods for representing such speciﬁcations, as discussed in Section 4.2, and by controlling the number of such alternatives and the ways in which they are propagated, as discussed further in Sections 3.1 and 3.2. However, the point is that with the approach to negotiation presented here, the evolution of that process itself is not the bottleneck.

3. Further Modelling Issues for Cooperative Update In describing the update and negotiation process via the running example of Section 2, some issues were glossed over in the interest of not clouding the main ideas with details. In this section, some of these more important details are elaborated. On the other hand, issues which are not addressed at all in this paper, such as concurrency control, are discussed in 4.2. 3.1. Context Sensitivity of the Lifting Strategy In the lifting uactg:0 in the example of Section 2, employee Annie made a request to travel either to Nantes or else to Nice for the French project, and department manager Barbara approved this request. However, suppose that Barbara had instead rejected this request, but would have approved a reduced request which includes only the possibility to travel to Nantes, but not to Nice. In other words, she would reject the request to travel to Nantes were it accompanied by an alternative to travel to Nice, but not if Nantes were given as the sole possibility for the destination. In this case, it is said that her decision is context sensitive. Although context-sensitive lifting behavior might seem less than completely rational, it must be acknowledged that human actors may sometimes exhibit such characteristics in their decision making. This work is not primarily about modelling human decision makers. However, context sensitivity in lifting behavior does have important implications. Suppose that, for efﬁciency purposes, the component Secretariat were allowed to check hotel availability before forwarding travel requests on to the managers. In that case, since no hotel is available in Nice for the requested time period, the department manager would not see that Annie had requested also to travel to that city, since that information would be ﬁltered out before being transmitted to DeptMgr. Thus, Barbara would see only the request to travel to Nantes, and so would approve it. In this case, whether or not the travel request is approved depends upon the order in which impossibilities are ﬁltered out. On the other hand, if Barbara exhibited a context-free decision behavior; that is, if whether she would approve the trip to Nantes were independent of any other requests which Annie had made, allowing the Secretariat to check hotel availability before forwarding the request on to the managers would not affect the ﬁnal outcome.

168

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

It is important to emphasize that this notion of context sensitivity relates to alternatives in the update family, and not upon conjunctive combinations. For example, if the request of Annie contained two alternatives, one to travel just to Nantes, and a second to travel both to Nantes and to Nice, then to approve the travel to Nantes, but not the combined travel to both Nantes and Nice would be perfectly context free. Context sensitivity has only to do with rejecting a given alternative on the grounds of the presence of other alternatives. 3.2. Admissibility for the Lifting Strategy In Stage 1 of the negotiation process, the liftings should be minimal in the sense that they do not make any changes which are not essential to the update request. Within the limited framework of the running example, it is difﬁcult to illustrate liftings which are not minimal. However, suppose that the component DeptMgr contains an additional relation SBudget (DeptID, Amount) which represents the department budget, and this component is connected to an additional component UpperMgt representing upper management, as illustrated in Figure 4.

RSePm

DeptMgr RDeptMgr SBudget

RDmUm

UpperMgt SBudget

Figure 4. Additional component for rider update

Now, suppose that in approving the travel for the trip of Annie, the department manager also adds an increase of 100000 Euros to the department budget to the lifting, so that it becomes uDeptMgr:0 := TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), ApprvProj = Carl ∪ DeptID = CDpt, ; Amount ← Amount + 100000 Here Carl has added a rider to the update request; to be approved, an additional update which is irrelevant to the original request must be realized as well. This lifting is not minimal because the rider could be removed without compromising support for the original update request. It may not always be possible to characterize minimality of a lifting in terms of inserting and deleting the minimal number of tuples. There might be a situation, such as a funds transfer, in which the amount should be minimal. However, the principle remains clear.

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

169

3.3. The Model of Authority A suitable framework for describing and managing access rights in the context of cooperative update requires certain special features beyond those of conventional database systems, since traditional access rights do not take into account any form of cooperation. One suitable model builds upon the widely-used notion of rôle-based access control, which was introduced in [1] using the terminology named protection domain or NPD, and which is elaborated more fully in articles such as [20]. The key idea is that rights are assigned not to individual users, but to rôles. Each user may have one or more rôles, and each rôle may have one or more users as members. For example, Barbara may have the rôle of manager of the French project, but she may also be an ordinary employee when making a travel request for herself. In addition to the usual privileges hierarchy, in which A ≤ B means that B has all privileges which A has, there is a authority hierarchy, in which A ≤ B means that A must support fully the requests of B. A possible authority hierarchy for the example of Section 2 might be the following, in which the ordering is represented from left to right. TravelAgent

<

< <

Scientist

Secretary

Manager

< Accountant

The employee Annie might make the travel request from the component Employee in the rôle of Scientist, in which case someone (or something — a program perhaps) in the rôle of Secretary using the component Secretariat and someone/something in the rôle of TravelAgent using the component Hotel would need to respect the update request of Annie, but those assuming the rôles of Accountant or of Manager (in the components with corresponding names) would have the right to trim her request as they see ﬁt. This is only a sketch of how the model of authority works; the details will appear in a forthcoming paper. 3.4. The Model of Representation and Computation The representation of update families, and the computations involved in lifting and merging them, are illustrated via example in Section 2, with the basic ideas hopefully clear. It is nevertheless appropriate to provide a bit more information as to what is allowed. First of all, update families are generally taken to be ﬁnite; that is, they represent only a ﬁnite number of alternatives. This means that, at least in theory, the liftings of Stage 1 of the negotiation process can be computed on a case-by-case basis. Consider the initial update request uEmpl:0 of 2.2. While the ranges on values for TotalCost and HotelCost are ﬁnite, the ranges for HotelName is speciﬁed by a wildcard and thus appear to be unconstrained. However, it is assumed that there are only a ﬁnite number of hotels, so this range may be taken to be ﬁnite. A second, computational issue arises in the context of computing merges in Stage 2 of the negotiation process. Here the set of liftings which agree with the update requests on each of several ports must be computed. In the most general case, this is an unsolvable problem. There is nevertheless a very natural case in which such problems do not arise. If the port views are deﬁned by basic SPJ (select-project-join) queries, and if the schema

170

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

has the ﬁnite-extension property [14, Def. 28]; that is, if the classical chase procedure [9] always terminates with a ﬁnite structure, then the merger can be computed as the result of the chase. Of course, there will be one such chase for each set of alternatives in the respective update families, but the total number of such alternatives is ﬁnite. In [19], many cases which guarantee such termination, and thus the semantic-extension property, are identiﬁed. Included in these is the classical situation of schemata constrained by functional dependencies and unary inclusion dependencies (which include in particular foreign-key dependencies), provided that the latter have the property of being acyclic [7]. The bottom line is that, from a theoretical standpoint, there are no problems with representation and computation. However, further work is needed to identify suitable cases which are both useful and efﬁciently solvable. See 4.2 for a further discussion.

4. Conclusions and Further Directions 4.1. Conclusions A straightforward but useful model of negotiation for cooperative update on database schemata deﬁned by components has been presented. In contrast to the approach given in [15], the method presented here involves only three simple stages for each component and thus terminates rapidly. The key idea is that decisions are made only during the ﬁrst stage; thereafter the operations involve only merging those decisions and then selecting one of them as the ﬁnal result. Other aspects of the modelling process, such as the representation of update requests, have been illustrated via a detailed example. This has illustrated that, at least for some examples, such representation is a viable alternative to more traditional, task-based representations. Nevertheless, there are many issues which remain to be solved before the ideas can be put into practice. 4.2. Further Directions Relationship to workﬂow and business-process modelling formalisms The kinds of applications which can be modelled effectively via cooperative update overlap in substantial part those which are typically modelled using workﬂow [26] and/or businessprocess modelling languages [4]. Furthermore, some database transaction models, such as the ConTract model [27], [21], are oriented towards modelling these sorts of processes. Relative to all of these, the cooperative update approach developed here is constraint based, in that it does not specify any ﬂow of control explicitly; rather, it places constraints on what that ﬂow may be. The identiﬁcation of workﬂow and business-process representations for those ﬂows of control which are representable by cooperative update, as well as a way to translate between the various representations, is an important direction which warrants further investigation. An appropriate model of concurrency control Update requests to databases, whether cooperative or not, typically overlap, thus requiring some form of concurrency control. However, traditional approaches are generally inadequate for cooperative update. Since they typically involve at least some human interaction, cooperative update processes are by their very nature long running, and so locking large parts of the database in order to avoid unwanted interaction of distinct transactions is not a feasible solu-

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

171

tion. On the other hand, cooperative transactions typically involve changes to only a very small part of the overall database. Work is currently underway on a non-locking approach which uses information contained in the initial update request to identify tight bounds on the part of the database which must be protected during a cooperative transaction [13]. A distributed model of control and communication The operation of a database system constructed from schema components, particularly in the context of cooperative updates, involves the passing of messages (i.e., projections and liftings) from component to component. Thus, a uniﬁed model of control and communication which is distributed amongst the components is essential to an effective realization of systems with this architecture. Future work will look at the properties and realization of such models. An efﬁcient representation for nondeterministic update families This issue has already been discussed brieﬂy in 3.4. Work is currently underway in two areas. The ﬁrst is to identify economical and computationally ﬂexible representations for nondeterministic update families. The second is to identify ways of computing merges of such nondeterministic update families using only one, or at least relatively few, instances of the chase procedure. More complex models of negotiation The model of negotiation which has been developed and presented in this paper is a very simple one. Although it is useful in modelling many business processes, there is clearly also a need for more complex negotiation processes, particularly ones with a back-and-forth nature in which parties compromise to reach a decision. Future work will look at such general notions of negotiation.

Acknowledgments For three to four months each year from 2005-2008, the author was a guest researcher at the Information Systems Engineering Group at Christian-Albrechts-Universität zu Kiel, and many of the ideas in this paper were developed during that time. He is particularly indebted to Bernhard Thalheim for suggesting the idea that his ideas of database components and the author’s work on views and view updates could have a fruitful intersection, as well as for inviting him to work with his group on this problem. He is furthermore indebted to Peggy Schmidt, for countless discussions and also for fruitful collaboration on the ideas of schema components. She furthermore read initial drafts of this paper and made several insightful comments.

References [1]

[2] [3]

R. W. Baldwin. Naming and grouping privileges to simplify security management in large databases. In Proc. 1990 IEEE Symposium on Research in Security and Privacy, pages 116–132. IEEE Computer Society Press, 1990. F. Bancilhon and N. Spyratos. Update semantics of relational views. ACM Trans. Database Systems, 6:557–575, 1981. G. Beneken, U. Hammerschall, M. Broy, M. V. Cengarle, J. Jürjens, B. Rumpe, and M. Schoenmakers. Componentware - State of the Art 2003. In Proceedings of the CUE Workshop Venedig, 2003.

172 [4] [5]

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

Business process modeling notation v1.1. http://www.omg.org/spec/BPMN/1.1/PDF, 2008. M. Broy. A logical basis for modular software and systems engineering. In B. Rovan, editor, SOFSEM, volume 1521 of Lecture Notes in Computer Science, pages 19–35. Springer, 1998. [6] M. Broy. Model-driven architecture-centric engineering of (embedded) software intensive systems: modeling theories and architectural milestones. Innovations Syst. Softw. Eng., 3(1):75–102, 2007. [7] S. S. Cosmadakis and P. C. Kanellakis. Functional and inclusion dependencies. Advances in Computing Research, 3:163–184, 1986. [8] A. Eisenberg, J. Melton, K. G. Kulkarni, J.-E. Michels, and F. Zemke. SQL:2003 has been published. SIGMOD Record, 33(1):119–126, 2004. [9] R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: Semantics and query answering. Theoret. Comput. Sci., 336:89–124, 2005. [10] G. Fiedler, H. Jaakkola, T. Mäkinen, B. Thalheim, and T. Varkoi. Co-design of Web information systems supported by SPICE. In Y. Kiyoki, T. Tokuda, H. Jaakkola, X. Chen, and N. Yoshida, editors, Information Modelling and Knowledge Bases XX, 18th European-Japanese Conference on Information Modelling and Knowledge Bases (EJC 2008), Tsukuba, Japan, June 2-6, 2008, volume 190 of Frontiers in Artiﬁcial Intelligence and Applications, pages 123–138. IOS Press, 2008. [11] S. J. Hegner. An order-based theory of updates for closed database views. Ann. Math. Art. Intell., 40:63–125, 2004. [12] S. J. Hegner. A model of database components and their interconnection based upon communicating views. In H. Jakkola, Y. Kiyoki, and T. Tokuda, editors, Information Modelling and Knowledge Systems XIX, Frontiers in Artiﬁcial Intelligence and Applications, pages 79–100. IOS Press, 2008. [13] S. J. Hegner. A model of independence and overlap for transactions on database schemata. In B. Catania, M. Ivanovic, and B. Thalheim, editors, Advances in Databases and Information Systems, 14th East European Conference, ADBIS 2010, Novi Sad, Serbia, September 20-24, 2010, Proceedings, volume 6295 of Lecture Notes in Computer Science, pages 209–223. Springer-Verlag, 2010. [14] S. J. Hegner. Internal representation of database views. J. Universal Comp. Sci., 17:–, 2011. in press. [15] S. J. Hegner and P. Schmidt. Update support for database views via cooperation. In Y. Ioannis, B. Novikov, and B. Rachev, editors, Advances in Databases and Information Systems, 11th East European Conference, ADBIS 2007, Varna, Bulgaria, September 29 - October 3, 2007, Proceedings, volume 4690 of Lecture Notes in Computer Science, pages 98–113. Springer-Verlag, 2007. [16] G. E. Kaiser. Cooperative transactions for multiuser environments. In W. Kim, editor, Modern Database Systems: The Object Model, Interoperability, and Beyond, pages 409–433. ACM Press and AddisonWesley, 1995. [17] M. Kifer, A. Bernstein, and P. M. Lewis. Database Systems: An Application-Oriented Approach. Addison-Wesley, second edition, 2006. [18] L. Kot and C. Koch. Cooperative update exchange in the Youtopia system. Proc. VLDB Endow., 2(1):193–204, 2009. [19] M. Meier, M. Schmidt, and G. Lausen. On chase termination beyond stratiﬁcation. CoRR, abs/0906.4228, 2009. [20] S. L. Osborn and Y. Guo. Modeling users in role-based access control. In ACM Workshop on Role-Based Access Control, pages 31–37, 2000. [21] A. Reuter and F. Schwenkreis. ConTracts – a low-level mechanism for building general-purpose workﬂow management-systems. IEEE Data Eng. Bull., 18(1):4–10, 1995. [22] M. C. Sampaio and S. Turc. Cooperative transactions: A data-driven approach. In 29th Annual Hawaii International Conference on System Sciences (HICSS-29), January 3-6, 1996, Maui, Hawaii, pages 41– 50. IEEE Computer Society, 1996. [23] B. Thalheim. Database component ware. In K.-D. Schewe and X. Zhou, editors, Database Technologies 2003, Proceedings of the 14th Australasian Database Conference, ADC 2003, Adelaide, South Australia, February 2003, volume 17 of CRPIT, pages 13–26. Australian Computer Society, 2003. [24] B. Thalheim. Co-design of structuring, functionality, distribution, and interactivity for information systems. In S. Hartmann and J. F. Roddick, editors, APCCM, volume 31 of CRPIT, pages 3–12. Australian Computer Society, 2004. [25] B. Thalheim. Component development and construction for database design. Data Knowl. Eng., 54(1):77–95, 2005. [26] W. van der Aalst and K. van Hee. Workﬂow Management: Models, Methods, and Systems. MIT Press, 2002.

S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates

[27] [28]

173

H. Wächter and A. Reuter. The ConTract model. In A. K. Elmagarmid, editor, Database Transaction Models for Advanced Applications, pages 219–263. Morgan Kaufmann, 1992. W. Wieczerzycki. Multiuser transactions for collaborative database applications. In G. Quirchmayr, E. Schweighofer, and T. J. M. Bench-Capon, editors, Database and Expert Systems Applications, 9th International Conference, DEXA ’98, Vienna, Austria, August 24-28, 1998, Proceedings, volume 1460 of Lecture Notes in Computer Science, pages 145–154. Springer, 1998.

174

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-174

A Description-based Approach to Mashup of Web Applications, Web Services and Mobile Phone Applications Prach CHAISATIEN, Takehiro TOKUDA {prach, tokuda} @tt.cs.titech.ac.jp Department of Computer Science, Tokyo Institute of Technology Meguro, Tokyo 152-8552, Japan

Abstract. Recent developments in mobile technology have enabled mobile phones to work as mobile Web servers. However, the composition of mobile phone applications and Web resources to form new mashup applications requires mobile programming knowledge ranging from how to create user interfaces, network connections and access to Web resources. Furthermore, the unique capabilities of mobile phone applications such as access to camera inputs or sensor data are often limited to local use only. To address these problems, we present a descriptionbased approach and an Integration Model for the composition of mobile mashup applications combining Web applications, Web services and mobile phone applications (i.e., generic components). The compositions appear to require less native mobile programming knowledge. In the current work, to leverage access to these services and applications, an Interface Wrapper was used to transform generic components into mashup components. Composers were able to transform and reuse form-based query results from Web applications and integrate them with wrapped output from users’ interaction with mobile phone applications, and other Web services. The final applications can be configured to work two ways: 1) as native mobile phone applications or 2) as a Web application accessible externally via a mobile Web server application. Keywords. Mobile phone application, mobile Web server, Web service, Web application, mobile mashup, Interface Wrapper

1. Introduction Mobile phone applications deliver unique capabilities such as GPS location services, voice recognition and camera/image processing applications. There are some problems related to the composition of mashup applications of these components with existing Web resources. One of these problems is related to the lack of mobile programming language knowledge needed for the creation of user interfaces and control parts. Another issue is that composers need not only to know how to create a standalone mobile application, but also need additional skills to program the mobile phones to access and reuse Web resources. To address these problems, this paper presents a description-based approach to flexibly compose mashup applications from 3 generic component categories: Web applications, Web services, and mobile phone applications. With minimum

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

175

configuration required, our approach allows composers to accomplish the following tasks in the aforementioned categories: • Simplify and reuse form-based query results from Web applications. • Extract selected portions from Web services’ outputs. • Generate and configure Web service interfaces for mobile phone applications. In the composition procedures, first, the Integration Model is used to describe and plan data flows of the mashup components. The Integration Model later is expanded into configurations of a Mobile Integration Description file (MID file). Then the mashup application generator uses the file to generate the actual mashup applications. In the similar a manner, composers are required to fill the control parameters of each component in the MID file. Thus a mashup application is generated according to those configurations. Lastly, composers were able to configure the final mashup application to run on the device as a mobile phone application or to be accessed externally as a Web application via the mobile Web server application. To leverage access to each mashup component, we termed the components that transform communication interfaces between component categories “Interface Wrappers”. For instance, the Web service wrapper detailed in this study enables intercomponent communication and external access to a mobile phone application using a Web service interface.

Figure 1. Overview of the mashup applications, Interface Wrappers and their relation to outputs and clients

This study’s contribution is the presentation of the model and the methodology to reuse non-API Web resources with existent mobile phone applications to form mashup applications. The established method is to use Java classes to build and to connect components, while our method controls the data flows of existent mashup components through the utilization of configuration files and its parameters. The implementation of this study shows that our approach allows composers to flexibly reuse capabilities of sensors and peripherals controlled by mobile phone applications to integrate them with Web resources and generate new mashup applications. The organization of the rest of this paper is as follows. Related work and research background are reviewed in Section 2. A mashup example is presented in Section 3 to demonstrate our approach. Section 4 explains the method of composing a mobile mashup application. The composition is divided into three working processes: planning

176

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

process, configuration process, and application generation process. Also Section 4, we present the Web information extraction tool used in the configuration process. In Section 5, we provide detailed mashup composition examples, then evaluate them by presenting the applications’ actual drawbacks and problems when applying the same model with other resources. In Section 6 we give a general discussion by making a comparison to the conventional approaches in terms of generation process, objectives and limitations. In Section 7, we describe this study’s future work and present our concluding remarks.

2. Background and Related Work Research disciplines in mobile mashup are usually related to these fields of study: 1. Web page tailoring and adaptation. 2. Web information extraction and reproduction. 3. Mobile mashup languages, modeling and their applications. 4. Mobile servers and ubiquitous environments. Generally, the conventional focus in tailoring and adapting Web pages for viewing on mobile devices gives more importance to extracting and simplifying visual outputs. DOM tree based extraction and proxy server architecture, as presented in [1] [4] and [11], are used to adapt the presentation of a Web page on mobile devices to assist navigation effectiveness. Although these methods promote minimization of information and visualization, when composing a mashup application for mobile phones they appear to support less functionality in communications and integrations over multiple working components. Research in the field of Web information extraction emphasizes methods to correctly indicate and reproduce parts of Web applications for creating new mashup Web applications. The study in [7] proposed a Web information extraction method to generate the virtual Web service functions from Web applications at the client side. This research targeted static contents, a limitation which was later corrected in [8] by allowing dynamic Web contents created by client-side scripts. These two systems are implemented using large external Java libraries including Java Applet. In our case, a mobile device cannot handle loads from external libraries to extract and simulate entire Web pages. In approaches to use description language based on XML, research in [13] and [15] has shown that the majority of description-based XML languages are designed to support content delivery to mobile phones and handheld devices. However, most languages target user interface design and do not facilitate integration with Web information. XISL [12], which extends interaction and input methods, requires an implementation of interpreter and the dialog manager modules. One substantial difference when compared to our approach is that we reuse interactions from existent mobile phone applications, and do not create new applications concerning users’ interaction from the description file. A method to generate a mobile phone application using configuration files was presented in mFoundry’s Mojax [14]. This framework borrows syntaxes from JavaScript, CSS and XML. Mojax applications are compiled and run as native Java code (J2ME) on a device. Mojax also supports development of plug-ins that can access device capabilities such as location services, address book, audio and video. Our approach introduces the transformation of generic components into mashup

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

177

components. Moreover, developers are able to write the optional control parts using Web or mobile programming. In 2002, Intel Research introduced the Personal Server [19], which wirelessly connected to the local network environment. The personal server allows an HTTP connection to access personal pages or file storage. A more specific study of component-based infrastructure can be found in [16]. This system used abstract user interface descriptions to represent software components on an embedded hardware system. Although a method to display system information and control the hardware system using a variety of clients was presented, those connections were specific to the transport layer. The use of information for integration with Web information was also not found. The sharing of mobile services presented in [10] is a system based on websites that support user generated mobile services. Our approach instead promotes the use of mobile integration with Web content, allowing contributors from these two platforms to share their works. The mobile service system in [17] provides an extension of presence awareness to mobile users. Without the implementation of the central server system, our approach is applicable with always-on HTTP connection via the mobile Web server, which allows quick access to shared information anywhere and anytime. Concerning development of mobile Web servers, ServersMan [18] is a mobile application targeted on major mobile platform (iPhone, Windows Mobile and Android). The application enables Web access to devices’ file storage and other parameters such as GPS latitude and longitude. The operations to access devices’ other resources, such as digital compass or accelerometer, are not yet defined. Moreover, the reuse of existent mobile application in the device is also not presented.

3. Mobile Mashup Example We present our example to demonstrate how the description-based approach is used in composing a mashup application. In Example 5.1.1, we show the composition of a mashup application for displaying nearest Wikipedia article and local weather information according to the mobile phone’s location.

Figure 2. Mobile Mashup Example

178

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

This application is targeted for a mobile Internet device (i.e., Output, Web application, iPod Touch, Safari Browser) by compositing a location service from a mobile phone application (i.e., Component A, publisher, GPS Locator, mobile application) with Web services. With no built-in GPS hardware at the client side, component B and C can alternatively retrieve information from a location service on component A and perform queries for their Web service outputs (i.e., Component B, subscriber, Wikinear, Web service and component C, subscriber, LocalWeather, Web service). Working procedures in composing this mashup application are as follow. 1. Specify a starting component: The data flows in this mashup application begin from a component that accesses GPS parameters from a mobile application (GPSLocator). For compatibility to the next components, we first transform the parameters by applying the Web service wrapper to GPSLocator. Composers must specify the Intent parameters and Intent’s extra parameters to retrieve data from this mobile application. It is also required that composers specify the provider role to the component including provider’s ID. Composers need to specify the Web service wrapper’s JSON message as well. 2. Specify next components: The required parameters for the next components are Web services’ URL, query field names, and each field’s value. In this example both Wikinear and LocalWeather used fields named lat for latitude and lng for longitude. The components’ role must be set to subscriber and use publisher’s ID where lat and lng are referred to. 3. Specify the output component: The output component, which is in the form of Web applications, uses query results from Web services described in 2. Composers must specify the mobile Web server’s access path and the output page in the form of HTML code and refer to parameters from Web services’ output. 4. Generate the final mashup application: Composers input information in item 1 - 3 to the MID file and generate the output Web application which is placed on the mobile Web server. Users can access it using a mobile Web server host name and access path according to the configuration. To support the compositions of mashup applications, the Integration Model is used to plan the data flows in the mashup applications.

4. Method in the Composition of Mobile Mashup Application Our method of composing mobile mashup applications consists of planning, configuration, and generation processes. In the planning process, the Integration Model is used to outline the component’s roles, the data flows, and the format of output forms for mashup applications. In the configuration process, the Integration Model is adapted and expanded into the actual configuration of MID files. We use the Web extraction tool to aid composers in retrieving configurations for data extraction from Web applications. Later in the application generation process, the mobile mashup application generator uses MID files to generate the actual mobile mashup application. The data flows involve different components located in different locations of a large system. Therefore, we explain the current system architecture to assist in understanding data flows and how the generated mobile mashup applications are to be placed in the system. Then we present the detailed processes of generating mashup applications.

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

179

4.1. Planning Process: Integration Model Table 1. Model representation of mashup components, roles and output forms Category Mashup component Role Output form

C C C R R R O O O

Model Representation [Web Service (name)] [Web Application (name)] [Mobile Phone Application (name)] [Publisher] [Medium] [Subscriber] [Web Service] [Web Application] [Mobile Phone Application]

Abbreviation

C C C R R R O O O

[WS (name)] [WA (name)] [MA (name)] [Pi] [Mij] [Sj] [WS] [WA] [MA]

A model representation of three mashup components, roles and output forms are shown in Table 1. Parameter indices indicate the publisher-subscriber relationship of component couples. As an extension to publisher (P) and subscriber (S) roles, we use medium role (M) to describe the component that is publishing a subscribed output from another component. The representation of component A in mobile mashup example 5.1.1, which is the Web service wrapper applied to mobile phone applications can be written as C [Mobile Phone Application (GPS Locator)], O [Web Service]

We call these one-tier compositions Interface Wrappers, which are used in order to transform an output’s interface for communication between other mashup components. Table 2 contains a model representation of wrappers, their corresponding functions and sample usages. Table 2. Model representation of Interface Wrappers, their corresponding functions and sample usages. Case

Model Representation

Function

(a)

C [Web Application] O [Web Service] = W [WS[WA (name)]]

Web content extractor functioning as a Web service

(b)

C [Mobile Phone Application] O [Web Service] = W [WS[MA (name)]]

Mobile Web Service Wrapper

(c)

C [Web Application] O [Mobile Phone Application] = W [MA(name)[WA(name)]]*

Mobile application functioning as a Web content extractor

(d)

C [Web Service] O [Mobile Phone Application] = W [MA(name)[WS(name)]] *

Mobile application functioning as a Web service connector

(e)

C [Web Service] O [Web Application] = W [WA[WS(name)]]

Web application functioning as a Web Service connector

(f)

C [Mobile Phone Application] O [Web Application] = W [WA[MA(name)]]

Mobile Web Application Wrapper

Sample Usage Extracts texts from querybased Web page (e.g. product search, book reviews, game ratings) Retrieves GPS coordinates from mobile phone application via a Web Service Displays part of querybased Web page Selects and displays texts from Web service’s result using native mobile phone application Searches and displays results from a Web service Searches for contact info, media or database query on the mobile phone

*Since usage and output of mobile applications are different, application name has to be declared.

180

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

Outputs in the form of Web application (WA) or mobile applications (MA) can be used as end points in creating a mashup application. A composition that contains only one wrapper might not be enough to form a meaningful application. In Section 5, we select case (a) (b) and (c) for our implementation because of the following reasons. • Case (a) and case (e) are considered as existent Web extraction techniques. Web service (WS) output from wrapped WA in case (a) would be more appropriate for showing complexity in creating a mashup application. • Case (b) and (f) are similar to each other, only the output forms are different. We select case (b) to show further integration of its WS output. • Case (c) is more complex than (d). We would like to show how information is extracted from WA in (c) while (d) contains only simple operations to use WS output. By applying new interface syntax (b) to example 5.1.1, a new abstract model can be declared as W C C O

[WS[MA(GPS Locator)]], R [P1], [WS(Wikinear)], R [S1], [WS(Weather Report)], R [S1], [WA(iPod Touch, Desktop)]

In the next section, we describe how to adapt and expand this model as an actual configuration file. 4.2. Configuration Process: MID File and Web Extraction Tool In the configuration process, the Integration Model is adapted and is expanded into the actual configuration of MID files. We use the Web extraction tool to aid composers in retrieving configurations for data extraction from Web applications. 4.2.1. MID File Table 3. Structure of the <project> scope in MID files. Scope <project>

<wrapper>

Possible Child Element (…) or <wrapper> (…) (…) NAME (…) <webapplication> (…) or <webservice> (…) or <mobileapplication> (…) NAME (…) <webapplication> (…) or <webservice> (…) or <mobileapplication> (…) (…) <webapplication> (…) or <webservice> (…) or <mobileapplication> (…)

Composition rule One or more, in any order One or none One One One One One One One One

Note: Italic capital letters denote that an element’s content has to follow definition rules. Notation (…) means that element’s inner contents are in the form of tags.

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

181

A mobile integration description file or MID file is an XML description file and is used for the configuration of working components in our mashup applications. Since each component has unique configurations, we begin by explaining the structure of MID files in Table 3, then we provide a detailed description of each element. The <project> scope of the MID file represents the mashup application according to the Integration Model. Composers are able to put as many and <wrapper> inside this scope, in any order. The name of each component and wrapper (NAME) must be unique since the namespaces are used to access data between components. We then specify the category of each element using one of <webapplication>, <webservice> or <mobileapplication>. This <project> scope ends with one element if the last component is not a wrapper. We do not use when the last component is a wrapper since it contains its own output element. How a MID file is used is shown in Example 5.1.1 and in Figure 3. The MID file includes the Web service wrapper of GPSLocator mobile application and other two Web services. Another example of the Web Service wrapper is shown using the Amazon Web application [3] in Figure 4.

Figure 3. MID file of example 6.1.1 (simplified)

182

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

As shown in Figure 3, the followings are important control parts in a MID file: a) Subscriber ID must match the publisher ID of the component it subscribes to. The publisher.output namespace is used to access parameters in scope of the corresponding publisher. b) Inside the <wrapper> scope of Interface Wrappers, the namespace results is used to refer to data in the scope of the component. c) The element is placed under the <project> and <wrapper> scopes. In the case of project’s output, components’ names are used in front of the namespace results. The namespace results is used to access to the corresponding component’s results (i.e., Wikinear.results and LocalWeather.results). d) In the current development version, Web applications can be generated using settings limited to the combination of <string> and . It is recommended that composers use external Web builders and integrate the working Web pages with a Web service output. e) The parameter in <mode> under the scope of <mobileapplication> is used to determine whether the information from mobile applications is retrieved actively from a foreground process, or passively from a background process. The <mode> parameter of the output mobile application is only allowed to be set as active.

Figure 4. Part of a MID file showing the Web service wrapper of the Amazon Web Application (simplified).

In Figure 4, the following are important control parts in a MID file regarding <webapplication> scope: f) The parameter in <mode> under the scope of <webapplication> is used to determine whether the information to be input into the query form of the Web application is passed from other components passively (passive) or requires user input (active). g) The parameter in under the scope of and <webapplication> is defined as the input of Web application components. This parameter can be set as

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

183

a namespace or a user input (userinput). A Web application’s mode has to be set to active in order to use the parameter userinput. For the next element, if the resulting pages require clicks to continue to the next page, we set parameter click to the scope. h) The parameter in under the scope of and <webapplication> is defined as the result of Web application components. This parameter can be set as a transfer of the entire information of this node to results scope (self) or a text attribute of the node (text). i) In Section 4.3, we show the Web extraction tool which automatically generates parameters used in the <webapplication> scope. 4.2.2. Web Extraction Tool To aid composers in correctly specifying the desired information from the form-based Web application, we use the web extraction tool to help in generating parameters for Web application components (i.e., <webapplication> scope) in the composition of mashup application. The tool is written in JavaScript which can be run on any browser. Figure 5 shows the sample use of the Web extraction tool applied to www.amazon.com [3] to search for a product title using a barcode number. We later used this page as a Web application component to integrate with a barcode reader mobile application in Section 5.1.2. Working steps to acquire the parameters are listed below.

Figure 5. Sample use of the Web extraction tool.

1. Composers select the desired input box and perform right click. At this point the tool will recognize ID, name and class name attribute of the input box. It also calculates the distance of two nodes between the form node to the selected input box node. 2. Next, composers input an example into the input box and submit the page (we used barcode number of a book to search for result).

184

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

3. After this step, composers highlight the desired content on the resulting page and perform right click to the element. 4. Results from step 1 are applied to the element of the Web application component’s setting in the MID file. In the case that the resulting page requires clicks over links, a parameter click can be applied to element and repeat step 3. Setting for multiple resulting pages can be accomplished by adding more element to the configuration. 5. When composers reach the final resulting page, they can perform right clicks to more than one element to generate the corresponding parameters. These parameters will be placed in the element of the Web application component’s setting in the MID file. 4.3. Generation Process: System and Mobile Mashup Application Generator In this section, we begin by explain the current system architecture for understanding data flows and how the generated mashup applications are to be placed in the system. Then we present the detailed processes of generating mashup applications. In the last part, we describe the Web extraction tool which aid composers in data extraction from Web applications. 4.3.1. System Architecture

Figure 6. System architecture

In Figure 6, the physical appearance of the system’s hardware is an Android mobile phone which can function as a generic mobile phone capable of running mobile applications. We used i-jetty open-source Web container [9] to extend its capabilities to

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

185

function as a mobile Web server. As the categories of final mashup applications are of two kinds, describes as follows: In the first case, the mashup applications are generated as mobile applications and are installed to the application storage section like other typical mobile applications. This case occurs when one or more components in the composition are used as an active component. Active components consists of active mobile applications (AMA), active Web applications (AWA), wrappers containing mobile application (WAMA) and wrappers containing active Web application (WAWA). This means that the mashup application used user interaction with a mobile phone application or a Web application. Therefore, an entry point on the mobile phone had to be created as an installable mobile application (using Android Package or APK files). Additional control parts to access to Web resources are stored in the Web Files section of the mobile Web server. In the second case, the mashup applications are generated as Web applications and placed on the Web Files section of the mobile Web server. This case occurs when there is no AMA or WAMA found in the composition. Therefore, the mobile Web server can passively access all sensors’ parameters from mobile applications without any user interaction with a mobile phone application. The Web Files section stores generated Java Servlet for external access, and Java classes which enable Web server module to access information from other existent mobile applications in the same machine. 4.3.2. Mobile Mashup Application Generator The main process of generating mashup applications is performed according to configurations in the MID file. The process follows these steps: 1. The generator notes the presence of active components in the composition. 2. From the realizations in step 1, the generator will decide whether to create a switcher mobile application (SMA, see Figure 5) which can switch through user interfaces (UI) in mobile applications according to the number of active components found. Rules to determine the generation of SMA are as follow. 2.1. If one or more active components are found then the generator will generate an SMA as APK file and control parts will be generated and placed in the Web Files section on the mobile Web server. 2.2. If no active component is found then an SMA will not be created. Only the control parts will be generated and placed in the Web Files section on the mobile Web server. 3. The generator then begins generating each component’s working class files which will be assigned to the corresponding section decided in step 2. 4. Additional control class files work as linkages between components: these file are assigned the Web Files section. 5. The generator then compiles these assigned class file as follows 5.1. Class files assigned to SMA are compiled into APK files. 5.2. Class files assigned to Web Files are compiled into the Web application archive (WAR) files. The installation tool for the generated files is still being developing. However, composers are currently able to install APK files to the mobile phone via Android debug bridge tool. WAR files can be installed to the mobile Web server module using the i-jetty user interface [9] which requires the files to be downloaded through network connection. In Figure 7 we show the relationship of each mashup component to class files and compilation files in the generation process.

186

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

Figure 7. Relationship of each component to class files and compilation files in the generation process

5. Evaluation As there are many components composers can use in the composition of mashup applications, they appear to have endless combinations. Therefore, we show the successful composition examples at first, and then review the specific problems in applying the same method to other similar cases. In our general discussion in Section 6, we compare our composition method with the conventional method in terms of: 1) composition processes, 2) objectives and limitations of the composition. 5.1. Mashup Examples 5.1.1. Geolocation Mashup: Web Service Interface Wrapper for Sensors In Figure 8, the Geolocation Mashup application shows how the composition aids a device which lacks sensors, and which can indirectly use parameters from mobile phone applications via Interface Wrapper. The latitude and longitude parameters from GPSLocator mobile application are then used as a part of query strings of Wikinear and LocalWeather Web service. The output of this mashup application is configured as a Web application. The type of Interface Wrapper used in this mashup application is selected from case (b) in Table 2.

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

187

Figure 8. Model representation and screen shots of Geolocation Mashup

The MID file for this mashup application (see Figure 3) contains no active component. Therefore, this application is only generated as a WAR file and is installed in the Web Files section of the mobile Web server. In the process of generating a mashup application, the WAR file contains 5 class files. Each generated class file performs a specific task as follows: a) GPSLocatorMAInfoAccess.class - accesses location’s latitude and longitude. b) GPSLocatorWSOutput.class - generates the JSON message of a) c) WikinearWSOutput.class - queries and generates outputs from the Wikinear Web service using parameters from b). d) LocalWeatherWSOutput.class - queries and generates outputs from the LocalWeather Web service using parameters parameters from b). e) ProjectWAOutput.class - generates outputs as a Web application using information from c) and d). The ProjectWAOutput.class in e) is a Java Servlet which other devices can access as a webpage. One can access this page using the mobile Web server hostname and the path specified in the MID file (e.g. http://[hostname]/location/mashup-output). 5.1.2. Book Reviews Barcode Reader Mashup: Web Content Extraction Figure 9 demonstrates the composition of a mobile phone’s user input interface with external Web resources. The mashup application aids users by shortening the interaction sequences. These sequences may conventionally require users to copy and paste one output from one section for use in another section. The type of Interface Wrapper used in this mashup application is selected from case (a) in Table 2.

188

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

Figure 9. Model representation and screen shots of Book Reviews Barcode Reader Mashup

The objective is to compose a mashup application which can translate a product’s barcode into a product name and then search for related information via Web resources, displaying the output on the mobile phone. The following must be considered prior to the configuration process: • Active components such as the Barcode scanner component and the Web display component. • Web application components which can use parameters generated from the Web extraction tool (i.e., www.amazon.com [3] and www.goodreads.com [6]). First, composers use the Web extraction tool to generate parameters for their compositions, and then begin to configure each component in the composition. Composers are required to install the barcode reader application manually and composers need to know the Intent configurations of the barcode reader application. In the process for generating this sort of mashup application, the APK file contains 2 class files concerning the access to user interface while the WAR file contains 3 class files. Each generated class file performs a specific task as follows: a) BarcodeReaderUIAccess.class - accesses the barcode reader interface via Intent. b) AmazonWAExtractor.class - extracts the information from www.amazon.com [3] using parameters from a). c) AmazonWSOutput.class - generates JSON message from b). d) GoodReadsWAExtractor.class - extracts information from www.goodreads.com [6] using parameters from c). e) ProjectMAUIAccess.class - accesses the user interface on the mobile phone to display output from d). In intended use, after users manually install the APK file and the WAR file to the mobile phone, user can access the barcode scanner interface from the home screen. We discuss the performance of our Web content extraction method in Section 5.2.2.

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

189

5.1.3. Japanese-English Talking Dictionary Mashup: Active Components

Figure 10. Model representation and screen shots of Japanese-English Talking Dictionary Mashup

Figure 10 demonstrates the fewest possible number of active components a mashup application can contain. The Integration Model consists of one Interface Wrapper which transforms a Web application into a mobile application. The objective of this composition is to test the active Web application component, a component which works with the active mobile application component. The type of Interface Wrapper used in this mashup application is selected from case (c) in Table 2. In the configuration process, first composers first use the Web extraction tool to generate parameters from ALC Japanese dictionary Web application [2] (e.g. www.alc.co.jp) and then create a Web application component. This active publisher component requires a user’s input. After the query in the publisher component has finished, the mashup application starts the TextToSpeech mobile application, reading the translated words in English. The mashup application consists of 4 class files according to case (c) of Table 2. Each generated class file performs a specific task as follows: a) ProjectWAUIAccess.class - provides a user interface which accesses the Web application www.alc.co.jp [2] for entering querying words. b) ALCWAExtractor.class - extracts the information from www.alc.co.jp [2] using parameters from a). c) TextToSpeechUIAccess.class: passes the extracted information from b) to the TextToSpeech mobile application and start the application. This mashup application contains only an APK file (as demonstrated in Figure 7 using the model W [MA*[WA*]]). After the file is manually installed, the icon on the mobile phone’s menu works as an entry point to the mashup application.

190

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

5.2. Problems regarding composition and usage, related case studies, and proposed solutions 5.2.1. Geolocation Mashup: Web Service Interface Wrapper for Sensors There are several problems found in the composition of the Geolocation Mashup. First, the composition output as a Web application can only be configured using strings and variables – this limitation provides less flexibility compared to other Web builder tools or dynamic Web programming languages. Since the Web service wrapper’s outputs provide a location’s latitude and longitude, those outputs can be accessed as a Web service. Composers may combine this Web service wrapper’s output with their HTML, JavaScript code or other dynamic Web programming language (instead of our provided template) to generate Web application output. The second problem occurred when we applied the Web service Interface Wrapper to the GPSLocation mobile application to passively access the location’s latitude and longitude. In random cases, when the location’s parameters were accessed through this method they were not updated and the system’s GPS module had to be restarted. When we further applied the mashup application to passively access the accelerometer of the mobile phone, the parameters were also not automatically updated. However, in the case of the accelerometer, we found that there is a background process to retrieve these parameters. To solve this problem, the mobile application has to be reprogrammed to bring the process to the active state. Then our mashup application would be able to access the parameters correctly. The other solution is that we can periodically connect to the Web service output using JavaScript. In this way the connection provides active access to the parameters. In sum, our method to access mobile phones’ sensors’ parameters via the Web service Interface Wrapper is bound by the limitations of the wrapped mobile phone applications. To passively retrieve sensors’ parameters, composers have to carefully test the mobile phone application to see how the parameters update themselves. It is recommended that the mashup applications access the parameters through a connection via the Web service Interface Wrapper. The access also provides compatibility with typical JavaScript and HTML code. 5.2.2. Book Reviews Barcode Reader Mashup: Web Content Extraction In the configuration process of the mashup application presented in Example 5.1.2, composers are required to use our Web extraction tool. We use the tool to specify and generate parameters to point to one result from a result list. To show more results, more parameters have to be specified which causes composers to repeatedly use the Web extraction tool to generate more parameters. In general programming, programmers can write extraction loops; however, extraction loops is not presented in our method. Our objective for this mobile mashup application was to use one result from the search result of the product barcode number. And we found that the product’s barcode number rarely matches more than one result. In the case where many results are preferred, it is recommended that the information that is extracted from the Web application is wrapped using the Web service Interface Wrapper so that composers can apply the wrapper’s outputs with typical programming. Concerning the performance of reusing a non-API Web application (as presented in Example 5.1.2), to display book reviews provided by www.goodreads.com [6], a navigation through 3 Web pages is needed (i.e., search page, result list page, and

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

191

reviews page). The GoodReadsWAExtractor.class requires time to finish parsing the DOM tree in each page and yields a time lag due to the network latency and hardware constraints of mobile phones. Composers may confront more time lags when working with Web pages containing many contents. Another problem found was that the information in the final resulting page is difficult to extract. We have applied the same method to retrieve the ESRB software ratings [5] using video games’ product barcodes. The essential information to retrieve from the final resulting page is in the form of a table with text. Composers can select the entire tag or specify all portions to be taken. To conclude, our method to perform Web extractions on mobile phones using description-based configurations requires more manual work than conventional programming does. The solution is to make the data flows diverge to parse the DOM tree remotely with other Web servers. In this case, we have to redesign the system architecture to be able to connect to external Web servers to do the configured tasks. Mobile phone hardware constraints and network latency also yields time lag when performing Web extraction. It is better that the complexity of the DOM tree is reduced before the parsing process begins. 5.2.3. Japanese-English Talking Dictionary Mashup: Active Components We use a similar method to generate parameters for a Web application component using the Web extraction tool. The difference from the other example is that this Web application component is set as active and prompts for user input before proceeding to the next step. To support this active component, we derived the display screen of the active Web application component from the Android WebView. This WebView functions similarly to the mobile phone’s default browser. The WebView is configured to suppress all user interaction with the screen except for the selected input box. Later, the user submits words for a result which is passed to the TextToSpeech application. We found that when this active Web application component is configured to use more than one input box, the display screen provided by WebView was quite confusing - users may not be able to see and use the selected two input boxes since the boxes are located far from each other. We propose that the WebView should be able hide the visual display of unrelated items. To do this, first, the Web extraction tool has to be modified to be able to generate parameters used in the display configuration. Second, the structure of MID file has to be changed to support these parameters. This proposed method may lower the domain of error concerning user interaction in the mobile mashup application (an application contains many active components).

6. General Discussion In this study, our approach to creating mashup applications is to plan them as an Integration Model. Our model expands into real configurations and generates a mobile mashup application according to those configurations. The main difference between conventional composition of mobile mashup application using Java mobile programming is that, instead of programming code, we attempt to reuse the existent components by applying configuration via parameters. We discuss and compare two approaches as follows.

192

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

6.1. Composition Processes Our composition process to create mashup applications starts from the composer’s plan for what components will be used. We limit the composition domain to 3 component categories. Composition of the components requires less API and documentation when compared to conventional programming. To begin composing a mashup application, composers may simply define a minimum of two Web resources and combine them as a mashup application using the MID file. On the other hand, to begin Java programming, composers need to study and indicate what APIs used. To create even a single Web application component, the programming tasks ranged from visual display of a Web application, network programming to retrieve information from Web resources, and creation of other controls such as navigation buttons. Our approach aimed to reduce unnecessary programming work; we use only parameters to compose mashup applications. The approach also limits the domain of error which can occur in conventional mashup composition with mobile programming language. Alternatively, composers can create Web service wrappers to share the output of mashup applications through mobile Web servers. This also provides flexibility to composers who are skilled at Web programming by allowing them to program or reuse mobile phone applications without the knowledge of mobile programming APIs. 6.2. Objectives and Limitations Our approach to composing a mashup application was not aiming to build new user interfaces or new logical algorithms. Our goal of reusing parts of Web resources with mobile applications includes a substantial limitation to composing mashup applications. The combination domain is only limited to the content on the Web resources and the existent mobile applications. For instance, creating a new camera application to detect and translate specific sign language into other representations cannot be accomplished with our approach. Such work still requires low-level implementation using mobile programming language. Moreover, our approach does not provide logic or mathematic functions within data flows. In our composition method, composers are required to add more mashup components to perform logic or mathematical calculations. For instance, to calculate the distance between the user’s current position (latitude and longitude) and the closest Wikipedia article based on the article’s GPS coordinates. As shown in Example 5.1.1, composers have to work through these following tasks: search for a distance calculator Web application which uses two pairs of coordinates, generate parameters from the Web application, configure the MID file, and generate the entire mashup application again. In conventional programming, by adding only a few lines of mathematical calculation to the proper part of the program, the calculation can be performed. The constraints of mobile phone hardware are also a significant limitation to our approach. When the system is running many processes (e.g., a mobile Web server module, the Web information extractor classes and accesses to sensors’ parameter), it tends to run slower as many background processes share memory resources. The drawback is also caused by the use of many classes connecting mashup components together. We believe that a specific system resource manager for running mashup application is needed.

P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup

193

7. Conclusion and Future Work In this study we presented a description-based integration approach to aid the composition of mobile mashup applications. We used the Integration Model and Interface Wrappers in the planning process. Later we expand the model into the actual configuration for generating a mobile mashup application for placement in the mobile Web server system. Our approach allows composers to access mobile phone sensors’ parameters and to create mobile mashup applications with less programming effort, using parameters to describe the data flows. We composed three mashup applications to demonstrate how actual configuration works. The example compositions have many limitations that need to be considered and resolved in the future. Our future work includes the research area of Web extraction performance, mashup composition flexibility, and performance improvement of the mobile Web Server.

8. Reference [1]

Boonlit Adipat and Dongsong Zhang. Adaptive and Personalized Interfaces for Mobile Web, Proceedings of the 15th Annual Workshop on Information Technologies and Systems, 2005. [2] ALC. Http://www.alc.co.jp/ [3] Amazon. Http://www.amazon.com/ [4] Nilton Bila, Troy Ronda, Iqbal Mohomed, Khai N. Truong and Eyal de Lara. PageTailor: Reusable End-User Customization for the Mobile Web. In Proceedings of the 5th International Conference on Mobile Systems, Applications and Services, 2007. [5] Entertainment Software Rating Board. Http://www.esrb.org/ [6] Goodreads. Http://www.goodreads.com/ [7] Junxia Guo, Hao Han and Takehiro Tokuda, A New Partial Information Extraction Method for Personal Mashup construction, Proceedings of the 19th European - Japanese Conference on Information Modeling and Knowledge Bases, 2009. [8] Hao Han and Takehiro Tokuda, A Method for Integration of Web Applications Based on Information Extraction, Proceedings of the 8th International Conference on Web Engineering, 2008. [9] I-jetty. Http://code.google.com/p/i-jetty/ [10] Christian S. Jensen, Carmen Ruiz Vicente, and Rico Wind, User-Generated Content: The Case for Mobile Services. IEEE Computer 41 (2008), 116–118. [11] Yung-Wei Kaoa, Tzu-Han Kaoa, Chi-Yang Tsaia and Shyan-Ming Yuana, A personal Web page tailoring toolkit for mobile devices, Computer Standards & Interfaces 31 (2009), 437-453. [12] Kouichi Katsurada, Yusaku Nakamura, Hirobumi Yamada and Tsuneo Nitta, XISL: A Language for Describing Multimodal Interaction Scenarios, Proceedings of the 5th International Conference on Multimodal Interfaces, 2003. [13] Kris Luyten and Karin Coninx, An XML-based runtime user interface description language for mobile computing devices, Proceedings of the 8th International Workshop on Design, Specification, and Verification of Interactive Systems, 2001. [14] mFoundry mojax. Http://www.mfoundry.com/platform.html [15] Nathalie Souchon and Jean Vanderdonckt, A Review of XML-compliant User Interface Description Languages, Proceedings of 10th International Conference on Design, Specification, and Verification of Interactive Systems, 2003. [16] Peter Rigole, Chris Vandervelpen, Kris Luyten, Karin Coninx, Yolande Berbers and Yves Vandewoude, A Component-Based Infrastructure for Pervasive User Interaction, Proceedings of Software Techniques for Embedded and Pervasive Systems, International Workshop on Software Techniques for Embedded and Pervasive Systems, 2005. [17] Xueshan Shan, A Presence-enabled Mobile Service System for Integrating Mobile Devices with Enterprise Collaborative Environment, Proceedings of the International Workshop on Wireless Ad-hoc Networks, 2005. [18] ServerMans. Http://serversman.com/index_en.jsp [19] Roy Want , Trevor Pering , Gunner Danneels , Muthu Kumar , Murali Sundar and John Light, The Personal Server: Changing the Way We Think about Ubiquitous Computing, Proceedings of the 4th International Conference on Ubiquitous Computing, 2002.

194

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-194

A Formal Presentation of the Pr ocess-Ontological Model Jari PALOMÄKI & Harri KETO Tampere University of Technology/Pori Department of Information Technology Pohjoisranta 11, P.O.Box 300 FI-28101 Pori, Finland {jari.palomaki, harri.keto}@tut.fi Abstract. The term of a “process” is used in Software Engineering (SE) theories and practices in many different ways, which cause confusion. In this paper we give a more formal description a Process-Ontological Model which can be used to analyze some problematic nature of software engineering. Firstly we present a process ontology in which everything is in a process. There are two kinds of processes: “eternal” and actual, where actual processes are divided into physical and mental processes. Secondly, we propose a topological model T for actual processes. Thirdly we propose an algebraic model for eternal processes, i.e. concepts. Lastly, by using category theory we connect these two models of processes together in order to get a category theoretical description of the Process-Ontological Model. That model is a functor category CO(T)op, i.e. the category of presheaves of concepts on T. Moreover, by using the Yoneda embedding we can represent the Process-Ontological Model as certain “structured sets”, and all of their “homomorphisms”. Keywords. Process, ontology, modelling, software engineering, concept, category theory

1

Introduction

The term of a “process” is used in many widely known theories and essential practices of Software Engineering (SE). It is applied both in practical business modelling situations and the scope of interest is also widened to an engineering discipline, which concerns all aspects of software production including management and the improvement of processes. “Process thinking” has proved its power as a technique and it is applied in many standards and guidelines. It has become one of the major efforts to make software engineering to an engineering, which has a qualitative value. To serve this aim we are developing a Process-Ontological Model (POM), see [1], [2], [3]. In this paper we will present some of the basic ideas of POM more formally. We think that a process-ontology will provide an appropriate philosophical and conceptual framework for the SE researches as well as for the SE practice. It gives a possibility to compare different SE models and concepts, and to interpret the dependencies between them. For example, in [2] that model was using to positioning different standards. This paper is composed as follows. Firstly we will consider a process ontology in which everything is in a process. Our overall view will be that everything in the world is composed of processes. Secondly, we propose a topological model for actual

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

195

processes. Thirdly we propose an algebraic model for eternal processes, i.e. concepts. Lastly, by using a category theory we are connecting these two models together to get a category theoretical description of the Process-Ontological Model. 2

A Process-Ontology

In Process and Reality [4], Alfred North Whitehead presented a view that the world can best be understood as interrelated systems of larger and smaller events, some of which are relatively stable. Events are always changing. Change represents the actualization of certain potentialities and the disappearances of others. The world does not simply exist, it is always becoming. More exactly, the world is a process which is the becoming of actual entities (or actual occasions). They endure only a short time, and they are processes of their own self creation. There are also eternal objects to be understood as conceptual objects. As potentialities they enter into the actual entity becoming concrete without being actualities themselves. Although novel actual entities are progressively added to the world, there are no new eternal objects. They are the same for all actual entities. From Whitehead’s “process philosophy” we will just adopt an idea that everything consists of processes, and that these processes are divided into eternal processes interpreted as concepts, and actual processes, which we will interpret to be events occupying a finite amount of a four dimensional space-time. Thus, the world is constructed out of events. Every event in space-time is overlapped by other events, i.e., events are not impenetrable. A space-time order results from a relation between events. Moreover, in terms of events spatio-temporal point-instants, lines, surfaces, and regions can be defined by using the Method of Extensive Abstraction as follows, (see [5], Chapters XXVIII and XXIX, and [4], Part IV). A fundamental relation in construction of point-instants in a four dimensional space-time is a five-term relation of co-punctuality, which holds between five events having a common area to all of them. A set of five or more events is called co-punctual if every quintet chosen out of the set has the relation of co-punctuality. A point-instant is a co-punctual set which cannot be enlarged without ceasing to be a co-punctual. The existence of point-instant so defined is provided if all events can be well-ordered, i.e. if the Axiom of Choice is true, (cf. [5, p. 299]). Given two point-instants N and O, we denote by NO their logical product, i.e., the events which are members of both. If NO is non-empty, then N and O are said to be connected. A set of point-instants is defined to be collinear, if every pair of pointinstants are connected, and every triad of point-instants D, E, J are such that either DE is contained in J, or DJ is contained in E. A set of point-instants is defined to be a line, if it is collinear, and it is not contained in any larger collinear set. The lines so defined are not supposed to be straight. That definition of a line is analogous to that of a point-instant. It is possible to extend this method to obtain surfaces and regions, as well, (see [5, p.311 ff]). A set of lines is called co-superficial, if any two lines intersect, i.e. they have a common pointinstant, but there is no point common to all the lines of the set. A surface is a cosuperficial set of lines which cannot be extended without ceasing to be co-superficial. A set of surfaces is called co-regional, if any two surfaces have a line in common, but no line is common to all the surfaces of the set. A region is a co-regional set of surfaces which cannot be extended without ceasing to be co-regional.

196

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

A space-time order is constructed out of the relation between events as follows. Two events are said to be compresent when they overlap in space-time. With respect to a given event it is possible to divide events into zones as follows: In the first zone there are those events that are compresent with a given event. Then, in the second zone, there are those events which are not compresent with a given event, but compresent with an event compresent with it, and so on. The nth zone will consist of events that can be reached in n steps, but not in n–1 steps, in which a step is taken to be as the passage from an event to another which is compresent with it. Assuming a minimum size of events, it is possible to pass from one event to another by a finite number of steps. Two point-instants are connected, if there is an event which is a member of both. Thus, point-instants can be collected into zones as well, and the passage from event to event by the relation of compresence can be replaced by the passage from point-instant to point-instant by the relation of connection. Accordingly, suppose there are n events, e1, e2, … , en, and suppose e1 is compresent only with e2, e2 is compresent with e1 and e3, e3 with e2 and e4, and so on. We can then construct the order e1, e2, … , en. The relation of connection is a causal relation between events, where the cause of an event occurs earlier than its effect. We shall also distinguish events in a living brain from events elsewhere, [6, p. 246]. So thoughts should be among the events of which the brains consist, i.e., each region of the brain is a set of events. These events are called mental events. Mental events can be known without inferences and they consist of bundles of compresent qualities. Events, which are not mental, are called physical events, and they, if known at all, are known only by inference so far as their space-time structure is concerned. Accordingly, from the ontological point of view, everything consists of processes. Among processes, firstly, there are eternal processes and actual processes. Eternal processes are interpreted as concepts, whereas actual processes are interpreted as spacetime events. Eternal processes are instantiated in actual processes. Secondly, among actual processes there are mental events and physical events. Mental events consist of bundles of compresent qualities which can be known without inferences, whereas physical events, if known, are known only by inference as regards to their space-time structure.

3

A Topological Model for Actual Processes

We shall give a topological model for actual processes, in which events are interpreted as open sets, i.e., space-time events will have a one-one correspondence with fourdimensional open sets. To get an idea, a few topological concepts are defined as follows. Consider a set T. Let {OiI} to be a set of open subsets of T satisfying the following axioms: A1 A2 A3

The union of any number of open sets is an open set. The intersection of two open sets is an open set. T itself and the empty set are open sets.

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

197

A topology on a set T is then the specification of open subsets of T which satisfy these axioms, and this set T is called a topological space. A set of open subsets {OiI} of T is said to be an open covering of T, if the union of OiI contains T. An open covering {VjI} of a space T is said to be a refinement of an open covering {OiI}, if for each element Vj of {VjI} there is an element Oi of {OiI} such that Vj Oi. If {OiI} is any open covering of T, and there is some finite subset {Oi1, Oi2, … , Oin} of {OiI}, then a space T is called a compact. A topological space T is separated, if it is the union of two disjoint, non-empty open sets. A space T is connected, if it is not separated. A space T is said to be pathconnected if for any two points x and y in T there exists a continuous function f from the unit interval [0, 1] to T with f(0) = x and f(1) = y. This function is called a path from x to y. A space T is simply connected if only if it is a path connected, and it has no “holes”.1 A space T, which is connected, but not simply connected, is called multiply connected. Given two points a and b of a space T, a set {O1, O2, … , On} of open sets is a simple chain from a to b provided that O1 (and only O1) contains a, On (and only On) contains b, and Oi Oj is non-empty if and only if¸i - j« 1. That is, each link intersects just the one before it and the one after it, and, of course, itself. It can be proved that if a and b are two points of connected space T, and {OiI} is a set of open sets covering T, then there is a simple chain of elements of {OiI} from a to b, (for the proof, see the Theorem 3-4 in [7]. Moreover, let C1 = {O11, O12, … , O1n} and C2 = {O21, O22, … , O2m} be simple chains from a point a to a point b in a space T. The chain C2 will be said to go straight through C1 provided that i) every set O2i is contained to some set O1j and ii) if O2i and O2k, i k, both lie in a set O1r, then for every integer j, i j k, O2j also lies in O1r. Accordingly, the finer chain C2 goes straight through the coarser chain C1. Given two points a and b of a space T, we can define a valuation on T as a function v: T o R, where R is the set of real numbers, having the following properties: i) v(a) = r, and ii) v(b) = s. Then there is a non-negative real number ¸s -r«called the distance between a and b, which is denoted as dv(a,b). For each point x, y, and z in T, the distance from x to y satisfy the following four properties: 1) dv(x,y) 0, 3) dv(x,y) = 0 l x = y, 3) dv(x,y) = dv(y,x), and 4) dv(x,z) dv(x,y) + dv(y,z). Thus the space T is a metrizable. A topological model for actual processes is used as follows: a process as a whole is interpreted as a topological space T, which, at least for empirical reasons, is compact and, depending on the number of parallel processes, is either a simply- or a multiply connected. The space T contains a start point a and an end point b of the process. The start point a is an event, which is included in the open set O1, and, similarly, the endpoint b is an event, which is included in the open set On. The simple chain from a to b consists of sequences of events interpreted as a set {O1, O2, … , On} of open sets. Moreover, it is possible to get as coarse or as fine a chain from a to b as necessary. In a 1

More formally, a path-connected space T is simply connected if given two points a and b in T and two paths p : [0,1] ĺ T and q : [0,1] ĺ T joining a and b, i.e., p(0) = q(0) = a and p(1) = q(1) = b, there exists a homotopy in T between p and q. Two maps p, q : Xĺ Y are said to be homotopic if there is a map H : [0, 1] × Xĺ Y such that for each point x in X, H(0, x) = p(x) and H(1, x) = q(x). The map H is called a homotopy between p and q. Intuitively, maps p and q are homotopic, if p can be continuously deformed to get q while keeping the endpoints fixed, and a path-connected space T is simply connected, if every closed path in T can be continuously deformed into a point.

198

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

case there are parallel processes, i.e., processes which we want to keep distinct in a certain moment, for example feedbacks, we just add “holes” to our space T. This prevents the parallel processes from deforming to each other. The space T will then be multiply connected. In addition, the space T can be made a metrizable space as well.2 4

An Algebraic Model for Eternal Processes

We will interpret eternal processes as concepts, i.e. they are like “frozen” processes. The relations between concepts enable us to make conceptual structures, where the basic relation between concepts is an intensional containment relation [8], [9], [10], [11], [12]. That is, given two concepts a and b, when a concept a contains intensionally a concept b, we may say that the intension of a concept a contains the intension of a concept b. Then, based on the intensional containment relation, we can present an axiomatic intensional concept theory, denoted by KC, in a first-order language L that contains individual variables a, b, c,..., which range over the concepts, and one nonlogical 2-place intensional containment relation, denoted by “t”. We shall first present four basic relations between concepts defined by “t”, and then, briefly, the basic axioms of the theory. A more complete presentation of this theory, see [8], and [11]. Two concepts a and b are said to be comparable, denoted by a H b, if there exists a concept x which is intensionally contained in both. DfH

a H b =df (x) ( a t x b t x).

If two concepts a and b are not comparable, they are incomparable, which is denoted by a I b. a I b =df ~ a H b. DfI Dually, two concepts a and b are said to be compatible, denoted by a A b, if there exists a concept x which contains intensionally both. DfA

a A b =df (x) ( x t a x t b).

If two concepts a and b are not compatible, they are incompatible, which is denoted by a Y b. DfY

a Y b =df ~ a A b.

The two first axioms of KC state that the intensional containment relation is a reflexive and transitive relation. AxRefl AxTrans

a t a. a t b b t c o a t c.

2 Since we have a space-time instead of space and time, we do not have a distance in space and a distance in lapse of time. There is, however, one relation between two events, which is the same for all observers, which is called an interval. There are still two kinds of intervals, one space-like and the other time-like. In special theory of relativity, if the distance between two events is r, and the lapse of time between them is t, then, if c is the velocity of light, the square of the space-like interval is r2 – c2t2, whereas if it is time-like, it is c2t2 – r2. If gravitational or electromagnetic forces are involved as in the general theory of relativity, a modified definition of “interval” is introduced.

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

199

Two concepts a and b are said to be intensionally identical, denoted by a§ b, if the concept a intensionally contains the concept b, and the concept b intensionally contains the concept a. Df§

a§ b =df a t b b t a.

The intensional identity is clearly a reflexive, symmetric and transitive relation, hence an equivalence relation. A concept c is called an intensional product of two concepts a and b, if any concept x is intensionally contained in c if and only if it is intensionally contained in both a and b. If two concepts a and b have an intensional product, it is unique up to the intensional identity and we denote it then by a b. Df

c§ a b =df (x) (c t x l a t x b t x).

The following axiom Ax of KC states that if two concepts a and b are comparable, there exists a concept x which is their intensional product. Ax

a H b o (x) (x§ a b).

It is easy to show that the intensional product is idempotent, commutative, and associative. A concept c is called an intensional sum of two concepts a and b, if the concept c is intensionally contained in any concept x if and only if it contains intensionally both a and b. If two concepts a and b have an intensional sum, it is unique up to the intensional identity and we denote it then by a b. Df

c§ a b =df (x) (x t c l x t a x t b).

The following axiom Ax of KC states that if two concepts a and b are compatible, there exists a concept x which is their intensional sum. Ax

a A b o (x) (x § a b).

The intensional sum is idempotent, commutative, and associative. The intensional product of two concepts a and b is intensionally contained in their intensional sum whenever both sides are defined. Th 1

a b t a b.

The next axiom of KC concerns the distributivity of an intensional sum and a product whenever both sides are defined, AxDistr

(a b) (a c) t a (b c).

A concept b is an intensional negation of a concept a, denoted by ¬a, if and only if it is intensionally contained in all those concepts x, which are intensionally incompatible with the concept a. When ¬a exists, it is unique up to the intensional identity. Df¬

b§ ¬a =df (x) (x t b l x Y a).

The following axiom Ax¬ of KC states that if there is a concept x which is incompatible with the concept a, there exists a concept y, which is the intensional negation of the concept a. Ax¬

(x) (x Y a) o (y) (y§ ¬a).

200

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

It can be proved that a concept a contains intensionally its intensional double negation provided that it exists. Th 2

a t ¬¬a.

This relation does not hold conversely without stating a further axiom, Ax¬¬

b Y ¬a o b t a.

Also, the following forms of the De Morgan’s formulas can be proved whenever both sides are defined: Th 3

i) ¬a ¬b§ ¬(a b), ii) ¬(a b) § ¬a ¬b.

If a concept a is intensionally contained in every concept x, the concept a is called a general concept, and it is denoted by G. The general concept is unique up to the intensional identity, and it is defined as follows: DfG

a§ G =df (x) (x t a).

The next axiom of KC states that there is a concept, which is intensionally contained in every concept. AxG

(x)(y) (y t x).

Adopting the axiom of the general concept it follows that all concepts are to be comparable. Since the general concept is compatible with every concept, it has no intensional negation. A special concept is a concept a, which is not intensionally contained in any other concept except for concepts intensionally identical to itself. Thus, there can be many special concepts. DfS

S(a) =df (x) (x t a o a t x).

The last axiom of KC states that there is for any concept y a special concept x in which it is intensionally contained. AxS

(y)(x) (S(x) x t y).

Since the special concept s is either compatible or incompatible with every concept, the law of excluded middle holds for s so that for any concept x, which has an intensional negation, either the concept x or its intensional negation x is intensionally contained in it. Hence, we have Th 4

(x) S(s) o (s t x s t x).

A special concept would contain one member of every pair of mutually incompatible concepts. By the Completeness Theorem, every consistent first-order theory has a model. Accordingly, it is shown in [11] that a model of KC is a complete semilattice, where every concept a C defines a Boolean algebra Ba = <pa,,,,G,a>, where pa is an ideal, known as the principal ideal generated by a, i.e. pa =df {x C | a t x}, and the intensional negation of a concept b pa is interpreted as a relative complement of a.

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

5

201

Putting Things Together: A Process-Ontological Model

A basic idea behind a Process-Ontological Model is that everything consists of processes. There are two kinds of processes: eternal processes, which are interpreted as concepts, and actual processes, which are interpreted as space-time events. Moreover, actual processes are either mental or physical. Now, eternal processes are instantiated in actual processes. So, given the models for actual processes and eternal processes, i.e. the topological model for actual processes and the algebraic model for eternal processes, we should put these two models together. For this purpose we will use category theory, but owing to the limitation of space, only rudiment of it is presented just to get an idea. Let X be a set of object, x, y, z,… together with two functions as follows: A function assigning to each pair (x,y) of objects of X a set homX(x,y). An element f homX(x,y) is called an arrow f: x ĺ y, with domain x and codomain y. ii) A function assigning to each triple (x,y,z) of objects of X a function homX(x,z) u homX(x,y) ĺ homX(x,z). For arrows g: y o z and f: x o y, the function is written as g o f: x o z, and it is called the composite of f and g. i)

The set X with these two functions is called a category, if the following two axioms hold: C1 C2

Associativity: If h: z o w, g: y o z and f: x o y are arrows of X with indicated domains and codomains, then h o (g o f) = (h o g) o f. Identity: For each object y in X there exists an identity arrow 1y: y o y such that if f: x o y, then 1y o f = f, and if g: y o z, then g o 1y = g.

Given a category X, we can form a new category Xop, called the opposite category of X, by taking the same objects but reversing the direction of all arrows and the order of compositions. If X and Y are two categories, a functor F: X o Y is a pair of functions; an object function, which assigns to each object x of X an object F(x) of Y, and a mapping function, which assigns to each arrow f: x o y of X an arrow F(f): F(x) o F(y) of Y. These functions are to satisfy two requirements: i) ii) in X.

F(1x) = 1F(x), F(g o f) = F(g) o F(f),

for each identity 1x of X; and for each composite g o f defined

For categories X and Y, a functor F: Xop o Y is called a contravariant functor from X to Y. Ordinary functors from X to Y are sometimes called covariant functors.

202

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

If F, G: X o Y are two functors, a natural transformation W: F o G from F to G is a function, which assigns to each object x of X an arrow Wx: F(x) o G(x) of Y in such a way that every arrow f: x o y of X it follows, G(f) o Wx = Wy o F(f). In case each Wx is invertible in Y, we call W: F o G a natural isomorphism. Two categories X and Y yields a new category YX, called a functor category. The objects of YX are functors from X to Y, while the arrows of YX are natural transformations between such functors. Accordingly, a functor is a morphism of categories, whereas a natural transformation is a morphism of functors. Let us have two categories X and Y, and two functors F: X o Y and G: Y o X in opposite directions between them. For an object x in X and an object y in Y we may compare the set homY(F(x),y) of all arrows in Y from F(x) to y with the set homX(x,G(y)) of all arrows in X from x to G(y). Now, an adjunction of the functor F to the functor G is a natural isomorphism ĳ: homY(F(x),y) o homX(x,G(y)), defined for all objects x in X and y in Y, and moreover, this natural isomorphism ĳ is natural in these arguments x and y, which means that it preserves categorical structure as x and y vary. The functor F is called a left adjoint of G, and G is called a right adjoint of F, which is denoted as F – _ G. An important corollary for our purpose is the following one, [see 13, p. 83]: Corollary If the functor F: X o Y has two right adjoints G and H, then G and H are naturally isomorphic. The same is true for left adjoints. Conversely, if F is left adjoint to G, and G is naturally isomorphic to H, then F is also left adjoint to H. Now, since a category can be seen as a set of objects with a structure, we can think our topological model for actual processes as a category, where open subsets are objects and subset relations between open sets are arrows. Similarly, our algebraic model for eternal processes can be thought to be a category as well, where concepts are objects and intensional containment relations between concepts are arrows. Thus we can denote the category of topological space as O(T) and the category of concepts as C.3 Given the categories O(T) and C, they can be connected by two functors P: O(T) o C and S: C o O(T), which are opposite between them. For a concept a in C and an open set O in O(T) we may compare the set homC(P(O),a) of all intension containmentrelations in C from P(O) to a with the set homO(T)(O,S(a)) of all subset-relations in O(T) from O to S(a). Hence, an adjunction of the functor P to the functor S is a natural isomorphism Ĳ: homC(P(O),a) o homO(T)(O,S(a)), defined for all open sets O in O(T) and a in C and is natural in these arguments O and a. This adjunction Ĳ can now be interpreted as what Whitehead calls “ingression”, that is, eternal processes participating actual processes. 4 Also, since both mental processes and physical processes are space-time events, they are actual processes, and 3 Of course, more category theoretical notions should be defined to get a more exact category theoretical descriptions for the underlying topological and algebraic models. However, in this paper we give only those which are necessary to understand the basic idea. 4 “The term ‘ingression’ refers to the particular mode in which the potentiality of an eternal object is realized in a particular actual entity, contributing to the definiteness of that actual entity”, [4]. Also, “[t]he actualities constituting the process of the world are conceived as exemplifying the ingression (or ‘participation’) of other things which constitute the potentialities of definiteness for any actual existence. The things which are temporal arise by their participation in the things which are eternal. The two sets are mediated by a thing which combines the actuality of what is temporal with the timeless of what is potentiality”, (ibid., 63-64).

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

203

thus they are modeled by topological model, that is O(T). Now the connection of mental processes and physical processes can be modelled, based on the Corollary mentioned above, by natural tranformations. These natural transformations are not necessarily isomorphism, i.e. the arrows are not necessarily reversible. Moreover, a C-valued presheaf on T is the same as a contravariant functor E from O(T) to C,which is same as a covariant functor F: O(T)op o C, i.e. Eop = F.5 This means, that if U and O are open subsets of T, and U O, then F(U O): F(O) o F(U).6 Accordingly, we can describe the Process-Ontological Model as a functor category from the category of topological space O(T) to the category of concepts C, where the objects are contravariant functors, i.e. presheaves, and the arrows are natural transformations between these functors. This functor category is the category of presheaves of concepts on T. 6 A Category Theoretical Representation of the Process Ontological Model We can give even more concise category theoretical representation of the Process Ontological Model. Let Sets be a category of sets, i.e. sets as objects and function beween sets as arrows. A functor Hom(x,-): X o Sets to the category of sets is called the (covariant) representable functor of x. It supposed that the category X is a locally small, that is, a category X is a locally small if for all objects x, x’ in X, the collection HomX(x,x’) of arrows of X is a set. A category X is called small if both the collection of objects and arrows are sets. Otherwise it is called large. The Yoneda embedding is the functor y: X o SetsXop taking the object x X to the contravariant representable functor, yx = HomX(-, x): Xop o Sets, and taking f: x o x’ to natural transformation, yf = HomX(-, f): HomX(-, x) o HomX(-, x’). Thus, we can think of the Yoneda embedding y as a “representation” of X in a category of set-valued functors and natural transformations on some index category, i.e. the Yoneda embedding y represents the objects and arrows of X as certain “structured sets”, and all of their “homomorphisms”.7 We are now able to give the representation of the Process Ontological Model as follow. Let F: O(T)op o C be a contravariant functor between the category of topological space

5 If F is a C-valued presheaf on T, and O is an open subset of T, then F(O) is called the sections of F over O. Each element of F(O) is called a section. A section over T is called a global section. This terminology is by analogy with sections of fiber bundles or sections of the étale space of a sheaf. Since there are some reservations for the unique “collatable” conditions for the presheaves of concepts, we restrain describing the Process-Ontological Model as a category of sheaves. 6 The transition from intensions to extensions reverses the containment relation, i.e., the intensional containment relation between oncepts a and b is converse to the extensional set-theoretical subset relation between the sets of their extension. For example, if the concept of a dog contains intensionally the concept of a quadruped, then the extension of the concept of the quadruped, i.e., the set of four-footed animals, contains extensionally as a subset the extension of the concept of the dog, i.e., the set of dogs, (see [8], [11], [12]). 7 Thus, we can also state the important Yoneda Lemma as follows: Let X be locally small. For any object x X and functor F SetsXop there is an isomorphism HomSetsXop(yx,F) # F(x), which is natural in both F and x. The Yoneda Lemma says that the representable contravariant Sets-valued functors on a category X constitute a full subcategory of all contravariant Sets-valued functors on X, which is equivalent to the category X itself.

204

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

O(T) to the category of concepts C. This gives rise to a functor F : SetsCop o SetsO(T)op, called an “inverse image” functor, which has both left and right adjoints F! –_ F –_ F .8 Moreover, there is a natural isomorphism F! o yO(T)op # yC o F as indicated in the following diagram.

SetsO(T)

F * F*

SetsC

op

F! y2(T)op

y& op

O(T)

F

C

7 Conclusion In this paper we have introduced a formal presentation of the Process-Ontological Model. Firstly we considered a process ontology in which everything is in a process. Our overall view is that everything in the world is composed of processes. There are two kinds of processes, “eternal” and actual, where actual processes are divided into physical and mental processes. Secondly, we proposed a topological model for actual processes, which is a four-dimensional, simply or multiply connected, and metrizable topological space T. Thirdly we proposed an algebraic model for eternal processes, i.e. concepts, that is, a complete semi-lattice, where every concepts as a principal ideal determines a Boolean algebra. Lastly, by using category theory we connected these two models of processes in order to get the category theoretical description of the ProcessOntological Model. That model is a functor category CO(T)op, i.e. the category of presheaves of concepts on T. Moreover, by using the Yoneda embedding, we can represent the Process-Ontological Model as certain “structured sets”, and all of their “homomorphisms”. We think that a process-ontology will provide an appropriate philosophical and conceptual framework for the SE researches as well as for the SE practice. It gives a possibility to classify, systematise, and valuate SE models and concepts and to interpret the dependencies between them. For example, in [2] that model was using to positioning different standards. As a comprehensive theory it can be used to evaluate

8 The induced functors F! and F are sometimes referred to in the literature as left and right Kan extensions, the definitions of which, see [13, p. 232].

J. Palomäki and H. Keto / A Formal Presentation of the Process-Ontological Model

205

and develop SE methodologies and guidelines. It should be used together with appropriate theories and methods to expand their interpretation. References: [1] Palomäki, J. & Keto, H.: “A Process-Ontological Model for Software Engineering”. CAiSE’06. The 18th International Conference on Advanced Information Systems Engineering – Trusted Information Systems. Proceedings of the Workshops and Doctoral Consortium. Eds. T. Latour and M. Petit. Namur: Namur University Press, 720-726. (2006) [2] Keto, H. & Palomäki, J.: “Applying a Process-Ontological Model in Practice”. Proceedings of ONTOSE 2007. The 2nd International Workshop on Ontology, Conceptualization and Epistemology for Software and System Engineering. Eds. Daniela Micucci, Fabio Sartori and Miguel Ângel Sicilia. CCB – Centoro Copie Bicocca. (2007) [3] Keto, H., Palomäki, J. & Jaakkola, H.,: “Towards the Process-Ontological Modelling”. Proceedings of the19th European Japanese Conference on Information Modelling and Knowledge Bases: EJC 2009. Eds. T. Tokuda, Y. Kiyoki, H. Jaakkola, and T. Welzer-Druzovec. University of Maribor, Maribor, Slovenia, 2009, 294-301. (2009) [4] Whitehead, A. N.: Process and Reality: An Essay in Cosmology. New York: The Macmillan Co. (1929) [5] Russell, B.: The Analysis of Matter. London: Allen & Unwin. (1927) [6] Russell, B.: Human Knowledge: Its Scope and Limits. London: Allen & Unwin. (1948) [7] Hocking, J. G. & Young, G. S.: Topology. New York: Dover (1961). [8] Kauppi, R.: Einführung in die Theorie der Begriffssysteme. Acta Universitatis Tamperensis, Ser. A. Vol. 15. Tampere: University of Tampere. (1967) [9] Kangassalo, H.: “COMIC: A system and methodology for conceptual modelling and information construction”, Data and Knowledge Engineering 9, 287-319 (1992/93). [10] Kangassalo, H.: “Approaches to the Active Conceptual Modelling of Learning”. Active Conceptual Modeling of Learning: Next Generation Learning-Base System Development. LNCS 4512. Eds. P.P. Chen and L.Y. Wong. Berlin, Heidelberg: Springer-Verlag,168–193 (2007). [11] Palomäki, J.: From Concepts to Concept Theory: Discoveries, Connections, and Results. Acta Universitatis Tamperensis, Ser. A. Vol. 416. Tampere: University of Tampere. (1994) [12] Palomäki, J.: “Three Kinds of Containment Relations of Concepts”. Information Modelling and Knowledge Bases VIII. Eds. H. Kangassalo, J.F. Nilsson, H. Jaakkola, and S. Ohsuga. Frontiers in Artificial Intelligence and Applications. Amsterdam, Berlin, Oxford, Tokyo, Washington, DC.: IOS Press, 261-277 (1997). [13] Mac Lane, S.: Categories for Working Mathematician. New York et al.: SpringerVerlag (1971).

206

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-206

Performance Forecasting for Performance Critical Huge Databases a

Bernhard THALHEIM a,1 , Marina TROPMANN b,2 Christian-Albrechts-University Kiel, Computer Science Institute, 24098 Kiel, Germany b Crosssoft GmbH, Holtenauer Strae 129 24118 Kiel Abstract. Fast databases are no longer nice-to-have – they are a necessity. Many modern applications are becoming performance critical. At the same time, the size of some databases has been increasing to levels that cannot be well supported by current technology. Performance engineering is now becoming a buzzword for database systems. At ﬁrst physical and partially logical tuning methods have been used for support of high performance systems, but they are mainly based on large and not well understood performance and tuning parameters. Nowadays it becomes obvious that we need methods for systematic performance design. Performance engineering also means, however, support for database’s daily operating. Most methods are reactive, i.e. they are using runtime information, e.g. performance monitoring techniques. It is then the operators or administrators business to ﬁnd appropriate solutions. We target at active methods for performance improvement. One of the potential methods for active performance improvement is performance forecasting based on assumptions on future operating and on extrapolations for the current situation. This paper shows that conceptual performance tuning supersedes physical and logical performance tuning. As a proof of concept we applied our approach within a consolidation project for a databases-intensive infrastructure. Keywords. Performance forecasting, performance engineering, system improvement, high-performance databases.

1. Introduction 1.1. The Performance Problem System performance is simply the fulﬁlment of requirements or properties by a system. These parameters form the systems’ performance objective. Typical such requirements are throughput and answer time for query sets. Speed, security, reliability and availability requirements are typical functionality-oriented ones. Precision, capacity and scalability requirements are data-oriented ones. Modern DBMS allow to measure many of these requirements through performance properties. Performance measurement means in the context of computing, quantitative ﬁgures for the use of computing resources: time, 1 [email protected] 2 Corresponding

http://www.is.informatik.uni-kiel.de/∼ thalheim Author: [email protected] http://www.crosssoft.de

B. Thalheim and M. Tropmann / Performance Forecasting

207

storage space, equipment. Performance requirements are deﬁned through performance properties. A number of performance tests (e.g. timing tests, stress tests, recovery tests, volume tests, conﬁguration tests, compatibility tests, quality tests) have been developed for database systems. The performance problem gains more attention due to a number of reasons. Databases are becoming huge. The size of large production databases goes beyond the tens of TB margin. DBMS are typically supporting a number of databases that are sharing resources and thus competing in their performance. Applications are becoming more complex and thus the database schemata and functions supporting them. Database applications are typically also becoming more interdependent. And ﬁnally, delivery time is becoming more crucial than ever. We may use a general characterisation of a performance problem and a performance solution. A performance problem is characterised by four main characteristics: state space with speciﬁc goal, desired and problematic sub-spaces; actions for changing a state space with a characterisation whether the space change has taken place and with conditions for their applicability; goal tests that determine whether a given state or state set satisﬁes the goals and to which degree the state satisﬁes the goals; problem solution controllers that evaluate the actions undertaken. This characterisation generalises the classical pattern-based framework used in Standards, e.g. [7]. They use the characterisation: content - motivation - examples - ﬁt criterion - measurement units - measurement - considerations. Our characterisation can be extended to a performance framework similar to the quality framework [8]. A performance solution can be characterised by the following seven parameters: conditions C required in the environment for the solution to work; audience A or performers; behaviour B that can be demonstrated, observed/measured, and developed; location L where the performance will take place; frequency F characterising how often the behaviour must be exhibited; degree D describing the required level of performance; metrics M supporting measurement and observation of the behaviour. These parameters may also be interdependent. For example, ‘when given C all A will be able to B in L F-times per day resulting in D and observed by M’. 1.2. Performance Critical Databases Databases are becoming larger and more performance critical than ever. For-instance, ecommerce databases that power globally used web sites must complete user transactions and present information at a rate fast enough that prevents impatient customers from clicking to a competitors web site. Corporations needing up-to-date internal information cannot wait for long drawn out processes that crunch numbers and detail competitive statistics. Instead, they need databases capable of quickly churning out the data necessary to compete in today’s economy. Physical database design is today still considered the Holy Grail of better performance and for elimination of system bottlenecks. It is difﬁcult to perform correctly and it takes time. Designing a high performance database is complicated work. It takes skill and experience to get a design that runs as fast as a lightning. But sadly, experienced personnel are at a premium these days. Moreover, logical design was discounted with respect to its importance in the past. Tools never have been able to keep their promises. Therefore, performance stewards got encouraged to use them.

208

B. Thalheim and M. Tropmann / Performance Forecasting

The second reason quality why designs are overlooked when the topic of performance is discussed is that a lot of up-front time is needed to create a good design. The application lifecycle has never been shorter in corporations than it is right now. Performance engineering needs superior personnel that is using state-of-the-art software tools or that is able to eliminate unnecessary tasks from the application system. Performance monitoring is used for reactive performance management since the monitor validates the physical design implementation. If the performance monitor shows alerts then it’s probably because your physical design is failing. There are, however, performance situations that really aren’t impacted by the physical design directly. Lock contention, for example, is mostly an application or coding issue. Physical and logical designs are interdependent. Performance problems are typically observed at the runtime and can be directly tracked back to physical problems. For instance, the performance category ‘space’ with the performance problem ‘out of space conditions (storage structures)’ can be tracked back to performance causes in Oracle ‘poorly forecasted data volumes in physical design’, ‘tablespace fragmentation’, ‘invalid settings for either object space sizes or tablespace object settings’, or ‘not using locallymanaged tablespaces’. This Category-Problem-Cause model has been used in our industrial performance projects for physical redesign. It is combined with the characterisation by content, motivation, examples, ﬁt criterion, measurement, and considerations. Levels for performance improvement are the hardware level the storage level, conﬁguration level, logical level, conceptual level, and application level. Many take the quick ﬁx approach to performance problems, which equates to throwing hardware or storage at the situation in most cases. Altering and improving a database’s physical design may yield the largest possible performance beneﬁt. But modifying the design of a physical database - especially one that is currently in production is no easy task and oftentimes requires healthy amounts of off-hours work by the administrator. At the ﬁrst, logical design seems to be an alternative way for performance improvement, but physical performance improvement methods are not well represented in logical schemata. We claim that this enhancement can easily be made since most physical parameters can be injected via hint tables to logical tables. Once methods can be described we also are able to cross the divide between logical and physical design. 1.3. Performance Facilitation The Performance Facilitation Model provides performance consultants with a proactive, solution-focused template. They continuously engender support to both the personnel with the performance gap and the manager or application engineers of the affected function. Through this continuous involvement, the Performance Facilitation Model creates awareness, ownership, and commitment to the success of the selected solution(s). Proactive performance improvement can inherit techniques and solutions developed for active performance improvement on demand. There techniques are based on analysis of performance problems, development and deployment of improvement proposals. It typically consists of eight stages: Inquire - Discovering the symptoms: When a performance problem occurs an initial assessment of the performance gap is started. It generates information about symptoms instead of root causes and guides analysis, e.g. by detection of inci-

B. Thalheim and M. Tropmann / Performance Forecasting

209

dents/symptoms that prompt the problem, by comparison of current and desired performance for potential reasons, by search for critical and problematic processes, by shaping affected infrastructure, by demarcating applications that are causing or are hindered by current performance problems and by evaluating performance parameters. Effects, side effects and after effects of the performance gap, the frequency of these effects, their impacts and the costs of each effect is described. Investigate - Deﬁning the current state: The current performance of the affected applications, functions, or components must be accurately deﬁned. Otherwise the selected solutions will be inappropriate and ineffective. A performance system model may be based on an input-function-output process model. It is extended by resource consumption estimation and by dependences among processes that are running in parallel. After demarcation of possible performance causes we may use analysis tools for further investigation of the current state. Vision - Deﬁning the possibilities: It is necessary to clearly understand the gap between current and desired performance. The performance problem and solution characterisations are used for shaping potential solutions to current performance gaps. We aim at facilitation of a safe, positive and future-focused discussion to get at the best possible results. The expected performance must be validated for relevance prior to use. Brainstorm - Generating a list of potential solutions: Performance problems and their effects are interrelated among each other. Solutions may lead to new problems. Therefore, problems can be classiﬁed into eliminated, modiﬁed, reinforced, and created problems. The interdependence will guide the facilitated brainstorming session aiming in detecting potential solutions for each of the four change operations (eliminate, modify, reinforce, create). Qualify- Narrowing solutions down to those with the greatest leverage: Every potential solution is evaluated according to a number of criteria, e.g. resources, total costs, short-term results, long-term results, alignment with the organisational vision, appropriateness in the eyes of management or customers, and beneﬁts. Looking at the weight of each potential solution after the Impact Elements have been prioritized will leave a clear ranking of the ﬁnal solutions. Plan - Securing ownership, commitment, permission: A proposal should be generated for each of the remaining solutions. The proposal template should include: current state of performance; steps listed in order with the predicted outcome of each step or features of the solution and beneﬁts of each feature; predicted ﬁnal sustainable outcome including the impact that the solution would have; resources and investment required for full implementation. Apply - Managing the realisation of the solution(s): Prior to starting any intervention the metrics that will prove efﬁcacy must be selected and measured. Details and steps of the performance improvement plan are developed at this stage. Roles, responsibilities, resources, timeframes/milestones, and consequences for each step are assigned to participants. Then the plan is realised step-by-step. Report - Measuring the ﬁnal outcome and capturing experience: The ﬁnal outcomes are documented . Using the metrics that were selected during the process, performance improvements should be quantiﬁed and reported. Typical elements are: new techniques, ideas, or knowledge; initial and ﬁnal states of performance; original and modiﬁed solution(s) with tangible (metrics) and intangible results of each solution

210

B. Thalheim and M. Tropmann / Performance Forecasting

implemented; total time, cost and other resources utilized; new experience and insight generated and key learning points. Proactive, solution focused performance improvement is still a matter of handicraft and mastering by performance specialists or stewards. There is no common foundation for these stages. Therefore, it is our goal to develop a general conceptual approach to proactive performance improvement. This conceptual approach consists of the following conceptual performance forecasting programme: Conceptualisation of performance solutions: Performance improvement problems can be categorised through the Category-Problem-Cause frame. Frequency and occurrence conditions are risk factors for problems. Performance problems may be resolved through certain techniques. Problems, risk factors and solutions must be understood, i.e. get their conceptualisation. Enhancement of conceptual schemata by performance templates: Schemata and schema elements may be enhanced by templates that characterise performance problems they may cause under certain conditions. Development of control and measurement practices: Proactive performance improvement needs a dynamic adaptation technique. Therefore, a performance monitor based on measurement practices allows to react to peak situations and to evaluate potential performance bottlenecks. Development of parameter set reduction and dependence representation techniques: Since performance and behaviour of a database system is typically characterised by a large number of parameter that have different importance and interactions depending on the current situation we need to apply techniques for reduction of sets of parameters. Substantiation of data mining and statistics techniques for performance analysis: Statistics and data mining are prominent techniques for detection and derivation of behavioural pattern. They can be enhanced and substantiated for speciﬁc needs of performance forecasting. Development of a forecasting framework: These procedures, techniques, practices and theories can be combined within a forecasting framework. There are very few publications that propose solutions to this kind of performance improvement. The Psychic-Skeptic Prediction (PSP) framework [5] allows an autonomic DBMS to efﬁciently learn about a workload’s dynamic behaviour over time and to forecast when a change in the workload might occur in order to proactively reset the DBMS parameters to suit the new situation. The PSP framework combines features of both online and ofﬂine prediction approaches. The general performance forecasting programme goes far beyond classical workload modelling and workload-shift detection. 1.4. Overview on this Paper Section 2 surveys known approaches to performance improvement and their potential. Section 3 develops an approach for conceptual performance forecasting. Section 4 brieﬂy surveys the realisation of the conceptual performance forecasting within a challenging project. Section 5 introduces fundamentals of the forecasting approach and discusses its potential. Section 6 brieﬂy discusses results, solutions and practices of our approach to performance forecasting.

B. Thalheim and M. Tropmann / Performance Forecasting

211

2. Classical Approaches to Performance-Oriented Tuning 2.1. Performance and Tuning Performance tuning is the process of modifying and adjusting the parameters of the underlying DBMS to improve performance. Performance is measured by a number of parameters such as response time and throughput time. The parameter set may be rather large. For instance, Oracle 10 supports more than 200 parameters3 . Tuning is typically not aiming at the change of the database model, its semantics or structuring. The ﬁrst step in tuning is typically to determine bottlenecks. The applications and the DBMS typically form an extremely complex system. The state-of-art in tuning applications is still based on physical performance improvement. The exceeding complexity of the large DBMS parameter set results in a ‘nightmare’ situation. The database programmer must detect those parameters that can be changed with performance improvement and must know which changes might have a bad impact on behaviour of other parameters. Most textbooks, manuals and papers thus advice to tune the database system at the physical level. Since tuning is teamwork and performance improvement is a part of continuous database engineering, tuning decisions are neither coherent nor well-documented nor error-prone nor well understood. Physical tuning techniques are still relying on human intervention and manual programming. They do not allow to react to dynamic changes. A large number of books and papers exist that explain physical and logical performance techniques, e.g. [2,10,11,13,12,4,17]. Almost all tuning techniques are deﬁned at the physical level. It can also be shown that these techniques can automatically be deducted [3]. Some of these techniques are already implemented in tools for the speciﬁc DBMS and thus becoming ﬁrmware, e.g. [21]. 2.2. Logical Tuning Logical tuning techniques are far better understood and far easier to use. Typical techniques are: • Tuning the disk cache: We may introduce separate caches for support of critical processes. These caches may be of different size and collected into special pools. We may use prefetching and replacement strategies for improving behaviour of these caches. • Tuning the logical schema: Indices seem to be the ultimate tuning device. They may however also cause performance problems. Therefore, index optimisation is typically a careful negotiation process. A number of equivalent schemata can be considered for the same application. Therefore schema restructuring techniques such as management of repeating groups and partitioning on the basis of horizontal decompositions lead to better performance. • Query optimisation support: Query optimisation is a matured science within DBMS technology. The optimisation aims at ﬁnding the best possible query plan for computation of a set of queries. It may however also fail. Therefore, DBMS 3 Tuning on the basis of these parameters might look similar to music production with large production boards with many keys for regulating the sound etc. Sound production becomes an art that is only mastered by ﬁrst-class operators.

212

B. Thalheim and M. Tropmann / Performance Forecasting

•

•

•

•

•

systems such as Oracle, Sybase and DB2 allow to deﬁne hints for the query optimiser. These hints4 are used for the generation of better query plans. Denormalisation of logical schemata for performance improvement: Database application might be either update-intensive or query-intensive. Query-intensive applications often suffer from normalisation since queries tend to use many joins. Therefore, adding redundant data to tables, handling non-normalised tables, master-slave data modiﬁcation, and derived materialised attributes are used for performance improvement. Materialisation and layered architectures: Queries or subqueries with a high frequency and low update rates may be materialised if actuality may be compromised. Data warehouses use this approach. Layered architectures can be built through view technology. Query ‘gardening’: Although a system might function efﬁciently when it is initially conﬁgured, performance might degrade even though the load is unchanged, for instance due to changes in the state (e.g., size of tables) of the database or due to competition with other coexisting systems or database systems. Typical ‘gardening’ functions are statistics monitors and recompilation procedures that are activated from time to time. Transaction processing tuning: Lock control, transient versioning, versions for read-only transactions, parallel thread processing by nested transactions are programming techniques that can easily be used for extension of transaction processing. Load control techniques: Feedback-driven techniques support admission, cancellation, restart control based on estimations on the conﬂict ratio. Transactions can also be ordered depending on their wait-depth according to completion of other transactions.

2.3. Conceptual Tuning Instead of physical or logical tuning we have developed an approach to conceptual tuning [1,14,18,19,20]. • Optimising the conceptual schema: Co-design of structuring, functionality and interactivity considers explicit restructuring depending on functionality and interactivity. • Performance-oriented translation to logical and physical schemata: Translation of the conceptual schema may depend on performance requirements and thus be used for direct optimisation of the logical and physical schemata. • Adaptation to the optimisation strategy of the DBMS: Optimiser of modern DBMS may be conﬁgured. This conﬁguration is typically used by operators for reorganisation on demand. It may however be integrated into the translation. • Technical tuning at the conceptual level: The conceptual schema may be translated to an optimised conceptual schema. In this case we consider more than one equivalent conceptual schemata. Typical parameters that might be used are selectivity and size of views, query results and relations. 4 The database group at Kiel holds a patent together with a large airline for the dynamic generation and injection of hints into database query optimisers.

B. Thalheim and M. Tropmann / Performance Forecasting

213

• Revision and optimisation of the logical schema: The translation of the conceptual schema to the logical schema may include another optimisation pass that optimises bottleneck sources such as identity management, aggregated columns, key-keeping tables, exclusion of register tables, introduction of controlled data and relation redundancy, and building of table extracts. • Explicit performance-oriented control strategies for integrity maintenance: Integrity constraints are an important instrument in structure modelling but may hinder performance. Performance-oriented integrity control strategies allow to schedule maintenance at the moment when performance is not inﬂuenced or inﬂuenced to a small intent. We may additionally distinguish between soft and hard constraints with lazy or eager enforcement. Hard constraints are those that must be preserved at any moment of time. • Explicit introduction of parameters for performance collapses: Co-design allows to derive estimations of query and subquery result sizes and of performancecritical functions. These estimations may be used for deﬁnition of schema-based special performance parameters and threshold values. • Optimisation of functions and queries depending on translation choices, with explicit introduction of alternatives. All tuning strategies known so far5 can be mapped to conceptual tuning strategies.

3. The Kiel Approach to (Conceptual) Performance Modelling 3.1. The General Approach to Performance Modelling Database systems (DBS) consist of a database management system (DBMS) and a number of databases. Typically, a DBS is based on support systems such as operating and graphical systems. The performance model is thus based on the architecture of the application system. Let us consider a typical 3-tier architecture in the sequel. For any system we may distinguish the supply proﬁle and the demand portfolio. Applications have their speciﬁc demands. A DBS may provide a number of services. These services are combined within a supply proﬁle. If there is a mismatch then it might result in a performance problem or in insufﬁcient utilisation of resources. In the ﬁrst case a performance solution is sought. In the second case, consolidation of systems may decrease the unnecessary overhead. In the project reported below we are interested in both solutions. 5 We analysed tuning principles in [17] and detected [16] that all known tuning approaches can be directed by additional schema information that can be kept with the conceptual schema. This state has not been changed in 2009 [16].

214

B. Thalheim and M. Tropmann / Performance Forecasting

In a similar form we may now develop a model suite consisting in models for monitoring, analysis, design, forecasting, tuning, and capability management. It is our intention to conceptualise these models in a form that allows from one side to reason about system properties together with stakeholders in the application and from the other side to map the models to physical implementations. The right part of the picture displays our Kiel approach to general performance management. The performance objective is therefore based on the applications and more specifically on the application tasks. These tasks are typically combined with performance properties. If the performance of the DBS does not match the performance properties then the performance problem must be described. We use a portfolio/proﬁle description. A portfolio consists of a set or collection of tasks. A proﬁle speciﬁes the services and the capability of a system. This separation allows us to specify requests issued by an application through the application demand portfolio and to describe the services provided by the DBS through the DBS supply proﬁle. In a similar form we describe the demand of the DBS by the DBS demand portfolio and the services provided by the support systems by the system supply proﬁle. The DBS supply proﬁle consists of the characterisation of the DBMS, the extended data dictionary with the system variables and parameters, and the DBMS utilities. Typical DBS supply proﬁles are, for instance, indexing, tuning, redundancy, and ﬂexibility proﬁles [2,4,10,11,12,13,17]. The support systems’ proﬁle is based on operating and graphics systems parameters. The detailed description of these proﬁles is beyond the scope and the page limits of this paper. Instead, we concentrate on the novel ideas and approaches discussed in the sequel in this Section. 3.2. Modelling the Application Demand Portfolio The DBS modelling process results in a speciﬁcation of the conceptual schema and of business processes. We combine this speciﬁcation with the conceptualisation of performance properties. The extended database application schema consists of the database schema, the business processes and the characterisation of the application demand by the characterisation of the kind of computation based on the description of the operations involved, the operation support, and the data volumina transferred for support of computation, the visibility description of processes for the business user that includes frequency of operations and their relation to business processes, the description of the modes of computation such as online, batch and interactive mode of computation or deferrable and immediate application of computation, the performance properties and quality based on the expected execution time for online etc. modes, based on the throughput expectation for queries, modiﬁcations and transactions, based on restrictions such as suitability or response time, and based on priority claims issued by the business user, the criticality level of the processes. We may use the transformation approach of [19,20] for optimisation of the extended conceptual speciﬁcation. This extended database application schema can be used for derivation of the DBS demand portfolio. The extended DBS speciﬁcation can be mapped to logical and physical schema. Most mappings are interpretations of the conceptual structure and business processes to

B. Thalheim and M. Tropmann / Performance Forecasting

215

logical structures and programs. We use, however, the translation portfolio [22] that allows us to take into account the style of the logical schema, e.g., the treatment of hierarchies, integrity control, quality characteristics, null values, key management by identiﬁers or surrogates, view management, error management, and controlled redundancy. The logical schema and the programs are extended by access proﬁles, storage portfolio, and an index portfolio. These portfolios may also be conceptualised since they are typically standardised within a DBMS through a number of conﬁguration templates. We also may use a tuning portfolio, a redundancy portfolio, a query computation hint portfolio. 3.3. Layered Architectures and Performance Models The layered architecture discussed above supports a separation of concern into macroscopic conceptual descriptions, mesoscopic logical descriptions and microscopic physical descriptions. Microscopic descriptions are typically based on a complete knowledge of all parameters. At this level we may consider the subsystems Ui of the system U = (U1 , ...Us ) as well as their interaction and evolution. The evolution can be given in an analytical form by a general equation of the form U˙ = N (U, Λ, Ω) + F where N is a nonlinear vector ﬁeld, Λ denotes inhomogeneities, Ω the set of external parameters, and F the ﬂuctuations. This equation is very general. It can be used if the state space is low-dimensional and inhomogeneities, external parameters and ﬂuctuations are known. 3.4. Indicators for Performance Problems Performance problems can be characterised on the basis of the category-problem-cause template with the description of content, motivation for their solution, ﬁt criteria, measures and considerations. Since these problems may occur in a large variety we concentrate on speciﬁc kinds: substantial derivation from normal operating. A deviation may be considered to be substantial if we observe a certain threshold increase, e.g. by 200 %. We may also concentrate on peaks. Each DBMS provides a number of parameters that might be used for characterisation of its performance, e.g., service time, arrival rate, service trafﬁc, trafﬁc intensity, utilisation, queue time, and response time. These parameters form the parameter space. For instance, Oracle provides more than 600 metrics and a number of views within the active workload directory such as dba hist database instance, dba hist snapshot, dba hist osstat, dba hist sysmetric summary, dba hist system event, dba hist sysstat, dba hist buffer pool stat, dba hist sys time model, and dba hist ﬁlestatxs. These parameters can be classiﬁed into performance parameters (e.g., NUM CPUS, BUSY TIME, USER TIME, SYS TIME, IOWAIT TIME, DB Time, DB CPU (time), CPU used by this session, READTIME, WRITETIME, WAIT COUNT, WAIT TIME, TIME WAITED ) and workload parameters characterising physical write blocks, physical read total blocks, executes, transactions, user calls, recursive calls, buffer gets, table scan blocks, logons, db block changes, table fetch rows, and sorts (memory). 3.5. Performance Improvement based on Tuning The performance solution model (C,A,B,L,F,D,M) introduced above considers

216

· · · · · · ·

B. Thalheim and M. Tropmann / Performance Forecasting

conditions C required in the environment for the solution to work, the audience A or scope, the behaviour B that can be demonstrated, observed/measured, and developed, the location L where the performance will take place, the frequency F characterising how often the behaviour must be exhibited, the degree D describing the required level of performance, and the metrics M supporting measurement and observation of the behaviour.

We could now test whether performance improvement principles or conceptual tuning techniques result in performance improvement. Instead, the application demand portfolio is matched against the workload parameters. This approach enables us to detect whether performance parameter behaviour depends on tasks in the demand portfolio. 3.6. Conceptual Control Posts and Monitors for Performance Forecasting The classical approach to performance improvement assumes a direct link between performance monitor results and physical design. Performance stewards try to analyse SQL scripts and database internals and regurgitate mountains of difﬁcult to interpret statistics. The key to understanding the discipline of performance monitoring is, however, not to really validate the physical design implementation but to understand the behaviour of the system. That means we need a conceptual understanding of databases’ performance and a mapping to facilities at the logical level of the DBMS. The tragic thing is that much of today’s mindset dismisses the idea that altering and improving a database’s physical design based on conceptualisation of performance requirements will yield the largest possible performance beneﬁt. We share the opinion of performance stewards that there are performance situations that really aren’t impacted by the physical design directly. Lock contention, for example, is mostly an application or coding issue. I/O contention problems are likely to be caused by improper segmentation of tables, indexes, and storage structures in physical design. Long table scans can be avoided if proper indexing strategy has been chosen. Out-ofspace problems with either your storage structures or objects are a result of improper size estimation in initial physical design.

4. An Application Case 4.1. The Project Scope Our approach to performance forecasting has been applied in a project [23] that aims in systematic consolidation of the computational infrastructure of one of the largest energy corporations in Europe. All systems must be loaded with an acceptable workload. Therefore, the systems landscape must be restructured, must be dynamically reconﬁgurable, must allow to report low workload as well peaks, and ﬁnally should be based on reliable prognoses of the landscape capabilities. The performance forecasting project we report in this paper aims in providing a solution for the ﬁrst three stages of performance facilitation. It uses a number of assumptions:

B. Thalheim and M. Tropmann / Performance Forecasting

217

Inertia of application demand portfolio: Application processes are typically running within a certain repetition and certain periods of time. They may be characterised by their resource consumption and by their I/O behaviour. We may therefore assume that at a certain point of time in future those processes are running that have been running in the past at some time points. Therefore, we may assume that behaviour in future can be modelled by behaviour in the past. Matching requests with resources: Applications are typically known to the database operators or application stewards. Therefore, we may characterise requests to the DBS by the tasks of the application demand portfolio. The utilisation of DBS resources is trackable through the DBMS statistics. Restriction for the scope: Performance problems are caused by speciﬁc tasks. Their inﬂuence on performance can be characterised by a small number of performance and workload parameters. We assume that workload parameters drive performance parameters. We restricted consideration to some parameters: Executes, Physical Writes, Physical Reads, Logical Reads, and Transactions. Hypothesis-based forecasting: We restrict the scope of forecasting to those dependences in the performance parameter space that can be modelled through regression between one of several performance parameters at the assignment side and workload parameters at the characterisation side. Restriction of platform and of applications: Since, a general framework has not yet been developed we use an actual project environment for testing and continuous development of the framework. The system landscape consists of IBM System P 550 LPAR with an AIX Version 5.3 operating system, with 8 Power6 processors, 128 GB DDR2-SDRAM, 6 3,5-Zoll-SAS (73,4 GB) disks, external storage arrays, a single 10 GBit/s network and a number of database applications. We choose an experimentation template, e.g. using a sampling rate of 10 minutes. 4.2. Modelling the Database System Performance Parameter Space

Database management systems provide a number of statistics and parameters. The schema on the left displays a collection of almost hundred of them. These parameters can be recorded and used for the development of statistics. These data cannot however be used in the recorded form. Data must be ﬁrst analysed whether they are complete, whether errors can be detected in the data, and whether duplicates or outliers lead to a wrong understanding. Moreover, systems are changing. Therefore, statistics must be transformed to robust statistics.

218

B. Thalheim and M. Tropmann / Performance Forecasting

4.3. Goals and Tasks for Performance Forecasting in the Project The consolidation project is supported by the performance forecasting project. The aim is to permanently and efﬁciently monitor systems workload, to forecast the behaviour based on mathematical and computational models, to optimise deployment of landscape resources, and to control capabilities of each of the systems. The forecasting solution is thus supporting restructuring of the landscape and acceptable deployment of each system. At the same time, a software system must be developed that does not signiﬁcantly increase workload of the landscape. In the ﬁrst phase of the project a number of statistics have been collected in a systematic way based on the approach presented in the Sections above. We aimed at the characteristic workload, the visual detection of dependences among parameters in the parameter space, and the systematic derivation of hypotheses for these dependences. In the second phase of the project these hypotheses have been checked against the statistics and used for the development of a conceptual performance model for the application demand portfolio. In the third phase this model can be used for performance forecasting based on the assumption of inertia of the application demand portfolio. If a performance bottleneck can be derived then we may apply proactive tuning or we may use capability management approaches for the resolution of performance problems. This approach to direct derivation of causes for performance problems enables us to provide a number of solutions for envisioned performance problems. These solutions are heuristic ones since they are based on observations. 4.4. From Performance Statistics to Heuristic Performance Models The statistics on the left displays behaviour of (1) CPU usage, (2) physical writes, (3) transactions, and (4) logical reads for a selected time period. By visual inspection we may envision that these parameters are associated. We also conclude that none of the last three parameters explains the behaviour of executes. We are interested in a small set of parameters that explains the behaviour of the given performance parameter. We may derive a number of hypotheses and check these hypotheses against our data after cleaning and consolidating the data themselves. 1. Hypothesis: The CPU time usage depends on executes (number of SQL statements). Result: correlation coefﬁcient 0,22698 with a certainty coefﬁcient of 0,05. The correlation strength is low and the hypothesis is not accepted. 2. Hypothesis: The CPU time usage depends on physical writes (number of blocks written to disk). Result: correlation coefﬁcient 0,64261 with a certainty coefﬁcient of 0,41. The correlation strength is moderate and the hypothesis must be reﬁned.

B. Thalheim and M. Tropmann / Performance Forecasting

219

3. Hypothesis: The CPU time usage depends on physical reads (number of blocks read from disk). Result: correlation coefﬁcient 0,09669 with a certainty coefﬁcient of 0,01. The correlation strength is negligible and the hypothesis is not accepted. 4. Hypothesis: The CPU time usage depends on logical reads (buffer gets, frequency of reads from cache). Result: correlation coefﬁcient 0,61149 with a certainty coefﬁcient of 0,37. The correlation strength is moderate and the hypothesis must be reﬁned. 5. Hypothesis: The CPU time usage depends on transactions (number of active transactions). Result: correlation coefﬁcient 0,28607 with a certainty coefﬁcient of 0,08. The correlation strength is low and the hypothesis is not accepted. 6. Combined and reﬁned hypothesis: The CPU time usage depends on transactions, physical writes, and logical reads. Result: correlation coefﬁcient 0,96 with an average error of 1, 85−10..12 . The correlation strength is very high and the hypothesis is used for the derivation of a performance model. The last hypothesis is the basis for a conceptual model. This model depends on the application demand portfolio and the DBS supply proﬁle.

5. The Foundation Framework for Performance Forecasting 5.1. The General Performance Forecasting Model A general performance forecasting model displayed in Figure 1 consists of • a set {o1 , ...., om } of performance objective parameters that can be observed for the system, • a set {c1 , ...., cn } of control parameters through which the behaviour of the system can be changed, • a set {p1 , ...., pl } of system-immanent characteristic parameters, and c, p¯) mapping control and characteristic param• a set of functions or relations Si (¯ eter values to values for performance objective parameters oi . c1 c2 {Si (¯ c, p¯)|1 ≤ i ≤ m} ... cn -

6 p1

6 p2

...

o1 o2 ... om -

6 pl

Figure 1. The general model for performance forecasting

The classical forecasting model is mainly considering control parameters. Since we know the DBMS and some of its system-immanent parameters we are also using the systemimmanent parameters as far as we know them and can measure their behaviour.

220

B. Thalheim and M. Tropmann / Performance Forecasting

The functions Si (¯ c, p¯) can be given as analytical functions, simulation functions or benchmarks. Analytical functions describe correlation of input parameters to output parameters. Simulation functions map the behaviour of the system to an abstract machine that is only based on relevant parameters. Benchmarks are mainly systematic experiments. A small number of portfolios from certain application domain is given and potentially a proﬁle of the DBMS is assumed. The ultimate performance forecasting problem solution would be the description on the basis of analytical functions. Analytical functions can, for instance, be obtained by learning algorithms used in algorithmic learning theory, neural networks, or machine learning. They are widely used in databases technology, e.g., memory allocation functions, memory estimation functions, compression functions, and estimation functions used for query optimisation. We share however the disbelieve of performance stewards in the nonexistence of such functions. This doubt is grounded by the large number of system-immanent parameters that must be taken into consideration. Benchmarking approaches are useful if the application domain is stable, the database size can be estimated to a number of size frames, the database structure is stable, and the DBMS is not evolving. In these cases, a similar application is used for observations of potential behaviour of the database systems. Benchmark development is an important research topic. If we face however a situation where applications are changing and where the DBMS carries a number of applications based on some assignment policy of companies, then benchmarking cannot be used for forecasting. Simulation functions are used as an experimental ‘proxy’ environment for performance reasoning. Typically, a performance demand load is used as an input parameter set. This parameter set is systematically changed. The performance forecasting model thus simulates the performance behaviour of the real system. We have reﬁned this approach by splitting the input parameters of the performance load into proﬁle parameters and portfolio parameters. The behaviour of these parameters is captured by experiments. The portfolio parameters and the proﬁle parameters are given by the applications that are running on the system and by the behavioural facilities of the DBMS itself. This approach is only applicable under the performance forecasting assumptions discussed in Section 1. The development of simulation models can also be considered to be the input stage for the development of analytical functions. In the sequel we shall demonstrate that this approach allows performance forecasting for database applications. 5.2. Synergetics of Performance Parameters The dimensionality of the performance parameter space constitutes a serious obstacle to efﬁciency and applicability of most tuning or performance improvement approaches. Performance stewards (or worker) are able to reason in low dimensions and therefore cannot provide any meaningful results when the number of parameters goes beyond a ‘modest’ size of 10 parameters. Therefore, parameter selection is used for dimension reduction. The objective of parameter selection is to identify parameters as important and discard any other parameter as irrelevant and redundant ones. Parameter selection is currently mainly performed as a manual ﬁltering procedures and is thus entirely based on the skills and on the experience of performance stewards. Classical methods to dimension reduction like eigenvalue transformation are not applicable due to insufﬁcient knowledge

B. Thalheim and M. Tropmann / Performance Forecasting

221

on systems actual behaviour. Therefore, performance assessment might be the work of a highly-skilled artisan. Therefore, we need to develop our own approach to handle this situation. We have chosen synergetics [6] that allows a separation of parameters into order parameters and enslaved parameters. This approach is based on the separation of consideration into microscopic state spaces and macroscopic state spaces discussed in Section 3. Order parameters force in general the behaviour of enslaved parameters. Therefore, we could abstract from enslaved parameters and consider order parameters for performance forecasting. Typically there are other external parameters that inﬂuence behaviour of order parameters. These parameters can thus be used to regulate behaviour of a system. These parameter thus control the behaviour of order parameters and are called control parameters. This approach has another beneﬁt for simulation models. Based on experiments or observations of behaviour of order parameters and extraction of corresponding values for control parameters we may derive behavioural pattern for these order parameters. If we know such pattern then the performance forecasting and monitoring allows to derive solutions for performance problems by pattern recognition and control regulation. The general performance facilitation can be based on the separation of parameters. 5.3. Fundamentals Provided by Data Mining and Statistics Data mining and statistics developed a rich body of knowledge that can directly be applied to development of performance models. Since there are many surveys, monographs and textbooks6 that provide a matured background theory we may restrict the scope of this subchapter to our approach that aims in use of statistics and data mining for performance forecasting. We based the development of models on a number of assumptions and ﬁndings. Workload is accumulative if we do not consider conﬂict management situations. Therefore, we may assume that dependences among parameters are linear. This assumption can be checked through crosstabulations, comparison of group means and analysis of residuals. The experimentation setup uses also techniques for outlier detection and outlier removal, techniques for bias removal or adjustment, techniques for visual inspection of relations, discriminant analysis for identiﬁcation of variables that best predict behaviour pattern membership, factor analysis for reduction of the number of parameters for later analysis, logistic regression for causal analysis within dichotomous dependent parameters, multiple regression and multiple correlation for the analysis of the joint impact of a set of independent parameters, and Anova and multivariate analysis of variance for tests of the statistical signiﬁcance between the means. This list of techniques 6 Since

we do not aim in surveying this research we link the reader to [9] instead of a long list of references.

222

B. Thalheim and M. Tropmann / Performance Forecasting

is not complete. It shows however the rich reservoir of methods those are applicable to performance analysis. Within our project we restricted ourselves to independence of groups of parameters and thus applied correlation and regression analysis. These techniques focus on analysing direct relationships between parameters. The models we use are based on the assumption that workload causes performance problems. This causal model can be directly extended by DBMS, DBS and operating systems models. There is - in most ‘normal’ database operating situations - a strong evidence of a causal relationship between workload and supply proﬁles of DBS and performance parameters. One weakness is however that this general assumption about relationship is not good as establishing how or why workload parameters and the supply proﬁle have the effect they do. Therefore, we need a sophisticated machine model that allows to separate different concerns. The synergetics model and elaboration analysis allow to reduce the impact of indirect and transitive causal chains. Factor analysis identiﬁes latent causal factors that explain the covariance among items.

6. Performance Forecasting in Practice 6.1. Implementation of the Approach to Performance Forecasting The performance framework presented in this paper has been tested in the project and is supported by a system that has been implemented in an experimental prototype. The monitor interval (e.g., daily monitor) and the parameters from the parameter space are going to be selected. A number of statistics is generated after the run call. These statistics can also be plotted. The data are collected by our performance value capturing toolbox. These statistics are used detection of hypotheses. The screen shot displayed above shows now that these hypotheses may be used for the derivation of a model. At the left side we show one derived model. This model coincides with the one discussed at the end of Section 4. This model uses linear regression of the CPU usage from transaction, physical writes, and logical reads. The result of the model evaluation is displayed based on the F-test and signiﬁcance coefﬁcients, the expectation for CPU usage and the average error.

B. Thalheim and M. Tropmann / Performance Forecasting

223

6.2. Conservative Performance Forecasting The result of conservative performance forecasting is displayed at the left. We target at pessimistic or conservative forecasting models. Such models give an upper estimation of the performance forecast. The graphic on the left side shows the difference between the forecast and the real value. Therefore, the derived model can be used for conservative performance forecasting. 6.3. Forecasting Model Evaluation Regression analysis resulted, for instance, in an equation of the following kind [15]: CPUBusyTime = 50844 + 0.027 NoTableScanBlocks + 0.56 NoPhysWriteBlocks + 0.155 NoPhysTotalReadBlocks + 5.996 NoTA + 0.18 No TableFetchRows + 0.04 NoDBBlockChanges + residuum.

The expectation of the residuum or the error expectation is estimated by −1.25 · 10−11 .

The last equation shows that the extended database application schema can be used for analysis of this performance parameter and therefore for conceptual performance forecasting. The three workload parameters of the two subsections above form the main portion of the workload. The result of model evaluation is displayed at the left side. Models can be numerically evaluated. We may also use visualisations. For instance, the diagram displays a comparison of the conﬁdence intervals of forecasted CPU utilisation values with the CPU utilisation values observed for the database system. Almost all CPU values are within the conﬁdence intervals. Therefore, the model approximates the systems behaviour. 6.4. Development of Performance Forecasting Questionnaires The models for dependences of parameters from other parameters can be now used for answering questions like the following ones: • What is the behaviour of CPU workload if the application demand portfolio increases requests to the DBS? • How many CPU’s are sufﬁcient if general DBS workload does not exceed 75 %? • Is it possible to add tasks to the application demand portfolio without increase of performance peaks?

224

B. Thalheim and M. Tropmann / Performance Forecasting

The answers to these questions are however dependent on the explicit assumptions discussed in Section 4, e.g., relative inertia of previous application demand portfolio. 7. Conclusion 7.1. Summary The paper shows that performance management can be performed at the conceptual level. We start with a description of performance objectives and their discrepancy from real behaviour. These discrepancies cause performance problems. Performance solutions and the introduction of a performance framework are attempts to overcome these problems. We analyse causes of different categories of performance problems. Most solutions for performance problems have been developed at the physical and in some cases at the logical levels through application of tuning techniques. These techniques can however be conceptually speciﬁed. We thus introduce a conceptual performance forecasting programme that is based on a separation of concern into a demand portfolio and supply proﬁle. The portfolio and the proﬁles are the basis for an extension of the conceptual schema and the business process speciﬁcation to an extended conceptual schema. This schema is used for the derivation of solutions for performance problems. Solutions can be based on explicit models describing interference of performance parameters. We show how these models can be developed through modelling techniques that detect correlation and dependences among parameters. The models we developed are based on dependences between workload parameters and performance parameters. Therefore, performance conceptualisation is achievable since workload parameters are part of the extended conceptual schema. 7.2. Future Research The models we developed describe relations among application demand portfolio and DBS supply proﬁles. Within the project reported above we also tried to develop models for relations between the DBS demand portfolio and the supporting system supply proﬁles. This task is more difﬁcult due to the number of parameters and due to interaction of different processes at the operating system. So far we used mathematical models based on a separation into macroscopic and microscopic levels. We need however also simulation models and benchmarks for development of systems, for their maintenance and their evolution. The models we have used so far are linear regression models describing dependences among parameters. Linear regression models are not sufﬁcient for description of all performance situations. For instance, schemata with intertwined integrity constraint sets have a modiﬁcation workload that results in modiﬁcation of many related objects whenever one object is modiﬁed. A path towards more complex models based on queuing theories and time series analysis has already been highlighted in [15]. The ﬁnal target of a performance framework is the development of sophisticated performance facilitation templates. Such templates naturally depend on the DBMS and the platform. Therefore, a deep study must be initiated whether such facilitation is describable at the logical level in all situations and can be conceptualised. We expect a positive answer for this question.

B. Thalheim and M. Tropmann / Performance Forecasting

225

References [1] M. Albrecht, M. Altus, and M. Steeg. Application-oriented design of behavior: A transformational approach using RADD. LNCS 1331, pages 323–332, Los Angeles, USA, Nov. 3 - 5, 1997, 1997. Springer, Berlin. [2] N. Bruno and S. Chaudhuri. Constrained physical design tuning. VLDB, 1(1):4–15, 2008. [3] S. Chaudhuri and G. Weikum. Foundations of automated database tuning. In VLDB, page 1265, 2006. [4] P. Corrigan and M. Gurry. Oracle performance tuning. O’Reilly & Associates, Sebastopol, 1993. [5] S. Elnaffar and P. Martin. The psychic-skeptic prediction framework for effective monitoring of DBMS workloads. Data Knowl. Eng., 68(4):393–414, 2009. [6] H. Haken, A. Wunderlin, and S. Yigitbasi. An introduction to synergetics. Open Systems and Information Dynamics, 3:97–130, 1995. [7] ISO/IEC. 9126-3 (Software engineering - product quality - part 3: Internal metrics). ISO/IEC JTC1/SC7 N2416, 2001. [8] H. Jaakkola and B. Thalheim. A framework for high quality software design and development: A systematic approach. IET Software, 2010. to appear. [9] O. Maimon and L. Rokach, editors. Data mining and knowledge discovery handbook. Springer, 2005. [10] S. T. March and J. V. Carlis. Physical database design: Techniques for improved database performance. In W. Kim, D. S. Reiner, and D. S. Batory, editors, Query Processing in Database Systems, pages 279– 296. Springer, New York, 1985. [11] S. S. Mittra. Database performance tuning and optimization. Springer Verlag, 2003. [12] R. J. Niemiec. Oracle database 10g performance tuning tips & techniques. McGraw-Hill, 2007. [13] P. O’Neil. Database principles, programming, performance. Morgan Kaufmann, Los Altos, 1994. [14] N. Runge. Scheme transformations on the basis of optimizing combinations of partially applicable elementary transformation methods. PhD thesis, Karlsruhe University, Computer Science Dept., 1994. [15] A. Seifert. Methoden und Verfahren f¨ur Performance Forecasting in datenintensiven Anwendungen. Master’s thesis, Christian-Albrechts University Kiel, Department of Computer Science, 2009. [16] D. Shasha and B. Thalheim. Personal communication. Discussion on performance problems, tuning techniques and their conceptual representation, 2000-2009. [17] D. E. Shasha. Database tuning-a principle approach. Prentice-Hall,Englewood Cliffs, 1992. [18] M. Steeg. The conceptual database design optimizer CoDO - concepts, implementation, application. LNCS 1157, pages 105–120, Cottbus, Germany, Oct. 7 - 10, 1996, 1996. Springer, Berlin. [19] M. Steeg. RADD/raddstar - A rule-based database schema compiler, evaluator, and optimizer. PhD thesis, BTU Cottbus, Computer Science Institute, Cottbus, October 2000. [20] M. Steeg and B. Thalheim. Conceptual database application tuning. In Multiconference on Systemics, Cybernetics and Informatics, volume VIII, pages 226 – 231. IEEE, 2000. [21] M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2’s learning optimizer. In VLDB, pages 19–28, 2001. [22] B. Thalheim. Entity-relationship modeling – Foundations of database technology. Springer, Berlin, 2000. [23] M. Tropmann, B. Thalheim, and R. Korff. Performance forecasting. 21. GI-Workshop on Foundations of Databases (Grundlagen von Datenbanken), pages 9–13, 2009.

Acknowledgement We thankfully acknowledge advices given by Theo H¨arder. The project has been performed by a project team at Kiel University CAU. We gratefully acknowledge the contributions of our project team members G¨unter Millahn and Andrej Seifert [15]. We also acknowledge the support of our industrial partners in the project.

226

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-226

Specification of Games Jaak HENNO Tallinn University of Technology [email protected]

Abstract. Game programming is part of many IT study programs. In these courses and in game-programming texts games are not considered on abstract, implementation-independent level, but discussion is based on some specific implementation environment: a programming language (C, C++), a software package, preprogrammed libraries etc. Thus instead of discussing games on general, implementation-independent level only specific features of these programming environments are considered. Here is presented a framework for object-oriented, structural description and specification of games as event-driven object-oriented systems. At first should be considered game visual appearance and game mechanics – they create the “feel”, player’s experience. Game structure and logic are considered on implementationindependent level using an easy-to understand specification language. The framework emphasizes separation of data structures from game engine and game logic and thus facilitates maintenance and reuse of game assets and objects. Mechanisms for automatic adjustment of game’s difficulty - so that it will be just suitable, not too easy but also not too difficult – are also considered. The specification method is illustrated with several examples. Specifications of games created with this method are easy to transform into implementations using some concrete game programming environment; this has been tested in a game programming course. Keywords. Game, game programming system, specification, event-driven architecture, formal description, emergence

1. Introduction Games and virtual words are rapidly becoming procedural literacy of the youth of 20th century, the generation of the “digital natives” [1]. Bits and bytes are the first common language for the whole mankind after the Babel catastrophe and games are new “algorithmic cultural objects” [2]. Approximately 80 percent of the population of USA plays some kind of computer games and from the total of 2.1GB of data what each American consumes each hour, video games are responsible for 55% of the information received by people at home [3]. Therefore, in many universities have been introduced courses covering video game creation. In these courses, video game creation is considered using some concrete programming language or game programming environment, thus instead of “Creation of Video Games” it were correct to call these courses “Creation of Video Games with C++” or “Game Creation with Gamemaker” and they consider games in the terms of concrete programming environment. In terms of programming language can’t be discussed features of games as a new art form [4]. Features of games are considered using natural language, but this makes discussion rather ambiguous. It is difficult to discuss video games, their structure, how and why they work etc in terms, which are more succinct, technical than natural language, since we do not have proper language for this.

J. Henno / Speciﬁcation of Games

227

Creating of whatever program begins with discussion, presentation of the task in natural, human language. This natural-language description of the task is then used to create equivalent form in programming language. Programming is a transformation from very flexible, loose, often ambiguous natural-language description of a task into a concise, unambiguous description in some programming language. Cognitive distance between these two representations is big. It is easier to overcome this distance, if some intermediate forms of description are used – descriptions in some half-formal specification language, which helps to understand the problem better and helps transformation from natural language into more succinct and formal form. Specification of a problem allows discussing the problem, communicating ideas to wider audience, helps to clarify the essence without involving implementation details, syntax and many peculiar features of a programming language. Existing specification formalisms were designed for static domains with comparatively small number of data types and non-interacting objects. They are not suitable to present dynamic nature of video games, which have numerous visual data types, data types, containing behaviors and animations, data types describing laws of physics, sometimes irrational physics of some imaginary word, event-driven, emerging behaviors of thousands of game objects and their interactions with each other and with players. Game specification has to cover several levels. The logical structure – game rules, what players can do - can be presented using a logic-based language, similar to Prolog or some extension, e.g. discrete time-based logical system. Such a description language is the Game Description Language [ 5 ] (GDL) created in Stanford AI department. This is a logic language, which describes games as finite state machines with one start and several final states. This is suitable to describe the procedural, functional structure of a game and to test game-playing programs. It is not suitable to present a new game, especially to programmers, who have little experience with logic languages. For instance, the main game loop is described with the following rule (the system predicate next declares, that the argument clause will be true in the next game frame) [6]: (⇐ ( next (cell ? x ? y ? player )) ( does ? player ( mark ? x ? y )))

Such a description is understandable to GDL (Prolog) interpreter/compiler, but for humans is practically useless. For player a game is a visual system, which constantly presents new challenges; player interacts with game to solve presented problems and this interaction creates ‘fun’. Game attraction depends both on visual beauty (2D, 3D graphics), but at least as much on algorithmic, time dimension – succession, nature, temp of challenges, presented to player. This distinguishes games from all other types of software – if e.g. a database creates ‘fun’, then something is terribly wrong. How to influence humans in order to create some response or feeling – this is not a problem of IT, this problem belongs to psychology. Humans are several orders more complex than computers and we have (yet) very limited understanding of mechanisms of our brain. Using the GDL language we can prove, that “…game state is playable if and only if it is either terminal or there are legal moves for each player. A game is playable if and only if every reachable state in the game is playable” [ 7], but this is playability for a computer, not for a human, this does not distinguish e.g. Tetris states where player has managed to create a strait level without holes. This makes game description and specification very different from specification of “ordinary” software, where the task is helping, automating some aspects of user’s every-day activities. Raph Koster, author of many successful games suggests: „Start with the experience design, and if your core is rotten or an afterthought, you’ll be put-

228

J. Henno / Speciﬁcation of Games

ting lipstick on a pig“; „there are two rare and vital skills a game designer needs to acquire: the ability to see the game in their head with no dressing at all; and then the ability to see the game in their head with no mechanics at all, as a player sees it“ [ 8], [9].

2. The most important: idea “Start with the experience design” [8]. Player’s feelings, play experience depends on many factors. How game designers create and affect players experiences – games “feel”[10] is only partially understood and often overlooked as a method or course of study, yet the game's feel is central to game's success. This is a hidden language in game design that no one has fully articulated yet, but it creates the meta-sensation of involvement with a game. The language could be compared to the building blocks of music - time signatures, tempo, chord progressions, loudness. Feel and sensation are similar building blocks for a game. The sensation of controlling a digital object is one of the most powerful and overlooked phenomena, which has emerged from the interaction of humans and computers [11]. This control - the games user interface, its visual and dynamic components, game’s experience - should be considered first. 2.1. Visual appearance The first what is noticed is game’s visual appearance, how beautiful the game is, games visual space, effects – sparks, particle systems, smog etc. Well-known are games from PopCap Games [12]. This company has perfected the art of “visual sugar”; their flagship game Bejeweled was an instant hit after publishing, got Computer Gaming World's Puzzle Game of the Year award in 2001 and in the next year was entered into the Computer Gaming Magazine, Computer Gaming World (CGW) Hall of Fame the first puzzle game since Tetris. Braid [13] is also very beautiful, although in totally different style. Games visual appearance depends on (graphic) processor, thus improves approximately according to Moore’s law – everything gets twice better (graphically more beautiful) in 18 months. The Ghostbusters movie, which shows that a well-aimed proton stream and a handy trap can bag any ghoul, has inspired more than eight video games, which are based on the movie's supernatural feel and sci-fi effects. Every new game had better effects and is much more impressive. The newest version: “Ghostbusters. The Video Game” [14] has received good reviews mainly because it’s updated graphical effects. Developers want their game to distinguish visually and often start with graphical effects. Polishing graphic effects can take years and may finally kill the whole project – the Duke Nukem Forever project, developed for 12 years was finally abandoned, when there was lot of code (over 4 hours) covering movements and effects, but no game [ 15]. You should have also dynamics, algorithmic – the 4th dimension. 2.2. Game mechanics Even more essential than good “look” is game’s mechanics, which creates players sensation of manipulating a digital object – running, bouncing, spinning, experiencing the joy of being in total control. The player’s satisfaction is the reason why games are played: “games are about one thing: entertaining people...You have to make your game fun” [16]. “A game designer needs to acquire: the ability to see the game in their head with no dressing at all” [8], be able to create a new miniworld with its own laws, its own

J. Henno / Speciﬁcation of Games

229

“feel”. Already the first, graphically very simple videogames: Pong, Mario, Tetris, Breakout all had a definite, quite individual feel. Game mechanics is best tested with minimal “visual dressing”, representing game objects with simple geometric figures – circles and rectangles. Such bare-bones representations reveals early what is wrong, where the behavior of game objects feels wrong. Below is such a “raw” representation for a Christmas game, where flying Santa-Clause drops packets which player has to collect.

Fig. 1. . Bare-bones game mechanics test format and the final game.

To emphasize mechanics, the 4th dimension, some games use deliberately very simple, 4-color Commodore/Atari-style graphics, but with clever modification and twist of laws of physics and common game expectations create lot of “fun” and exiting experience [17]. There are many examples. 2.2.1. This Is The Only Level A very simple game [18] – you have to jump-drive a small blue elephant to exit; only one room, no scrolling, classical platform-game template– jump and avoid falling on spikes; very player-friendly – if you loose your elephant the Great Elephant’s god above the tube drops you a new one – no punishment whatever, elephants are eternal and will always re-burn.

Fig. 2 This is the only level

But when you finish with your first elephant, the game sends you another. Everything looks the same (this is the same and only level !), but now the action of arrow keys is reversed, right moves left and left moves right. Understanding this is a small surprise and learning to move is a pleasant experience. And this is only the beginning – game presents 40 new tasks where the settings are the same (the same room) but the game mechanics (action of keys, sometimes also the mouse) are every time surprisingly different. The game is breaking all canons of classical user interface design, e.g. Nielsen’s rule “Consistency and standards” [19]. Words, situations, actions mean here never the same thing; player has to discover, what is the meaning of keys this time. This is surprisingly “fun” experience, just what a good game should provide. 2.2.2. VVVVVV, Gnop Another seemingly very simple game, a classical 2D indie platformer with pixilated graphics, where player has to move the hero through rooms with lot of different dangers, e.g. deep pits. To pass a pit player has to rotate the room, so that pit becomes horizontal corridor - the hero has a supernatural ability of quick gravity-flipping, so

230

J. Henno / Speciﬁcation of Games

that instead of falling down he adheres to pit wall and can walk along the wall, thus avoiding danger. Games surreal physics creates a very distinctive “feel” and lot of fun. Even simpler is game “Gnop” (Pong reversed), where instead of moving paddles player moves the ball up or down, and again the weird setup creates fun. Game mechanics, feel is movement, kinetics, tactile sensation of manipulation steering around, running, jumping, performing acrobatic maneuvers are a very important part of a game. Mario’s movements were polished for months before the actual game creation begin: “Before any of the levels had been created Mr. Miyamoto had Mario running around and picking up objects in a small ‘garden’… They spent a lot of time working on the swimming, it's harder than running to get the feeling right, they didn't want you to avoid the water, the wanted to make it an advantage and fun to dive in.” [20]. For creation of game kinetics, object control several components should be considered: Input – what keys, buttons player can use; Response -- how game processes and responds to player input; Context -- how constraints (e.g. room geometry, collisions, invisible forces etc) modify current state and movements; Complete game environment, created by animation, sounds and visual effects; Metaphor and context, which create emotional meaning of gameplay and provide familiarity which helps learning the environment; Rules – what is the goal, what player has to achieve and how game variables create higher-level meaning and challenge to motion and control. 2.3. Well-designed tasks Gameplay is fun only if the game constantly presents new challenges and is correctly balanced – not too easy, but also not too difficult, so that players manage to solve presented problems. Koster: “Games that are too hard kind of bore me and games that are too easy also kind of bore me” [21]. Balancing game problems so that they are not too difficult, but also not too easy and follow each other in constant flow is a difficult task. Gamers have different abilities, what for some is simple is for others quite difficult. For big games have been created cheats and walkthroughs; this has created a twisted category of gamers, who with every new game first download cheats. And for complex games they are a great help for many players. For instance, consider the beginning of a cheat for the game Machinarium [22], below is the corresponding scene: “Click on the overturned bathtub twice, then click on the metal body of our robot friend twice to make it fall to the ground, landing just next to his head. Now click on his head to attach it to the body.”

Fig. 3. The first scene of Machinarium

J. Henno / Speciﬁcation of Games

231

Thus the search space here is a graph with (at least) eight nodes (many arrows have been omitted from the diagram) – player should have good deal of ingenuity to discover the right sequence of clicks and double-clicks:

Fig. 4. Search space for Machinarium start scene.

2.3.1. Complexity manager Games should be self-regulating, adjusting their difficulty according to players actions; especially important is this with educational games – education should always adjust itself to learner’s level. Many good games have features in this direction, e.g. the problem is solved automatically if player has made already too many attempts. Problem difficulty could be estimated and adjusted, presenting the search space formally and simulating the search. Game state space can be searched, recording the number of steps with simple predicate: next (This, Steps) : − can _ go(This, There), not _ visited (There), assert (visited (This )), Steps1 is Steps + 1, next (There, Steps1). next (This, Steps ) : − can _ go(This, There), assert (visited (This )), assert (dead _ end (This )), next (There, Steps ).

Simulating search in game tree allows estimating beforehand the “raw” difficulty of the game, i.e. the size of the search tree. However, this does not take into account the psychological aspects based on unusual setups (the examples above), the difficulty of reaching a new state (the “psychological” length of the path). In educational (“serious”) games where player has to solve tests, game state can have several parameters which influence difficulty of tests; if player makes too many mistakes these parameters will be modified. 2.3.2. Recording gameplay Game design is engineering of emotions. There are some studies of player’s emotional state: what happened before, during, and after play [23], [24], but in those studies player’s emotions were explained only with the name of the game, no further details. A scientific theory of something is based on facts, gathered from experiments. To build a theory of game’s “mechanical and predictable heart, built on the foundation of basic human psychology, (which) beats at the core of every single successful game”

232

J. Henno / Speciﬁcation of Games

[25] we need data, gathered by observing gamers and corresponding game states - the emotions, satisfaction, fun depending on state of the game. Gameplay is execution of a program. It consists of elementary actions (linear code) and logic, which ties these elementary actions together. In gameplay, the elementary actions (what player can do) are fixed by game rules. But logic, which ties these actions into gameplay, is created by player(s). Rules give only flow schema, decision blocks are from player. Games are programs with variable logic, ballet with fixed pa’s, but free choreography, tales, where character types and plot elements are known, but the player determines how the story unfolds. Gameplay is dynamic mapping from game states into game’s emotional and cognitive states: “game designer produces rules for interaction that, with the participation of the player, generate game states that induce emotions in the player” [26]. We need data about this mapping, information about what happened in game and caused this or another reaction, otherwise understanding of games is comparable with alchemy, the state of chemistry before discovery of the molecular weight: “…it was clear to the alchemists that "something" was generally being conserved in chemical processes, even in the most dramatic changes of physical state and appearance; that is, that substances contained some "principles" that could be hidden under many outer forms, and revealed by proper manipulation”[27]. Unfortunately we (yet) do not have a periodic table of games. Game feel could be studied recording gameplay: player’s actions, rewards, feelings; some experiments in this direction were described in [28]; a Game Play Analysis system is discussed in [29].

3. Game Specification framework After considering graphical and dynamic aspects of a new game should be described game(play). In the following is presented a (half-abstract) framework for stepwise game description and specification of a game. The system grow out from a game programming course [ 30 ] where students used popular game programming systems: Gamemaker [31], Microsoft XNA Game Studio [32], Pygame [33], the Flash programming language AS3, Panda3D [34] etc. The system was used to present new game idea before implementation, i.e. for game (informal) specification. Games are event-driven systems, presented in functional style, considering also game objects interactions, game visual and organizational structure. The main elements of specification are game objects, events/messages and object’s reactions to events - actions. Game logic is considered only on the level of interactions in gameplay – how events are dispatched and what actions they trigger. The formal level is not suitable for proving correctness of game-playing programs as in GDL [6], but allows considering information flows in game program and thus finding inconsistencies in game and its specification. Game description, specification and implementation cover following steps: • Informal presentation of the idea, using natural-language description and example images; explanation of game mechanics – how the “feel” is created, implementation (at least with elementary geometric figures) of key actions (Mario’s jump), what player can perform in gameplay, i.e. testing game actions and kinematics. • Game elements: objects, events, interaction diagrams and messages; they determine the hierarchical structure of game objects (see example below). • Explanation of data types and implementation of visual (2D sprites, 3D objects, animations) and audio data. • Selection of implementation environment, programming basic gameplay. • Play-testing and polishing – improving interactions, adding “visual sugar”.

J. Henno / Speciﬁcation of Games

233

4. Requirements for implementation environment The main purpose of described here formalism is to help to describe a game structure on high level, describing actions using pseudo code and not to deal with low-level programming/implementation issues. Created specifications are easy to implement without essential changes in specification’s structure, if the implementation environment satisfies certain (minimal) requirements. It should implement the basic game loop: Get player’s input, i.e. capture IO events (keyboard, mouse) and game-specific (i.e. defined in this particular game) events; Calculate the next state and its visual representation on screen – frame; Create the step (next frame) event and other (game-specific) events (i.e. check collisions), send information to players and create visual output. For video games the visual side is very essential, thus the environment should implement (at least) sprite graphics (drawing sprites on screen, scaling, rotating, flipping, alpha blending, color transform) and (desirably) have vector graphics. Graphic objects sprites and vector objects - should have coordinates, attributes to describe movement – direction, speed and collision detecting. Several popular game programming environments: Gamemaker, Flash (AS3), Panda3D, XNA Game Studio, Pygame etc satisfy (most) of these requirements, thus are suitable as implementation environments for games, specified using this language. New (open source) systems appear constantly and there is good competition in game development tools.

5. Game Specification Language The main categories of game specification are game objects (classes), events and object’s reactions to events (actions) [35]. Objects can be passive (non-playing, background objects), which do not receive events and active, which receive events and in response perform actions (reactions). Classes (both active and passive) have attributes; e.g. object coordinates on screen (game window). Actions of active objects change game state; this is projected to screen as frame-based animation. Game specification is a 6-typle of disjoint sets G =< Description, O , E , A, D , S >

Description – a brief description of the game in natural language (with images); it should communicate the general idea of the game, game experience. Games have already established rather consistent structure (opening screen, option selection menus, rooms, levels) and functions to navigate between rooms and menus, (auto)save, exit and general attributes (screen/window size, number of frames per sec fps). These are stored in data type S (Setup). O is the hierarchical structure of classes of game objects. All the other parts of specification may have rather complex structure and considered in following sections. 5.1. Game objects The set of classes O of game objects is hierarchically ordered, i.e. a new class can extend some already existing class or some class defined by implementation environment; notation O1 ⊆ O2 means that class O2 extends class O1 . Even simple casual games (puzzle, Tetris) with small number of objects may contain lot of structure, e.g. implementation of the Pong game consists of 250- 500 LOC

234

J. Henno / Speciﬁcation of Games

(Line Of Code) in C-language, Mario – ca 1000 and Age Of Empire - millions [36]. Student projects with Gamemaker (horizontal Mario-type scroller) have often hundreds of objects. Big games may contain large number of objects, e.g. the game Dungeon Siege (2002, developed by Gas Powered Games and published by Microsoft Game Studios) had 7500 object classes [37]. Objects can use objects or data structures from other objects and/or their attributes as arguments when new objects are created, e.g. object ship needs arrays of missiles, enemies and explosions and on firing passes them to missiles; missile seeks collisions with enemies and starts explosions (example 2 below). The top object is constant controller ∈ O - this is the game engine, i.e. the program, running the game. Common tasks for controller is bookkeeping: managing points, lives, rooms, generating objects (in next room) etc. But controller may have also other tasks, e.g. in considered earlier game “This is only one room” where meaning of input keys varies, but game object moves (more or less) using the same procedures (with parameters) controller can serve as a switch, which decides, what kind of movement commands send to objects depending on current game state. Structure of object hierarchy in a game depends on interactions of objects and often differs from the naïve, “natural” hierarchy. For instant, the “natural” hierarchy of objects in a shooting scene may look like this: Game Player

Enemy

Explosion

Gun Bullet Fig. 5. .”Natural” hierachy of objects in a shooting scene.

But shooting (bullets appear) is initiated by player, bullets should in every frame check collisions with enemy and in case of collision initiate an explosion, thus structure of game objects in the shooting scene is better to present like this: Player Bullet Enemy

Explosion

Fig. 6. Hierarchy of game objects resulting from game logic.

Objects can receive events from the set E , created by other objects. In response to an event they perform actions and/or create new events. In event-driven systems events are usually considered to be properties of the whole class, i.e. they are sent to all members of the class, not to some object of the class, but in specifications it is often very convenient to consider events which are sent to some distinguished object of the class and not to whole class. For instance, in turn-based game (Tic-Tac-Toe) event turn is sent only to one player and it does not have sense to send event next (next frame) to enemies, who are dead. This behavior is easy to implement using additional attributes in object specifications.

J. Henno / Speciﬁcation of Games

235

Classes of objects (both active and passive) have the set Attr (O) of properties (attributes), i.e. class variables (data fields). Attributes have a fixed domain of allowed values. All objects, which have visual representation on screen, have an attribute sprite - sprite of the object. Sprite can be one image or a sequence of several images and/or animations, e.g. a DirectX graphical 3D-object can have built-in animation walk . Game objects are very diverse, contain different types of data and are very dynamic – data and fields change often, during the gameplay may emerge new behaviors, i.e. data values may change. 5.2. Data types In games are used many different types of data – visual, audio, formal (vector) representations of visual structures (e.g. race-track), formal representations of animations etc. For instance, in video game series developed by Electronic Arts about American football [38] each player (object) has 448 attributes with different types of data as attribute values. To make game easier to maintain, enable re-use , speed up execution, data should be “taken out” from the main game loop (engine) and presented as separate (formal) structures. The simplest are data types provided by implementation environment (e.g. programming language: int, Double, String, Boolean etc). Used in games numeric values have often complex semantics, e.g. it is assumed, that implementation environment “understands” semantics of attributes speed (t ) , direction(t ) - if the object is visible at the moment t, then its graphical representation, sprite, should move at the moment t with corresponding speed in corresponding direction. Attribute values are abstract data types, there could be dependencies and operations defined on attribute domains (e.g. the negation operation on the domain of marks in example 1). A common example of such an abstract type is movement, which together with speed, direction has also elements speedX, speedY, i.e. speed projections on corresponding coordinate axes. Thus together with attributes speed(t), direction(t) object always has also attributes speedX, speedY. Object’s movement can be described using any of these, e.g. effect of bouncing a (ideal) ball from horizontal plane is natural to describe by ball.speedY *= -1; and movement of fragments after explosion is most natural to describe using direction angle and linear speed along the direction line (polar representation). If movement obeys laws of physics, then its movement data type should contain also functions for gravitation force, functions describing friction, rotation, elasticity etc. Usually physical laws are implemented as a separate physics package, in specification should be described the signature of this package (which variables it describes). Many data types describe visual representation of game objects. The simplest are two-dimensional (2D) sprites. Classically they are small (to enable quick redrawing on screen), e.g. 32x32 px images. For animated sprites or sprites which are used to create 3D-impression a sprite can contain series of images, e.g. a sprite can contain “snapshots” of a 3D-model, taken after small rotations along 360degree turn - this allows to create impression of a 3D object and is computationally much “cheaper” than real-time rendering of the 3D model.

Fig. 7. Some frames of a car sprite, used to create 3D car image.

Data types for animated 3D figures can be quite complex. Even a simple 3Dmodel, e.g. DuckPrince from standard figures of Alice [39] contains subobjects for all body

236

J. Henno / Speciﬁcation of Games

parts and has methods: move, turn, roll, resize, say, think, playSound, …,orientTo, turnToFace, pointAt, setPose, standUp; these methods have modifiers (parameters), e.g. method move has refinements moveTo, moveTowards, moveAwayFrom, moveWithSpeed etc.

DuckPrince RightLeg

Chest

LeftLeg

RightFoot

Head

LeftFoot

RightWing

Crown

Beak

LeftWing

Fig. 8. Subobject hierarchy of data structure DuckPrince.

A very common event is collision of objects. Result of collision depends on many factors – properties of colliding objects and methods used to compute collision. Exact computation of collisions and other effects caused by movements may be rather computation-heavy, therefore different methods to “cut corners” are used, e.g. instead of exact surfaces of objects are used some approximations: collision rectangles, collision spheres, collisions are tested only for some points, together with non-planar surfaces are stored also surface normals (in some points) – they are used for computation of bounces; normals are also used when shaders are applied to surfaces to obtain better 3D impression.

Fig. 9. Additional structures stored with visual objects – shadow (used for detecting plane’s position and missiles explosions on ground), collision polygon for spaceship (if the ship can be fired from every direction), surface normals for bounce direction detection.

In isometric (2.5D) games for collision calculation are used separate images – masks, which mimic objects shadow on the ground, e.g. car collisions with houses are calculated using the house’s mask – area on ground covered by house.

J. Henno / Speciﬁcation of Games

237

Fig. 10. Isometric house and its mask.

In on-line multiplayer games collision figure (circle or oval) is used to determine whom an avatar sees in the scene.

Fig 11 Avatar “sees” only agents who are inside his collision oval.

Discrete frame-based execution can create for objects with wrongly defined collision mask rather strange, non-natural effects. For instance, if collision mask for the pad is the whole pad rectangle, ball collision with the pad (bounce) reverses vertical component of ball speed and ball movement in one frame is less than pad’s height, then approaching the pad from a side ball is after the first collision still in pad’s collision rectangle, its vertical speed is again reversed and it starts a trembling dance along the pad until it gets free on the other end of the pad.

Fig. 12. Ball “Flutter” caused by erroneous collision rectangle.

5.3. Events and event listeners The objects receive and dispatches events e ∈ E . The set of events contains constant create ∈ E - this is the event, when an object (instance of a class) is added to game (in object-oriented systems this means executing the constructor function). Another common event is step - frame change, i.e. new frame. Sometimes it is convenient to divide the step event into several consecutive events: begin step, step, end step (e.g., this is used in Gamemaker). The reaction to step event can be e.g. change in object coordinates – this creates movement. Common events are IO-events, mouse and keyboard events: click, over, out, keypress and interaction, collision of objects – reaction to this event can make ball to bounce from a pad. The collision event is not symmetrical; it occurs only when an object (e.g. object bullet) executes testfunction collision(Other) and result is passed only to the object who initiated test. Sending an event e with attributes Attrs to object O is denoted by ⇒ (O, e, Attrs)

Game begins with event

238

J. Henno / Speciﬁcation of Games

⇒ (controller , create)

This is a unique event, create (for the object controller) can not occur later. This start event may also have attributes – e.g. difficulty of level, addresses of servers where the game will be hosted etc. To receive an event, object should be made event listener. In the following, the listener declarations are omitted. It is assumed that when an event e ∈ E is sent to some class O or to some specific object o ∈ O it has reaction (action) f (O, e, Attrs) . It is often convenient to allow objects to have states. Very common are states based on attribute visible – non-visible objects usually do not perform many actions, thus they also do not respond to several events, e.g. to animation event next. Structure of object states can be more complex, e.g. a fighter can have states Creeping, Shooting, Dead , a rocket can have states Flying, Landed, Fuelling etc. 5.4. Actions In response to events objects execute actions – perform computations using object attributes. A common action what objects e.g. controller can perform is creating and sending (dispatching) an event to another object, e.g. in the example drop(c,i) - dropping playmark with color c into i-th column of the board is an event which player sends to controller. Actions performed during gameplay change the game state and are described using ordinary procedural assignments. 6. Examples The following examples intend to illustrate the main ideas behind the proposed high-level specification method. In the examples variable identifiers begin with capital letter, constant (object) identifiers – lowercase latter, “_” stands for arbitrary value, prefix Arr denotes an array (actually set, the order and indexing are not used). Actions are described in a pseudocode; semantics (axioms) of abstract data types - using the Colog [40],[41] language. Colog (Algol in Logic) is an extension of predicate logic over manysorted propositional variables and predicates with procedural operators if_then_else and while. When domains are finite, this is actually only syntactic shorthand which allows more compact representation of functions and predicates; Colog has interpreters in several versions of Prolog, e.g. in Swi-Prolog [42], Eclipse Prolog [43] etc.. Many details are not specified but left for common sense and previous knowledge about specifications and programming. In “pure” object-oriented specifications objects, their attributes, events and actions are usually presented together, as one meta-object. But for games it is more convenient, easier to understand when presentation follows the flow of game – attributes, events, actions of objects are introduced in order how they are needed to explain gameplay. 6.1. Example1: ConnectFour Description: This is a game for two players with different playmarks (e.g. one has Red playmarks, another – Blue ones), who by turn drop one of their playmarks into one of N columns, where the marks fall straight down, occupying the next available space (“hole”) within the column. Each column contains M positions for marks; common values are N = 7, M=6. The player who succeeds to create a line of four of his playmarks vertically, horizontally, or diagonally wins. There are several nice implementations on Internet – [44], [45].

J. Henno / Speciﬁcation of Games

239

Fig. 13. The player on move wins.

6.1.1. Data types Signature of elementary data types consists of sorts Color = {r , b} - colors, the operator ¬ acts on the set of colors as negation, i.e. ¬ r = b, ¬ b = r . C - column. Signatures of (partial) operations are h : C → [0..M ] - height of the (filled part of) column drop : C × Color → C - dropping a playmark into the column look : C × i → Color - checking the i-th hole in column The following axioms describe semantics of signature operations; it is assumed, that all free variables col ∈ C , c ∈ Color are universally quantified. h(C.create) = 0 - height of the empty column is 0 h(col ) = i < M ⇒ h(drop(col , c)) = i + 1

⎧look (drop(col , c), i + 1) = c ⎫ h(col ) = i < M ⇒ ⎨ ⎬ ⎩look (drop(col , c), j ) = look (col , j ),1 ≤ j ≤ i ⎭ - semantics of dropping a playmark – checking the top-most hole reveals the dropped color, checking lower holes reveal the content as it was before dropping. h(col ) = i < M ⇒ drop(look (col , j ), c) = drop(col , c),1 ≤ j ≤ i drop(h(col )) = drop(col )

- checking a hole’s content or columns height does not change effect of dropping. With these signature operations can be described formula for derived function check _ line(column, c) - does dropping a playmark with color c into column c of the board b created a line of 4 consecutive playmarks of color c in a line. 6.1.2. Objects and attributes Object classes for this game are: {Column, Color , Player , Controller}

Active objects which receive events are: Controller , Player .

240

J. Henno / Speciﬁcation of Games

E (controller ) = {create, drop(c, i ),1 ≤ i ≤ N , turn, won, loose, draw} attr (controller ) = {board } attr ( Player ) = {c ∈ Color , b ∈ Board} controller.create = { for (i = 1..N )columni = Column.create(); board = U columni ; i

player1 = Player.create(c = r , b = board ); player 2 = Player.create(c = b, b = board ); ⇒ ( player1, turn)}

player.turn = {(∃j )(h(b.column j ) < M ) (b.column j = drop (b.column j , c ), ⇒ (controller , drop (b.column j , c ))}

This means, that on his turn player selects a column which is not yet full, drops his playmark into this column (updates his representation of the board) and sends the corresponding event to controller. controller.drop(b.column j , c) = {b.column j = drop(b.column j , c); if (check _ line(b.column j , c) ⇒ ( player (color = c), won) else{ if (∃j )(h(b.column j < M ) ⇒ ( player (color = ¬c), turn) else{ ⇒ ( player1, draw); ⇒ ( player 2," draw ") }}}

6.1.3. Complexity manager The presented above axiomatic representation of game actions allows to estimate difficulty of various variations of the game – how change of game board size M, N or target pattern (instead of four in a straight line some other pattern) influence game difficulty, e.g. average length of a play session (depth of game tree) if skills of both players are similar, i.e. they use the same search algorithm. In the following table are results of simulations, where both players used for next move the minimax algorithm with alphabeta pruning, the search depth was 4. For all considered settings the game was played 20 times, blue started; l is the average length of these 20 games. Table 1. Game difficulty measuring experiment.

Blue won Red won Draw

N=6,M=7 l % 28.5 40 35.6 35 42 25

N=7. M=8 l % 41.7 45 39 50 56 5

N=8. M=10 l % 36.5 60 34.7 40 0 0

N=10,M=12 l % 35 44 44.2 33 20.8 23

J. Henno / Speciﬁcation of Games

241

The first player will with optimal strategy always win [ 46], [47], but alpha-beta with limited search depth was not able to detect this. Human players are not able to implement optimal strategy (perform search with depth > 4), the sub-optimal strategy which was used to obtain results in the above table corresponds to real situation. The above table shows, that difficulty of gameplay (length of game) does not increase dramatically, but the percentage of draws decreases, thus N,M could be increased to make game more interesting. These results allow creating difficulty regulating sub-system – when player’s results are close or better than presented in above table, the game difficulty, i.e. values of parameters N, M should be increased in order to keep up tension. 6.2. Example 2: Star Track Battle

Fig. 14. .Star Track Battle.

Description: on the background of moving planets a player’s battleship is fighting enemies, who constantly enter the screen from right. Battleship can move only vertically and shoot missiles in direction of enemies. Enemies do not shoot, but if they collide with battleship the ship gets damages (its health decreases). Player wins, if he can shoot down all enemies, but if ships health becomes zero, player looses. 6.2.1. Graphic data (sprites) with collision figures Since ships and enemies see and fire each other only from one direction, their collisions (for ship with enemies, for enemies – with ship’s bullets) can occur also only from one direction. Thus collisions of ship and enemies can be calculated only from rectangles covering their “head” (in direction of their movement); missiles collisions are calculated only for a single point in missile’s head.

Fig. 15. Ship and enemy collisions are calculated from triangles, missile collisions – from a single point.

Explosions are small animations which themselves quickly vanish (become invisible):

Fig. 16. Different stadiums of explosion animation.

The following specification demonstrates safe use of memory. It uses the ship and fixed number of all other objects - missiles, enemies, explosions. Enemies fly over the screen from right to left and if they remain alive, they return to right side of the screen

242

J. Henno / Speciﬁcation of Games

and re-appear; if they are hit, they become invisible and are removed from screen (but not from game memory). Missiles become active (visible, moving) when they are fired; when they hit an enemy or leave the screen area they become invisible and do not receive event step , which makes them move. Explosions are small animations (growing and getting darker) and behave the same way – they become active, visible and are shown only when needed; when they finish playing they become invisible and do not receive step events. This is a safe method of memory handling. If on firing a new missile objects were created and lost missiles where removed from game then the game program should constantly interact with operating system and memory leaks become very possible. The Game = < {controller , ship, Star ⊂ Enemy, Missile, Explosion}, {E ( ship) = {create, step, move, shoot , collision( Enemy )}, E ( Enemy ) = {create, step, damaged }, E ( Missile) = {activate, step, collision( Enemy )} >

Stars recognize only the event step, which they use for their movement; on creation they get random position, size and horizontal speed: Attr ( Star ) = {x, y, size, speed x } E ( Star ) = {create, step} f ( Star , create) = { x = random(0, screen.width ); y = random (0, screen.height ); size = random ( sizemin , sizemax ); speed x = − random( speed min , speed max )}

When star moves over screen edge on the left, it reappears from right:

f (Star , step) = {x + = speed x ; if ( x < 0) x = screen.width} Explosions (there is all the time a fixed number of explosions) know only their position: Attr ( Explosion) = {x, y, visible}

On creation explosions are hidden (not visible); when an explosion is needed, the array of explosions is searched for a hidden explosion, which gets message Start ( x, y ) with coordinates indicating where it should appear; explosions have built-in animation, which automatically makes them first bigger and then to disappear, so that they do not need other events: f ( Explosion, Start ( x, y )) = { Explosion.x = x; Explosion. y = y; Explosion.visible = true; Explosion. play ();}

Missiles (rockets) know only their position, speed (horizontal) and weather they are currently fired, i.e. visible on screen or not: Attr ( Missile) = {x, y, vx , visible}

On every step missiles (which are currently fired, i.e. visible) check, whether they have a collision with some enemy ship. If they have, they activate an explosion and

J. Henno / Speciﬁcation of Games

243

vanish themselves; if missile did not hit an enemy, but fled out from screen, it also becomes invisible: f ( Missile, step) = { if (visible = true){ ( for (eny ∈ Arrenemies , eny.active = true){ if (collision(eny )){ eny.active = false; eny.visible = false; visible = false; (∃expl ∋ Arrexplosions , expl.visible = false) ⇒ (expl , Start ( x, y ))} else if ( x > Screen.width) visible = false} }}

Controller manages the ship and arrays of stars, enemies, explosions and missiles: Attr (controller ) = {ship, Arrstars , Arrenemies , Arrexplosions , Arrmissiles }

f (controller , create) = { for (i = 1,STARS)( star = Stars.create, Arrstars . push( star )) for (i = 1, EXPLOSIONS)(expl = Explosion.create(visible = false), Arrexplosions . push(expl )) for (i = 1, ENEMIES)(eny = Enemy.create, Arrenemies . push(eny )) for (i = 1, MISSILES)(m = Missile.create(visible = false, speed x = 6), Arrmissiles . push(m))}

Ship knows only its vertical position, vertical speed and whether it is shooting: Attr ( ship ) = { y, v y , shooting}

Ship recognizes three events (keys) - keyup - start moving upwards, keydown - start moving down and keyshoot - search a non-active missile (which is currently not flying towards enemy) and make it activated (visible) – the missile starts automatically fly to right: f ( Ship, keyup ) = {if ( y > 0) y − −}

f ( Ship, keydown ) = {if ( y < screen.height ) y + +} f ( Ship, keyshoot ) = { (∃m ∈ Arrmissiles , m.visible = false){ m.x = ship.x; m. y = ship. y; m.visible = true}}

Movement of enemies is similar to movement of stars – constantly over the screen, therefore the class Enemy extends the class Star and does not need any additional properties concerning movement.

244

J. Henno / Speciﬁcation of Games

Attr ( Enemy ) = Attr ( Star ) ∪ {visible, active} f ( Enamy, create) = {visible = true, active = true} f ( Enemy, damaged ) = {visible = false, active = false}

The last specification for f ( Missile, step) says, that if missile has been fired (is visible) and if it collides with active enemy, then enemy becomes non-active, vanishes and the array of explosions is searched for currently “non-playing” (i.e. not visible) explosion, which is shown in the place of collision. Total number of explosions (length of the array of explosions) is set higher than the number of missiles, thus for every missile (if it hits an enemy) there is always a non-playing explosion, which can be used. 6.2.2. Complexity Manager Difficulty of this game depends on speed and number of enemies, dexterity of the ship (how quickly it can change its direction and how quickly it can move) and amount and speed of ship’s missiles – player may run out of missiles (for a time) when ship constantly fires. The complexity-regulating function should use variables, which are easily available, so that complexity regulation would not burden the game engine. Here could be used pe - probability of enemy ship to activate (start flying) in frame, ve - average speed of activated enemy ship; the product pe ve length( Arrenemies ) measures “mass” of approaching enemies. Player’s resources are v y - ship vertical speed and length( Arrmissiles ) − #(missile.visible) . When length( Arrmissiles ) is constant (changing game memory structures at run-time is rather dangerous), the latter depends both on player’s activity (how intense is shooting) and bullet’s speed – bullets become again useable when they exit screen. Player’s current achievement can be measured with its ship damage and the average number #(enemies.x < 0) of enemies, which passed (fled out from left) per frame. All these quantities are easy to record from frame events. Thus the gameplay may be kept exciting with following adjustments: if (#(enemies.x < 0) / length ( Arrenemies ) + ship.damage > 0.7) {ve − −; Missile.vx + +} if (#(enemies.x < 0) / length ( Arrenemies ) + ship.damage < 0.3) {ve + +; Missile.vx − −}

7. Conclusions “Creating games is much greater fun than playing them!” – a statement from a Game Programming Course [48] participant. Creation/programming of novel games, which do not repeat pre-used commercial models brings us back to the “era of great discoveries” – men like Konrad Zuse [49], John Backus [50], Grace Murray Hopper [51], Donald Knuth [52] and many others. From their programs emerged theories and technologies which nowadays are the theoretical base of Information Technology: formal grammars, compilers, functional programming. Game creation is still an art, comparable to alchemy, but hopefully we will sometimes also have a TeX–like language/system for describing games. Games are very different from “classical” software systems (e.g. Information Systems) and classical methods for specification do not work here. Most books about game programming are based on implementation environment (C, C++, C# etc) and instead of discussing games are actually teaching these concrete software environments.

J. Henno / Speciﬁcation of Games

245

Here is presented a high-level specification framework (language) for of games as event-driven object-oriented systems. This framework allows describing games without using low-level details of programming languages, supports separate representation of game data types and reuse of game objects and assets. The language allows specification of game’s object structure; conversion of specifications to working programs using suitable game programming environment is rather straightforward. Together with a game is considered also sub-system for manipulating and varying game’s difficulty, which allows creating game levels with different difficulties. References [1] [2] [3] [4] [5]

Mark Prensky. Digital Natives, Digital Immigrants, On-line: http://www.marcprensky.com/writing/ (14.01.2009) Alexander R. Galloway, Gaming: Essays on Algorithmic Culture. Univ Of Minnesota Press; 1 edition (May 27, 2006) ISBN-10: 0816648514 How Much Information? 2009 Report on American Consumers. http://hmi.ucsd.edu/pdf/HMI_2009_ConsumerReport_Dec9_2009.pdf James Paul Gee. Why Game Studies Now? Video Games: A New Art Form. Games and Culture Vol 1 Number 1, January 2006 pp 58-61 General Game Playing: Game Description Language Specifica tion. http://games.stanford.edu/language/spec/gdl_spec_2008_03.pdf

[6] [7]

http://games.stanford.edu/language/language.html T. Hinrichs. Automatically Proving Playability through Abstraction. http://people.cs.uchicago.edu/~thinrich/papers/playability.pdf

[8] [9]

Raph Koster (2004) Theory of Fun for Game Design. Paraglyph Press, ISBN 10: 1-932111-97-2 Raph Koster. Theory of Fun for Games. Keynote speech at Game Developer s Conference 2005, http://kotaku.com/tag/game-developers-conference/ (14.01.2009)

[10] S. Swink. Game Feel. A Game Designer's Guide to Virtual Sensation. Focal Press 2008 [11] S. Swink. Game Feel: The Secret Ingredient. http://www.gamasutra.com/view/feature/2322/game_feel_the_secret_ingredient.php [12] http://www.popcap.com/ [13] Braid. http://braid-game.com/ [14] http://www.ghostbustersgame.com/ [15] Learn to Let Go: How Success Killed Duke Nukem. http://www.wired.com/magazine/2009/12/fail_duke_nukem/ (15.01.2010) [16] Ernest Adams and Andrew Rollings (2003). Andrew Rollings and Ernest Adams on Game Design, New Riders 2003, ISBN-10: 1592730019 [17] http://www.goodexperience.com/ [18] http://www.kongregate.com/games/ArmorGames/this-is-the-only-level [19]

http://www.useit.com/papers/heuristic/heuristic_list.html

[20] The Making of Mario 64. http://www.miyamotoshrine.com/kong/features/mario64/index.shtml [21] Raph Koster (2004) Theory of Fun for Game Design. Paraglyph Press, ISBN 10: 1-932111-97-2 [22] Machinarium. http://machinarium.net/ [23] J. Asensio, M. Jiménez, S. Fernández, D. Borrajo. A Social and Emotional Model for Obtaining Believable Emergent Behaviors. LNCS 5253 (2008), pp 395-399 [24] Amyris Fernandez. Fun Experience with Digital Games: a Model Proposition. Interact 2007, Sept 1014, Brazil. http://www.fun-of-use.org/interact2007 [25] Daniel Cook. The Chemistry Of Game Design. http://www.gamasutra.com/view/feature/1524/the_chemistry_of_game_design.php (14.01.2009) [26] S.Bura. Emotion Engineering in Videogames. http://www.stephanebura.com/emotion/ (12.01.2009)

246

J. Henno / Speciﬁcation of Games

[27] D. Cook. The Chemistry Of Game Design. http://www.gamasutra.com/view/feature/1524/the_chemistry_of_game_design.php [28] Henno, J. (2010). On Structure of Games. T. Welzer Družovec, H. Jaakkola, Y. Kiyoki, T. Tokuda, N. Yoshida (Toim.). Information Modelling and Knowledge Bases XXI (344 - 361). Amsterdam; Berlin;T okyo; Washington, DC: IOS Press [29] R. Appelman. Experiential Modes of Game Play. http://www.digra.org/dl/db/07311.16497.pdf [30] J. Henno. Game and Virtual Environment Programming (in Estonian). Tallinn 2009, ISBN 978-998559-889-4, 1-206 [31] http://www.yoyogames.com/make (14.04.2009) [32] http://creators.xna.com/en-US/news/xnagamestudio3.1 [33] http://www.pygame.org (14.04.2009) [34] http://www.panda3d.org/ (14.04.2009) [35] Henno, J. High-Level Specification of Games. Studies in Computational Intelligence, Springer Berlin / Heidelberg, 2009, pp 307-322 [36] C Programming Reference. http://www.cprogrammingreference.com/Tutorials/Games_Programming/PingPong.php [37]

Game Object Systems. http://chrishecker.com/Game_Object_Systems

[38] http://www.easports.com/ [39] 3D Programming Environment Alice. http://www.alice.org/ [40] Levesque, H., Reiter, R., Lesp´erance, Y., Lin, F. and Scherl, R.. GOLOG: A logic programming language for dynamic domains. Journal of Logic Programming, 31(1–3):59–83, 1997. [41] G. de Giacomo, Y. Lesp´erance, H. J. Levesque. ConGolog, a concurrent programming language based on the situation calculus. Artificial Intelligence, 121(1-2):109–169, 2000. [42]

http://www.cs.toronto.edu/cogrobo/Systems/golog_swi.pl

[43]

http://www.cs.toronto.edu/cogrobo/kia/gologinterpreter

[44] 4 in a Row. http://www.fierz.ch/4inarow.htm [45] Mustrum. http://www.lbremer.de/mustrum_e.html [46] Allis, V. "A Knowledge-Based Approach of Connect-Four--The Game Is Solved: White Wins." Report IR-163 by the Faculty of Mathematics and Computer Science at the Vrije Universiteit Amsterdam, The Netherlands. 1988 [47] Allis, V. "A Knowledge-Based Approach of Connect-Four." http://www.connectfour.net/Files/connect4.pdf [48] Game and Virtual Worlds programming. http://www.ttu.ee/users/jaak [49] Konrad Zuse. http://www.idsia.ch/~juergen/zuse.html [50] http://en.wikipedia.org/wiki/Fortran [51] http://cs-www.cs.yale.edu/homes/tap/Files/hopper-story.html [52]

http://www-cs-faculty.stanford.edu/~uno/

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-247

247

Bridging Topics for Story Generation Makoto SATO a,1 , Mina AKAISHI a and Koichi HORI a of Aeronautics and Astronautics, The University of Tokyo, Japan

a Department

Abstract. This paper introduces a method for bridging topics designed to facilitate generating stories over documents. First, we present a method for topic extraction based on narrative structure with k-means algorithm. We then model the story generation process and present a method for ﬁnding a bridge document between two documents. Keywords. story generation, topic bridging, topic extraction

Introduction In many business areas dealing with document databases, identifying topics and analyzing their dynamics is important. For example, grasping the transitions of the causes of accidents or discovering the possibility of an accident from reports of incidents is desired. A good way to communicate information is by accompanying narratives with context. However, the context of information when it is accumulated is different from the context when it is used. Generating a story in a new context from an enormous number of documents is difﬁcult. Thus, we proposed an information access framework [1,2] that supports the decomposition/recomposition of documents by enabling the manipulation of their narrative structures. In this framework, the system classiﬁes patterns of topic transitions and suggests sequence of scenes for generating a story. With this strategy, narrowing candidates of scenes is difﬁcult because the criterial parameter is quite simple. In this paper, we present a new strategy for generating a story, bridging topics strategy. Bridging topics strategy aims to ﬁnd a gap between two topics and ﬁnd documents related to the gap. For example, the background of our study is “dealing with text databases” and the objective of it is “communicating information with its context”. Our approach to it is “developing a framework for story generation”. When someone uses this idea, alternative ideas should be considered. In such a case, we want to ﬁnd external information bridging a gap between the topic “dealing with text databases” and the topic “communicating information with its context”. Figure 1 shows the overview of the process of bridging topics. At ﬁrst, we prepare two documents. One is the source document of the bridge and another is the target of the bridge. We deﬁne a matrix that features its content for each document based on our previous work reviewed in the section 1. Then, we extract multiple topics from the 1 Corresponding Author: Makoto SATO, Department of Aeronautics and Astronautics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan; E-mail: [email protected].

248

M. Sato et al. / Bridging Topics for Story Generation

documents with a method based on K-means algorithm introduced in the section 2. We model the process of story generation and ﬁnd a gap bridging topics that is represented as a matrix with a method introduced in the section 3. In the section 4, we will illustrate an example of bridging topics from “dealing with text databases” to “communicating information”.

($%$ *%$ "$'$ $($+

($%$ *'$$($ $##+

,$%" $"(

)$"( -(

)$"( -(

$$"$&## $"(-(#

$$"$&## $"( -(#

-

-

$"#$ " "$# ##$) $# "#

$ # "$$ $) )# %$#

-

-

""$&# %$ ) "$

#$") %" "$ "%# %$#

($"$$ #

"$ # "" )$"( !(

($"$$ #

$$"$&##$"( !(!

!

!

!

%$# "$ #$") $($ ,%$

"$ #$") %$ ""$&# $($

"$ ""$&# %$ $($ %%$

"$%$ $#

#$"$#

Figure 1. Overview of the process of bridging topics from “dealing with text databases” to “communicating information”.

1. Narrative Navigator Narrative Navigator (NANA) is an information access framework based on the narrative structure model. NANA supports the generation of new stories in context when it is used

M. Sato et al. / Bridging Topics for Story Generation

249

in the decomposition and composition of documents. In order to support generating new stories, patterns of transitions of term popularity in scenes are classiﬁed and sequences of scenes are suggested. 1.1. Narrative Structure Model The narrative structure model is a model for the hierarchical structure of narrative elements. The elements of documents are mapped to the narrative components listed in Table 1. A set of terms in a sentence is mapped as an event. A chunk of events is regarded as a scene, and a sequence of scenes is regarded as a story. A world model is regarded as a set of documents. NANA gives suggestions for decomposing stories and composing scenes by using vocabulary chain graphs. Two notions, term dependency and term attractiveness, deﬁne a vocabulary chain graph. Table 1. Mapping of Narrative Components to Text Elements Narrative component

Text element

world model

set of stories

story

sequence of scenes (documents)

scene

chunk of events

event

set of terms (sentence)

character

term

1.2. Term Dependency and Term Attractiveness Term dependency and term attractiveness are the basis of NANA. The dependency of term ti on term t j in scene s, ds (ti , t j ), is given by conditional probability as follows: ds (ti , t j ) =

sentencess (ti , t j ) , sentencess (ti )

(1)

where sentencess (ti ) is the number of sentences that contain term ti in scene s and sentencess (ti , t j ) is the number of sentences that contain both term ti and term t j in scene s. The attractiveness of term t j in scene s, as (t j ), is the sum of the dependency of term ti on term t j over all terms ti : ds (ti , t j ), (2) as (t j ) = ti ⊂T

where T is the set of all terms in the document. 1.3. Term Context-dependent Attractiveness In addition, we deﬁned term context-dependent attractiveness as an extension from term dependency and attractiveness [9]. A term depended on by terms that have high attractiveness in a scene has more attractiveness in the next scene than a term depended on by terms that have low attractiveness.

250

M. Sato et al. / Bridging Topics for Story Generation

The term context-dependent attractiveness cτ (t j ) of a term t j in a scene at time τ is the sum of the products of term context-dependent attractiveness cτ −1 in the previous scene at time τ − 1 and term dependency ds (ti , t j )in the present scene in a scene s over all the terms ti : cτ (t j ) = cτ −1 (ti )ds (ti , t j ). (3) ti ⊂T

2. Extracting Multiple Topics We present a new method that discovers a set of topics expressed by documents, providing quantitative measures that can be used to identify the content of those documents. This method takes as input the number of topics to generate and the term dependencies of the document. It returns the importance of each topics and the term popularity of the term in a topic. The output of our approach is similar to topic models [10]. Topic models are based upon the idea that documents are mixtures of topics, where a topic is a probability distribution over words. While a topic model is a generative model for documents, our approach focuses on the superﬁcial information as term count or term co-occurrence. 2.1. Features of Documents Vectors characterize documents and terms. In terms of vector space model, a topic is represented as a vector. When the component of a vector correspond to term dependency, let d(t j ) be the term dependency vector of term t j : d(ti ) = [d(ti , t1 ) . . . d(ti , t N )] .

(4)

This vector represents the distribution that indicates how much is the term depended by other terms. 2.2. K-means Algorithm The vectors are clustered with K-means algorithm [4]. K-means clustering is a simple method of cluster analysis which aims to partition N observations into K clusters in which each observation belongs to the cluster with the nearest mean. Given a set of term dependency vectors (d(t1 ), . . . , d(t N )), where each observation is a dimensional real vector, then k-means clustering aims to partition the n observations into K sets (K < N ) S = S1 , . . . , S K so as to minimize the within-cluster sum of squares: arg min

K d(t j ) − μi 2 ,

(5)

i=1 d j ⊂Si

where is the centroid, or the mean of Si . The centroid is representative word of the cluster. ||d(t j )−μi ||2 is the square of Euclidean norm of the difference between the term dependency vector and the centroid vector. After clustering, the relations among terms are visualized as Figure 2 with WordColony [2].

M. Sato et al. / Bridging Topics for Story Generation

251

Figure 2. Visualized clustered term netrwork. The node corresponds to the term, the edge direction corresponds to the difference of term dependencies, and the color corresponds to the cluster.

2.3. Term Topical Attractiveness and Topic Attractiveness We deﬁne term topical attractiveness and topic attractiveness. Term topical attractiveness a (k) (t j ) is the popularity of term t j in a topic k. It is the sum of the dependency of terms in a topic. The term topical attractiveness vector a(k) which component is the term topical attractiveness is as follows: a(k) =

d(t j ).

(6)

d(t j )⊂Sk

Topic popularity p (k) indicates the impotence of the topic k. It is deﬁned as 1-norm of the term topical attractiveness vector: p (k) = ||a(k) ||1 .

(7)

Considering multiple topics, term topical context-dependent attractiveness, the extension of term context-dependent attractiveness explained in the section 2.3, is deﬁned as follows: cτ (t j )(k) =

ti ⊂T

(k)

cτ −1 (ti )ds (ti , t j ).

(8)

Term topical context-dependent attractiveness is a term weighting method taking into account context dependency for each topic. When the number of topic is one (K = 1), it is the same as term context-dependent attractiveness.

252

M. Sato et al. / Bridging Topics for Story Generation

3. Bridging Topics Our goal is to ﬁnd a scene as external information when we have two output scenes, Sτ −1 and Sτ . We regard the story generation process as follows: Scenes are outputted as an expression of the topics (showed in the ﬁgure 3. Topics are changed by external information. In practice, the topics of the output scenes are characterized by term topical attractiveness and topic attractiveness and changes of topics are analyzed by contextdependent model.

Figure 3. Overview of the process of bridging topics from “dealing with text databases” to “communicating information”.

At ﬁrst, multiple topics are extracted from the scene Sτ −1 and scene Sτ , with the method explained in the chapter 2. Term topical attractiveness and topic popularity of the two scenes are calculated. Term topical attractiveness in scene Sτ −1 is regarded as term topical context-dependent attractiveness in the scene before changed and term attractiveness in scene Sτ is regarded as term topical context-dependent attractiveness in the scene after changed. Then, we solve inverse problem of term context-dependent attractiveness corresponding to the equation (8). Let D be the co-occurrence dependency matrix: D = [d1 . . . d N ]

(9)

And let cτ(k) be the term topical context-dependent attractiveness vector in topic k and let Cτ be the term topical context-dependent attractiveness matrix: (10) cτ(k) = cτ(k) (t1 ) . . . cτ(k) (t N ) (K ) . . . c Cτ = c(1) τ τ

(11)

The relationship between term topical context-dependent attractiveness and term dependency in the equation (8) is simply represented as Cτ = Dq Cτ −1 .

(12)

M. Sato et al. / Bridging Topics for Story Generation

253

That is, the term dependency matrix Dq of the document we want to ﬁnd should satisfy this condition. This equation (12) usually has many solutions because it is underdetermined for its sparseness. So we use the pseudoinverse, a generalization of the inverse matrix, for getting one solution. Dq = Cτ C+ τ −1 .

(13)

where C+ τ −1 is the psudoinverse of Cτ −1 . The psudoinvese gives the "least-squares" answer. The number of paths is factorial of K. Finally, we ﬁnd the targeted information with similarity searching based on cosine similarity between term topical attractiveness vector extracted from term dependency matrix Dq as query and term topical attractiveness vector in the targeted information database. Topics of each document in the database are extracted in advance.

4. Example of Bridging Topics We implemented the bridging topic strategy into NANA and generated a story about “information access”. Here, we focused on our study described in the Introduction. The background of our study is “dealing with text databases” and the objective of it is “communicating information with its context”. Our approach to it is “developing a framework for story generation”. When someone uses this idea, alternative ideas should be considered. Thus, we tried to ﬁnd external information bridging a gap between the topic “dealing with text databases” and the topic “communicating information with its context”. We analyzed a subset of abstracts of articles from Computer, IEEE (1970-2008) 2 , the magazine covers all aspects of computer science, as external information. The overview of the process is showed in the ﬁgure 1. We prepared two texts about “dealing with text databases” and “communicating information with its context”. Then, topics were extracted topics from the texts by the method introduced in the chapter 2. We set the number of topics two (K s = 2). Table 2 shows the top ﬁve attractive terms in a topic for each text. For example, “transition” is (1) the most attractive term in the topic cτ in the text about “dealing with text databases”. Topic of the text are chracterized by the attractive terms. We bridged a gap between the two text by the method introduced in the chpater 3 and we got a dependency matrix that is inferred to feature a document bridging topics. Then, we extracted topics from the inferred dependency matrix. We set the number of topics is three (K q = 3). We searhed for the candidate document for bridging topics in the target text database, a subset of abstracts from Computer, IEEE, based on the value of inner product between the vector that represents topic of the document bridging topics and the vector that represents topic of the document in the targeted database. Topics of each text in the targeted text database were extracted in advance. We set the number of topics three for the text in the database. Table 3 shows the top ﬁve attractive terms for each topic and the candidate titles of articles. For example, “documents”, “generating”, “story”, “context” and “difﬁcult” are the top ﬁve popular terms in the topic cq(1) . Based on similarity searching with the term topical attractive vector that features the topic cq(1) , the article “Building 2 http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=2

254

M. Sato et al. / Bridging Topics for Story Generation

Knowledge: What’s beyond Keyword Search?” is the most suitable candidate document. These results said that there are many alternatives for the story about “information access” from “dealing with text databases” to “communicating information with its context” to “developing a framework for story generation”. Table 2. Top ﬁve topical attractive terms of (a) the text about “dealing with text databases” and (b) the text about “communicating information with its context”. The number of topics is two (K s = 2) for each text. (a) (b) (1) (2) (1) (2) Topic cτ −1 Topic cτ −1 Topic cτ Topic cτ transition

topics

narratives

story

reports

important

possibility incidents

identifying dynamics

good communicate

number generating

grasping

document

accompanying information

enormous documents

Table 3. Top ﬁve attractive terms for each topic and the candidate titles of articles. The number of topics is three (K q = 3) Topic

(1) Topic cq

Terms

Building Knowledge: What’s beyond Keyword Search?

generating

New Applications & Recent Research

story context

A software infrastructure for authenticated Web metering Genetic search based on multiple mutations

difﬁcult

Extending telecommunications systems: the feature-interaction problem

information (2)

Topic cq

story communicate narratives context information

(3)

Topic cq

Title of article

documents

narratives

Internal Accounting Controls in the Ofﬁce of the Future Microsystems Opinion: Critique of the F8 Microprocessor Search in vain, challenges for Internet search At a crossroads: Connectivity model determines distribution of power Automatic indexing and content-based retrieval of captioned images Toward a PeopleWeb Array Processor Architecture

communicate

Building community information systems: the Connected Kids case

context accumulated

At a crossroads: Connectivity model determines distribution of power Automatic indexing and content-based retrieval of captioned images

5. Sample Application We tried to ﬁnd the bridge topics from "Aeronautics" to "global warming", and from "global warming" to "Aeronautics" and generate stories from the viewpoint of computer technology. In the area of Aeronautics, the design of body frame or engine of airplane for efﬁcient aviation are well focused in order to address changes to the Earth’s climate. However, there may be other ways and they should be consider various viewpoints.

M. Sato et al. / Bridging Topics for Story Generation

255

The explanations texts of "Aeronautics" and "global warming" are referred to Wikipedia 3 . Common words, such as "a", "is" or "the", are treated as stop words. We analyzed two subsets. One is the subset of abstracts of articles from Proceedings of the IEEE 4 (2007–2009), which is a peer-reviewed scientiﬁc journal published by the Institute of Electrical and Electronics Engineers. Another is the subset of abstracts of articles from Communications of the ACM 5 (2007-2009), which is the ﬂagship monthly journal of the Association for Computing Machinery. First, the topics from two texts are extracted. Then, the characterizations of the bridge between the topics are estimated as the term dependency matrix. Finally, the candidate documents are accessed with the term dependency matrix as query. The number of source and target topics is set to three (K s = 3). The number of bridge query topic is set to three (K q = 3). The candidate paths of stories are visualized in the ﬁgure 4. This shows one of the story paths. The lexical chains enables to explain the cohesion of the scenes. For example, the title of one of the bridge candidatess is “Scanning the Issue: Special Issue on Aviation Information Systems.” The abstract of this article is “This special issue focuses on three signiﬁcant technologies for restructuring air trafﬁc management: bounded-error navigation, aviation communications networks, and automated algorithms to increase air trafﬁc capacity”[11]. This article suggests that the trafﬁc management involves the atmosphere of the earth. The content of this article doesn’t have relations with “global warming” directly but helps to consider the relationship. Additionally, we searched with the top ten popular terms as the query on the web. Table 4 shows the titles of the search results with Google 6 . Our method is helpful for suggesting the terms that connect between two topics. Compared to Proceedings of the IEEE and Communications of the ACM, the results covers many topics. Selection of the target database depends on the intended use. Table 4. Search results of Google with queries, “aeronautics,” “atmosphere,” “seamanship,” “protocol,” “caused,” “global,” “design,” “machines,” “techniques” and “capable.” 1 2

Complete Annual Report, 06-07 - Cooperative Institute for Research ... Global Warming: March 2009

3 4

Forensic expert, Expert Witness Referral to Medical and Technical ... HIGH SPEED CRAFT HUMAN FACTORS ENGINEERING DESIGN GUIDE

5

ADASS2009:PROGRAM

6. Related Works There are a lot of researches about the story generation in the ﬁeld of educational applications or entertainment applications [3,5]. One approach is to generate stories dy3 http://wikipedia.org/ 4 http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=5 5 http://cacm.acm.org/magazines 6 http://www.google.com

256

M. Sato et al. / Bridging Topics for Story Generation

7/*-+,753+754':9/)895-25('2 <'73/4-/8;/'9/544,573'9/54 #>89+3 #5:7)+$56/) $./886+)/'2/88:+,5):8+8549.7++8/-4/?)'49 9+).4525-/+8,577+897:)9:7/4-'/797',?) 3'4'-+3+49(5:4*+* +77574';/-'9/54 ';/'9/54)533:4/)'9/5484+9<5718'4* ':953'9+*'2-57/9.3895/4)7+'8+'/797',?) )'6')/9>

+754':9/)8 +754':9/)8,7537++1MNHAB7 <./).3+'48'/7'4*GCJIEFN 4':9/1B<./).3+'484';/-'9/54 8+'3'48./6/ + 4';/-'9/545,9.+ '/7/89.+8)/+4)+/4;52;+*

7/*-+)'4*/*'9+8 +><57*8

533:4/)'9/5485,9.+

!75)++*/4-85,9.+

! ! " # $% & ' (

:953'9+*>6+786+)97'2 :+/4-,57/;/2/'4#+'7).'4* "+8):+ /7(574+533:4/)'9/54 +9<5718,57#3'22%43'44+* /7)7',9#>89+38 $8#97'9+->,57*:)'9/54'2 $+).4525->445;'9/54 =

#6+)/'288:+54'7-+ #)'2+ >4'3/)#>89+38 *;'4)+8/4'-4+9/)'9' #957'-+$+).4525-/+8 #)'44/4-9.+88:+#6+)/'2 88:+54;/'9/544,573'9/54 #>89+38

667+49/)+8./6+'74/4-,57 +2/)569+7549752 5<"!897:)9:7+8/98 75(59/)8675-7'3895/3675;+ 25)5359/54'4*4';/-'9/54 4!:(2/)#+7;/)+'4* 536:9+7#)/+4)+

4;/8/54/4-/49+22/-+49 /4,573'9/549+).4525-/+8 9.75:-.9.+67/835,<+( /49+22/-+4)+ 3'-+7/8/8486/7/4-'4+< -+4+7'9/545,)536:9+7 8)/+49/898 536:9/4-*:)'9/54'99+78

&+2+(7'9+8&53+4 /4536:9/47+:8/4+88+9.5*8 !'9+49'(2+ !++7956'9+494++*8>5:7 +=6+79/8+

$'7-+9$56/) -25('2<'73/425('2<'73/4-/89.+/4)7+'8+/4 9.+';+7'-+9+36+7'9:7+5,'79.8 4+'7 8:7,')+'/7'4*5)+'488/4)+ 9.+3/* 9.)+49:7>'4*/98 6750+)9+*)549/4:'9/54

Figure 4. Story paths from “Aeronautics” to “global warming”. One of the bridge candidates suggests a story via “Aviation Information System”.

namically or on a per-session bias [8]. The system can adapt narratives to the user’s preferences and abilities thorough interactions between the system and the users. Based on such interaction, we propose a system that can segment accumulated documents and generate new stories with connecting the suggested segments. Our methods are close to multi-document summarization techniques [7] because we use the characteristics extracted from texts. For example, [6] tried to express documents as graph structure and summarize similarities and differences. However, our objective is not to generate a new story complete automatically but to support the user’s ability to create a new story.

M. Sato et al. / Bridging Topics for Story Generation

257

7. Conclusion We have presented methods of extracting multiple topics from document and bridging topics for generating new stories. Topics are extracted based on narrative structure model and featured by term topical attractiveness. The method for bridging topics is by solving inverse problem of term context-dependent attractiveness. We showed the sample application that is for generating stories between "Aeronautics" and "global warming". The results we have presented use the simple strategy especially in contextdependency model. In future, we intend to extend this work by exploring more complex model and evaluating tools for generated stories. Moreover, the evaluation for the topic bridging is needed.

Acknowledgements This work was supported in part by the Global COE Program “Global Center of Excellence for Mechanical Systems Innovation,” MEXT, Japan.

References [1]

[2] [3] [4] [5]

[6] [7] [8] [9] [10] [11]

M. Akaishi, “A Dynamic Decomposition/Recomposition Framework for Documents based on Narrative Structure Model,” Transactions of the Japanese Society for Artiﬁcial Intelligence, Vol. 21, No. 5, pp. 428–438, 2006 (Japanese). M. Akaishi, Y. Kato, K. Satoh and K. Hori, “Narrative based Topic Visualization for Chronological Data,” 11th International Conference Information Visualization (IV ’07), pp. 139–144, 2007. M. Cavazza, F. Charles, and S. Mead, “Character-based interactive storytelling,” IEEE Intelligent Systems, vol. 17, no. 4, pp. 17–24, 2002. C. Ding and X. He. “K-means Clustering via Principal Component Analysis,” Proc. of Int’l Conf. Machine Learning (ICML 2004), pp 225-232. July 2004. A. Gordon, et al., “Branching storylines in virtual reality environments for leadership development,” in PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, pp. 844–851, AAAI Press, 2004. I. Mani and E. Bloedorn, “Multi-document summarization by graph search and matching,” cmplg/9712004, Dec. 1997. I. Mani and M. Maybury, Advances in automatic text summarization. MIT Press, 1999. M. Riedl and R. Young, “From linear story generation to branching story graphs,” Computer Graphics and Applications, IEEE, vol. 26, no. 3, pp. 23–31, 2006. M. Sato, M. Akaihisi, K. Hori, “Analyzing Topic Transitions using Term Context-dependent Attractiveness,” EJC 2009, 2009. M. Steyvers and T. Grifﬁths, “Probabilistic topic models,” In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum, 2006. Bandic, J. and Tomlin, C., “Scanning the Issue: Special Issue on Aviation Information Systems,” Proceedings of the IEEE, Dec. 2008.

258

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-258

A Combined Image-Query Creation Method for Expressing User’s Intentions with Shape and Color Features in Multiple Digital Images a

Yasuhiro HAYASHI a Yasushi KIYOKI b and Xing CHEN c Graduate School of Media and Governance, Keio University, 5322 Endo, Fujisawa, Kanagawa, 252-8520, JAPAN b Faculty of Environmental Information, Keio University, 5322 Endo, Fujisawa, Kanagawa, 252-8520, JAPAN c Department of Information and Computer Sciences, Kanagawa Institute of Technology, 1030 Shimo-ogino, Atsugi, Kanagawa, 243-0292, JAPAN

Abstract. This paper presents a combined-image query creation method for expressing user’s intentions by combining multiple digital images for image retrieval. This method uses image databases provided for query-creation and performs several set-operators to express user’s imagination by combining user’s imaginary images and real scenes. The user’s intentions are expressed by the operation of subspace projection in the image feature space. This method makes it possible to create an imaginary image as the combined-image query for expressing user’s intentions by combining several images and operators in the query creation process. The important feature of this method is to use shape and color features for expressing imaginations by extending our previously proposed method. This paper shows several experimental results to clarify the feasibility and effectiveness of our method. Keywords. Image-Query Creation, User’s Intentions, Content-based Image Retrieval, Query-By-An Image, Multimedia Database

1. Introduction There is a rapid increase in the size of digital resources together with the fast growth of the media data volume in Internet. How to deal with “user’s intentions” is one of the important subjects in the multimedia database research field. In a typical scenario for the image retrieval, the users supply image-queries and the system is expected to return a set of similar digital images adapted to the “users’ intentions.” Significant researches have focused on determining efficient methodologies for the image retrieval in image databases. One approach is to implement image retrieval by attaching keywords or text to images. The systems, such as Flickr [2] and Google Images [3], adopt this approach. The other approach which is called Content-based Image Retrieval (CBIR) is implemented by extracting low-level visual features like color histogram, shapes and textures from the image data [4, 5, 6]. A lot of systems have been developed on this approach, ART MUSEUM [7] is one of the earliest systems and QBIC [8] is the first commercial retrieval engine. Other popular research systems are BlobWorld [9],

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

259

VIPER/GIFT [10], Photobook [11], VisualSEEk [12], Netra [13], SIMBA [14], SIMPLIcity [15], Virage [16], and RetrievalWare [17]. Although many research efforts for the image retrieval have been performed, it is difficult to retrieve the images corresponding to the “users’ intention” in a query in visual similarity. In order to solve this problem, several approaches exist. The MARS system introduced Relevance Feedback that was first used in text retrieval into CBIR in the 1990s [18, 19]. The key point of this technique is to learn the user’s intentions through analyses of positive examples and negative examples that are provided by interaction with the user. On the basis of this learning, the algorithm of the technique refines the weight of similarity calculation and returns the retrieval results corresponding to the user’s intentions. Additionally, a semantic image retrieval method based on the Mathematical Model of Meaning (MMM) has been proposed to judge the similarity of images on the semantic level [20]. This method extracts the semantic similarity factors from image groups. Moreover, several query methods have been proposed to exactly express the user’s intention. Query-by-An-Image is a typical query method to express the user’s intentions with a sample image, and also Query-by-Sketch is to express the user’s intentions by sketching features of the image which the user is eager to retrieve [21]. We have noticed that to directly express the user’s intentions hasn’t been established on conventional query methods. Especially, in the case of Query-by-AnImage, the user’s intentions are only reflected to a sample image. Consequently, we have proposed an image-query creation method for expressing the user’s intentions by combining multiple digital images for the image retrieval [1]. Difference between our method and the conventional researches is that our method applies the image databases to query creation for expressing the user’s intentions while the conventional researches efficiently retrieve the image databases.

2. An Combined Image-Query Creation Method for Expressing User’s Intentions with Shape and Color Features in Multiple Digital Images This section describes the image-query creation method for expressing user’s imaginations and intentions by combining several images and operators in the query creation process. Colors of user's interest in images and locations of colors related to shapes of the object are important elements in order to express the user's intensions. The image-query creation method is used to express the user’s intentions about colors and shapes of images by combining the multiple digital images, as shown in Figure 1. Features of this method and differences in comparing with our proposed method in [1] are summarized as follows: •

•

This method provides some images in the image databases and several setoperators on a query part of the image databases to express image-context as the intentions that is the user’s imagination process for matching of the user’s imaginary images and real scenes. This method creates the image-query expressing user’s intentions by combining the multiple digital images and the several operators. Concretely, the operators execute the color-based set-operations such as AND, OR, MINUS and NOT to the image features shown as M x N blocks.

260

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

Figure 1. An Overview of the Image-Query Creation Method

•

•

In this method, a user can express imaginations and intentions by combining colors and shapes of objects existing in multiple images. However, if the user cannot prepare images for the intention expression, the user cannot express the intentions by using this method. Combining images and operators based on image features by representative colors realizes similar processing of the image trimming in image processing by using transparent color, which is colorless. In the image-query, the blocks of the user’s interest are shown as chromatic colors and also others are shown

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

•

•

261

as the transparent color. Although the output of the image processing is the mere image against the input image, this method outputs information of the subspace projection on the image feature space as the image-query. The user's intentions of both the colors and the shapes of the objects in the image are expressed by adopting color histogram matrix for storing image features and color description matrix for storing color indexes of regions of the user's interest. In our previously proposed method [1], the user’s intentions about the colors and the shapes of the objects in the image are independently expressed as one dimension vector, and also the both intentions are expressed by integrating each result of similarity calculation. Similar images corresponding to a user's imagination are selected as the retrieval results, based on the color features on the subspace by inputting the created image-query to image databases.

2.1. Color Histogram Matrix and Color Description Matrix This method handles color information actually existing in images and allocates the regions of colors, according to the user's interest specified by the user, as two kinds of image features. The first kind of image feature is expressed as a three-dimensional vector, and a set of those features are expressed as color histogram matrix. This matrix stores color distribution information in K-representative colors of each block where the image is divided into M x N. The other kind of image feature is expressed as twodimension vector indicating an image region in which a user has interests, and a set of those vectors are expressed as the color description matrix. This image feature is used to represent the shape of objects in the image, as the imagination and intention of a user. In the color histogram matrix, an original image is divided into M x N and represented as M x N blocks. The colors of each block are classified into the Krepresentative colors and the color histogram is created by the ratio of the number of pixels of the K-representative colors. The matrix stores the color histogram information of each block. The process of creating the color histogram matrix is shown in Figure 2. The color histogram matrix is defined to be:

ColorHistogramMatrix = {hijk | i = 1...M , j = 1...N , k = 1...K}

(1)

hijk are color histograms of each block. The domain of M is {M | M=1 … width of the image} and also the domain of N is {N | N=1 … height of the image}. In addition, pixel-remainders left by the image quantization are rounded down. Finally, the color histogram matrix is transformed to a one-dimension vector for storing to the image feature space about the colors in image databases. This method manipulates color description matrix for two-dimension vectors to describe local color features in the image. The process of creating the color description matrix is shown in Figure 3. A method of color description matrix has been proposed by I. Valova, et al. [22]. In this method, an original image is divided into M x N and is represented as M x N blocks. The matrix stores dominant colors for each block. In each block, one color corresponding to the average color of the block is selected from the Krepresentative colors. The color description matrix is defined as:

ColorDescriptionMatrix = {c ij | i = 1...M , j = 1...N }

(2)

262

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

The values of cij are color indexes for the dominant color of each block. (If cij = 0, then, cij means the transparent color.) The domain of M is {M | M=1 … width of the image} and also the domain of N is {N | N=1 … height of the image}. In addition, pixel-remainders left by the image quantization are rounded down. In the case for expressing more detailed intentions about shapes, this matrix is enlarged by extending the values of M and N. Finally, the color description matrix is transformed to a onedimension vector for storing to the image feature space about the color indexes in the image databases.

Figure 2. Process of Creating Color histogram matrix

Figure 3. Process of Creating Color description matrix

2.2. Query Creation Operators Query creation operators calculate the blocks of user's interest (i.e. the colors and the shapes of the objects in the image) by set-operations of the representative color of each block of the color description matrix. The blocks of user's interest are drawn as chromatic color, and others are drawn as transparent color. In this regard, these operators doesn’t change color ratio of each block. Additionally, the operators have

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

263

each threshold for the set-operations by using similar colors. The query creation operators for the image-query creation are defined as follows: 2.2.1. Operator 1: AND

Figure 4. AND Operation

As shown in Figure 4, this operator extracts the blocks with the same color between two images. In multiple images, when more than two color description matrixes are given, this operator compares the colors of each block (cij, c’ij) of two color description matrixes (C, C’) as Equation 3. A color distance calculation function of which the concrete process is presented in sub section 2.2.4 is defined to ColorDistance. A threshold for extracting the blocks with similar colors is defined to dth. Numbers of each block are defined to i, j. The domain of i is {i | i=1 ... M} and also the domain of j is {j | j=1 … N}.

if (c ij is chromatic color) { if (c' ij is chromatic color) { if (ColorDisctance(c ij , c' ij ) <= d th ) { c ij } else { c ij is changed to transparent color } } else { c ij is changed to transparent color

(3)

} } else { c ij } If the colors between the corresponding blocks in two images are equivalent or similar, then, the block color is kept in the result image. Also, if the colors between the corresponding blocks in two images are different, then, the block color is replaced with transparent color in the result image. For example, in a case of expressing an intention

264

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

about an apple, the intention is expressed by AND operation between an apple image with green background and an image with all red color. Moreover, This operator functions as a filter that extracts only the blocks of the chromatic colors between the corresponding blocks in two images by incrementing the value of threshold enough without considering the difference of the colors. 2.2.2. Operator 2: OR

Figure 5. OR Operation

As shown in Figure 5, this operator overlaps the blocks between two images. In multiple images, when more than two color description matrixes are given, this operator confirms the colors of each blocks (cij, c’ij) of two color description matrixes (C, C’) as Equation 4. Numbers of each block are defined to i, j. The domain of i is {i | i=1 ... M} and also the domain of j is {j | j=1 … N}.

if (c ij is transparent color) { c' ij } else { c ij

(4)

} If the colors of the blocks cij are the transparent color, then, the block color replaced c’ij in the result image. Also, if the colors of the blocks cij are the chromatic colors, then, the block color is kept in the result image. For example, in a case of expressing an intention about two apples, the intention is expressed by OR operation between an image where an apple is in the right side and an image where an apple is in the left. 2.2.3. Operator 3: MINUS As shown in Figure 6, this operator removes the blocks with the same color between two images. In multiple images, when more than two color description matrixes are given, this operator compares the colors of each block (cij, c’ij) of each color description matrixes (C, C’) as Equation 5. A color distance calculation function of which the concrete process is presented in sub section 2.2.4 is defined to ColorDistance. A threshold for extracting the blocks with similar colors is defined to dth. Numbers of

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

265

each block are defined to i, j. The domain of i is {i | i=1 ... M} and also the domain of j is {j | j=1 … N}.

Figure 6. MINUS Operation

if (c ij is chromatic color) { if (c' ij is chromatic color) { if (ColorDisctance(c ij , c' ij ) <= d th ) { c ij is changed to transparent color } else { c ij }

(5)

} else { c ij } } else { c' ij } If the colors between the corresponding blocks in two images are equivalent or similar, then, the block color is replaced with transparent color in the result image. Also, if the colors between the corresponding blocks in two images are different, then, the block color is kept in the result image. For example, the intention about an apple, the intention is expressed by MINUS operation between an apple image with green background and an image with all green color. Moreover, this operator functions as a filter that removes only the blocks of the chromatic colors between the corresponding blocks in two images by incrementing the value of threshold enough without considering the difference of the colors. 2.2.4. Operator 4: NOT As shown in Figure 7, this operator reverses the chromatic colors and the transparent color in each block in an image. When one color description matrix C is given, this operator confirms the colors of each block cij in the color description matrix as

266

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

Equation 6. Numbers of each block are defined to i, j. The domain of i is {i | i=1 ... M} and also the domain of j is {j | j=1 … N}.

Figure 7. NOT Operation

if (c ij is transparent color) { c ij is changed to chromatic color } else { c ij is changed to transparent color

(6)

} If the colors of the blocks cij are the chromatic colors, then the block color is replaced the transparent color in the result image. Also, if the colors of the blocks cij are the transparent color, then the block color is replaced the chromatic colors that cij originally has in the result image. For example, in a case of expressing an intention about green background, the intention is expressed by NOT operation of an apple image whose green background is the transparent color. 2.3. Color Distance Calculation Color distance calculation is needed in the color clustering for color histogram matrix and the set-operations with the similar colors for the query creation operators in this method. It is inconvenient to calculate two colors difference in the Munsell color system by Euclidean distance in the color space. The color difference that is recognized as human perception is calculated by Godlove’s color distance formula on Munsell color space model [23]. When two colors are defined as c and c’, and also each Hue, Saturation and Value of c and c’ are represented as H, S and V, the color distance d is defined as:

⎛ ⎛ πΔH ⎞⎞ 2 2 d = 2Sc Sc ' ⎜1 − cos⎜ ⎟⎟ + (ΔS) + (4ΔV ) ⎝ ⎠ 180 ⎝ ⎠ ΔH = | H c − H c ' |, ΔS =| Sc − Sc ' |, ΔV =| Vc − Vc ' |

(7)

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

267

2.4. Subspace Projection on Image Feature Space with Color Descriptions

Figure 8. A Process of Subspace Projection and Creation of Color Histograms for Objects in the Image

As shown in Figure 8, the chromatic color areas of the color description matrix, which are computed by the query creation operators, correspond to subspace and column selection on the image feature space which consists of data sets of the color description matrixes and the color histogram matrixes. In each subspace, the dominant colors of each area are extracted in the image. Information on the user’s intentions is reflected to the image database as this subspace projection. Processes for the subspace projection in this method are similar to the process of trimming on the image processing. In the case of trimming, the user's intentions are output again as an image. Additionally, the image processing ignores the intention information of the trimmed image which is input to the image database as an image-query because complicated image processing in CBIR doesn't consider the user' intentions. Therefore, the trimming only executes the same process as the Query-by-An-Image. The intention expressed by this method is directly expressed as the subspace in the image feature space on the image database by combining images. This part is a unique point of the image-query creation method in comparing with the Query-by-An-Image method.

268

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

The detail intentions of colors in the user’s interest areas are calculated by multiplication of each element of the color description matrix and the color histogram matrix. The result of this calculation is the image-query expressing the user’s intentions in this method. As shown in Equation 8, when a color description matrix cij is given, cij is replaced to the numerical value. Numbers of each block are defined to i, j in the color description matrix. The domain of i is {i | i=1 ... M} and also the domain of j is {j | j=1 … N}. If cij is the transparent color, the value of cij becomes 0. Also, if cij is the chromatic color, the value of cij becomes 1.

if (c ij is transparent color) { c ij = 0 } else { c ij = 1

(8)

} Then, when the color description matrix cij and the color histogram matrix hijk are given, this method calculates the product of cij and hijk as Equation 9. The color histograms of each area of the user's interest are extracted by this calculation. Numbers of each block are defined to i, j in the color description matrix. k is each element in color histograms. The domains of each variable are {i | i=1 ... M}, {j | j=1 … N} and {k | k=1 ... K}.

hijk = c ij • hijk

(9)

3. Implementation of Image Retrieval with Created Image-Query This section presents the implementation method of similarity calculation in this method for image-databases. In the method, a vector space, referred to as the retrieval space, is dynamically created for each query based on an image feature extraction method. The created retrieval space has a characteristic of consisting of the color histograms of image areas which correspond to the user's intentions. By projecting all images onto the retrieval space, the image retrieval on the visual similarity calculation is performed. 3.1. An Image Feature Extraction Method To construct the color-based image feature spaces represented by the color histogram matrix and the color description matrix, we use Color Image Scale which has been proposed by S. Kobayashi [24] as the K-representative colors. Color Image Scale has 130 representative colors consisted of 120 chromatic colors and 10 achromatic colors on Munsell color system applied to human perception, and also represents relationship between colors and impression. A series of target images is mapped to the feature spaces after extracting the image feature. As shown in Figure 9, an extraction process of the image features is as follows:

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

269

Figure 9. An Extraction Process of Image Features

STEP 1: All images are converted into HSV color information and also various image sizes are transformed into a constant size to normalization. Additionally, all images are divided into M x N blocks. If there are the remainders of the pixel by the image quantization, those pixels are rounded down. STEP 2: Each block stores dominant colors which are each of one color in the K-representative colors similar to the average color of each block. The similar color is computed by the color distance calculation on the Munsell color system. STEP 3: In parallel with step 2, the colors of all pixels of each block cluster in K-representative colors, and color histograms for each block of all target images are created. To reduce low frequencies of each color histogram that would be noise in the similarity calculation, the values hk of the frequency of the color histograms that are smaller than a threshold hth is adjusted to 0. STEP 4: The blocks of each image that store the dominant colors are mapped onto the feature space expressed by the color description matrix, and then the created color histograms of each image are mapped onto the feature space expressed by the color histogram matrix after location information of each block is added. 3.2. Construction of the Retrieval Space In this method, multiple images are required to be given in the query step for expressing user’s intentions. The extracted color histogram H by the image feature extraction method is defined as a feature vector shown in Equation 10. The color histogram has K color elements.

H = {h1 , h2 , ..., hK }

(10)

To create the retrieval space using the color features in the images, summation HAccumulation of each element of the color histograms in each block on the image are calculated as Equation 11. Numbers of each block are defined to i, j. The domain of i is {i | i=1 ... M} and also the domain of j is {j | j=1 … N}.

270

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

M

H ACCUMULATION =

N

∑ ∑ (h i =0 j =0

ij1

, hij 2 , ..., hijK )

(11)

HAccumulation is corresponding to the color distribution information in the image areas of the user’s intentions. Eventually, all the images in the databases are projected onto the retrieval space from same processing.

Figure 10. A Construction Process of the Retrieval Space

3.3. Similarity Calculation between the Image-Query and Target Images An Algorithm of similarity calculations is used to compute the similarity degree between the image-query and the images in the image-database, which provides a ranking to arrange the order of the images. For the comparison of the results of image retrieval based on different equations in the similarity calculation, the histogram intersection method and inner product of two vectors are used for the similarity calculation between the image-query and target images. The histogram intersection is a method of the similarity calculation between two histograms [25]. When each element Q(hk) of color histograms for the image-query Q and the color histogram T(hk) for the target image T are given, at the histogram intersection shown as Equation 12, the function Min computes minimum value between two values. Small values of each element of two histograms are left as values of each element of a histogram of result. Then, a summation of each element of the result histogram is calculated as similarity degree Similarity(Q,T) between the two histograms. In the same way, the inner product of two vectors is calculated as Equation 13.

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

271

K

Similarity(Q, T ) =

∑ Min(Q(h ), T (h )) k

(12)

k

k=1

K

Similarity(Q, T ) =

∑ Q(h ) •T (h ) k

(13)

k

k=1

4. Experiments and Discussions To verify the feasibility and effectiveness of our method, we performed qualitative experiments with implemented system. In the experiments, we created several imagequeries for expressing specific intentions by combining multiple images, and retrieved images in image databases by the created image-queries. 1000 images in Corel Test Collection and 130 images painted out with each color in Color Image Scale are used for creating the image-queries. Moreover, the 1000 Corel images, 130 images with a left black horse and 130 images with a right white horse are used for target images. The black horse and the white horse are extracted from image A with an image-processing software, and then these horses are pasted in a image of which background color is painted out by each color of Color Image Scale for the experiments. All images are converted into HSV color model and divided into 20 x 20 blocks. Additionally, 10 high-ranks of retrieval results of created image-queries are demonstrated. 4.1. Intention 1: A Black Horse in Left Side on An Image We combined multiple images and the operators to create an image-query expressing the intention (intension-1) about a black horse in left side on an image as shown Table 1. Number of the images used to express the intention is 6 images. Names of each image for creating the image-query is described upper each image in the table as alphabets from A to F. A concrete operation, an image of color description matrix of image A and the created image-query are shown in columns of the Table 1. In this operation, we extracted a black horse in the left side on the image A by only using the MINUS operator. The selected images are a combination to remove of background colors and colors of a white horse from image A. By the operation, the area of the black horse in left side in the image A corresponding to the user’s intention was extracted. Table 1. The Intention Expression about Images with A Black Horse in the Left Side on the Image ImageQuery Creation Method

A

B

C

D

E

Operation: (((((A MINUS B) MINUS C) MINUS D) MINUS E) MINUS F) Color Description Matrix

Created ImageQuery

F

272

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

Results of image retrieval with the created image-query for the intention-1 are shown in Table 2. In the case of the similarity calculation with the histogram intersection, in the top 10, the number of images with the black horse in left side was 4. In this case, the similarity degree of the 4 images was the same in score. (That is, the 4 images are basically in the same rank.) This result shows that the specific intention is expressed by our image-query method using multiple images. In the case of the similarity calculation with the inner product, the number of images corresponding to the intention was 5 in the top 10. ‘47b.png’ and ‘23b.png’ with the background colors with each color on Color Image Scale and the black horse, which is cropped from image A by image processing, are included in the result. However, the result was worse than the result of histogram intersection. This is because the irrelevant values of the color histogram to the similarity calculation influence the results. Table 2. Results of Image Retrieval with the Created Image-Query for the Intention 1 Histogram Intersection

1.

760.jpg 1900.0

2. 765.jpg 1900.0

3.

777.jpg 1900.0

4.

790.jpg 1900.0

5.

923.jpg 1899.0

6.

927.jpg 1899.0

7. 930.jpg 1899.0

8.

966.jpg 1899.0

9.

969.jpg 1899.0

10.

970.jpg 1899.0

1.

47b.png 2.2706586E7

2. 23b.png 1.6288367E7

3. 679.jpg 1.1313964E7

4.

268.jpg 9573303.0

5.

6.

771.jpg 9024865.0

7. 709.jpg 8915054.0

8.

9.

687.jpg 8503187.0

10. 249.jpg 8320241.0

Inner Product

770.jpg 8631645.0

795.jpg 9308472.0

4.2. Intention 2: A White Horse in Right Side on An Image For expressing another user’s intention (intension-2), we combined multiple images and the operators to create an image-query about a white horse in the right side on an image as shown in Table 3. In this query, we intend to extract the white horse in the image A by using the combination of the AND operator and the OR operator. The number of the images used to express the intention is 5 images. The selected images are a combination to extract colors of the white horse from image A. By the query

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

273

execution, the area of the white horse in the right side in the image A corresponding to the user’s intention was extracted as shown in Table 4. Table 3. The Intention Expression about Images with A White Horse in the Right Side on the Image A

ImageQuery Creation Method

B

C

D

E

Operation: ((((A AND B) OR (A AND C)) OR (A AND D)) OR (A AND E)) Color Description Matrix

Created ImageQuery

Table 4. Results of Image Retrieval with the Created Image-Query for the Intention 2 Histogram Intersection 1.

760.jpg 2600.0

2. 37w.png 2600.0

3. 38w.png 2600.0

4. 45w.png 2597.0

5.

6.

36w.png 2595.0

7. 39w.png 2594.0

8.

9.

44w.png 2594.0

10. 995.jpg 2592.0

1.

113b.png 3.0738157E7

2. 113w.png 2.9942159E7

3. 425.jpg 2.3003473E7

4.

7w.jpg 1.7109689E7

5.

6.

258.jpg 1.4304373E7

7. 79.jpg 1.0014234E7

8.

9.

866.jpg 9274182.0

10. 124.jpg 8834907.0

42w.png 2594.0

516.jpg 2596.0

Inner Product

847.jpg 9730030.0

7b.jpg 1.6993741E7

In the case of the similarity calculation with the histogram intersection, in the top 10, the number of images with the black horse in left side was 8. In this result, the image-query created by our image-query method using multiple images enough expresses the specific intention. In this query, the scores of ‘760.jpg’, ‘37w.png’ and ‘38w.png’ were same. And also, the score of ‘39w.png’, ‘42w.png’, and ‘44w.png’ were same, too. Moreover, in the case of the similarity calculation with the inner product, the number of images corresponding to the intention was only 2 in the top 10. The result was worse than the result of histogram intersection. This is because the irrelevant values of the color histogram to the similarity calculation influence the

274

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

results and the light colors like white color are converted into gray colors in clustering in the K-representative colors. 4.3. Intention 3: A Black Horse in Left Side and A White Horse in Right Side on An Image For expressing a user’s intention (intension-3), we combined multiple images and operators to create an image-query about a black horse in the left side and a white horse in the right side on an image as shown in Table 5. The extraction of two horses are intended by combining all operators including the NOT operator. The number of the images used in this method is 6 images. The selected images are a combination to extract the horses by the AND operator and the OR operator, and also to remove green background by the NOT operator and the MINUS operator from image A. By the query execution, the area of the black horse in left side and the white horse in right side in image A corresponding to the user’s intention was extracted as shown in Table 5. Table 5. The Intention Expression about Images with A Black Horse in the Right Side and A White Horse in the Left Side ImageQuery Creation Method

A

B

C

D

E

F

Operation: (A MINUS NOT((((((A AND B) OR (A AND C)) OR (A AND D)) OR (A AND E)) OR (A AND F)))) Color Description Matrix

Created ImageQuery

Results of image retrieval with the created image-query for the intention 3 are shown in Table 6. In the case of the similarity calculation with the histogram intersection, in the top 10, the number of images with the black hose in right side and the white horse in right side was 4. This result shows that the specific intention is expressed by this method. The irrelevant images to the intention were included in the result because the colors of these images were converted into gray or brown in the feature extraction. Additionally, in the case of similarity calculation with the inner product, the number of images corresponding to the intention was 1. The result was worse than the result of histogram intersection. In our previously proposed method, the user's intentions about the colors and shapes of the objects in the image are independently expressed as one dimension vector, and also the both intentions are expressed by integrating each result of similarity degree. Especially, the intentions about colors are expressed by operating percentage of each element of the color histograms. These operations are similar to user's drawing the image newly. However, because locations of the colors aren't considered, a created image-query in our previously proposed method is only expressed about usages of the colors based on the user's intentions. In this method, we considered the image painting and the intention expression separately. The image processing deals with the processing of the image painting. The

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

275

user's intention expression is enabled by the combinations of multiple images in the image-databases on this method. Moreover, gaps of the blocks aren't considered in the comparison of the shapes of the objects in the similarity calculation currently. It is difficult for the user to express locations of the blocks accurately as the intentions. Contrivance such as moving the blocks from side to side and up and down is required on the similarity calculation to allow the gaps of the blocks. Additionally, this method enables the intention expression like a black horse on the left side or a white horse on the right side, etc. However, this method isn't able to express the intention like only a black horse on the left side or only a white horse on the right side, etc. This is because the user doesn't express the intention in the right side of the image. The user tends to express only intended information. In the image-query creation, we consider that an automatic supplementation using scene completion technology [28] for background colors is necessary. As an application of this method, the image filtering corresponding to the intention is considered by defining the combination of the images and the operators that expresses a specific intention. Table 6. Results of Image Retrieval with the Created Image-Query for the Intention 3

5. Conclusions and Future Work In this paper, we have presented a combined-image query creation method for expressing user’s intentions by combining multiple digital images and operators for image retrieval. This method provides some images in the image database and several color-based set-operators on the query part of the image databases to express imagecontexts as the intentions. This query creation corresponds to the user’s imagination process for matching of the user’s imaginary images and real scenes. The user’s

276

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

intentions are expressed as the subspace projection of the image feature space by combining the images and the operators in the query creation with the image databases. The process of the subspace projection is similar to the trimming process of the image processing. Our method realizes the actual imagination expression with the colors and the shapes of the objects by using multiple images and operators for adaptively combining those colors and shapes. As the future work, to refine the precision of the retrieval results, we improve the similarity calculation by using global color features and local color features in the images. Moreover, we intend to allow the gaps of the blocks on the similarity calculation and to express more detail intentions for the users. Additionally, we confirm the application range of this method for adopting video retrieval systems.

References [1] Yasuhiro Hayashi, Yasushi Kiyoki and Xing Chen, "An Image-Query Creation Method for Expressing User’s Intentions by Combining Multiple Images," The 19th European- Japanese Conference on Information Modeling and Knowledge Bases, June 2009. [2] Flickr, “http://www.flickr.com/”. [3] Google Image, “http://image.google.com/”. [4] Chang, S.K., "Image Database Systems," Handbook of Pattern Recognition and Image Processing, Young, T. Y. and Fu, K. S. (eds), pp. 371-393, Academic Press, 1986. [5] R. Datta, et al. “Image Retrieval: Ideas, Influences, and Trends of the New Age,'' ACM Computing Surveys, vol. 40, no. 2, article 5, 60 pages, 2008. [6] David A. Forsyth, Jean Ponce: ”Computer Vision: A Modern Approach, ” Prentice Hall, 2002. [7] Hirata, K. and Katzo, T. “Query by visual example, content based image retrieval,” in Advances in Database Technology-EDBT’92, Vol. 508, Pirotte, A., Delobel, C. and Gottlob, G., Eds., 1992. [8] Niblack, W. et al., “The QBIC project: Quering images by content using color, texture and shape,” in Proc. SPIE Storage and Retrieval for Image and Video Data Bases, pp. 172-187, 1994. [9] Carson, C.; Thomas, M.; Belongie, S.; Hellerstein, J. M. & Malik, J. Blobworld: "A system for regionbased image indexing and retrieval," Third International Conference on Visual Information Systems, Springer, 1999. [10] VIPER/GIFT, “http://viper.unige.ch/doku.php” [11] Pentland, A., Picard, R. W. and Sclaroff, S., "Photobook: Content-based manipulation of image database," Int. J. Comput. Vis., Vol. 18, pp. 233-254, 1996. [12] Smith, J. R. and Chang, S., “VisualSEEk: A fully automated content-based query system,” in Proc. ACM Multimedia’96, pp 87-98. [13] Ma, W.-Y. & Manjunath, B. S. "NeTra: A Toolbox for Navigating Large Image Databases," Multimedia Systems, 1999, 7, 184-198 [14] Siggelkow, S.; Schael, M. & Burkhardt, H. SIMBA - Search IMages By Appearance in Pattern Recognition, Proc. of 23rd DAGM Symposium, B. Radig and S. Florczyk, Eds. Sept. 2001, number 2191 in LNCS Pattern Recognition, Springer Verlag, 2001, 9-17 [15] James Z. Wang, Jia Li, Gio Wiederhold, "SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries," IEEE Trans. on Pattern Analysis and Ma-chine Intelligence, vol 23, no.9, pp. 947963, 2001. [16] J. R. Bach, C. Fuller, A Gupta, A Hampapur, B. Horowitz, R. Humphrey, R. Jain, and C. Shu. The virage image search engine: An open framework for image management. Proceedings of SPIE: Symposium on Electronic Imaging, 2670:76-87, 1996. [17] Excalibur RetrievalWare, "http://www.excalib.com/demos.html". [18] Rui, Y., Huang, T. S. and Mehrotra, S., “Content-based image retrieval with relevance feedback in Mars,” in Proc. IEEE Conf. Image Processing, pp. 815-818, 1997. [19] Rocchio JJ, "Relevance Feedback in Information Retrieval. "In: The SMART Retrieval System, 1971, pp. 313-323, Prentice Hall. [20] Yasushi Kiyoki, Takashi Kitagawa and Takanari Hayama: "A meta-database system for semantic image search by a mathematical model of meaning," ACM SIGMOD Record, Vol. 23 Issue 4, December 1994. [21] W.H. Leung, T. Chen, "Trademark retrieval using contour-skeleton stroke classification," IEEE Int. Conf. on Multimedia and Expo., vol. 2, 2002, pp. 517-520.

Y. Hayashi et al. / A Combined Image-Query Creation Method for Expressing User’s Intentions

277

[22] I. Valove and B. Rachev, “Retrieval by Color Features in Image Databases,” Proceedings of Eighth East-European Conference on Advances in Databases and Information Systems, September 2004. [23] I. H. Godlove, “Improved Color-difference Formula,” Applications to the Perceptibility and Acceptability of Fadings, J. Opt. Soc. Amer. 41, 760, Nov. 1951. [24] Shigenobu Kobayashi, “Color Image Scale,” The Nippon Color & Design Research Institute ed., translated by Louella Matsunaga, Kodansha International, 1992. [25] Michael J. Swain, Dana H. Ballard: “Color Indexing,” International Journal of Computer vision, Vol. 7, No.1, pp. 11-32, 1991. [26] James Z. Wang, Jia Li, Gio Wiederhold, “SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol 23, no.9, pp. 947963, 2001.A.N. Author, Book Title, Publisher Name, Publisher Location, 1995. [27] Xing Chen and Yasushi Kiyoki, “A Visual and Semantic Image Retrieval Method Based on Similarity Computing with Query-Context Recognition,” Information Modeling and Knowledge Bases. Vol. XVIII, pp. 245-252, May 2007. [28] James Hays and Alexei A Efros, "Scene Completion Using Millions of Photographs, " ACM Transactions on Graphics (SIGGRAPH 2007), vol. 26, number 3, 2007.

278

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-278

Towards Context Modelling and Reasoning in a Ubiquitous Campus Ekaterina GILMAN, Xiang SU and Jukka RIEKKI Intelligent Systems Group and Infotech Oulu, University of Oulu, Finland Abstract. This paper proposes context modelling and reasoning to enable intelligent services in a ubiquitous campus. An ontology-based modelling includes upper level context modelling and domain-speciﬁc modelling for the campus area. Ontological and rule-based inferencing, which facilitate ubiquitous functionality for daily life, are implemented by utilizing the context model developed. A student assistant scenario is presented, demonstrating the usefulness of ontological context modelling and reasoning for highly distributed environments, such as a university campus. Keywords. Context-awareness, Ontology, Inference, User-centric Services

Introduction Ubiquitous computing paradigm assumes the provision of the computing environment which enables an access to services at the right time and place by embedding the computing resources in human surroundings. Basically, a ubiquitous environment is a network of distributed heterogeneous devices that interact with each other via available communication networks (WiFi, GPRS, LAN, etc.) to provide real-time user-centric services to users. By user-centric services, we mean the personalization based on users’ interests, settings, current activity and computing resources at certain time and place. Information like that constitutes the context, which might be quite dynamic in a ubiquitous environment. To be able to provide user-speciﬁc services, the system should be context-aware, which means an ability to adapt itself to the changing environment and user behaviour. Context modelling and reasoning are initial steps to employ knowledge to understand the context and share this information in support of intelligent functionalities and interfaces. Naturally, a context model should represent the part of the real world relevant to the context-aware application designed. This representation should be adequate, faithful and allow effective manipulations. The context model facilitates appropriate reasoning techniques. In this paper, we present how context modelling and reasoning will facilitate intelligent services within a ubiquitous campus. A university campus is an attractive environment for ubiquitous systems development. It provides a comprehensive set of technologies used for infrastructure, endless types of services to develop, and a lot of motivated test users. Intelligent services aim to support the students, researchers, professors and all other university staff in their daily university life, by providing the needed information or actions based on the current user’s context. We present a general ontological context

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

279

model for the ubiquitous campus and verify it with a speciﬁc student assistant application scenario. The main contribution of this article is the framework developed for the context modelling and reasoning within a ubiquitous campus. Our framework facilitates usercentric services development. Adaptation to the context is realized by ontological and rule-based reasoning. We start by describing a typical student assistant scenario. Then, we introduce the framework architecture in section 2, after that we present our context ontology model in more detail in section 3. Section 4 describes our inference mechanism for realizing context-awareness in a ubiquitous campus. We compare our model with related work in section 5 and conclude the paper in section 6.

1. Application Scenarios for the Ubiquitous Campus Under the term ubiquitous campus, we consider a ubiquitous environment together with the supporting infrastructure and ubiquitous services at the university area. This section presents a possible student’s daily life assistant application scenario which utilizes such ubiquitous services. The inference rules facilitating the described functionality are presented in section 4. Alice is a student of the University of Oulu. Every morning her mobile phone checks her calendar automatically, decides the time to ring her up based on her personal proﬁle. When Alice is awake, the ring can be switched off by the phone itself. While Alice is doing her morning routine, the phone checks the weather and bus schedule. In addition, the status of the library service is checked as well. If the deadline to return some books is close, mobile phone informs Alice about that and provides the interface to prolong the loan. If it is not possible to prolong some books, a reminder will be shown on the mobile phone’s display to return the books in question. The mobile phone also synchronizes the mail data, in case if Alice would like to read mail from her mobile phone on her way to the university. Alice takes the suggested bus. When she enters the campus area, her mobile phone shows the recent university news. When Alice has entered the university, her mobile phone asks her if she wants to be visible for her friends. By enabling this service, Alice can see where her friends are in the campus area. In the lecture room, her phone activates the silent mode, and all information related to the lecture is loaded to her terminal, such as home assignments marks, lecture material, coming deadlines. The mobile phone sets itself to the normal mode during lectures breaks. Alice can always print the collected material by touching the printer’s RFID tag with her mobile phone. When the lunchtime is getting close, the mobile phone shows the menus of the university restaurants to Alice.

2. Framework of Context Modelling and Reasoning Context modelling is the speciﬁcation of all entities and relations between them which are needed to describe the context as a whole, e.g. information on location, time, the user and its current or planned activity, and computational entities [1]. Wei and Chan [2] consider three basic aspects a well-designed context model should address: data structure, its integrity and manipulation. Strang et al. [3] point out requirements that a contextaware system puts into context modelling, such as incompleteness and ambiguity, due to the fact that we cannot guarantee correct and complete information from physical sen-

280

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

sors, e.g. There are quite many knowledge representation (KR) approaches to classify the context model. In the same paper, Strang et al. present their classiﬁcation: Key-value, Markup Scheme models, Graphical models, Object Oriented models, Logic based models, Ontology based models. The authors evaluated approaches under their requirements for context-aware computing systems and demonstrated that ontology-based models can meet them all. Our conceptual framework of context modelling and reasoning is given in Figure 1. Context is captured by heterogeneous sensors, distributed within the campus area or attached to the user (e.g. GPS receiver of a user’s mobile phone), by users’ speciﬁc settings (e.g. preferences or proﬁles), and other campus-related information (such as lessons schedule). Context modelling includes the upper level ontology and lower level Resource Description Framework (RDF) base, which deals with how contexts are represented, organized, and stored. Context reasoner performs the inference to adapt the system to changing situations and to trigger the services to the user.

Figure 1. Framework of Context Modelling and Reasoning in Ubiquitous Campus

The Ubiquitous campus ontology serves as a knowledge base, and the context model should capture context information from physical and social environments. In our ubiquitous campus, the context-data include the data from the physical environment (such as location, temperature), digital environment (such as digital devices, sensors) and social environment (such as person and her proﬁle data).

3. Ontological Modelling for the Ubiquitous Campus Context Among the many candidates of context modelling, ontology-based representation is a good solution. Firstly, ontolgies can serve as a formal context model that computers can interpret. Secondly, ontologies enable the interoperability of devices in the environment in a unambiguous way. Thirdly, ontologies facilitate advanced knowledge processing like reasoning. Our ubiquitous campus ontology model provides a formal vocabulary set for

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

281

representing and sharing context knowledge in a campus domain, including a deﬁnition of basic concepts and relationships among them. Web Ontology Language (OWL) is adopted for representing the ontology. 3.1. Design Considerations of the Ubiquitous Campus Ontology Context information can be highly dynamic, while the ontology model has a more stable structure. This conﬂict makes our design challenging. As a solution, we propose a twolayer ontology structure. The upper layer includes the general concepts and relationships for all ubiquitous environments, and the lower layer emphasizes application speciﬁc, teaching and research related entities for our university campus scenario. The common concepts for all ubiquitous environments include mobile and stable devices, services, persons, etc. The design of two-layer ontology structure facilitates the extensibility of the upper level ontology. Concepts from different sub-domains can be easily plugged into the upper level when necessary. We classify sources of context information into three different classes, social environment context, digital environment context, and physical environment context. Context in the social environment describes people and their relationships. Both personal and social context are included in this catalogue. Physical environment context depicts physical entities in our surroundings, most of which can be measured by some devices like sensors. Digital environment context includes all the digital devices, networks, services that facilitate the life of the people in the environments. 3.2. Structure of the Ubiquitous Campus Ontology Figure 2 presents the overall structure of our ubiquitous campus ontology. The upperlevel context ontology describes the basic concepts relating to the context entities, while the lower-level ontology describes detailed concepts in the campus domain. The loosely coupled two-layer ontology structure enables a uniform upper level context model for all possible scenarios and applications. Ubiquitous applications have their own low level ontology, which suits the hierarchy of upper level ontology. When a user changes among scenarios and applications in ubiquitous environments, the low level ontology will change accordingly. As mentioned above, social environment context, digital environment context, and physical environment context are our main divisions in the context modelling. Concepts in the social environment include Person, Proﬁle, Social Activity, and Social Service. No doubt, it is Person who plays the most important role within the ubiquitous environment, and all the services facilitate person’s activity as a ﬁnal goal. Capturing the information about the person is the most straightforward enabler for the user-centric service adaption. The concept Proﬁle interprets the name, gender, preference, mode, plan of a speciﬁc person. The concept Social Activity describes what activities the user is doing with other people, like meeting, learning, photo sharing, etc. Finally, Social Service is the service that can be offered from other people to the user, like restaurant service, lecture service (a student is taught by a teacher), and post delivery service, for instance. A physical environment is another important set of entities for the ubiquitous computing, which includes location, time, sound, temperature, illumination, etc. System behaviour can be adapted automatically based on the environmental data. Sensors and sen-

282

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

Figure 2. Ubiquitous Campus Context Ontology Model

sor networks are important enablers to measure the data in the campus area and transfer it to the ontology model. Physical content, like books, is an indispensable context for the campus scenario, in supporting of learning and teaching for campus people. Digital environment has rich computing resources and services. Different kinds of devices, like PDAs, sensors, RFID, cameras, robots are deployed in our surroundings, and different networks are accessible for devices. Some devices in the digital environment can interact and measure the physical environment, while others should provide digital services to entities in the social environment (like person and social activity), at the right place and time. We also consider digital content (e.g. schedules, videos, presentations, web pages), which plays an important role in campus. We consider both digital and social services which could be simple and composed ones. A simple service provides the functionality by utilizing its internal resources, while the composed one, in addition, uses other simple services. For example, Library Service is a composed service that includes Book Reminder Service and Loan Prolong Services. A simple service can be achieved like a reﬂection, for example, a mobile phone can stop the ringing when its noisy sensor measurement shows the user has got up. A composed service needs combinations or interactions among simple services, such as environmental data combined with personal information.

4. Context Reasoning for the Student Assistant Application Scenario Context reasoning plays one of the key roles in ubiquitous computing. Nurmi et al. [4] deﬁne context reasoning as deducing new and relevant information to the use of application(s) and user(s) from various sources of context-data. Context reasoning is responsi-

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

283

ble for the intelligent data processing to provide for the user the right service at the right time. There are several reasoning mechanisms for pervasive systems available, including rule-based reasoning, probabilistic reasoning, case-based reasoning, etc. [5] The selection of an appropriate reasoning technique depends on both the context model representation, its structure, environment knowledge and concrete application requirements. For example, if a system should make some decisions based on the user’s history, then casebased reasoning might be considered as a candidate. However, when we cannot be sure about the certainty of the contextual data, some probabilistic methods could be considered (e.g. Bayesian networks, probabilistic logic, Dempster-Shafer theory). Moreover, it is seldom the cases that only one reasoning mechanism is used, a hybrid approach is utilized for the inferencing within ubiquitous systems. More details are given in the following surveys [5][6][7]. Our ontological context model representation for the ubiquitous campus scenario allows us to apply two inference mechanisms: ontological (based on Description Logics) and rule-based. Description Logics (DLs) is the name for a family of KR formalisms that represent the knowledge of an application domain (the "world") by ﬁrst deﬁning the relevant concepts and roles (concepts relations) of the domain (its terminology), and then using these concepts to specify properties of objects and individuals occurring in the domain (the world description). DL deals with two components, TBox introduces the terminology, i.e. the vocabulary of an application domain, while ABox contains assertions about named individuals in terms of this vocabulary [8]. The basic operation of description logic reasoners is subsumption determination, i.e. checking whether a concept is more general than the other. Key inference operations with respect to ABox are realization, which means "determining the concepts instantiated by a given individual", and retrieval, which, in turn, means "determining the set of individuals that instantiate a given concept" [5]. A rule-based system uses rules to infer the conclusions from the premises. Rules are considered as instructions for actions that can be applied to a certain situation. Knowledge base contains the application domain data, all facts relevant to the system. Rules are applied to the facts of knowledge base and if data satisfy the conditions in the rules, actions are triggered. Due to their expressiveness, rules are quite often applied for reasoning. For our system, DL reasoning supports the concepts and entities relations inference, while rule-based reasoning facilitates intelligent system behaviour. We have implemented the context model for our student assistant application scenario with OWL. Currently, we have not utilized the full DL support of OWL-DL (property characteristics, complex classes); however, we are able to make the basic inference about the concepts relations and individuals types. For our system, we concentrate more on rule-based inferencing. Our ubiquitous campus scenario deﬁnes an extensive set of heterogeneous services and resources, its designed ontology model facilitates the data and control exchange between utilized services. Rules deﬁne when certain services should be executed, what data should be utilized by them, based on the user proﬁle and current context. Rule inference is applied directly to the context model, so data are obtained from ontology model and are put there as well. Table 1 shows some rules deﬁned for our student assistant application scenario. The left column describes the rule in natural language and the right column presents the Jena [9] rules , applied to the context model.

284

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

Table 1. Rules for Ubiquitous campus scenario Situation

Reasoning rule [rule1: (?s rdf:type sw:Student)(?s sw:hasLocation "home") (?s sw:usesPersonalAssistant ?pa)(?pa sw:usesRingService ?rs) (?pa sw:usesCalendarService ?calendar) now(?current_time)

If a student is at home and according to her calendar she has lectures today, then her mobile phone starts ringing at a certain time she has speciﬁed in her proﬁle

(?calendar sw:currentLectureTime ?lecture_time) (?s sw:usesProﬁle ?proﬁle) (?proﬁle sw:hasWakeupTime ?wakeup_before) difference(?lecture_time,?wakeup_before,?diff) equal(?current_time,?diff) (?pa sw:currentService ?cs) -> drop(11)(?pa sw:currentService ?rs)(?rs sw:hasStatus "ringing")] [rule2: (?s rdf:type sw:Student)(?s sw:usesPersonalAssistant ?pa) (?s sw:hasLocation "home")(?pa sw:currentService ?service)

When the ringing was turned off, the mobile phone checks the weather, bus timetable, mail and library services

(?service rdf:type sw:RingService)(?service sw:hasStatus "off") (?pa sw:usesWeatherService ?ws)(?ws sw:currentWeather ?weather) (?pa sw:usesBusTimetableService ?bs)(?bs sw:nearestBus ?bus) (?pa sw:hasLibraryService ?ls)(?pa sw:usesMailService ?mail) -> (?pa sw:currentWeather ?weather^^xsd:int) (?pa sw:currentBus ?bus^^xsd:int) (?mail sw:hasStatus "synchronize")drop(3)(?pa sw:currentService ?ls)] [rule3: (?s rdf:type sw:Student)(?s sw:usesPersonalAssistant ?pa)

If the student has some books with close loan expiration time, then reminder service is activated

(?s sw:hasLocation "home")(?pa sw:currentService ?service) (?service rdf:type sw:LibraryService) (?service sw:hasBooksToExpire true^^xsd:boolean) (?pa sw:usesBookReminderService ?book_reminder) -> (?book_reminder sw:hasStatus "active") drop(3)(?pa sw:currentService ?pa)] [rule4: (?s rdf:type sw:Student)(?s sw:usesPersonalAssistant ?pa) (?pa sw:hasCalendarService ?ca)

If the student is at the lecture, then switch her mobile phone to the calm mode

(?ca sw:hasCurrentLectureLocation ?lecture_location) (?s sw:hasLocation ?lecture_location)(?s sw:usesMobilePhone ?phone) (?phone sw:hasMode ?mode) -> drop(6)(?phone sw:hasMode "calm")] [rule5:(?s rdf:type sw:Student)(?s sw:usesPersonalAssistant ?pa) (?s sw:usesProﬁle ?proﬁle)(?proﬁle sw:hasLunchTime ?lunch_time)

If the student’s speciﬁed lunch time is coming, then restaurant menus are shown on her mobile phone

now(?current_time)equal(?lunch_time,?current_time) (pa sw:usesRestaurantService ?restaurant_service) (?restaurant_service sw:hasStatus ?st) -> drop(7)(?restaurant_service sw:hasStatus "show")]

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

285

In our case, ontological representation is used as the general knowledge base for all services utilized in our scenario. Rules trigger the services behaviour adaptation by ontology modiﬁcation. As mentioned above, our context model allows implementation of both simple and composed services. Table 1 demonstrates that to achieve its functionality PersonalAssistantService uses other simple services, such as RingingService, WeatherService, CalendarService, etc. This part demonstrated that reasoning can be implemented successfully for our personal assistant application scenario.

5. Related Work A lot of research for campus domain has been done: MyCampus [10] is a system supporting agent-based context-aware mobile services. Burrell et al. [11] reported a campus tour guide system, which enables social navigation based on the user’s location. Chan-Sik et al. [12] presented their sensor architecture for collecting environmental data in a campus. Erdur et al. [13] have developed an environment that provides semantic matching-based information gathering capability for mobile users, in their case study the mobile user gets the location based information about the closest places to reside or eat something. Gavrilova and Jin [14] present an ontology-based approach for knowledge based portal creation for the international students affairs domain. Maron et al. [15] present Campus News Information System, a ubiquitous system that provides announcements and university advertisements based on users proﬁles. However, all these context-aware systems lack a common model for the context information. Our work conceptually differs from listed approaches: we develop the campus ontology, and based on it, make context-aware services available for users. On the other hand, ontology-based solution have been extensively utilized in context modelling and reasoning. Wang et al. [16] propose CONtext Ontology (CONON) for context modelling in a pervasive environment. They provide two layered ontology model, where the upper ontology describes the general concepts and the domain-speciﬁc ontology holds the set of speciﬁc ontologies describing the concepts within a certain domain (e.g. home-domain, ofﬁce-domain, etc.). COBRA-ONT [17] provides an agentbased architecture, which includes an ontology model of context on behalf of a community of agents, services, and devices. And COBRA-ONT was extended to SOUPA [18], a generic ontology with a common vocabulary for a different pervasive computing environment. Compared to these models, our ontology also has a two-layer structure (similliar to CONON), but differs at the conceptual division.

6. Discussion In this article, we discussed context modelling and reasoning within the ubiquitous campus, in support of offering intelligent services to users according to their context information. In the context model, we provided a clear distinction between digital, physical and social environments, and categorized the campus related entities. The general model is expressive for the utilization of different services within such a huge and heterogeneous university campus environment. At the same time, our approach is extensible to any other domains. Furthermore, we built a scenario-speciﬁc ontology based on the general model to satisfy our student assistant application scenario.

286

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

We can see that rule-based inference, applied to our ontology, facilitates the intelligent services development. We have applied inference directly to ontology and campus intelligent services are using it, as well as the context model. A set of Jena rules illustrates the usability of the context modelling and reasoning for the scenario. Our future work includes verifying the usability of our model by implementing more services, and studying how to apply the probabilistic reasoning within this semantical context representation. Furthermore, we will study how the service composition in the campus environment can be achieved. This will be an important stage for enabling the users to focus on their everyday life, rather than available services. Acknowledgements This work was funded by the MOTIVE research program of the Academy of Finland. The second author would thank the funding from the Infotech Oulu Graduate School. References [1] [2] [3]

[4] [5]

[6] [7]

[8]

[9] [10]

[11] [12]

[13]

[14]

Feruzan, A., "Context Modeling and Reasoning using Ontologies", University of Technology Berlin, July 2007. Wei, Edwin J. Y., and Chan, Alvin T.S., "Towards Context-Awareness in Ubiquitous Computing", International Conference of Embedded and Ubiquitous Computing, Taipei, Taiwan, pp.706-717, 2007. Strang, T., and Linnhoff-Popien, C., "A context modeling survey", Workshop on Advanced Context Modelling, Reasoning and Management, Sixth International Conference on Ubiquitous Computing, Nottingham, England, 2004. Nurmi, P., and Floréen, P., "Reasoning in Context-Aware Systems", Helsinki Institute for Information Technology, Position paper, 2004. Perttunen, M., Riekki, J., and Lassila, O., "Context Representation and Reasoning in Pervasive Computing: a Review", International Journal of Multimedia and Ubiquitous Engineering, vol. 4, no. 4, pp. 1-28, October 2009. Bikakis, A., Patkos, T., Antoniou, G., Plexousakis D., "A survey of semantics-based approaches for context reasoning in ambient intelligence", Constructing Ambient Intelligence, Springer, pp. 14-23, 2008. Bettini, C., Brdiczka, O., Henricksen, K., Indulska, J., Nicklas, D., Ranganathan, A., Riboni, D., "A survey of context modelling and reasoning techniques", Pervasive and Mobile Computing, vol. 6, no. 2, pp. 161-180, June 2009. Baader, F. and Nutt, W. 2003. Basic description logics. In the Description Logic Handbook: theory, Implementation, and Applications, F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider, Eds. Cambridge University Press, New York, NY, 43-95. Jena - a Semantic Web Framework. http://jena.sourceforge.net/. Sadeh, Norman M., Chan, E., and Van, L., "MyCampus: An Agent-Based Environment for ContextAware Mobile Services", Workshop on Ubiquitous Agents on embedded, wearable and mobile devices, Bologna, 2002. Burrell,J., Gay,G., Kubo,K., and Farina, N., "Context-aware computing: A test case", Fourth international conference on Ubiquitous Computing, Göteborg, Sweden, pp. 647-653, 2002. Chan-Sik, J., Min-Jae, L., Seung-Hyun, O., "A Design and Implementation of Ubiquitous Campus Environment Information Service", International Conference on Advanced Language Processing and Web Information Technology, pp. 362-366, 2008. Erdur, R. C., Dikenelli, O., Onal, A., Gumus, O., Kardas, G., Bayrak, O. and Tetik, Y. E, "A Pervasive Environment for Location-Aware and Semantic Matching Based Information Gathering", 20th International Symposium, Computer and Information Sciences, pp. 352-361, 2005. Gavrilova, T., and Jin, H., "Ontology-Based Knowledge Portal Development for University Knowledge Management", Fourth international Conference on Networked Computing and Advanced information Management, Washington, DC, pp. 552-559, 2008.

E. Gilman et al. / Towards Context Modelling and Reasoning in a Ubiquitous Campus

[15]

287

Maron, M., Read, K., Schulze, M., "CAMPUS NEWS - Artifcial Intelligence Methods Combined for an Intelligent Information Network", Constructing Ambient Intelligence, pp. 44-52, 2008. [16] Wang, X.H., Zhang, D.Q., Gu, T., Pung, H.K., "Ontology based context modeling and reasoning using OWL", Pervasive Computing and Communications Workshops, pp. 18-22, 2004. [17] Chen, H., and Finin, T., "An Ontology for Context Aware Pervasive Computing Environments", Special Issue on Ontologies for Distributed Systems, Knowledge Engineering Review, vol. 18, no. 3, pp. 197207, 2004. [18] Chen, H., Perich, F., Finin, T., and Joshi, A., "SOUPA: Standard Ontology for Ubiquitous and Pervasive Applications", First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services, pp. 258-267, 2004.

288

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-288

A Phenomena-of-Interest Approach for the Interconnection of Sensor Data and Spatiotemporal Web Contents Kyoung-Sook KIM a,1 , Takafumi NAKANISHI a , Hidenori HOMMA a , Koji ZETTSU a , Yutaka KIDAWARA a , and Yasushi KIYOKI a,b a Knowledge Creating Communication Research Center, National Institute of Information and Communications Technology, Japan b Keio University, Japan Abstract. With the advance of ubiquitous computing and mobile environments, we have begun to continuously monitor changes in real-world condition and environment through wireless sensor networks. Opportunities also exist for people to create information related to the world around them by using mobile phones equipped with sensing devices, and share that information online with others. In this paper, we propose a novel approach for the interconnection of earth observation data and spatiotemporal web contents on the basis of spatiotemporal and thematic relationships. In particular, we use the concept of moving phenomena of interests to link between measurement sensing data and people-centric contents on the basis of spatiotemporal proximity and thematic relevance. This paper also shows a simple application that automatically generates semantic tags with respect to natural geographic phenomena, such as typhoons, climate changes, and air pollution, on the basis of our interconnection approach. We are able to easily understand qualitative meanings with respect to a certain phenomenon expressed by quantitative numeric conditions. Keywords. geo-observation, sensing measurements, user-generated contents, spatiotemporal proximity, thematic relevance, events, phenomena

1. Introduction Various phenomena are continually occurring at different times and places in the real world. We generally want to know what happened, is happening, or will happen in our surroundings. For this, we collect information about various aspects of the real world, communicate with other people, and ultimately obtain comprehensive knowledge about real-world phenomena. In the recent decade, the development of mobile sensing and Geoweb[13] technologies has given us the chance to use the Internet to participate in capturing, creating, and sharing information about events or phenomena occurring in the world. Sensor web projects, for instance, attempt to continuously monitor the status of the Earth so that people can understand our planet and its environmental processes, 1 Corresponding Author: Kyoung-Sook Kim, Knowledge Creating Communication Research Center, NICT, 3-5 Hikaridai, Seika-cho, Sora-gun, Kyoto 609-0289, Japan; E-mail: [email protected].

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

289

seek to identify potential natural and human threats, such as hurricanes or earthquakes, and make it easy to share real-time sensor data streams available online. People have also begun capturing and storing their entire lives, or large portions of them, such as via “lifelogging” using blogs, and annotated photos, music, and videos. Mobile phones equipped with built-in sensors including GPS have become a particularly vital tool in sensing, processing, and communicating an array of valuable information about the real world, as mentioned in a Nokia report [2]. In [4], a new vision of people-centric sensing by carrying sensor-enabled mobile phones on a broad, global scale is introduced and compared with the small-scale specialized wireless sensor networking. In a people-centric sensing system, people as individuals or members in social or special interest groups have vast opportunities for sensing and sharing information of interest; anytime and anywhere. We can resultantly regard people as sensors for capturing our physical environments and making observation data informed by their personal activities and everyday experiences. Goodchild also deﬁnes Volunteered Geographic Information (VGI) as a special case of user-generated geospatial contents on the Web, and discusses the role of people as sensors for monitoring the world [7]. In this study, we propose a new approach for interconnecting spatiotemporal usergenerated contents with raw sensor data in the form of measurement values collected from wireless sensors. Through people-centric sensing, information such as places and times related to real-world events or phenomena are produced and consumed on the Web. There are some differences, however, between qualitative user contents, such as hypertext documents, photos, and video ﬁles, and quantitative sensor measurements, such as temperature, speed, or degree of CO2 in the areas of representation, computation, and application. In order to link these two different data sets, we use the concept of moving phenomena deﬁned by spatial, temporal, and thematic constraints, such as typhoons, climate changes, and air pollution. In [9], we have already proposed moving phenomena analysis and 3D visualization system, with spatio-temporal and thematic aspects. In this paper, we introduce sensing data integration to the moving phenomena analysis and visualization system environment. We also make a connection between two different types of spatiotemporal data on the basis of spatiotemporal proximity and thematic relevance. The rest of the paper is organized as follows. Section 2 explains basic terms for understanding this study, and section 3 describes the concept of moving phenomena over the sensor data and the interconnection between sensor data and spatiotemporal contents. In section 4, we propose a simple application to generate semantic tags of natural geographic phenomena by using our interconnection method, and section 5 presents a prototype for the case of using typhoons with real earth observation data and web news articles. Finally, section 6 concludes this study with discussion of future work.

2. Background This section introduces some basic terms for understanding our interconnection approach on the basis of moving phenomena of interest. 2.1. Quantitative Sensor Data As the effects of severe weather become more apparent in the form of hurricanes, ﬂoods, drought, and climate changes, interest in environmental monitoring and risk management

290

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data <swe:SimpleDataRecord> <swe:field name="time"> <swe:Time definition="urn:ogc:def:property:OGC:OriginTime"> [Time] <swe:value>2005-08-05T12:21:13Z <swe:field name="latitude"> <swe:Quantity definition="urn:ogc:def:property:OGC:Latitude:wgs84"> <swe:uom code="deg"/> <swe:value>-77.8912 [Location] <swe:field name="longitude"> <swe:Quantity definition="urn:ogc:def:property:OGC:Longitude:wgs84"> <swe:uom code="deg"/> <swe:value>38.512 <swe:field name="DPM"> <swe:Quantity definition="urn:ogc:def:property:OGC:ChemicalPresenceInAirDPM"> <swe:uom code="ppm"/> <swe:value>0 ˎˎˎˎˎˎˎˎ [Quantitative representation]

Figure 1. Example of measurements with scalar results from a simple geo-observation

has been growing rapidly in many countries. A sensor network provides a platform to monitor environmental conditions at different locations, such as temperature, pressure, pollutants, and humidity, using spatially distributed sensors. The Open Geospatial Consortium (OGC) [1], an international consortium of companies, universities, and government agencies for developing open standards for geospatial processing and location services, has been building Sensor Web Enablement (SWE) standards to enable the discovery, exchange, and processing of any type of sensor observation information and the tasking of sensor systems on the Web [3]. The SWE standards deﬁne a conceptual model and XML schema for encoding observations and measurements from sensors; both real-time and archived. In the speciﬁcations related to the SWE standards, an observation is an action for estimating a value describing a certain phenomenon, and its result value may have many data types. Yet in many cases an observation result is a measure described using a numerical value with a scale. Figure 1 shows simple observation results according to the OGC scheme. The information in the scheme mainly includes quantitative measurements as well as geographic location (longitude/latitude coordinates) and the time of the observing on sensor systems. For example, a model of temperature sensors measures a scalar value from -40◦ C to 100◦ C. 2.2. Qualitative Spatiotemporal Contents The spread of Geoweb encourages people to create, publish, and share their contents on a map by way of geo-tagging, a process of adding location information, such as geographic coordinates, IP addresses, and place names, to items such as documents, images, RSS feeds, videos, and websites. This uses a new type of geographical information (geoinformation) different from the traditional type that is authorized and used by internal data formats. Geo-information contents are easily available and freely distributed by using open and more human-friendly data formats such as GeoRSS, Keyhole Markup Language (KML), GPS Exchange Format (GPX), or Microformats. In particular, GeoRSS and KML are increasingly used to store geo-information on the contents of many Geoweb sites.

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

291

<entry> M 3.2, Mona Passage urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a2005-08-17T07:02:32Z

[Time]

<summary>We just had a big one.45.256 -71.92

[Location]

[Qualitative representation]

Figure 2. Examples of spatiotemporal contents on the basis of qualitative representation

Another important context of Geoweb contents is temporal information about content creation and updates. This means we can assume that the content describes a part of the real world at that time. For example, current micro-blogs such as Twitter describe topics ranging from individual experiences to social problems through short messages, photos, and videos. Thus, if we regard these spatiotemporal contents as an information shadow of the real world [12], we can obtain comprehensive knowledge about real-world situations and phenomena. User-generated spatiotemporal contents are, however, usually represented by qualitative methods such as natural language and multimedia. For example, people prefer to use the expressions ‘strong winds,’ ‘hot day,’ or ‘dangerous typhoon’ rather than ‘temperature > 33◦ C’ or ‘pressure < 900hPa.’ Figure 2 shows two examples of spatiotemporal contents as formed by GeoRSS feeds with qualitative representation.

3. Phenomena-based Interconnection with Spatiotemporal and Thematic Relationships In this section, we explain our rough idea for interconnecting quantitative sensor data and qualitative spatiotemporal contents, and then introduce data models and operations for the phenomena-based links between two spatiotemporal data sets. 3.1. Interconnection with Spatial Relationships The integration of multi-source spatial data is an important issue in the dynamic and heterogeneous environment of spatial data infrastructures (SDIs). Figure 3 shows a mean of integration for multiple layers of geospatial data by matching common geometrical elements and linking heterogeneous resources from different data providers on the basis of spatial relationships with respect to common reference objects. Making an explicit link between two different resources facilitates access from one to the other. In particular, topological relationships such as ‘CONTAINS,’ ‘INTERSECTS,’ or ‘EQUALS,’ as shown in [5], have been used for integrating two different spatial data resources. In this study, we also consider a topological relationship to connect measurements and usergenerated spatiotemporal contents. Suppose that we have two data sets: measurements of temperature obtained from sensors and user-generated contents. Over the heterogeneous resources, we deﬁne two spatial regions of interest: the temperature area that satisﬁes the condition of {25◦C ≤ temperature ≤ 26◦ C} and the area that covers the contents related to weather topics such as {d, e}, as given in Figure 4. If we consider the topological relationships between two spatial regions, e.g., ‘INTERSECTS,’ we can as-

292

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

Figure 3. Integration of heterogeneous geospatial data by matching and linking based on common geometrical objects

Figure 4. Connection between measurements of temperature and user-generated contents relevant to weather topic with spatial intersection relationship

sume that the intersected region R is characterized by both quantitative representation of {25◦ C ≤ temperature ≤ 26◦ C} and qualitative representation of {d, e}. Those intersecting spatial regions are, however, not static; i.e., they may change (continuously) over time. Therefore we need a spatiotemporal model to represent dynamic spatial regions over time. 3.2. Moving Phenomena Data Models In our previous study [9], we proposed the concepts of ‘geo-event’ and ‘moving phenomena’ for the aggregation and analysis of spatiotemporal user-generated contents on the Web. The moving-phenomena model was designed to represent phenomena of interest by aggregating web contents based on location, time, and topic, such as disease infection, rumour circulation, hurricane disaster, or global warming. In particular, it focuses on the representation of the continuous movement and thematic changes of a phenomenon over space and time. If we create a moving phenomenon over the scattered user contents, we get a continuum in space-time with thematic information as a collection of contents related to topics describing a certain phenomenon. In this study we also use the movingphenomena model to aggregate sensor data based on phenomena of interest. We regard

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

293

Figure 5. Typhoon phenomenon as a sequence of snapshots by spatial, temporal, and thematic constraints over geo-observation data

a measurement of features of interest as a geo-event with spatial, temporal, and thematic components in the following manner. Deﬁnition Geo-Events A geo-event is represented by a triple of (s, t, v), where s is a geometry projected on a (x, y) geographic coordinate (longitude, latitude) system, t is a time instance, and v is a measurement of the real numbers, integers, boolean values, or character strings, or a set of measurements. A geo-event derived from the measurement of a sensor describes what is observed at a speciﬁc location at a certain time, and we can infer phenomena of interest on the basis of observable geo-events occurring in the world. In our previous model we represented a moving phenomenon by a collection of geo-events and one continuous function that estimates thematic values at any location in space-time. However, the moving-phenomena model for sensor data such as the spread of ﬁre and a hurricane’s evolving is different from web contents because sensor data usually contain continuous features of interest in space and time, such as temperature, wind speed, and wind direction. This means we have to consider not only observed geo-events but also the measurement of sensing features at unobserved geographic locations and times. Figure 5 shows an example of real-world phenomena over geo-observation data. Nowadays, we can easily monitor the historical locations of a typhoon and its properties as geo-events, as shown in Figure 5(a); yet we usually draw a typhoon phenomenon by snapshots with the shape of the ellipsoid that approximates the affected area and the measurements inside the area, as shown in Figure 5(b). Thus we need to estimate the measurements of the unobservable area surrounding geo-events to deﬁne a snapshot at a certain time that satisﬁes spatial, temporal, and thematic constraints. In this study we redeﬁne a data model for moving phenomena over the spatiotemporal continuous measurements in the following manner. Deﬁnition Moving Phenomena A moving phenomenon is deﬁned as a mapping function mp from time-varying spatial locations into time-varying scalar values of an observation measurement over time; i.e., mp: S(t) → V(t) such that ∀ t ∈ T, where T is the life time of the phenomenon, and

294

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

Figure 6. Revised data model for the representation of moving phenomena

S(t), V(t) is a subset of locations, values at time instance t of spatial domain S, and measurement value range V, respectively. The measurement values include the real numbers, integers, and boolean values. According to the deﬁnition of moving phenomena, we need three continuous mapping functions: a spatial function mp and two temporal functions S(t) and V (t) to represent time-varying spatial locations and time-varying scalar values, respectively. S(t) is modelled as a function from time to spatial object, such as a point, line, or region, and V (t) returns a value derived from V at time t. All these functions provide the interpolation methods that estimate range values at an unobserved spatial, temporal, or spatiotemporal domain from a sampled domain set. Intuitively, a moving phenomenon of sensor data is abstractly represented by spatiotemporal continuums with measurement distribution, and captured by a sequence of snapshots by spatial, temporal, and thematic constraints in time. Finally, we modify the data model of moving phenomena in [9], as shown in Figure 6. Compared to the previous model of moving phenomena with a coverage function in a spatiotemporal domain, we classify coverage function as the spatial mapping function and movement function as the temporal mapping functions. This model is motivated by integration of the spatial ﬁeld representation in a geographic information system (GIS), as reviewed by [8], into the sliced fragments of moving objects proposed by [6]. In the sliced representation, a moving type is composed of unit moving types, and each unit moving type is deﬁned by a pair of a time interval and unit function that offers the temporal interpolation method; i.e., (timePeriod, unitFunction). This model allows us to retrieve the locations where an object is located at an arbitrary time instance on the moving object model. Thus, we also design data structures for moving phenomena by the aggregation of unit phenomena of (timePeriod, unitPhenomenon) associated with two temporal functions S(t) and V (t) and one spatial function mp to estimate measurement value at any time at any location inside the phenomena. The revised data model also covers our previous model in [9].

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data Thematic Constraint

Thematic Constraint

MovingPhenomenon

MovingPhenomenon

Temporal Constraint

295

Temporal Constraint

Spatial Constraint

Conceptual model

Interconnection by spatiotemporal and thematic relationships

GeoEvent

Spatial Constraint GeoEvent

Real data Sensor Data

Place

Time

User Content

Measure

Sensor network

Place

Time

Keyword

Social network

Figure 7. Interconnection of multi-source spatiotemporal data on the basis of the relationships between moving phenomena

3.3. Interconnection with Spatial, Temporal, and Thematic Relationships Now we explain how to connect two different spatiotemporal data with spatial, temporal, and thematic relationships. Figure 7 shows the conceptual idea of interconnection between sensing measurement data and user-generated spatiotemporal contents on the basis of moving phenomena. The process of phenomena-based interconnection of multisource spatiotemporal data is described as follows: Step 1: Creating geo-events from multi-source spatiotemporal data sources A geo-event consists of three elements: spatial, temporal, and thematic objects. The representation of the thematic object differs depending on the data source. For example, a sensor geo-event is represented by a numerical value and a geoevent derived from user contents is represented by a sorted list of distinct words excluding stop words. Step 2: Aggregating geo-events by spatial, temporal, and thematic constraint Step 3: Creating moving phenomena as continua in a spatiotemporal domain Step 4: Checking spatiotemporal and thematic relationships between two moving phenomena derived from different data sources Step 5: Making a connection between geo-events inside intersected moving phenomena By representing moving phenomena with spatiotemporal continuity from discrete geo-events we ﬁnd a new topological relationship between two geo-event sets. For example, Figure 8 shows the spatiotemporal intersection relationship between two moving phenomena. Of particular note, we can assume the spatiotemporal proximity of occurrences of geo-events in spatial and temporal dimensions. If a phenomenon occurs in a certain area, the related geo-sensing geo-events are observed at nearby locations and timed simultaneously, and people near to the phenomenon may have greater interest than those further away. We may also deﬁne a new phenomenon on the basis of common characteristics of events at the similar time and place. Even though many events occur accidentally, the co-occurrence of certain location-dependent or time-dependent events is useful information for connecting them. We can therefore consider the inter-connection

296

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data time (t)

Intersection in a spatiotemporal domain

longitude (y) latitude (x)

Figure 8. Intersection relationship between two moving phenomena in a spatiotemporal domain

of two different types of events contained within the spatiotemporal intersection area. In addition to the spatiotemporal proximity, we have to take into account the thematic relationship between moving phenomena. In this study, we use a viewpoint-dependent interconnection method from among the conceptual terms proposed in [11] to ﬁnd geoevents relevant to topics of interest. For example, if a viewpoint is given as ‘Hurricane,’ we select geo-events of web contents containing keywords relevant to ‘Hurricane’ and of observation measurements whose property name is lexically related to ‘Hurricane.’ Then, we create three-dimensional objects with the concept of moving phenomena in a spatiotemporal domain. Finally, we can make a connection between geo-events inside moving phenomena on the basis of their spatiotemporal proximity and thematic relevance.

4. Semantic Tagging System for Natural Geographic Phenomena In this section, we introduce a semantic tagging system for natural geographic phenomena on the basis of the phenomena-based interconnection of sensor data and spatiotemporal contents. It extracts semantic keywords with respect to geographic phenomena, such as hurricanes, ﬂoods, and earthquakes. Figure 9 illustrates the overall system architecture of our semantic tagging system. The system consists of three modules: a moving phenomenon generator, moving phenomenon interconnector, and GeoT3D visualization tool. The moving phenomenon generator helps users to aggregate geo-events by spatial, temporal, and thematic constraints and abstract them as a moving phenomenon using our data model. First the generator selects thematically relevant geo-events from the viewpoint of a user from each data source. Then it creates moving phenomena using those geo-events considering spatial, temporal, and thematic constraints. In particular, the generator should construct a continuous measurement ﬁeld occupying a whole space using observed geo-events at a sampling time that satisﬁes the temporal constraint in the case of geo-events of sensor data. Then it extracts each snapshot at the sampling time by using spatial and thematic constraints, and maps two successive snapshots depending on the temporal function. After creating moving phenomena, the moving phenomenon inter-

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

297

Figure 9. Overview of a semantic tagging system for natural geographic phenomena

connector checks the spatiotemporal relationships between phenomena and makes a link between different types of geo-events inside the spatiotemporal intersection area. We apply a three-dimensional (3D) computational geometry algorithm to check the spatiotemporal relationships between moving phenomena. Based on our previous work, we have tried to represent moving phenomena using 3D geometries, especially the composition of tetrahedrons, in the space-time cube as geographic space by the x- and y-axes (latitude and longitude). The z-axis(height) represents time progress. The GeoT3D visualization tool enables users to easily and rapidly recognize the behaviours and relationships of moving phenomena in a 3D space-time domain. The visualization tool also displays tag clouds for understanding semantic meanings based on the links between geo-events with respect to a moving phenomenon selected by the user.

5. Experiments We present here an implementation of the proposed semantic tagging system. For the experiment, we used real geo-observation data acquired from weather stations and real track data of typhoons that hit Japan in 2005. The geo-observation data contain various types of hourly observation data on the location of each station, such as temperature, pressure, and wind speed. Typhoon track data include the sampling information about pressure, the maximum speed of wind, and the radius as well as the location of the eye during its life span. We also used Web news articles published in 2005 as a set of user-generated spatiotemporal contents. Each data item is stored into a database system as a form of geo-event. In this study, we focus on the creation of moving phenomena using sensor data and the generation semantic tags by interconnecting the created moving phenomena and moving phenomena of user contents introduced in [9]. First, we show an example of the creation of a moving phenomenon over the measurements of wind speed of stations and the trajectory information of a typhoon with several spatial, temporal, and thematic constraints. To create a moving phenomenon, we ﬁrst capture snapshots of wind speed in the area surrounding the eye of the typhoon at each sampling time. Figure 10(a) exempliﬁes the spatial, temporal, and thematic conditions to capture each snapshot of a typhoon phenomenon. For example, we use the distance

298

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

(a) Setting spatial, temporal, and thematic constraints

(b) Sampling snapshots of a typhoon phenomenon

(c) Typhoon phenomenon as a moving phenomenon in a spatiotemporal space

Figure 10. Generating a typhoon moving phenomenon using geo-events of sensor data: measurements of wind speed and trajectory information

constraint ‘within 120 km of the trajectory of typhoon Nabi’ as a spatial constraint, time interval ‘2005/09/04–2005/9/8’ as a temporal constraint, and ‘wind speed is greater than 25 knots and less than 50 knots’ as a thematic constraint. In this study, we ﬁrst built base ﬁelds with the thematic constraint during the time interval, and then we cut out snapshots within the typhoon area from the base ﬁelds (Figure 10(b)). The typhoon area is computed by spatial and thematic constraints. Next, we created unit phenomena between two successive snapshots with a temporal function. In this study, we used the linear function on the basis of 3D Delaunay triangulation as introduced in [10]. Finally, we can acquire a typhoon phenomenon in spatiotemporal dimensions as shown in 10(c). Figure 11 shows a screenshot of the 3D visualization of moving phenomena depending on the data types. As we already mentioned, the 3D spatiotemporal representation helps users directly understand the behaviours of moving phenomena and the spatiotemporal relationships between them. Finally, we can see relevant keywords and contents for the moving phenomenon of sensor data using tag clouds, as shown in Figure 12. Therefore we can explain a typhoon phenomenon which is within 120 km of the centre of the typhoon, occurs during ‘2005/09/04–2005/9/8,’ and has wind speed between 25 knots and 50 knots by a set of keywords {‘typhoon Nabi,’ ‘disaster,’ ‘ﬂood,’ ‘Japan,’ ...}.

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

299

Figure 11. Example of 3D visualization of moving phenomena in a spatiotemporal domain

Figure 12. Visualization of keywords related to a geographic phenomenon with spatial, temporal, and thematic constraints

6. Conclusion In this study, we proposed an interconnection method for numerical sensor data and lexical web contents on the basis of the spatial, temporal, and thematic relationships. In our method, we used the concepts of geo-events and moving phenomena to represent spatial, temporal, and thematic information. In particular, we deﬁned a new data model of moving phenomena over the sensor data for the representation of time-varying continuous measurements, such as temperature, pressure, and degree of air pollution. We also demonstrated a simple system for automatically generating semantic tags of natural geographic phenomena, such as typhoons, climate changes, and air pollution. In the system, we make a connection between two geo-events on the basis of spatiotemporal proxim-

300

K.-S. Kim et al. / A Phenomena-of-Interest Approach for the Interconnection of Sensor Data

ity and thematic relevance and extract semantic keywords using these connections with respect to a certain phenomenon described by spatial, temporal, and numerical thematic constraints. In our future study, we will extend the connection using various types of spatial, temporal, and thematic relationships. Moreover, we will analyze the graph patterns over the linked geo-events on the basis of the moving phenomena.

Acknowledgements The authors would like to thank Mr. Hideki Murasato for his help in implementing the prototype used in this study.

References [1] [2] [3] [4]

[5] [6]

[7] [8] [9]

[10] [11]

[12] [13]

Open Geospatial Consortium. http://www.opengeospatial.org/. Sensing the World with Mobile Devices. Technical report, Nokia Research Center(NRC), December 2009. R Sensor Web Enablement: Overview And High M. Botts, G. Percivall, C. Reed, and J. Davidson. OGC Level Architecture. Technical report, OGC, December 2007. A. T. Campbell, S. B. Eisenman, N. D. Lane, E. Miluzzo, R. A. Peterson, H. Lu, X. Zheng, M. Musolesi, K. Fodor, and G.-S. Ahn. The Rise of People-Centric Sensing. IEEE Internet Computing, 12(4):12–21, 2008. M. J. Egenhofer. Reasoning about binary topological relations. In Proc. of the Second International Symposium on Advances in Spatial Databases(SSD), pages 143–160, 1991. L. Forlizzi, R. H. Güting, E. Nardelli, and M. Schneider. A Data Model and Data Structures for Moving Objects Databases. In Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, pages 319–330, 2000. M. F. Goodchild. Citizens as Sensors: The World of Volunteered Geography. GeoJournal, 69(4):211– 221, 2007. K. Kemp. Environmental modeling with gis: A strategy for dealing with spatial continuity. Technical report, University of California at Santa Barbara, 1993. K.-S. Kim, K. Zettsu, Y. Kidawara, and Y. Kiyoki. Moving phenomenon: Aggregation and analysis of geotime-tagged contents on the web. In Proc. of the 9th International Symposium on Web and Wireless Geographical Information Systems, pages 7–24, 2009. H. Ledoux and C. M. Gold. Modelling three-dimensional geoscientiﬁc ﬁelds with the voronoi diagram and its dual. International Journal of Geographical Information Science, 5(22):547–574, 2008. T. Nakanishi, K. Zettsu, Y. Kidawara, and Y. Kiyoki. A context dependent dynamic interconnection method of heterogeneous knowledge bases by interrelation management function. In Proc. 19th European-Japanese Conference on Information Modelling and Knowledge Bases, 2009. T. O’Reilly and J. Battelle. Web squared: Web 2.0 ﬁve years on. Proc. of the 6th Annual Web 2.0 Summit, 2008. A. Scharl and K. Tochtermann, editors. The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society. Springer, 2007.

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-301

301

Modelling Contexts in Cross-Cultural Communication Environments Anneli HEIMBÜRGER1, Miika NURMINEN, Teijo VENÄLÄINEN and Suna KINNUNEN University of Jyväskylä Faculty of Information Technology Department of Mathematical Information Technology

Abstract. In our research, context is defined as a situation a user has at hand. The focus in our study is on modelling contexts in cross-cultural communication environments. These environments can be physical, virtual or hybrid. Crosscultural communication environment – user – situation is the key triplet in our context research. In our paper we discuss context as a key to situation-specific computing. We introduce our cross-cultural communication context tree and context flow architecture and an example of implementation i.e. Context-Based eAssistant for Cross-Cultural Communication (CeACCC). Keywords. Context models, context tree, context flow, cross-cultural communication, e-Assistant for cross-cultural communication, cross-cultural icons

Introduction Cultural competence has become an important dimension for success in today’s international business and research arena. Cultural computing is an emerging, multidisciplinary computer science field as discussed by Fei-Yue Wang in his letter from the editor in IEEE Intelligent Systems Special Issue for AI and Cultural Heritage [31]. In the near future, cultural computing will have several important applications in our knowledge societies in the fields such as business, environment, health care, education and research. What is culture? Culture is embodied in how people interact with other individuals and with their environment; it is a way of life formed under specific historical, natural and social conditions [10, 15, 18, 23, 31]. Culture can be considered as one example of context and cultural computing as a subset of context computing (see the definitions in Table 1). A computational method, a computer system, or an application is contextsensitive if it includes context-based functions and if it uses context to provide relevant information and services to the user, where relevancy depends on the user’s situation. Such applications have to adapt not only to the device, the connection state and the user environment but also to the user’s situation at hand. These parameters partially characterize a contextual situation. For example, a project manager monitors project’s forthcoming milestones by means of the project management system, or he/she examines the same system when preparing the next day advisory board meeting. In the 1

Corresponding Author.

302

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

first case the contextual situation is a long-term project monitoring undertaking with more general content, whereas in the second case the contextual situation is short-term project monitoring task with detailed content. A variety of context models have been subject of research. Many of them model only the physical environment, i.e. location, identity, and time [4, 29]. The focus of our context modelling is on users’ situations at hand in cross-cultural communication environments. We propose a two-level context model that includes a generic level and an application domain specific level. Our application domain is cross-culturality. Our study is a part of the two-year joint project on “Ubiquitous Cross-Cultural Knowledge Spaces/Ubiquitous Cross-Cultural Multimedia Systems for Mobile Computing Societies” between Keio University SFC, the Kanagawa Institute of Technology, Komazawa University, the Tampere University of Technology, Pori, and the University of Jyväskylä [14, 19, 30]. The essential concepts used in our paper are summarized in Table 1. The paper is organized as follows. Context definitions and models are summarized in Section 1. Our context tree and context flow architecture in cross-cultural communication environments are introduced in Section 2. In Section 3 we present an example of implementation “Context-Based e-Assistant for Supporting Cross-Cultural Communication”. Section 4 is reserved for conclusions and issues for further research. Table 1. The essential concepts used in our paper Concept

Definition

Culture

Culture is embodied in how people interact with other individuals and with their environment; it is a way of life formed under specific historical, natural and social conditions [31]. Other cultural levels also exist, such as organization and team cultures; these are out of scope of our paper [20].

Cross-cultural

Considers studies and knowledge between two cultures.

Cross-cultural communication

Consists of human-to-human, human-to-machine, and human-to-environment communication in cross-cultural environments. The environment can be physical or virtual or hybrid

Cultural computing

Research, development, design and implementation of computational models, methods, functions and algorithms for cultural applications.

Context

Situation and/or task at hand. Cross-cultural situation can be considered as one example of context.

Context-sensitive

A computational method, a computer system, or an application is context-sensitive if it includes context-based functions and if it uses context to provide relevant information and services to the user, where relevancy depends on the user’s situation.

Context computing

Context computing can be defined as the use of context in software applications, where the applications adapt to discovered contexts by changing their behavior. A context-sensitive application presents the following features: context sensing, presentation of information and services to a user, automatic execution of a service, and tagging of context to information for later retrieval.

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

303

1. Context: Key to Situation Specific Computing Various areas of computer science have been investigating the concept of context over the last decades. Ubiquitous computing is a new domain in which context is receiving growing attention. For long time, systems like Geographic Positioning System (GPS) and Geographic Information System (GIS) remained the sole source of context for the development of location-aware systems. [4, 29] In the literature several definitions of the term context can be found [4, 29, 34]. Some essential context definitions in the field of computer science are summarized in Table 2. Table 2. Summary of essential context definitions Bazire and Brezillon 2005

The context acts like a set of constraints that influence the behavior of a system (a user or a computer) embedded in a given task. The definition is based on the analysis of a collection of 150 context definitions from several fields of applications. [3]

Coutaz et al. 2005

Context is not simply the state of a predefined environment with a fixed set of interaction resources. It is part of a process of interacting with an everchanging environment composed of reconfigurable, migratory, distributed, and multiscale resource. [7]

Dey et al. 2005

Context is any information that can be used to characterize the situation of entities that are considered relevant to the interaction between a user and an application, including the user and the application themselves. [8]

Leppänen 2005

A context is a conceptual or intellectual construct that help us understand, analyze and design the natures, meanings and effects of more elementary things in the concerned environment or circumstances. It is a whole which is determined by the focal thing(s) of which making sense is important. It is composed of highly related things, each of which represents certain contextual domain. [22]

Winograd 2001

Context is an operational term: something is context because of the way it is used in interpretation, not due to its inherent properties. [32]

The concept of context is still a matter of discussion, and through the years several different definitions have been proposed. Coppola et al. 2009 in [6] divide the definitions into extensional and intensional definitions. Extensional definitions present the context through a list of possible context dimensions and their associated values. The context is represented by the location of the user, the surrounding objects, proximity to other people, temperature, computing devices, user profile, and physical conditions and time. Intensional definitions present the concept of context more formally. Extensional definitions seem to be useful in practical applications, where the abstract concept of context has to be made concrete. However, from a theoretical point of view they are not properly correct, as the context cannot be outlined just by some of its aspects. On the other hand intensional definitions are of little use in practice, despite being theoretically satisfying. Context is a multidimensional concept. Context modelling approaches can be classified by the scheme of data structures which are used to exchange contextual information in the respective system. Context models can be divided into seven categories which are summarized in Table 3.

304

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

Table 3. Context modelling approaches Key-Value Models The model of key-value pairs is the most simple data structure for modelling contextual information. Key-value pairs have been widely used to model the context by providing the value of context information (e.g. location information) to an application as an environment variable. Key-value pairs are easy to manage, but they lack functionalities for enabling advanced context retrieval algorithms. [29] Markup Scheme Models Common to all markup scheme modelling approaches is a hierarchical data structure consisting of markup tags with attributes and content. Profiles (e.g. Composite Capabilities/Preference Profile) are typical representatives of markup scheme based context modelling approach. The markup scheme models are usually derivatives of Standard Generalized Markup Language (SGML), such as the XML. They are often either proprietary or limited to a small set of contextual aspects, or both. [29] Graphical Models A very well known general purpose modelling instrument is the Unified Modeling Language (UML) which has a strong graphical component (UML diagrams). Due to its generic structure, UML is also appropriate for modelling the context. This is shown for instance by Bauer in [1], where contextual aspects relevant to air traffic management are modeled as UML extensions. Other graphical model examples have been introduced by Bauer in [1], by Halpin in [9] and by Henricksen et al. in [16]. Object-Oriented Models The intention of object-oriented context modelling approaches is to exploit the benefits of object oriented approach, encapsulation and reusability, to cover parts of the problems arising from the dynamics of the context in ubiquitous environments. The details of context processing is encapsulated on an object level and hence hidden to other components. Access to contextual information is provided through specified interfaces only [29]. UML-based graphical models can be used to specify object-oriented models. Logic Based Models A logic defines the conditions on which a concluding expression or fact may be derived (a process known as reasoning or inferencing) from a set of other expressions or facts. To describe these conditions in a set of rules, a formal system is applied. In a logic based context model, the context is consequently defined as facts, expressions and rules. Usually contextual information is added to, updated in and deleted from a logic based system in terms of facts or inferred from the rules in the system respectively. Common to all logic based models is a high degree of formality. [29] Ontology Based Models Ontology is a structure that represents relevant entities, their relationships and related rules. Ontologies are usually based on a formal logical model, but ontology modelling focuses more on conceptual knowledge, supplemented with logical rules. OWL is a web ontology language intended to be used when the information contained in documents needs to be processed by applications, as opposed to situations where the content is presented to humans. Ontology based context modelling approach provides a set of ontological concepts to characterize entities such as persons, places or several other kinds of objects within their contexts. An example is context broker architecture which provides runtime support for context-aware systems, for example in intelligent meeting rooms applications. [4, 5, 6, 17, 22, 26, 29] SECI/Shared Context Model Japanese has an interesting concept ba which can be translated as shared context [24, 28]. Nonaka adapted this concept for the purpose of elaborating SECI model of knowledge creation [25]. Ba can be considered as a shared context, a space that serves as a foundation for knowledge creation. This space can be physical (e.g. office, dispersed business space), virtual (e.g., email, teleconference), mental (e.g. shared experiences, ideas, ideals) or any combination of them. Ba provides a platform for advancing individual and/or collective knowledge.

Based on context related research, we can summarize that a complete and comprehensive model is still missing. Some of the main reasons may be the absence of a comprehensive international standard or at least W3C recommendation as well as the

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

305

lack of a reusable reference model that could be applied to manage context in various application domains. In our research, generally speaking, context is a situation at user’s hand. The focus in our study is on modelling cross-cultural communication contexts, i.e. situations at user’s hand in cross-cultural environments. These environments can be physical, virtual or hybrid. Cross-cultural communication environment – user/actor – situation is the key triplet in our context research.

2. Context Tree and Context Flow Architecture in Cross-Cultural Communication Environments In a cross-cultural environment, the user can communicate with (a) another user/actor (or users/actors), (b) a machine or (c) a physical, virtual or hybrid environment. In first stage, our objective is to model cross-cultural communication contexts. Our approach is extensional and ontology-based. We illustrate this by a context tree. In the second stage, our objective is to introduce context flow architecture in cross-cultural communication environments. Our context tree for cross-cultural communication (CTCC) includes two context descriptor classes: general context descriptors and application domain specific descriptors (Figure 1). CTCC is a multilevel tree where the root represents the global cross-cultural communication context: the nodes at the first level refer to the general and application domain specific contexts, the nodes at the second level refer to subclasses and the leaves at the third level refer to attributes, optionally specified at the fourth level. There are situations in each class of descriptors and in their interaction with one another that affect actions taken by or actions accepted by computing entities. The context can also be temporal. When context timing is necessary to keep, we can have a time stamp associated and stored with the context data [12]. We have implemented the context tree using Protégé [27]. Protégé is a free, open source ontology editor and knowledge-base framework. In Figure 2 we introduce information flow and processing architecture for crosscultural communication environments. The system has two main input modes: a situation/task-specific input mode and an explicit/tacit knowledge input mode. The explicit/tacit knowledge input mode can be used to store actor’s own experiences in everyday life or as a feedback from using the Context-Sensitive Service System. By means of the situation/task-specific interface the actor inputs static or dynamic contexts. The context can be divided into low and high level contexts. The inputted low level contexts can be mapped to high level contexts (for example the mapping function transforms geographical coordinates to a street address or a series of geographical coordinates into a route). The high level contexts are transformed to the context integrator and manager module. The contexts i.e. the situation the actor has at hand can be mapped to cross-cultural communication context ontology structure by the context manager. The mapping function transforms the inputted context for reasoning and decisions. The reasoning engine creates decisions which are inferred by means of a relation and rule database. Context logs database includes context history for more detailed situation analysis and for learning of user’s intentions. Reasoning and decision procedures create knowledge

306

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

Figure 1. Context Tree for Cross-Cultural Communication. Application domain specific descriptors are inside the lines on the left side. Others, mostly on right side are general context descriptors. Some examples of potential subclasses and attributes are given.

which will be used by the context-sensitive service in cross-cultural communication environment. Finally, the system gives context-sensitive output for the actor. The output can be knowledge explaining how to act in certain situation; it may also activate searching and delivering contents, running other applications, or more advanced data mining functions.

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

307

Actor Input Modes

Cross-Cultural Communication Context Input Interface

Explicit/Tacit Knowledge Input Interface

Unprompted Input

Static Context High Level • Street Address • Station Name

Feedback

Dynamic Context

Low Level • Coordinates

Low Level • Series of coordinates

Mapping Function to High Level Contexts

High Level • Route • Icon, Sign • Video

Context Integrator and Manager Context Logs DB

Context-Sensitive Service in Cross-Cultural Communication Environment

Reasoning Engine with Appropriate API (for example OWL API)

Cross-Cultural Communication Context Ontology DB External Web Resources

Rule DB

Output to Actor

Figure 2. Information flow and processing architecture for cross-cultural communication environments

3. A Scenario of Implementation: Context-Based e-Assistant for Supporting Cross-Cultural Communication As an example of implementing our context tree and context flow architecture we introduce the Context-based e-Assistant for Supporting Cross-Cultural Communication (CeACCC) [11]. The core idea of the CeACCC is to support the user/actor in a cross-

308

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

cultural situation. The situation can be for example a research or business meeting or travelling. An input to CeACCC is context i.e. situation. An output from the CeACCC system for the user/actor is how to interpret a given context and behave in it. One of the essential design principles in our CeACCC is time context. The user might not always have time to go through all the available information in detail. The application must be able to provide the information in a suitably detailed level, according to the user’s needs, either in greater detail (if time) or more compactly (if not). Let’s study an example where the user is in Japan for the first time and is trying to travel from Tsukuba to Shonandai by train during the rush hour. He/she needs information on the train routes and fares, as well as information on how to behave correctly in stations and trains. The user can use the free browsing feature to get information on various situations and get information on timetables, ticket prices etc. However, it may be tedious to search all the information items individually. The user could instead try to find the situation (travelling from Tsukuba to Shonandai) listed in guided tours and thus gain all the related information more easily. If the situation is not listed in the guided tours, the user can find information by using the search methods. The situation can be inserted in natural language, for example: “travelling from Tsukuba to Shonandai during rush our”. Also the map based search can be used. The user can first select a Tokyo district train map, select the stations and receive all the necessary information. The user can indicate two (or more) points in the map along with some additional preferences, like the shortest, quickest or cheapest route. In the CeACCC, we also sketch a new information search concept, a situation recognition functionality that analyses a user given pictorial file (an image, an icon, a sign or a symbol) of the situation (Figure 3). An example of a situation recognition mode is described as follows: CROSS-CULTURAL SITUATION: A train station in Japan, an unknown symbol for the actor (nationality - for example Finnish, first time in Japan). ACTIVITY: When encountering an unknown sign or symbol in the train station, the actor can take a picture of it with her/his mobile device, and use CeACCC’s image recognition feature to help interpret the sign or symbol. SERVICE: The actor submits the image by her/his mobile device to the CeACCC pictorial database. The actor can also give additional information in order to help the interpretation of the content of the image. The additional information can consist of instructions to focus on certain part of the image or of instructions to omit something from the image. The image service sends the picture to CeACCC’s pictorial database. FUNCTION: A pictorial recognition service indentifies the symbol and associated description of its meaning. SERVICE: The image service sends the symbol description and action guidelines for the actor. ACTIVITY: The actor knows how to interpret the symbol and how to behave in the situation at hand (= context).

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

309

Figure 3. Situation recognition mode. Symbols on the right hand side are examples of pictorial files that the actor could submit to CeACCC’s pictorial database for recognition.

4. Conclusions and Future Research In our paper we discussed context as a key to situation specific computing. We summarized the most relevant context definitions and models from the computer science point of view. The concept of context is still a matter of discussion in many scientific forums, although during the past decades several different definitions have been proposed. While early models mainly addressed the modelling of context with respect to one application or an application class, generic context models are of interest since many applications can benefit from these. Many models also limits only on the physical environment, i.e. location. In our research, context is defined as situation at user’s hand. The focus in our study is on modelling cross-cultural communication contexts, i.e. situations at user’s hand in cross-cultural environments. These environments can be physical, virtual or hybrid. Cross-cultural communication environment – user/actor – situation is the key triplet in our context research. We have introduced our cross-cultural communication context tree and context flow architecture, and an example of initial implementation. Future work involves further development of the CeACCC system, and formalization of the context tree. The image recognition functionality should be implemented on a mobile device and used in actual environment. Our next implementation scenario to be embedded into our CeACCC system is Icon-based Language for Cross-Cultural Communication. Icons are small-sized and isolated signs. Being embedded for example in maps, icons typically indicate points of interest or other discrete object classes. In addition to map icons, icons are familiar from emergency situations, airport signs, hotel information booklets and computer system icons. Traffic signs can also be regarded as a collection of icons. In our system icons can have three functions. They could be used as (1) symbols for cross-cultural knowledge categorization, (2) keys for information retrieval and knowledge mining, and (3) provide us with a culture and language independent way to communicate, i.e. universal communication language [13]. An example of the cross-cultural icon design is given in Figure 4. In our example the cross-cultural context is season, and its sub-context is spring. The Cross-Cultural

310

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

Spring Icon has a Finnish area and a Japanese area which both include symbols that are strongly related to spring in each country. The icon has both horizontal and vertical dimensions for information browsing and deeper knowledge mining in the Web [2, 21, 33]. The concept of visual vocabulary that anyone from any culture, any country, and in any context of life can understand is a very interesting research challenge.

Figure 4. An example of a cross-cultural icon design

Acknowledgements The authors thank the Finnish Academy, the Japan Society for Promoting Science and the Finnish Funding Agency for Technology and Innovation for supporting our research.

References Bauer, J. Identification and Modeling of Contexts for Different Information Scenarios in Air Traffic, Diplomarbeit, 2003. [2] Barakbah, A. and Kiyoki, Y. A Pillar Algorithm for K-Means Optimization by Distance Maximization for Initial Centroid Designation, Proceedings of the IEEE International Symposium on Computational Intelligence and Data Mining (CIDM) 2009, Nashville, Tennessee, USA, March 30 - April 2nd 2009. [3] Bazire, M. and Brezillon, P. Understanding context before using it. Proceedings of CONTEXT 2005, 29–40, Springer-Verlag, 2005. [4] Chaari, T., Ejigu, D., Laforest, F. and Scuturici, V-M. 2007. A Comprehensive Approach to Model and Use Context for Adapting Applications in Pervasive Environments. Journal of Systems and Software 80 (2007), 1973-1992. [5] Chen, H., Finin, T. and Joshi, A. An ontology for context-aware pervasive computing environments, Special Issue on Ontologies for Distributed Systems, Knowledge Engineering Review, 2003. [6] Coppola, P., Della Mea, V., Di Gaspero, L., Lomuscio, R., Mischis, D., Mizzaro, S., Nazzi, E., Scagnetto, I. and Vassena, L. 2009 AI Techniques in a Context-Aware Ubiquitous Environment. In: Hassanien, A-E., Abawajy, J. H., Akraham, A. and Hagras, H. (eds.). Pervasive Computing. Innovation in Intelligent Multimedia and Applications, 157-180. Springer, London 2009. [7] Coutaz, J. Crowley, J., Dobson, S. and Garlan, D. Context is key, Communications of the ACM 48 (2005). [8] Dey, A., Kokinov, B., Leake, D. and Turner, R. (eds). Modeling and Using Context. 5th International and Interdisciplinary Conference CONTEXT 2005, Paris, France, July 5-8, 2005. LNAI 3554. Springer, Berlin, 2005. [9] Halpin, T. A. Information Modeling and Relational Databases: From Conceptual Analysis to Logical Design. Morgan Kaufman Publishers, San Francisco, 2001. [10] Heimbürger, A. 2008, When Cultures Meet: Modelling Cross-Cultural Knowledge Spaces. Jaakkola, H., Kiyoki, Y. and Tokuda, T. (eds.) Information Modelling and Knowledge Bases XIX, Frontiers in Artificial Intelligence and Applications 166, 314-321, Amsterdam: IOS Press, 2008. [1]

A. Heimbürger et al. / Modelling Contexts in Cross-Cultural Communication Environments

311

[11] Heimbürger, A. 2009. CeACCC Design Issues. Presentation in the First Japan-Finland Workshop on Ubiquitous Cross-Cultural Multimedia Systems for Mobile Computing Societies, January 11-16, 2009, January 2009, Tokyo, Japan. [12] Heimbürger, A. 2009. Temporal Entities in the Context of Cross-Cultural Meetings and Negotiations. In: Kiyoki, Y., Tokuda, T. and Jaakkola, H. (eds.), Information Modelling and Knowledge Bases XX, Frontiers in Artificial Intelligence and Applications 190, 290-308, Amsterdam: IOS Press, 2009. [13] Heimbürger, A. 2009. Ubiquitous Cross-Cultural Knowledge Spaces (UbiKnows) Activities at University of Jyväskylä Spring 2009 – Autumn 2009. Presentation at Keio University SFC, Oct. 27th, 2009. [14] Heimbürger, A., Jaakkola, H., Sasaki, S., Yoshida, N. and Kiyoki, Y. 2009. Context-Based Knowledge Creation and Sharing in Cross-Cultural Collaborative Communities. In: Proceedings of the 19th European-Japanese Conference on Information Modelling and Knowledge Bases (EJC2009), Maribor, Slovenia, June 1 - 5, 2009, pp. 78-90 [15] Heimbürger, A., Sasaki, S., Yoshida, N., Venäläinen, T., Linna, P. and Welzer, T. 2010. Cross-Cultural Collaborative Systems: Towards Cultural Computing. In: Kiyoki, Y., Tokuda, T. and Jaakkola, H. (eds.) Information Modelling and Knowledge Bases XX. Frontiers in Artificial Intelligence and Applications, Amsterdam: IOS Press. (To be published in January 2010). [16] Henricksen, K., Indulska, J., and Rakotonirainy, A. Modeling Context Information in Pervasive Computing Systems. In Mattern, F. and Naghshineh, M. (eds.) LNCS 2414: Proceedings of 1st International Conference on Pervasive Computing, Springer, 167–180, 2002. [17] Hide Tokuda Lab 2009. Hide Tokuda Lab (referred Nov. 30th, 2009) <(URL: http://www.ht.sfc.keio.ac.jp/> [18] Hofstede, G. 2009. Geert Hofstede™ Cultural Dimensions (referred 13th Dec. 2009) . [19] Jaakkola, H., Heimbürger, A. and Linna, P. 2009. Knowledge-Oriented Software Engineering Process in a Multi-Cultural Context, Software Quality Journal, Published online: 5 December 2009, Springer. [20] King, W.R., A Research Agenda for the Relationships between Culture and Knowledge Management, Knowledge and Process Management 14 (2007), 226-236. [21] Kiyoki, Y., Kitagawa, T. and Hayama, T. A Metadatabase System for Semantic Image Search by a Mathematical Model of Meaning, ACM SIGMOD Record, 23, 4(1994), 34-41. [22] Leppänen, M. 2005. An Ontological Framework and a Methodical Skeleton for Method Engineering – A Contextual Approach. Jyväskylä Studies in Computing 52, 702 p., Jyväskylä 2005. [23] Lewis, R. 2009. The Lewis Model of Culture (referred Dec. 13th, 2009) . [24] Nishida, K. Fundamental Problems of Philosophy: The world of Action and the Dialectical World, Tokyo: Sophia University, 1970. [25] Nonaka, I and Konno, N. The concept of "Ba’: Building foundation for Knowledge Creation. California Management Review 40 (1998). [26] OWL 2009. Web Ontology Language (referred Dec. 1st, 2009) [27] Protégé 2009. Protégé Ontology Editor and Knowledge-base Framework (referred Dec.17.th, 2009), . [28] Shimizu, H. Ba-Principle: New Logic for the Real-time Emergence of Information, Holonics, 5 (1995), 67-69 [29] Strang, T. and Linnhoff-Popien, C. 2004. A Context Modeling Survey. Workshop on Advanced Context Modelling, Reasoning and Management, UbiComp 2004 – The Sixth International Conference on Ubiquitous Computing, Nottingham/England [30] UbiKnowS 2009. The First Japan-Finland Workshop on Ubiquitous Cross-Cultural Multimedia Systems for Mobile Computing Societies, January 11- 16, 2009, Tokyo and Fujisawa, Japan (referred August 25th, 2009) . [31] Wang, F-Y, Is Culture Computable? A Letter from the Editors, Special Issue: AI and Cultural Heritage, IEEE Intelligent Systems 24 (2009), 2-3. [32] Winograd, T.. Architectures for context. Human–Computer Interaction 16 (2001), 401–419. [33] Zettsu, K., Nakanishi, T., Iwazume, M., Kidawara, Y. and Kiyoki, Y. Knowledge Cluster Systems for Knowledge Sharing, Analysis and Delivery among Remote Sites In: Jaakkola, H., Kiyoki, Y. and Tokuda, T. (eds.) Frontiers in Artificial Intelligence and Applications, Vol. 166, Information Modelling and Knowledge Bases XIX. Amsterdam: IOS Press, 2008. [34] Zhou, M. X., Houck, K., Pan, S., Shaw, J., Aggarwa, V. and Wen, Z. Enabling Context-Sensitive Information Seeking. Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, 2006, 116 – 123.

312

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-312

Towards Semantic Modelling of Cultural Historical Data Ari Häyrinen Digital Culture, University Of Jyväskylä

Abstract. In this paper a practical method is presented for creating documentation of cultural historical targets using an event-centric core ontology. By using semantic documentation templates and an XML-based query language, a domain specific documentation model can be created and flexible user interfaces can be built easily for accessing and editing the documentation. Keywords. ontologies, cultural historical documentation, information retrieval

Introduction The most challenging feature of cultural historical data is its great variety. This variety sets challenges for documentation, information retrieval and user interface design in cultural historical data systems. Also, lately the question of information integration and especially that of cross-cultural information exchange have been raised in organisations such as museums, libraries and archives that collect and maintain cultural historical data. To solve some important problems in the field of cultural historical documentation[1], semantic technologies have been introduced. Conceptual Reference Model (later CRM) is a formal ontology intended to facilitate integration, mediation and interchange of heterogeneous cultural heritage information[3]. The model was created by the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM) on empirical bases from real-world datasets which reflect the special needs of the cultural historical field. CRM also has the status of being an ISO-standard [8]. In this paper, an event-centric, CRM-based method for modelling cultural historical data is presented. The method is designed to help cultural historical documentation work by providing tools for semantically aware documentation of cultural historical items and events. In order to demonstrate method, a software called IDA-framework was implemented.

1.

CIDOC-CRM and event-centric documentation

CIDOC-CRM consists of 86 classes and 143 properties, and it is meant to be extended by users for more specific domains. An important aspect of the CRM class hierarchy is separation between temporal entities and persistent items (Figure 1). It allows

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

313

modelling of history as events, with actors participating in those events. Events can, for example, produce or modify persistent items, or persistent items can be used in the events.

Figure 1: partial CICOC-CRM class hierarchy

1.1.Event-centric model of history The tradition of documentation in the cultural historical field has been very itemcentric[2]. The documentation is organised around physical items, which makes documentation of immaterial objects challenging or even impossible. In the eventcentric approach, the root of documentation process is not necessarily a physical item. For example, when a hand-painted painting is modelled according to CIDOC-CRM, the painting has no direct property called author or artists. Instead, the painting is said to have been produced by a production event, which is carried out by one or more persons. Similarly, if painting is sold, damaged, or restored, these processes are modelled as individual events that affect the state of the painting. The event-centric approach has several advantages compared to traditional, itembased approaches. First, events provide a semantically meaningful way of describing links between physical things and actions of human beings[4]. Second, the eventcentric model allows a very flexible structure for an individual record. Events describe the history of an item, and new events can be added at any time. In practice, this means that the related documentation can be very detailed or just on a general level and that the level of details can be decided by the person creating the documentation. Therefore, documentation can be constructed based on the qualities of the target of documentation instead of some rigid structure. Using explicit events also simplifies data structure design, because events have a common structure: someone did something somewhere during a certain time period. With this formulation, it is possible to represent, for example, a creation of an artwork, a construction of a building or having a scientific seminar. Because events are individual records, they split documentation into smaller units, therefore making it semantically more precise and more accessible.

314

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

The third, and, from the perspective of documentation, very important benefit is that using explicit events in documentation makes events themselves documentable as units. For example, a design process of a building or restoration of a painting can be documented as an individual record. In the traditional item-centric approach this would need specific fields for every event type. To be able to define the cultural context of an item, it is beneficial if the cultural object can be separated from the physical carrier object(s). For example, in the case of architectural drawings, it can be said that the immaterial architectural design is carried out by physical drawings. This way the actual design can be documented independently from the physical documents. This separation also helps when modelling the relation between industrially made objects and conceptual designs. Industrially made furniture is produced by following a certain procedure or design. The design is documented only once even if there are several physical copies of the furniture. However, sometimes it is easier to use a shorter path and model the relations in a shorter way. CIDOC-CRM also allows more simple modelling. For example, one way to model the relation between an architectural drawing and its target is to say that the drawing shows a building. Although this is not semantically accurate, since the drawing illustrates the design of the building and not the building itself, it can be useful when the initial data does not contain the necessary information for the design or when there are no resources to make full mappings.

Figure 2: User interface created based on the documentation template

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

2.

315

IDA-framework

IDA-framework is a simple and flexible tool for making semantically-aware, eventcentric documentation [7]. The core ontology of the framework is based on CIDOCCRM. However, the purpose is not to produce full CIDOC-CRM mappings but to provide a semantically meaningful base that follows CIDOC-CRM's principles and conventions. IDA-framework is aimed specially for small memory organisations like local museums and highly specialized museums whose collections are densely linked. IDA-framework uses events explicitly in its documentation model. The complexity of the CIDOC-CRM is hidden from the end user, and the domain specific documentation structure is defined by semantic documentation templates [6]. 2.1.Semantic documentation template Since CRM does not suggest what to document in any specific case, there must be a mechanism that guides users in making new records[5]. The idea of a semantic documentation template is to provide a record-type specific documentation frame that can be defined by the organisation responsible for documentation. The template defines a typical case for a record, including default properties and default events. More precisely, the documentation template maps parts of thesauri to a partial domain ontology that is build on top of CIDOC-CRM. <Building > <message required="0" width="200" /> <was_produced_by class="Production" required="1" >

<start_day required="0" /> <start_month required="0" /> <start_year required="1" width="5" />

<used_specific_technique class="Architectural_Design" required="0" />

The documentation template does not define any user interface elements. A user interface can be freely created for the template concerned. Figure 2 shows an input form generated by Javascript in the browser. The template helps the user to create an initial target that can be later further refined. 2.2.Semantic query language For performance reasons, IDA-framework is implemented with the help of a relational database system which excludes the use of Xquery or SPARQL as a query language[10][9]. The native query language for relational databases is SQL. While SQL is a well-known and established query language, it is also complex and it requires information about the internal data structure. Another problem with SQL is that queries

316

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

Figure 3: IDA-framework system architecture

can get very complex with the recursive database structures that IDA-framework uses. For this project, a new query language called IDA-QL was developed. IDA-QL is very verbal, and it should be meaningful without any prior knowledge about query languages. The following query will give the IDs of all persons who were born in 1953 and have "Lin*" in their name ("Linus" or "Lintunen", for example): <search> Lin* <was_born>

<start_year>1953

The purpose of IDA-QL is to provide platform independent information retrieval and data manipulation language for cultural historical data. It is a meta query language that is translated to SQL queries. IDA-QL is based on XML, and it operates on classes and properties defined by the ontology. This also means that if the ontology is translated to another language, then also the query language gets translated. 2.3.Implementation IDA-framework server is written in PHP and it uses a relational database through the MDB2-abstraction layer. The ontology is defined by an RDFS file, and the documentation templates are described in an XML file. Communication between the server and the client is done with XML API. Client applications can be written with

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

317

Figure 4: Maps and architectural drawings mapped to CRM.

any programming language as long it can send http calls and is able to parse XML files. The overall implementation is very simple. There are fewer than 6000 lines of PHP code in the server application.

3.

Case study: University Museum of Jyväskylä

Demonstrations were made with the collections of the University Museum of Jyväskylä. Two kinds of items were selected from the collections: maps and architectural drawings. There were about 200 maps and 100 architectural drawings in the museum's collection at the time of the experiment. The museum also has a collection management tool called DUO. Direct export form DUO to IDA-framework was not possible because of the differences in the documentation models. Instead, data was exported from DUO to an HTML file, which was then parsed with Javascript. The input was then manually validated and added to IDA-framework. 3.1.Domain Analysis based on CIDOC-CRM A map depicts some geographical region produced by an individual person or organisation. In terms of CIDOC-CRM, map is a Physical Man-made Object produced by Production event and the Production event is carried out by an Actor. In this case, a simple model was used for modelling the authorship of the map. The producer of the physical map is not necessarily the creator of the map, but in this case it is the assumption in the initial data. In addition, there is a possibility to define several map subtypes, a road map or a tourist map, among them.

318

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

As stated earlier, a simple way to model a relation between an architectural drawing and its target is to say that the drawing shows a certain building. While useful in some cases, this model is not semantically accurate. The architectural drawing does not actually present the building, it merely presents the architectural design, which is an immaterial object. This relationship can be modeled with the conceptual class Architectural Design that is carried by the physical object – the drawing. In this case, a class called Architectural Design was derived from the CRM class Design_or_Procedure. This design was then linked to a production event of the building (Figure 4). This kind of modelling solves also the problem of how to document designs that were never carried out, in other words, plans that do not represent any physical building. The instance of Architectural Design exists even if there are no buildings constructed by following the design. The concept of architectural design also serves to organise drawings, because individual documents can be grouped as carriers of a single, named design. 3.2. Demo Applications In the map application, the user can browse through maps by region, by map maker or by map type. The contents of these navigation panels are created by IDA-QL-queries. The following query gives the names of all persons and organisations who have produced a map: <search> <Map/>

The query returns a list of actors. This list is then parsed by Javascript and displayed in a webpage. When the user clicks a certain name in the list, the following query is used to retrieve information: <search result='result'> <Map> <was_produced_by> <depicts/> <was_produced_by>

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

319

The query selects all the maps produced by the actor in question. The properties inside the result tag define what properties are included in the response XML. The XML response is then parsed with Javascript and the actual display is rendered. Architectural drawings can be viewed by buildings, by designs, by architects and by campus areas. Since this application is about architectural drawings, not about buildings, only the buildings that are presented in architectural drawings should be listed. The following query returns the list of buildings that are depicted by an architectural plan: <Building>

There is no concept of architect in the ontology used. However, the concept of architect can be defined by an IDA-QL query as a person who has produced a architectural drawing and thus a list of architects can be created.

4.

Discussion

The case material was quite limited, and therefore it is too early to draw final conclusions. However, the material shows that the method can be used successfully in the field of cultural historical data, and a semantically rich Web 2.0 application can be built entirely with Javascript using an XML interface provided by IDA-framework. Modifications of the data structure do not require changes in the application code, which makes the system more flexible than other relational databases. The main focus of the software development was in the server side and on developing a flexible way to build user interfaces. Therefore, no actual usability tests were conducted at this point.

5.

Conclusion

A practical method for CRM based documentation of cultural historical targets was presented. By hiding the complexity of the ontology with documentation templates, the system adapts semantic technologies in the field of cultural historical documentation. With a simple query language, it is possible to build flexible and browseable user interfaces without expertise needed for other semantic query languages.

Acknowledgements This work has been funded by the Department of Art and Culture studies at the University of Jyväskylä, by the Finnish Cultural Foundation, and by Kone Foundation.

320

A. Häyrinen / Towards Semantic Modelling of Cultural Historical Data

References [1] Bekiari, C., Constantopoulos P., Doerr, M., Information design for cultural documentation, 2005, 9th DELOS Network of Excellence thematic workshop "Digital Repositories: Interoperability and Common Services", Available from : http://www.ics.forth.gr/isl/publications/paperlink/Information%20design.pdf; accessed Jan 2010. [2] Cameron, F., Robinson, S.Digital Knowledgescapes: Cultural, Theoretical, Practical, and Usage Issues Facing Museum Collection Databases in a Digital Epoch, Theorizing Digital Cultural Heritage : A Critical Discourse. Cambridge, MA, USA: MIT Press, 2007. (165-191). [3] Crofts, N., Doerr, M., Gill, T., Stead, S., Stiff, M. (2006), Definition of the CIDOC Conceptual Reference Model. Available from http://cidoc.ics.forth.gr/docs/cidoc_crm_version_5.0.1_Nov09.pdf accessed Jan 2010 [4] Doerr, M., Kritsotaki, A. (2006), Documenting Events in Metadata. Available from http://cidoc.ics.forth. gr/docs/fin-paper.pdf; accessed April 2010. [5] Doerr M., Iorizzo D. The dream of a global knowledge network—A new approach, Journal on Computing and Cultural Heritage (JOCCH) , [6] Häyrinen, A., A template based, event-centric documentation framework., Available from : http://cidoc. mediahost.org/content/archive/cidoc2008/Documents/papers/drfile.2008-06-26.pdf; accessed Jan 2010. [7] IDA-framework development site: www.opendimension.org/ida [8] ISO-standard 21127:2006, A reference ontology for the interchange of cultural heritage information, available from http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber= 34424; accessed April 2010. [9] SPARQL, http://www.w3.org/TR/rdf-sparql-query/ [10] Xquery, http://www.w3.org/TR/xquery/

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-321

321

A Collaboration Model for Global Multicultural Software Development Taavi Ylikotilaa and Petri Linna b University of Jyväskylä, Faculty of Information Technology Department of Mathematical Information Technology, Finland; [email protected] b Tampere University of Technology/Pori Department of Information Technology, Finland; [email protected] a

Abstract. Software development projects seem to raise many challenges and issues. These issues are exacerbated when the projects are distributed globally and thus software development projects are multicultural. Globalization has increased the need to estimate the effectiveness and cost savings of every project. That is why many projects are outsourced or in other ways distributed to cheaper countries. The main problems of software development projects are related to knowledge sharing, communication, and cultural issues. This paper studies the challenges in global multicultural software development projects and the kinds of collaborative tools available for software development. This paper presents an examination of a collaboration model for global multicultural software development. The model is based on the authors' own work experience and literature research. We propose that this model could be used as a reference model when planning global software projects. Keywords. Global Software Development, Multicultural Software Development, Collaborative Model

Introduction Software development has a fifty-year history and has changed radically during these years. During the first decades, software (SW) development was done mainly by local teams, and after the PC revolution in the 1980s and Internet evolution, software development became really global during the 1990s. This globalization has led to the distribution of software development work to multiple sites in different countries. The development of software at geographically distant sites is referred to as Global Software Development (GSD). This software development may happen within one organization at several sites or between two or more organizations. The software companies may also use offshoring, outsourcing, or subcontracting business conventions, all of which have their own characteristics, but are beyond the scope of this paper. The most important business reasons for utilizing global software development [14] are: (a) the trend towards larger business units, (b) product localization to different countries, (c) the need to operate closer to the clients, (d) the growing need for skilled personnel, (e) the need to reduce costs, and (f) globalization as a path to growth. By utilizing time zone differences effectively it is also possible to achieve continuous, software development around the clock [4].

322 T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

When a software development project is established with development sites located in different countries, people with different national and cultural backgrounds work together on the same project. Cultural sensitivity and understanding is very important in global SW development. According to Hofstede [11], “Culture is a collective phenomenon, because it is shared with people who live or lived within the same social environment, which is where it was learned. Culture consists of the unwritten rules of the social game. It is the collective programming of the mind that separates the member of one group or category of people from others.” According to King [15], cultures can be considered at four levels: national cultures, organizational cultures, organizational subcultures, and subunit cultures. Multicultural is defined in this paper as the comparative knowledge and studies relating to, or including several cultures [17]. Multicultural project members may experience some differences relating to working habits, the usage of language, attitudes towards time, personal space, decision making, discussion style in meetings, and acceptance procedures. It would be very useful in daily working if cultural awareness could be increased, because in this way many misunderstandings and communication problems would be avoided. People with different national cultures in the same project set requirements, especially for project managers, but also for designers. Every experienced SW project manager knows that a geographically dispersed project team is much more difficult to manage than a project team located at the same site. Fluent communication is one of the most important factors in a successful SW project [9]. A classic study made by Tom Allen pointed out already in 1977 that communication drops significantly when the engineers' offices are more than 25 meters from one another [1]. Communication can be divided into formal and informal communication. Formal communication means agreed practices, like project meetings, information sharing sessions, and project status reports. Informal communication means spontaneously occurring communication, like corridor discussions or instant chatting between project members. It has been noted that informal communication is very important, and its significance rises when projects are more complex [10]. Often the news of change in requirements is propagated informally between project members. When a SW project is located at different sites, it sets requirements for communication and for collaborative tools. Many collaborative tools (e.g. telephone, email, teleconferences, videoconferences, net meeting tools, chats, document management systems, etc.) are used in GSD. These tools should make coordination, cooperation, and communication easier, but there are also cultural issues affecting how these tools are used. The main objective of this paper is to define and present a reference model for collaboration in global multicultural software development. This reference model can be used as a basis for collaboration planning in GSD projects. It must be noted that the size of the projects may vary from less than ten members up to thousands of members and thus this model should be adapted, taking into account what is practical for the project in question. This paper is based on the ongoing work at the Department of Mathematical Information Technology of the University of Jyväskylä and the Pori Unit of the Tampere University of Technology. These research projects: Steps in Multicultural Software Business Globalization: Models, Methods and Practices Towards Increasing Competitiveness (STEP), and Ubiquitous Cross-Cultural Knowledge Spaces (UbiKnowS), are researching multicultural software business globalization. This paper is organized as follows: in Section 1, we present the challenges that global

T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

323

multicultural software development is facing. In Section 2, we show the kind of collaborative tools that are currently available, and in Section 3 we discuss the multicultural factors in GSD. In Section 4, we introduce a collaborative model for multicultural GSD. Section 5 is reserved for conclusions and issues for further study.

1. Challenges in global multicultural SW development The software development process has traditionally been plan-driven. The software development process has it roots in the traditional "Waterfall Model" see [21] and its variants. The strict use of this kind of plan-driven development demands the freezing of the requirements of the software product at a very early stage of the development. This means that the development project is unable to make changes to the product requirements. The iterative and incremental modifications of the waterfall model are currently being used because they are more flexible to user requirement changes. One example of an iterative software development process is the Open Unified Process, which is used in current development projects based on agile methods. This model was originally developed for small and co-located teams and this can be problematic for GSD projects. The waterfall model has traditionally been used in global software development, because it relies on formal methods and freezing documents at early stages. This facilitates work distribution between sites and countries. Nowadays there is an increasing need to change product requirements also during development, in order to be competitive. There has also been an increasing interest in agile-based methods and processes, but agile development relies more on informal interaction than formal mechanisms that have been typical for distributed software development. The software industry faces new challenges when moving to agile methods in global software development as well. The most frequent problems for GSD are as follows [9]: (a) strategic issues, (b) cultural issues, (c) inadequate communication, (d) knowledge management, (e) project and process management, and (f) technical issues. In the GSD requirements engineering field, the following problems have been identified [5]: cultural diversity, inadequate communication, knowledge management, and (g) time difference. In GSD projects, it has also been noted that it is very difficult to (h) find who the experts are on various issues at remote sites, and another major problem that has been identified is (i) the lack of trust and lack of willingness to communicate openly across sites [20]: a) Strategic issues. When the decision has been made concerning a global software development project, the next difficult decision is how to divide the work across sites and countries. Usually there is organizational resistance to GSD and this poses a challenge in relation to trust building between sites, which is very important. b) Cultural issues. In GSD, close cooperation between individuals with different cultural backgrounds is needed, and this may cause conflicts if there is not enough cultural sensitivity and awareness. Although the English language is widely used in software development, language understanding and willingness to speak may also be sources of problems. Cultural issues will be discussed further later in this paper. c) Inadequate communication. Software development requires much communication, which can be divided into formal and informal communication.

324 T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

Formal communication is needed for official communication, like project status follow-up, solving the problems in the project, and for agreeing on responsibilities for different tasks. Informal communication is considered important for people to know what other people are working on and who has expertise in what area. There are also many other issues that help people work effectively together. d) Knowledge management. Effective knowledge and information sharing mechanisms are essential in GSD, so that teams know which tasks are on the critical path and team members have up-to-date documentation available. e) Project and process management. The synchronization of work is particularly important, for example, if some sites are doing the development and a team in some other site is doing the testing. There are models of concurrent software development [2] but effective utilization of these is difficult because of changing requirements, unstable specifications, unavailability of tools supporting collaboration, and lack of informal communication [9]. f) Technical issues. A reliable and satisfactory fast telecommunication network is currently a prerequisite for successful software development. Configuration management tools, in addition to other collaboration tools, require the transmission of critical data over the telecommunications network. g) Time difference. Especially when there are teams from several different continents, there are large time-zone differences between teams, and there is only a small overlap available for synchronous collaboration [5]. Synchronous meetings across continents are always difficult for at least one team: the meeting time is either too early in the morning or very late in the evening. In this case, asynchronous communication is used more. One solution could also be rotating meeting times, so that unsatisfactory meeting times are rotated in turn between the teams. h) Finding experts. In practice this means that it is difficult to know in GSD who understands a particular module well or who is responsible for a certain feature. Even if the right person is known, there may be difficulties to initiate contact with the remote site because of cultural and language differences. i) Lack of trust and willingness to communicate. There are many reasons for this, but especially in the case of outsourcing this can be very obvious. People do not want to share expertise if they are worrying about job security. Building trust takes time, and distance is an impediment to building relationships of trust [4]. When there are no (or rare) face-to-face meetings and informal events (like lunch discussions), trust building takes time. In GSD, the software architecture should be modular, so that it enables the distribution of modules to different sites. A modular design reduces complexity and allows parallel development in different sites [4]. A modular design and successful division into sites decrease the need for communication between sites, if one module or certain tightly coupled modules can be developed at the same site. When we look at the issues listed above, we notice that most of them are somehow connected to communication and more widely to collaboration between sites. It is evident that putting effort into collaboration planning and increasing cultural awareness would be worthwhile for global software projects.

T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

325

2. Collaborative tools used in GSD Collaborative issues are a current trend. Companies need to evaluate and develop their strategies for this collaborative trend. They need more homogeneous communication habits and must perhaps learn to use very different kinds of tools than those they are used to. The companies' networks and systems administrators face big challenges to secure all information in communications, but new collaborative tools are changing corporate communications habits. Employees may try to use the same collaborative tools in work that they use in civil life – especially social networking. Many times administrators try to prevent the use of these third parties' web-based collaborative tools, because they want to keep communications in secured networks and tools under their own control. In the 3C model, collaboration includes three perspectives: communication, coordination, and cooperation. The model idea was originally proposed by Ellis, Gibbs and Rein [6] (see i.e. [13]). Communication includes issues such as formal and informal communication, coordination includes issues like how to divide tasks between employees, and cooperation includes issues like what kind of cooperation process is used. All these perspectives should be taken into account when selecting a collaborative tool. The term “Computer Supported Cooperative Work” (CSCW) was introduced back in 1984 by Irene Greif and Paul M. Cashman [8]. It is a field that studies, for example, how to use groupware or design. CSCW systems are meant to support communications, coordination, and collaboration. CSCW attracts researchers from many fields, such as communication researchers or anthropologists. CSCW is not only a technological issue, there is also the human side. CSCW includes a matrix, which has time and place axes. The time axis means that an event can be asynchronous or synchronous. The aim of this matrix is divided into four different sections of groupware tools. The sections include different kinds of needs for tools. Typical groupware tools are email, group calendar, chat, and video communications. Nowadays it seems that this matrix is no longer enough. Collaborative tools of a new kind are making this matrix more complicated. Penichet [22] say that there is a need to expand this CSCW matrix, because there are now more complex tools, and more are coming every day. There are many kinds of collaborative tools [26], and even in the area of GSD, several categories can be found. Whitehead [25] divided them into four categories: model-based tools, process centered support, awareness, and collaboration infrastructure tools. The Tigris (www.tigris.org) project has categorized tools in eleven different areas. The aim of this project is to gather open-source GSD tools in one place. It may be problematical for large companies to select open-source based tools. They want reliable tools, which they can control themselves and could get help with if necessary. They may prefer commercial products, such as IBM Rational, Borland Together, or Sparx Systems Enterprise Architect. Cheaper open source-based tools can be a very good choice for smaller GSD companies. For the last couple of years, the social media concept has also existed, which is specialized in making multi-level communities on the web. This concept is also very interesting for GSD companies, because these create new opportunities, for example, to collect feedback from groups outside of the company, to find appropriate employees, or to advertise products with new technology. One collection of collaboration tools is Robin Good’s collaborative map [7]. This mind map is based on people’s recommendations of collaboration tools. The mind map

326 T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

categorizes tools in over ten different areas and helps to find web-based collaboration tools: x Project management tools, e.g. Basecamp and 5pm x Collaborative writing tools, e.g. Google docs x Collaborative reviewing tools x File sharing tools x Document sharing –wiki tools x Work grouping tools x Web presenting tools x Private social networking tools x Whiteboarding tools x Web conferencing tools x Video conferencing tools, e.g. Skype x Screen sharing tools x VOIP – audio conferencing tools, e.g.. Skype x Instant messaging tools x Chat tools x Event scheduling tools The software development project also has to select what kinds of tools are used in the following areas [4]: x Software Configuration Management tool x Document management tools x Computer Aided Engineering tools x Programming tools x Testing tools x Error management tools x Requirements management tools Email is still the most familiar tool that is widely used in GSD projects. The benefit of this tool is that it is easy to use and if users have problems with language, there is always the possibility to read and write messages in their own time. This tool is also asynchronous, which is good if there is a big time difference between sites. The disadvantage of email is that sometimes it may take time until you receive a response if the other party is delayed for some reason. Each project can be very different, and as a result, different variations of tools will be needed. Marcus noted that culture has an effect on user interfaces [19]. Consequently, global multicultural projects should select tools that are global and do not include nationalities or cultural specialties. This helps employees use tools effectively without needing to think of their cultural backgrounds. In summary, the project manager has a lot of tools available, and it is challenging to find the most appropriate tools to support most effectively the project’s needs.

3. Multicultural factors Culture has been the subject of research for a long time. Despite this long research history, there are many challenges, and the unclear definition of culture is one of them.

T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

327

In 1952, Kroeber and Kluckhohn [16] found over 164 definitions, and Lonner [18] found over 200 definitions in 1994. Hoft [12] has categorized culture into four Meta models: the onion, the pyramid, the iceberg, and the objective and subjective models. GSD companies need to understand that culture has an impact on many levels in the company: national, organizational, team, individual, customer, and product. At the national level, there are requirements, such as understanding each country’s management and communication styles. At the organizational level, there are requirements, such as understanding working styles or how strong the organizational culture is. In a weak organizational culture, national cultures are reflected in the organization more than in stronger organizations. At the team level, it is important, for example, to understand communication habits. In particular, GSD companies need to think about cultural factors, because teams do not necessarily have face-to-face communication. In addition, GSD projects need a lot of communication, and in more complex projects, the need for informal communication increases. When selecting new employees for a company, the HR unit has to take into consideration that these people will work in multicultural teams. It has a significant advantage if new employees already have good cultural awareness, which prevents and overcomes potential communication stumbling blocks [23]. At the product level, there are requirements to consider, such as product localization and designing in software case user interfaces [19]. The aforementioned 3C model’s perspectives in comparison with Hofstede’s national cultures dimensions give some ideas of how to select the appropriate collaboration tool for a multicultural team. Hofstede’s dimensions are power distance (PD), individualism (IDV), masculinity (MAS), uncertain avoidance (UA), and longterm orientation (LTO) [11]. The following section discusses some examples of how cultural issues can affect the 3C model perspectives and thus affect the selection of the most appropriate collaboration tool. There are PD differences country by country. There are many kinds of leadership styles, which affect the whole collaboration perspective, for example, strong hierarchical systems will tend to avoid open discussion with others or cooperation rules and coordination instructions can be understood in different ways. The impact of IDV on team structures is that there are many employees thinking in the “we” or “I” mindset. There might be a need to look for collaboration tools that make the team society stronger and motivated to work together. MAS cultural factors affect whether there is a need for a soft collaborative style, taking into account all ideas and opinions, or whether the more masculine style is better. UA affects how deeply collaboration rules should be dealt with first. This might be challenging to do later on without losing face. Informal communication channels are more helpful for asking something that cannot be asked formally without losing face. UA also affects how straight and clear the rules need to be. LTO affects what kind of time attitudes employees might have, for example, some like to do many tasks at the same time, but some like to do one thing at a time. It is not possible to know all the cultural differences, but it is more important to accept that there are cultural differences and to be flexible and try to turn it in a positive direction, by focusing on more idea-rich teams. With collaborative tools, it is possible to decrease the threshold and also maintain a good selection of informal communication tools. It is important to find a tool which enables one to build stronger teams, avoiding separation into cultural sub-teams and making a team which works toward a common target. After all, it is complicated to give direct advice on how to control cultural factors, because culture theories always incorporate generalizations and

328 T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

project management should lead individuals who may have very different kinds of personalities. Nevertheless, culture theories give good ideas of what kinds of scenarios may arise and give project managers some ideas of how to control them. Project management is very much about human leadership. The main challenge for software development projects may be knowledge sharing [24], including communications and document sharing. It is important to select a collaboration tool carefully to minimize multicultural conflicts. Communication should be supported on many levels, and the informal communication threshold can be lowered with collaborative tools. Informal communication becomes more important when teams are distributed globally, therefore it is good to use various communication tools. Face-to-face communication is important, because it offers non-verbal clues of others’ thoughts. The project manager has the major responsibility of selecting suitable collaborative tools. The manager has to recognize his team's skills and then choose the most appropriate tools to support all the project's needs. The project manager should have a very good cultural intelligence and awareness if the employees of the project team have very different backgrounds. The project manager has to be a good global player. This is not a simple task, when trying to avoid all the typical GSD challenges listed in Chapter two.

4. Collaboration model Global software development has increasingly been using collaboration technology, not only for communication, but also for project management, software configuration management, document management, error management, requirements management, and for computer aided design [4]. Successful selection and usage of those tools are thus essential for the software project. By introducing our model, we want to emphasize that collaboration is not only the correct selection of tools, but it also increases cultural awareness and effective ways of working together. This model gives view of collaboration issues and during actual projects there are usually more phases where issues like SW product inspection, checking, and testing should be planned carefully. We have divided this model into three phases: collaboration initiation, collaboration planning, and the collaborative project phase. In each phase there are certain tasks that should be accomplished and some main decisions that should be made in order to move from one phase to another. An organization enters the collaboration initiation phase when a need is identified to establish if the software development project should be developed globally. This paper does not describe how this decision should be made, because it is very much dependent on the organization’s size and what the planned SW business model is. In the collaboration initiation phase it is very important to take multicultural factors into account and when performing collaboration planning, the perspectives of the 3C model could be used.

T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

Study for global SW development

329

Not global project

Global collaboration initiation Decision to develop Software globally Global software project transfer to mainenance Collaboration main enablers: 1. Project management 2. Requirements management 3. Cultural awareness 4. Informal + formal communication 5. Information sharing 6 b ildi

Global Collaborative project

Global collaboration planning Global software project kickoff

1. 2. 3. 4. 5. 6. 7. 8. 9. x x x

Document management Version management Building planning Error management Meeting practices + decision making Cultural differences Testing coordination Trust building between sites Collaboration enablers:

1. 2. 3. 4.

Collaborative SW architecture Collaborative tools selected Collaborative Requirements analysis Communication plan including time difference planning

Finding experts Precense awareness Common calendars

Figure 1. Collaboration model for a GSD project

When a decision is made to implement a SW project globally, the project is moved to the collaboration planning phase. In this phase, the following collaborative tasks (Table 1) should be performed in project planning. Table 1. Tasks for the Global SW project collaborative planning phase Task Collaborative SW Architecture

Collaborative tools selected Collaborative requirements analysis

Communication plan including time difference

Task explanation When planning high level SW architecture it must be noted that SW modules should be easily distributed to different sites, and interfaces should be well defined [3]. Communication, requirement management, version management, etc. tools should be selected for the project. At least one representative from each site should be involved in the requirement analysis phase, in order to achieve a common understanding of the requirements and to gain commitment [3],[5] In the GSD project planning of communication conventions and meetings are very important. In this plan both formal and informal communication should be taken into account. However, if there are representatives from several continents in the project, finding suitable times for teleconferences may be very difficult.

During the transition from the global collaboration planning phase to the global collaborative project execution phase, a global software kickoff meeting should take

330 T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

place, in which there should be at least one representative from all the sites. It would be better if all project members attended this event, but it is not always possible for cost reasons. This event is a good place for information sharing and for people to meet each other. It is a good idea also to have some informal activities in this event, if possible. In the actual project execution phase there are several tasks where collaboration between sites should be taken into account (Table 2). Table 2. Tasks for the Global SW project collaborative planning phase Task Collaborative document management Collaborative version management Collaborative build planning Collaborative error management

Meeting practices and decision making Understanding cultural differences Testing coordination Trust building between sites

Collaboration enablers

Task explanation Document management practices should be implemented so that documents are available to all parties. Version management practices should be implemented so that all parties have access to the version management system. Build planning and practices should be implemented so that all sites have been taken into account. Error management practices should be implemented so that all sites are involved and it should be decided clearly which site is responsible for certain error solving. Project meeting practices and decision making should be performed in a way that information about the project situation and decisions is available to all parties. Global software project members should have basic knowledge of the main differences in other cultures. Testing coordination between sites should be implemented so that testing responsibilities are clear to all parties. This is a difficult but important issue and should be understood at least by the project management. Ways to increase trust between sites: the involvement of all sites in meetings and information sharing. Informal communication is also an important way to build trust. The following collaboration enablers should be taken into account from the GSD point of view: finding experts, presence awareness, and shared calendars. These issues should be taken into account because they are essential for the GSD project.

The global collaborative project is transferred to the maintenance phase when all the planned tasks in the project have been done. In our model the project phase is moved again to global collaboration initiation and it must be decided how maintenance will be implemented – at one site or globally? If maintenance is also done globally, this model can be utilized for maintenance collaboration planning. The most important enabling tasks for the global software project are listed in Table 3. Table 3. Main enablers for collaborative projects Task Collaborative project management Collaborative requirements management Cultural awareness

Informal and formal communication

Information sharing

Task explanation Careful project management practice planning is essential for a collaborative project. Careful requirements engineering and planning throughout the whole project schedule are essential for a successful GSD project. All sites should be involved in this work. Increasing cultural awareness between distant teams is very important so that members from different sites can understand each other better. Informal communication between sites should be encouraged and supported by the management. Meeting practices (e.g. weekly meetings) should be planned and agreed between sites. Information sharing practices should be planned, taking into account the distance between the sites.

T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

331

The collaboration planning for global software projects is supported by our reference model which should be used by program managers when they are planning a global software project. Our model may also be used for training purposes in order to identify what kinds of phases and tasks should be taken into account when planning collaboration in a global software project.

5.

Conclusions and issues for further research

In our paper we have analyzed the challenges currently faced by global multicultural software development. We have identified that many problems in GSD projects are initially related to communication. We have also introduced collaboration tools that have been used by software companies and organizations. We have briefly introduced what kinds of cultural issues global software projects are facing. In order to strengthen the planning and execution of a global multicultural software project, we have introduced our collaborative model. This model points out issues and tasks that should be taken into consideration during project planning and execution. This model includes cultural factors that should also be considered in global projects. By utilizing our model, companies can easily detect which issues and tasks are important when they plan a GSD project. Our future research will focus on the following subjects: x How do cultural factors affect global software development? x How do cultural factors affect the design and implementation of software products and services? x How do cultural factors affect internationalization, target country selection, and entry mode choice for software firms? Our research will be based on future interviews with Finnish companies and organizations that are working in the area of global multicultural SW development.

Acknowledgements The authors wish to thank the Academy of Finland and the Finnish Funding Agency for Technology and Innovation for supporting our research.

References [1] Allen T. Managing the flow of technology, MIT Press, Cambridge MA, (1977) [2] Blackburn J.D., Hoedemaker G., Van Wassenhove L.N., Concurrent Software Engineering: Prospects and Pitfalls, IEEE Transactions on Engineering Management Vol. 43 May 1996, (1996), 179-188 [3] Battin R.D., Crocker R., Kreidler J., Subramanian K., Leveraging Resources in Global Software Development. IEEE Software March/April 2001 (2001), 70-77 [4] Carmel E., Global Software Team: Collaborating Across Borders and Time Zones, Prentice-Hall, New Jersey, (1999), 9, 142-144, 137-129, 103-114, 91 [5] Damian D. E., Zowghi D., Requirements Engineering challenges in multi-site software development organizations. Requirements Engineering Journal, 8 (2003), 149-160

332 T. Ylikotila and P. Linna / A Collaboration Model for Global Multicultural Software Development

[6] Ellis C. A., Gibbs S. J. and Rein G. L. Groupware: some issues and experiences. Communications of the ACM, (1991) 34(1), 38-58 [7] Good R. 2009. Best Online Collaboration tools 2009. http://www.mindmeister.com/12213323, Accessed 12th January 2010 [8] Grudin J., Computer-Supported Cooperative Work: History and Focus. Computer 27, (1994),19-26 [9] Hersleb J.D., Moitra D., Global Software Development. IEEE Software March/April 2001, (2001), 16-20 [10] Hersleb J.D., Mockus A., Finholt T.A. Grinter R.E., Distance, Dependencies, and Delay in Global Collaboration. Proceedings of the 2000 ACM conference on Computer supported cooperative work , (2000), 319-328 [11] Hofstede, G., Hofstede, G. J., Cultures and Organizations: Software of the Mind: Intercultural Cooperation and Its Importance for Survival. New York: McGraw-Hill, (2005), 4 [12] Hoft N.L., Developing a Cultural Model. International User-Interfaces, (1996), 41-73. [13] Hugo Fuks, Alberto Raposo, Marco A. Gerosa, Mariano Pimentel and Carlos J.P. Lucena, 3C Collaboration Model. (Ed.) Ned F. Kock. Encyclopedia of E-Collaboration, (2007), 637-644. Idea Group Inc. [14] Jaakkola H., Heimbürger A., Cross-Cultural Software Engineering - Interkulturalni softverski inzenjering (Abstract). Paper presented at the International Conference on Society and Technology 2009. Juraj Plenkovic (ed.), Croatian Communication Association; Informatologia no: 12 (2009). Full paper will be published in Informatolologia December 2009, (2009) [15] King W. R., A Research Agenda for the Relationships between Culture and Knowledge Management. Knowledge and Process Management volume 14 number 3, (2007), 226 – 236 [16] Kroeber A.L., Kluckhohn C., Culture: A Critical Review of Concepts and Definitions. Peabody Museum. (1952) [17] Lewis R.D., When Cultures Collide. Managing Successfully Across Cultures, London: Nicholas Brealey Publishing. (1999) [18] Lonner W.J., Culture and human diversity. In Human diversity: Perspectives on people in context, E.J. Trickett, R.J. Watts and B. D Eds. Jossey-Bass, San Francisco, (1994), 230-243 [19] Marcus A., Cross-Cultural User-Experience Design 4045, 4th International Conference, Diagrams 2006 4045, (2006), 16-24. [20] Mockus A., Hersleb J. Challenges of Global Software Development. Proceedings of the Seventh International Software Metrics Symposium (Metrics '01) April 2001, (2001), 182 [21] Pfleeger S.L., Atlee J.M. Software Engineering: Theory and Practice. Third Edition. Pearson and Prentice Hall. (2006), 45-66 [22] Penichet V.M.R., Marin I., Gallud J.A., Lozano M.D. and Tesoriero R., A Classification Method for CSCW Systems. Electronic Notes in Theoretical Computer Science 168, (2007), 237-247 [23] Shah S. The researcher/interviewer in intercultural context: A social intruder! British Educational Research journal 30, (2004), 549-575 [24] Vesiluoma S., Understanding and Supporting Knowledge Sharing in Software Engineering. Ph.D. Thesis. Tampere University of Technology, (2009) [25] Whitehead J., Collaboration in Software Engineering: A Roadmap. In Proceedings of the 2007 Future of Software Engineering2007 IEEE Computer Society, (2007), 214-225 [26] Wikipedia, Collaborative Software. http://en.wikipedia.org/wiki/Collaborative_software Accessed 11th January 2010

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-333

333

A Culture-Dependent Metadata Creation Method for Color-based Impression Extraction with Cultural Color Spaces Totok Suhardijantoa, Kiyoki Yasushib, and Ali Ridho Barakbaha a Graduate School of Media and Governance, Keio University b Faculty of Environment and Information Studies, Keio University 5322 Endoh, Fujisawa, Kanagawa, Japan, 252-8520 [email protected], [email protected], [email protected] Abstract: It is becoming important to realize a cross-cultural communication environment among societies with different cultures. Images are effective media for exchanging cultural characteristics across cultures. This paper presents a culture-dependent color-emotion model for cross-culture oriented image retrieval system that realizes color-emotion spaces to search images with human emotion aspects. Many image retrieval systems have been featured by color analysis, but culture-dependent aspects of images are not considered intensively. Our system creates color impression spaces based on Ekman’s 17 basic emotions. For the first step, the culture-dependent color impression space is created by using culturalfeatures. We apply automatic clustering using our previous method “Valley Tracing” in order to generate dynamic representative colors. This system automatically creates a set of culture-dependent color impression metadata. Keywords. color-emotion based image search, emotion information, culturedependent color-emotion

1.

Introduction

Nowadays there are large quantities of images and visual information available. These images are existing in structured collections (e.g. museum collections) or independent (e.g. images found in Web pages in the form of individuals’ photographs, logos, and so on) [6]. Collection of images became vary not only in number, but also in types because nowadays people with different cultural backgrounds enable to share their images over the world. Along with the trend of cross-cultural communication issues, an image retrieval system that enables to take culture-dependent issue into consider has more challenges in near future to provide better ways and approaches in dealing with culture-dependent images. With regard to culture-dependent image retrieval, the issue of association between image color and emotions or impression is essential. This issue has already been addressed for many years. It has been attracting many scholars from various areas of studies. The color-emotion oriented image retrieval system has been already proposed in number of researches [8], [9], [25], [26]. Current computer vision techniques allow us to extract automatically low-level features of images, such as color, texture, shape and spatial location of image elements, but it is difficult to extract high level features automatically, such as names of objects, scenes, behaviors and emotions [8]. Current

334

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

image retrieval still has semantic gap in dealing with cross cultural environments [3], [4]. Although several emotion based image retrieval (EBIR) systems and emotion semantic image research (ESIR) systems have already been proposed in [9], [26], [25], these systems have not addressed issues yet for culture-dependent emotion. In this paper, we describe how to generate culture-dependent emotion-based image metadata by using vector space. In our previous research, we have introduced a method to deal with content based image retrieval based on three basic features including structure, shape and color [1]. The method makes use of 3D-Color Vector Quantization to extract color features in each block of image partition. We use this method to do same thing, but in this research we elaborate it with cross-cultural features of color and emotion. Our proposed method is based on assumption that perceived color-emotion in an image is mainly affected by human past experiences and cultures. We generated culture-dependent emotion based image metadata through these following steps. First, color features are extracted using 3-D Color Vector Quantization of RGB color space which is presented in detail in Section 4. In this step, color features are converted into RGB color space. Independently, culture-dependent emotion features are acquired from cultural knowledge through a survey that is addressed to 89 respondents from three different cultures including 15 Vietnamese, 42 Indonesian and 32 Japanese. Respondents were asked to indicate their emotion or impression based on Ekman’s 17 primary emotions [24] that associates to certain colors. This survey result is used for creating culture-dependent color-emotion model. Second, emotion features are represented as a vector and mapped onto the RGB color space in order to correlate with nearest colors. Because a particular color can associate with different emotions and vice versa, in order to generate dynamically representative colors of emotion, we use automatic clustering method. After representative colors are chosen and irrelevant colors are excluded, we used them in creating culture-dependent color-emotion metadata. The system overview is shown in Figure 1.

Figure 1: System Overview of Culture-Dependent Color-emotion Metadata Creation Method

This paper is organized as follows. In the Section 2, we review the association between color and emotion. Then, the basic emotion model used in this research is characterized. Section 3 provides description of color features and the 3-D Color

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

335

Vector Quantization approach. Section 4 describes the culture-dependent metadata creation method. In this section, we explain how to determine and select dynamic representative colors. In Section 5, we perform our experiment in using the cultural color space for calculating the cultural distance. In Section 6, we provide the conclusion and future works. 2.

Color and Emotions

2.1 Related Work The study of color-emotion for single colors and two-color combinations is an established research area. A number of scholars from various disciplines have already performed investigations in these subjects. A few numbers of papers addressed the problems and challenges in this research area. Wang & Wang [8] characterize four problems faced by researchers in this field of research such as (i) extracting emotion features from images, (ii) defining emotion information user’s request, (iii) building emotion user models, and (iv) individualizing emotion user models. They found that when users are looking for images, mostly the semantic structure of the image is the key for selection. Similar to [8], Wang & He [25] also mentions three key issues in dealing with emotion based image retrieval, that is, emotion semantic representation, image feature extraction, and emotion recognition. Several papers addressed the problem of including the problem of including coloremotions in the systems. Wang & Yu [9] and Corridoni et al. [10] focused on color content and applied clustering in the color space to segment images into regions with homogenous color. Then, the regions are associated to semantic emotion terms and used for indexing images. Wang & Yu [9] proposed method of annotation and retrieval based on three-dimensional emotion space. From histogram features, emotional factors are predicted using a Support Vector Machine. Similar to [9] that uses threedimentional emotion space, Solli and Lenz [26] implement psychophysical experiments, that is, color-emotion metric with three scales: activity, weight and heat in finding interest points corresponding to homogenous emotion regions and transitions between regions. Emotion characteristics for patches surrounding each interest point are saved in a-bag-of-emotion that can be used for retrieving images based on emotional content. There is an only small number of papers that addresses cultural-oriented emotionbased image retrieval systems. One of them is performed by Sasaki [7]. The method is to dynamically create image-query which represents user preference using common features of impressions as context. Aside from using selected context, in order to extract close-related images, the method also makes use of combination or multiple images. The method also provides users an image retrieval environment which reflects historical aspects and cultural-semantic impression of color and allows users to retrieve images by submitting particular queries such as historical period, culture, and artists. The works of Hofstede [28] and Lewis [29] are quite dominant in the area of culture computation. Because our system deal with computation in cross-cultural environment, it is better at the first time to place the difference between our system and both Hosftede [28] and Lewis [29] system. The work of Hofstede, which is held by itim Culture & Management Consultancy, approached culture in a more general view, that is, from organizational and institutional perspectives. They created a system called CultureGPS tool based on Hofstede’s 5-D Model of Culture. This tool allows users to

336

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

analyze behavior differences in intercultural encounters and to predict to a certain degree, which interactions evolve when people from different nationalities meet and work together. The system provides users with insight into differences in thinking, feeling, and action of people around the world. The work of Lewis [29] is implemented by The National Cultural Profiles (NCP) for creating a web-based global cultural database. The Lewis’ Model of Culture is claimed to be the most practical theoretical approach to classifying cultures. Lewis [29] distinguishes cultures into three polar types including multi-active, linear-active, and reactive types. The work of NCP is aimed at providing a compact guide for business practitioners to cope with cross-cultural communications. They created a system that allows users to recognize and think patterns of all the world’s major cultures when they meet counterparts from different backgrounds. In this system, they emphasize crosscultural communication aspect. Not similar to the previous works that deal with a wide scale of cultural aspects, this paper only copes with color and emotion association across cultures. We focus on how a correspondence between color and emotion in a culture and its comparison with those of other cultures. The main issue addressed in this paper is to provide metadata creation method in dealing with emotion and color aspects in an image retrieval system. 2.2 Association between Color and Emotion Color cannot be divisible with our everyday lives. It is widely known that colors have also a strong impact on our emotions and feelings [12], [13], [14]. For instance, the color red has been associated with excitement, yellow as cheerful, and blue with comfort and security [15], [16]. The relationship between color and emotion is closely related to color preferences. Particular colors have been claimed to be highly preferred regardless of age, racial group or culture [17], [18], while there is also an evidence that particular color is preferred with regard to the cultural basis [11], [19]. These numbers of studies show that color and emotion can be associated. Dealing with emotions is a difficult task. Plutchik [21] argued that study of emotion is difficult because some relate to ambiguities in the language of emotion, some to inconsistencies in definitions of the concept, some to the problem of how and to what extent emotion applies to animals, and some to the impact of different historical traditions. Due to the difficulties of research on emotion, Plutchik suggests two alternative approaches to deal with. First, the identification and meaning of emotion concepts should not depend entirely on naïve judgment but should depend on their place within a theory. Second, the identification of emotion can be performed by assuming the existence of primary emotions, somewhat like the primary colors. In this paper, we began the research based on second approach. If one accepts the concept of primary emotions, the question arises of their relations to each other and how many emotions could be regarded primary ones. Over the years, a number of studies have proposed lists of basic or primary emotions. Among these studies, the most influential works are [20], [21], [22], and [23]. Their proposed lists of primary emotions are shown in Table 1. Plutchik [20], [21] proposes his primary emotions based on evolutionary or adaptive biological processes, while Izard [22] suggests his basic emotion with regard to the neural basis. Ekman [23] proposes his basic emotion based on facial expressions. He believes that facial expressions of emotion are not culturally determined, but universal across human cultures and thus biological in origin. However, in his work in 1990s, Ekman [24]

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

337

expanded the primary emotions from 6 basic emotions to 17 basic emotions including a range of emotions are not encoded in facial expressions. Table 1: Renowned Basic Emotions Proposed by Plutchik, Izard, and Ekman Theory

Emotions

Plutchik [20], [21] Izard [22] Ekman [23] Ekman [24]

fear, anger, sadness, joy, acceptance, disgust, anticipation, surprise fear, anger, enjoyment, interest, disgust, surprise, shame/shyness, contempt, distress, guilt anger, disgust, fear, happiness, sadness, surprise anger, disgust, fear, happiness, sadness, surprise, amusement, contempt, contentment, embarrassment, excitement, guilt, pride in achievement, relief, satisfaction, sensory pleasure, shame

This study applied Ekman’s categorization of primary emotions to create coloremotion metadata due to following reasons. First, it is proposed based on facial expressions that are universal across cultures. Second, it is one of the most elaborated works in the area of emotion studies. Third, the Ekman’s basic emotion model in 1990s [24] is resulted from a comprehensive research that considered multicultural environment. These reasons fit with our purpose to create a color-emotion based image search system that deal with cross-cultural environment. 3.

Color Vector Quantization

For color feature extraction, we apply our previous approach [1] using Three Dimensional (3D)-Color Vector Quantization of RGB color space. The main idea of this approach is that the system uniformly represents image colors in certain positions of RGB’s vector color space. It intends to reduce a complexity RGB colors in the image and unifies the close colors into a color center in the vector space. In this paper, we use the 64x64x64 quantization size of the RGB color space so that it can represent 125 uniform discretization positions in the RGB color space, as shown in Figure 2.

Figure 2: Illustration of 3D-Color Vector Quantization of RGB color space.

Figure 3 shows the values of discretized RGB color space by 64x64x64 quantization size of 3D-Color Vector Quantization.

338

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

Figure 3: Values of 64x64x64 quantization size of 3D-Color Vector Quantization in RGB color space

4.

Color-Emotion Metadata Creation

In this session, we present how to realize the culture-dependent color-emotion space model for image retrieval. With this culture emotion space, an image retrieval system can allow users to assign the image query with culture-dependent features. Then, the system can execute it and provide results which are closely related to the query in terms of culture-dependent aspects. The culture-dependent color-emotion metadata creation is performed by the following steps (see also Figure 4). Step 1: assigning emotion information and color features to m by n matrix. Step 2: creating user preference metadata space by designated user color preferences and color features matrix. Step 3: selecting representative color, which is chosen by a user according to local majority consensus. In this step, we calculate a local average of each primary emotion to set a threshold for user color preference frequency value. Step 4: assigning the representative color of particular primary emotion into a matrix for creating the culture-dependent color-emotion model.

4.1 Developing Emotion Information and Color Features Space In order to generate culture dependent color-emotion model, it needs to acquire emotion information from particular culture knowledge. For this reason, we performed a survey toward people from particular cultures in Asian regions. We collected data from 89 students from Indonesian (42), Vietnamese (15), and Japanese (32) through a survey that associates emotion features to RGB color features. In dealing with emotion features, we have chosen 17 basic emotion model that is proposed by Ekman [24]. For RGB color features, this research uses 3D Color Vector Quantization with 125 color features that we have already used in our previous research [1]. In the area of color-emotion studies, a research work can be carried out to investigate single color-emotion or combined color-emotion [26] depending on the research scope and goal. In this paper, we limit the scope into single color-emotion. It means that we regard each color separately disjoined.

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

339

Figure 4: Overview of Color-Emotion Metadata Creation

In the next step, emotion features are then presented in the form of m by n matrix in order to generate emotion to color metadata. Because we use 17 primary emotions and 125 color features of RGB, the color-emotion data is presented in a 17 x 125 vector space. An emotion feature is marked by value of 1 when it is chosen by a respondent, and by value of 0 when it is passed. Primary emotion terms are designated as variables of 17 rows [iw1, iw2,…, iw17] and 125 color features in 125 columns [f1, f2,…, f125]. Each emotion to color matrix represents each respondent or user’s color to emotion association. After the emotion to color metadata space for individual user is created, we generates user preference to color matrix with user as variables of m rows [u1, u2,… um] where m is number of correspondences and 125 columns of color features in [f1, f2,…, f125]. In order to generate culture-dependent color-emotion model, user preference is grouped according to their ethnic groups or cultures. 4.2 Assigning Representative Colors When dealing with human emotion and color preference, data distribution and frequency often randomly occur. In our case, for a given emotion feature, user preferences can be concentrated on a few colors, while for other emotion feature user preferences are more spread. For this reason, in order to determine a representative color, we employ a statistical approach with this following mechanism.

340

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

Step 1: Calculating local average of each primary emotion to set a threshold for selecting representative colors. x=

1 n

n

1

∑ x = n (x + K + x ) i

1

n

(1)

i =1

Step 2: Representative colors are chosen by filtering out the insignificant colors which close to 0. In order to remove the insignificant colors from the data sets, we set threshold which can adjust to the data distribution. For this attempt, we use the mean M of binomial distribution that can be written as M (X ) =

n

∑x

p( x)

(2)

x

where

M is the mean of data distribution in a given emotion term x is the total number of data p is the probability of event n is the number of selected color

Step 3: Assigning threshold x > 0.4 for data with low distribution, but with higher frequency; and threshold x > 0.2 for data with high distribution, but with lower frequency. The colors which still remain left are regarded representative colors. After representative color is defined, each selected color is given the value of 1 and presented into m by n matrix where m rows are primary emotion, and n columns are color features. When the culture-specific color to an emotion matrix has been developed, it can be used as metadata for culture-specific emotion-based image retrieval system. In this research, the same process is performed for creating the culture-specific emotion model for each culture.

5.

Cultural Distance in Color-Emotion

In this research, we performed an experiment to evaluate the cultural color spaces which consist of three different culture-dependent color spaces. According to [21], there are points where the difference of color preference across cultures are slightly insignificant, but in some cases, they are considerably different. We compute the distance between color-emotion preference in Vietnamese, Indonesian and Japanese by euclidean distance and cosine similarity. We mapped color-emotion information from three different cultures to cultural color space in order to calculate the distances as shown in Figure 5. With both calculations, we found that there are close similarities between Indonesian and Vietnamese in term of color-emotion preference. However, by Euclidean distance calculation, the color-emotion preference among three cultures is slightly indifferent (see Figure 7).

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

341

Figure 5: Cultural Distance of Color-Emotion Preference among Three Cultures

Figure 6: Cultural Distance of Color-Emotion Preference among Three Cultures with Range of 0 to 1 (by Cosine Similarity)

The differences and similarities as shown in Figure 6 and 7 can reflect that there are two types of color-emotion preference. First is culture-bound color-emotion where there is no similarity among cultures. The other is more universal ones where there is almost no variation across cultures. An image retrieval with the function for distinguishing these colors of course will provide a better result in cross-cultural environments. The cultural color space can help an image retrieval to perform better in terms of color-based impression or emotion extraction. The space can function to extract color based on given culture.

6.

Conclusion

In this paper, we have presented the creation method of culture-dependent coloremotion space. By this method, an integrated color-emotion space and color-emotion user model based on culture-dependent can be generated. The method combined and made use of quantized RGB color space for extracting color feature and Ekman’s basic emotion model for creating emotion and color association information. This method is

342

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

Figure 7: Cultural Distance of Color-Emotion Preference among Three Cultures (by Euclidean Distance)

a part of our attempt to provide an image search system that is able to work with highlevel features of image such as emotion. In the future work, we will implement this culture-dependent color-emotion space to a multimedia information search system that allows us to deal with cross-cultural environments. This paper is a report of our preliminary research toward culturedependent multimedia information retrieval.

References [1] [2] [3] [4] [5] [6] [7]

[8] [9] [10] [11] [12] [13] [14]

Barakbah, A.R. and Y. Kiyoki. 3D-Color Vector Quantization for Image Retrieval Systems, International Database Forum (iDB), Iizaka, Japan, 2008. Barakbah, A.R. and K. Arai. Identifying moving variance to make automatic clustering for normal data set, IECI Japan Workshop (IJW), Tokyo, Japan, 2004. Semeulders, A.W.M. Content-Based image retrieval at the end of early years. IEEE Trans. On Pattern Analysis and Machine Intelligence 22, 12, 2000, pp. 1349-1380 Zhao, R. and W.I. Grosky. Bridging the semantic gap in image retrieval. Distributed multimedia databases: techniques & applications. Idea Group Publishing Hershey PA USA, 2002, 14-36. V. Gudivada, V.V. Raghavan, W.I. Grosky, and R. Kasanagottu. Information Retrieval on the World Wide Web. IEEE Internet Comput. 1, 15, 1997, pp. 58-68. Kherfi, M.L., D. Ziou and A. Bernardi. Image Retrieval From the World Wide Web: Issues, Techniques, and Systems. ACM Computing Surveys, Vol. 36, No. 1, March 2004, pp. 35–67. Sasaki, S., Y. Itabashi, Y. Kiyoki, and X. Chen. An Image-Query Creation Method for Representing Impression by Color-based Combination of Multiple Images. Frontiers in Artificial Intelligence and Applications; Vol. 190, Proceeding of the 2009 Conference on Information Modelling and Knowledge Bases XX, 2009. Wang, S. and X. Wang. Emotion Semantics Image Retrieval: A Brief Overview. ACII 2005, LCNS 3784, 2005, pp. 490-497. Wang, W.N. and Y.L. Yu. Image emotional semantic query based on color semantic description. Proceedings of ICMLC 2005, 2005, pp. 4571-4576. Corridoni, J., A. Del Bimbo, and P. Pala. Image retrieval by emotional semantics. Multimedia Syst. 7, 1999, pp. 175-183. Saito, M. Comparative Studies on Color Preference in Japan and Other Asian Regions, with Special Emphasis on the Preference for White. Color Research and Application, Vol. 21, 1, 1996, pp. 35-49. Hemphill, M. A Note on Adults’ Color-Emotion Associations. Journal of Genetic Psychology, 54, 1996, pp. 385-394. Lang, J. Creating Architectural Theory: The Role of The Behavioral Sciences in Environmental Design. New York: Van Nostrand Reinhold, 1993. Mahnke, F. Color, Environment, Human Response. New York: Van Nostrand Reinhold, 1996.

T. Suhardijanto et al. / A Culture-Dependent Metadata Creation Method

343

[15] Ballast, D.K. Interior Design Reference Manual. Belmont, CA: Professional Pub. Inc, 2002. [16] Wexner, L. B. The Degree to Which Colors are Associated with Mood-tones. Journal of Applied Psychology, 123, 4, pp. 394-409. [17] Adams, F.M. & Osgood, C.E. A Cross-cultural Study of the Affective Meaningof Color. Journal of Cross-cultural Psychology, 7, 1973, pp. 135-157. [18] Eyseck, H.J. A Critical and Experimental Study of Color-Preferences. American Journal of Psychology, 54, 1941, pp. 385-394. [19] Choungorian, A. Color Preference and Culture Variation. Perceptual & Motor Skills, 26, 1968, pp. 1203—1206. [20] Plutchik, R. Emotion: Psychoevolutionary Synthesis. New York: Harper, 1980. [21] Plutchik, R. Emotions and Life: Perspective from Psychology, Biology, and Evolution. Washington D.C.: American Psychology Associations, 2003. [22] Izard, C.E. Pattern of Emotions: A New Analysisof Anxiety and Depression. New York: Academic Press, 1972. [23] Ekman, P. Pictures of Facial Affect. Palo Alto, CA Consulting Psychologists Press, 1976. [24] Ekman, P. Basic Emotions. In T. Dalgleish and M. Power (Eds.). Handbook of Cognition and Emotion. Sussex, U.K.: John Wiley & Sons, Ltd., 1999. [25] Wang, W.N and Q.H. He. A Survey on Emotional Image Retrieval. Proceeding of ICIP 2008, 2008, pp. 117-120. [26] Solli, M. and R. Lenz. Color Based Bags-of-Emotions. CAIP 2009, LCNS 5702, 2009, pp. 573-580. [27] Hofstede, G. Culture’s Consequences: Comparing Values, Behaviors, Institutions, and Organizations across Nations. Thousand Oaks: SAGE Publication, 2001. [28] Hofstede, G. & J.G. Hofstede. Cultures and Organizations: Software of the Mind. New York: Mc Graw-Hill, 2005. [29] Lewis, R.D. When Cultures Collide Leading across Cultures: a Major New Edition of the Global Guide. Boston: Nicholas Brealey International, 2006. [30] Lewis, R.D. Cross Cultural Communication: A Visual Approach. Boston: Intercultural Press, 1999.

344

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-344

R-Web: A Role Accessibility Deﬁnition Based Web Application Generation Yusuke NISHIMURA, Kosuke MAEBARA, Tomoya NORO and Takehiro TOKUDA Department of Computer Science, Tokyo Institute of Technology, Japan Abstract. We present a role accessibility deﬁnition based Web application generation. A role accessibility deﬁnition speciﬁes what kind of data access can be done by which type of roles of users. From a given role accessibility deﬁnition, we can automatically derive data model, business logic and user interface to generate simple Web applications. With additional deﬁnitions of page transition and general computation using existing Web service functions, then we can generate more general type of Web applications. We can use ﬁne grained Web service functions for handling tables or external Web service functions on the Web. Our approach will help us (esp. non-programmers) create a variety of Web applications such as questionnaire systems, student assignment evaluation systems, and so on. Keywords. Web application generation, role accessibility, Web service functions, user interacting Web applications

Introduction Traditionally Web applications are constructed by methods such as manual writing in integrated environments or frameworks [1, 2], and automatic generation using diagrams or annotations [3–10]. However, such traditional methods are not suitable for constructing Web applications which deal with different roles of users interacting with data using limited accessibilities. We refer to such Web applications as user interacting Web applications. Examples of user interacting Web applications include questionnaire survey systems, meeting room reservation systems, report submission and evaluation systems, a conference paper review systems, and so on. In general, Web applications have the following four elements: data model, computation deﬁnition, appearance deﬁnition, and transition deﬁnition. In the case of user interacting Web applications, limited access methods are traditionally represented using part of computation deﬁnition. Let us take a look at one example of conference paper reviewing system. We have roles of PC chairs, reviewers, and authors. An author can upload a paper and download the uploaded paper. Reviewers can download papers speciﬁed by the PC chair and they can upload review reports to the speciﬁed papers. Web application developers have to embed such accessibility constraints in the program codes (computation deﬁnition) manually. It is hard for non-programmers to develop user interacting Web applications manually although they sometimes need such Web application. In this paper we propose a role accessibility deﬁnition based Web application generation. In our approach, we explicitly separate access model from computation deﬁnition.

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

345

A role accessibility deﬁnition speciﬁes what can be done by which type of roles of users to what kind of data. From a given role accessibility deﬁnition, we automatically derive data model, business logic and user interface. With additional deﬁnitions of page transition and general computation using existing Web service functions, then we can generate more general type of Web applications. We can use ﬁne grained Web service functions for handling tables or external Web service functions on the Web. Our approach will help us (especially non-programmers) create a variety of user interacting Web application easily since it requires minimum amount of accessibility deﬁnitions. Note that target of our proposal is creating Web applications used in a small group or for personal use, rather than large-scale, commercial applications such as Amazon, Facebook, and so on, which deal with large amount of data and need to ensure data consistency and scalability. The rest of this paper is organized as follows. In section 1 and 2 we present role accessibility deﬁnitions and some examples. In section 3 we explain additional deﬁnitions for general computation and transition. Section 4 and 5 describe implementation and evaluation. We discuss related work and provide conclusions in section 6 and 7.

1. Role Accessibility Deﬁnitions In our generator, named R-Web, Web application developers deﬁne the followings: roles of Web application users, account tables to store information about the Web application users, data tables to store data the Web application deals with, and role accessibility to the account tables and the data tables. In the rest of this paper, we refer Web application developers as “developers”. 1.1. Role and Account Table Developers declare what kind of roles the Web application users have. At least one role is required in R-Web. Each role has its own account table to store information about users of the role, which has to include information about user identiﬁcation and authentication. The information is used when users log in to the Web application generated by R-Web. Account tables have two mandatory ﬁelds, which are used for identifying and authenticating users. In R-Web, either user ID or e-mail address is used for identiﬁcation, and password is used for authentication. For each of the two ﬁelds, developers need to specify its name, maximum and minimum lengths and characters used for it. Developers can add some new ﬁelds to the account tables, each of which has a ﬁeld name and a data type. We have the following data types. e-mail E-mail addresses. It is different from e-mail address for identiﬁcation. integer Integer values. Maximum and minimum values have to be speciﬁed. string Character sequences whose length is less than/equals to 255. Maximum and minimum lengths have to be speciﬁed. long text Long character sequences whose length is more than 256. Maximum and minimum lengths have to be speciﬁed. date and time Data type to store both date and time. date Data type to store date. time Data type to store time.

346

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

selection Data type for allowing users to choose one value from a ﬁxed set of strings. The set of strings have to be deﬁned. In some cases, developers would like to let some users access the Web application without any identiﬁcation and authentication. Developers can create at most one role to deal with this situation. This role does not have a corresponding account table since users of the role are not identiﬁed and authenticated. 1.2. Data Table Data which the Web application deals with is stored in data tables. Data tables have at least one ﬁeld, each of which has a ﬁeld name and a data type. There are eight data types: “integer”, “string”, “long text”, “date and time”, “date”, “time”, “selection”, and “ﬁle”. The data type “ﬁle” indicates ﬁle data. File names are stored in the data tble, and the ﬁles themselves are stored in a certain directory. Maximum ﬁle size has to be speciﬁed. The other data types are described in the previous section. 1.3. Role Accessibility After creation of roles and deﬁnition of structure of account tables and data tables, for each of the roles, developers deﬁne the following accessibility to each of the tables. create (C) It determines whether the role can create records in the account/data table. delete (D) It determines whether the role can delete records in the account/data table. For each of the account tables and the data tables, we also deﬁne the following accessibility to each ﬁeld of the table. read (R) It determines whether the role can read data stored in the ﬁeld. write (W ) It determines whether the role can edit data stored in the ﬁeld. exclusive write (X) It determines whether the role can exclusively edit data stored in the ﬁeld (and upload the corresponding ﬁle), and cancel the restriction later. After a user of the role exclusively edit the ﬁeld, nobody else can edit the ﬁeld until the restriction is canceled by the user. When a user who has role accessibility C creates a new record, he/she can set data to all ﬁelds of the record even if he/she does not have role accessibility W to the ﬁelds. Developers can prohibit users from setting data to some ﬁelds in creating a new record (users create new records leaving the ﬁelds empty). In some Web applications, one record in a data table corresponds to one particular user. Suppose, for example, we are planning to develop a Web application for questionnaire survey as shown in Figure 1. Respondents need to be registered on the system, and they can edit their own answers as many times as needed. Now, a respondent answers the questionnaire for the ﬁrst time and sends the answer to the server. A record of the answer is created in the data table of the system, which means this record belongs to the respondent. The respondent can read and edit only some ﬁelds of his/her own record, and other records stored in the data table are neither readable nor editable. That is also true of account tables. Users can read and edit only their own account information, and they usually cannot neither read nor edit any other users’ account information. In order to deal

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

347

Figure 1. Example of a questionnaire survey system and role accessibility

with this situation, we introduce “individual accessibility deﬁnition”, which determines whether accessibility D, R, W , and X to each record (or each ﬁeld in the record) in the data table will be deﬁned for a particular user separately from accessibility deﬁnition for the user’s role. Developers determine which role’s user the individual accessibility should be assigned to. After a Web application is generated (i.e. in operation phase), a user who creates a new record in the data table determines a user the accessibility should actually be assigned to. For each role, individual accessibility of the role must be deﬁned in the role’s account table.

2. An Example We take an assignment evaluation system as an example. It is used for a lecture course. A supervisor is the chair of the course, and he/she invites two instructors. Each of them gives an assignment to students after his/her lecture. Students upload their paper ﬁles of each assignment. Instructors download and read the ﬁles of their own assignment, and input score to a ﬁeld of each student’s record. The supervisor checks all of the scores. In this system, we need four roles: supervisor, instructor A, instructor B, and student. The system has one data table to store paper ﬁles and scores of each student 1 . Each record in the data table has six ﬁelds: student ID, stident name, paper ﬁle of the 1st/2nd assignment, and score of the 1st/2nd assignment. Role accessibility to the data table will be as follows (Figure 2): 1 Although it also has account tables for each role, the details are omitted in this paper due to space limitation.

348

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

Figure 2. A data table for an assignment evaluation system and role accessibility

• Users of the role “supervisor” have accessibility C and D to the data table, and R to all of the ﬁelds. • Users of the role “instructor A” have accessibility R to the ﬁelds “student ID”, “student name”, “paper ﬁle of the 1st assignment”, and “score of the 1st assignment”, and W to the ﬁeld “score of the 1st assignment”. • Users of the role “instructor B” have accessibility R to the ﬁelds “student ID”, “student name”, “paper ﬁle of the 2nd assignment”, and “score of the 2nd assignment”, and W to the ﬁeld “score of the 2nd assignment”. • Individual accessibility deﬁnition for the role “student” is determined. ∗ The user of the role “student” whom individual accessibility is assigned to have accessibility R to all of the ﬁelds except for “score of the 1st assignment” and “score of the 2nd assignment”, and W to the ﬁelds “paper ﬁle of the 1st assignment” and “paper ﬁle of the 2nd assignment”. ∗ The others do not have any accessibility to all of the ﬁelds. 3. Page Transition And General Computation If developers only give role accessibility deﬁnition, a simple Web application can be generated. The Web application consists of several Web pages: Web pages for displaying information in the data/account tables, Web pages for editing information in the tables, Web pages for creating/deleting records to the tables, and a top page for logging in the system. Although every user can access each of the data tables and the account tables in the system, how information is displayed in each page depends on users or their role. Only information the user is allowed to read is displayed in displaying pages, and only information the user is allowed to edit (i.e. accessibility W /X) is displayed in editing pages. Only users who have accessibility C/D to a data/account table can go to pages for creating/deleting record/account. In order to be able to generate enhanced Web applications, we can deﬁne page transition and general computation as well as role accessibility. In page transition deﬁnition, we specify which information (ﬁelds of data/account tables) is to be displayed/edited in each page and which page users can go to from each page.

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

!"!#$ %&*

349

5679:

$ $;

&95

569:<

=%

&

+-/3

5679:

$ $;

&95

569:<

=%

&

Figure 3. Web service invocation for summing up scores of each student

Role accessibility deﬁnition and page transition deﬁnition cannot deal with general computation. In many Web applications, we need computation such as summing up all of the data in some ﬁelds of a table, search for records to satisfy a user’s query, sorting records in a table, and so on. In order to solve this problem, we provide some Web service functions for such general computation. Developers can specify Web service invocation and presentation of the result in page transition deﬁnition. They can also include external Web service functions on the Web. Suppose, for example, a developer is planning to develop an assignment evaluation system described in section 2. Since total score of each student is not stored in the data table of the application, it needs to be calculated if necessary (the supervisor may need the information). The developer can realize the computation as shown in Figure 3 if the following Web service functions are prepared. SELECT Select some columns from a table (require which columns are to be selected). SUM Sum up some ﬁeld values for each record in a table (require which ﬁeld values are to be summed up). JOIN Join two tables containing the same number of records (just join the n-th record of one table and the n-th record of the other). If the data table as shown in Figure 2 is given, results of the three Web service functions, SELECT, SUM, and JOIN will be like Table A, B, and C in Figure 3 respectively, and, ﬁnally, information in Table C will be displayed in the result page. 4. Implementation of R-Web Developers create a Web application by R-Web in the following procedure. 1. Create data tables For each data table, developers set name of the table and create at least one ﬁeld. Each ﬁeld consists of its name and data type. 2. Create roles For each role, developers set name of the role and, if necessary, upper limit of the number of users who have the role. A role for unidentiﬁed users, which does not have upper limit of the number of its accounts, is prepared by default.

350

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

3. Create account tables For each role created in the previous step (except for the role for unidentiﬁed users), developers determine which of user ID and e-mail address is used for identifying users in logging in to the Web application, then a ﬁeld corresponding to the data is automatically added to the role’s account table. A ﬁeld of password, which is used for authenticating users, is also added. Developers can add other ﬁelds, each of which consists of its name and data type. 4. Deﬁne role accessibility to each of the data tables At least one role which can create records in the table is chosen. Accessibility R, W , and X to each ﬁeld and accessibility D to each record is deﬁned for each role. Whether the accessibility is deﬁned for a particular user separately from accessibility deﬁnition for the user’s role or not (i.e. individual accessibility deﬁnition) is also determined in this step. 5. Deﬁne role accessibility to each of the account tables This step is almost the same as the previous step, except that individual accessibility for the role corresponding to the account table must be deﬁned. 6. Deﬁne page transition and data processing This step is optional. Instead of generating a simple Web application described in section 3, developers can deﬁne page transition and specify Web service invocation for data processing. In general, any users can reach a Web application without identiﬁcation and authentication unless access to the application is forbidden by server conﬁguration etc, which means developers always have to consider such unidentiﬁed users. In R-Web, the role for unidentiﬁed users is automatically created, and the role does not have any accessibility to all of the tables by default. Developers can change the accessibility in the fourth step. After the procedure described above, R-Web generates PHP code ﬁles and a ﬁle of SQL commands for MySQL. What the developers do after that is putting the PHP code ﬁles to a certain directory in a Web server and creating database by executing the SQL commands. Since R-Web is designed for non-programmers, it ensures security and reliability of the Web applications generated, such as cross-site scripting, SQL injection, and input check based on data type.

5. Evaluation R-Web can generate wide variety of user interacting Web applications such as the followings other than the examples described in section 1 and 2. BBS It has a data table consisting of two ﬁelds: user name and message. General users, who is not identiﬁed and authenticated, can post messages to the system and share them with anyone else while they cannot delete/edit any messages, which means that they have accessibility C to the data table and accessibility R to both of the two ﬁelds. The system has a role for administrator, users of which can delete/edit any messages, which means that they have accessibility D to the data table and accessibility R and W to both of the two ﬁelds. Conference paper reviewing system This is a system that authors submit their papers and, for each of the papers, three reviewers judge the paper. It has a data table consisting of ﬁve ﬁelds: author name, paper ﬁle, and review reports by the ﬁrst, second, and third reviewers. The system has ﬁve roles: authors, PC (co-)chairs, and reviewers (reviewers are divided into three groups).

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

351

PC (co-)chairs have accessibility C to the data table and accessibility R to all of the ﬁelds. Individual accessibility deﬁnition for roles of authors and three groups of reviewers is required. If an author applies for paper submission (e.g. by e-mail), a PC chair creates new record for the author’s paper and assigns individual accessibility (R to “author” name, and R and W to “paper ﬁle”) to the author. Also, the PC chair picks up one user from each groups of reviewers and assigns individual accessibility to them, which means, if a reviewer in the ﬁrst group of reviewers is designated by the PC chair, he/she has accessibility R to the paper ﬁle of the record and accessibility R and W to the ﬁeld of review report by the ﬁrst reviewer. Meeting room reservation system It is used for making reservations for a meeting room. Some authorized users can make reservation, and anyone else (general users) can check the status. This system has one data table representing a calendar: each record indicates one day. For example, each record has one ﬁeld for indicating date and some ﬁelds for indicating time slots (9:00-10:30, etc.). We need three roles in the system: authorized users, who make reservations, calendar updaters, who update the calendar, and general users, who browse the schedule. General users can access the system without identiﬁcation and authentication. If an authorized user has already made a reservation, anyone else cannot make a reservation at the same time slot until the previous reservation is canceled. Calendar updaters have accessibility C and D to the data table, and R and W to the date ﬁeld. Authorized users have accessibility R and X to all of the time slot ﬁelds, and R to the date ﬁeld. General users have accessibility R to all of the ﬁelds. The current R-Web has some limitations to be improved as follow. Dynamic change of role accessibility Once a Web application is generated, users cannot change any role accessibility. However, some users would like to change accessibility to some ﬁelds in some Web applications. An example of this Web application is a kind of memo pad system. Users post notes to the system and they can read only their own notes by default. In the case that some users would like to share their notes with others, they need to change accessibility to the notes. The current R-Web cannot deal with such situation. Deﬁning structure of data/account tables by a user Data/account table structure must be deﬁned by developers. Once a Web application is generated, any users cannot change the data/account table structure. In some Web applications, some users (not developers) need to deﬁne data table structure. Let us take a Web application for coordinating schedule for instance. This application has two roles: coordinator and member. A coordinator nominates some time slots and each member answers which time slots are convenient for himself/herself. The problem is that the coordinator needs to deﬁne structure of the data table (and also role accessibility to each ﬁeld) since he/she nominates time slots. The current R-Web cannot generate such Web applications. In order to solve this problem by the current R-Web, the coordinator should be a developer of the Web application, and generates the Web application by using R-Web. Timing of assigning individual accessibility to a particular user If individual accessibility for a role is deﬁned, a user must choose a particular user the individual accessibility is actually assigned to when he/she creates a new record. For example, in the case of the paper review system described above, the PC chair has to choose

352

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

three reviewers to assign individual accessibility when an author applied for paper submission (i.e. when the PC chair create a new record for the paper). However, in general, reviewers are assigned to each paper after paper submission deadline. In order to deal with this situation, the PC chair need to create a new record leaving the individual accessibility unassigned. Although the current R-Web has such shortcomings, we can develop a wide variety of user interacting Web applications only by deﬁning role accessibility. We do not need any programming skills and knowledge of security issues in Web applications.

6. Related Work T-Web [3, 4] and PF-Web [5] are diagram-based generators. T-Web generates Web applications from directed graphs, called Web transition diagrams, and program templates, while, in PF-Web, Web applications are considered as pipe/ﬁlter architecture-based applications and are generated from Web transition diagrams and program ﬁlters. In the case of developing user interacting Web applications, Web transition diagrams tend to be large and complicated since complicated user interaction logic have to be realized by combination of general-purpose program templates or ﬁlters. R-Web can generate user interacting Web applications easily from user interaction logic deﬁnition using a simple user interface. A-Web [6, 7] is an annotation-based generator. A developer deﬁnes behaviour of the Web application by associating annotations with dynamic parts of Web page templates embedded by server side programs. In the case of developing Web applications interacting with database, developers need knowledge of SQL since they have to describe SQL commands by themselves. R-Web automatically realize data ﬂow among databases, Web pages and sessions based on interaction logic deﬁned by developers, which means developers do not need knowledge of SQL. CakePHP [11] is a Web application development framework for PHP. It automatically generates program code for CRUD processing to database (creating, reading, updating, and deleting records). However, in the case of developing user interacting Web applications, developers have to describe program code to realize role accessibility logic by themselves, which means they need programming skills and knowledge of Web application reliability and security. In R-Web, developers can generate user interacting Web applications only by deﬁning table structure and role accessibility, and they do not need any programming skills and knowledge of Web application reliability and security. Tuigwaa [12] can generate Web applications interacting with database by deﬁning table structure and role accessibility to each Web page. For each Web page, developers can only determine whether each role can access the page or not, which means Tuigwaa cannot control role accessibility to the database (create/delete records, and read/update data) directly. For example, if a developer would like to allow users of a role to read some parts of the database and disallow them to read the other parts and to do any other operations, the developer creates a Web page displaying only the parts which the users can read, and allows them to access the page. In R-Web, developers can deﬁne role accessibility to database directly. Some approaches using modelling languages such as UML [8, 9] and WebML [10] have also been proposed. Although they can easily deal with navigation structure of

Y. Nishimura et al. / R-Web: A Role Accessibility Deﬁnition Based Web Application Generation

353

Web applications, they are weak in controlling interaction with database based on Web application users’ operation. From the viewpoint of access control of DBMS, in the case of MySQL, accessibility to each ﬁeld is determined for each user, and we cannot assign different accessibility to each record. In general, developers create only one account for Web application users, and deﬁne its accessibility to the database. They need to embed accessibility different among users to the Web application program. R-Web can deﬁne accessibility for each role, and assign different accessibility to each record by individual accessibility deﬁnition. R-Web does not require any programming skills. In the case of the UNIX ﬁle system, each ﬁle is owned by a user, and the owner can determine whether read/write/execution by the owner/group members/others is allowed or not. On the other hand, in the case of R-Web, for each of the tables of the Web application, developers determine which role can create/delete records in the table and which role can read/write/exclusively write data in each ﬁeld of the table.

7. Conclusion We have presented a role accessibility deﬁnition based Web application generation. We describe what kind of data access can be done by which role of users. Then we automatically derive data model, business logic and user interface of Web applications. With additional deﬁnitions of general computation and transition, we can generate more general type of Web applications. Our approach help us (especially non-programmers) to generate a variety of Web applications.

References [1] [2] [3]

[4] [5]

[6] [7] [8] [9] [10] [11] [12]

Apache Struts. http://struts.apache.org/. Ruby on Rails. http://rubyonrails.org/. Mitsuhisa Taguchi, Tetsuya Suzuki, and Takehiro Tokuda. Generation of server page type Web applications from diagrams. In 12th European-Japanese Conference on Information Modelling and Knowledge Bases, pages 117–130, 2002. Mitsuhisa Taguchi and Takehiro Tokuda. Automatic generation of client-server collaborative Web applications from diagrams. In 5th International Conference on Web Engineering, pages 117–130, 2005. Tomohiro Matsuzaki, Tetsuya Suzuki, and Takehiro Tokuda. A pipe/ﬁlter architecture based software generator PF-Web for constructing Web applications. Computer Software of Japan Society for Software Science and Technology, 19(4):266–282, 2002. In Japanese. Kazuhiro Asami and Takehiro Tokuda. Generation of Web applications from HTML page templates with annotations. In IASTED International Conference Applied Informatics, pages 295–300, 2002. Kazuhiro Asami and Takehiro Tokuda. Generation of Web applications from annotation-based deﬁnitions. In Engineering Information Systems in the Internet Context, pages 69–79, 2002. Jim Conallen. Building Web Applications with UML. Addison-Wesley Pub, 1999. Rolf Hennicker and Nora Koch. A UML-based methodology for hypermedia design. In UML2000 Conference, pages 410–424, 2000. Stefano Ceri, Piero Fraternali, Aldo Bongio, Marco Brambilla, Sara Comai, and Maristella Matera. Designing Data-Intensive Web Applications. Morgan Kaufmann, 2002. CakePHP. http://cakephp.org/. Tuigwaa. http://tuigwaa.sandbox.seasar.org/ (In Japanese).

354

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-354

NULL ‘Value’ Algebras and Logics a

Bernhard THALHEIM a,1 and Klaus-Dieter SCHEWE b,2 Christian-Albrechts-University Kiel, Computer Science Institute, 24098 Kiel, Germany b Software Competence Center Hagenberg, 4232 Hagenberg, Austria Abstract. NULL is a special marker used in SQL to indicate that a value for an attribute of an object does not exist in the database. Its aim is a representation of “missing information and inapplicable information”. Although NULL is called null ‘value’ is not at all a value. It is a marker. It is only an annotation of incomplete data. Since it is typically interpreted as a value, NULL has led to controversies and and debates because of its treatment by 3-valued logics, of its special requirements for its use in SQL joins, and the special handling required by aggregate functions and SQL grouping operators. The three-valued logics does not properly reﬂect the nature of this special marker. Markers should be based on their speciﬁc data type. This data type is then different from any other data types used in relational database technology. Due to this orthogonality we can combine any type with the special type. To support this we introduce a non-standard generalisation of para-consistent logics. This logics reﬂects the nature of these markers. This paper aims in developing a general approach to NULL ‘values’ and shows how they can be used without changing database technology. Keywords. NULL, NULL ‘value’, constraints, NULL logics, database engineering,

1. Introduction In the early 1970s E. F. Codd introduced the relational database model including the relational calculus, relational algebra, and relational database normalization. Later he also introduced an approach to handle queries in the presence of NULL ‘values’ in a relational database and proposed a 3-valued logic with truth values ‘True’, ‘False’, and ‘Unknown’. When a NULL value appears in a table, its evaluation in a condition produces the ‘Unknown’ truth value. He gave truth values to complex conditions by giving truthtables for the connectives ‘AND’, ‘OR’, and ‘NOT’. For example, ‘True OR Unknown’ has the truth value ‘True’ because a disjunction is true if one of its disjuncts is true. Codd evaluated ‘Unknown OR Unknown’ to ‘Unknown’. This speciﬁc Łukasiewicz logic L3 was the starting point for a rather difﬁcult treatment of missing or incomplete values in the database literature and technology. [8] states that “nulls are ipso facto nonsense”. “I apologize for the wording “contains a null”; as I’ve written elsewhere, to talk about anything “containing a null” actually 1 [email protected] 2 [email protected]

http://www.is.informatik.uni-kiel.de/∼ thalheim http://www.scch.at/

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

355

makes no logical sense. Indeed one of the problems with nulls is precisely that you can’t talk about them sensibly! ... the entire topic is a perfect illustration of The Principle of Incoherence ... ”. The third manifesto [7,9] is an attempt to clarify and bring up to date E.F. Codd’s 1970 model. It requires every tuple of a relation to contain exactly one value for each attribute of that relation, the value being “drawn from the domain” (Codd) or, synonymously, “of the declared type” (Date and Darwen) of that attribute. 1.1. NULL ‘Values’ in the Relational Database Model NULL ‘values’ can be used for attributes in the relational model. They are often considered to be a programmer’s nightmare. Allowing NULL ‘values’ into attribute values introduces a whole new degree of uncertainty into your database. Qualiﬁed guesses must be made by the SQL programmer to counter for erroneous results of NULL values in a database. We need to distinguish at least between the following kinds of NULLs: • NULL ‘values’ represent currently unknown values that may be replaced later with values when we know something. These NULL values can be represented by speciﬁc default values. For example, Gender can be coded by the following scheme: 0 (unknown), 1 (male), 2 (female), 9 (inapplicable). • Domain-speciﬁc NULL ‘values’ are used to denote ordinal or cardinal numbers. Ordinal numbers measure position. Cardinal numbers measure quantity or magnitude. There is a difference between the quantity 0 and an unknown quantity. 0 is the common default value for all numeric domain types. The blank can be used as a default type for character types. Date and time is speciﬁed by relative values and required by the schema to be absolute. In this case, NULL ‘values’ are not the appropriate solution. We split the corresponding attributes. • NULL ‘values’ are also used to represent inapplicability of a characterization for a given object. In this case, hierarchies can be used for separation of aspects. NULL ‘values’ can be derived. For instance, if two values are incomparable then the comparison evaluates to ‘unknown’ or ‘null’. For example, the color of a car and the color of hair can be incomparable in the application. From the other side, NULL ‘values’ used for characterization of properties of different objects can be equal. In this case, marked NULL ‘values’ or variables should be used. The treatment of NULL ‘values’ is different in DBMS. Some of them treat NULL ‘values’ as missing or unknown values. Evaluation of expressions with NULL ‘values’ is different in DBMS. For this reason, it is a good idea to restructure all relations to relations without NULL ‘values’ whenever possible. Since this approach is an implementation approach we are not using it during conceptual modelling. [31] proposed a speciﬁc treatment of missing information. NULL ‘values’ can be used based on the interpretations ‘value does not exist’, ‘no information neither on existence nor on values’ or ‘value exists but is currently unknown’[3,18]. This concept can be introduced in HERM in a shortened form. Since NULL ‘values’ can also be used in keys [29], the identiﬁcation property is not lost in general. We can use N OT E X IST for null-valued components in the ﬁrst approach. We can use U NKNOWN for ‘unknown’ and N O I NFORMATION for ‘no information’. The ‘value exists but is unknown’ approach can be represented by a formula ∃xPR (..., x, ...). The ‘no-information’ approach is based on the projection of R to the deﬁned components compon(R) \ {A},

356

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

i.e. Pπ(compon(R)\{A}) (...). The non-existence of values on A can be represented by ¬∃xPR (..., x, ...) ∧ Pπ(compon(R)\{A}) (...). We are going to extend this approach in this paper. Often the use of NULL ‘values’ is forbidden for the primary keys or for all keys. This restriction is an implementation restriction which is required by most commercial DBMSs. Default values or initial values can be used for speciﬁc values. The entity integrity rule forbids nulls in primary key columns. Primary keys cannot contain NULL (missing) data. The reason for this rule should be obvious for keys with a singleton attribute. An object cannot uniquely identiﬁed or referenced in a table if the primary key of that table can be NULL and there is more than one object with this property. It’s important to note that this rule must not be applied to composite keys. The entity integrity rule requires however for composite keys that none of the individual columns can be null. 1.2. The SQL Standard and NULL ‘Values’ The SQL:1999 standard deﬁnes NULL values in a similar way as the SQL2 standard. In detail, we ﬁnd the following statements: “A null value (NULL) is a special value or mark that is used to indicate the absence of any data value. ... Values are either null values or non-null values. A null value is an implementation-dependent special value that is distinct from all non-null values of the associated data type. There is effectively only one null value and that value is a member of every SQL data type. There is no for a null value although the keyword NULL is used in some places to indicate that a null value is desired. ... Two sets are equal if they contain the same elements and those elements are all non-null values. The equality of two sets is unknown if the sets contain null values and the replacement of those null value with appropriate non-null values would make the two sets equal. Otherwise, two sets are unequal. ... Every domain, column in a base table, SQL variable, and SQL-supplied parameter, has a null class. If no null class is speciﬁed, it is the general null class (which contains only the general null value); otherwise it is the deﬁned null class that is speciﬁed. A deﬁned null class, is created by a and is a named set of possible null values known as null states, together with the general null value. A null state is a named, implementation-dependent null value that is distinct from both the general null value and all other null states of the same null class. The null values of a null class are ordered on their position number, the general null value having position number one in every null class. This ordering is used to determine the result of an operation when more than one of its operands are null values. Except in the few cases where the null substitution principle yields a different result, if one or more operands of an expression is a null value then the result is a null value. The result of an attempted transfer of any null value between objects having different null classes is the general null value. The null class of the result of an operation is determined as follows. Case: - If no operand is null, then the result is as determined by the application of other General Rules. - If the operator is AND and either operand is false, then the result is false. - If the operator is OR and either operand is true, then the result is true. - If the result has a deﬁned null class, and one or more operands are null, then: Case: * If the operator is OR then the result is the null value having the maximum position number; * Otherwise, the result is the null value having the minimum position number. Note: The general null value effectively has the position number one.

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

357

- Otherwise, the result is the general null value. ... Every column an stored attribute has a nullability characteristic that indicates whether any attempt to store a null value into that column or stored attribute will inevitably cause an exception to be raised, and whether any attempt to retrieve a value from that column or stored attribute can ever result in a null value. The possible values of the nullability characteristic are known not nullable and possibly nullable. ... ”

SQL follows the semantics of predicate logic also in the case of missing values and thus cause a number of problems: The interpretation of nulls ‘values’ is not easy to adapt to a particular intuition; the meaning of relational operations when applied to both null values and certain data values is not well-deﬁned; and the treatment of integrity constraints in databases containing null values is not completely deﬁned. 1.3. 14 (or 20) Kinds of NULL ‘Values’ in Databases The ANSI/Sparc report [1] has been introducing 14 different kinds of incomplete data that could appear as a result of queries or as attribute values. This collection is not complete. We may distinguish the following kinds of NULL ‘values’: 1. The property is not applicable for this object but belongs to this class of objects. 1.1. Independently from the point of time t. “not applicable” 1.2. At the current point of time t. “currently not applicable” 2. The property does not belong to the object. 2.1. The property is not representable in the schema. 2.1.1. Due to changes of value type (temporarily, fuzzy, ...). “many-typed” 2.2. The property is representable in the schema. 2.2.1. But there is no value for the object. “unknown” 2.2.1.1. Because it has not been transferred from another database. 2.2.1.2. Because is has not yet inserted into the database. “existential null” 2.2.2. The value for the property exists but is “under change”. 2.2.2.1. However the value is trackable. 2.2.2.1.1. But is at the moment forbidden. 2.2.2.1.2. At the moment permitted. 2.2.2.1.2.1. But not deﬁned for the database. 2.2.2.1.2.1.1. Because it is currently under change. 2.2.2.1.2.2. The value is deﬁned for the system. 2.2.2.1.2.2.1. But is currently incorrect. 2.2.2.1.2.2. But is currently doubtful. 2.2.2.2. The value is not trackable. 2.2.2.2.1. Because of changes. 2.2.2.2.2. Because of reachability. “place-holder null” 2.2.3. There are several values for the property of this object. “partial null” (2.2.3.1., 2.2.3.2.1, 2.2.3.2.2. similarly to 2.2.2.) “nondeterministic” 2.2.4. There is no value for the property of this object. “not exists” 2.2.5. There is never a value for the property of this object. “never exists” 3. The property is may-be applicable for this object but it unknown whether it is true for the object in this case. “may-be null” 3.1. It is not known whether the property is applicable to the given object. If it is applicable then its value for this property is taken from certain domain. “partial may-be null”

358

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

The nondeterministic and the many-typed cases are caused by wrong modelling of the database. If we abstract from temporarily not available values then we need to distinguish at least the NULL ‘values’ into U NKNOWN, N OT A PPLICABLE, N OT E XISTS, and N EVER E XISTS. Finally, we should separate NULLs from other real values. Therefore we call these values either NULL ‘values’ or better NULL markers. 1.4. An Application Example It has often been claimed that speciﬁc markers are not necessary for relational databases. We need however to consider also schema parsimony as a schema quality criterion. Large schemata are a burden for users and programmers. They increase the possibility of inherent schema errors. Therefore, practicians tend to denormalise schemata for performance reasons and to combine types of common behaviour and shared structures into singleton types. As a surprise for theoreticians, the schema becomes easier to maintain and the theory of integrity constraints is not much more complex. We call this phenomenon the efﬁciency unapprehensiveness. Let us consider a simple example of a relational type Enrolls associating the Student and Course within a Program. A student might obtain a grading if this is applicable after the examination. The grading is however different for each type of student. A natural separation of concern would be the following one. Grade value during teaching term

U NKNOWN N EVER E XISTS N OTA PPLICABLE U NKNOWN for Diploma student for Diploma student for Guest student for Bachelor/Master student with certiﬁcate without certiﬁcate Grade value after examination

N OT E XISTS N EVER E XISTS N OTA PPLICABLE K NOWN K NOWN for Diploma student for Diploma student for Diploma student for Guest student for Bachelor /Master student with graded certiﬁcate with normal certiﬁcate without certiﬁcate

1.5. Research on NULL Values Database research that tackles the null value problem is fairly rich, e.g. [2,5,10,11,12, 19,21,23,25,26,17,27,32,34]. Research on null values is not yet complete. Most papers cover two or three kinds of null markers. The most comprehensive survey on some kinds of null values (existential, may-be, place-holder, partial, partial may-be) is given by [6]. The situation is different for efﬁcient query computation with null-valued object. [20] discusses in detail whether sure and potential answers can be generated from databases with different kinds of nulls (unknown, no information). Comprehensive data-complexity analysis for querying databases with null values has led to a deep understanding of the reasons for complexity increase. There is no extensive research on database modelling techniques for null values. [14,15] provides an insight into treatment of different nulls through variables and equational logics. [29] shows that the entity integrity rule can also be weakened in the case of primary keys.

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

359

[7,9] claim that NULLs are a disaster and explains how we can use SQL without any utilisation of NULL values. The proposal is based on horizontal and vertical decomposition. Each relational database structure R is decomposed for each nullable attribute A into a relational structure that contains all objects with no-NULL values and into another one R.W ithoutA without this attribute for objects that have a NULL value. This approach directly leads to a combinatorial explosion of database schemata since relationship types are also decomposed as a consequence. It is surprising to see that in almost all approaches to solve the null value problem by many-valued logics the truth values form a lattice with F being minimal with respect to a partial order on truth values – an exception is the recent paper [13] . In this paper we propose a logic that combines a para-consistent logic with a null value smaller than F and fold it into a four-valued logic. Thus, besides U (unknown) we consider values NE (does not exist) and NA (not applicable) as completely legal truth values. In the logic all six binary operators of Peirce’s triadic logic are present. Using NE states that it does not make sense to consider the conjunction with other truth values, which implies NE ∧ F = NE and NE ∨ F = F. Therefore, it makes sense to apply Kleene’s conjunction alteration [16] to this null value. For NA it makes sense to adopt Turquette’s deﬁnitions for conjunction and disjunction, which means that it is ignored in conjunctions and disjunctions. 1.6. Brief Survey on this Paper We are going to develop another logical and algebraic foundation for different kinds of null markers and show that these markers can be implemented in SQL:1999. This paper discovers that classical 3-valued logics does not provide a good foundation. Instead we introduce a speciﬁc para-consistent logics and fold it into a many-valued logics. Since we do not target on a full-ﬂedged treatment of all twenty different kinds of null markers we restrict the consideration to four main types.

2. NULL ‘Value’ Logics 2.1. NULL Types A base type is an algebraic structure B = (Dom(B), Op(B), P red(B)) with a name, a set of values in a domain, a set of operations and a set of predicates. A class B C on the base type is a collection of elements from Dom(B). Usually, B C is required to be set. It can be also a list, multi-set, tree etc. Classes may be changed by applying operations. Elements of a class may be classiﬁed by the predicates. The value set can be discrete or continuous, ﬁnite or inﬁnite. We typically assume discrete value sets. Typical predicates are the comparison predicates <, >, ≤, =, ≥, =. Typical functions are arithmetic functions such as +, −, and ×. The NULL type is a nominal type, i.e. a speciﬁc kind of intension based data types beyond the extension based types such as absolute types or ratio types [31]. It can also be an ordinal type.We use the following truth values: U = unknown: This results, when a value for a property exists, but is not known. NA = not applicable: This results, when a property is not applicable.

360

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

NE = not exists: This results, when a value for a property does not exist. The NULL type is an enumeration type with Dom(N U LL) = { U, NA, NE }3 , with two predicates P red(N U LL){=N U LL , =N U LL } and an empty set of operations. We require that Dom(N U LL) is disjoint with any other type we are using for modelling. 2.2. NULL Logics In the following we will present an integrated propositional logic combining the threevalued logics of Łukasiewicz and Turquette with para-consistent logic based on Kleene [16]. The syntax is standard with conjunction ∧, disjunction ∨ and negation ¬, but we will also add a weak negation [33]. For the semantics will use truth tables. On these grounds we can then deﬁne different types of implication. We ﬁrst show that negation ¬ and Łukasiewicz implication are sufﬁcient to express all constructs, then we extend Wajsbergs axiomatisation for L3 to our logic. We base interpretations of propositional formulae on ﬁve-valued interpretations using truth values NE, F, NA, U, T, i.e. we assume a total order on the truth values. The most commonly known approaches to handle null values concentrate on threevalued Łukasiewicz logic L3 , in which ∧ corresponds to building the minimum, while ∨ requires taking the maximum. In particular, this applies to U, in which case we obtain the following Łukasiewicz truth tables with the classical Peirce negation ¬. In addition, we deﬁne weak negation ∼. Note that the truth table deﬁnitions for conjunction and disjunction correspond to the operators Z and Θ in Peirce’s triadic logic. ∧ T U F

T T U F

U U U F

∨ T U F

F F F F

T T T T

U T U U

F T U F

¬ T U F

F U T

∼ T U F

F T T

For NA it makes more sense to adopt Turquette’s conjunction and disjunction, which correspond to the operators Ψ and Φ in Peirce’s triadic logic: ∧ T NA F

T T T F

NA T NA F

F F F F

∨ T NA F

T T T T

NA T NA F

F T F F

¬ T NA F

F NA T

∼ T NA F

F T T

The main difference is that both in conjunction and disjunction the result is NA only, if both factors are NA. This results from the simple consideration that if a property is not applicable, then it can be omitted in a conjunction or disjunction. This argument can be used to combine this two three-valued logics, i.e. U ∧ NA = U, and U ∨ NA = U hold. For the truth value NE, however, it makes sense to adopt Kleene’s weak conjunction and alteration [16] – these correspond to the operators Ω and Y in Peirce’s triadic logic – which deﬁne the following para-consistent logic: 3 We restrict the treatment of null markers to these four values. The value N EVER E XISTS and other NULL markers can be treated in a similar way.

361

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

∧ T NE F

T T NE F

NE NE NE NE

F F NE F

∨ T NE F

T T NE T

NE NE NE NE

F T NE F

¬ T NE F

F NE T

∼ T NE F

F NE T

If NE is involved, this reﬂects the consideration that if a value for a property does not exist, a conjunction or disjunction does not make sense. We can integrate this paraconsistent logic with the other two three-valued logics. Using the arguments above for disjunction we only have to add NE ∨ L = NE for all L. Thus, we obtain the following truth tables, which complement the truth tables above: ∧ NE NA U

NE NE NE NE

NA NE NA U

U NE U U

∨ NE NA U

NE NE NE NE

NA NE NA U

U NE U U

Let L5 denote the propositional logic with junctors ∧, ∨, ¬, ∼, and the ﬁve-valued semantics deﬁned by the truth tables above. Using these tables it is easy to verify the following result. Proposition 1 The following equivalences hold in the ﬁve-valued propositional logic L5 : ¬¬α = α ¬(α ∧ β) = ¬α ∨ ¬β ¬(α ∨ β) = ¬α ∧ ¬β In particular, conjunction can be expressed by disjunction and strong negation, which implies the following result. Proposition 2 The system {∨, ¬, ∼} is complete for L5 . 2.3. Folding NULL Types with Other Types Any ordinary domain type B = (Dom(B), Op(B), P red(B)) can be used to extend to a new type BW ithN U LL = (Dom(B) ∪ Dom(N U LL), Op(B), P red(B) P red(N U LL)) with partial operations that are deﬁned for Dom(B) and undeﬁned for arguments from Dom(N U LL) and with combined predicates =, = that inherit these predicates from the component types if they exist there and stating that values from Dom(B) and Dom(N U LL) are different in the speciﬁc form explained above.

3. NULL ‘Values’ and Integrity Constraint Logics 3.1. Dependencies with Unknown NULLs Dependencies can be generalized to relations containing null ‘values’. Two tuples t and t are strongly equivalent with respect to X (denoted by t ≈X t ) if both are deﬁned on X and are equal on X. They are weakly equivalent on X (denoted by t ∼X t ) if they are both equal on A whenever both are deﬁned on A for any A ∈ X, i.e. if they are

362

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

both equal on A whenever both are deﬁned on A or both are undeﬁned for any A ∈ X. Now we can deﬁne different kinds of validity for the functional dependency X → Y in a relation RC with null ‘values’. Some of them are as follows: • The relation RC 1-satisﬁes the functional dependency strongly X-equivalent tuples are strongly Y equivalent. • The relation RC 2-satisﬁes the functional dependency weakly X-equivalent tuples are weakly Y equivalent. • The relation RC 3-satisﬁes the functional dependency strongly X-equivalent tuples are weakly Y equivalent. • The relation RC 4-satisﬁes the functional dependency weakly X-equivalent tuples are strongly Y equivalent.

X → Y if all pairs of X → Y if all pairs of X → Y if all pairs of X → Y if all pairs of

We conclude with [31]: (1) 2-satisﬁability implies 3-satisﬁability. (2) 1-satisﬁability implies 3-satisﬁability. (3) 4-satisﬁability implies 1-satisﬁability and 2-satisﬁability. The classical Armstrong axiomatisation [31] for functional dependencies can be directly applied to the axiomatisation of 1- and 2-satisﬁability. The augmentation axiom X∪Y → Y is not valid for 4-satisﬁability. The transitivity rule does not apply to 3-satisﬁability, i.e., the 3-satisﬁability of X → Y and Y → Z in a relation RC does not imply the 3-satisﬁability of X → Z. A key K is called a sure key of RC if RC 4-satisﬁes K → attr(R). The key is called a possible key of RC if RC 3-satisﬁes K → attr(R). In the same manner multivalued, join and other dependencies can be generalized for relations with null ‘values’. 3.2. Dependencies with Inapplicable NULLs There are several kinds of null values which should be distinguished in this case, depending on whether a property is applicable to an object, whether a property is under change (incomplete, not committed), whether a value is available, whether a value is stored, whether a value is derivable from inconsistent or incomplete data and whether a value is secured. Context-dependent null values [29,30] are semantically deﬁned null values. 3.3. Possible Worlds Semantics for Unknown and Inapplicable NULLs Another approach to null values is based on possible world semantics [2,25,24]. A tuple t without null values is a completion of a tuple t which uses null values if the tuples t, t are weakly equivalent. A relation RC is a completion of a relation RC with null values if it is obtained by substitution of null values by non-null values from the corresponding domains. A functional dependency is weakly satisﬁed in RC if it is satisﬁed in one of the completions of RC . We observe that a functional dependency can be weakly satisﬁed in RC but is not i-satisﬁed for i ∈ {1, 2, 3, 4}. Weak satisfaction leads to the additivity problem [24], i.e. RC weakly satisﬁes a functional dependency α and weakly satisﬁes a functional dependency β but does not weakly satisfy {α, β}.

363

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

3.4. Towards a Logic for Integrity Constraints Integrity constraints are typically speciﬁed using the B(erri)V(ardi) frame Q(α → β) where α and β are quantor-free formulas with a set of variables X and Q is a sequence of quantiﬁers on X. We therefore need to introduce implication if we wish to use integrity constraints. In classical logic implication α → β is only a shortcut for ¬α ∨ β. In our logical treatment of null values, however, we introduced two different kinds of negation: strong Peirce negation ¬, and weak negation ∼. Accordingly, we have to permit at least two kinds of implication to capture integrity constraints: (strong) logical implication α → β = ¬α ∨ β, and (weak) material implication α ⊃ β =∼ α ∨ β. Then we also obtain two different kinds of equivalence: α ↔ β = (α → β) ∧ (β → α) (logical equivalence), and α ≡ β = (α ⊃ β) ∧ (β ⊂ α) (material equivalence). The following truth tables indicate these two forms of implication – they can be easily veriﬁed from the deﬁnition of ∨, ¬ and ∼ in the previous subsection: ⊃ T U NA F NE

T T T T T NE

U U T T T NE

NA F T T T NE

F F T T T NE

NE NE NE NE NE NE

→ T U NA F NE

T T T T T NE

U U U U T NE

NA F U NA T NE

F F U F T NE

NE NE NE NE NE NE

For the three-valued Łukasiewicz logic L3 , which we adopted for the truth value U, we have a third implication α ⇒ β, which we call the Łukasiewicz implication. This is deﬁned by the following truth table: ⇒ T U F

T T T T

U U T T

F F U T

Then it is easy to see that disjunction α ∨ β can be expressed as (α ⇒ β) ⇒ β, and weak negation ∼ α as α ⇒ ¬α. Thus, both logical and material implication can be expressed by strong negation ¬ and Łukasiewicz implication ⇒. With the following truth table we extend the deﬁnition of the Łukasiewicz implication to the other null values. ⇒ T U NA F NE

T T T T T NE

U U T T T NE

NA NA U T F NE

F F U T T NE

NE NE NE NE NE NE

So the logic L5 will have an additional implication operators →, ⊃ and ⇒. At the same time we preserve the equivalences for disjunction and weak negation, which again can be easily veriﬁed by using the truth tables. This gives the following result.

364

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

Proposition 3 In L5 the following equivalences hold: α ∨ β = (α ⇒ β) ⇒ β

∼ α = α ⇒ ¬α

In particular, all operators of the logic L5 can be expressed by using only Łukasiewicz implication and Peirce negation. Proposition 4 The system {¬, ⇒} is complete for L5 . 4. SQL Realisation 4.1. User-Deﬁned Domain Types and Functions for NULL Markers SQL:1999 supports user-deﬁned domain types and user-deﬁned functions. This support can be used for the development of extensions that can easily be integrated into any database application. We do thus not need an extension of SQL. Instead we use templates for generation of supplements to database schemata that can be integrated into any application. We therefore proceed in a way different from classical SQL handling. The ﬁrst template we use is the deﬁnition of a user-deﬁned domain. For instance, CREATE DOMAIN BOOLExt AS VARSTRING(14) [ DEFAULT value ] CHECK (VALUE = ’TRUE’ OR VALUE = ’FALSE’ OR VALUE IS ’UnknownNULL’ OR VALUE IS ’NotExistNULL’ OR VALUE IS ’NeverExistNULL’ OR VALUE IS ’NotApplNULL’); GO;

This example is only one realisation. We might also use such domain type extensions for other kinds of null values. The next extension is the explicit support for connectives. Since the theory presented above uses a conservative extension of conjunction, disjunction and negation, we may deﬁne functions that can be used instead of Boolean expressions in WHERE-clauses. These functions can directly be generated while parsing a query. Therefore, there is no need for SQL programmers to change their style of writing Boolean expressions. CREATE FUNCTION [dbo.]OrExtend (@FirstBool BOOLExt , @SecondBool BOOLExt) RETURNS BOOLExt WITH ENCRYPTION AS BEGIN DECLARE @ResultBool BOOLExt .... RESULT @ResultBool END; GO;

The function directly encodes the truth table for the connective ∨. In a similar way we may implement the truth table for negation and conjunction. 4.2. Treatment in Predicates There are two special operands used to test for the presence of the NULL. ISNULL returns TRUE only when the supplied operand has a NULL. Conversely, IS NOT NULL

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

365

returns TRUE when the supplied operand does not have a NULL. These are quite important functions. One of the most common database mistakes is to test an operand for a NULL by comparing it to the empty string or zero. Instead we reprogram this function as well and use a number of user deﬁned functions. SQL allows also set predicates. These set predicates can consider in better detail the meaning of elements in a set if our proposal for handling NULL markers is used. In this case also observe that NOT IN and NOT EXISTS are becoming different and cannot be transferred to each other by simply applying double negation and collection. Compare the following two variants for these predicates. NOT IN predicate

K NOWN

U NKNOWN

N OTA PPLICABLE

N OT E XISTS

treat as value

treat as NULL marker

remove object from result

questionable query

NOT EXISTS predicate

K NOWN

U NKNOWN

N OTA PPLICABLE

N OT E XISTS

treat as value

treat as NULL marker

treat as empty object

remove object from result

4.3. Support for Aggregation Most aggregate functions eliminate NULL in calculations; one exception is the COUNT function. When using the COUNT function against a column containing NULLs, the NULL will be eliminated from the calculation. However, if the COUNT function uses an asterisk, it will calculate all rows regardless of NULLs being present. A formal and coherent framework to aggregation handling has been developed in [22]. Aggregation functions may be either distributive (MAX, MIN, COUNT, SUM) or algebraic (AVG) or holistic (MEDIAN, RANK). For instance, SUM is reprogrammed by using the following options: SUM aggregation

K NOWN

U NKNOWN

N OTA PPLICABLE

N OT E XISTS

use value

use expectation value

use 0

questionable sum

Distributive aggregation functions are deﬁned by structural recursion and can thus be directly derived from the logic discussed above. Distributive functions can also be corrected for correct computation. for instance, we might use the ISNULL function for replacing the NULL marked attributes with a valid value (COUNT(ISNULL(Attribute,0))). Algebraic aggregation functions need a corrective. For instance, the AVG function must then separate objects for which a value is known from objects for which a value is

366

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

marked by a NULL marker. In this case we derive a split of the answer that corresponds to this separation. Instead we might also use completion strategies for algebraic functions. For instance, unknown but applicable and existing values may be completed to the expectation value if the distribution function is known. Holistic aggregation functions are far more difﬁcult.

5. Concluding Null ‘values’ are often considered to be the nightmare for programmers and database processing. We claim that they are an opportunity for better treatment of real world things. It is often natural in an application domain that null markers are attached to an object. This situation is similar to the reluctance of exceptions in programming. Since the application world is never perfect we need an explicit exception handling [4] that is implementable. Null markers are a means to handle exceptions for the occurrence of values in objects. Null markers are a very powerful weapon for database modelling. NULL is not the number zero and NULL is not the empty string (‘’) value. They allow to combine object collections with a general common behaviour into a class. We do not require to separate objects of the same natural kinds into different classes depending on a missing value. If we consider, for instance, the Enroll type for Student then we might have use other null-marked attributes beside the attribute Grade. Examples in real applications are Laboratory (Students may participate in practical courses and thus improve their grading.), Project (Students may submit a project and thus be graded in a different way.), Bonus (Students may use a bonus system for ﬁnal examinations and thus improve their chance for good grades in ﬁnal examinations.), and History (Students may have already enrolled a similar course and thus might be given a credit for their history.). If we would model all these different kinds by subtypes then we might easily have to use 5 × 5 different types instead of the singleton type Enroll together with complex integrity constraints. The nightmare is then complete if Enroll is used in other relationship types. It is surprising that logic users consider the logical value 0 as the minimal value. It is a value that has an infological value and a meaning, namely the truth value ‘False’ that is known to us. We consider the value NE as completely legal value since it states that it never makes sense to consider the truth value. It is thus valuable information for query processing.

References [1] ANSI/X3/SPARC. Study group on data base management systems. Interim Report. ACM SIGMOD Records, 7(2), 1975. [2] P. Atzeni and N. M. Morfuni. Functional dependencies in relations with null values. Information Processing Letters, 18(4):233–238, 1984. [3] P. Atzeni and R. Torlone. A metamodel approach for the management of multiple models and the translation of schemes. Information Systems, 18(6):349–362, 1993. [4] A. Berztiss and B. Thalheim. Exceptions in information systems. In Digital Libaries: Advanced Methods and Technologies, RCDL 2007, pages 284–295, 2007. [5] J. Biskup and H. H. Br¨uggemann. Designing acyclic database schemes. In H. Gallaire, J. Minker, and J.-M. Nicolas, editors, Advances in Database Theory, Vol. II, pages 3–26. Plenum Press, New York, 1983.

B. Thalheim and K.-D. Schewe / NULL ‘Value’ Algebras and Logics

367

[6] K. S. Candan, J. Grant, and V. S. Subrahmanian. A uniﬁed treatment of null values using constraints. Inf. Sci., 98(1-4):99–156, 1997. [7] H. Darwen and C. J. Date. The third manifesto. SIGMOD Record, 24(1):39–49, 1995. [8] C. J. Date. Logic and Databases - The roots of relational theory. Trafford Publishing, 2007. [9] C. J. Date and H. Darwen. Database Explorations. Essays on The Third Manifesto and related topics. Trafford Publishing, 2010. [10] B. S. Goldstein. Formal properties of constraints on null values in relational databases. Technical Report 80-013, SUNY at Stony Brook, Dept. of Computer Science, 1981. [11] J. Grant. Null values in a relational data base. Inf. Process. Lett., 6(5):156–157, 1977. [12] J. Grant. Null values in SQL. SIGMOD Record, 37(3):23–25, 2008. [13] S. Hartmann and S. Link. When data dependencies over sql tables meet the logics of paradox and s-3. In PODS, pages 317–326. ACM, 2010. [14] N. Cat Ho and H. Rasiowa. Subalgebras and homomorphisms of semi-Post algebras. Studia Logica, 46(2):161–175, 1987. [15] N. Cat Ho and B. Thalheim. On semantic and syntactic issues of null values in the relational model of databases. Manuscript, Dresden, 1987. [16] S. W. Jablonski, G. P. Gawrilow, and W. B. Kudrjavcev. Boolesche Funktionen und Postsche Klassen. Akademie-Verlag, Berlin, 1970. [17] W. Lipski Jr. On semantic issues connected with incomplete information databases. ACM Trans. Database Syst., 4(3):262–296, 1979. [18] P. Kandzia and H.-J. Klein. Theoretische Grundlagen relationaler Datenbanksysteme. Bibliographisches Institut, Darmstadt, 1993. [19] A. M. Keller. Set-theoretic problems of null completion in relational databases. Information Processing Letters, 22(5):261–265, 1986. [20] H.-J Klein. Gesicherte und m¨ogliche Antworten auf Anfragen an relationale Datenbanken mit partiellen Relationen. Habilitationsschrift, CAU Kiel, 1997. [21] A. L. Kulenovic and A. Kulenovic. Treatment of null values in the nfset data model. In BNCOD, pages 226–244, 1991. [22] H.-J. Lenz and B. Thalheim. A formal framework of aggregation for the OLAP-OLTP model. Journal of Universal Computer Science, 15(1):273 – 303, 2009. [23] M. Levene and G. Loizou. Inferring null join dependencies in relational databases. BIT, 32:413–429, 1992. [24] M. Levene and G. Loizou. The additivity problem for data dependencies in incomplete relational databases. In L. Libkin and B. Thalheim, editors, Proc. Semantics in Databases, LNCS 1358, pages 136–169. Springer, Berlin, 1998. [25] Y. E. Lien. Multivalued dependencies with null values in relational databases. In A. L. Furtado and H. L. Morgan, editors, Proc. 5th Int. Conf. on Very Large Data Bases - VLDB’79, pages 61–66, Rio de Janeiro, 1979, 1979. IEEE-CS. [26] S. Link. On the implication of multivalued dependencies in partial database relations. Int. J. Found. Comput. Sci., 19(3):691–715, 2008. [27] M. A. Roth, H. F. Korth, and A. Silberschatz. Null values in nested relational databases. Acta Inf., 26(7):615–642, 1989. [28] B. Thalheim. Completeness and representation conditions for automata logics. PhD thesis, Lomonossov University, Faculty of Mathematics and Mechanics, Moscov, Russia, 1979. (In Russian). [29] B. Thalheim. On semantic issues connected with keys in relational databases permitting null values. Journal of Information Processing and Cybernetics, EIK, 25(1/2):11–20, 1989. [30] B. Thalheim. Dependencies in relational databases. Teubner, Leipzig, 1991. [31] B. Thalheim. Entity-relationship modeling – Foundations of database technology. Springer, Berlin, 2000. [32] Y. Vassiliou. Null values in data base management: A denotational semantics approach. In SIGMOD Conference, pages 162–169. ACM, 1979. [33] G. Wagner. A database needs two kinds of negation. In MFDBS, volume 495 of Lecture Notes in Computer Science, pages 357–371. Springer, 1991. [34] C. Zaniolo. Database relations with null values. J. Comput. Syst. Sci., 28(1):142–166, 1984.

368

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-368

Ontology Representation and Inference Based on State Controlled Coloured Petri Nets Ke WANG a,b, James N.K. LIU b,1 and Wei-min MA a School of Economics and Management, Tongji University, Shanghai, China b Department of Computing, The Hong Kong Polytechnic University, Hong Kong a

Abstract. Many automatic or semi-automatic extraction techniques have been proposed for building domain ontologies in recent years but the correctness, consistency and completeness of the extracted ontologies is often either not considered or is not formally verified. The issue of detecting potential anomalies in an ontology has not to date been adequately addressed. In this paper we propose a formal technique for ontology representation and inference, based on which an automatic technique for ontology verification can be developed so as to be able to detect and identify potential anomalies in an ontology. The technique makes use of a State Controlled Coloured Petri Net (SCCPN), which is a high level net that combines a Coloured Petri Net and a State Controlled Petri Net. This work presents a formal definition of SCCPN for modeling ontologies and the mapping between them as well as formulating the ontology inference in SCCPN with specified inference mechanisms. Keywords. ontology, knowledge representation, knowledge inference, Petri net

1. Introduction An ontology is an explicit specification of a conceptualization [1]. It could take the simple form of a taxonomy (i.e. knowledge encoded in a hierarchical structure) or a vocabulary with standardized machine interpretable terminology supplemented with natural language definitions or it could be used to describe a logical domain theory with very expressive, complex, and meaningful information [2]. Domain ontologies can be built using automatic or semi-automatic extraction techniques (e.g. [3-5]) but the correctness, consistency and completeness of the resulting ontologies is often either not considered or is not formally verified. One possible way to check the correctness, consistency and completeness of an ontology may be to apply high level Petri net theory. Petri nets, introduced by Petri [6], are known to be well-suited to the modeling and analysis of parallel and asynchronous activity operating systems. High level Petri nets are extended from the basic models of Petri nets and offer more compact representations and more powerful modeling capabilities. They have been used in many AI applications, in particular to model and analyze expert systems (e.g. [7-9]) and it is anticipated that, by mapping ontologies into 1

Corresponding Author: James N.K. Liu, Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong; E-mail: [email protected].

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

369

Petri nets, these approaches can also be used to analyze and verify ontologies. In this paper we propose a formal technique for ontology representation and inference. It makes use of a State Controlled Coloured Petri Net (SCCPN), which is a high level net that combines a Coloured Petri Net [10] and a State Controlled Petri Net [11]. The concepts in ontologies and their associated relationships are represented by the structure of SCCPN with SCCPN using high order coloured tokens to provide a richer modeling capability. The rest of this paper is organized as follows. Section 2 gives the preliminary definitions of ontology first. Section 3 describes how SCCPN can be used to model ontologies. Section 4 provides a formal definition of SCCPN. Section 5 discusses ontology inference in SCCPN. Finally, Section 6 concludes this paper with a brief discussion of future work.

2. Preliminary Definitions Although ontologies have been widely used in knowledge sharing and by a wide variety of communities, there is no common formal definition of ontology. However, we can identify the common core components, principally individuals, classes and relations [12]: x x x

Individuals (or instances): Individuals are the basic objects in a domain. These are gathered into classes. Classes (or concepts): Classes are collections of individuals, and also the main entities of an ontology. Classes in ontology are usually organized in taxonomies which allow the application of inheritance mechanisms. Relations: Relations are subsets of the product of classes and individuals in the domain. An ontology will contain various kinds of relations but certain relations are common to most ontologies including: – Instantiation (membership): This is an anti-symmetric relation between a class and individual, and represents the creation of a real individual or instance as a member of a class. – Exclusion (disjointness): This is a symmetric and irreflexive relation between the classes. The intersection of two classes with the relation of exclusion is empty. – Specialization (subsumption): This is also known as an Is-A relation. This is a reflexive, transitive, and anti-symmetric (partial order) relation between classes.

In this work, we adopt a definition of ontology that includes the following core components. Definition 1. An ontology is a tuple O : (CL, IN , RE , d, A,) , where CL is a finite set of classes; IN is a finite set of individuals; RE is a finite set of relations; d is a relation on CL u CL called specialization; A is a relation on CL u CL called exclusion; is a relation on CL u IN called instantiation.

370

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

3. Mapping an Ontology into SCCPN State Controlled Coloured Petri Nets were initially proposed for the formal description and verification of hybrid rule/frame based expert systems [7] and are a combination of Coloured Petri Nets proposed by Jensen [10] and State Controlled Petri Nets proposed by Liu and Dillon [13,14]. The mapping between an ontology and SCCPN is in Table 1. If the wish is to model a concise ontology graph, it is possible to do so using the most primitive form of Petri net, representing classes as places and using transitions linked to the places to represent hierarchical relations between them. Individuals of a class are indicated by tokens in a place, accordingly, to instantiate a particular class is represented by creating an instance token in the corresponding place. These mappings allow a concise ontology graph to be easily transformed into a Place/Transition net. Figure 1 shows how an ontology graph, in this case a family ontology, can be mapped to a Place/Transition net, with all the elements of Fig. 1(a), its classes and their hierarchical relations, mapped into Fig. 1(b) as a Place/Transition net. In this net, the places, denoted by pi (1 d i d 11) , represent the classes in the ontology graph, and the transitions, denoted by t j (1 d j d 10) , represent the hierarchical relations. Using a more sophisticated net such as SCCPN, however, it is possible to represent more information and improve the ability of a net to make inferences. The major features of SCCPN are that, first, it denotes class attributes using colour sets where a coloured token records its class type information. To enable recording the state of inference, information concerning relation type is also added into the colour sets. The net uses three major types of token: instance, state, and control. Instance tokens represent a particular instance of a class. State tokens record information about inferences for a particular class (i.e., which type of relation is involved) and the state of predicates (i.e., true or false). Control tokens control the firing or executing of transitions. A transition becomes active when at least one of its input places has a control token and is enabled if a transition is active and the transition condition is met (i.e., all its input places have the correct state tokens). Each input place of a transition has a self-loop arc that maintains the state of a class or predicate while inferring, i.e., the execution of a transition will not affect the state of its input places. Finally, to enable inverse inference along with the hierarchical relations, the net also makes use of a generalization relation that is in effect the inverse of the specialization relation. Namely, the specialization relation is mapped into two transitions linking places in opposite directions, from subclass to its superclass, and from superclass to subclass. For some ontology languages, such as OWL, the user-defined “properties” in an ontology must be treated discriminatingly as two different cases according to the following criterion, when mapping them into the SCCPN. If the “property” describes a Table 1. Mapping between ontology and SCCPN Ontology Classes Class Attributes Individuals Specialization Exclusion Instantiation Further Relations Predicates Predicates States

SCCPN Class Places Colour Sets Instance Tokens Transitions Transitions Creation of Instance Token in Class Place Transitions Predicate Places State Tokens

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

371

relation between two or more classes, it is regarded as a further relation. That is to say, it is not a standard relation like specialization and exclusion, which can be obtained in all approaches and tools. In this case, it is mapped into SCCPN using a transition linking the places concerned. If the “property” denotes a quality or character of a class, it is treated as a predicate and is mapped into SCCPN by a predicate place, which is linked by a transition to the class having this quality. We will provide a formal definition of SCCPN and describe the inference mechanism in the following sections. To briefly illustrate the present discussion, we can consider the ontology and its mapping in SCCPN as shown in Figure 2. Fig. 2(a) represents a segment of the family ontology in Fig. 1 and Fig. 2(b) shows its mapping in SCCPN. To distinguish different types of places and transitions, the notations of place and transition are both extended to 2-dimensional variables. It should be noted that, for convenience, the subscripts of the notations in Fig. 2(b) are relabeled. pci (1 d i d 4) denotes the class place, as pc1 , pc 2 , pc 3 , pc 4 respectively represent the classes “mother”, “parent”, “female” and “male”. The predicate place is denoted by p pj ( j 1) . Relations are represented by tkW (1 d k d 5) , where k denotes the type of

relation and W indicates a particular transition with relation type k . The relation of specialization from subclass to superclass is denoted by t1W and t2W is its inverse. The information in the rectangles in Fig. 2(a) is some “properties” adhering to the corresponding classes. The notation “ A male” means an exclusion relation between the classes “female” and “male”. Since exclusion is a symmetric relation, the relation is mapped into two transitions in opposite directions (see t31 and t32 ). “Predicate: lacks Y chromosome” is a quality of the class “female”. It is modeled using a predicate place (i.e. p p1 ) linking to the corresponding class (via transition t41 ). The formulation “ { (parent female)” represents a necessary and sufficient relation between the classes “mother” and (parent female), i.e. mother parent female and parent female mother. The former is represented by t11 and t12 , and the latter is modeled by the transition t51 . The notation “SL” in Fig. 2(b) stands for a self-loop arc of the place pc 3 . This arc maintains the state of pc 3 while executing t41 . Each input

(a) A family ontology (b) The mapping in Place/Transition net Figure 1. An ontology graph and its mapping in a Place/Transition net

(a) A segment of the family ontology (b) The mapping in SCCPN Figure 2. An ontology and its mapping in SCCPN

372

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

place of a transition has a self-loop arc, however, for simplicity, other self-loop arcs are not indicated in the figure. Note that for the specialization and generalization relations, here we only considered them between two classes (concepts). In other words, for each hierarchical relation (represented by a link) in the ontology, it is modeled by two transitions linking the classes in opposite directions. This mapping is intuitive and reasonable. However, when consider the specialization and generalization relations among more than two classes, from the perspective of inference, it is not sufficient and the case will become more complex. For example, the specialization relations “mother-parent” and “motherfemale”, can be explained as “IF mother, THEN parent AND female” from the view of inference, and thus they can be represented in the net by one transition with this “AND” relation. For the generalization relations, to infer subclasses from a superclass, it refers to the “OR” relation. If the subclasses are exclusive, it is an “EXCLUSIVE OR” relation and can be represented by the transitions in each pair of classes with generalization relation, or else, it would be an “INCLUSIVE OR” relation and more transitions are required to model the “inclusive” state. More discussion concerning this problem will be given in our future work. In this work, for simplicity, we introduce different colours into the net to indicate the inference states in the following sections. 4. Formal Definition of SCCPN

The formal definition of SCCPN used here has been adopted and extended for ontology engineering from the SCCPN that is normally used to model hybrid rule/frame-based expert systems [7]. The extension enables the net to provide a richer modeling capability for ontology representation and inference. Definition 2. The SCCPN that models an ontology is a 9-tuple Net (6, P, T , A, N , C , G, E , I ) , where ¦ {w1 , w2 , , wi } is a finite set of non-empty types, called colour set, i t 1 ; P {Pc , Pp } is a finite set of places, and Pc Pp ;

Pc { pc1 , pc 2 , , pcj } , a finite set of places that model the classes in the ontology, called class places, j t 1 ; Pp { p p1 , p p 2 , , p pk } , a finite set of places that model the predicates asserted in the ontology, called predicate places, k t 0 ; T {T1 , T2 , Tl } is a finite set of transitions, where l is the number of relation types contained in the ontology and l t 2 ; Tn To (n z o, 1 d n, o d l) ; T1 {t11 , t12 , , t1m } , a finite set of transitions that are connected from subclass to superclass, modeling the hierarchical specialization (Is-A) relation; T2 {t21 , t22 , , t2 m } , a finite set of transitions that are connected from superclass to subclass, i.e., t1W and t2W are inverse relations in opposite directions; Similarly, T3 is a finite set of transitions modeling the exclusion relations and T4 , , Tl model the further relations in the ontology; A {a1 , a2 , , aq } is a finite set of arcs where q t 1 ; P T P A T A ; N : A o P u T T u P is a node function that maps each arc into a pair where the first element is the source node and the second is the destination node. The two

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

373

nodes must be of different kinds (i.e. one of the nodes must be a place while the other must be a transition). C : P o ¦ is a colour function that maps each place into a colour set; G : T o boolean value is a guard function that is defined from T into expressions such that t T : ª¬Type G (t ) BooleanValue Type Var G (t ) ¦ º¼ ; E : A o expression is an arc expression function that is defined from A into expressions such that a A : ª¬Type E (a) C p (a) MS Type Var E (a )

¦ @ , where p(a) is the place of N (a) and MS stands for multi-sets or bags;

I : P o expression is an initialization function that is defined from P into closed expressions such that p P : ¬ªType I ( p) C ( p ) MS ¼º .

Definition 3. The node functions N : A o P u T T u P can be further classified into the following 4 different types: Bc : T o P is an input control function, a mapping from transitions to the bags of places; Oc : T o P is an output control function, a mapping from transitions to the bags of places; Bs : T o P is an input state function, a mapping from transitions to the bags of places; Os : T o P is an output state function, a mapping from transitions to the bags of places. And for each transition ti T in the

SCCPN, Bs ti Os ti z , Bc ti Oc ti , such that p j Bs ti p j Os ti , p j Bc ti p j Oc ti .

It should be noted that, for the simplicity of representation, the instance tokens are indicated by a particular colour defined in the colour set, i.e., the instance token is treated as a special case of the state tokens. This is reasonable since it can also be seen as indicating the instantiation relation. Therefore, in the definition of node functions, we only represent the input/output control and state function. Similarly, in the following discussion of markings, we consider only control marking and state marking. 5. Ontology Inference in SCCPN

In this section we first describe the markings that are used to formally represent the states of the SCCPN. We then describe how to model the different types of relations that are contained in ontologies. In particular, we specify the inference mechanisms for different relation types by formulating the guard functions and arc expression functions. 5.1. Markings for the Representation of Inference The states of the SCCPN can be formally represented by markings. The formal analysis of ontology inference is based on the reachable markings generated by transition firings in the net. In SCCPN, a marking M is composed of M s , which represents the marking for the state tokens, and M c , which represents the marking for the control tokens. Formally, they are defined as follows. Definition 4. A marking M of SCCPN is a function that M M s , M c , where M s depicts the distribution of state tokens in the places. M s is represented in the form of a space vector spanned by the number of places in the net. Each

374

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

element M sp (the subscript p P indicates a particular place) is a multi-set over the colour set, denoting the state tokens deposited in the particular place. M c depicts the distribution of control tokens in the places. M c is a space vector spanned by the number of places in the net, mapping from places to nonnegative integers, which indicate the number of control tokens contained in the places. The multi-set (or bag) is employed to allow multiple appearances of tokens with the same colour. Unlike single-set that adding an element a into a set {a, b} we still have the set {a, b} , for the multi-set, if adding a into a multi-set {a, b} we get the multi-set {a, a, b} , which shows two occurrences of a . Definition 5. A transition ti is active in a marking M , if M ci t 1 for a place pci Bc (ti ) . In other words, a transition becomes active when at least one of its input places has a control token. Definition 6. A transition ti is enabled in a marking M , if ti is active and M s makes G (ti ) true. In other words, a transition is enabled if it is active and the transition condition is met. When a transition is enabled, it may be executed and once it is executed, the marking M changes to the marking M c . If M c is obtained from M by the execution of transition ti , it is said that M c is directly reachable from M , denoted as M [ti ! M c . Definition 7. A marking M n is said to be reachable from a marking M 0 if and only if (iff) there exists a finite sequence of firings that transform M 0 into M n , i.e. a sequence having M 0 as an initial marking and M n as an end marking such that M 0 [t1 ! M 1[t2 ! M n 1[tn ! M n , where n is a natural number denoting the length of

the sequence. Denoting V

(t1 , t2 , , tn ) , it is said that M n is reachable from M 0 by V .

5.2. Inference Mechanisms for Different Relation Types Since different types of relations are contained in ontologies, and all are modeled by transitions in SCCPN, it is necessary to specify the inference mechanisms for different relation types by formulating the guard functions and arc expression functions. For simplicity of representation, we consider here only one colour set, denoted by 6 {C ( p ) | C 0,1, l} , to indicate different relation types. Each output token created by a transition will be assigned a colour which indicates that the token was created by a particular type of relation. C 0 denotes the instantiation relation. It can also be seen as indicating the presence of an instance token. As defined above, T1 models the specialization relation, T2 is its inverse, T3 models the exclusion relation and T4 , , Tl model other further relations. Accordingly, C 1, 2,3, l refers to specialization, generalization, exclusion and other further relations, respectively. The following specifies the different relation types for these three inference mechanisms. (1) Specialization (Is-A) relation °True if C Bs (t1W ) 0 or 1, G t1W ® °¯ False if C Bs (t1W ) z 0 and 1.

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

where C Bs (t1W )

375

0 or 1 denotes that the token in the place Bs (t1W ) has a colour

“0” or “1”. In other words, the transition in T1 can be enabled only if its input place has a token whose colour is “0” or “1”. E : C Os (t1W ) Bs (t1W ) 1 This denotes that the new token created in the output place (except the input place), by executing the transitions in T1 , is assigned a colour that C 1 . Concerning the implication of this inference result, the colour “1” can be explained as “true”. For example, the class “parent” can be inferred from “mother” via the Is-A relation. The colour “1” indicates this inference with the meaning “a mother definitely is a parent”. (2) Generalization relation °True if C Bs (t2W ) 0 or 2, ; E : C Os (t2W ) Bs (t2W ) 2 G t2W ® °¯ False if C Bs (t2W ) z 0 and 2. Similarly, these two functions imply that the transition in T2 can be enabled only if its input place has a token whose colour is “0” or “2”, and after executing the transition in T2 , the new token created in the output place (except the input place) is assigned the colour that C 2 . Concerning the implication of this inference result, the colour “2” can be explained as “may be”. For example, the class “mother” and “father” both can be inferred from “parent” via the generalization relation. The colour “2” indicates this inference with the meaning “a parent may be a mother or may be a father”. (3) Exclusion relation °True if C Bs (t3W ) 0 or 1, ; E : C Os (t3W ) Bs (t3W ) 3 G t3W ® °¯ False if C Bs (t3W ) z 0 and 1. The implication of a token with colour C 3 can be explained as “false”. For example, the class “male” can be inferred from “female” via the exclusion relation. The colour “3” indicates this inference with the meaning “a female definitely is not a male”. It is also possible to similarly specify the guard functions and colour functions for the transitions modeling other relations in an ontology. 5.3. An Illustrative Example Figure 3 shows an example of ontology inference. The SCCPN in Fig. 3 is identical with that in Fig. 2(b), which models the segment of a family ontology as shown in Fig. 2(a). For simplicity, the figure does not show all self-loop arcs. Figure 3(a) shows the initial state of inference, denoted by the marking M 0 , in which pc1 contains a control token and an instance token ( C 0 ). Due to the presence of control token, both t11 and t12 are active. As defined above, the instance token makes the guard functions of t11 and t12 true. Thus t11 and t12 are enabled. The execution of t11 in Fig. 3(a) produces a new state of the net, denoted by the marking M 1 as in Fig. 3(b). A new state token with colour “1” and a control token are created in the output place pc 2 . Meanwhile, the control token in the input place is destroyed, and the instance token is returned to the input place via the self-loop arc. The result implies that an instance of the class “mother” is also a “parent”. At this point, the state of the net in Fig. 3(b) cannot be further inferred since no transition is enabled.

376

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

The execution of t12 in Fig. 3(a) will produce a new inference result, denoted by 2

M , as in Fig. 3(c). Similarly, the result implies that an instance of the class “mother” is also a “female”. Because of the presence of a state token ( C 1 ) and a control token in pc 3 , both t31 and t41 are enabled. Thus Fig. 3(c) can be further inferred. The separate execution of t31 and t41 in Fig. 3(c) will produce the results in Fig. 3(d) and Fig. 3(e) and neither offers any further opportunities for inference. The result in Fig. 3(d) implies that an instance of the class “mother” cannot be a “male”, whereas Fig. 3(e) implies it lacks a Y chromosome, as represented by the predicate place p p1 . Figure 4 shows the reachable markings representing the inference drawn from the example in Fig. 3. For simplicity, the markings are presented in the form of a matrix in which the first column denotes the marking of coloured state tokens (i.e. M s ) and the second column denotes the marking of control tokens (i.e. M c ).

Figure 3. An illustrative example of ontology inference

Figure 4. The sequence of markings reachable from M0

K. Wang et al. / Ontology Representation and Inference Based on SCCPN

377

6. Conclusion and Future Work

This paper proposes a formal technique for ontology representation and inference, using State Controlled Coloured Petri Net. SCCPN is based on high level Petri net theory, which provides not only a formal representation of the knowledge structure and an explicit control mechanism using coloured tokens for knowledge inference, it also offers various means of checking the correctness, consistency, and completeness of the knowledge base. It is anticipated that the mapping ontologies into SCCPN, will allow further analysis of ontologies using high level net approaches, especially for ontology verification, which has not to date been adequately addressed. Our future work will thus focus on the issue of the correctness and verification of ontologies, mapping ontologies into SCCPN as an approach to developing an automatic technique for ontology verification so as to be able to detect and identify potential anomalies in an ontology, such as contradiction or redundancy. Acknowledgement

The authors are grateful for the partial support of GRF grant 5237/08E, CRG grant GU756 and departmental grant 1-ZV41 of The Hong Kong Polytechnic University. References [1] T.R. Gruber, A translation approach to portable ontology specifications, Knowledge Acquisition, vol. 5, no. 2, pp. 199-220, Jun. 1993. [2] H. Pinto, and J. Martins, Ontologies: How can they be built?, Knowledge and Information Systems, vol. 6, no. 4, pp. 441-464, Jul. 2004. [3] R. Navigli, P. Velardi, and A. Gangemi, Ontology learning and its automated terminology translation, IEEE Intelligent Systems, vol. 18, no. 1, pp. 22-31, Jan.-Feb. 2003. [4] P. Cimiano, A. Hotho, and S. Staab, Learning concept hierarchies from text corpora using formal concept analysis, Journal of Artificial Intelligence Research, vol. 24, pp. 305-339, 2005. [5] R.Y.K. Lau, D.W. Song, Y.F. Li, T.C.H. Cheung and J.X. Hao, Toward a Fuzzy Domain Ontology Extraction Method for Adaptive e-Learning, IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 6, pp. 800-813, Jun. 2009. [6] C.A. Petri, Fundamentals of a Theory of Asynchronous Information Flow, in Proceedings of the 1962 IFIP congress, 1962, pp. 386-390. [7] S.C.K. Shiu, J.N.K. Liu, and D.S. Yeung, Formal description and verification of hybrid rule/frame-based expert systems, Expert Systems with Applications, vol. 13, no. 3, pp. 215-230, Oct. 1997. [8] Z. Zhang, S. Wang, and S. Liu, Dynamic knowledge inference, and learning of fuzzy Petri net expert system based on self-adaptation learning techniques, in Proceedings of Fourth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 1, 2007, pp. 377-381. [9] V.R.L. Shen, and T.T.Y. Juang, Verification of knowledge-based systems using predicate/transition nets, IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, vol. 38, no. 1, pp. 78-87, Jan. 2008. [10] K. Jensen, Coloured Petri nets: basic concepts, analysis methods, and practical use, vol.1, 2nd ed., Berlin; New York: Springer-Verlag, 1996. [11] N.K. Liu, Formal verification of some potential contradictions in knowledge base using a high level net approach, Applied Intelligence, vol. 6, no. 4, pp. 325-343, Oct. 1996. [12] A. Gomez-Perez, M. Fernandez-Lopez, and O. Corcho, Ontological engineering: with examples from the areas of knowledge management, e-commerce and the semantic Web, London; New York: Springer-Verlag, 2004. [13] N.K. Liu, and T. Dillon, Formal description and verification of production systems, International Journal of Intelligent Systems, vol. 10, no. 4, pp. 399-442, Apr. 1995. [14] N.K. Liu, Formal description and verification of expert systems, PhD thesis, Department of Computer Science and Computer Engineering, School of Mathematical and Information Sciences, La Trobe University, Bundoora, Victoria, Australia, 1991.

378

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-378

The Discourse Tool: A Support Environment for Collaborative Modeling Efforts Denis KOZLOV1, Tore HOEL2, Mirja PULKKINEN1, Jan M. PAWLOWSKI1 Department of Computer Science and Information Systems, FI-40014 University of Jyväskylä, Finland 2 Oslo University College, N-0130 Oslo, Norway

1

Abstract. This paper describes the first experiences in development and usage of an environment to support a community in collaborative modeling efforts in the area of Technology Enhanced Learning (TEL). With first priority, the domain of interest, the sub-domains and the relationships have to be modeled to have a common understanding and a shared model for the further work and exchange. While the work continues, the model, and the sub-domain models continue to evolve. The tool that is presented here supports the modeling, especially the discourse conducted on the models in the collaborating network. With the tool i) the model rationale can be made transparent, ii) reasoning and agreements on changes in the model or its elements are systematically recorded by issue iii) the semantic wiki functionalities enhance the analysis of the discussion and discussion tracks. With the help of the tool, the community has continuously an updated model version at hand. Community members can raise an issue whenever a change or enhancement in the model is seen necessary. The members can share their updated sub-domain models and model variants for discussion. The tool is called Discourse Tool, since a discursive process is induced for negotiating and decision making on the model issues. Keywords. computer supported collaborative work, collaborative modeling, domain model, groupware, discourse tool

1. Introduction In this paper, we introduce a support tool for discourse processes in collaborative modeling. Modeling is in general a highly collaborative process that requires a lot of communication. To improve collaboration and communication, e.g., to achieve a common understanding on concepts, models and their relations, the rationales for modifications of existing models, we suggest a discourse tool. The discourse tool can be used in any collaborative modeling setting. We illustrate its use in the domain of Technology Enhanced Learning (TEL). This area brings together a variety of stakeholders and sub-domains. Therefore, it is highly necessary to gain a common understanding on different concepts, which means a lot of need for information exchange, negotiations, argumentation etc. when modeling collaboratively. In our sample implementation, we have used the discourse tool in the European project ICOPER (ICOPER, 2009). In this project, around 20 institutions collaboratively

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 379

develop a reference model for the TEL domain, based on existing good practices, standards and specifications. The main goal of this paper is to present the first stage in the development and use of a technical environment to support and enhance the collaboration in modeling the target domain in the ICOPER project. The presented technical support environment is called the Discourse Tool. Insights from the field of computer supported collaborative work (Bannon and Schmidt 1989, Grudin 2009) and the studies of Discourse (Fairclough 1989, 1995, 2003; Phillips and Hardy 2002; Van Dijk 1989, 2001) are taken as starting points for the tool design. Semantic web (Davies et al., 2006) and wiki technologies (Brown et al., 2007; Ebersbach et al., 2008) are the technological basis for the tool. Our study shows how to deploy these technologies to support a community in their collaboration on domain models and, in particular, standards. The Discourse tool provides the opportunity to share, collaboratively construct, discuss and further develop domain models. The paper is structured as follows. Firstly, we provide an overview of the most important previous studies to understand the problems and challenges of collaborative modeling. Those studies are related to the core elements of our study, i.e. the notion of discourse and computer supported collaborative work. Secondly, we highlight the technical implementation of the environment. Finally, we discuss the main achievements and limitations of our study and further work.

2. Research background As a starting point for the tool development, we take the insights from the computer supported collaborative work (CSCW) field of study, which gives insight into developing of support systems for groups and networks of people. In this specific case, the tool to be developed should enable broad, independent involvement in a case of collaborative research and development in the TEL domain. The goal is not only to bring up new models and proposals, but to achieve common understanding and agreements for e.g. standard developments. The notion of Discourse is taken as a guiding principle for the tool development. Discourse can be seen as the overall collection of language based human activities, within which takes place the creation, accumulation, transfer and elaboration of mental constructs as ideas, cultures and knowledge (Fairclough 1989, 1995, 2003; Phillips and Hardy 2002; Van Dijk 1989, 2001). Discourse has gained attention also in the information systems field of inquiry, since discourse as the representation of linguistic practice provides a theoretical basis for research and development of information systems, both from the point of view of activities related to their development and from the point of view of their purpose as means for human communication (Lyytinen 1987). Practical guidelines have been derived to systems design and development through the language practice lens (Auramäki et al. 1992), and the discourse analysis is introduced also as research method in Information Systems studies (e.g. Sarkkinen 2006). The potential to take impact in developments in ICT related developments through understanding discourse has been pointed out (e.g. Pulkkinen 2008). Whereas natural language elements collected in dictionaries and grammars are used merely to represent language elements and possibilities to combine them to convey meanings, discourse is representing the overall process of linguistic practices. In an

380 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

overall discourse process, the group or society carrying the discourse is involved in producing meanings, constructing ideas of future states, thus guiding societal developments as well as contributing to, and arguing for or against them (Phillips and Hardy 2002; Grant et al. 2004). For successful collaboration, common understanding of the underlying structures (Giddens, 1984) of the domain of interest is a prerequisite. These social construals (Lynch 1991) or models are depicted in models that also serve as common artifacts (Robinson 1991) to support any exchange about issues related to the domain, and collaborative construction within the domain. Computer Supported Collaborative Work (CSCW) (Bannon and Schmidt 1989, Grudin 2009) is the field of study on information systems and applications to facilitate human collaboration and coordination of contributions in a work area for groups and networks of people. Three distinct areas from this body of knowledge give basis for the effort of constructing the targeted tool: • Facilitating of both the synchronous, and asynchronous, as well as the distributed or also co-located collaboration and coordination which is at the core of CSCW (Bannon and Schmidt 1989, Grudin 2009). Providing common platforms for information processing is the first step in establishing a collaboration tool. • Support for group decision making is an area where the actual process of decision making is one of the focuses in CSCW (Grudin 2009). This means providing tool functionality for entering different ideas, comparing and discussing them, and finally making a decision with transparent arguments. • Looking at the activities in designing systems as a group effort, i.e. the process of collaborative construction of the models and designs (e.g. Kyng 1991, K. Grønbæk et al. 1997, Grudin 2009). The participation studies in the field underline the intertwined involvement of different stakeholder groups as the design models evolve, as opposed to distinct phases of user requirement collection and following modeling of designs by IT people only. The targeted discourse tool aims at enabling such continuous involvement of all stakeholder groups. Developments especially through wide area networking (Harasim, 1993) underline the need of understanding and experimenting with tools for group collaboration and coordination support also in the Internet and multiply the challenges in collaboration through e.g. cultural diversity (Ishii 1993). We understand that our setting is prone to cultural challenges, and the discourse tool is expected to mitigate the challenges an international community is facing. The collaboration across community and cultural boundaries calls for a facilitative platform, where a common discourse can be first of all started by collecting, storing and structuring the relevant information, and presenting it in a common model which facilitates the communication and collaboration. Creation of a common understanding of the domain in the focus, as well as of its structure underlies successful collaboration. Further, upcoming modifications and developments are intended to be transparent and traceable for the community. The community members can not only access the information, but also participate in discussions, and contribute through the Discourse Tool. In discourse terms, this means empowering community members by providing them through the tool the opportunity to participate and have impact in the activity, i.e. to voice their opinions, comments, suggestions and contributions to the models through the tool.

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 381

We have outlined the instantiation of a process within the overall discourse, that we call (a bit simplified) the discourse process. This means in our context that any topical issue found in the focus domain, which has previously not been dealt with, can be recorded into the Discourse tool. The issues can be solved to coherently incorporate or modify the relevant ideas from them into the result models presented as the agreedon, common overall models in the DT and in the web environment for further discussions. This means, the models presented in the tool and subjected to discussions may evolve over time, and are accessible in their latest version to the community. In line with the CSCW theories, and the technological enablers, the information on the work area and the sub-areas is firstly, collected to the tool for further use and reference, and secondly, the tool gives a structure to this information. Thirdly, the rationale behind the developments in the models can be traced in the recorded discussions and contributions. Last but not least, the technologies augment mere recording and re-displaying the information, as the semantic web technologies enable visualizing the conducted discussions and contributions.

Figure 1. Discourse Process Lifecycle

3. Research objectives and methods The main objective of our study is to elaborate a sophisticated methodology and a technical environment, i.e. the Discourse Tool, to implement that methodology to support wide area networking of a professional community in their need of a domain model and collaborative modeling effort. Our research method follows a design science approach (March and Smith, 1995), aiming at constructing a technical environment supporting different phases of a discourse process based on semantic MediaWiki and a concept mapping tool (CMAP).

382 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

The Discourse Tool aims at supporting the following activities: • Collaborative conceptual and data modeling in the TEL domain • Identifying and coding relevant, new topics and issues found in the focus domain to the tool. • Collaborative solving of issues to coherently incorporate or modify the relevant ideas from them into the result models presented as the agreed-on, common overall models in the DT and in the web environment for further discussions. • Continuous support for evolvement of the models presented in the tool and subjected to discussions. • Participative modeling, i.e. ability of each team member to participate and influence the construction of specific models in the focus domain. In the following, we will show how the challenges were addressed and resolved using our integrated tool.

4. Discourse Tool 4.1. Objectives and key design principles of the Discourse Tool This section presents the main objectives, challenges and key design principles of the Discourse Tool. The tool aims at implementing discourse processes, computer supported collaborative work activities as well as supporting collaborative modeling. The main objective of the Discourse Tool is to provide a sophisticated environment for experts to collaborate and discuss conceptual problems related to specific domain. This main objective can be broken down into the following sub-goals (adopted from Smidt and Rodden, 1996): • supporting informal interaction between experts, • supporting information sharing and exchange (sharing views, objects etc.), • supporting decision making, • supporting coordination of efforts (both synchronously and asynchronously), • supporting domain directories, e.g. structuring the information and providing catalogues). In the ICOPER context, the main focus is TEL domain. We aim at supporting of interactions, decision making and modeling efforts between TEL experts, as well as at structuring the information about the TEL domain by means of specific categories (e.g. concepts, issues etc, see below) and relations between those categories. The above main challenges can be transformed to the following design questions: • How to bring up existing domain-specific topics and issues and initiate a new discussion by involving experts synchronously and asynchronously? • How to solve the brought up topics and issues and transform the relevant ideas into specific modeling activities? • How to ensure participative modeling, i.e. synchronous and asynchronous involvement of all the experts in the process of discussion and model construction? • How to ensure continuous evolvement of the discourse process and collaborative modeling efforts?

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 383

4.2. Content of the Discourse Tool This section presents the main entities comprising the content of the Discourse Tool. The content of the Discourse Tool consists of the following high level entities: • Concepts are notions / terms representing and describing key aspects of a specific domain, derived from discussion and discourse in the community. • Semantic relations show the connections between atomic notions, i.e. concepts on the level of meaning and logic. • Issues are real problems existing in specific domain that lead to initiation of new discussions and collaborative solving of those problems by means of collaborative modeling and knowledge exchange • Discussions contain processes of collaboratively resolving of existing issues by a group of experts. We distinguish general, managed or targeted discussions: General discussions aim at identification and prioritization of general issues existing in a specific domain. In contrast, targeted discussions are discussions of relatively complex, urgent and narrow defined issues aiming at creating potential solutions. Those issues require extensive discussions with clearly defined steps, explicit moderation and facilitation. Managed discussions follow a clear process, in most cases with already identified solution alternatives and a clear timeline. In the context of the ICOPER project, the above categories are instantiated with respect to the TEL domain (Figure 2). Therefore the issues are meant to be real TEL problems (Figure 3). By concepts we mean primarily atomic notions existing in the TEL domain, such as learning resources, learning outcomes, teaching methods, or a unit of learning.

Figure 2. The scope of the Discourse Tool

A specific category for the TEL domain and the ICOPER project is the category of standards and specifications, i.e. de-facto and de-jure standards and specifications and their drafts existing in the TEL domain. Standards and specifications are, in most cases, data models agreed by a certain community or a formal body. These are of particular importance as they are discussed controversially in the community. There are also a number of other TEL specific categories, e.g. technical services supporting TEL activities of learners and teachers, processes covering the interactions between learners and teachers as well as between learners and learning management systems, etc (Figure 3). However, these components are specific to the project and the task of building a

384 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

reference model in the TEL domain. For generic usage, we recommend to use the categories of concepts, discussion, issues and specializations for the targeted domain which are then extended by categories derived from an initial discourse process. However, it is necessary to limit the number of categories as they increase the complexity of the discourse process. The balance of the expressiveness of the categories and the complexity is also a challenge in every new context.

Figure 3. TEL specific categories available in the Discourse Tool

There are numerous semantic relations between the TEL specific concepts and issues. In order to minimize the complexity of the overall picture, only the most important relations between the instances of the categories are identified and presented in the

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 385

Discourse Tool. For example, the category Standards and Specifications have relations only with the categories Concepts and Issues (Figure 4) as usually a standard is related and created to describe a concept. The standard European Qualification Framework (EQF) uses such concepts as knowledge, skills, competency and qualification. This standard has been discussed so far with respect to a number of existing issues, e.g. definition of learning method.

Figure 4. Relations between the standard EQF and related issues and concepts

4.3. Discourse Process and Collaborative Modeling in the Discourse Tool This section presents an example how the discourse process is initiated and carried out by means of the Discourse Tool, as well as how collaborative modeling supports the Discourse Process. The following example is provided to illustrate how the whole discourse process is initiated, carried out based on the categories available in the Discourse Tool and results into collaborative modeling efforts. The example is domain specific (TEL domain) and taken from the ICOPER project. However, the general ideas presented here can be used for any kind of domain – however, as the discourse also depends highly on the context (such as domain, stakeholders), it needs to be validated and adapted for other contexts. The first phase is the initiation of a discussion (Figure 5). At this stage an expert raises an issue existing in the TEL domain, e.g. how to develop a metadata model for learning outcome profiles (Figure 6). The expert provides a brief description of the issue and major relations to other entities, i.e. concepts and specifications that are associated with this issue.

386 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

Figure 5. Initiation of the discourse

In the discussion phase, an expert starts a discussion by stating his initial position (Figure 7). Other experts respond to the initial topic and provide their opinions. As soon as the experts identify the main entities to be modeled to solve the initial issue two types of graphs are used. The process of discourse is guided by the concepts, their descriptions and relations. As the models can become very complex, we also support the visualization of concepts and relations. We use two methods to support the graphical navigation and conceptualization: semantic graphs and concept maps. The semantic graph is used to visualize the relations between the entities of the same category, e.g. issues or concepts. The semantic graph is generated automatically by using specific syntax (Figure 8). Furthermore, we use a conceptual modeling tool: The “IHMC Cmap Tools” is used for graphical oriented, collaborative, synchronous and asynchronous modeling of concepts and relations between them. IHMC Cmap Tools is a standalone knowledge modeling kit developed by the Institute of Human and Machine Cognition (IHMC, 2009). At the moment the kit is integrated into the Discourse Tool only on the interface level, but not on the level of functionality. It means that it is possible to trace online the progress of modeling efforts carried out by a team of experts, but not to revise the developed models through the Discourse Tool (Figure 9). The goal is to combine the characteristics of a graphical modeling tool with the advantages of a structured formal model in the discourse tool. By the combination of a formal modeling process, containing descriptions and concrete attributes, with graphical modeling, we create different perspectives and ease orientation in the discourse. However, the full integration remains an issue for further implementation.

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 387

Figure 6. Example of issue in the Discourse Tool

4.4. Implementation of the Discourse Tool This section presents the technical implementation of the Discourse Tool. Firstly we provide an overview of possible technical platforms for implementation of the Discourse Tool and explain why Semantic MediaWiki has been chosen. Secondly, we highlight the architecture and main functionality of the Discourse Tool. From the technical viewpoint, the Discourse Tool can be implemented in different ways by using existing open-source tools and platforms for collaborative work. In this paper we focus only on those tools that are free and open source software products. The

388 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

Figure 7. Discussion Thread

ICOPER project for which the Discourse Tool is being developed is assumed to be a non-commercial project funded by the European Commission. To make the tool available after the project run-time, it was decided to base the implementation on opensource products. As a first challenge, we decided on different components and functions of the tool to provide appropriate functionalities to the users without overloading with possible functionalities. The following criteria were used for the selection of the platform for implementation: • The Discourse Tool should provide sophisticated semantic links between various categories and instances • The Discourse Tool should be modifiable and compatible with other technologies, projects and initiatives, e.g. conceptual maps, RSS feeds. • The tool should be easy to use for both, modeling and domain experts. Generally, platforms and tools suitable for development of the Discourse Tool can be divided into the following mutually non-exclusive categories (Brown et al., 2007), (Noy et al., 2008): • Groupware • Wiki engines • Tools for collaborative knowledge construction Groupware is “software designed to facilitate teamwork and information sharing among team members” (Yoo, 1998). Examples of groupware software include e.g.

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 389

Figure 8. Semantic graph showing the relations between the issues available in the Discourse Tool

phpGroupWare (phpGroupWare, 2009), Collabtive (Collabtive, 2009) and Google Apps (Google, 2009). A distinguishing feature of groupware software products is that they include many small web based configurable applications that enhance and facilitate the collaboration between team members. For example, phpGroupWare consists of 50 small applications such as calendar, address book, project manager, file manager, etc. Groupware is relevant for implementation of the Discourse Tool, since it supports collaborative work and can be used for initiation and carrying out of discussions and building relations between different instances. Wikis are “collaborative web sites that allow users to add and edit content” (Brown et al., 2007). Examples of well known wiki engines include MediaWiki (MediaWiki,

390 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

Figure 9. View of a model in the Discourse Tool

2009), TikiWiki (TikiWiki, 2009) and XWiki (XWiki, 2009). A distinguishing feature of wiki engines as compared to the project collaboration software and groupware is their high modifiability and availability of numerous extensions that add extra functionality to the original engines. Wikis are relevant for implementation of the Discourse Tool, since they support many aspects and activities of the collaborative work. Tools for collaborative knowledge construction support the integration of Web 2.0. and semantic web approaches (Noy et al., 2008). Unlike wiki engines and groupware, tools for collaborative knowledge construction provide for example semantic links to different concepts instead of annotations or tagging and relate those concepts in a structured form, i.e. ontology. Examples of tools for collaborative knowledge construction include Semantic MediaWiki (Semantic MediaWiki, 2009), OntoWiki (OntoWiki, 2009), Collaborative Protégé (Collaborative Protégé, 2009), Hozo (Hozo, 2009), DBin (DBin, 2009). For our purpose, it is not necessary to implement a full groupware system as the platform for the Discourse Tool, since their system of annotation and tagging is not sophisticated and flexible enough whereas other functions are not needed. Wiki engines are flexible enough in terms of functionality and flexibility/modifiability; however their original system of annotation and tagging is quite limited. In contrast, tools for collaborative knowledge construction provide a sophisticated system of semantic annotation, however their flexibility is limited. A good compromise seems to be to combine flexibility of wiki engines and sophisticated system of semantic annotation of tools for collaborative knowledge construction by using Semantic MediaWiki as an extension for MediaWiki. The architecture of the Discourse Tool is provided in Figure 10. It is implemented based on Semantic MediaWiki and a number of extensions to it.

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 391

Figure 10. Architecture of the Discourse Tool

A MediaWiki is used as a basic platform for the Discourse Tool, whose semantic functionality is extended using a Semantic MediaWiki. A number of further semantic features are implemented by means of SMW+ or Halo (Ontoprise, 2009), Semantic Forms (Koren, 2009c), Semantic DrillDown (Koren, 2009a), Semantic Query Form Tool (Zehetner, 2009) and Semantic Result Formats (Yergler, 2009). SMW+ provides a number of highly usable tools to display categories of concepts and relations between them as an ontology, as well as the analysis of knowledge inconsistencies, e.g. undefined entities, anomalies, referential integrity. ‘Semantic DrillDown’ represents categories of concepts in a user friendly way by means of a drill down list. In this case the concepts are presented in alphabetical order for each category. ‘Semantic Forms’ are used as a user friendly tool to create new categories of concepts and their semantic properties. ‘Semantic Query Form Tool’ provides a possibility to carry out semantic search of concepts and filter the results. ‘Semantic Result Formats’ provides further extensions for Semantic MediaWiki, such as semantic calendar and event lines, that can be used to display events arranged during the Discourse Process. Along with semantic extensions, a number of non-semantic tools are used to extend the capabilities of MediaWiki and Semantic MediaWiki. Those tools are FCK Editor (Knabben and Walc, 2009), DiscussionThreading (Pond and Brice, 2008), ToDo Tasks (Grinberg, 2009), Polls (Sbarnea, 2008), Skoffer screencasting feature (Steinmann, 2009), Data Transfer (Koren, 2009b) and Widgets (Chernyshev, 2009). The FCK Editor extension enables a more intuitive WYSIWYG editor when editing wiki pages. DiscussionThreading is an extension that implements an improved discussion page system. It makes the discussion process on a wiki page well structured and more user-friendly. ToDo Tasks extension provides the ability to create and modify to-do tasks and assign them to specific wiki users. Polls extension provides a possibility to add polls on wiki pages. Skoffer adds a screencast recorder to MediaWiki. This feature can be useful with respect to Discourse Process, since screencasts can be used to capture for example a process of development of concept maps. Data Transfer is an extension of MediaWiki that allows users to both export and import data from and to the wiki via XML files. Finally, Widgets extension allows adding widgets to wiki by just creating pages in Widget namespace. This feature is useful, since many tools and applications can be imported to the Discourse Tool as widgets, e.g. a calendar or a news channel.

392 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

With respect to supporting of collaborative modeling activities of experts, the Discourse Tool has two extensions. On the one hand we use the extension Anysite to display the content of the Camp Server, i.e. a server side of the modeling tool IHMC Cmap Tools (IHMC, 2009). On the other hand we use semantic graph to display the relations between the entities of the same category.

5. Discussion In our study, a Discourse Tool to support collaboration in construction and maintenance of a domain model together with sub-domain models and alternative models was successfully implemented. The tool is available at http://discourse.icoper.org/ and, in this initial phase, it is supporting the ICOPER project by making available the domain model and sub-domain models together with detailed information on these, and by providing a discourse space for any exchange of information, opinions and ideas on the models among the project participants. However, the tool is not restricted to one project only, it is available to TEL community as such. The idea of the Discourse Tool is applicable to any domain where collaborative modeling is addressed and where there is a need to combine conceptual and formal modeling. The Discourse Tool has been constructed and introduced to the TEL community during the pilot. Already in this pilot phase it is being used both within the ICOPER consortium and beyond. Experts both from within the project consortium and from the broader community provide their inputs to the Discourse Tool. Based on this initial trial phase, we provide a short case study showing initial results and experiences from the practical usage. In our case, the Discourse Tool was used for different purposes – from building a common understanding of general concepts to discussing detailed aspects of data models. As an example, one major task was to elaborate a data model of learning outcome profiles (for the structure, see also Figures 6 and 7) – this approach is highly important in the community, however, different views on the modeling processes and data models exist. The discourse process started with the raising of the initial issue, i.e. building a common understanding on competencies and learning outcomes. During the discussion process, the experts identified the main concepts to be reflected in the data model, their definitions and semantic relations. All the steps of the discussion (Figure 7) as well the main outcomes of it (Figure 6) could be easily traced in the Discourse Tool. When the main components of the data model were identified, the experts used the Cmap Tools to build the total model. The whole process of the model development was transparent to a broad community via the Discourse Tool. The final version of the model is available in the Discourse Tool as well (Figure 6). If a collaborative modeling tool was used, the whole discussion process would have to be integrated in a more complex and less transparent way for the participating experts and the community. It would have been more challenging to trace the decisions on each step of the discussion process to resolve the initial issue. In turn, the usage of a groupware would made the modeling part of the discussion process less transparent to the outside world, since collaborative modeling tools are purely integrated into the groupware. Moreover, both groupware and collaborative modeling tools would have required a more complex system of annotations to trace the semantic relations between the components of the data model.

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 393

Summarizing our first experiences, the key elements of the Discourse Tool in our trial phase were: • The combination of graphical conceptual modeling (CMAP and semantic maps) with concrete data modeling in a discourse process. In the modeling process, basic awareness and a common understanding was supported mainly by the graphical modeling tools, in particular CMAPs. In this process, the more concrete steps were initiated, such as issue discussions and building of data models. The transparent continues process was helpful to the stakeholders as different levels of discussions and awareness / understanding were supported. • The combination of modeling and discourse as a dynamic process was highly important, in particular, as the stakeholders had to develop a common understanding coming from different backgrounds and perspectives. The simultaneous building of understanding and modeling shortened the time from the idea to the data model. • Participation of different stakeholder groups and expertise levels as the tool supports non-expert general discussions as well as technical modeling questions. Therefore, most stakeholders were able to participate on some level. • Combination of modeling with overall general discussions also was used to integrate all user types and expertise. However, not all discussions were joined by all stakeholder groups. This is not necessarily a weakness of the tool. However, the discussions should be planned carefully to enable participation of stakeholders in all, not only in selected fields. The actual broader usage of the tool after the initial validation starts in the first half of 2010. The Discourse Tool is planned to be used not only for gathering its content, i.e. instances of specific TEL categories and semantic relations between them, but also user experiences and feedbacks concerning the evolving models and their further development.

6. Conclusions and Future Research In this paper, we have presented the pilot phase and experiences in development and usage of the Discourse Tool, an environment for collaborative construction of models and resolving of related problems. The ideas underlying the Discourse Tools are based on the notions of computer supported collaborative work (CSCW) and discourse. These ideas are domain independent and can be used by experts from different areas. The tool has been successfully implemented as a part of the ICOPER project for Technology Enhanced Learning (TEL) experts. Based on this example we have shown how a discourse process is initiated, carried out and supported and facilitated for the collaboration in modeling. Finally, we highlighted how the Discourse Tool is implemented from the technical viewpoint, i.e. based on semantic MediaWiki. Based on the first experiences and case study, the tool has shown promising perspectives for collaborative modeling, focusing on the integration of different user levels as well as modeling types and levels. However, a variety of question have evolved in the first phase – specific attention has to be paid to the types and topics of general discussions and how they can be related to concrete (data) models to assure the participation of stakeholders in all parts of the modeling task. Additionally, there is a

394 D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts

strong need for a continuous modeling process, from awareness building to conceptual and data modeling. The selection of tool functionalities should be further analyzed. Furthermore, the tool needs to be validated on a larger scale: User experiences will be collected, and further studies for tool enhancement will be needed to fully understand the effects of the tool on modeling tasks for different stakeholders. Another point needing attention is the representation of the models in the tool. The current version of the Discourse Tool is integrated with CMap Tools. A broader variety of model representations would greatly enhance the use of the tool, and more research is needed for implementing this feature.

7. Acknowledgements This paper has been written in the context of the ICOPER Best Practice Network (http://icoper.org), which is co-funded by the European Commission in the eContentplus programme, ECP 2007-EDU-417007. The paper only states the opinions of the authors.

8. References Auramäki E., Hirschheim R. and Lyytinen K., Modeling offices through discourse analysis: the SAMPO approach. The Computer Journal, 35(4), 342-352, 1992. Bannon L.J. and Schmidt K., Four characters in search of a context, In Proceedings of the First European Conference on Computer Supported Cooperative Work ECSCW’89, Gatwick, London, 13-15 September 1989, 358-372. Bjørn-Andersen N. and Pedersen P.H., Computer facilitated changes in the management structure, Accounting, Organizations and Society, 5(2), 203-216, 1980. Brown M.K., Huettner B. and James-Tanny C., Managing virtual teams: getting the most from wikis, blogs and other collaborative tools, Wordware Publishing, 2007. Chernyshev S., Widgets extension, Online http://www.mediawiki.org/wiki/Widgets, accessed 25.01.2010. Collabtive, Online http://collabtive.o-dyn.de/, accessed 25.01.2010. Conklin J., Begeman M.L., gIBIS: a hypertext tool for exploratory policy discussion, ACM Transactions on Information Systems (TOIS) 6(4), 303–331, 1988. Davies J., Studer R. and Warren P., Semantic web technologies: trends and research in ontology based systems, Wiley & Sons, 2006. Ebersbach A., Glaser M., Heigl R., Warta A., Adelung A. and Dueck G., Wiki: web collaboration, Springer, 2008. Fairclough N., Analysing discourse. Textual analysis for social research, London & New York: Routledge, 2003. Fairclough N., Critical discourse analysis. The critical study of language, London & New York: Longman Group Limited, 1995. Fairclough N., Language and power, London: Longman, 1989. Giddens A., The constitution of society, Berkeley: University of California Press, 1984. Google Apps, Online http://www.google.com/apps/intl/en/business/index.html, accessed 25.01.2010. Grant D., Hardy C., Oswick C. and Putnam L. (Eds.), The SAGE handbook of organizational discourse, Thousand Oaks, CA: Sage Publications, 2004. Grinberg P., ToDo Tasks, Online http://www.mediawiki.org/wiki/Extension:Todo_Tasks, accessed 25.01.2010. Grønbæk K., Kyng M. and Mogensen P., Toward a cooperative experimental system development approach, In M. Kyng and& L. Mathiassen (Eds.). Computers and Design in Context, The MIT Press, Cambridge, Masachusetts, 201-238, 1997. Gruber T.R., A translation approach to portable ontologies, Knowledge Acquisition, 5(2), Burlington: Academic Press, 199-220, 1993. Grudin J., CSCW. History and focus, Online http://research.microsoft.com/en-us/um/people/jgrudin/past/ Papers/IEEE94/IEEEComplastsub.html, accessed 20.01.2010.

D. Kozlov et al. / The Discourse Tool: A Support Environment for Collaborative Modeling Efforts 395

Grudin J., Why CSCW applications fail: problems in the design and evaluation of organization of organizational interfaces, In Proceedings of the 1988 ACM conference on Computer-supported cooperative work, ACM Press New York, NY, USA, 85–93, 1988. Harasim L.M., Global Networks. Computers and international communication, MIT Press, 1993. IHMC Cmap Tools, Online http://cmap.ihmc.us/conceptmap.html, accessed 25.01.2010. Ishii H., Cross-cultural communication and CSCW. In: Harasim L.M. (Ed.): Global Networks, Computers and International Communication, MIT Press, 134-151, 1990. Janis I.L., Groupthink in small groups and social interaction, New York: Wiley & Sons, 1983. Knabben F.C. and Walc W., FCK Editor, Online http://www.mediawiki.org/wiki/Extension:FCKeditor_(by_ FCKeditor_and_Wikia), accessed 25.01.2010. Koren Y., Data Transfer, Online http://www.mediawiki.org/wiki/Extension:Data_Transfer, accessed 25.01.2010, 2009b. Koren Y., Semantic DrillDown, Online http://www.mediawiki.org/wiki/Extension:Semantic_Drilldown, accesses 25.01.2010, 2009a. Koren Y., Semantic Forms Online http://www.mediawiki.org/wiki/Extension:Semantic_Forms, accessed 25.01.2010, 2009c. Kyng M., Designing for cooperation - cooperating in design, Communications of the ACM, 34(12): 64-73, 1991. Lynch M., Pictures of nothing? Visual construals in social theory, Sociological Theory 9(1), 1-22, 1991. Lyytinen K., A Taxonomic perspective of information systems development: theoretical constructs and recommendations, In Boland R.J. Jr. and Hirschheim R.A. (Eds.) Critical Issues in Information Systems Research, Chichester: John Wiley & Sons Ltd., 3-37, 1987. March S. and Smith G.F., Design and natural science research on information technology, Decision Support Systems 15(4), 251-266, 1995. Markus L., Power, politics and MIS implementation, Communications of the ACM, 26(6), 430 – 444, 1983. MediaWiki, Online http://www.mediawiki.org/wiki/MediaWiki, accessed 25.01.2010. Noy N.F., Chugh A. and Alani H., The CKC challenge: exploring tools for collaborative knowledge construction, IEEE Intelligent Systems, 23(1): 64-68, 2008. Ontoprise SMW+, Online http://wiki.ontoprise.com, accessed 25.01.2010. Phillips N., and Hardy C., Discourse analysis: investigating processes of social construction, Thousand Oaks: Sage, 2002. Pond J.D. and Brice D., Discussion Threading, Online http://www.mediawiki.org/wiki/Extension:DiscussionThreading, accessed 25.01.2010. Pulkkinen M., Enterprise architecture as a collaboration tool: discursive process for enterprise architecture management, planning and development, Dissertation. Jyväskylä studies in Computing 93, University of Jyväskylä, 2008. Robinson M., Double-level languages and co-operative working, AI & Society 5(1), 34 – 60, 1991. Sarkkinen J., Design as discourse: representation, representatonal practice, and social practice, University of Jyväskylä, Dissertation, 2006. Sbarnea S., Poll extension, Online http://www.mediawiki.org/wiki/Extension:Poll, accessed 25.01.2010. Schmidt K., and Rodden T., Putting it all together. Requirements of a CSCW platform. In: Shapiro D., Tauber M. and Traunmüller R. (eds), The Design of Computer-Supported Cooperative Work and Groupware Systems, North-Holland Elsevier, Amsterdam, 157-176, 1996. Steinmann S.S., Skoffer extension, Online http://www.mediawiki.org/wiki/Extension:Screencasting, accessed 25.01.2010. TikiWiki, Online http://info.tikiwiki.org/tiki-index.php, accessed 25.01.2010. van Dijk T., Critical discourse analysis, In Tannen D., Schiffrin D. and Hamilton H. (Eds.), Handbook of Discourse Analysis. Oxford: Blackwell, 352-371, 2001. van Dijk T., Structures of discourse and structures of power. In Anderson J.A. (Ed.), Communication Yearbook 12, Newbury Park, CA: Sage, 18-59, 1989. XWiki, Online http://www.xwiki.org/xwiki/bin/view/Main/WebHome, accessed 25.01.2010. Yergler N., Semantic Result Formats, Online http://www.mediawiki.org/wiki/Extension:Semantic_Result_ Formats, accessed 25.01.2010. Yoo Y., Predicting groupware usage, Proceedings of the 31st Hawaii International Conference on System Sciences, vol. 6, 510-517, IEEE Computer Society Press, 1998. Zehetner G., Semantic Query Form Tool, Online http://www.mediawiki.org/wiki/Extension:SemanticQueryFormTool, accessed 25.01.2010

396

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-396

On Context Modelling in Systems and Applications Development Anneli HEIMBÜRGER1a, Yasushi KIYOKIb, Tommi KÄRKKÄINENa, Ekaterina GILMANc, Kyoung-Sook KIMd and Naofumi YOSHIDAe a University of Jyväskylä, Finland b Keio University Shonan Fujisawa Campus, Japan c University of Oulu, Finland d National Institute of Information and Communications Technology, Japan e Komazawa University, Japan Abstract. Context is a multi-dimensional concept. It is hard to define context generally for computer science. Which information is considered as context, which is not? Why are the certain context elements relevant for a certain case, but irrelevant for another? How to explain this to computers? Can computers learn these issues as humans do? In our paper we present different viewpoints to the concept of context and to context modelling starting from requirements engineering and ending up to multi-disciplinary education. Based on context related literature research and discussions in our paper, we can summarize that a complete and comprehensive definition and model of context is difficult to achieve and may not even be appropriate at all. However we can conclude that there is a common understanding that context always relates to an entity, context is used to solve a problem, context depends on the domain of use, context depends on time and context is evolutionary. Keywords. Context, context modelling, requirements engineering, ubiquitous systems, location awareness, movement awareness, multi-disciplinary education

Introduction Context modeling and computing are current and important research and development issues when we are designing and realizing intelligent systems and applications for example in ubiquitous environments. Traditional software applications usually know by design in which situations they are to function. This is not the case with ubiquitous systems. These systems are interacting with the world that is changing all the time. Hence they have to dynamically adjust their functions and output behavior in real time. Various areas of computer science (and also other sciences) have been investigating the concept of context over the last decades. The concept of context is still a matter of discussion, and through the years several different definitions have been proposed [1, 2]. The definitions can be divided into extensional and intensional definitions. Extensional definitions present the context through a list of possible context dimensions and their associated values. Intensional definitions present the concept of context more formally. Extensional definitions seem to be useful in practical applications, where the abstract concept of context has to be made concrete. Intensional definitions are of little use in practice, despite being 1

Corresponding Author.

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

397

theoretically satisfying. Context modelling approaches can be classified by the scheme of data structures which are used to exchange contextual information in the respective system. Context should be seen as a function of interaction between users, objects and environment, and as a consequence of focus or attention. Context can emerge in the moment and it can change quickly [3]. Contexts can be static, dynamic, discrete, continuous, individual or collective. Almost any information available at the time of an interaction can be seen as context information. Some examples are: identity, spatial information, temporal information, and environmental information, social situation, nearby resources, physiological measurements, feelings and impressions. Our general approach to context typology is presented in Figure 1 [4].

Figure 1. Context typology

From the view point of intelligent systems and applications development, it is essential to integrate contextual information and knowledge with other types of data, information and/or knowledge related to the system or application under construction. Contextual information and knowledge function as an additional major source for reasoning, decision-making and adjustment to form a coherent and versatile architecture in ever-changing world. This can be realized by means of the ability to capture, model, represent, manipulate and manage contextual information. Contextual information may include in addition to physical characteristics of the task environment also cognitive factors such as knowledge states of both the application and the user, or user’s emotions (kansei in wider sense) and social factors such as networks, relations, roles and hierarchies [2].

398

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

For promoting discusssion around the concept of context we organized the paanel session “Context Modellin ng” in the 20th European-Japanese Conference on Informaation Modelling and Knowledgee Bases (EJC2010). The idea of the panel was initiatedd by Professor Yasushi Kiyoki from Keio University SFC and by Senior Researcher Annneli U of Jyväskylä, Department of Mathemattical Heimbürger from the University Information Technology. Our O paper is based on the presentations in the panel sesssion. Panelists present their ownn viewpoint to context. Our paper is organizeed as follows. On the role of context and it’s modellingg in requirements engineering is discussed by Professor Tommi Kärkkäinen in Sectionn 1. The section focuses on thhe concept of context and its relation to and appearancee in requirements engineering literature. l Section 2 presents context from ubiquitous sysstem developer’s perspective byy Researcher Ekaterina Gilman. The focus of the section iss on the phases of context proccessing: context acquisition, context modelling and conntext reasoning and it presents a general view on context-aware system development. Moobile n Section 3 by Expert Researcher Kim Kyoung-Sook. The intelligent is discussed in section focuses on location n and movement awareness. In Section 4 Associate Professor Naofumi Yoshida introducces multi-disciplinarity as one approach to context study. The focus of the section is on health care education and sign language for gloobal communication. Section 5 summarizes our paper.

1. On the Role of Contextt and It’s Modelling in Requirements Engineering (Tommi Kärkkäinen) Software/systems (S/S) devvelopment joints together two main facets: business (whaat is needed) and technology (w what can be provided). When the application domain of S//S is attached to these facets, thhe central role of user (needs and expectations) appears. It is this overall setting that und derlines the actual S/S development processes and phases,, i.e., requirements engineering, S/S design, implementation, testing, and maintenance.

Figure 2. 2 S/S context, Model Views (MV) according to [5]

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

399

If possible, these phases should originate from an iterative and incremental development (and deployment) model, which nowadays have in many cases also global, distributed, and multicultural character. S/S is always developed for certain environment to fulfill certain needs. Hence, the general S/S deployment context, from organizational process perspective, is steered by business processes and the supporting workflows by users (or machines if automatized). Such task orientation in the application domain, as emphasized by Adam et al. in [6], provides fruitful means for achieving more appropriate and usable systems. Taking universities as on example (in Finnish context, that is), the main business process as defined by the Ministry of Education and Culture, is to produce bachelor, master, and doctor degrees, and these degrees consist of courses provided by university teachers as part of their educational workflows. The context of use of S/S is, hence, modelled by describing the environment of use, the users in that environment, and the behavior provided by S/S. The technological realization, from modelling perspective, is then supported by characterizing S/S’s structure and implementation. Moreover, I propose to differentiate the Information System (IS) development and SoftWare (SW) development (or engineering) such that IS focuses on the models of use/user/UI and data, whereas SW is interested in architecture and implementation as well. This general view is illustrated in Figure 2, where the five Model Views (MV) are according to [5].

1.1. On the Concept of Context According to Merriam-Webster’s Online Dictionary [7] context refers to: 1. the parts of a discourse that surround a word or passage and can throw light on its meaning 2. the interrelated conditions in which something exists or occurs : environment, setting Similarly, Oxford Online Dictionary [8] provides a definition of the form: the circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood However, these basic definitions do not define or provide further guidelines to what are the constituents of S/S context, i.e. what attributes and qualities should be defined to capture the discourse, conditions, events, and other circumstances to fully understand the use of software/system to provide its specification. In order to reveal the contextual attributes we first explore context-awareness related to mobile computing. In [9] by Chen and Kotz, the division provided by Schilit et al. [10] is summarized (see also [11]) into the following three categories: • Computing context: network connectivity, communication costs, communication bandwidth, nearby resources such as printers, displays, and workstations • User context: user’s profile, location, people nearby, social situation • Physical context: lightning, noise levels, traffic conditions, temperature Furthermore, (ibid.) propose to add the fourth category • Time context: time of day, week, month, season of year Finally, (ibid.) propose the following overall definition: Context is the set of environmental states and settings that either determines an application’s behavior or in which an application event occurs and is interesting to the user.

400

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

Ali et al. in [12] summarize the meaning of context as conditions in the operating environment of a system that influences how the system should behave in different situations. From these we observe that because application’s behavior always depends on its own state as well, this dimension (if not one-to-one with environmental variables tracked) could be included separately in computing context. From S/S architecture point of view, important contextual information (see, e.g., [13]) is provided by available resources (e.g. existing databases) and other S/Ss to be interacted with. External interfaces with the environment of S/S are also emphasized in requirements specification standard [14]. Feather et al. in [15] point out that deployment in real setting means interjecting into a pre-existing context of activities (cf. Figure 3), established designs and standard practices. To conclude, I propose to include in the S/S context at least the following concepts: • Computing context: communication context (networks, costs, bandwidth), existing computational resources (S/Ss, DBs, devices, technology platforms, OS), system state • User context: user’s profile (age, skills, preferences), location, proximity (e.g. distance to other users), social situation (including laws, rules, regulations, culture, etc.; see, e.g., [16]), workflow(s) to be supported by S/S • Physical context: physical environment of S/S (temperature, humidity, noise, speed, lightning) • Temporal context: time, date, day of the week When attributes constituting the context are defined and their changing values are captured, we can start considering context history [9]. Hence, the temporal change of context becomes part of relevant information for context itself. We can then start a new research track on context mining, but that is another story… Anyway, contextual variability is defined by Salifu et al. in [17] (for fixed attributes) “as a space of variables whose different values require different application behaviors.” S/S variability, thus, naturally depends on the context. To this end, a relevant bootstrap-type of observation is to recapitulate the definition given by Lehman in [18] for the so-called E-Programs: “The program has become a part of the world it models, it is embedded in it.” E-Programs thus change their own context of deployment. According to the comprehensive definition guidelines just given, every S/S is potentially an E-Program.

1.2. On the Role of Context in Requirements Engineering Requirements engineering is concerned with describing S/S to be built. In general, one speaks about functional requirements when referring to the S/S’s actual behavior and nonfunctional requirements when the quality of the behavior is defined. Until 1990s, one typically stated the functional requirements in the form “The system shall…” e.g. “… store completed degrees” (see, e.g., [14] Section 5.3.2). However, nowadays user and context of use (role of user) of S/S are usually given, in the form of use cases [19] or user stories [20]: “The study secretary stores information on a completed degree to the university-wise database.” According to the famous article by Brooks [21, p. 17] "The hardest single part of building a software system is deciding precisely what to build.” We remind here that this statement is not concerned of just describing what to build, in appropriate form, but to decide what’s in it and what’s out of it (scope, boundary). Moreover, Brooks [21]

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

401

does not provide restrictions on when to make decisions, thus allowing incremental and iterative development processes to be applied. Next we briefly summarize our findings related to the role of context in requirements engineering. These observations are based on a non-representative sample obtained after quick exploration of some relevant requirements engineering forums (latest IEEE RE conferences 2007-2009, relevant IEEE’s publication series on software engineering in 2007-2009, quick search of “context” + “requirements engineering” with IEEEXplore®). However, our impression after only such an initial exploration is that the topic of contextuality in requirements engineering definitely needs (and is definitely worth of) further studies. Ali et al. [12] share this opinion. Maiden [22] provides an example of stating a functional requirement within computing/communication context: ”If device is accessing 3G network, then display local street map.” Glinz [23] states a nonfunctional requirement in (loosely defined) computing and user context: ”The response time shall be less than 0.5 seconds in 98% of all user input actions.” Callele et al. [24], especially in relation to computer game development, suggest to include user’s (= player) emotional state in user context description. Seyff et al. [25] propose to gather and discover new requirements with a mobile tool in the actual work (flow) context. From the RE importance perspective, Kamata and Tamai [26] conclude that rich description of context characterize normal projects whereas a poor description is one indication of an overrun project. Context modeling studies are even harder to find. The direct relation between context and S/S development is provided by the so-called context diagram, which is a particular form of data flow diagram that shows an entire system in the context of its external entities. Such a diagram is utilized e.g. in problem frames [27], as system context diagram [28], and together with use case –based development approach in [19]. Ali et al. [13] suggest integrating three different requirements engineering approaches – goal modeling, feature modeling, and problem frames – to facilitate treatment and modeling of contextual variability in requirements. Alshaikh and Boughton [29] suggest augmenting the scope and system boundary view of context with “dynamic element of every stage of (software) development”. This element is called context dynamic matrix (CDM) and it captures perception and influence related to the context of a requirement allowing reasoning to take place. We conclude this short exploration by noticing the lack of unified treatment of context, shortage of expressing all attributes defining the (relevant) context, and neglect of their full address to state a requirement in context. Moreover, depending on the application domain, the set of relevant attributes for context definition varies. To this end, special context models or modeling languages seem to be missing as well, because taking into account all context categories and attributes with diagram-based models supposedly overpopulate such a description. Hence, (e.g. XML-based) description with a structural form appears to be most appropriate way to define the context of S/S. A related example of XML-based representation is provided in [30] for incorporation user contextual information into software asset retrieval.

1.3. Future Work I think that context in relation to S/S requirements and their engineering provides a fruitful field of constructive and empirical research. We should agree on the definition of context in the field of S/S development, define in full flavor the set of all relevant attributes determining the context, and develop unified ways to express requirements in

402

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

context. Actually, without comprehensive characterization and documentation of context, it can be very difficult to assess such basic characteristics of good requirements as unambiguity, completeness, and verifiability (see [14]). In this writing, I have not much touched the actual context modeling. For me, modeling means precise description of relevant entities to be modeled. In the context of context modeling, we must first agree which are the relevant characteristics to be included in the characterization of context, and what and how to adapt for different situations to be described. Then these characteristics and their intrinsic relations with each other must be typed. This constitutes the constructs of the context modeling language, similarly to the archetypal example UML. Moreover, meta- and metametalevels can and should be used for precise definition of syntax and semantics of context-aware requirement specification languages.

2. Context from Ubiquitous Systems Developer’s Perspective (Ekaterina Gilman) Ubiquitous systems support users with information services at the right time, place and situation by embedding heterogeneous, invisibly interacting computer resources in human surroundings. These systems adjust themselves based on the user’s behaviour and environmental changes, such as time, location, other users etc. In other words, these systems are context-aware, they sense the current context and know how to behave in it, and hence the understanding of what is the context and how to operate with it is needed. A lot of research has been done trying to define context generally. Early context was defined by synonyms and examples, for instance, by location, time, etc. [31]. Schilit et al. [11] define the main important aspects of context, such as where you are, who you are with and what resources are nearby. Later, Dey suggests broadening the view on context and considers the overall situation of application and users involved [31]. Zimmermann puts constraints on Dey’s definition by introducing five categories for elements that describe the context: individuality, activity, location, time, and relations [32]. It is hardly possible to define context generally, however, it is clear that context constitutes the information that is important for the subject (system or person) and the set of important information varies significantly depending on subject’s nature, goal and situation. System is context-aware if it “uses context to provide relevant information and/or services to the user, where relevancy depends on the user’s task” [31]. To be usable the context should be formally represented, so the system can operate with it. Moreover, the system should know how to retrieve the context data and what to do when the certain situation occurs. Generally, “uses context” includes the following steps: context acquisition, context modelling and context reasoning.

2.1. Using Context for Building Ubiquitous Systems Context acquisition is obtaining of the environmental (physical, social, etc.) information. These data are gathered by using the physical sensors, services, manual input and by learning. The reliability of the sensors is important issue and should be addressed by a system developer. Often, complex pre-processing of signal data (such as segmentation, feature extraction, classification and post-processing) is needed before the meaningful low-level context can be obtained, for instance, movement patterns

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

403

based on accelerometer data. Difficulties appear to the situations when not all dvance, for instance which features characterize a pattern bbest. information is known in ad Moreover, often to get the meaningful context, several sensors data shouldd be nsor data fusion. combined, this is called sen Wei and Chan define context c modelling as representing, structuring and organizzing contextual data and relatioonships between them, in order to facilitate the storage and operations of them [33]. Th here are many requirements the context model should satiisfy, such as support for heterogeneity h of resources, validation, uncertainty and incompleteness of informaation. Plenty of formal representations for the context moodel have been suggested, for in nstance ontologies, key-value pairs, etc. [34]. Nurmi and Floréen define context reasoning as deducing new and relevvant information to the use of appplication(s) and user(s) from the various sources of conttextdata [35]. Generally, conttext reasoning is responsible for new knowledge discovvery from the given context datta (e.g. inferring high-level context from low-level), conttextdependant system behavio oural patterns, and system consistency support. There are many issues related to reasoning, r most important are the following. First, qquite restricted part of nature and our everyday activities can be logically formalizzed. hould discover or learn these logical and casual dependenncies. Second, how the system sh Figure 3 shows the geeneral architecture of a context-aware application. Sensorrs or concept observations, whicch build the lowest, context acquisition level, are used too get the data about the real worrld. At the second, context modelling level, these fused ddata are validated and wrapped d in an operable format, and this constitutes the low-leevel context. For higher-level contexts c more intelligent reasoning and learning techniqques (e.g., rule-based reasoning g, probabilistic reasoning) need to be applied together w with optional integration with external e data (context reasoning level). Applications use this information to perform their actions. Context acquisition, context modelling, conntext reasoning levels, and application form a context-aware application.

Figurre 3. Architecture for context-aware applications

2.2. Future Work Current knowledge and technology form a strong basis for ubiquitous systeems vailable solutions for the low-level issues of context-aw ware development. There are av systems (environmental sensing, communication infrastructure, etc.). Moreover, pleenty

404

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

of frameworks facilitating the context-aware systems development are available from the research community. Currently ubiquitous research is concentrated more on human aspects, such as multimodal interfaces development, reducing the explicit input to the systems, user support, acceptance and control of ubiquitous applications. There is the main question for ubiquitous systems research: “How can we make ubiquitous systems more useful, reliable and trustful for users?” To provide the solutions for this question, we have to address the following core issues. How to deal with incomplete context data? The problem is that we cannot fully describe the surrounded world, with all its relations and exceptions. The system should intelligently behave with incomplete or even wrong data. Sometimes, even in similar situations the user behaves differently. How can the system achieve this level of understanding for the user’s intentions? Current prototypes still require quite extensive user input. Many behavioural patterns cannot be pre-coded before the system usage, hence the system should continuously learn from the user. Adjustment of the system’s behaviour should be simple and natural for the user. As can be seen, all these tasks are related to the context. The existing technology and the ongoing efforts of the research community allow us to move in the semantic direction for context representation. The next step for practical implementation is to encode this semantic knowledge in capable devices of surrounding environment. This way system can acquire the knowledge from environment, not only data, by studying the objects and their semantic relations; hence, the learning facilities could be significantly improved. We expect the majority of applications to become more context-aware, providing the users with personalized and intelligent support. This way, applications would match humans’ needs more and allow users concentrate on their everyday activities rather than on the system control. The most significant contextaware application areas are healthcare and environment preservation, which have direct impact on humans’ wellbeing. Acknowledgements: The author of this section would like to thank Professor Jukka Riekki for discussions on the topic.

3. Location Awareness in Mobile Intelligence (Kim Kyoung-Sook) With the combination of ubiquitous mobile computing and knowledge processing, context-awareness is one of main research issues for improving system behavior to provide suitable contents, applications, or services to mobile by being aware of their current situation and needs. At first, context is defined as 'location, identities of nearby people and objects' by Schilit and Theimer in [36]. More recently Abowd and Dey reviews context as 'any information that can be used to characterize the situation of an entity, which is a person, place, or object that is considered relevant to the interaction between a user and an application, including the user and applications themselves' in [37]. According to two definitions of context, we can think primary 4W (Who, What, Where, When) context information to represent identity, activity, location, and time. Among those contexts the location is one of the essential context to determining our activities and orientation in the real world, especially in mobile environment. In this section, we discuss the mobility context and privacy problem due to over-sharing location information in ubiquitous mobile applications.

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

405

3.1. From Location Awarenness to Movement Awareness Computing Along with the widespreadd of smart phones equipped with positioning technologiees as well as GPS, location-awarre computing is a key element to develop mobile applicatiions and services. We can caategorize location-aware features for mobile applicatiions according to [37] as follow ws: • Presentation of lo ocal information and services to a user based on the uuser location and interrest • Automatic execution of a service based on the user's current location • Tagging of location context such as geographical features to informationn for later retrieval A service that shows the users a list of nearby friends, points of interest suchh as restaurants, theaters, or shopping malls, navigation services which provide best rouutes to a driver, and automaatic geo-coding services, that put geographical locaation information to pictures wh hen they are taken are good examples of applications havving location-aware features. However, we still have many considerations for nnext generation location-aware computing on the basis of the context of mobility. Mobility means changge of location over time. In mobile environment, users exppect to access their contents annd receive services, and connect to their friends or famiilies consistently and seamlesslly with mobile devices, even though they fully move from fr place to place. However, many m traditional location-awareness applications focus onn the current location of a userr and who or what is surrounding. Only using the currrent location is difficult to provvide more suitable content and data to user's situation. The ability to determine the mobile m user's future location can significantly improve the quality of contents and seervices as mentioned in [38]. For example, a user wantts to meet his/her friends on thhe train when they get close to each other. However, iff the friend is about to go out of o a station when the user arrived at the same station, they t cannot make face-to-face interaction even though they are on the same place. By anticipating their locations, the service can determine that they will come close to eeach other and help them to meet m each other in the right time. Namely, movement-aw ware applications adapt their fuunctionality to the upcoming situation. Figure 4 shows the movement awareness as a next n evolutionary step in location awareness.

Fig gure 4. Evolution of location-aware computing

406

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

Movement context is rarely used in real application because of heterogeneous positioning techniques and representation of location space even though it is widely believed to be useful. For example, we have several methods to track and identify the location of objects such as GPS, Cellular Wireless Network, or Wi-Fi positioning. Depending on the positioning method, the location information of user may be differently estimated. The location representation of indoors such as ‘(BuildingA, Room201)’ also is unlike outdoors such as ‘(N35°41 E139°46)’. In order to understand movement context of mobile users and provide high-likely relevant information and services to them, a system should store historical locations, preserve the continuous trajectory information of each user, and make collaboration their trajectories. The trajectory is fundamental information to anticipate near-future location and behavior of the mobile user. As shown in [39], traditional trajectory models have been designed on the basis of the same geographic space and same coordinate reference system. In ubiquitous mobile computing, however, various forms of location information should be handled and integrated in the system. Thus, we need to consider a new model and methodology to represent trajectories which can cover heterogeneous location information and space and construct seamless space from outdoors to indoors. 3.2. Future Work: Preserving Privacy in Location-Aware Applications Location-aware services are changing of our life patterns. Many mobile users have already created their contents with real-time geographic location information and overshared them on their social network. However, it means they allow people to know where they are or that they are not in their house. For example, someone order pizza to your house using a fake identity when you are absent from home. In other words, location-awareness brings up future threads on location privacy as shown in [40]. Therefore, we need to consider technologies to protect users’ personal information including location and other context information in context-aware environments. In [41], the privacy preservation techniques are surveyed briefly in order to defense the location privacy threats. The authors distinguish different types of defense techniques as following: • Network and cryptographic protocols • Access control mechanisms to avoid obtaining certain context information • Obfuscation techniques • Identity anonymization techniques. For example k-location anonymity aggregates locations of k-number of users into a certain area making each user indistinguishable among a sufficiently large number of individuals. Also it is used for avoiding location tracking for continuous location updates or queries. Although many research efforts have been focused on privacy protecting techniques, still there are many open issues and challenges for more complex context-aware services.

4. Multi-Disciplinary Approach to Context Modelling (Naofumi Yoshida) A multi-disciplinary, an inter-disciplinary, or a cross-disciplinary approach is an interesting and a challenging way to study context modelling to find the modelling target itself. It is difficult to model context in general way because we have several

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

407

contexts to be defined. It is hard to design the context from a simple example in a single discipline. It is too complex to design the context for the real and complex world. When we focus on a certain topic with multi-disciplinarily, we have a possibility to find the way to design context model and apply it to various topics. In this section, the examples of context modelling in multi-disciplinary environments are given and the future work of the approach is discussed. 4.1. Example 1: Multi-Disciplinary Education We can find a lot of multi-disciplinary approaches in education at universities. For example, Faculty of Global Media Studies in Komazawa University [42] is one of the multi-disciplinary faculties. Figure 5 shows the overview of the conceptual structure. After learning basic skills such as Information Technology literacy, Media literacy, Languages, and practical examples, the structure provides the multi-disciplinary view from disciplines. Learners will obtain a chance to mixture of several disciplines after selecting a main discipline. 4.2. Example 2: On-Line Healthcare One of the important topics of multi-disciplinarily is the overlapping discipline of Medical and Information Technology (IT). It is directly related to our life. In this example, on-line healthcare service is provided by the combination of following three disciplines: IT, medical and health care. Figure 6 (a) shows the structure of this example. This project is led by International Medical Information Center, Japan [43]. IT provides on-line and digital sensing technology for individual users. Medical doctors are providing suggestions for individual on-line users. Health Care Nurses suggest concrete activities for daily life to keep the health. This structure is given by the collaboration of following three different kinds of organizations: universities, doctors and local governments. Each organization has a certain profession in each discipline. The structure is the key of the multi-disciplinary activities in the example.

Figure 5. An Example of multi-disciplinarily: multi-disciplinary education

408

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

Health Information

Analysis with Doctors

Cosmetics Users

Center for Medical Information

Interface Integration

Healthcare Services Health Management for Elder-People with Local Governments

Medical Care and Rehabilitation in Hospitals

Cosmetic Services Health Promotion in Sports Centers

Inter-Maker Product Databases

User Community on Wiki

Users and their families

(a) On-Line Healthcare

(b)

Cosmetic Community

Figure 6. An example of multi-disciplinarily: on-line healthcare and cosmetic community

4.3. Example 3: Cosmetic Community This example has been formulated for on-line communities by database and writable web pages. The example shows the multi-disciplinary structure of IT and Cosmetic areas as shown in Figure 6 (b). Universities, makers of cosmetic products, and users are joining to the structure. The structure provides exchanging function for user experiences in a certain area of user community. Fresh and real information can be obtained by the discussions about not only products information but also how to use the products. In the experiments we use Wiki system, which allows the easy creation and editing of web pages via a WWW browser. 4.4. Example 4: Sign Language for Global Communication This is the example of multi-disciplinarily of the combination of IT and Sign Language areas as shown in Figure 7. In the area of sign language, there are a lot of issues such as difficulty of learning sign language, and globalization for each sign language in each country. This example provides the functions for education of sign language by providing movies of actual sign language, and international dictionaries for local and international sign languages by building the dictionaries.

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

409

User Communities Sign Language Retrieval Dictionary including Movies Japanese Sign Language

American Sign Language

…

International Sign Language

Figure 7. An Example of multi-disciplinarily: sign language for global communication

4.5. Future Work Multi-disciplinary environments provide us a new and challenging way to extend the study of contexts and context models. We already have a lot of fundamental models in the area of information modelling and knowledge bases. We have a possibility to apply these models to various areas, for example to multimedia information and to ubiquitous computing. Education and medical areas also are important areas as discussed in this section. Extending the study of contexts and context models enrich directly our society and human life to solve multi-disciplinary problems in the real world.

5. Conclusions Context is a multi-dimensional concept. In our paper we have presented different viewpoints to the concept of context and to context modelling starting from requirements engineering and ending up to multidisciplinary education. Requirements engineering including requirements development and management is the core because no system exists without these activities. Based on context related literature research and discussions in our paper, we can summarize that a complete and comprehensive definition and model of context is difficult to achieve and may not even be appropriate at all. However we can conclude that there is a common understanding that firstly context always relates to an entity, secondly context is used to solve a problem, thirdly context depends on the domain of use, fourthly context depends on time, fifthly context is evolutionary because it relates to a dynamic process in a dynamic environment and finally that context has an important role not only (traditional) context-aware applications but also in other application areas as well. In Figure 8 we present our view of context flow architecture. An example of applying our context flow architecture to cross-cultural environments is given in [1].

410

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

Actor Input

Context Input Interface

Explicit/Tacit Knowledge Input Interface

Feedback

Static Context High Level • Street Address • Station Name

Dynamic Context

Low Level • Coordinates

Low Level • Series of coordinates

Mapping Function to High Level Contexts

Context Logs DB

Context Integrator and Manager

Context-Sensitive Service in the Application Environment

External Web Resources

High Level • Route • Icon, Sign • Video

Reasoning Engine with Appropriate API (for example OWL API)

Context Ontology DB

Output to Actor

Figure 8. Context flow architecture

Rule DB

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

411

References [1]

[2] [3] [4]

[5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26]

Heimbürger, A., Nurminen, M., Venäläinen, T. and Kinnunen, S. 2010. Modelling Contexts in CrossCultural Communication Environments. In: Heimbürger, A., Kiyoki, Y., Tokuda, T. and Yoshida, N. (Eds.). Proceedings of the 20th European-Japanese Conference in Information Modelling and Knowlwdge Bases, Jyväskylä, Finland, May 31 – June 4, 2010. University Printing House, Jyväskylä, Finland. P. 311 - 321 Cassens, J., Kodof-Petersen, A., Zacarias, M. and Wegener, R. K. (Eds.) Sixth International Workshop on Modeling and Reasoning in Context (MRC 2010), Lisbon, Portugal, July 17th, 2010. Heimbürger, A. 2009. Temporal Entities in the Context of Cross-Cultural Meetings and Negotiations. In: Kiyoki, Y., Tokuda, T. and Jaakkola, H. (eds.), Information Modelling and Knowledge Bases XX, Frontiers in Artificial Intelligence and Applications 190, 290-308, Amsterdam: IOS Press, 2009. Kiyoki, Y., Heimbürger, A., Jaakkola, H. and Takahashi Y. Contextual Computing of MultiDimensional Educational Knowledge based on Kiyoki’s Semantic Associative Search Method. In: Isomäki, H., Häkkinen, P. and Viteli, J. Future Educational Technologies, University of Jyväskylä, Publications of Information Technology Research Institute 20, 2009. University Printing House, Jyväskylä, Finland. P. 148 - 170 Pressmann, R. Software Engineering – A Practitioner’s Approach, McGraw-Hill, 5th edition, 2001. Adam, S., Doerr, J., Eisenbarth, M. and Gross, A. Using Task-Oriented Requirements Engineering in Different Domains – Experiences with Application in Research and Industry, Proceedings of 17th IEEE International Requirements Engineering Conference (2009), 267–272. Merriam-Webster Online Dictionary, referred September 27th, 2010 . Oxford Dictionary, referred September 27th, 2010 . Chen, G. and Kotz, D. A Survey of Context-Aware Mobile Computing Research, Technical Report TR2000-381, Dartmouth College, Department of Computer Science, 2000. Schilit, B., Adams, N. and Want, R. Context-Aware Computing Applications, Proceedings of 1st IEEE Conference on Mobile Computing Systems And Applications (1994), 85-90. Schilit, W. N., A System Architecture for Context-Aware Mobile Computing, PhD thesis, Columbia university, 1995. Ali, R., Yu, Y., Chitchyan, R., Nhlabatsi, A., and Giorgini, P., Towards a Unified Framework for Contextual Variability in Requirements, Proceedings of 3rd International Workshop on Software Product Management (2009), 31–34. Buschmann, F. Learning from Failure, part 1: Scoping and Requirements woes, IEEE Software 26 (2009), 68–69. IEEE Computer Society, IEEE Recommended Practice for Software Requirements Specifications, IEEE Standard 830, 1998. Feather, M., Hicks, K., Mackey, R. and Uckun, S., Guiding Technology Deployment Decisions Using a Quantitative Requirements Analysis Technique, Proceedings of 16th IEEE International Requirements Engineering Conference (2008), 271–276. Liu, L., Zhang, H. and Peng, F., Understanding Chinese Characteristics of Requirements Engineering, Proceedings of 17th IEEE International Requirements Engineering Conference (2009), 261–263. Salifu, M., Yu, Y. and Nuseibeh, B. Specifying Monitoring and Switching Problems in Context, Proceedings of 15th IEEE International Requirements Engineering Conference (2007), 211–220. Lehman, M. M., Programs, Life Cycles, an Laws of Software Evolution, Proceedings of the IEEE 68 (1980), 60-76. Cockburn, A. Writing Effective Use Cases, Addison-Wesley Longman Publishing Co., Inc, 2001. Beck, K. and Andres, C. Extreme Programming Explained: Embrace Change, 2nd Edition, AddisonWesley, 2005. Brooks, F., P. No Silver Bullet - Essence and Accidents of Software Engineering, IEEE Computer 20 (1987), 10–19. Maiden, N. Where are We? Handling Context, IEEE Software 26 (2009), 75-76. Glinz, M., A Risk-based, Value-oriented Approach to Quality Requirements, IEEE Software 25 (2008), 34-41. Callele, D., Neufeld, E. and Schneider, K. Emotional Requirements, IEEE Software 25 (2008), 43-45. Seyff, N., Graf, F., Grünbacher, P. and Maiden, N. The Mobile Scenario Presenter: A Tool for In Situ Requirements Discovery with Scenarios, Proceedings of 15th IEEE International Requirements Engineering Conference (2007), 365–366. Kamata M. I. and Tamai, T. How Does Requirements Quality Relate to Project Success or Failute?, Proceedings of 15th IEEE International Requirements Engineering Conference (2007), 69–78.

412

A. Heimbürger et al. / On Context Modelling in Systems and Applications Development

[27] Problem Frames Approach, referred September 27th, 2010. [28] System Context Diagram, referred September 27th, 2010 . [29] Alshaikh, Z. and Boughton, C. The Context Dynamics Matrix (CDM): An Approach to Modelling Context, Proceedings of 165h Asia-Pacific Software Engineering Conference (2009), 101–108. [30] Hadji, H. B., Kim, S-K. and Choi, H-J., A Representation Model for Reusable Assets to Support User Context, Proceedings of IEEE International Symposium on Service-Oriented System Engineering (2008), 91–96. [31] Dey, A., K., Understanding and Using Context, Personal Ubiquitous Computing, vol. 5, no. 1, pp. 4-7, 2001. [32] Zimmermann, A., Context Management and Personalisation: A Tool Suite for Context- and UserAware Computing, Doctoral Thesis, Fraunhofer FIT, 2007. [33] Wei, Edwin J. Y., and Chan, Alvin T.S., Towards Context-Awareness in Ubiquitous Computing, International Conference of Embedded and Ubiquitous Computing, Taipei, Taiwan, pp.706-717, 2007. [34] Strang, T., and Linnhoff-Popien, C., A Context Modeling Survey, Workshop on Advanced Context Modelling, Reasoning and Management, Sixth International Conference on Ubiquitous Computing, Nottingham, England, 2004. [35] Nurmi, P., and Floréen, P., Reasoning in Context-Aware Systems, Helsinki Institute for Information Technology, Position paper, 2004. [36] Schilit, B. and Theimer, M., Disseminating Active Map Information to Mobile Hosts, IEEE Network 8 (1994), 22-32. [37] Abowd, G. D., Dey, A. K., Brown, P. J., Davies, N., Smith, M. and Steggles, P. Towards a Better Understanding of Context and Context-Awareness, In Proc. of the 1st international Symposium on Handheld and Ubiquitous Computing (1999), 304-307. [38] Laasonen, K., Clustering and Prediction of Mobile User Routes from Cellular Data, In Proc. of Knowledge Discovery in Databases (2005), 569-576. [39] Forlizzi, L., Güting, R. H., Nardelli, E. and Schneider, M. A Data Model and Data Structures for Moving Objects Databases, SIGMOD Rec. 29, 2 (2000), 319-330. [40] Duckham, M. and Kulik, L., Location Privacy and Location-Aware Computing, In Drummond, J., Billen, R., Forrest, D. and Joao, E. (Eds) Dynamic & Mobile GIS: Investigating Change in Space and Time, chapter 3, 34-51. CRC Press, 2006. [41] Bettini, C., Jajodia, S., Samarati, P. and Wang, X. S., Eds, Privacy in Location-Based Applications: Research Issues and Emerging Trends, Lecture Notes In Computer Science, vol. 5599. Springer-Verlag, 2009. [42] Komazawa University, Educational Curriculum in Faculty of Global Media Studies, Komazawa University, referred September 27th, 2010, . [43] IMIC: International Medical Information Center, referred September 27th, 2010 .

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-690-4-413

413

Future Directions of Knowledge Systems Environments for Web 3.0 Koji ZETTSU a,1 , Bernhard THALHEIM b,2 , Yutaka KIDAWARA a,3 , Elina KARTTUNEN c,4 , and Hannu JAAKKOLA c,5 a National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289 Japan b Christian-Albrechts-University Kiel, Computer Science Institute, 24098 Kiel, Germany c Tampere University of Technology, P.O.Box 300, FI-28101 Pori, Finland Abstract. The internet and web applications have changed business and human life. Nowadays everybody is used to obtain data through the internet. Most applications are still Web 1.0 applications. Web 2.0 community collaboration and annotated data on the basis of Web 3.0 technologies supports new businesses and applications. The quality dimension of the web is however one of the main challenges. Knowledge systems target at high-quality data on safe grounds, with a good reference to established science and technology and with data adaptation to user’s needs and demands. Knowledge system can be build based on existing and novel technologies. This paper discusses the challenges, two solutions and the fundamentals of knowledge system environments. Keywords. web, information systems; knowledge web, next generation web; context, content, concept, and topic modelling; cloud computing, cloud services; knowledge system environments; information credibility, universal communication;

1. Introduction 1.1. Web x.0 Evolution and the Knowledge Web For almost two decades the internet was a linkage of networked servers, which was entirely used as a worldwide source for researches. It resulted in an aggregate of billions of static web sites, which was accessed via hyperlinks. Websites have mainly been author-driven. They have been aiming to support users depending on their information need and demand, so the focus was chieﬂy on the mutual trust between user and provider. The utilisation of these sites can be modelled by 1 [email protected] 2 [email protected] 3 [email protected] 4 elina.karttunen@tut.ﬁ 5 hannu.jaakkola@tut.ﬁ

414

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

story spaces. The story space speciﬁcation results in storyboards that are schemes for utilisation by a large variety of users. Web 1.0 is author driven and uses as stories • at the provider side publish/provide story/support or advertise/wait/attract/react/retain and • at the user side inform/subscribe/obtain/answer/come back. Web 1.0 has mainly be oriented towards content provision, which basically meant to deliver content together with a rudimentary functionality. These main functionalities can be: • navigation facilities for inside site or page navigation; • acquisition possibilities of information for users from simple content that is based on text, media data such as pictures, audio and video data; • linking facilities; • search and browse facilities providing to users. Websites are mainly oriented towards content delivery, provide some functionality and are using a large variety of presentation facilities. Web 1.0 has made author-driven static content available to numerous users. Users could access exclusively the web pages for researches and personal investigations. The control and management from the ’top’ didn’t provide any scope or client-side opportunities for development. This has changed with the evolution of Web 2.0, the so-called social web, as a development process powered by collaborative brainstorming, in which the collective cooperation is to the fore. Meanwhile there are no bounds set to the today’s web. With the establishment of user communities, users obtain an abundance of information by high-tech sophisticated services, interchange experiences and beneﬁts by the mass collaboration every single day, because data acquisition and data diﬀusion are basically accomplished by user interactions inside the whole web story space. While Web 2.0 integrates collaboration, Web 3.0 provides annotation techniques. These annotation techniques are typically based on linguistic semantics of words used for a reference of data chunks to user semantics. These techniques provide a very good background for sophisticated search and representation techniques. Fully-developed Web 3.0 is characterised by the formula (4C + P + VS) where • 4C means content, commerce, community and context • P is used for personalization and • VS denotes vertical search. But what is missed in the future of web, is quality. We want to reach this level of quality with the aid of semantics and pragmatics in respect of the user proﬁle and life cases. We are convinced that lexical semantics composes the base frame of the Next Generation Knowledge Web. Figures 1 illustrates the general facets of websites. We distinguish six diﬀerent facets: presentation (layout and playout) of pages within a website, (aggregated and prepared) data and functionality provided by the systems that support the website, stories and context behind the application logic of the website, and the user space that is based on a description of the intentions. Web 1.0 was mainly based on presentation systems with supporting systems for aggregated

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

415

data (called content) and functionality. Web 2.0 allows context injection and is user-centered and story-centered. Web 3.0 extends the data content by annotation that are meaningful to users, i.e. provides content together with topical data. The knowledge web extends this dimension by explicit support for concepts beside annotations. It additionally allows an adaptation to the user and the context thus providing information the user really needs. Goal, application area proﬁle, information demand

User and intention

Storyboard

Context Technics organisation

Stories tasks

Website development space Functionality

Data Content, objects, knowledge

Navigation, search, work

Presentation Interfaces depending on the environment Figure 1. Separation of concern for development of Web x.0 websites

1.2. Knowledge Web - Do We Have a Need for That? Human often meet a situation in which additional information, knowledge or at least fact are urgently demanded. This knowledge on demand is however not uniquely determined. It depends on the user, the current user situation, the data on hand, the background, the policies of data providers, etc. Example (knowledge demand): Let us consider the large variety for knowledge demand of people after the Iceland EyJafjallajokull Glacier volcano eruption on March 20, 2010: • How long this situation will inﬂuence travel in Europe? Remember that the last Eyjafjallajokull eruption lasted for two years, and it is possible that this one will do the same. How weather conditions such as the anti-cyclone situation inﬂuence on ash spread? • What are the contents of ash? Could particles of rock, glass and sand clog up aircraft engines? What are the fears of the eﬀect of volcanic ash on plane engines? Are there other components on aircraft that are equally sensitive to particles? Is driving more dangerous than ﬂying through ash? As ﬂights resume, how dangerous is it to ﬂy through a volcanic ash cloud? Are the airlines right with their requirement to resume ﬂights on manual control by pilots depending on visibility? Which safety tests showed that the engines could cope in areas of low-density ash? • Why mathematical simulations have been used for decision making? Why mathematics has partially failed in making predictions?

416

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

• How the weather changes can be explained after the volcano eruption? Why scientist were incorrect in their prediction for the weather impact? (The European summer in 2010 was far colder than any prediction could foresee. This summer seems to be a counterexample for the climate change discussion. Watching the enormous plumes of dust and ash rising from Eyjafjallajokull, it is hard to imagine that this almost week-long eruption would not have any eﬀect on weather and climate. But scientist expected that there is no change.) • What is the economical impact of such eruptions in general and of this eruption in special? What is the impact of the eruption for North Sea ﬁshery, for industry, for tourism, etc.? • What are the passengers rights for stranded passengers or cancellations? What are the best sources of advice? How I can cope with my personal situation? E.g., who gets priority on seats now ﬂights are running again? • Why icelanders enjoy their volcanos? • How clouds depends on volcanos and ﬂights? Jet contrails are eﬀectively acting as cirrus clouds, reﬂecting solar energy in the day, acting as a blanket by night. • Is there any correlation to other climate change drivers such as sun activity? What are the implications of ionospheric plasma bubbles? To what extent are sunspot activities related to economic cycles? This small list can be extended in many directions and illustrates the variety of knowledge that is necessary to satisfy the demand of people. The example shows that we need diﬀerent data, concepts, explanations, theories, and information. In general, knowledge system environments must support the following kinds: • state-of-the art, -aﬀairs, -knowledge, -science; • deﬁciencies, missing or withhold facts; • background, scientiﬁc explanations, science, potential theories, analysis; • cross links, bindings; • associations; • facts with quality properties, full or partial picture; • predictions, possible tactics and strategies for the future; • restrictions, generalisation; • analogies; • history beside news; • ways to cope with and the outcome for the future; • consequences; • links with headlines and quality assessment. This list of knowledge pieces or chunks that must be provided can be categorized by the utility that the knowledge provides as follows: Orientation knowledge allows to cope with the situation, to explain, and to survey the history, the scenario, the facts, the summarisation or generalisation and the overall view. Tacit or action knowledge is based on practices, technics, methods, and strategies. It provides rules, procedures, check lists, principles, strategies, law, regulations, comments to regulations in order to manage situations.

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

417

Explanation knowledge gives reasons, arguments for explanation of claims or arguments or assertions or recommendations (what, why,, ...). Sources knowledge links to knowledge on data sources (meta knowledge) such as knowledge on archives, references to communication, or cross links. Activity knowledge supports working, adaptation or processing, operating on analogies, and coping with errors. 1.3. The Notion of Knowledge The notion of knowledge6 is one of overused terms. Knowledge has two sides. It is knowledge in general deﬁned by a noun from one side and the knowledge by a user expressed by the verb ‘to know’ from the other side. Knowledge as sustainable, potentially durable and veriﬁable grounded consensus: The required information can be qualiﬁed as knowledge, if the information 1. is consensus with a world and a community, 2. is based on postulates or principles that create the fundament for the knowledge, 3. is true according to a certain notion of ’truth’, 4. is reusable in a rule system for new information, 5. is long-lasting and existing for a long time, 6. has an eﬀect and is sustaining within a society, community or world, and 7. is not equivalent to other information that can be generated with the aid of facts or preliminary information in the particular inventory of knowledge by a rule system. Knowledge as the state of information of a user: Diﬀerent kinds of ‘to know’ are: 1. The state or fact of knowing. 2. Familiarity, awareness, or understanding gained through experience or study. 3. The sum or range of what has been perceived, discovered or learned. 6 The deﬁnition provided by the Encyclopedia Britannica [27] considers two ‘Janus’ meanings beside the obsolete ‘cognizance’ and the archaic ‘sexual intercourse’: (I) as the fact of knowing something: (Ia1) the fact or condition of knowing something with familiarity gained through experience or association; (Ia2) acquaintance with or understanding of a science, art, or technique; (Ib1) the fact or condition of being aware of something; (Ib2) the range of one’s information or understanding; (Ic) the circumstance or condition of apprehending truth or fact through reasoning or cognition; (Id) the fact or condition of having information or of being learned; (II) the body of things known about or in science: (IIa) the sum of what is known: the body of truth, information, and principles acquired by mankind; (IIb) a branch of learning (synonyms of knowledge: learning, erudition, scholarship) meaning what is or can be known by an individual or by mankind. We prefer this approach over the approach taken by the Wikipedia community who distinguishes between communicating knowledge, situated knowledge, partial knowledge, scientiﬁc knowledge and know-how or know-what or know-why or know-who knowledge.

418

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

4. Learning; erudition: teachers of great knowledge. 5. Speciﬁc information about something. 6. Carnal knowledge. We conclude therefore that within the scope of the Knowledge-Centered Web, it is necessary to deliver knowledge as enduring, justiﬁed and true consensus to users depending on context, users demands, desiderata and intention, whereby these aspects are supported by social facets, the environment, the proﬁle, tasks and life cases of the users. Life cases, portfolios and tasks constitute the information demand of every users. The information demand of users requires a special content quality. It results in the requested knowledge, which is also depending on the understanding and motivation of users. So, the requested knowledge of users is a composition of understanding, and information demand, whereby the information demand is an aggregated component of life cases, motivation, intention and quality. 1.4. The Knowledge Delivery Task for Web 3.0 Knowledge Systems Environments The knowledge delivery task of the Knowledge-Centered Web is deﬁned as: Deliver the knowledge the user really needs through (1) concepts at the educational level level of the user that are illustrated and extended by (2) content which is quality content depending on the external and internal quality of the aggregated data (media object suite) and that are depicted by (3) topics in the language, in the culture and in the application portfolio of the user. Therefore, knowledge delivery and acquaintance for the user is user-oriented and life-case-based content, concepts and topics. 1.5. Survey of This Paper Section 2 starts with a discussion of one of the most challenging applications for knowledge system environments: Cloud computing is based on highly distributed content, a large variety in understanding its concepts and in annotating content. It also provides functionality on demand. Therefore, the knowledge access is going to change to combination of knowledge chunks based on knowledge sharing. Beside the classical understanding of systems/infrastructure/software as a service we face also the research challenge of developing solutions for knowledge-as-a-service. Cloud knowledge sharing and provisions raises many novel research topics since the demand and the proﬁles of potential users and the capabilities of cloud services must be taken into consideration. One of the most challenging questions is the integration of context. Section 3 approaches the development of knowledge system environment based on the observation that the carrier of knowledge is data. Therefore, knowledge services are based on data-intensive services that must provide knowledge based on the needs under direct supervision of data incompleteness and partnership context. A number of classical research solutions must be extended to such needs such as data or knowledge integration and collaboration support. Section

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

419

4 shows the potential and discusses the challenges of knowledge system environments to users. Knowledge processing is based on novel principles of universal communication. Since data is provided by diﬀerent services with varying quality, varying scope and varying support, integration of data and compilation of knowledge must be enhanced by automatic quality maintenance. Creditability assessment is of the basic functions for knowledge compilation. The WISDOM system allows to derive creditability and reputation of web pages. Section 5 develops a foundation for knowledge systems environments. This foundation is based on a separation of concern into content through data, concepts through models and notions, topics through language or carrier expressions and information through explicit representation of the user understanding. These dimensions can nicely be enhanced by a context dimensions that allows to develop cloud-based service systems as discussed in Section 2.

2. Cloud-Based Knowledge Services and Context Sensitivity The recent development of IT has been towards service orientation, globalization and adaptability [26]. These demands have created opportunities for the development of cloud technology. According to [2] the development towards clouds is rooted in the development of Web 2.0 and the impact it has on the nature of the services oﬀered. The shift from contractual and high-commitment services to low-commitment self-services has brought up the need for new kinds of knowledge sharing and storing opportunities. 2.1. Cloud Services Cloud computing refers not only to applications delivered as services over the Internet, but also to the hardware and systems software that provide these services. The applications are usually referred to as SaaS, where the datacenter hardware and software are called the cloud [2] further deﬁne clouds as public, where a service is being sold in pay-as-you-go manner to the general public, and private clouds as the systems inside a company. They refer to public clouds as utility computing, thus deﬁning cloud computing as the combination of SaaS and utility computing. According to [2] the most relevant new aspects brought by cloud computing from a hardware point of view are: 1) the illusion of inﬁnite computing resources available on demand, 2) the elimination of an up-front commitment by cloud users and 3) the ability to pay for use of computing resources on a short-term basis as needed. Utility computing is preferable when demand for service varies with time or when demand is unknown in advance. [3] divide commercial cloud service architecture into four levels, which include physical machines, virtual machines, resource allocators and users. Services can be directed to each of these levels, which can be seen using the [40] ontology of ﬁve cloud service layers as described in Figure 2. The services are directed to applications, software environments, software infrastructure and hardware, end additionally to the software kernel, which consists of the physical servers that

420

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

Figure 2. The cloud service layers and their services [40]

compose the cloud. The infrastructure layer consists of data, communication and computational resource services. 2.2. Using Clouds for Knowledge Sharing Providing knowledge management tools as a service is a recent development in the IT service industry. The trend is towards a knowledge cloud to enhance the speed and accuracy of ﬁnding relevant knowledge for company use or customer service. Xu and Zhang (2005) state that, with a knowledge cloud type of service, the customer sends a query to a server and gets back an answer according to the data and knowledge mining speciﬁcations in the cloud. For knowledge-sharing services, some new application opportunities are presented by [2]. For example, time consuming parallel batch processing and analytical tasks can be processed through an application that includes enough parallelism to hide the complexity of the task from the user, and simultaneously uses hundreds of cloud servers. The cost of moving large amounts of data into the cloud system must be evaluated in comparison to the need for speedy analysis. However, the need for data analysis is on the rise, as a growing share of computing resources is used for understanding customers, supply chains, buying habits, ranking and so on. The knowledge-as-a-service involves three participants, as described in detail in the Figure 3. Data owners collect data from their daily business and are allowed to utilize the data in the cloud. They are themselves responsible for protecting the secrecy of the data outside the cloud. The knowledge service provider is divided into a knowledge extractor and an accessible knowledge server. Data sent by the data owners is extracted through security algorithms and agreed sets of grouping or ﬁltering algorithms to enable data queries to hit the right data for each customer.

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

421

[41] accentuate the fact that the knowledge server owns the knowledge, but does not necessarily own the data behind it. Service providers serve the customers who consult the knowledge base in their decision making. The customers can hide the query instance from the service provider to further enhance the security of their knowledge.

Figure 3. The paradigm of knowledge-as-a-service[41]

2.3. Problems of Using Clouds for Knowledge Sharing Companies employ cloud services today to improve scalability and to react to varying resource demand. At the moment the services are often limited to one provider at a time due to inﬂexibility and varying interfaces [3]. In order to widen the use of clouds in services, some common standard interfaces have to be followed [3,40]. The lack of common solutions and regulations prevent the companies from utilizing the open market data processing providers; according to [3] this also poses a threat to the ability to develop market-oriented knowledge services. Cloud technology is improving quickly, but currently there are no common guidelines or policies. This creates issues of a legal and ethical nature regarding knowledge sharing. The security of the knowledge is of the upmost priority of many organizations, but there is no common legislation to cover the issue, even less if the cloud is international. Knowledge breaching (attack on the server) is one of the knowledge security issues connected to centralizing data in large service provider entities. Using a small number of adaptively selected queries, the attacker can use the responses to pry into parts of the knowledge. Combining these parts can generate knowledge close to the original. 2.4. Context Sensitivity of Knowledge and Clouds Cloud computing only works if the data ﬂow from the user to the cloud, within the cloud and back to the user is eﬃcient. Geographic, technical and political borders can fracture the cloud into smaller pieces and hinder the distribution of knowledge. According to [14] it is also important to consider how the user can extract certain information according to the context. Context is any information that can be used to characterize the situation of a person, place or object relevant to the interaction between a user and an application. This context deﬁnition by [4] also includes the user and application themselves.

422

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

A communication situation usually has a context, and as [14] state, context is expressed as associations between knowledge units. Without context, new knowledge will not necessarily be absorbed, understood, and accepted. In the case of the knowledge-as-a-service model by [41], the knowledge service provider acts as an intermediary between the input knowledge and the query responses of the customer, thus creating a context-sensitive environment for knowledge transfer. The response depends partly on the input data, but also on the ﬁltering and grouping algorithm of the data extractor and the way the queries are made by the customers. When receiving the knowledge service provider’s response, the customer will be further dependent on the context to interpret and absorb the knowledge, and to act accordingly. The context sensitivity of the cloud as a means to distribute knowledge is manifested in all of the knowledge-as-a-service model stages, as there might be multiple actors handling the data inputs from the owners and knowledge outputs to the customers. As stated by [17, p. 238], the absorption of modern techniques has virtually nothing to do with deep-rooted core beliefs. We standardize for convenience, but the mental agendas remain inviolate and hidden. 2.5. Future The infrastructure software of the future will probably run on virtual machines. Application software will most probably be divided into the customer part and the cloud part. Of these, current Web 2.0 systems do not yet oﬀer utilization of the customer application part when it is not connected to the cloud. The cloud part itself has to be very eﬃciently scalable both up and down, which is not typical in the case of traditional software systems. On the hardware system level, new clouds require development of memory hierarchy, as the lowest level of software will be virtual machines. Also, routers, switches, and bandwidth will need improvements to meet the needs of the future [2]. Globalization is one of the main trends in the IT world. In the increasingly geographically distributed IT environment, clouds are becoming tools that allow even more possibilities to collect and share knowledge between actors all over the world. Globalization and the use of cloud services increase the impact of context in knowledge sharing, bringing into consideration the eﬀect of national cultures. Current cloud interfaces, as mentioned above, have to be standardized in order to enable open cloud service markets. If the context sensitivity and knowledge security issues can be acknowledged and tackled, cloud computing could be the future of knowledge distribution and services.

3. Towards Knowledge Systems for Data-Intensive Science Knowledge is seen as human understanding gained through study and experience, which includes perception, skills, training, common sense, and experience. Traditional knowledge systems are passive archives of information on facts and heuristics, but pay little concern for the context in which they will be used. Meanwhile, knowledge systems are expected to provide actionable information available in

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

423

the right format, at the right time and at the right place for decision making. Therefore, next-generation knowledge systems need to evolve to be a more sophisticated platform for evaluating validating and presenting information content with respect to veriﬁable truth claims. A motivating example is shown in the evolution from experimental and theoretical sciences to emerging data-intensive science [12]. Today, scientists are exploring data captured by large-scale, complex instruments or generated by simulations stored in computers. In this emerging phase, knowledge developed primarily for the purpose of scientiﬁc understanding is being complemented by knowledge created to target practical decisions and action. Until now, the demand for applications science is emerging. Followings show the core characteristics of the applications science compared with the basic science: Need driven. The applications science focuses on the knowledge to seek courses of action and determine their consequences (i.e., societal needs), rather than seeking answers to questions (i.e., scientiﬁc curiosity). Externally constrained. External circumstances dictate when and how applications knowledge is needed, not according to academic schedules based on when and how the best knowledge can be obtained. Useful even when incomplete. Despite incomplete data or partial knowledge due to the loss of scientiﬁc stationarity, the means of making eﬀective use of partial knowledge must be developed, including robust inferences and statistical interpretation, in order to make decisions with establishing conﬁdence. Addressing the above challenges, interdisciplinary synthesis of knowledge resources is a crucial requirement for next-generation knowledge systems. Figure 4 shows a practice in National Institute of Information and Communications Technology (NICT) for space environment science [20]. The system aims to ﬁnd associations among heterogeneous domains of data, such as solar-space observation data, space weather forecast data, biology data, science chronological data, earth weather station data, and newspaper article data. The principal questions of users reﬂect societal needs rather than scientiﬁc curiosity, for instance “What are the social implications of ionospheric plasma bubbles?”, “To what extent are sunspot activities related to economic cycles?”. To answer those questions, the system analyzes correlations (both semantic and spatiotemporal) among the multidisciplinary data, and makes inferences to ﬁnd a set of associative data as an evidence for the questions, or user’s hypotheses. This kind of knowledge system introduces following unique challenges : Data intensive. Data sources that support basic science are often insuﬃcient to support applications. New applications-appropriate sources must be identiﬁed, and new ways of observing (including the use of communities as data gatherers) must be developed. While science data has value for data-owner scientists, sharing data with others should increase rather than decrease that value. Service-oriented architectures. Service orientation is widely acknowledged for its potential to revolutionize the world of computing by abstracting from all resources as services in a service-oriented architecture (SOA) [24]. The SOA

424

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

Figure 4. The knowledge system discovering phenomenon-oriented relationships among heterogeneous domains of data in space environment science.

helps to mitigate the transition to new underlying technologies and enable the linkage of data and resources. Migration to cloud computing. Cloud computing allows to host, process, and analyze large volumes of multidisciplinary data. It oﬀers obvious advantages, such as co-locating data with computations and an economy of scale in hosting the services. Studies on large synthesis datasets also create a need for collaborative tools in the cloud. 3.1. Knowledge Correlation Computing Large synthesis of multidisciplinary data builds an evolving network of community knowledge. NICT’s knowledge cluster systems [42] aim to realize it by linking heterogeneous knowledge resources owned by diﬀerent communities. The basic idea is similar as the World Wide Web (the Web in short). As the Web provides a framework of inﬁnitely-evolving document repository, the knowledge cluster systems provide a framework of inﬁnitely-evolving knowledge repository. Because knowledge is context-dependent information in nature (remember that knowledge is actionable information available in the right format, at the right time and at the right place for decision making), links between knowledge resources are also context-dependent in contrast with the simple links between Web pages in conventional Web. Linking two diﬀerent knowledge resources requires ”bridge concept”, the information describing semantic relations between two diﬀerent communities of knowledge. We are proposing a method for organizing the bridge concepts by using the semantic space based on vector space model, in which the knowledge is rep-

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

425

Figure 5. Knowledge correlation computing by semantic space model.

resented by a vector in the subspace (Figure 5). The advantage of this method is that we can manage context-dependent correlations between knowledge resources by measuring vector correlations in subspaces [21]. In addition, time scale or geographical scale can be included in the context. The link mechanism focuses on the relations between knowledge resources only in the given context, by which the complexity of bridge concept is greatly reduced compared with giving universal relations in traditional approaches. The relations are deﬁned as signiﬁcance of correlations in liner scale. It allows ambiguity or uncertainty of relations between diﬀerent domains of knowledge, which increase ﬂexibility of the bridge concept. 3.2. Data-centric Service Collaboration on Global Knowledge Grid The Global Knowledge Grid [43]is an ICT infrastructure for implementing the knowledge cluster systems based on collaborations of knowledge services [23] in distributed environments. The Global Knowledge Grid stems from the necessity of providing knowledge and processing capabilities to everybody, thus promoting the advent of a competitive knowledge-based economy. Knowledge services on the Global Knowledge Grid are classiﬁed into the following categories: (1) knowledge discovery service provides concept descriptions of an information source based on knowledge discovery approach by data mining, such as segmentation, classiﬁcation, summarisation, and ontological annotations. The MDS services in Figure 5 are examples of this type of service. (2) knowledge association service produces various kinds of associations between multiple information sources based on their concept descriptions provided by the corresponding knowledge discovery services. The CALC service in Figure 5 is an example of this type of service. (3) knowledge delivery service is responsible for (visual) presentation and navigation of the results from knowledge discovery services and knowledge association services. It also handles user interfaces and interactions. These types of knowledge ser-

426

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

vices are developed and deployed by a wide variety of groups joining the Global Knowledge Grid in parallel. An application on the Global Knowledge Grid is realized by a speciﬁc collaboration of those services. The Global Knowledge Grid provides a data-centric service collaboration, called Service MeshUp. Figure 6 illustrates how the Global Knowledge Grid works according to the Service MeshUp mechanism. The Service MeshUp, described by our original XML-based process deﬁnition language, consists of a set of aspects, each of which deﬁnes a single task or function in an application. Each aspect has its own properties (aspect properties) to be shared by the services. For each aspect, an application developer determines a set of services to be involved, and then, for each service, he/she deﬁnes precondition, behavior, and postcondition with respect to the aspect properties. Here, the precondition describes the conditions for activating the service with respect to the aspect properties. For instance, the precondition becomes true when a speciﬁc aspect property is modiﬁed, or has a speciﬁc value. The behavior deﬁnes what the corresponding service will do when it is activated. Basically, the functions of the corresponding service are invoked. The postcondition deﬁnes what will be done after the behavior. In most cases, the data obtained as the result of the behavior will be set to the aspect properties. In contrast with conventional service mashup methods based on workﬂow model, like WS-BPEL [8], our Service MeshUp aims at realizing service collaborations sharing common interests, each of which is represented by aspect properties. While both models are convertible with each other, our Service MeshUp model is more suitable to self-organization processes and ever-evolving processes such as building collective knowledge or monitoring situations than transactional processes like business workﬂows.

4. Knowledge Processing for Universal Communication 4.1. Universal Communication Nowadays, the Internet has become an essential information infrastructure. We search information on the Internet when we need to obtain knowledge. Furthermore, ubiquitous and mobile computing technologies are providing a new knowledge processing technology in the real-world. The ICT infrastructure which is capable of distributing massive amounts of information anytime and anywhere, is becoming more wide spread. The infrastructure supports human communication in order to provide useful information. Universal communication, any kind of information with anyone, is a big challenge of a next generation information analysis technology. Some of barriers are still remained to realize universal communication: language barrier, miscommunication due to untrusted information and long distance. Those barriers should be overcome by eﬀective new technology. We are focusing on information analysis technology to ﬁnd valuable and credible information from Web content.

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

427

Figure 6. Global Knowledge Grid: A data-centric SOA platform for synthesis of knowledge resources.

4.2. Information Analysis for Credibility of Web Content General users write their own daily news and post information they consider interesting as digital documents for blogs and SNS. Such digital content includes both valuable information and worthless, false, and demagogic information. Ordinary web search engines can display web pages in a particular order. The topranked web page does not always contain valuable and credible information. Nevertheless, readers often trust the authenticity of the displayed information. Web search engines such as Google retrieve web content using search keywords. The PageRank method evaluates the score of web content and generates a ranked list. The top-ranked web content on search engines is often relevant to the user’s query, though, in some cases, the content may not be credible or valuable. Even if users believe that the content is useful, the search engine cannot evaluate the retrieved digital content, and users have to retrieve various contents using diﬀerent keywords. NICT’s Information Analysis Technology addresses the issue of information credibility by analyzing credibility based on the following criteria: 1) content, 2) sender, 3) appearance, and 4) authenticity of the content. We believe that the

428

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

understanding of texts by a machine is important, and that an NLP approach is very eﬀective in evaluating the credibility criteria. By using diﬀerent methods for analyzing the information credibility criteria, credible information can be obtained, which eventually becomes valuable knowledge. Sometimes, vast amounts of knowledge may have to be combined to understand even a single social topic, which is not feasible when using conventional search engines. An Ordinary Web search engine does not consider the credibility of the Web content. The top-ranked Web content corresponds to the level of credibility the search engine displays. A novel technology that extracts information credibility criteria from the digital content on the Internet is expected to be developed as a next-generation Web technology. We strongly believe that the understanding of text by a machine is vital for the information credibility criteria. In the conventional document-processing method, a document is considered as a mere bag of words. NICT’s approaches are based on the natural language processing (NLP) technology. NICT’s ultimate goal is to develop information analysis methods to evaluate the four above mentioned criteria. The method will show multifaceted credibility criteria for supporting the human judgment of information credibility. The aim of this study is to make the selection and distribution of credible information easier by using our analysis methods. People use the Web not only to obtain information on what they want to know but also to collect information for decision making. They generally use conventional search engines such as Yahoo! and Google, but it is diﬃcult to judge the credibility of information from the obtained search results. This is because these search engines rank Web pages in terms of a single measure that is not always related to the credibility of the Web pages. To judge the credibility, users must examine the Web pages individually. For example, when a user searches for information on “bio-ethanol”, which is arguably said to be a bio-based fuel, using a conventional search engine, commercial pages that advertise its eﬀect on environment are highly ranked, and the other pages remain in low ranks. From this search result, it is diﬃcult to judge whether “bio-ethanol” is really good for environment. In order to obtain credible criteria information, it is necessary to develop a new system that automatically processes information on a given topic and provides a bird’s eye view and multifaceted views on the topic. For example, these views are a summary page that tabulates reports and opinions on the topic, people and organizations that wrote the pages, whether contact addresses and information sources are speciﬁed, and so forth. We have been developing an information credibility analysis system, WISDOM (Web Information Sensibly and Discreetly Ordered and Marshaled) [1]. WISDOM enables users to browse a large volume of information from a bird’s eye view and multifaceted views while changing a search condition on the basis of each factor of information credibility. By using this system, users can obtain credible information on a topic of interest more eﬃciently. We propose a method for providing a bird’s eye view of major statements on a given topic and their contradictions. This method targets Web pages on the topic, and extracts linguistic expressions occurring with a high frequency as major statements; this method also extracts contradictions to the major statements.

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

429

This resulting view is a summary that enables users to grasp what facts and opinions are found on the Web. In this summary, not only major statements but also their contradictions are presented even if the number of the contradictions is small. That is to say, this system can extract statements that are related to a given topic including minor ones, and provides users with a bird’s eye view of the topic. 4.3. The overview of WISDOM The purpose of WISDOM is to evaluate the credibility of information on the Web from multiple viewpoints. WISDOM considers the following to be source of information credibility: information contents, information senders, and information appearances. We aim at analyzing and organizing these measures on the basis of semantics-oriented natural language processing (NLP) techniques. The analysis of information contents focuses on the text information in Web pages and also includes the processes of page clustering,summarization, and opinion extraction. The analysis of information senders highlights the identity of a sender, classiﬁes his/her aﬃliation, and determines the level of expertise in the topic. The analysis of information appearances focuses on the impression of a Web page, and includes the processes of clarifying the information source and contact address and assessing the appropriateness of page design and writing style. WISDOM is designed to function on several hundred million Web pages, and currently targets 120 million Japanese Web pages. For each Web page,sentence extraction, morphological and syntactic analysis, opinion extraction [22], classiﬁcation of the information sender, and assessment of information appearance are performed in advance. These results are stored in an XML-based format. A user accesses WISDOM via a Web browser, and inputs a topic that he/she wants to know as a query. By clicking analysis button, he/she can see the results organized and summarized according to the information senders, contents,and opinions. These results represent an overall trend of the topic (“information sender analysis”, “page clustering” and “opinion distribution” . The user can also see the detailed information of each Web page, such as the information sender (author) of the page, the appropriateness of page appearance, and individual opinions with their positions among the entire opinion distribution. In this way, the user can browse information of the topic from various points of view and ﬁnd credible information more eﬃciently. 4.4. Universal Communication using Web Content NICT’s Information Analysis Technology can analyze social reputation on Web pages. The distribution of reputation enables us to ﬁnd more useful information when we visit unfamiliar places. For examples, mobile phone can search some restaurants around there when we want to go good restaurant in the place. NICT is developing Spoken Dialog System. The system is based on mobile computer and mobile phone. Users can input query by voice. The system has a speech recognition/synthesis and a dialog management function. Speech recognition function transfers from voice query to text query. Dialog function manages user’s situa-

430

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

%

#!! "

$

%

$

! "

Figure 7. Overview of NICT’s Information Analysis Infractructure.

tion and context. The system communicate with information analysis function in order to retrieve related information and analyze reputation on Web pages (See Fig 7) . Spoken dialog system supports user’s activities anywhere and anytime. The answer is output by synthesis speech. Users can overcome operation barrier of information retrieval. It is one of ICT supported system for the Universal Communication.

5. Foundation of Knowledge Systems by Knowledge Chunks Knowledge can be characterised through (1) its content, (2) its concepts, (3) its annotations or topics, and (4) its understanding by the user. Knowledge pieces cannot be considered in an isolated form. For this reason we imagine to use knowledge chunks as a suite of knowledge pieces consisting of content, concepts, topics and information. These dimensions are interdependent from each other. Figure 8 displays the knowledge space. 5.1. Content and Media Types: The Data Dimension. Content is complex and ready-to-use data. Content is typically provided with functions for its use. Content can be deﬁned n the basis of media types. Content management systems are information systems that support extraction, storage and delivery of complex data. Content in its actual deﬁnition is any kind of information that is shared within a community or organization. In diﬀerence to data in classical database systems

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

431

Context Topics ontology Databases utilisation

Content space Media types functionality, adaptation

Annotation, linking culture, context

Topic space Knowledge space Concept space

User proﬁles user portfolio

Information space Memes cultural units

Pragmatics Semantic theories ontology general culture

Figure 8. The four dimensions of the knowledge space surrounded by the context dimension: (1) data dimension through content; (2) foundation dimension through concepts; (3) language dimension through topics; (4) user dimension through information; (5) context of data (content) or theories (concept) or user (information) or carrier/language (topic)

content usually refers to aggregated macro data which is complex structured. Structuring of content can be distinguished: • The structure of the aggregated micro data is preserved but micro data was combined to build larger chunks of information. Examples are scientiﬁc data sets such as time series of certain measurements. There is a common (or even individual) structuring and meaning for each sampling vector but the compound of all sampling vectors adds additional semantics. • The structure of content is only partially known. A typical example is the content of Web pages: structuring is known up to a certain level of detail which may also be varying within one instance. • Content may be subsymbolic, such as pictures, videos, music or other multimedia content. Aggregation of content usually takes place by combining reusable fragments provided by diﬀerent sources in diﬀerent formats such as texts, pictures, video streams or structured data from databases. Content is subject to a content life cycle which implies a persistent change process to the content available in a content management system (CMS). The more generic ones agree in a major paradigm: the separation of data management and presentation management. Data management reﬂects the process of supporting content creation, content structuring, content versioning, and content distribution while presentation management grabs the data for delivering it to the user in various ways. Only content which is generated following this separation can be easily shared, distributed, and reused. Following new trends and developments in Web technologies, e.g., in the context of Web 2.0 or the Semantic Web the automated processing of content becomes more and more important. Because content represents valuable assets it may be reused in diﬀerent contexts (content syndication) or has to remain accessible for a long time.

432

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

The semistructured or even unstructured nature of content requires annotations to enable search facilities for content. Expressing semantics in a machine interpretable way has been under investigation since the early days of artiﬁcial intelligence, see e.g., [36] for a survey of knowledge representation techniques such as logical theories, rule-based systems, frames or semantic nets. Today systems handle semantical descriptions as metadata describing certain content instances. There are diﬀerent ways for associating data and metadata: • A conceptual, logical, or physical schema is deﬁned and instances are created according to this schema. This is the usual way for classical databases. The modelling language strongly restricts the capabilities of this description facility. Common languages such as Entity-Relationship Modelling or UML focus on structural properties with support of selected integrity constraints. • Deﬁning a schema is not applicable (or only in a restricted way) to semistructured or unstructured content. For that reason content instances are annotated. An annotation is a triple (S, P, O) where S denotes the subject to be annotated, P a predicate denoting the role or purpose of this annotation, and O the object (or resource) which is associated with S. The vocabulary for annotations is organized in ontologies and thesauri. A typical language for expressing annotations in the context of the Semantic Web is the Resource Description Framework (RDF, [39]) while the Web Ontology Language OWL ([38]) may be used to express semantic relationships between the concepts and resources used for annotation. There exist myriads of ontologies and parameter deﬁnitions for diﬀerent application domains such as the Dublin Core parameters [5]) for editorial content. 5.2. Concepts and Theories: The Foundation Dimension. Concepts are the basis for knowledge representation. They specify our knowledge what things are there and what properties things have. Concepts are used in everyday life as a communication vehicle and as a reasoning chunk. Concepts can be based on deﬁnitions of diﬀerent kinds. Therefore our goal for the development of knowledge web can only be achieved if the content deﬁnition covers any kind of content description and goes beyond the simple textual or narrative form. A general description of concepts is considered to be one of the most diﬃcult tasks. We analysed the deﬁnition pattern used for concept introduction in mathematics, chemistry, computer science, and economics. This analysis resulted in a number of discoveries: • Any concept can be deﬁned in a variety of ways. Sometimes some deﬁnitions are preferred over others, are time-dependent, have a level of rigidity, are usage-dependent, have levels of validity, and can only be used within certain restrictions. • The typical deﬁnition frame we observed is based on deﬁnition items. These items can also be classiﬁed by the kind of deﬁnition. The main part of the deﬁnition is a tree-structured structural expression of the following form

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

433

SpecOrderedTree(StructuralTreeExpression (DeﬁnitionItem, Modality(Suﬃciency, Necessity), Fuzziness, Importance, Rigidity, Relevance, GraduationWithinExpression, Category))) .

• Concepts typically also depend on the application context, i.e. the application area and the application schema. The association itself must be characterised by the kind of association. Concepts are typically hierarchically ordered and can thus be layered. We assume that this ordering is strictly hierarchical and the concept space can be depicted by a set of concept trees. A concept is also dependent on the community that prefers this concept. A concept is also typically given through an embedding into the knowledge space. The schema in Figure 9 displays the general structure for content deﬁnition. This schema also covers all aspects discussed in [19]. A concept has (0,n) AcceptanceLevel Community

Community Context

- Concept 6

UsageProﬁle Spec Ordered Tree

(0,1) Descriptor

Preference Deﬁned Time Usage Through Restriction Validity

- Structural Expression

? Deﬁnition Item Term

Deﬁnition Kind

-

Kind of Deﬁnition

Language Figure 9. The main schema for Concept Deﬁnition and Formation

typically a manyfold of deﬁnitions. Their utilisation, exploration and application depend on the user (e.g. the education proﬁle), the usage, and context. Example (concept of mathematical set): A set can be given by an enumeration of its elements, by inductive deﬁnition of its elements, by an algorithm for the construction of the set, or by explicit description of the properties of the set. Which of the deﬁnitions is more appropriate depends on the application domain. These set deﬁnitions are based on the principle of extensionalism: Two sets are equal if they contain the same elements. We might also use sets that are not based on this principle. 5.3. Topics and Ontologies: The Language Dimension. Content and concepts may be enhanced by topics that specify the pragmatic understanding of users. Semantic annotation in current content management systems is usually restricted to preselected ontologies and parameter sets. Rich conceptual data models are only available in more sophisticated systems. Because most generic CMS are

434

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

focused on Web content management semantic annotation is usually restricted to editorial parameters. Specialized content management systems which are adapted to certain application domains incorporate preselected and tailored ontologies. Especially for XML-based content there exist several annotation platforms which incorporate semantical annotation either manually or semi-automatically; see [25] for a survey on available platforms. Automated processing of semantical metadata is usually restricted to search facilities, e.g., searching for the author of an article. Because ontologies are preselected for most systems a full-featured reasoning support is usually not available. Especially for OWL ontologies there are reasoning tools based on description logics such as Racer ([10]) or FaCT which enable T-box (but also A-box) reasoning about semantic relationships between annotation concepts. Applying generic semantical annotation and classical reasoning facilities to content management suﬀers from several drawbacks: • Content as aggregated macro data is only partially analysable. The purpose of metadata is the description of properties which cannot be concluded from the data itself. The very simple annotation frame of (S, P, O) triples does not allow one to express complex properties. For that reason this information has to be kept in the underlying ontology by deﬁning appropriate concepts. The support of user-speciﬁc concepts increases the size of the ontology signiﬁcantly and makes reasoning support even harder. Ad hoc deﬁnitions of user-speciﬁc concepts is not supported in this annotation model. • Annotation with respect to arbitrary ontologies implies general purpose reasoning support by the system. Reasoning for even simple languages suffers from its high computational complexity (e.g., NEXPTIME for the restricted OWL-DL dialect, [13].) Dealing with high worst-case complexities implies a small size of input data but this is a contradiction to expressible ontologies and the deﬁnition of content as complex structured macro data. Especially the size of content instances is a crucial factor because A-box reasoning is a critical point for automated content processing ([11].) But there are advantages, too: • Usually, it is possible to distinguish between diﬀerent points of view on content instances. Not every property is important while looking from every point of view. The macro data may encapsulate and hide properties from its aggregated micro data. Reasoning about the properties of the compound can be separated from the properties of the elements as well as the properties of interconnections between content instances. • Typical application scenarios determine important properties and suggest evaluation strategies. So ontologies may be decomposed to enable a contextualized reasoning, e.g., on the basis of Local Model Semantics ([9]). Local reasoning may rely on a language that is just as expressive as needed in this context. Contexts relying on less expressive languages may support automated reasoning while contexts relying on more expressive languages may be used for manually interpreted information. Soundness and completeness

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

435

of the reasoning process are not of primary interest as long as the reasoning result is acceptable in the application domain. • The separation between annotations relying on common knowledge, userspeciﬁc annotations and (especially) usage-speciﬁc annotations reduces the size of incorporated ontologies signiﬁcantly. • If semantic annotations themselves are given a more sophisticated internal structure reasoning can be adapted to the requirements of the application domain. The major disadvantage of current semantic description in content management is the treatment of knowledge over content instances as metadata on a secondary level in a strongly restricted language. In the following sections we will introduce a data model for content which handles the semantic part on the same level as the content itself and gives additional structure to the semantic description. Content chunks are semantically enriched content instances. They are based on the notion of a schema for content chunks to incorporate typical functionality of content management systems such as content generation, content delivery, or content exchange. 5.4. Information and Memes: The User Dimension. There are several deﬁnitions for information7 . We categorize these notions: • The ﬁrst category of these deﬁnitions is based on the mathematical notion of entropy. This notion is independent of the user and thus inappropriate in our project context. • The second category of information deﬁnitions bases information on the data a user has currently in his data space and on the computational and reasoning abilities of the user. Information is any data that cannot be derived by the user. This deﬁnition is handy but has a very bad drawback. Reasoning and computation cannot be properly characterised. Therefore, the deﬁnition becomes fuzzy. • The third category is based on the notion of information utility. Business information systems understand information as data that have been shaped into a form that is meaningful, signiﬁcant and useful for human beings. 7 In general, information is • raw data and • well-formed and meaningful data • that (1) has been veriﬁed to be accurate and timely relative to its context, (2) is speciﬁc and organized for a purpose, (3) is presented within a context that gives it meaning and relevance, and which (4) leads to increase in understanding and decrease in uncertainty.

This notion extends the GDI notion (General Deﬁnition of Information). “Well-formed” means that the raw data are clustered together correctly, according to the rules (syntax) that govern the chosen system, code or language being analysed. Syntax is understood broadly, as what determines the form, construction, composition or structuring of something. “Meaningful” means that the data must comply with the meanings (semantics) of the chosen system, code or language in question. We refer to [35] for diﬀerent kinds of semantics. However, let us not forget that semantic information is not necessarily linguistic. For example, in the case of the manual of the car, the illustrations are such as to be visually meaningful to the reader.

436

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

These data satisfy an information demand and can be understood by this group. Typical data represent information about signiﬁcant people, places, and things within an organisation or in the environment surrounding it. • The fourth category is based on the general language understanding of information [27]. Information is either the communication or reception of knowledge or intelligence. Information can also deﬁned as ∗ knowledge obtained from investigation, study, or instruction, or ∗ intelligence, news or ∗ facts and data. Information can also be the act of informing against a person. Finally information is a formal accusation of a crime made by a prosecuting oﬃcer as distinguished from an indictment presented by a grand jury. All these deﬁnitions are too broad. We are thus interested in a deﬁnition that is more appropriate for the internet age (anthroposophic understanding) in extension of the GDI notion of information. Information as processed by humans, • is carried by data • that is perceived or noticed, selected and organized by its receiver, • because of his subjective human interests, originating from his instincts, feelings, experience, intuition, common sense, values, beliefs, personal knowledge, or wisdom, • simultaneously processed by his cognitive and mental processes, and • seamlessly integrated in his recallable knowledge. The value of information lies solely in its ability to aﬀect a behavior, decision, or outcome. A piece of information is considered valueless if, after receiving it, things remain unchanged. For the technical meaning of information we consider the notion used in information theory. Therefore, information is directed towards pragmatics, whereas content may be considered to highlight the syntactical dimension. If content is enhanced by concepts and topics, then users are able to capture the meaning and the utilisation of the data they receive. In order to ease perception we use metaphors. Metaphors may be separated into those that support perception of information and into those that support usage or functionality. Users are reﬂected by actors that are abstractions of groups of users. Pragmatics and syntactics share data and functions. The functionality is provided through functions and their representations. The web utilisation space depends on the technical environment of the user. It is speciﬁed through the layout and the playout. Layout places content on the basis of a data representation and in dependence of the technical environment. Playout is based on functionality and function representations, and depends on the technical environment. The information transfer from a user A to a user B depends on the users A and B, their abilities to send and to receive the data, to observe the data, and to interpret the data. Let us formalise this process. Let sX denote the function

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

sender

presentation

content message

appeal to

s-r-relationship

receiver

437

receiver

Figure 10. Dimensions of understanding messages

user by a user X for data extraction, transformation, and sending of data. Let rX denote the corresponding function for data receival and transformation, and let oX denote the ﬁltering or observation function. The data currently considered by X is denoted by DX . Finally, data ﬁltered or observed must be interpreted by the user X and integrated into the knowledge KX a user X has. Let us denote by iX the binary function from data and knowledge to knowledge. By default, we extend the function iX by the time tiX of the execution of the function. Thus, the data transfer and information reception (or brieﬂy information transfer) is formally expressed it by IB = iB (oB (rB (sA (DA ))), KB , tiX ) . In addition, time of sending, receiving, observing, and interpreting can be taken into consideration. In this case we extend the above functions by a time argument. The function sX is executed at moment tsX , rX at trX , and oX at toX . We assume tsA ≤ trB ≤ toB ≤ tiB for the time of sending data from A to B. The time of a computation f or data consideration D is denoted by tf or tD , respectively. In this extended case the information transfer is formally expressed it by IB = iB (oB (rB (sA (DA , tsA ), trB ), toB ), KB , tiB ) . The notion of information extends the dimensions of understanding of message displayed in Figure 10 to a web communication act that considers senders, receivers, their knowledge and experience. Figure 11 displays the multi-layering of communication, the inﬂuence of explicit knowledge and experience on the interpretation. The communication act is speciﬁed by • the communication message with the content or content chunk, the characterisation of the relationship between sender and receiver, the data that are transferred and may lead to information or misinformation, and the presentation, • the sender, the explicit knowledge the sender may use, and the experience the sender has, and • the receiver, the explicit knowledge the receiver may use, and the experience the receiver has. We approach the analysis of knowledge system usage as the ﬁrst important part of storyboarding pragmatics. Knowledge system usage analysis consists of three parts:

438

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

-

-

experience

data (information, misinformation) form appearance, appeal to communication act gestalt receiver presentation

explicit knowledge

experience explicit knowledge sender

data

-

receiver

sender-receiver relationship Figure 11. Dimensions of the communication act

1. Life cases capture observations of user behaviour in reality. They can be used in a pragmatic way to specify the story space. The work on life cases was reported in a previous publication [33]. 2. User models complement life cases by specifying user and actor proﬁles, and actor portfolios. The actor portfolios are used to get a better understanding of the tasks associated with the knowledge system. The work on user models was reported in a previous publication [34]. 3. Contexts complement life cases and user models by characterising the situation in which a user ﬁnds him/herself at a certain time in a particular location. We classify various aspects of contexts related to actors, storyboard, system and time, which make up the context space, then analyse each of these aspects in detail. This is formally support by lifting relations. User modelling is based on the speciﬁcation of user proﬁles that address the characterisation of the users, and the speciﬁcation of user portfolios that describe the users’ tasks and their involvement and collaboration on the basis of the mission of the knowledge system [30]. To characterize the users of a knowledge system we distinguish between education, work and personality proﬁles. The education proﬁle contains properties users can obtain by education or training. Capabilities and application knowledge as a result of educational activities are also suitable for this proﬁle. Properties will assigned to the work proﬁle, if they can be associated with task solving knowledge and skills in the application area, i.e. task expertise and experience as well as system experience. Another part of a work proﬁle is the interaction proﬁle of a user, which is determined by his frequency, intensity and style of utilization of the knowledge system. The personality proﬁle characterises the general properties and preferences of a user. General properties are the status in the enterprise, community, etc., and the psychological and sensory properties like hearing, motoric control, information processing and anxiety. A portfolio is determined by responsibilities and is based on a number of targets. Therefore, the actor portfolio (referring to actors as groups of users with similar behaviour) within an application is based on a set of tasks assigned to or intended by an actor and for which s/he has the authority and control, and a description of involvement within the task solution [32]. A task as a piece of work is characterized by a problem statement, initial and target states, collaboration

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

439

and presupposed proﬁles, auxiliary conditions and means for task completion. Tasks may consists of subtasks. Moreover, the task execution model deﬁnes what, when, how, by whom and with which data a task can be accomplished. The result of executing a task should present the ﬁnal state as well as the satisfaction of target conditions. For task completion users need the right kind of data, at the right time, in the right granularity and format, unabridged and within the frame agreed upon in advance. Moreover, users are bound by their ability to verbalise and digest data, and their habits, practices, and cultural environment. To avoid intellectual overburdening of users we observe real applications before the system development leading to life cases [33]. Life cases help closing the pragmatic gap between intentions and storyboarding. They are used to specify the concrete life situation of the user and characterise thus a bundle of tasks the user should solve. Syntax and semantics of life cases have already been well explored in [30]. In addition, each user has an information portfolio, which speciﬁes the information needs as well as the information entered into the system. We do not model the information portfolio as part of a user, but instead of this we will model the information “consumed” and “produced” with each more detailed speciﬁcation of a user request. 5.5. Context Dimension Characterisation and Adaptation of Knowledge Delivery by Context Taking the commonly accepted meaning a context [15] characterises the situation in which a user ﬁnds him/herself at a certain time in a particular location. In this sense context is usually deﬁned only statically referring to the content of a database. Only very few attempts have been made so far to consider context of scenarios or stories. More generally, we consider context as everything that surrounds a utilisation situation of a knowledge system by a user and can throw light on its meaning. Therefore, context is characterised by interrelated conditions for the existence and occurrence of the utilisation situation such as the external environment, the internal state, location, time, history, etc. For knowledge systems we need to handle the mental context that is based on the proﬁle of the actor or user, the storyboard context that is based on the story leading to a situation, the data context that is based on the available data, the stakeholder context, and the collaboration context. These diﬀerent kinds of contexts have an inﬂuence on the development of the storyboard and must thus be considered for the development of the knowledge system. We distinguish the following facets of context [34,33,30]: Actor context: The knowledge system is used by actors for a number of tasks in a variety of involvements and well understood collaboration. These actors impose their quality requirements on the knowledge system usage as described by their security and privacy proﬁles. They need additional auxiliary data and auxiliary functions. The variability of use is restricted by the actor’s context, which covers the actor’s speciﬁc tasks and speciﬁc data and function demand, and by chosen involvement, while the proﬁle of actors

440

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

imposes exceptions. The involvement and collaboration of actors is based on assumptions of social behaviour and restrictions due to organisational decisions. These assumptions and restrictions are components of the actor’s context. Storyboard context: The meaning of content and functionality to users depends on the stories, which are based on scenarios that reﬂect life cases and the portfolios of users or actors. According to the proﬁle of these users a number of quality requirements such as privacy, security and availability must be satisﬁed. The actor’s scenario context describes what the actor needs to understand in order to eﬃciently and eﬀectively solve his/her tasks in the actual portfolio. The actor’s determine the policy for following particular stories. System context: The knowledge system is developed to support a number of intentions. The purposes and intents lead to a number of decisions on the knowledge system architecture, the technical environment, and the implementation. The knowledge system architecture has an impact on its utilisation, which often is only implicit and thus leads to not understandable systems behaviour. The technical environment restricts the user due to restrictions imposed by server, channel and client properties. Adaptation to the current environment is deﬁned as context adaptation to the current channel, to the client infrastructure and to the server load. At the same time a number of legal decisions based on regulations, laws and business rules have been incorporated into the knowledge system. Temporal context: The utilisation of a scene by an actor depends on his/her history of utilisation. Actors may interrupt and resume their activities at any moment of time. As they may not be interested in repeating all previous actions they have already successfully completed, the temporal context must be taken into account. Due to availability of content and functionality the current utilisation may lead to a diﬀerent story within the same scenario. Provider context: Providers are characterised by their mission, intentions, and speciﬁc policies. Additionally, terms of business may be added. Vendors need to understand how to run the knowledge system economically. Typical parts of this context are intentions of the provider, themes of the website, mission or corporate identity of the site, and occasion and purpose of the visits of actors. Thus, providers may require additional content and functionality due to their mission and policy. They may apply their terms of business and may require a speciﬁc layout and playout. Based on this information, the knowledge system is extended by providerspeciﬁc content and functionality. The storyboard may be altered according to the intentions of the provider, and life cases may be extended or partially supported. Provider-based changes to portfolios are typical for knowledge systems in e-government and e-business applications. Developer context: The knowledge system implementation depends on the capability of the developer. Typically we need to take into account the potential environment, e.g. hard- and software, communication channels, the information systems that are to be incorporated, especially the associated databases, and the programming environment developers use.

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

441

Organisational and social context: The organisation of task solutions is often already predetermined by the application domain. It follows organisational structures within the institutions involved. We captured a part of these structures already on the basis of the portfolio and modelled it by collaboration. The other pars form the organisational context. Collaboration of partners consists of communication, coordination, and cooperation. Cooperation is based on cooperativity, i.e. the disposition to act in a way that is best helpful for the collaboration partners, taking their intentions, tasks, interests and abilities into account. At the same time, collaboration is established in order to achieve a common goal. Actors choose their actions and organise them such that their chances of success are optimised with respect to the portfolio they are engaged in. Additionally, the social context may be taken into account, which consists of interactive and reactive pressures. Typical social enhancements are socially indicated situations such as welcome greetings, thanking, apologising, and farewell greetings. Most systems today do not support adaptivity and user orientation. Information as processed by humans is perceived in a very subjective way. As for a knowledge system, the determining factor whether the user can derive advantage from the content delivered is the user’s individual situation, i.e. the life case, user model and context. The same category of information can cause various needs in diﬀerent life cases. Not any user can deal with any kind of content. For the casual user or the novice other content has to be delivered than for experts. The common knowledge system doesn’t reﬂect the user’s situation and neglects the user’s speciﬁc needs. As a result, the user is spammed with information which is predominantly out of focus. The abundance of information also makes it impossible to separate useful from for the user useless content. Any by the absence of meta data unspeciﬁed information reduces the usability of World Wide Web on the whole. Furthermore, users are limited • in their abilities for verbalisation, • in their abilities for digestion of data and • by their habits, practices and cultural environment. These limitations may cause intellectual overburdening of users. Most systems that require sophisticated learning courses for their exploration and utilization did not consider these limitations and did not cope with real life situations. The approach we use for avoiding overload is based on observation of real applications before developing the knowledge system. User typically request or need various content depending on their situation, on material available, on the actual information demand, on data already currently available and on technical equipment and channels on hand. Therefore, we need a facility for content adaptation depending on the context of the user. Content matching and adaptation may be thus considered as one of the ‘grand’ challenges of modern internet. To meet this challenge, the information has to be matched against the particular needs of the user [34,33,30]. Since the thinkable combinations of user life cases, user models and context [15] are indeﬁnitely, the deﬁnition of life cases

442

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

[33] has to be determined for the content and matched against the users situation. For a knowledge system, there should be not only concrete deﬁnitions of which content is applicable for which life case. To avoid making useful content useless by presenting it in an inappropriate way to the user, knowledge systems have also to consider the user’s speciﬁc proﬁle and context. By processing this data, the knowledge system should provide diﬀerent views of information and the appropriate media types for presenting their knowledge to various audiences. The implicit goals of content management and content delivery are: • to meet all the information (contextual) requirements of the entire spectrum of users in a given application area; • to provide a “natural” and easy-to-understand structuring of the information content; • to preserve the designers entire semantic information for a later redesign; • to achieve all the processing requirements and also a high degree of eﬃciency in processing; • to achieve logical independence of query and transaction formulation on this level; • to provide a simple and easily to comprehend user interface family. 6. Conclusion 6.1. Knowledge in the Two Facets The notion of knowledge is one of the most overused terms. It can be considered as knowledge in general deﬁned by a noun (objective knowledge) and the knowledge by a user expressed by the verb ‘to know’ (subjective knowledge). We conclude therefore that within the scope of the knowledge system environments, it is necessary to deliver knowledge as enduring, justiﬁed and true consensus to users depending on context, users demands, desiderata and intention, whereby these aspects are supported by social facets, the environment, the proﬁle, tasks and life cases of the users. Life cases, portfolios and tasks constitute the information demand of every users. The information demand of users requires a special content quality. It results in the requested knowledge, which is also depending on the understanding and motivation of users. So, the requested knowledge of users is a composition of understanding, and information demand, whereby the information demand is an aggregated component of life cases, motivation, intention and quality. It is surprising that the literature treats knowledge on a 100 % quality basis. We can however distinguish between validated knowledge that satisﬁable within a scope of axioms and derivation rules (application domain), within a certain generality and has validity and timelineness, veriﬁed and correct knowledge based on axioms and rules within a proof system that can be veriﬁed within a ﬁnite time, obey a correctness criteria (depending on proﬁles) and has some known interaction with other knowledge and ﬁnally

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

443

quality knowledge deﬁned by the quality of use (understandability, learnability, operability, attractiveness, appropriatedness), by the external quality (privacy, ubiquity, pervasiveness, analysability, changeability, stability, testability), and by the internal quality (accuracy, suitability, interoperability, robustness, self-contained/independence). These quality characteristics result in diﬀerences of the value of knowledge for the user. Sections 2, 3 and 4 demonstrate the challenges we face with knowledge system environments. Our general notion of knowledge is supported by current database and information systems technology based on a separation into content, concepts, topics and memes in Section 5. 6.2. Knowledge Management Systems Knowledge management aims in supporting the spread of knowledge of individuals or groups across interested in it communities in ways that directly aﬀect performance. Knowledge management envisions getting the right data (or information) within the right context to the right person at the right time for the right business purpose. Knowledge management systems allow to manage knowledge in communities of practice or interest, supporting creation, capture, and storage and sharing of expertise in the form of information in dependence on the user. Knowledge management systems use the entire variety of architecture solution known for intranet, internet or extranet systems. We observe two main architecture lines: 1. The task/proﬁle/portfolio-based approach focuses on the use of knowledge by collaborators depending on their tasks and interest. This approach is based on the data or information and knowledge needs of the communities, where they are located, and who needs them. The KMS is designed to capture knowledge and to make knowledge available when needed to whom needs it. These systems may be described by storyboards [18,31] that describe the life situations or life cases of the knowledge demanders or user, the context of knowledge use and the user proﬁles which specify the education, personality and practice proﬁle. 2. The infrastructure/generic system based approach focuses on building a knowledge base system to capture and distribute knowledge for use throughout the communities of practice. It concern of the technical details needed to provide good mnemonic functions associated with the identiﬁcation, retrieval, and use of knowledge. The approach focuses on network capacity, database structure and organization, and knowledge information classiﬁcation. Sections 3 and 4 describe one approach how such knowledge management systems can be developed. These approaches combine the two architecture lines. Section 5 develops a background for a general knowledge system environment. 6.3. Semantiﬁcation of Web 3.0 The “Semantic Web” is mainly based on syntax and partially uses micro-semantics of wordings. Semantics is used in the sense of rudimentary lexical semantics.

444

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

Rudimentary lexical semantics must be enhanced by explicit deﬁnitions of symbols or words used. These deﬁnitions can be combined with the name spaces that provide a source for the lexical units used in a web document. The semantiﬁcation project [6] of the group headed by J. Pokorny and P. Vojtas at Charles University Prague aims in enhancing the ontology based annotation in XML documents or RDFa-annotated HTML ﬁles by a semantic repository, by user proﬁles and by portfolio management. Web documents should be enhanced by context [15] or meta-data similar to the OpenCyc project. Lexical units may be characterised by time(absolut, type), place(absolut, type), culture, sophistiﬁcation/security, topic/usage, granularity, modality/disposition/epistemology, argument preferences, justiﬁcation, and lets [Len02]. The vocabulary of name spaces or of ontologies is not just a collection of words scattered at random throughout web documents. It is at least partially structured, and at various levels. There are various modes and ways of structuring, e.g., by branching, taxonomic or meronymic hierarchies or by linear bipole, monopolar or sequenced structures. Ontologies are often considered to be the silver bullet for web integration. They are sometimes considered to be an explicit speciﬁcation of conceptualisation or to be a shard understanding of some domain of interest that can be communicated across people and computers. We should however distinguish a variety of ontologies such as generic, semiotic, intention/extension, language, UoD, representational, context and abstraction ontologies. 6.4. Technical Environments for Knowledge on Demand In this paper we discussed three approaches to technical environments to knowledge systems: cloud services, database-backed systems and knowledge processing for universal communities. Technical environments for knowledge delivery system includes pull and push technology, notiﬁcation technology, knowledge discovery technology, knowledge documentation, knowledge quality and productivity, and human computer interface technology. These systems be developed by using eight layers, which includes storyboard layer as a top level one in order to allow to provide knowledge on demand and on context, and followed by seven interwoven technology layers that facilitate the community to work together to share, re-use and generate knowledge among them: interface layer, access layer, collaborative layer, application logic layer, transport layer, integration layer, and repositories layer. References [1]

[2]

S. Akamine, Y. Kato, D. Kawahara, K. Inui, S. Kurohashi, and Y Kidawara. Development of a Large-scale Web Crawler and Search Engine Infrastructure. Proceedings of the 3rd International Universal Communication Symposium (IUCS2009), 126-131, 2009 M. Armbrust, A. Fox, R. Griﬃth, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, Above The Clouds: A Berkeley View of Cloud Computing. Technical Report No. UCB/EECS-2009-28 . EECS Department. University of California, Berkeley. 2009.

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

[3]

445

R.Buyya, C. Shin Yeo, and S. Venugopal, 2008. Market-Oriented Cloud Computing: Vision, Hype and Reality for Delivering IT Services as Computing Utilities. Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications (HPCC-08, IEEE CS Press, Los Alamitos, CA, USA), Sept. 25-27, 2008, Dalian, China. 9p. 2008. [4] A. Dey, B. Kokinov, D. Leake, and R. Turner (Eds.). Modeling and using context. 5th International and Interdisciplinary Conference CONTEXT 2005. Springer. Berlin, 2005. [5] Dublin Core Metadata Initiative. Dublin Core. http://dublincore.org/, June 2007. [6] M. Duˇ z´ı, A. Heimburger, T. Tokuda, P. Vojtas, and N. Yoshida. Multi-agent knowledge modelling. In H. Jaakkola and Y. Kiyoki, editors, EJC’2008, Information Modeling and Knowledge Bases XVI. IOS Press, 2008. Panel summary, EJC 2008. [7] G. Fiedler and B. Thalheim. Towards semantic wikis: Modelling intensions, topics, and origin in content management systems. Information Modelling and Knowledge Bases, XX:1– 21, 2009. [8] X. Fu, T. Bultan, and J. Su. Analysis of Interacting BPEL Web Services. Proce. of the 13th international conference on World Wide Web, pp. 621–630, 2004. [9] C. Ghidini and F. Giunchiglia. Local models semantics, or contextual reasoning = Locality + compatibility. http://citeseer.ist.psu.edu/481285.html, April 2000. [10] V. Haarslev and R. M¨ oller. Racer: An OWL reasoning agent for the semantic web. In Proceedings of the International Workshop on Applications, Products and Services of Webbased Support Systems, in conjunction with the 2003 IEEE/WIC International Conference on Web Intelligence, Halifax, Canada, October 13, pages 91–95, 2003. [11] V. Haarslev, R. M¨ oller, and M. Wessel. Description Logic inference technology: Lessions learned in the trenches. In I. Horrocks, U. Sattler, and F. Wolter, editors, Proc. International Workshop on Description Logics, 2005. [12] T. Hey, S. Tansley, and K. Tolle. The Fourth Paradigm: Data-Intensive Scientiﬁc Discovery, Microsoft Research (2009). [13] Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harmelen. From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 1(1):7–26, 2003. [14] H. Jaakkola, A. Heimb¨ urger, and P. Linna. Knowledge-oriented software engineering process in a multi-cultural context. Software Quality Control. Vol 18 (2). pp. 299-319, 2010. [15] R. Kaschek, K.-D. Schewe, B. Thalheim, and Lei Zhang. Integrating context in conceptual modelling for web information systems, web services, e-business, and the semantic web. In WES 2003, LNCS 3095, pages 77–88. Springer, 2003. [Len02] D. Lenat. The dimensions of the context space. www.cyc.com/context-space.pdf, 2002. [16] J. Lewerenz, K.-D. Schewe, and B. Thalheim. Modeling data warehouses and OLAP applications by means of dialogue objects. In Proc. ER’99, LNCS 1728, pages 354–368. Springer, Berlin, 1999. [17] R. Lewis. The Cultural Imperative. Global Trends in the 21st Century. Intercultural Press. Brealey. Printed in Finland. 338p., 2007. [18] T. Moritz, K.-D. Schewe, and B. Thalheim. Strategic modelling of web information systems. International Journal on Web Information Systems, 1(4):77–94, 2005. [19] G. L. Murphy. The big book of concepts. MIT Press, 2001. [20] T. Nakanishi, H. Homma, K.-S. Kim, K. Zettsu K., Y. Kidawara, and Y. Kiyoki. A Threelayered Architecture for Event-centric Interconnections among Heterogeneous Data Repositories and its Application to Space Weather. Proc. of the 20th European Japanese Conference on Information Modelling and Knowledge Bases, EJC 2010, Jyv¨ askylla, Finland, pp. 29–44, 2010. [21] T. Nakanishi, K. Zettsu, , Y. Kidawara, and Y. Kiyoki. A Context Dependent Dynamic Interconnection Method of Heterogeneous Knowledge Bases by Interrelation Management Function, In Proc. of the 19th European-Japanese conference on information modelling and knowledge bases, EJC 2009, pp. 210-227, 2009. [22] , T. Nakagawa, K. Inui, and S. Kurohashi. Dependency Tree-based Sentiment Classiﬁcation using CRFs with Hidden Variables. Proc. of Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational

446

[23] [24] [25] [26]

[27] [28] [29] [30]

[31]

[32]

[33]

[34]

[35] [36] [37] [38] [39] [40]

[41] [42]

[43]

K. Zettsu et al. / Future Directions of Knowledge Systems Environments for Web 3.0

Linguistics (HLT-NAACL 2010), 2010 NGG: Next Generation Grids Expert Group. Future of European Grids: Grids and ServiceOriented Knowledge Utilities. Report 3, 2006. M.P. Papazoglou, and D. Georgakopoulos. Service-Oriented Computing. Communications of the ACM Vol. 46, No. 10, pp. 24–28, 2003. L. Reeve and H. Han. Survey of semantic annotation platforms. In SAC ’05, pages 1634–1638, New York, NY, USA, 2005. ACM Press. M. R¨ onkk¨ o, J. Ylitalo, J. Peltonen, N. Koivisto, O. Mutanen, J. Autere, A. Valtakoski, and Pentik¨ ainen, National Software Industry Survey 2009. Helsinki University of Technology. Helsinki. 128 p., 2009 J.E. Safra, I. Yeshua, and et. al. Encyclopædia Britannica. Merriam-Webster, 2007. K.-D. Schewe and B. Thalheim. Reasoning about web information systems using story algebra. In ADBIS’2004, LNCS 3255, pages 54–66, 2004. K.-D. Schewe and B. Thalheim. The co-design approach to web information systems development. International Journal of Web Information Systems, 1(1):5–14, March 2005. K.-D. Schewe and B. Thalheim. Usage-based storyboarding for web information systems. Technical Report 2006-13, Christian Albrechts University Kiel, Institute of Computer Science and Applied Mathematics, Kiel, 2006. K.-D. Schewe and B. Thalheim. Usage-based storyboarding for web information systems. Technical Report 2006-13, Christian Albrechts University Kiel, Institute of Computer Science and Applied Mathematics, Kiel, 2006. K.-D. Schewe and B. Thalheim. Development of collaboration frameworks for web information systems. In 20th Int. Joint Conf. on Artiﬁcal Intelligence, Section EMC07 (Evolutionary models of collaboration), pages 27–32, Hyderabad, 2007. K.-D. Schewe and B. Thalheim. Life cases: A kernel element for web information systems engineering. In Web Information Systems and Technologies, volume Volume 8. Lecture Notes in Business Information Processing, Springer Berlin Heidelberg, 2008. K.-D. Schewe and B. Thalheim. User models: A contribution to pragmatics of web information systems design. In K. Aberer, Z. Peng, and E. Rundensteiner, editors, Web Information Systems – Proceedings WISE 2006, volume 4255 of LNCS, pages 512–523. Springer-Verlag, 2006. K.-D. Schewe and B. Thalheim. Semantics in data and knowledge bases. In SDKB 2008, LNCS 4925, pages 1–25, Berlin, 2008. Springer. John F. Sowa. Knowledge Representation, Logical, Philosophical, and Computational Foundations. Brooks/Cole, a division of Thomson Learning, Paciﬁc Grove, California, 2000. B. Thalheim. The conceptual framework to user-oriented content management. Information Modelling and Knowledge Bases, XVII:30–49, 2007. W3C. Web Ontology Language Overview. http://www.w3.org/TR/owl-features/, Feb 2004. W3C RDF Core Working Group. Resource Description Framework (RDF). http://www.w3.org/RDF/, 2004. L. Youseﬀ, M. Butrico, and D. Da Silva. Toward a Uniﬁed Ontology of Cloud Computing. Grid Computing Environments Workshop, GCE ’08, pp.1-10, 2008. Youseﬀ, L.; Butrico, M.; Da Silva, D.; , ”Toward a Uniﬁed Ontology of Cloud Computing,” Grid Computing Environments Workshop, 2008. GCE ’08 , vol., no., pp.1-10, 12-16 Nov. 2008 S. Xu and W. Zhang. Knowledge as a Service and Knowledge Breaching. Proceedings of the 2005 IEEE International Conference on Services Computing. Vol 01. pp. 87-94, 2005. K. Zettsu, T. Nakanishi, M. Iwazume, Y. Kidawara, and Y. Kiyoki. Knowledge Cluster Systems for Knowledge Sharing, Analysis and Delivery among Remote Sites. Information Modeling and Knowledge Bases, Vol. XIX, IOS Press, pp.282–289 (2008). K. Zettsu, Y. Kidawara, and Y. Kiyoki. Developing Next Generation Web as Collaboration Media. Infocommunications Journal, Vol. LXV., No. 2010/I, pp.15–19, Scientiﬁc Association for Infocommunications, Hungary, 2010.

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved.

447

Subject Index abstract state machine 37 applicative algebra 37 architecture-driven development 97 artificial intelligence 137 association modelling 77 association network 77 autonomous control 57 balance 57 category theory 194 cloud computing 413 cloud services 413 cluster based similarity 117 collaborative model 321 collaborative modeling 378 color-emotion based image search 333 complex value 37 component 154 computer supported collaborative work 378 concept 194, 413 constraints 354 content 413 content-addressable memory 57 content-based image retrieval 258 context 396, 413 context flow 301 context modelling 396 context models 301 context tree 301 context-awareness 278 coreference-by-address 57 cross-cultural communication 301 cross-cultural icons 301 cultural historical documentation 312 culture-dependent color-emotion 333 database 154 database engineering 354 database transformation 37 decision support 137

discourse tool 378 domain model 378 e-Assistant for cross-cultural communication 301 emergence 226 emotion information 333 emotional context 117 event-centric interconnections 21 event-driven architecture 226 events 288 formal description 226 game programming system 226 game 226 geo-observation 288 global software development 321 groupware 378 heterogeneous data repositories 21 high-performance databases 206 human associative memory 77 hyperintension 1 image search 117 image-query creation 258 inference 278 inferencing 57 information credibility 413 information retrieval 77, 312 information systems 97, 413 integrity constraint 1 intension 1 interface wrapper 174 intermediary selection 137 knowledge inference 368 knowledge representation 368 knowledge system environments 413 knowledge web 413 location awareness 396 mobile mashup 174 mobile phone application 174 mobile Web server 174 modelling 97, 194 movement awareness 396 multicultural software development 321

448

multi-disciplinary education 396 multimedia database 258 multi-query images 117 next generation web 413 NULL 354 NULL logics 354 NULL ‘value’ 354 ontology 1, 194, 278, 312, 368 partial update 37 performance engineering 206 performance forecasting 206 Petri net 368 phenomena 288 process 194 query space modelling 77 query-by-an image 258 requirements engineering 396 role accessibility 344 schema derivation 57 semantic relations of structure 57 sensing measurements 288 software development 97 software engineering 194 space weather sensor data 21 spatiotemporal proximity 288 specification 226

story generation 247 strategic concept development 137 subactivation 57 subspace feature selection 117 supply chain modeling 137 system improvement 206 thematic relevance 288 three-layered architecture 21 topic bridging 247 topic extraction 247 topic modeling 413 transparent intensional logic 1 ubiquitous systems 396 uncertainties for interrelationships 21 universal communication 413 user interacting Web applications 344 user’s intentions 258 user-centric services 278 user-generated contents 288 Web 97, 413 Web application 174 Web application generation 344 Web service 174 Web service functions 344

Information Modelling and Knowledge Bases XXII A. Heimbürger et al. (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved.

449

Author Index Akaishi, M. 247 Barakbah, A.R. 117, 333 Chaisatien, P. 174 Chen, X. 258 Číhalová, M. 1 Duží, M. 1 Flatt, R. 137 Gilman, E. 278, 396 Hausser, R. 57 Hayashi, Y. 258 Häyrinen, A. 312 Hegner, S.J. 154 Heimbürger, A. v, 301, 396 Henno, J. 226 Hoel, T. 378 Homma, H. 21, 288 Hori, K. 247 Jaakkola, H. v, 97, 413 Karttunen, E. 413 Kasari, M. 77 Keto, H. 194 Kidawara, Y. 21, 288, 413 Kim, K.-S. 21, 288, 396 Kinnunen, S. 301 Kirchberg, M. 137 Kiyoki, Y. v, 21, 117, 258, 288, 396 Kozlov, D. 378 Kärkkäinen, T. 396 Link, S. 137

Linna, P. Liu, J.N.K. Ma, W.-m. Maebara, K. Menšík, M. Nakanishi, T. Nishimura, Y. Noro, T. Nurminen, M. Palomäki, J. Pawlowski, J.M. Pulkkinen, M. Riekki, J. Sato, M. Schewe, K.-D. Silvonen, P. Su, X. Suhardijanto, T. Thalheim, B. Timonen, M. Tokuda, T. Tropmann, M. Venäläinen, T. Wang, K. Wang, Q. Yasushi, K. Ylikotila, T. Yoshida, N. Zettsu, K.

321 368 368 344 1 21, 288 344 344 301 194 378 378 278 247 37, 354 77 278 333 97, 206, 354, 413 77 v, 174, 344 206 301 368 37 333 321 v, 396 21, 288, 413

This page intentionally left blank

Information Modelling and Knowledge Bases XIX: Volume 166 Frontiers in Artificial Intelligence and Applications (Frontiers in Artificial Intelligenece and Applications)

Information Modelling and Knowledge Bases XVII: Volume 136 Frontiers in Artificial Intelligence and Applications

Artificial Intelligence Research and Development (Frontiers in Artificial Intelligence and Applications, Vol. 146) (Frontiers in Artificial Intelligence and Applications)

Artificial Intelligence Research and Development: Volume 163 Frontiers in Artificial Intelligence and Applications

Ontology and the Semantic Web: Volume 156 Frontiers in Artificial Intelligence and Applications (Frontier in Artificial Intelligence and Applications)

Advances in Logic, Artificial Intelligence and Robotics: Laptec 2002 (Frontiers in Artificial Intelligence and Applications, 85)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications)

Legal Knowledge and Information Systems: JURIX 2005: The Eighteenth Annual Conference: Volume 134 Frontiers in Artificial Intelligence and Applications

Modular Ontologies (Frontiers in Artificial Intelligence and Applications, WoMO 2011)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications) (Frontiers in Artificial Interlligence and Applications)

Self-Organization and Autonomic Informatics (I): Volume 135 Frontiers in Artificial Intelligence and Applications

Ontology Learning and Population: Bridging the Gap between Text and Knowledge - Volume 167 Frontiers in Artificial Intelligence and Applications

Applications of Data Mining in E-Business and Finance (Frontiers in Artificial Intelligence and Applications)

Knowledge Transformation for the Semantic Web (Frontiers in Artificial Intelligence and Applications, 95)

Design Problems, Frames and Innovative Solutions - Volume 203 Frontiers in Artificial Intelligence and Applications (Subseries: Knowledge-Based Intelligent Engineering Systems)

Multi-Relational Data Mining: Volume 145 Frontiers in Artificial Intelligence and Applications

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close

Information Modelling and Knowledge Bases XXII - Volume 225 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XIX: Volume 166 Frontiers in Artificial Intelligence and Applications (Frontiers in Artificial Intelligenece and Applications)

Information Modelling and Knowledge Bases XVII: Volume 136 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XVIII - Volume 154 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XX - Volume 190 Frontiers in Artificial Intelligence and Applications

Artificial Intelligence in Education (Frontiers in Artificial Intelligence and Applications)

Information modelling and knowledge bases XIV

Information Modelling and Knowledge Bases Xiii

Artificial Intelligence Research and Development (Frontiers in Artificial Intelligence and Applications, Vol. 146) (Frontiers in Artificial Intelligence and Applications)

Artificial Intelligence Research and Development: Volume 163 Frontiers in Artificial Intelligence and Applications

Artificial Intelligence Research and Development: Volume 163 Frontiers in Artificial Intelligence and Applications

Ontology and the Semantic Web: Volume 156 Frontiers in Artificial Intelligence and Applications (Frontier in Artificial Intelligence and Applications)

Ontology and the Semantic Web: Volume 156 Frontiers in Artificial Intelligence and Applications (Frontier in Artificial Intelligence and Applications)

Advances in Logic, Artificial Intelligence and Robotics: Laptec 2002 (Frontiers in Artificial Intelligence and Applications, 85)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications)

Legal Knowledge and Information Systems: JURIX 2005: The Eighteenth Annual Conference: Volume 134 Frontiers in Artificial Intelligence and Applications

Modular Ontologies (Frontiers in Artificial Intelligence and Applications, WoMO 2011)

Active Mining (Frontiers in Artificial Intelligence and Applications, 79)

Active Mining (Frontiers in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

Formal Ontologies Meet Industry (Frontiers in Artificial Intelligence and Applications)

New Trends in Multimedia and Network Information Systems (Frontiers in Artificial Intelligence and Applications)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications) (Frontiers in Artificial Interlligence and Applications)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications) (Frontiers in Artificial Interlligence and Applications)

Self-Organization and Autonomic Informatics (I): Volume 135 Frontiers in Artificial Intelligence and Applications

Ontology Learning and Population: Bridging the Gap between Text and Knowledge - Volume 167 Frontiers in Artificial Intelligence and Applications

Applications of Data Mining in E-Business and Finance (Frontiers in Artificial Intelligence and Applications)

Knowledge Transformation for the Semantic Web (Frontiers in Artificial Intelligence and Applications, 95)

Design Problems, Frames and Innovative Solutions - Volume 203 Frontiers in Artificial Intelligence and Applications (Subseries: Knowledge-Based Intelligent Engineering Systems)

Multi-Relational Data Mining: Volume 145 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XXII - Volume 225 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XIX: Volume 166 Frontiers in Artificial Intelligence and Applications (Frontiers in Artificial Intelligenece and Applications)

Information Modelling and Knowledge Bases XVII: Volume 136 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XVIII - Volume 154 Frontiers in Artificial Intelligence and Applications

Information Modelling and Knowledge Bases XX - Volume 190 Frontiers in Artificial Intelligence and Applications

Artificial Intelligence in Education (Frontiers in Artificial Intelligence and Applications)

Information modelling and knowledge bases XIV

Information Modelling and Knowledge Bases Xiii

Artificial Intelligence Research and Development (Frontiers in Artificial Intelligence and Applications, Vol. 146) (Frontiers in Artificial Intelligence and Applications)

Artificial Intelligence Research and Development: Volume 163 Frontiers in Artificial Intelligence and Applications

Artificial Intelligence Research and Development: Volume 163 Frontiers in Artificial Intelligence and Applications

Ontology and the Semantic Web: Volume 156 Frontiers in Artificial Intelligence and Applications (Frontier in Artificial Intelligence and Applications)

Ontology and the Semantic Web: Volume 156 Frontiers in Artificial Intelligence and Applications (Frontier in Artificial Intelligence and Applications)

Advances in Logic, Artificial Intelligence and Robotics: Laptec 2002 (Frontiers in Artificial Intelligence and Applications, 85)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications)

Legal Knowledge and Information Systems: JURIX 2005: The Eighteenth Annual Conference: Volume 134 Frontiers in Artificial Intelligence and Applications

Modular Ontologies (Frontiers in Artificial Intelligence and Applications, WoMO 2011)

Active Mining (Frontiers in Artificial Intelligence and Applications, 79)

Active Mining (Frontiers in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

Formal Ontologies Meet Industry (Frontiers in Artificial Intelligence and Applications)

New Trends in Multimedia and Network Information Systems (Frontiers in Artificial Intelligence and Applications)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications) (Frontiers in Artificial Interlligence and Applications)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications) (Frontiers in Artificial Interlligence and Applications)

Self-Organization and Autonomic Informatics (I): Volume 135 Frontiers in Artificial Intelligence and Applications

Ontology Learning and Population: Bridging the Gap between Text and Knowledge - Volume 167 Frontiers in Artificial Intelligence and Applications

Applications of Data Mining in E-Business and Finance (Frontiers in Artificial Intelligence and Applications)

Knowledge Transformation for the Semantic Web (Frontiers in Artificial Intelligence and Applications, 95)

Design Problems, Frames and Innovative Solutions - Volume 203 Frontiers in Artificial Intelligence and Applications (Subseries: Knowledge-Based Intelligent Engineering Systems)

Multi-Relational Data Mining: Volume 145 Frontiers in Artificial Intelligence and Applications

Recommend Documents