This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Frontiers in Artificial Intelligence and Applications FAIA covers all aspects of theoretical and applied artificial intelligence research in the form of monographs, doctoral dissertations, textbooks, handbooks and proceedings volumes. The FAIA series contains several sub-series, including “Information Modelling and Knowledge Bases” and “Knowledge-Based Intelligent Engineering Systems”. It also includes the biennial ECAI, the European Conference on Artificial Intelligence, proceedings volumes, and other ECCAI – the European Coordinating Committee on Artificial Intelligence – sponsored publications. An editorial panel of internationally well-known scholars is appointed to provide a high quality selection. Series Editors: J. Breuker, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras, R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong
Volume 225 Recently published in this series Vol. 224. J. Barzdins and M. Kirikova (Eds.), Databases and Information Systems VI – Selected Papers from the Ninth International Baltic Conference, DB&IS 2010 Vol. 223. R.G.F. Winkels (Ed.), Legal Knowledge and Information Systems – JURIX 2010: The Twenty-Third Annual Conference Vol. 222. T. Ågotnes (Ed.), STAIRS 2010 – Proceedings of the Fifth Starting AI Researchers’ Symposium Vol. 221. A.V. Samsonovich, K.R. Jóhannsdóttir, A. Chella and B. Goertzel (Eds.), Biologically Inspired Cognitive Architectures 2010 – Proceedings of the First Annual Meeting of the BICA Society Vol. 220. R. Alquézar, A. Moreno and J. Aguilar (Eds.), Artificial Intelligence Research and Development – Proceedings of the 13th International Conference of the Catalan Association for Artificial Intelligence Vol. 219. I. Skadiņa and A. Vasiļjevs (Eds.), Human Language Technologies – The Baltic Perspective – Proceedings of the Fourth Conference Baltic HLT 2010 Vol. 218. C. Soares and R. Ghani (Eds.), Data Mining for Business Applications Vol. 217. H. Fujita (Ed.), New Trends in Software Methodologies, Tools and Techniques – Proceedings of the 9th SoMeT_10 Vol. 216. P. Baroni, F. Cerutti, M. Giacomin and G.R. Simari (Eds.), Computational Models of Argument – Proceedings of COMMA 2010 Vol. 215. H. Coelho, R. Studer and M. Wooldridge (Eds.), ECAI 2010 – 19th European Conference on Artificial Intelligence
ISSN 0922-6389 (print) ISSN 1879-8314 (online)
Information Modelling and Knowledge Bases XXII
Edited by
Anneli Heimbürger University of Jyväskylä, Finland
Yasushi Kiyoki Keio University, Japan
Takehiro Tokuda Tokyo Institute of Technology, Japan
Hannu Jaakkola Tampere University of Technology, Finland
Preface In recent decades information modeling and knowledge bases have become hot topics, not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The 20th European-Japanese Conference on Information Modeling and Knowledge Bases (EJC2010) continues the series of events that originally started as a co-operation initiative between Japan and Finland, back in the second half of the 1980’s. Later (1991) the geographical scope of these conferences expanded to cover the whole of Europe and other countries as well. The EJC conferences constitute a worldwide research forum for the exchange of scientific results and experiences achieved in computer science and other related disciplines using innovative methods and progressive approaches. In this way a platform has been established drawing together both researchers and practitioners who deal with information modelling and knowledge bases. The main topics of EJC conferences target the variety of themes in the domain of information modeling: conceptual analysis, the design and specification of information systems, multimedia information modelling, multimedia systems, ontology, software engineering, knowledge and process management, knowledge bases, cross-cultural communication and context modelling. We also aim at applying new progressive theories. To this end much attention is also paid to theoretical disciplines including cognitive science, artificial intelligence, logic, linguistics and analytical philosophy. In order to achieve the targets of the EJC, an international program committee selected 15 full papers and 10 short papers in a rigorous reviewing process from 34 submissions. The selected papers cover many areas of information modelling, namely the theory of concepts, database semantics, knowledge representation, software engineering, WWW information management, context-based information retrieval, ontological technology, image databases, temporal and spatial databases, document data management, process management, cultural modelling and many others. The conference could not be a success without a lot of effort on the part of many people and organizations. In the program committee, 29 reputable researchers devoted a lot of energy to the review process, selecting the best papers and creating the EJC2010 program, and we are very grateful to them. Professor Yasushi Kiyoki and Professor Takehiro Tokuda acted as co-chairs of the program committee while Senior Researcher, Dr. Anneli Heimbürger, and her team took care of the conference venue and local arrangements. Professor Hannu Jaakkola acted as the general organizing chair and Ms. Ulla Nevanranta as conference secretary for the general organizational matters necessary for running the annual conference series. Dr. Naofumi Yoshida and his Program Coordination Team managed the review process and the conference program. We also gratefully appreciate the efforts of all our supporters, especially the Department of Mathematical Information Technology at the University of Jyväskylä (Finland), for supporting this annual event and the 20th jubilee year of EJC.
vi
We believe that the conference was productive and fruitful in the advance of research and application of information modelling and knowledge bases. This book features papers edited as a result of the presentation and discussion at the conference. The Editors Anneli Heimbürger, University of Jyväskylä, Finland Yasushi Kiyoki, Keio University, Japan Takehiro Tokuda, Tokyo Institute of Technology, Japan Hannu Jaakkola, Tampere University of Technology (Pori), Finland Naofumi Yoshida, Komazawa University, Japan
vii
Conference Committee General Programme Chair Hannu Kangassalo, University of Tampere, Finland Co-Chairs Yasushi Kiyoki, Keio University, Japan Takehiro Tokuda, Tokyo Institute of Technology, Japan Members Maria Bielikova, Slovak University of Technology in Bratislava, Slovakia Boštjan Brumen, University of Maribor, Slovenia Pierre-Jean Charrel, University of Toulouse and IRIT, France Xing Chen, Kanagawa Institute of Technology, Japan Alfredo Cuzzocrea, ICAR Institute and University of Calabria, Italy Marie Duží, VSB-Technical University Ostrava, Czech Republic Jørgen Fischer Nilsson, Techinical University of Denmark, Denmark Hele-Mai Haav, Institute of Cybernetics at Tallinn University of Technology, Estonia Roland Hausser, Erlangen University, Germany Anneli Heimbürger, University of Jyväskylä, Finland Jaak Henno, Tallinn University of Technology, Estonia Yoshihide Hosokawa, Gunma University, Japan Hannu Jaakkola, Tampere University of Technology, Pori, Finland Ahto Kalja, Tallinn University of Technology, Estonia Eiji Kawaguchi, Kyushu Institute of Technology, Japan Mauri Leppänen, University of Jyväskylä, Finland Sebastian Link, Victoria University of Wellington, New Zealand Tommi Mikkonen, Tampere University of Technology, Finland Jari Palomäki, Tampere University of Technology, Pori, Finland Hideyasu Sasaki, Ritsumeikan University, Japan Tetsuya Suzuki, Shibaura Institute of Technology, Japan Bernhard Thalheim, Kiel University, Germany Peter Vojtáš, Charles University Pragu, Czech Republic Yoshimichi Watanabe, University of Yamanashi, Japan Naofumi Yoshida, Komazawa University, Japan Koji Zettsu, NICT, Japan General Organizing Chair Hannu Jaakkola, Tampere University of Technology, Pori, Finland
viii
Organizing Committee Anneli Heimbürger, University of Jyväskylä, Finland Xing Chen, Kanagawa Institute of Technology, Japan Ulla Nevanranta, Tampere University of Technology, Pori, Finland Program Coordination Team Naofumi Yoshida, Komazawa University, Japan Xing Chen, Kanagawa Institute of Technology, Japan Anneli Heimbürger, University of Jyväskylä, Finland Jari Palomäki, Tampere University of Technology, Pori, Finland Teppo Räisänen, University of Oulu, Finland Daniela Ďuráková, Technical University of Ostrava, Czech Republic Akio Takashima, Hokkaido University, Japan Tomoya Noro, Tokyo Institute of Technology, Japan Turkka Näppilä, University of Tampere, Finland Jukka Aaltonen, University of Lapland, Finland External Reviewers Thomas Proisl Besim Kabashi
ix
Contents Preface Anneli Heimbürger, Yasushi Kiyoki, Takehiro Tokuda, Hannu Jaakkola and Naofumi Yoshida
v
Ontology As a Logic of Intensions Marie Duží, Martina Číhalová and Marek Menšík
1
A Three-Layered Architecture for Event-Centric Interconnections Among Heterogeneous Data Repositories and Its Application to Space Weather Takafumi Nakanishi, Hidenori Homma, Kyoung-Sook Kim, Koji Zettsu, Yutaka Kidawara and Yasushi Kiyoki
21
Partial Updates in Complex-Value Databases Klaus-Dieter Schewe and Qing Wang
37
Inferencing in Database Semantics Roland Hausser
57
Modelling a Query Space Using Associations Mika Timonen, Paula Silvonen and Melissa Kasari
77
Architecture-Driven Modelling Methodologies Hannu Jaakkola and Bernhard Thalheim
97
An Emotion-Oriented Image Search System with Cluster Based Similarity Measurement Using Pillar-Kmeans Algorithm Ali Ridho Barakbah and Yasushi Kiyoki
117
The Quadrupel – A Model for Automating Intermediary Selection in Supply Chain Management Remy Flatt, Markus Kirchberg and Sebastian Link
137
A Simple Model of Negotiation for Cooperative Updates on Database Schema Components Stephen J. Hegner
154
A Description-Based Approach to Mashup of Web Applications, Web Services and Mobile Phone Applications Prach Chaisatien and Takehiro Tokuda
174
A Formal Presentation of the Process-Ontological Model Jari Palomäki and Harri Keto
194
Performance Forecasting for Performance Critical Huge Databases Bernhard Thalheim and Marina Tropmann
206
Specification of Games Jaak Henno
226
x
Bridging Topics for Story Generation Makoto Sato, Mina Akaishi and Koichi Hori A Combined Image-Query Creation Method for Expressing User’s Intentions with Shape and Color Features in Multiple Digital Images Yasuhiro Hayashi, Yasushi Kiyoki and Xing Chen Towards Context Modelling and Reasoning in a Ubiquitous Campus Ekaterina Gilman, Xiang Su and Jukka Riekki A Phenomena-of-Interest Approach for the Interconnection of Sensor Data and Spatiotemporal Web Contents Kyoung-Sook Kim, Takafumi Nakanishi, Hidenori Homma, Koji Zettsu, Yutaka Kidawara and Yasushi Kiyoki
247
258 278
288
Modelling Contexts in Cross-Cultural Communication Environments Anneli Heimbürger, Miika Nurminen, Teijo Venäläinen and Suna Kinnunen
301
Towards Semantic Modelling of Cultural Historical Data Ari Häyrinen
312
A Collaboration Model for Global Multicultural Software Development Taavi Ylikotila and Petri Linna
321
A Culture-Dependent Metadata Creation Method for Color-Based Impression Extraction with Cultural Color Spaces Totok Suhardijanto, Kiyoki Yasushi and Ali Ridho Barakbah
333
R-Web: A Role Accessibility Definition Based Web Application Generation Yusuke Nishimura, Kosuke Maebara, Tomoya Noro and Takehiro Tokuda
344
NULL ‘Value’ Algebras and Logics Bernhard Thalheim and Klaus-Dieter Schewe
354
Ontology Representation and Inference Based on State Controlled Coloured Petri Nets Ke Wang, James N.K. Liu and Wei-min Ma
368
The Discourse Tool: A Support Environment for Collaborative Modeling Efforts Denis Kozlov, Tore Hoel, Mirja Pulkkinen and Jan M. Pawlowski
378
On Context Modelling in Systems and Applications Development Anneli Heimbürger, Yasushi Kiyoki, Tommi Kärkkäinen, Ekaterina Gilman, Kyoung-Sook Kim and Naofumi Yoshida
396
Future Directions of Knowledge Systems Environments for Web 3.0 Koji Zettsu, Bernhard Thalheim, Yutaka Kidawara, Elina Karttunen and Hannu Jaakkola
Ontology as a Logic of Intensions Marie DUŽÍa,1, Martina ÍHALOVÁ a, Marek MENŠÍKa,b a b
VSB-Technical University Ostrava, 17. listopadu 15, 708 33 Ostrava, Czech Republic Institute of Computer Science, FPF, Silesian University in Opava, Bezruovo nám. 13, 746 01 Opava, Czech Republic [email protected], [email protected], [email protected]
Abstract. We view the content of ontology via a logic of intensions. This is due to the fact that particular intensions like properties, roles, attributes and propositions can stand in mutual necessary relations which should be registered in the ontology of a given domain, unlike some contingent facts. The latter are a subject of updates and are stored in a knowledge-base state. Thus we examine (higher-order) properties of intensions like being necessarily reflexive, irreflexive, symmetric, anti-symmetric, transitive, etc., mutual relations between intensions like being incompatible, being a requisite, being complementary, and so like. We also define two kinds of entailment relation between propositions, viz. mere entailment and presupposition. Finally, we show that higher-order properties of propositions trigger necessary integrity constraints that should also be included in the ontology. As the logic of intensions we vote for Transparent Intensional Logic (TIL), because TIL framework is smoothly applicable to all three kinds of context, viz. extensional context of individuals, numbers and functions-in-extension (mappings), intensional context of properties, roles, attributes and propositions, and finally hyper-intensional context of procedures producing intensional and extensional entities as their products. Keywords. Ontology, intension, hyperintension, Transparent Intensional Logic, integrity constraint.
Introduction In informatics, the term ‘ontology’ has been borrowed from philosophy, where ontology is a systematic account of existence. In most general, what exists is that what can be represented. Thus in recent Artificial Intelligence and information systems a formal ontology is an explicit and systematic conceptualization of a domain of interest. Given a domain, ontological analysis should clarify the structure of knowledge on what exists in the domain. A formal ontology is, or should be, a stable heart of an information system that makes knowledge sharing, reuse and reasoning possible. As J. Sowa says in [14, p. 51], “logic itself has no vocabulary for describing the things that exist. Ontology fills that gap: it is the study of existence, of all the kinds of entities abstract and concrete that make up the world”. Current languages and tools applicable in the area of an ontology design focus in particular on the form of ontological representation rather than what a semantic content of ontology should be. Of course, a unified syntax is useful, but the problems of syntax 1
Corresponding Author.
2
M. Duží et al. / Ontology As a Logic of Intensions
are almost trivial compared to the problems of developing a common semantics for any domain. In this paper we focus on ontology content rather than a form. We concentrate on describing concepts necessary for the specification of relations between higher-order entities like properties, roles/offices, attributes and propositions, which are all modelled as PWS (possible-world semantics) intensions, i.e. functions with the set of possible worlds as their domain. To this end we apply the procedural semantics of Transparent Intensional Logic (TIL), which provides a universal framework applicable smoothly in all three kinds of context, namely extensional context of individuals, numbers and functions-in-extension, intensional context of PWS-intensions and finally hyperintensional context of concepts viewed as abstract procedures producing extensional as well as intensional entities as their products.2 The paper is organised as follows. Ontology content and languages for ontology specification are introduced in Section 1. Here we also provide a brief introduction to Transparent Intensional Logic, the tool we are going to apply throughout the paper. In Section 2 we introduce our logic of intensions, in particular the logic of requisites. Section 3 tackles the phenomenon of presupposition and compares it with mere entailment. Finally, concluding Section 4 outlines further research.
1. Ontology content and knowledge representation Knowledge representation is a multidisciplinary discipline that applies theories and tools of logic and ontology. It comprises both knowledge base and ontology design. Yet there is a substantial distinction between the former and the latter. Whereas the content of a knowledge base state consists in particular of contingent values of (empirical) attributes, the ontology content comprises in particular the taxonomy of entities that should not depend on contingent facts. Thus, for instance in Description Logic (DL) we distinguish between definitional and incidental part, the former containing concepts of attributes rather than their values. The main reason for building knowledge-based systems comprising ontologies can be characterized as making hidden knowledge explicit and logically tractable. To this end it is desirable to apply an expressive semantic framework in order that all the semantically salient features of knowledge specification can be adequately represented so that reasoning based on this representation is logically adequate and does not yield paradoxes. In general, current ontology languages are mostly based on the 1st-order predicate logic (FOL). Though FOL has become stenography of mathematics, it is not expressive enough when applied in other areas such as ontology specification. The obvious shortcoming of the FOL approach is this: in FOL we must treat higher-order intensions and hyper-intensions as elements of a flat universe, due to which knowledge representation is not comprehensible enough. Moreover, when representing knowledge in FOL, the well-known problem of the paradox of omniscience is almost inevitable. For applications where FOL is not adequate, it would be desirable to extend the framework to a higher-order logic (HOL). A general objection against using HOL logic is its computational intractability. However, HOL formulas are relatively well understood, and reasoning systems for HOLs do already exist, e.g., HOL [6] and Isabel [13].
2
Recent most up-to-date results and applications of TIL can be found in [5].
M. Duží et al. / Ontology As a Logic of Intensions
3
1.1. Standard ontological languages There are a number of languages which have been developed for knowledge representation. They provide tools for knowledge-base specification and deductive reasoning using the specified knowledge. Of these, perhaps the best known and broadly used logical calculi are F-logic and Description Logic (DL) in their various variants.3 The F-logic arose from the practice of frame systems. Thus it can be viewed as a hierarchy of classes of elements which are furnished with attributes, accompanied by inference rules. The DL-philosophy is different; it makes use of the notion of a logical theory defined as a set of special axioms built over the first-order predicate logic calculus. Particular classes and their mutual relations are defined by logical formulas. Thus in DL the class hierarchy typical for frame systems is not directly specified. Rather, it is dynamically derived using logical definitions (class descriptions). Though the existing ontology languages have been enriched by a few constructs exceeding the power of FOL, these additional constructs are usually not well defined and understood. Moreover, particular languages are neither syntactically nor semantically compatible. The W3C efforts at standardization resulted in accepting the Resource Description Framework (RDF) language as the Web ontological recommendation. However, this situation is far from satisfactory. Quoting from Horrocks and Schneider [8]: “The thesis of representation underlying RDF and RDFS is particularly troublesome in this regard, as it has several unusual aspects, both semantic and syntactic. A more-standard thesis of representation would result in the ability to reuse existing results and tools in the Semantic Web.” RDF includes three basic elements. Resources are anything with an URI address. Properties specify attributes and/or (binary) relations between resources and an object used to describe resources. Statements of the form ‘subject, predicate, object’ associate a resource and a specific value of its property. RDF has unusual aspects that make its use as the foundation of representation in the area of ontology building and Semantic Web difficult at best. In particular, RDF has a very limited collection of syntactic constructs, and these are treated in a very uniform manner in the semantics of RDF. The RDF syntax consists of the so-called triples – subject, predicate and object, where only binary predicates are allowed. This causes serious problems concerning compatibility with more expressive languages. The RDF thesis requires that no other syntactic constructs than the RDF triples are to be used and that the uniform semantic treatment of syntactic constructs cannot be changed only augmented. In RDFS we can specify classes and properties of individuals, constraints on properties, and the relation of subsumption (subclass, subproperty). It is not possible, for instance, to specify properties of properties, e.g., that the relation (property) is functional or transitive. Neither it is possible to define classes by means of properties of individuals that belong to the class. The RDF like languages originally did not have a model theoretic semantics, which led to many discrepancies. As stated above, RDF(S) is recommended by W3C, and its usage is world spread. The question is whether it is a good decision. A classical FOL approach would be better, or even its standard extension to HOL would be more suitable for ontologies. Formalisation in HOL is much more natural and comprehensive, the universe of discourse is not a flat set of ‘individuals’; rather, properties and relations can be naturally talked about as well, which is much more apt for representation of ontologies. 3
For details on Description Logic and F-logic see, for instance, [1] and [11], respectively.
4
M. Duží et al. / Ontology As a Logic of Intensions
Recognition of the limitations of RDFS led to the development of ontology languages such as OIL, DAML-ONT and DAML+OIL, which resulted into the OWL. OWL has been developed as an extension of RDFS. OWL (like DAML+OIL) uses the same syntax as RDF (and RDFS) to represent ontologies, the two languages are syntactically compatible. However, the semantic layering of the two languages is more problematical. The difficulty stems from the fact that OWL (like DAML+OIL) is largely based on DL, the semantics of which would normally be given by a classical first-order model theory in which individuals are interpreted as elements of some domain (a set), classes are interpreted as subsets of the domain and properties are interpreted as binary relations on the domain. The semantics of RDFS, on the other hand, are given by a non-standard model theory, where individuals, classes and properties are all elements in the domain. Properties are further interpreted as having extensions which are binary relations on the domain, and class extensions are only implicitly defined by the extension of the rdf:type property. Moreover, RDFS supports reflection on its own syntax: interpretation of classes and properties can be extended by statements in the language. Thus language layering is much more complex, because different layers subscribe to these two different approaches. A bit more sophisticated approach is provided by the OWL (Ontology Web Language) that is also recommended by W3C, which is based on DL framework. In DL we talk about individuals that are elements of a universe domain. The individuals are members of subclasses of the domain, and can be related to other individuals (or data values) by means of properties (n-ary relations are called properties in Web ontologies, for they are decomposed into n properties). The universe of discourse is divided into two disjoint sorts: the object domain of individuals and the data value domain of numbers. Thus the interpretation function assigns elements of the object domain to individual constants, elements of data value domain to value constants, and subclasses of the data domain to data types. Further, object and data predicates are distinguished, the former being interpreted as a subset of the Cartesian product of object domain, the latter a subset of the Cartesian product of value domain. DL is rather rich, though being an FOL language. It makes it possible to distinguish intensional knowledge (knowledge on the analytically necessary relations between concepts) and extensional knowledge (of contingent facts). To this end DL knowledge base includes the so-called T-boxes (terminology or taxonomy) and A-boxes (contingent attributes of objects). T-box contains verbal definitions, i.e., a new concept is defined composing known concepts. For instance, a woman can be defined: WOMAN = PERSON & SEXFEMALE, and a mother: MOTHER = WOMAN & child(HASchild). Thus the fact that, e.g., mother is a woman is analytic (necessary) truth. In T-boxes there are also specifications of necessary properties of concepts and relations between concepts: the property satisfiability corresponds to a nonempty concept, the relation of subsumption (intensionally contained concepts), equivalence and disjointness (incompatibility). Thus, e.g., that a bachelor is not married is analytically true proposition. On the other hand, the fact that, e.g., Mr. Jones is a bachelor is a contingent unnecessary fact. Such contingent properties (attributes) of objects are recorded in A-boxes. The third group of ontology languages lies somewhere between the FOL framework and RDFS. This group comprises SKIF and Common Logic [7]. The SKIF syntax is compatible with functional language LISP, but in principle it is an FOL syntax. These languages also have a non standard model theory, with predicates being interpreted as individuals, i.e., elements of a domain. Classes are however treated as subsets of the domain, and their redefinition in the language syntax is not allowed.
M. Duží et al. / Ontology As a Logic of Intensions
5
Based on common logic, the SKIF language accommodates some higher-order constructs. The SKIF languages are syntactically compatible with LISP, i.e., the FOL syntax is extended with the possibility to mention properties and use variables ranging over properties. For instance, we can specify that John and Peter have a common property: p.p(John) & p(Peter). The property they have in common can be, e.g., that they both love their wives. We can also specify that a property P is true of John, and the P has the property Q: P(John) & Q(P). If P is being honest and Q is being eligible, the sentence can be read as that John is honest, which is eligible. The interpretation structure is a triple ¢D, ext, V², where D is the universe, V is the function that maps predicates, variables and constants to the elements of D, and ext is the function that maps D into sets of n-tuples of elements of D. SKIF does not reduce the arity of predicates. To our best knowledge, the only ontology language supporting inferences at this level is a Semantic Web Rule Language (SWRL) combining OWL and RuleML [9]. According to the OWL (Web Ontology Language) overview [19], OWL is intended to be used when information contained in documents needs to be processed by applications, as opposed to situations where the contents only need to be presented to humans. OWL can be used to represent the meaning of terms in vocabularies and relationships between those terms. OWL has been designed on the top of XML, XLink, RDF and RDFS in order to provide more facilities for expressing meaning and semantics to represent machine interpretable content on the Web. Summarising, well-defined ontology should serve at least these goals: (1) universal library to be accessed and used by humans in a variety of information use contexts, (2) the backdrop work of computational agents carrying out activities on behalf of humans, and (3) a method for integrating knowledge bases and databases to perform tasks for humans. Current ontology languages, however, are far from meeting these goals, and their expressive power does not enable computational agents to make use of an adequate inference machine. Still worse, from a logical-semantic point of view these languages suffer the following shortcomings. None of them (perhaps with an exception of languages based on DL) makes it possible to express modalities (what is necessary and what is contingent), to distinguish three kinds of context, viz. extensional level of objects like individuals, numbers, functions (-in-extension), intensional level of properties, propositions, offices and roles, and finally hyperintensional level of concepts (i.e. algorithmically structured procedures). Concepts of n-ary relations are unreasonably modelled by properties. True, each n-ary relation can be expressed by n unary relations (properties) but such a representation is misleading and incomprehensible. Ontology language should be, however, universal, highly expressive, with transparent semantics and meaning driven axiomatisation. For these reasons we vote for an expressive system of Transparent Intensional Logic (TIL). From the formal point of view, TIL is a hyper-intensional, partial, typed O-calculus. Hyperintensional, because we apply top-down approach to semantics, from hyper-intensional (conceptual) level of procedures, via intensional down to extensional level of abstraction. Basic semantic construct is an abstract procedure known as TIL construction. Since TIL has been referred to in numerous EJC papers, in the next paragraph we only briefly recapitulate basic principles of TIL. For the most up-to-date exposition, see [5] and also [10].
6
M. Duží et al. / Ontology As a Logic of Intensions
1.2. A brief introduction to TIL TIL is an overarching semantic theory for all sorts of discourse, whether colloquial, scientific, mathematical or logical. The theory is a procedural one, according to which sense is an abstract, pre-linguistic procedure detailing what operations to apply to what procedural constituents to arrive at the product (if any) of the procedure. Such procedures are rigorously defined as TIL constructions. The semantics is entirely anticontextual and compositional and it is, to the best of our knowledge, the only one that deals with all kinds of context in a uniform way. Thus the sense of a sentence is an algorithmically structured construction of the proposition denoted by the sentence. The denoted proposition is a flat, or unstructured, mapping with domain in a logical space of possible worlds. Our motive for working ‘top-down’ has to do with anticontextualism: any given unambiguous term or expression (even one involving indexicals or anaphoric pronouns) expresses the same construction as its sense whatever sort of context the term or expression is embedded within. And the meaning of an expression determines the respective denoted entity (if any), but not vice versa. The denoted entities are (possibly 0-ary) functions understood as set-theoretical mappings. Thus we strictly distinguish between a procedure (construction) and its product (here, a constructed function), and between a function and its value. Intuitively, construction C is a procedure (a generalised algorithm). Constructions are structured in the following way. Each construction C consists of sub-instructions (constituents), each of which needs to be executed when executing C. Thus a specification of a construction is a specification of an instruction on how to proceed in order to obtain the output entity given some input entities. There are two kinds of constructions, atomic and compound (molecular). Atomic constructions (Variables and Trivializations) do not contain any other constituent but themselves; they specify objects (of any type) on which compound constructions operate. The variables x, y, p, q, …, construct objects dependently on a valuation; they v-construct. The Trivialisation of an object X (of any type, even a construction), in symbols 0X, constructs simply X without the mediation of any other construction. Compound constructions, which consist of other constituents as well, are Composition and Closure. Composition [F A1…An] is the operation of functional application. It vconstructs the value of the function f (valuation-, or v-, -constructed by F) at a tuple argument A (v-constructed by A1, …, An), if the function f is defined at A, otherwise the Composition is v-improper, i.e., it fails to v-construct anything.4 Closure [Ox1…xn X] spells out the instruction to v-construct a function by abstracting over the values of the variables x1,…,xn in the ordinary manner of the O-calculi. Finally, higher-order constructions can be used twice over as constituents of composite constructions. This is achieved by a fifth construction called Double Execution, 2X, that behaves as follows: If X v-constructs a construction X’, and X’ v-constructs an entity Y, then 2X v-constructs Y; otherwise 2X is v-improper, failing as it does to v-construct anything. TIL constructions, as well as the entities they construct, all receive a type. The formal ontology of TIL is bi-dimensional; one dimension is made up of constructions, the other dimension encompasses non-constructions. On the ground level of the type hierarchy, there are non-constructional entities unstructured from the algorithmic point of view belonging to a type of order 1. Given a so-called epistemic (or objectual) base 4 As mentioned above, we treat functions as partial mappings, i.e., set-theoretical objects, unlike the constructions of functions.
M. Duží et al. / Ontology As a Logic of Intensions
7
of atomic types (R-truth values, L-individuals, W-time moments / real numbers, Zpossible worlds), the induction rule for forming functional types is applied: where D, E1,…,En are types of order 1, the set of partial mappings from E1 u…u En to D, denoted ‘(D E1…En)’, is a type of order 1 as well.5 Constructions that construct entities of order 1 are constructions of order 1. They belong to a type of order 2, denoted ‘*1’. The type *1 together with atomic types of order 1 serves as a base for the induction rule: any collection of partial mappings, type (D E1…En), involving *1 in their domain or range is a type of order 2. Constructions belonging to a type *2 that identify entities of order 1 or 2, and partial mappings involving such constructions, belong to a type of order 3. And so on ad infinitum. The sense of an empirical expression is a hyperintension that is a construction that produces a (possible world) D-intension, where D-intensions are members of type (DZ), i.e., functions from possible worlds to an arbitrary type D. On the other hand, Dextensions are members of a type D, where D is not equal to (EZ) for any E, i.e., extensions are functions whose domain is not the set of possible worlds. Intensions are frequently functions of a type ((DW)Z), i.e., functions from possible worlds to chronologies of the type D (in symbols: DWZ), where a chronology is a function of type (DW). Some important kinds of intensions are: Propositions, type RWZ. They are denoted by empirical sentences. Properties of members of a type D, or simply D-properties, type (RD)WZ.6 General terms, some substantives, intransitive verbs (‘student’, ‘walks’) denote properties, mostly of individuals. Relations-in-intension, type (RE1…Em)WZ. For example transitive empirical verbs (‘like’, ‘worship’), also attitudinal verbs denote these relations. D-roles, also D-offices, type DWZ, where D (RE). Frequently LWZ. Often denoted by concatenation of a superlative and a noun (‘the highest mountain’). An object A of a type D is denoted ‘A/D’. That a construction C/ n v-constructs an object of type D is denoted ‘C ov D’. We use variables w and t as v-constructing elements of type Z (possible worlds) and W (times), respectively. If C ov DWZ vconstructs an D-intension, the frequently used Composition of the form [[Cw]t], the intensional descent of the D-intension, is abbreviated ‘Cwt’. The analysis of a sentence consists in discovering the logical construction (procedure) encoded by a given sentence. To this end we apply a method of analysis that consists of three steps:7 1) Type-theoretical analysis, i.e., assigning types to the objects that receive mention in the analysed sentence. 2) Synthesis, i.e., combining the constructions of the objects ad (1) in order to construct the proposition of type RWZ denoted by the whole sentence. 3) Type-Theoretical checking. 5 TIL is an open-ended system. The above epistemic base {R, L, W, Z} was chosen, because it is apt for natural-language analysis, but the choice of base depends on the area and language to be analysed. For instance, possible worlds and times are out of place in case of mathematics, and the base might consist of, e.g., R and Q, where Q is the type of natural numbers. 6 We model D-sets and (D1…Dn)-relations by their characteristic functions of type (RD), (RD1…Dn), respectively. Thus an D-property is an empirical function that dependently on states-of-affairs (WZ) picks-up a set of D-individuals, the population of the property. 7 For details see, e.g.,[12].
8
M. Duží et al. / Ontology As a Logic of Intensions
To illustrate the method, let us analyse the sentence “All drivers are persons”. Ad (1) The objects mentioned by the sentence are individual properties of being a Driver and being a Person, and the quantifier All. Individual properties receive the type (((RL)W)Z), RWZ for short. Given a world-time pair ¢w, t², a property applied to world w and time t returns a class of individuals, its population at ¢w, t². Yet the sentence does not mention any particular individual, be it a driver or a person. It says that the population of drivers is a subset of persons. Thus the type of the (restricted) quantifier All is ((R(RL))(RL)). Given a set M/(RL) of individuals, the quantifier All returns all the supersets of M. Thus we have [0All 0M] o (R(RL)). Ad (2) Now we combine constructions of the objects ad (1) in order to construct the proposition (of type RWZ) denoted by the whole sentence. Since we aim at discovering the literal analysis of the sentence, objects denoted by semantically simple expressions ‘driver’, ‘person’ and ‘all’ are constructed by their Trivialisations: 0Driver, 0Person, 0 All. By Composing these constructions, we obtain a truth-value (T or F), according as the population of people belongs to the set of supersets of the population of drivers. Thus we have, [[0All 0Driverwt] 0Personwt] ov R. Finally, by abstracting over the values of the variables w and t, we construct the proposition: OwOt [[0All 0Driverwt] 0Personwt]. Ad (3). By drawing a type-theoretical structural tree, we check whether particular constituents of the above Closure are combined in a type-theoretically correct way. Ow Ot [[0All
0
Driverwt]
((R(RL))(RL))
0
Personwt]
(RL)
(R(RL))
(RL) R
(RW) ((RW)Z)
the type of a proposition, RWZ for short.
So much for the method of analysis and the semantic schema of TIL. 1.3. Ontology content Formal ontology is a result of the conceptualization of a given domain. It contains definitions of the most important entities, forms a conceptual hierarchy together with the most important attributes and relations between entities. Material individuals are mereological sums of other individuals, but only contingently so. Similarly, values of attributes and properties are ascribed to individuals contingently, provided a given property is purely contingent, that is without an essential core. Thus we advocate for a (modest) individual anti-essentialism. On the other hand, on the intensional level of propositions, properties, offices and roles, that is entities which we call ‘intensions’, the most important relation to be observed is that of requisite. For instance, the property of being a mammal is a requisite of the property of being a whale. It is an analytically necessary relation between intensions that gives rise to the so-called ISA hierarchy. Thus on the intensional level we advocate for intensional essentialism; an essence of a
M. Duží et al. / Ontology As a Logic of Intensions
9
property is the set of all its requisites. Finally, on the hyper-intensional level of concepts, relations to be observed are equivalence (i.e. producing the same entity), refinement (a compound concept is substituted for a simpler yet equivalent concept), entailment and presupposition. The structure of ontology building starts on the hyper-intensional level with the specification of primitive concepts. Next we specify compound concepts as ontological definitions of entities of a given domain. Having defined entities, we can specify their most important descriptive attributes. The building process continues by specifying particular (empirical) relations between entities and analytical relations of requisites that serve to build up ontological hierarchy. Finally, the most important general rules that govern behaviour of the system are specified. Here again we distinguish analytically necessary constraints from nomic and common necessities that are given by laws and conventions, respectively; they are not valid analytically necessary. For instance, mathematical laws are analytically necessary, they hold independently of states of affairs. On the other hand, laws of physics are not logically or analytically necessary, they are only nomically necessary. It is even disputable whether these laws are eternal in our world. Yet still weaker constraints are, for instance, traffic laws. That we drive on the right-hand side of a lane is valid only by convention and locally. Summarising, basic parts of a formal ontology should encompass: (1) Conceptual (terminological) dictionary which contains: a) primitive concepts b) compound concepts (ontological definitions of entities) c) the most important descriptive attributes, in particular identification of entities (2) Relations a) contingent empirical relations between entities, in particular the part-whole relation b) analytical relations between intensions, i.e., requisites and essence, which give rise to ISA hierarchy (3) Integrity constraints a) Analytically necessary rules b) Nomologically necessary rules c) Common rules of ‘necessity by convention’ Concerning ad (1), in particular ontological definitions, this topic has been dealt with in [4]. Briefly, ontological definition of an entity is a compound construction of the entity. Such a definition often serves as a refinement of a primitive concept of the entity, which makes it possible to prove some analytic statements about the entity. For example, the sentence “Whales are not dolphins” contains the empirical predicates ‘is a whale’ and ‘is a dolphin’, yet the sentence is analytic truth. At no world/time are the properties being a whale and being a dolphin co-instantiated by the same individual. The proposition constructed by the sentence is the necessary proposition TRUE. In order to prove it, we need to refine the concept of a whale. To this end we make use of the fact that the property of being a whale can be defined as the property of being a marine mammal of the order Cetacea that is neither a dolphin nor a porpoise. 8 Thus the ontological definition of the property of being a whale is
8
See, for instance, http://mmc.gov/species/speciesglobal.html#cetaceans or http://www.crru.org.uk/education/factfiles/taxonomy.htm
10
M. Duží et al. / Ontology As a Logic of Intensions
OwOt Ox [[0Mammalwt x] [0Marinewt x] [0Cetaceawt x] [0Dolphinwt x] [0Porpoisewt x]] Types: x o L; Cetacea, Mammal, Marine, Dolphin, Porpoise/(RL)WZ. Using this definition instead of the primitive concept 0Whale we get: OwOt [0No Ox [[0Mammalwt x] [0Marinewt x] [0Cetaceawt x] [0Dolphinwt x] [0Porpoisewt x]] 0Dolphinwt]. Gloss: “No individual x such that x is a marine mammal of the order Cetacea and x is neither a dolphin nor a porpoise is a dolphin”. In this paper we focus problems ad (2) and (3), that is, we will examine relations between intensions, properties of intensions and various integrity constraints viewed via the logic of intensions.
2. Logic of intensions 2.1. Requisites and ISA hierarchies. It is important to distinguish between purely contingent propositions and the proposition TRUE that takes the value T in all ¢w, t²-pairs. The latter is denoted by analytically true sentences such as the above analysed sentence “No whale is a dolphin” or “All drivers are persons”. We have seen that the literal analysis does not make it possible to prove the analytic truth of the sentence. To this end we have to possibilities. Either we can record ontological definitions refining the primitive concepts of the objects talked about (as illustrated by the above whale-example), or we need to explicitly record in our ontology the fact that there is a necessary relation (-inextension) between the two properties. We call this relation a requisite, in this case Req1/(R(RL)WZ(RL)WZ) and it receives this definition: [0Req1 0Person 0Driver] =df wt [x [[0Driverwt x] [0Personwt x]]] Gloss. Being a person is a requisite of being a driver. In other words, necessarily and for any individual x, if x instantiates the property of being a driver then x also instantiates the property of being a person. Now we set out the logic of requisites, because this relation is the basic relation that gives rise to ISA taxonomies.9 The requisite relations Req are a family of relationsin-extension between two intensions, hence of the polymorphous type (RDWZEWZ), where possibly D = E. The relation of a requisite can be defined between intensions of any type. For instance, a requisite of finding is the existence of a sought object. Infinitely many combinations of Req are possible, but the following four are the relevant ones we wish to consider: (1) Req1 /(R (RL)WZ (RL)WZ): an individual property is a requisite of another such property. (2) Req2 /(R LWZ LWZ): an individual office is a requisite of another such office. (3) Req3 /(R (RL)WZ LWZ): an individual property is a requisite of an individual office. (4) Req4 /(R LWZ (RL)WZ): an individual office is a requisite of an individual property. 9
Parts of this section draw on material presented in [5], Chapter 4.
M. Duží et al. / Ontology As a Logic of Intensions
11
Neglecting complications due to partiality, definitions of particular kinds of requisites should be obvious: “Y is a requisite of X” iff “necessarily whatever occupies/ instantiates X at ¢w, t² it also occupies/instantiates Y at this ¢w, t².” Examples. Being a Person and being a Driver is an example of Req1. An example of Req2 is The Commander-in-Chief and the President of USA. The former office is a requisite of the latter, such that whoever is the President is also the Commander-inChief. However, it may happen that the Presidency goes vacant, while somebody occupies the office of Commander-in-Chief. As an example of Req3 we can adduce the property of being a US citizen and the office President of USA. Finally, an example of Req4 is the pair of God-office and the property of being Omniscient. Note that while Req1/(R(RL)WZ(RL)WZ) and Req2/(RLWZLWZ) are homogeneous, Req3, Req4 are heterogeneous. Since the latter two do not have a unique domain, it is not sensible to ask what sort of ordering they are. Not so with the former two. We define them as quasi-orders (a.k.a. pre-orders) over (R(RL)WZ), (RLWZ), respectively, that can be strengthened to weak partial orderings. However, they cannot be strengthened to strict orderings on pain of paradox, since they would then both be reflexive and irreflexive. We wish to retain reflexivity, such that any intension having requisites will count itself among its requisites. Since intensions are properly partial functions, in order to deal with partiality we make use of three properties of propositions True, False, Undef/(RRWZ)WZ. If P o RWZ is a construction of a proposition, [0Truewt P] returns T if the proposition takes the truthvalue T in a given ¢w, t², otherwise F. [0Falsewt P] returns T if the proposition takes the truth-value F in a given ¢w, t², otherwise F. [0Undefwt P] returns T in a given ¢w, t² if neither [0Truewt P] nor [0Falsewt P] returns T, otherwise F. Claim 1 Req1 is a quasi-order on the set of L-properties. Proof. Let X, Y o (RL)WZ. Then Req1 belongs to the class QO/(R(R(RL)WZ(RL)WZ)) of quasi-orders over the set of individual properties: Reflexivity.
[wt [x [[0Truewt OwOt [Xwt x]] [0Truewt OwOt [Ywt x]]] [[0Truewt OwOt [Ywt x]] [0Truewt OwOt [Zwt x]]]] wt [x [[0Truewt OwOt [Xwt x]] [0Truewt OwOt [Zwt x]]]]] In order for a requisite relation to be a weak partial order, it will need to be also anti-symmetric. The Req1 relation is, however, not anti-symmetric. If properties X, Y are mutually in the Req1 relation, i.e., if [[0Req1 Y X] [0Req1 X Y]] then at each ¢w, t² the two properties are truly ascribed to exactly the same individuals. This does not entail, however, that X, Y are identical. It may be the case that there is an individual a such that [Xwt a] v-constructs F whereas [Ywt a] is v-improper. For instance, the following properties X, Y differ only in truth-values for those individuals who never
12
M. Duží et al. / Ontology As a Logic of Intensions
smoked (let StopSmoke/(RL)WZ: the property of having stopped smoking).10 Whereas X yields truth-value gaps on such individuals, Y is false of them: X = OwOt Ox [0StopSmokewt x] Y = OwOt Ox [0Truewt OwOt [0StopSmokewt x]]. In order to abstract from such an insignificant difference, we introduce the equivalence relation Eq/(R(RL)WZ(RL)WZ) on the set of individual properties; p, q o (RL)WZ; =/(RRR): Eq = Opq [x [[0Truewt OwOt [pwt x]] = [0Truewt OwOt [qwt x]]]].
0
Now we define the Req1’ relation on the factor set of the set of L-properties as follows. Let [p]eq = Oq [0Eq p q] and [Req1’ [p]eq [q]eq] = [Req1 p q]. Then: Claim 2 Req1’ is a weak partial order on the factor set of the set of L-properties with respect to Eq. Proof. It is sufficient to prove that Req1’ is well-defined. Let p’, q’ be L-properties such that [0Eq p p’] and [0Eq q q’]. Then [Req1’ [p]eq [q]eq] = [Req1 p q] = wt [x [[0Truewt OwOt [pwt x]] [0Truewt OwOt [qwt x]]]] = wt [x [[0Truewt OwOt [p’wt x]] [0Truewt OwOt [q’wt x]]]] = [Req1’ [p’]eq [q’]eq]. Now obviously the relation Req1’ is antisymmetric: [[0Req1’ [p]eq [q]eq] [0Req1’ [q]eq [p]eq]] [[p]eq = [q]eq]. Claim 3 Req2 is a weak partial order defined on the set of L-offices. Proof. Let X, Y o LWZ. Then the Req2 relation belongs to the class WO/(R(R LWZLWZ)) of weak partial orders over the set of individual offices. Reflexivity.
[wt [[[0Occwt X] [0Truewt OwOt [Xwt = Ywt]]] [[0Occwt Y] [0Truewt OwOt [Ywt = Zwt]]]] wt [[0Occwt X] [0Truewt OwOt [Xwt = Zwt]]]]. Remark. Antisymmetry requires the consistent identity of the offices constructed by X, Y: [X = Y]. The two offices are identical iff at all worlds/times they are either co10
We take the property of having stopped smoking as presupposing that the individual previously smoked. For instance, that Tom stopped smoking can be true or false only if Tom was once a smoker. Similarly for the property of having stopped whacking one’s wife.
M. Duží et al. / Ontology As a Logic of Intensions
13
occupied by the same individual or are both vacant: wt [[0Truewt OwOt [Xwt = Ywt]] [0Undefwt OwOt [Xwt = Ywt]]] = wt [0Falsewt OwOt [Xwt = Ywt]], which is the case here. It is a well-known fact that hierarchies of intensions based on requisite relations establish inheritance of attributes and possibly also of operations. For instance, a driver in addition to his/her special attributes like having a driving license inherits all the attributes of a person. This is another reason for including such a hierarchy into ontology. This concludes our definition of the logic of the requisite relations. We turn now to dealing with a part-whole relation. 2.2. Part-whole relation We advocate for the thesis of modest individual anti-essentialism: If an individual I has a property P necessarily (i.e., in all worlds and times), then P is a constant or partly constant function. In other words, the property has a non-empty essential core Ess, where Ess is a set of individuals that have the property necessarily, and I is an element of Ess. There is, however, a frequently voiced objection to individual anti-essentialism. If, for instance, Tom’s only car is disassembled into its elementary physical parts, then Tom’s car no longer exists; hence, the property of being a car is essential of the individual referred to by ‘Tom’s only car’. Our response to the objection is this. First, what is denoted (as opposed to referred to) by ‘Tom’s only car’ is not an individual, but an individual office/role, which is an intension of type LWZ having occasionally different individuals, and occasionally none, as values in different possible worlds at different times. Whenever Tom does buy a car, it is not logically necessary that Tom buy some one particular car rather than any other. Second, the individual referred to as ‘Tom’s only car’ does not cease to exist even after having been taken apart into its most elementary parts. It has simply lost some properties, among them the property of being a car, the property of being composed of its current parts, etc, while acquiring some other properties. Suppose somebody by chance happened to reassemble the parts so that the individual would regain the property of being a car. Then Tom would have no right to claim that this individual was his car, in case it was allowed that the individual had ceased to exist. Yet Tom should be entitled to claim the reassembled car as his.11 Therefore, when disassembled, Tom’s individual did not cease to exist; it had simply (unfortunately) obtained the property of completely disintegrating into its elementary physical parts. So much for modest individual anti-essentialism. The second thesis we are going to argue for is this. A material entity that is a mereological sum of a number of parts, such as a particular car, is from a logical point of view a simple, hence unstructured individual. Only its design, or construction, is a complex entity, namely a structured procedure. This is to say that a car is not a structured whole that organizes its parts in a particular manner. Tichý says: [A] car is a simple entity. But is this not a reductio ad absurdum? Are cars not complex, as anyone who has tried to fix one will readily testify? No, they are not. If a car were a complex then it would be legitimate to ask: Exactly how complex is it? Now how many parts does a car consist of? One plausible answer which may suggest itself is that it has three parts: an engine, a chassis, and a body. But an equally plausible answer can be given in terms of a much longer list: several spark plugs, several pistons, a 11 As Tichý argues in [16], where he uses the example of a watch being ‘repaired’ by a watchmaker in such a way as to become a key.
14
M. Duží et al. / Ontology As a Logic of Intensions
starter, a carburettor, four tyres, two axles, six windows, etc. Despite being longer the latter list does not overlap with the former: neither the engine, nor the chassis nor the body appears on it. How can that be? How can an engine, for example, both be and not be a part of one and the very same car? There is no mystery, however. It is a commonplace that a car can be decomposed in several alternative ways. … Put in other words, a car can be constructed in a very simple way as a mereological sum of three things, or in a more elaborate way as a mereological sum of a much larger set of things. ([17], pp. 179-80.)
It is a contingent fact that this or that individual consists of other individuals and thereby creates a mereological sum. Importantly, being a part of is a relation between individuals, not between intensions. There can be no inheritance or implicative relation between the respective properties ascribed to a whole and its individual parts. Thus it is vital not to confuse the requisite relation, which obtains between intensions, with the part-whole relation, which obtains between individuals. The former relation obtains of necessity (e.g., necessarily, any individual that is an elephant is a mammal), while the latter relation obtains contingently. Logically speaking, any two individuals can enter into the part-whole relation. One possible combination has Saturn a part of Socrates (or vice versa). There will be restrictions on possible combinations, but these restrictions are anchored to nomic necessity (provided a given possible world at which a combination of individuals is attempted has laws of nature at all). One impossible combination would have the largest mountain on Saturn be a part of S (or vice versa). Why impossible? Because of wrong typing: the arguments of the part-whole relation must be individuals (i.e., entities of type L), but the largest mountain on Saturn is an individual office while S is a real number. Yet there is another question interesting from the ontological point of view: which parts are essential for an individual in order to have a property P? For instance, the property of having an engine is essential for the property of being a car, because something designed without an engine does not qualify as a car, but at most as a toy car, which is not a car. The answer to the question which parts are essential in order to have a property P is, in the car/engine example, that the property of having an engine is a requisite of the property of being a car. What is necessary is that a car, any car, should have an engine. It is even necessary that it should have a particular kind of engine, where being a kind of engine is a property of a property of individuals. This kind of a requisite relation should be also included into ontology. What is not necessary is that any car should have some one particular engine belonging to a particular kind of engine: mutatis mutandi, any two members of a particular kind of engine will be mutually replaceable.12 Thus the relation Part_of is of type (RLL)WZ. 2.3. Some other properties of intensions In addition to the above described higher-degree relations of requisite it is also useful to include into ontology some other higher-degree relations between and properties of intensions. In particular, we examine properties of relations-in-intension. For instance, that a given relation is necessarily reflexive, anti-symmetric and transitive, like the partial order induced by a requisite relation. 12
This problem is connected with the analysis of property modification, including being a malfunctioning P.
M. Duží et al. / Ontology As a Logic of Intensions
15
These higher-order properties of intensions are necessarily valid due to the way they are constructed. Since we explicate concepts as closed constructions modulo D- and Ktransformation, we can also speak about mutual relations between and properties of concepts which define particular intensions. Those that deserve our attention are in particular: Incompatibility of concepts defining particular properties, i.e., the respective populations are necessarily disjoint; example: bachelor vs. married man. Equivalence of concepts, i.e., the defined properties are one and the same property Week-equivalence of concepts, i.e., the defined properties are ‘almost the same’; as an example we echo the relation Eq between individual properties defined in the previous paragraph Functionality of a relation-in-intension, that is necessarily, in each ¢w, t²-pair, a given relation R Awt uBwt is a mapping fR: Awt Æ Bwt assigning to each element of A at most one element of B Inverse functionality of a relation-in-intension, that is necessarily, in each ¢w, t²pair, a given relation-in-extension R Awt u Bwt is a mapping fR–1: Bwt Æ Awt assigning to each element of Bwt at most one element of Awt. We also often need to specify some restrictions on the domain or range of a given mapping. Such local restrictions are specified as integrity constraints which we are going to deal with in the next paragraph.13 2.4. Integrity constraints Classical integrity constraints specify whether a given function-in-intension (i.e. an attribute) must be singular or may be multi-valued, and whether it is mandatory or optional. These constraints are analytically necessary. As an example of a cardinality constraint we can adduce the constraint that everybody has just one (biological) mother and father. That each order must concern a customer, a producer/seller and some products is an example of a constraint on mandatory relation. In addition to these analytical constraints it is useful to specify restrictions on cardinality in case of multi-valued attributes, or particular roles of individuals that enter into a given relation, etc. These constraints have the character of nomically necessary constraints given by some conventions valid in a given domain. For instance, there can be a constraint valid in a given organization that each exporter can have five customers at maximum. Regardless of the character of a given domain, we should always specify the degree of necessity of a given integrity constraint. If C ov R v-constructs the respective condition to be met, the basic kinds of constraints ordered from the highest to the lowest are: a) Analytically necessary rules; these are specified by constructions of the form wt C. b) Nomologically necessary rules; these are specified by constructions of the form Owt C.
13 In the terminology of standard ontology languages, the so-called “properties” are actually relationsin-intension with ‘slots’. Thus we can speak about ‘slot constraints’ and facets that are local slot constraints. See [15].
16
M. Duží et al. / Ontology As a Logic of Intensions
c)
Common rules of ‘necessity by convention’; these are specified by constructions of the form OwOt x [C …x …]. To adduce an example, imagine a mobile agent (typically a car) that encounters an obstacle on his way. In order to specify the behaviour of the agent properly, we must take into account priorities of particular constraints. First, the agent must take into account analytical constraints like that there cannot be two material objects at the same position at the same time. Second, physical laws must be considered; for instance, we must calculate vehicle stopping distance taking into account the speed of the agent as well as of the obstacle and the direction of their move. Only then conventional laws like traffic rules can be considered. If the agent comes to a conclusion that the stopping distance is greater than the distance of an obstacle then, of course, the rules like driving on the right-hand side of a lane or traffic sings cannot be followed. So much for the logic of intensions. In the next section we tackle another important phenomenon that is useful to include into ontology so that reasoning of agents can be properly specified, namely two kinds of entailment relation which also can be viewed as higher-order integrity constraints. They are presupposition vs. mere entailment.
3. Presupposition and entailment When used in a communicative act, a sentence communicates something (the focus F) about something (the topic T). Thus the schematic structure of a sentence is F(T). The topic T of a sentence S is often associated with a presupposition P of S such that P is entailed both by S and non-S. On the other hand, the clause in the focus usually triggers a mere entailment of some P by S. Schematically, (i) S |= P and non-S |= P (P is a presupposition of S); Corollary: If non-P then neither S nor non-S is true. (ii) S |= P and neither (non-S |= P) nor (non-S |= non-P)
(mere entailment).
More precisely, the entailment relation obtains between hyperpropositions P, S, i.e., the meaning of P is entailed or presupposed by the meaning of S. For the precise definition of entailment and presupposition, see [5], Section 1.5. The phenomenon of topic-focus is associated de dicto – de re ambivalence. Consider a pair of sentences differing only in terms of topic-focus articulation: (1) (2)
The critical situation on the highway D1 was caused by the agent a. The agent a caused the critical situation on the highway D1.
While (1) not only entails but also presupposes that there be a critical situation on D1, the truth-conditions of (2) are different, as our analysis clarifies. First, (1) as well as (1’), (1’)
The critical situation on the highway D1 was not caused by the agent a.
are about the critical situation, and that there is a such a situation is not only entailed but also presupposed by both the sentences. As we have seen above, the meaning of a sentence is a procedure producing a proposition, i.e. an object of type RWZ. Execution of this procedure in any world/time yields a truth-value T, F or nothing. Thus we can conceive the sense of a sentence as an
M. Duží et al. / Ontology As a Logic of Intensions
17
instruction on how to evaluate its truth-conditions in any world/time. The instruction encoded by (1) formulated in logician’s English is this: If there is a critical situation on the highway D1 then return T or F according as the situation was caused by the agent a, else fail (to produce a truth-value). Applying our method of analysis introduced in Section 1, we start with assigning types to the objects that receive mention in the sentence. Simplifying a bit let the objects be: Crisis/RWZ: the proposition that there is a critical situation on the highway D1; Cause/(RLRWZ)WZ: the relation-in-intension between an individual and a proposition which has been caused to be true by the individual; Agent_a/L. A schematic analysis of (1) comes down to this procedure: (1s)
OwOt [if 0Crisiswt then [0Causewt 0Agent_a 0Crisis] else Fail]
So far so good; yet there is a problem of how to analyse the connective if-then-else. There has been much dispute over the semantics of ‘if-then-else’ among computer scientists. We cannot simply apply material implication, . For instance, it might seem that the instruction expressed by “If 5=5 then output 1, else output the result of 1 divided by 0” received the analysis [[[05=05] [n=01]] [[05=05] [n=[0Div 01 00]]]], where n is the output number. But the output of the above procedure should be the number 1 because the else clause is never executed. However, due to the strict principle of compositionality that TIL observes, the above analysis fails to produce anything, the construction being improper. The reason is this. The Composition [0Div 01 00] does not produce anything: it is improper because the division function takes no value at the argument <1, 0>. Thus the Composition [n = [0Div 01 00]] is v-improper for any valuation v, because the identity relation = does not receive an argument, and so any other Composition containing the improper Composition [0Div 01 00] as a constituent also comes out v-improper. The underlying principle is that partiality is being strictly propagated up. This is the reason why the if-then-else connective is often said to be a non-strict function. However, there is no cogent reason to settle for non-strictness. We suggest applying a mechanism known in computer science as lazy evaluation. The procedural semantics of TIL operates smoothly even at the level of constructions. Thus it enables us to specify a strict definition of if-then-else that meets the compositionality constraint. The analysis of “If P then C1, else C2” is a procedure that decomposes into two phases. First, on the basis of the condition P ov R, select one of C1, C2 as the procedure to be executed. Second, execute the selected procedure. The first phase, viz. the selection, is realized by the Composition [0the_only Oc [[P [c=0C]] [P [c=0D]]]]. The Composition [[P [c=0C]] [P [c=0D]]] v-constructs T in two cases. If P v-constructs T then the variable c receives as its value the construction C, and if P vconstructs F then the variable c receives the construction D as its value. In either case the set v-constructed by Oc [[P [c=0C]] [P [c=0D]]] is a singleton. Applying the singulariser the_only to this set returns as its value the only member of the set, i.e., either the construction C or D.
18
M. Duží et al. / Ontology As a Logic of Intensions
Second, the chosen construction c is executed. As a result, the schematic analysis of “If P then C else D” turns out to be (*)
[ L Oc [[P [c=0C]] [P [c=0D]]]].
2 0
Types: PoR (the condition of the choice between the execution of C or D); C, D/ n; variable c ov n; the_only/( n(R n)): the singulariser function that associates a singleton set of constructions with the only construction that is an element of this singleton, and which is otherwise (i.e., if the set is empty or many-valued) undefined. Note that we do need a hyperintensional, procedural semantics here. First of all, we need a variable c ranging over constructions. Moreover, the evaluation of the first phase does not involve the execution of the constructions C and D. These constructions are only arguments of other constructions. Returning to the analysis of (1), in our case the condition P is that there be a crisis on the highway D1, i.e., 0Crisiswt. The construction C that is to be executed if P yields T is [0Causewt 0Agent_a 0Crisis]], and if P yields F then no construction is to be selected. Thus the analysis of the sentence (1) comes down to this Closure: (1*)
The evaluation of (1*) in any ¢w, t²-pair depends on whether the presupposition 0 Crisiswt is true in ¢w, t². If true, then the singleton v-constructed by Oc [ … ] contains as the only construction the Composition [0Causewt 0Agent_a 0Crisis]], which is afterwards executed to return T or F, according as the agent a caused the crisis. If false, then the second conjunct in Oc […] comes down to [0T 0F] and thus we get Oc 0F. The v-constructed set is empty. Hence, 2[LOc 0F] is v-improper, that is the Double Execution fails to produce a truth-value. To generalise, an analytic schema of a sentence S associated with a presupposition P is a procedure of the form If P then S else Fail. The corresponding schematic TIL construction is (**)
OwOt 2[0LOc [[Pwt [c=0Swt]] [Pwt 0F]]]. The truth-conditions of the other reading, i.e. the reading of (2)
(2)
“The agent a caused the critical situation on the highway D1”
are different. Now the sentence (2) is about the agent a (topic), ascribing to a the property that it caused the crisis (focus). Thus the scenario of truly asserting that (2) is not true can be, for instance, this. Though it is true that the agent a is known as a hit and run driver, this time he behaved well and prevented a critical situation from arising. Or, a less optimistic scenario is thinkable. The critical situation on D1 is not because of the agent a’s risky driving but because the highway is in a very bad condition. Hence, that there is a crisis is not presupposed by (2), and its analysis is this Closure: (2*)
OwOt [0Causewt 0Agent_a 0Crisis]
M. Duží et al. / Ontology As a Logic of Intensions
19
The moral we can extract from these examples is this. Logical analysis cannot disambiguate any sentence, because it presupposes full linguistic competence. Thus we should include into our formal ontology the schematic rules that accompany activities like agents’ seeking and finding, causing something, etc. Then our fine-grained method can contribute to a language disambiguation by making these hidden features explicit and logically tractable. In case there are more non-equivalent senses of a sentence we furnish the sentence with different TIL constructions. If an agent receives an ambiguous message, he/she can answer by asking for disambiguation. Having a formal fine-grained encoding of a sense, the agent can then infer the relevant consequences.
4. Conclusion The theoretical specification of particular rules is only the first step. When making these features explicit we keep in mind an automatic deduction that will make use of these rules. To this end we currently develop a computational FIPA compliant variant of TIL, the functional programming language TIL-Script (see [3]). The direction of further research is clear. We are going to continue the development the TIL-Script language in its full-fledged version equivalent to TIL calculus. The development of TIL-Script is still a work in progress, in particular the implementation of its inference machine. From the theoretical point of view, the calculus and the rules of inference have been specified in [5], Sections 2.6 and 2.7. Yet its full implementation is a subject of further research. Currently we proceed in stages. First we implemented a method that decides a subset of the TIL-Script language computable by Prolog (see [2]). This subset has been now extended to the subset equivalent to standard FOL. For ontology building we combine traditional tools and languages like OWL (Ontology Web Language) with TIL-Script. We developed an extension of the editor Protégé-OWL so that to create an interface between OWL and TIL-Script. The whole method has been tested within the project ‘Logic and Artificial Intelligence for Multi-Agent Systems’ (see http://labis.vsb.cz/) using a traffic system as a case study. The sample test contained five mobile agents (cars), three car parks and a GIS agent. The GIS agents provided mobile agents with ‘visibility’, i.e., the coordinates of the objects within their visibility. All the agents communicated in TILScript and started with minimal (but not overlapping) ontologies. During the test they learned new concepts and enriched their ontology in order to be able to meet their goals. The agents’ goal was to find a vacant parking lot and park the car. All the agents succeeded and parked in a few seconds, which proved that the method is applicable and usable not only as an interesting theory but also in practice.
Acknowledgements. This research has been supported by the Grant Agency of the Czech Republic, projects No. 401/09/H007 ‘Logical Foundations of Semantics’ and 401/10/0792, ‘Temporal aspects of knowledge and information’, and by the internal grant agency of FEECS VSB-Technical University Ostrava, project No. IGA 22/2009, ‘Modeling, simulation and verification of software processes’.
20
M. Duží et al. / Ontology As a Logic of Intensions
References [1] Baader, F., Calvanese, D., McGuinness, D., L., Nardi, D., and Patel-Schneider, P., F., editors. The Description Logic Handbook: Theory, Implementation and Application. Cambridge University Press, 2002. [2] íhalová, M., Ciprich, N., Duží, M., Menšík, M. (2009): Agents' reasoning using TIL-Script and Prolog. In 19th Information Modelling and Knowledge Bases, ed. T. Tokuda, Y. Kiyoki, H. Jaakkola, T. Welzer, Družovec, Slovenia: University of Maribor, 137-156. [3] Ciprich, N., Duží, M. and Košinár, M.: The TIL-Script language. In Kiyoki, Y., Tokuda, T. (eds.): EJC 2008, Tsukuba, Japan 2008, pp. 167-182. [4] Duží, M., Materna, P. (2009): Concepts and Ontologies. In Information Modelling and Knowledge Bases XX . Y. Kiyoki, T. Tokuda, H. Jaakola, X. Chen, N. Yoshida (eds.), Amsterdam: IOS Press, pp. 45-64. [5] Duží, M., Jespersen, B. and P. Materna: Procedural Semantics for Hyperintensional Logic; Foundations and Applications of Transparent Intensional Logic. Springer: series for Logic, Epistemology and the Unity of Science, Vol. 17, 2010, ISBN: 978-90-481-8811-6. [6] Gordon, M. J. C. and Melham, T. F. (eds.) 1993: Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge: Cambridge University Press. [7] Hayes, P., Menzel, C., 2001. Semantics of knowledge interchange format. In: IJCAI 2001 Workshop on the IEEE Standard Upper Ontology. [8] Horrocks, I. and Patel-Schneider, P.F. 2003: Three These of Representation in the Semantic Web. WWW2003, May 20-24, Budapest, Hungary, 2003, (retrieved 10.1.2005) . [9] Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B. and Dean, M 2004: SWRL: A Semantic Web Rule Language Combiming OWL and RuleML. W3C Member Submission, May 2004, (retrieved 10.1.2010), . [10] Jespersen, B. (2008): ‘Predication and extensionalization’. Journal of Philosophical Logic, vol. 37, 479 – 499. [11] Kifer, M., Lausen, G., and James Wu. Logical foundations of object-oriented and frame-based languages. Journal of the ACM, 42(4):741-843, 1995. [12] Materna, P. and Duží M. (2005): ‘The Parmenides principle’, Philosophia, 32, 155-80. [13] Paulson. L. C. 1994: Isabelle: A Generic Theorem Prover. Number 828 in LNCS. Berlin: Springer. [14] Sowa, John, F.: Knowledge Representation. Logical, Philosophical, and Computational Foundations. Brooks/Cole 2000. [15] Svátek, V. Ontologie a WWW. www source: http://nb.vse.cz/~svatek/onto-www.pdf [16] Tichý, P. 1987. Individuals and their roles (in German; in Slovak in 1994). Reprinted in (Tichý 2004: 710-748). [17] Tichý, P. 1995. Constructions as the subject-matter of mathematics. In The Foundational Debate: Complexity and Constructivity in Mathematics and Physics, eds. W. DePauli-Schimanovich, E. Köhler and F. Stadler, 175-185. Dordrecht, Boston, London, and Vienna: Kluwer. Reprinted in (Tichý 2004: 873-885). [18] Tichý, P. 2004. Collected Papers in Logic and Philosophy, eds. V. Svoboda, B. Jespersen, C. Cheyne. Prague: Filosofia, Czech Academy of Sciences, and Dunedin: University of Otago Press. [19] W3C 2004: The World Wide Web Consortium: OWL Web Ontology Language Overview W3C Recommendation 10 February 2004, (retrieved 10.1.2010) .
A Three-layered Architecture for Event-centric Interconnections among Heterogeneous Data Repositories and its Application to Space Weather Takafumi NAKANISHIa, Hidenori HOMMAa, Kyoung-Sook KIM a, Koji ZETTSU a, Yutaka KIDAWARAa and Yasushi KIYOKIa,b a National Institute of Information and Communication Technology(NICT), Japan b Keio University, Japan
Abstract. Various knowledge resources are spread to a world-wide scope. Unfortunately, most of them are community-based and never thought to be used among different communities. That makes it difficult to gain “connection merits” in a web-scale information space. This paper presents a three-layered system architecture for computing dynamic associations of events to related knowledge resources. The important feature of our system is to realize dynamic interconnection among heterogeneous knowledge resources by event-driven and event-centric computing with resolvers for uncertainties existing among those resources. This system navigates various associated data including heterogeneous data-types and fields depending on user's purpose and standpoint. It also leads to effective use for the sensor data because the sensor data can be interconnected with those knowledge resources. This paper also represents application to the space weather sensor data. Keywords. Event-centric interconnections, heterogeneous data repositories, threelayered architecture, uncertainties for interrelationships, space weather sensor data
Introduction A wide variety of knowledge resources are spread to a worldwide scope via Internet with WWW. Most knowledge resources are provided through community-based creation and they are not shared and used well among different communities. In fact, most data repositories are constructed and used in the local community independently. It is difficult for users to interconnect these widely distributed data according to their purposes, tasks, or interests. That makes it difficult to gain “connection merits” in a web-scale information space. The difficulty in retrieving and interconnecting various knowledge resources arises because of heterogeneities of data-types, contents and utilization objectives. Recently, various sensor data resources are also created widely and spread to the worldwide areas. It is becoming very important to find how to utilize them in related applications. For specialists in different fields from the community sharing the sensor data, it is difficult to use those data effectively because their usage and definitions are not clearly recognized. Each research community focuses on the sensor for research
22
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
purpose dependent of the community. In the current state, most sensor data are not used effectively widely because each research community installs the sensor of each research purpose. It is necessary to share the sensor data with the information on the purpose of use and the background knowledge. For users in the other fields, it is difficult to understand how the sensor data are related to their lives and what the sensor data means. Generally, the expression of the sensor data is an enumeration of the numerical values with domain-specific formatting. For making it possible to utilize those data by other domain-specialists, it is important to show what the sensor data mean and what influence the sensor data cause. Some methods of annotating and connecting the sensor data are expected directly. However, it is too hard and complex. An interpretation and utilization of the sensor data are different according to user's background knowledge and his/her purposes. It is important to realize interconnection mechanisms depending on user's background knowledge and his/her purpose for sensor data. Currently, we have organized a joint research with the Space Environment Group of NICT, to solve how to share sensor data related to the space weather field. The aim of this research is to create new applications of space-weather sensor-data by combining the related knowledge resources. Space Environment Group of NICT is delivering sensor data of solar activities and space environment that is called space weather by RSS [1]. Space weather shows conditions on the Sun and in the solar wind, magnetosphere, ionosphere, and thermosphere. These can endanger human life or health by affecting the performance and reliability of space-borne and ground-based man-made systems [2] such as communication failure, damage of electric devices for space satellite, bombing, etc. The group is delivering these data so that various users may use them. In our current global environment, it is important to transmit significant knowledge to actual users from various data resources. In fact, most events affect various aspects of other areas, fields and communities. For example, in the case of the space weather, a sensor data representing abnormality of Dst index, which is one of the sensor data on the space weather related to Geomagnetic storm event, and news articles on interruption of relay broadcast for XVI Olympic Winter Games are interrelated in the context of “watching TV.” The Dst index and those news articles are individually published from different communities. In order to understand a concept in its entirety on user’s standpoint, a user would need to know the various interrelationships between data in interdisciplinary fields. By only using existing search engines, however, it is difficult to find various data resources in interdisciplinary fields. Moreover, the interconnection will change over time. In order to manage ever-changing interrelations among a wide variety of data repositories, it is important to realize an approach for discovering “event-centric interrelations” of various types of data on each different community depending on user’s standpoint. In this paper, we present a three-layered system architecture for computing dynamic associations of events in nature to related knowledge resources. The important feature of our system is to realize dynamic interconnection among heterogeneous data resources by event-driven and event-centric commuting with resolvers for uncertainties existing among those resources. This realizes interconnection indirectly and dynamically by semantic units for the data of various types such as text data, multimedia data, sensor data etc. In other words, it navigates various appropriate data including data of heterogeneous data-type and heterogeneous fields depending on user's purpose and standpoint. In addition, it leads to effective use for the sensor data because
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
23
the sensor data are interconnected with various data. We also propose a three-layer data structure for representing semantic units extracted from all type of data. The data structure represents semantic units depending on a constraint in each layer. By this data structure, we can compute interconnection between heterogeneous data in some semantic units. Actually, we consider that it is difficult to construct only static basic interrelationships that are acceptable in any cases. It is effective to provide the interrelationships corresponding to user’s standpoint dynamically. The essence of our system is to dynamically select, integrate and operate various appropriate content resources for distributed environment. We define constraints in each layer of the threelayer data stricture for semantic units –event, occurrence and scene. Therefore, our framework is important and effective to realize the distributed heterogeneous data resources. This paper is organized as follows. In section 1, we present a three-layer data structure for interconnection. ries. In section 2, we present the overview of interconnection for heterogeneous content repositories. In section 3, 4, and 5, we describe detail data structures and operations of an event, an occurrence, and a scene. In section 6, we describe the related works. Finally, in section 7, we conclude this paper.
1. Three-layer Data Structure for Interconnection In this section, we present a three-layer data structure for realizing event-centric interconnection of heterogeneous data repositories. Currently, a relationship between each data is represented in a static link. We consider that there are limitations to uniquely represent global static interrelationships. Because interrelationships keep changing in various factors such as spatiotemporal condition, background field, situation. Of course, the interrelation that everyone supports might exist, too. However, it is important to dynamically represent interrelationships depending on an arbitrary situation. It is difficult to represent unique and global interrelationship because it has uncertainties. We define the constraint for reducing the uncertainties, and design the method for representation of various interrelationships. In section 1.1, we describe uncertainties for interrelationships between heterogeneous data. In section 1.2, we define a three-layer data structure for interrelationships that considers these uncertainties. Furthermore, in section 1.3, we consider why we apply interconnection not integration from the standpoint of three uncertainties. 1.1. Uncertainties of Interrelationships between Heterogeneous Data Generally, it is difficult to represent static interrelationships between heterogeneous data because it has uncertainties. However, most current systems utilize static link representation. They implicitly have limitation of interconnection such as limitation of domain, data-type, and field. For realizing interconnection between heterogeneous data, we have to clear uncertainty items. There are three uncertainties for interrelationship between heterogeneous data as follows: (1) Which part of data to focus on.
24
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
It is necessary to extract metadata set as a semantic unit from target data in order to target heterogeneous data. In this case, the extracted semantic unit depends on which part of data to focus on. For example, it is assumed to extract semantic unit from the sensor data of precipitation. In the case that you focus when precipitation is zero, you can detect semantic unit that represents fine or cloudy weather. In the case that you focus when precipitation is higher than the threshold, you can detect semantic unit that represents heavy rain. A different semantic unit can be extracted from the same data source by changing the constraint. That is, it is important to clarify focus point of the data as constraint. (2) What standpoint to interpret data. An interpretation of each extracted semantic unit is changing by user’s background knowledge, standpoint, etc. For example, it assumes that there are disaster ontology and climate changing ontology. When the heavy rain semantic unit is mapped to disaster ontology, the event will be semantically arranged close to swollen river, traffic damage, etc. When the same heavy rain semantic unit is mapped to climate changing ontology, the event will be semantically arranged close to global warming. By this example, you can find various interpretations of the semantic unit are possible by changing the constraint. That is, it is important to clarify what standpoint to interpret data as constraint. (3) What standpoint to interrelate between each data. An interrelationship of each extracted semantic unit is also changing by user’s background knowledge, standpoint, etc. Actually, most interconnection depends on a situation. In such case, we should represent the interrelationship according to the situation. That is, it is important to clarify what standpoint to interrelate between each data as constraint. We consider that we can uniquely represent an interconnection on the constraints if we apply the constraints that exclude three above-mentioned uncertainties. Therefore, it is important to design a data structure for defining the constraints that represent three uncertainties. 1.2. Three-layer Data Structure—Event, Occurrence and Scene For representing interrelationship between heterogeneous data with such three uncertainties, we realize event-centric interconnection for heterogeneous data. It is necessary to design a new data structure for solving the uncertainties. In this section, we design a new three-layer data structure for interconnection of heterogeneous data. The data structure consists of three layer based on three uncertainties. By this data structure, we can represent interconnection between heterogeneous data depending on user’s purpose and standpoint. The data structure consists of three data-types in each layer –event, occurrence and scene. Figure 1 shows overview of the data structure and its layers. Each data has constraints – condition, context and viewpoint. à Event An event is a minimum semantic unit extracted from delivered target data. An event consists of set of various metadata that represent its features. For detecting event from target data, we have to determine a constraint. The constraint for event detection is called a condition. The condition represents which part of target data to focus on. In other words, the condition is constraints that represent
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
25
Figure 1. Overview of three-layer data structure for interconnection. The data structure consists of event, occurrence, and scene. There are three types of constraint – condition, context and viewpoint –for avoiding the uncertainties.
how to summarize target data and how to composite an event. Various events can be detected by setting various conditions from same target data. That is, this solves uncertainty (1) shown in section 1.1. The event also has its condition. It becomes possible to process unitedly by making various different kinds of data resources an event. à Occurrence An occurrence is a projected event according to a constraint that is called context. The interpretation of the event is different according to the standpoint, the background knowledge, etc. The context is a constraint for uniquely providing the interpretation of an event such as user's standpoint, background knowledge, etc. A occurrence is projection data of event along context. That is, the context solves uncertainty (2) shown in section 1.1. By the context, we can specify semantic of an event. Conversely, various occurrences can be composited by setting various contexts from same event. The occurrence consists of projected metadata with contexts. à Scene A scene is set of relationships between occurrences according to a constraint that is called viewpoint. The interconnection of occurrences is different according to the standpoint, the background knowledge, etc. The viewpoint is a constraint for uniquely providing the interconnection of occurrences such as user's standpoint, background knowledge, etc. That is, the viewpoint solves uncertainty (3) shown in section 1.1. By the viewpoint, we can specify interconnection. Conversely, various scenes can be composited by setting various viewpoints from same occurrences. The various interconnections between heterogeneous data can be represented by this data structure of three layers. For representing interconnection between heterogeneous data, events are detected from target data according to condition; occurrences are constructed by projection of events according to context; and scenes
26
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
Table 1. Summary of integration and interconnection
are constructed by interconnection of each occurrence according to viewpoint. The interconnection of heterogeneous data in the three constraints – condition, context and viewpoint– can be found if tracing this data structure oppositely according to the three constraints. 1.3. Integration or Interconnection Generally, techniques for arranging two or more resources include integration and interconnection. In this section, we consider whether integration or interconnection is effective in this case. Table 1 shows a summary for general features of integration and interconnection. For realizing an integration technique, we have to reconstruct all system in most cases because it is necessary to consolidate the system that distributes. However, an integration technique provides efficient computation for arranging two or more resources. An integration technique can arrange static, usual interrelationships fast. Oppositely, it is not possible to apply to the arrangement of various dynamic relationships. On the other hand, it is easy to implement an interconnection technique in most case because it is possible to mount making the best use of existing systems. However, the computational complexity tends to increase. It is better to apply an integration technique not an interconnection technique to arrange static, usual interrelationships because there are a lot of computational complexities. It is possible to apply an interconnection technique to arrangement of various dynamic interrelationships. In this paper, we focus on interrelationships of heterogeneous data. It is difficult to represent static interrelationships between heterogeneous data because it has the uncertainties shown in section 1.1. In this assumption, we should present the method for representing various interrelationships that change dynamically depending on the various constraints by avoiding these uncertainties. The interconnection can realize such an environment. Recently, a lot of data repositories and resources have been widely spread on the Internet. It is difficult to integrate these environments. Of course, it is not impossible to construct the integration system with a part of them. From the standpoint of the extendibility, it is reasonable to apply the interconnection to this environment that increases every day. An interconnection can be applied without changing the arrangement of the resource of the distributed environment. Actually, effectively using the heterogeneous data repositories scattered in the distributed environment is becoming important. In this case, we also take care of three uncertainties for interrelationship. In the case of space weather sensor data derived by Space
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
27
Environment Group of NICT, we are grappling with the similar issue. They require also representing various relationships between their space weather sensor data and other data. Furthermore, we are working “knowledge cluster systems” for knowledge sharing, analysis, and delivery among remote knowledge sites on a knowledge grid [3]. In this environment, we have constructed and allocated over 400 knowledge bases to each site. One of the important issues in this environment is how to arrange and interrelate among these knowledge bases. We have proposed a viewpoint-dependent interconnection method of knowledge bases by focus on concept words in each knowledge base [4]. In this case, to arrange each knowledge base maintaining a distributed environment, the interconnection is applied. Therefore, in order to compute interrelation among various resources in distributed environment, it is important to realize an interconnection mechanism depending on some constraint for avoiding uncertainties.
2. Overview of Interconnection for Heterogeneous Content Repositories In this section, we describe an overview of event-centric interconnection of heterogeneous content repositories. This is a model for interconnection of interdisciplinary data resources in distributed environment depending on some constraint for avoiding uncertainties shown in section 1. In today’s global environment, it is important to transmit significant knowledge to actual users from various data resources. In order to realize this environment, it is important to interrelate among data resources depending on some constraint for avoiding uncertainties. This framework realizes interconnection indirectly and dynamically for the data of various types such as text data, multimedia data, sensor data etc. That is, it helps a user to obtain various appropriate data including data of heterogeneous data-type and heterogeneous fields depending on user's purpose and standpoint. The overview of an event-centric interconnection for heterogeneous contents repositories is shown in Figure 2. Here, for realizing the framework, there are four modules – event detection module, event projection module, correlation analysis module and codifier module. à Event detection module: An event detection module extracts events shown in section 1.2 from target data depending on a condition. The condition is a kind of constraint for avoiding uncertainty shown in section 1. The event detection module can composite various events by setting various condition from same target data. The diversity of data itself that is one of the uncertainties when an event is extracted is avoided by a condition. The input of the module is target data. It must be set in each data repository. The output of the module consists of extracted event set. It is possible to process unitedly by making various heterogeneous data resources an event. à Event projection module: An event projection module projects detected event depending on a context. We call a projected event a occurrence shown in section 1.2. The projection process corresponds to the interpretation of the event according to the context. For example, it assumes that an event detection module extracts a heavy rain event from article data and there are disaster ontology and climate changing ontology. When a context is disaster, heavy rain event will be projected in disaster ontology, and construct a new occurrence. The occurrence
28
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
Figure 2. The overview of an event-centric interconnection for heterogeneous contents repositories. This method consists of four modules—event detection module, event projection module, correlation analysis module, and codifier module.
à
à
will be semantically arranged close to swollen river, traffic damage, etc. When a context is climate changing, heavy rain event will be projected in climate changing ontology, and construct a new occurrence. The occurrence will be semantically arranged close to global warming. In these two case, an event projection module projects thematic metadata described in the heavy rain event to each ontology as a new occurrence. When a context is a spatiotemporal constraint, a new occurrence may be constructed as a shape that represents spatiotemporal region on 3D axis (latitude, longitude, and time) from heavy rain event. In this case, an event projection module projects spatiotemporal metadata described in the heavy rain event to 3D shape as a new occurrence. An event projection module can composite various occurrences by setting various contexts from same event. The occurrence consists of projected metadata with contexts. Correlation analysis module: A correlation analysis module interconnects occurrences depending on a viewpoint based on computing correlation. We call a set of interconnection between occurrences a scene shown in section 1.2. The interconnection of occurrences is different according to the standpoint, the background knowledge, etc. The viewpoint is a constraint for uniquely providing the interconnection of occurrences such as user's standpoint, background knowledge, etc. By the viewpoint, we can specify interconnection. Conversely, A correlation analysis module can composite various scenes by setting various viewpoints from same occurrences. This module can indirectly interconnect heterogeneous data by utilizing occurrences. Codifier module: A codifier module arranges and organizes scenes extracted from a correlation analysis module. The interconnection of heterogeneous data in
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
29
the three constraints – condition, context and viewpoint– can be found if tracing this data structure oppositely according to the three constraints. The process of event-centric interconnection of heterogeneous content repositories is described as follows: Step1. Detecting events from heterogeneous data An event detection module extracts an event from target data along an event class database. In the event class database, event models and their conditions are stored. This step produces semantic units that are unified data-type from various data as events. By this step, it is possible to process unitedly by making various heterogeneous data resources an event. Step2. Projecting events as occurrences An event projection module projects detected event along a occurrence class database. In the occurrence class database, occurrence models and their context are stored. This step produces projected events as occurrences. An event projection module can composite various occurrences by setting various contexts. An occurrence is an event interpreted by the context by projection. Therefore, for representing various interconnections, this step should produce various occurrences from a same event. Step3. Interconnecting occurrences as scenes A correlation analysis module interconnects occurrences depending on a viewpoint along a scene class database. In the scene class database is stored scene models and their viewpoints. This step produces interconnection set of occurrences as scenes. This step can composite various scenes by setting various viewpoints from same occurrences. This set can indirectly interconnect heterogeneous data represented in interconnection set of occurrences. Step4. Providing organized scenes as event-centric interrelationships between heterogeneous data A codifier module arranges and organizes scenes extracted from a correlation analysis module. When a user gives some queries representing a condition, a context and a viewpoint, this step provides appropriate scene set dynamically. By this process, a user obtains interconnection between heterogeneous data depending on three constraints for avoiding uncertainties. Figure 3 shows three important operations for representation of interrelationships between heterogeneous data. These are detection, projection and interconnection. Each operation has a constraint—condition, context, and viewpoint. On the viewpoint from target data, it is possible to expand various interconnections of target data by these constraints. Conversely, on the viewpoint from a user, it is possible to narrow interconnections candidate of target data by these constraints. The computation result by this process can represent relationships between heterogeneous data by utilizing scene data in RDF etc. With regard to each step, any method is acceptable. Please note that this process dynamically represents interrelationships between heterogeneous data depending on a condition, a context and a viewpoint. Conversely, by this process, we can find the approval constraints for the interrelationships (e.g. which data, which part of data, what standpoint to interpret data, and what standpoint to interrelate). This process dynamically represents various interconnections with the condition, context, and viewpoint. That is, it helps a user to obtain various appropriate data including data of heterogeneous data-type and
30
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
Figure 3. Three important operations for representation of interrelationships between heterogeneous data— detection, projection and interconnection— and the data structure.
Figure 4. Overview of an event and its condition. An event data extracted from target data depending on event model including condition. An event consists of a basic attribute (e.g. event label), feature attributes (e.g. date, place, keywords), and origin attributes (e.g. event type, source and condition).
heterogeneous fields depending on user's purpose and standpoint while user’s understanding.
3. Event—Detection Figure 4 shows an overview of event detection. An event is extracted from target data by an event model and its condition in event class database shown in Figure 2. An event consists of seven attributes as follows: event=<eventLabel, eventType, date, place, keywords, source, condition>, where eventLabel means the name of the event, eventType means the kind of the event and represents to which an event model to belong, date means temporal annotations, place means spatial annotations, keywords represents thematic annotations, source means URI of source data, and condition represents condition expression used for the event detection. Please note that not only each detected event but also each event model stored in event class database shown in Figure 2 has same seven attributes. These event models are used as basic patterns when the events are extracted. These attributes are roughly divided into the basic attribute (eventLabel) that represents basic information, the feature attributes (date, place, keywords) that represent the feature of the event and the origin attributes (eventType, source, condition) that represent how to extract themselves. That is, an event consists of two
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
31
Figure 5 Overview of occurrences and their contexts. An occurrence is projected event depending on a context.
types of attribute—feature attribute and origin attribute. The feature attributes are used for interconnecting target data that is represented by event. The origin attributes are used to navigate source data and represent as reason. Furthermore, each attribute is permitted to have two or more elements. The elements given to each attributes are roughly classified into two types—inheritance element and data dependence element. The inheritance element is an element decided depending on the event model. Both events extracted by using the same event model have the same elements. These elements are called inheritance elements because they are inherited from the model. That is, the inheritance element represents features of its event type. The data dependence element is extracted from target data itself. Elements of this type change depending on the target data even if both events are extracted from the same event model. That is, data dependence element represents features of itself. An event is detected from target data by using a condition in each event module; some elements of each attribute are inherited from event module; and some other elements of each attribute are extracted from the target data. By this process, it is possible to unitedly process various heterogeneous data resources by extracting minimum semantic units as event.
4. Occurrence—Projection Figure 5 shows an overview of projection of an event as occurrences. An occurrence is a projected event by occurrence models including its context in occurrence class database shown in Figure 2. The occurrence model represents how to project events in each context. An occurrence represents as follow: occurrence=, where occurrenceLabel means the name of the occurrence, occurrenceType means the kind of the occurrence and represents to which occurrence models to belong, eventSource means URI of target event data, context represents context expression used for the event projection as the occurrence, and an attrii’ represents projected feature attributes depending on a context.As with an event, a occurrence has three types of
32
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
Figure 6. Overview of a scene and its viewpoint. A scene is a record including an interrelationship between occurrences
attributes— a basic attribute (occurrenceLabel), feature attributes (attrii’), and origin attributes (occurrenceType, eventSource, context). Please note that feature attributes set of a occurrence foccurrence is changing depending on an occurrence model including a context Pcontext. foccurrence=(attri1’, attri2’,…, attrin’)= Pcontext (fevent), fevent=(attri1, attri2,…, attrim), where attrij is feature attribute of an event, attrii ’ is feature attribute set of a occurrence, and Pcontext. is an occurrence model with a context. That is, an occurrence model Pcontext. projects event feature attributes attrij to occurrence feature attributes attrii’. Various occurrences can be composited by setting various occurrence models with contexts from same event. Composing various occurrences by using various occurrence models depending on the context means various interpretations of an event are introduced. Therefore, for representing various interconnections, various occurrences should be produced from a same event. When this data structure applies to the system, you can uniquely clarify interpretation of an event by a context that represents user's standpoint, background knowledge, etc. We specify semantic of an event by a occurrence.
5. Scene —Interconnection Figure 6 shows an overview of a scene. A scene is a record including interrelationship of occurrences by a scene model including its viewpoint in scene class database shown in Figure 2. A scene represents as follow: Scene=<sceneLabel, scenType, interrelationship, viewpoint>, Interrelationship=, where sceneLabel means the name of the scene, sceneType means the kind of the scene and represents to which scene models to belong, interrelationship means an interrelationship of maters, and viewpoint represents viewpoint expression used for the occurrence interconnection as the scene. The interrelationship has two types of occurrences. It consists of fromOccurrenceURI that represents cause occurrences for relationship and toOccurrenceURI that represents effect occurrences for relationship.
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
33
Please note that not only each scene but also each scene model stored in scene class database shown in Figure 2 has same attribute sets. These scene models are used as basic patterns when the occurrences are interconnected by correlation analysis. Various scenes can be composited by setting various viewpoints from same occurrences. This data set can indirectly interconnect heterogeneous data represented in interconnection set of occurrences. When this data structure applies to the system, you can uniquely clarify interrelationships of a occurrence by a viewpoint that represents user's standpoint, background knowledge, etc. We specify interconnection of occurrences by a scene. This process dynamically represents interrelationships between heterogeneous data depending on a viewpoint. Conversely, we can find the approval viewpoints for the interrelationships.
6. Implementation Example—Application to the Space Weather Figure 7 shows an implementation for interconnection of heterogeneous contents repositories applying to space weather data as an example. Currently, we are coworking with the Space Environment Group of NICT. Space Environment Group of NICT is delivering sensor data of solar activities and space environment that is called space weather by RSS. One of the important problems is groping for effective use of the space weather data. One of the effective uses is to show how the event that these sensors represent influences our life of every day. For realizing it, we are developing an interconnection method for space weather sensor data and other data such as meteorological sensor data, general newspaper article, etc by using the three-layered architecture. It means this system bridges the gap between general facts such as events in our life of everyday and concepts in specific field such as space weather sensor data. In Figure 7, the system consists of event extraction modules, a correlation analysis management module, correlation analysis modules and codifier module. à Event extraction modules Each event extraction module detects events from each data such as news article data, meteorological sensor data that are AMeDAS (Automated Meteorological Data Acquisition System) data by Japan Meteorological Agency, Space weather sensor data, etc. These modules produce semantic units shown in section 3 that are unified data-type from various data as events. à Correlation analysis management module A correlation analysis management module has two operations. One is projection of each detected event data to correlation analysis modules as occurrences. An occurrence is an event interpreted by the context by projection. Another is organization of correlation analysis modules. In this system, various types of correlation analysis modules provide various scenes that represent interrelationships between occurrences (projected events). The correlation analysis management module should organize these data. That is, this module is input/output interfaces for correlation analysis modules. à Correlation analysis modules A correlation analysis module interconnects occurrences depending on a viewpoint. In this system, we are developing two types of correlation analysis modules—spatiotemporal correlation analysis module and semantic correlation analysis module. Spatiotemporal correlation analysis module
34
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
Figure 7. An implementation for interconnection of heterogeneous contests repositories applying to space weather data
A spatiotemporal correlation analysis module is an analysis module that specializes in the axis of time and spaces. This finds interrelationships of the projected events (occurrences) into which the region and time hour by hour change as phenomenon. We are developing this module based on a moving phenomenon model [5] Semantic correlation analysis module A semantic correlation analysis module is an analysis module that specializes in the semantics. This finds interrelationships of the projected events (occurrences) depending on viewpoint. We are developing this module based on this reference [4] The interrelation is extracted by mutual constraint between these analysis modules. à Codifier module A codifier module arranges and organizes scenes extracted from a correlation analysis management module as shown in section 2. When a user gives some queries representing a condition, a context and a viewpoint, this module provides appropriate scene set dynamically by RDF. By these modules, we can obtain interrelationships between heterogeneous data by bridging the gap between general facts and specific concepts. For example, in the case of the space weather, a sensor data that shows abnormality of Dst index, which is one of the sensor data on the space weather related to Geomagnetic storm event, and an news article on interruption of relay broadcast for XVI Olympic Winter Games are interrelated in the viewpoint of “watching TV” while they are individually published from different communities.
7. Related Works The relationships among concepts are predefined on the basis of a bridge concept. Schema mappings [6] and bridge ontologies [7] are typically used for the bridge concept. These methods are employed to predefine the universal relationships between two different domains; however, it is quite difficult to understand these relationships in
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
35
most cases. As a result, conventional approaches can be employed only on a small scale. The QOM [8] realizes semi-automatic alignment of different ontologies quickly. However, there is no concern about contexts. That is, it is purpose to create static whole ontologies. The feature of our method is dynamic extraction of event-centric interrelationships depending on the content of web feeds selected by a user. The essences of our purpose are to dynamically select, integrate and operate various appropriate data resources depending on a context for distributed environment. Therefore, our method is important and effective to realize interconnection of the distributed heterogeneous data repositories. Recently, linked data [9] that connects various resources at the instance level have attracted attention. Especially, the Linking Open Data community project [10] tries to connect various RDF data. The project enables us to use a large number of open interlinked datasets as structured data. Some works extracts structured data from Wikipedia such as DBpedia [11] and YAGO [12]. These works provide static interlinks for RDF data. In near future, these interlinks apply to not only data but also device, environment, resources, etc. In this sense, it is difficult to expand various interlinks without excluding three uncertainties shown in section 1.1 because of heterogeneities of data-type, content and utilization purpose. Our system realizes dynamic interconnection among heterogeneous data resources by event-driven and event-centric computing with resolvers for uncertainties existing among those resources. Therefore, Our architecture can solve these problems.
8. Conclusion In this paper, we presented a three-layered system architecture for computing dynamic associations of events in nature to related knowledge resources. The important feature of our system is to realize dynamic interconnection among heterogeneous data resources by event-driven and event-centric commuting with resolvers for uncertainties existing among those resources. This realizes interconnection indirectly and dynamically by semantic units for the data of various types such as text data, multimedia data, sensor data etc. In other words, it navigates various appropriate data including data of heterogeneous data-type and heterogeneous fields depending on user's purpose and standpoint. In our current global environment, it is important to transmit significant knowledge to actual users from various data resources. In fact, most events affect various aspects of other areas, fields and communities. This helps a user to obtain related information on heterogeneous data-type, contents and fields while providing a wide understanding of the relationships between them depending on user's standpoint. As our future study, we will extend the system to peer-to-peer environment. We will also formulate the evaluation indexes of represented concepts and contents. Furthermore, we will apply our method to various fields and communities.
References [1] Space Weather Information Center, NICT, http://swc.nict.go.jp/contents/. [2] National Space Weather Program Implementation Plan, 2nd Edition, FCM-P31-2000, Washington, DC, July 2000.Available in PDF at http://www.ofcm.gov/nswp-ip/tableofcontents.htm.
36
T. Nakanishi et al. / A Three-Layered Architecture for Event-Centric Interconnections
[3] K. Zettsu, T. Nakanishi, M. Iwazume, Y. Kidawara, Y. Kiyoki: Knowledge cluster systems for knowledge sharing, analysis and delivery among remote sites, Information Modelling and Knowledge Bases, vol. 19, pp. 282–289, 2008. [4] T. Nakanishi, K. Zettsu, K. Kidawara, Y. Kiyoki: A Context Dependent Dynamic Interconnection Method of Heterogeneous Knowledge Bases by Interrelation Management Function, In proceedings of the 19th European-Japanese Conference on Information Modelling and Knowledge Bases, Maribor, Slovenia, June, 2009. [5] K.-S. Kim, K. Zettsu, K. Kidawara, Y. Kiyoki: Moving Phenomenon: Aggregation and Analysis of Geotime-Tagged Contents on the Web, In proceedings of the 9 th international symposium on Web & Geographical Information Systems (W2GIS2009), pp.7-24, 2009. [6] R. J. Miller, L. M. Haas, M. A. Hernandez: Schema Mapping as Query Discovery, Proc. of the 26th International Conference on Very Large Data Bases (VLDB2000), pp. 77–88, 2000. [7] A. H. Doan, J. Madhavan, P. Domingos, A. Halevy: Learning to Map between Ontologies on the Semantic Web, Proc. of the 11th international conference on World Wide Web, pp. 662–673, 2002. [8] M. Ehrig, S.Staab: QOM–Quick Ontology Mapping, In Proc. of Third International Semantic Web Conference (ISWC 2004), pp. 683–697, Hiroshima, Japan (2004). [9] T. Berners-Lee, Linked Data, http://www.w3.org/DesignIssues/LinkedData.html, 2006. [10] Linking Open Data W3C SWEO Community Project, http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/. [11] S. Auer, C. Bizer, J. Lehmann, G. Kobilarov, R. Cyganiak, Z. Ives: DBpedia: A Nucleus for a Web of Open Data, In proceedings of the 6th International and 2nd Asian Semantic Web Conference (ISWC2007+ASWC2007), pp.715-728, 2007. [12] F.M. Suchanek, G. Kansneci, G Weikum: YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia, In proceedings of the 16th international conference on World Wide Web, pp.697-706, 2007.
Partial Updates in Complex-Value Databases Klaus-Dieter SCHEWE a,1 and Qing WANG b,2 a Software Competence Centre Hagenberg, Austria b University of Otago, Dunedin, New Zealand Abstract. Partial updates arise when a location bound to a complex value is updated in parallel. Compatibility of such partial updates to disjoint locations can be assured by applying applicative algebras. However, due to the arbitrary nesting of type constructors, locations of complex-value database are often defined at multiple abstraction levels and thereby non-disjoint. Thus, applicative algebras is not as smooth as its simple definition suggests. In this paper, we investigate this problem in the context of complex-value databases, where partial updates arise naturally in database transformations. We show that a more efficient solution can be obtained when generalising the notion of location and thus permitting dependencies between locations. On these grounds we develop a systematic approach to consistency checking for update sets that involve partial updates. Keywords. Abstract State Machine, partial update, complex value, applicative algebra, database transformation
1. Introduction According to Blass’s and Gurevich’s sequential and parallel ASM theses sequential3 and parallel algorithms are captured by sequential and general Abstract State Machines (ASMs), respectively [3,6] (see also [4]). A decisive characteristic of ASMs is that states are first-order structures consisting of updatable (partial) functions. Thus, in each step a set of locations is updated to new values, where a location is defined by an n-ary function symbol f in the (fixed) state signature of the ASM, and n values a1 , . . . , an in the (fixed) base set B of the structures defining states. That is, in a state S the function symbol f is interpreted by a function fS : B n → B, and an update of f (a1 , . . . , an ) to a new value b ∈ B gives rise to fS (a1 , . . . , an ) = b in the successor state S . The progression from a state S to a successor state S is defined by an update set Δ, i.e. a set of updates (, b) with a location and a new value b for this location, provided Δ is consistent, where consistency of an update set is defined by the uniqueness of new values for all locations, i.e. whenever (, b), (, b ) ∈ Δ hold, we must have b = b . However, this requirement is too strict, if the base set B contains values that themselves 1 E-mail:
[email protected][email protected] 3 In Gurevich’s seminal work “parallelism” actually means unbounded parallelism, whereas algorithms with an a priori given bound to parallelism in elementary computation steps are still considered to be sequential. 2 E-mail:
38
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
have a complex structure. For instance, if the values for a location are tuples (A1 : a1 , . . . , Ak : ak ), then updates to different attributes Ai and Aj can still be compatible. The same applies to lists, finite sets, counters, labelled ordered trees, etc., and is therefore of particular interest for database transformations over complex-value databases. It is therefore desirable to distinguish between total and partial updates. For the former ones consistency of an update set should remain unchanged, whereas for the latter ones we should strive to find a way to guarantee compatibility and then merge partial updates to a location in an update set into a single total update on . The problem of partial updates in ASMs was first observed by the research group on Foundations of Software Engineering at Microsoft Research during the development of the executable ASM specification language AsmL [7,8]. This motivated Gurevich’s and Tillmann’s investigation on the problem of partial updates over data types counter, set and map [9]. An algebraic framework was established by defining particles as unary operations over a datatype, and the parallel composition of particles as an abstraction of order-independent sequential composition. However, this fails to address partial updates over data types such as sequence as exemplified in [10]. This limitation led to the proposal of applicative algebras as a general solution to the problem of partial updates [11]. It was shown that the problem of partial updates over sequences and labeled ordered trees could be solved in this algebraic framework, and the approach in [9] was a special kind of an applicative algebra. Definition 1.1 An applicative algebra consists of elements, which comprise a trivial element ⊥ and a non-empty set denoted by a client type τ , a monoid of total unary operations (called particles) over the elements including a null particle λ, and a parallel composition operation Ω, which assigns a particle ΩM to each finite multiset M of particles, such that the following two conditions (AA1) and (AA2) are satisfied: (AA1) f (⊥) = ⊥ for each particle f , and λ(x) = ⊥ for every element x. (AA2) Ω{{f }} = f , Ω(M {{id}}) = ΩM , and Ω(M {{λ}}) = λ. A multiset M of particles is called consistent iff ΩM = λ. When applying applicative algebras to the problem of partial updates each partial update (, b) has to be interpreted as a partical applied to the content of in state S (denoted by valS ()) and all these particles form a multiset M that is aggregated to ΩM such that valS () = ΩM (valS ()) holds, provided M is consistent. In this paper, we investigate the partial update problem in the context of complexvalue databases. In database transformations, bounded parallelism is intrinsic and complex data structures form the core of each data model. Thus, the problem of partial updates arises naturally. Several examples of partial update problems encountered in complex-value database are provided in Section 2. Furthermore, in Section 2, we discuss the reasons why using applicative algebras is not as smooth as the simple definition above suggests. One of important assumptions of applicative algebra is that locations of partial updates must be disjoint. However, it is common in data models to permit the arbitrary nesting of complex-value constructors. Consequently, we need particles for each position in a complex value, and each nested structure requires its own parallel composition operation. It means that we have to deal with the theoretical possibility of infinitely many applicative algebras, which requires a
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
39
mechanism for the construction of such algebras out of algebras for parts of the type of every object in a complex-value database. This leads to the question of how to efficiently check consistency for sets of partial updates. In view of these problems we propose an alternative solution to the problem of partial updates. The preliminaries such as the definition of partial locations, partial updates, and different kinds of dependencies among partial locations are handled in Section 3. We relax the disjointness assumption on the notion of location in order to reflect a natural and flexible computing environment for database computations. While in principle the prime locations bound to complex values are not independent from each other, we may consider each position within a complex value as a sublocation, which for simplicity of terminology we prefer to call also location. Then a partial update to a location is in fact a (partial) update to a sublocation. In doing so, we can transform the problems of consistency checking and parallel composition into two stages: normalisation of shared updates and integration of total updates, which are discussed in Section 4 and Section 5, correspondingly. The first stage deals with compatibility of operators in shared updates and the second one deals with compatibility of clusters of exclusive updates. The work in this paper is part of our research on formal foundations of database transformations. Taking an approach analogous to the ASM thesis we demonstrated that all database transformations are captured by a variant of Abstract State Machines [13]. Decisive for this work is the exploitation of meta-finite states [5] in order to capture the intrinsic finiteness of databases, the explicit use of background structures [2] to capture the requirements of data models, and the handling of genericity [1]. For XML database transformations the requirements for tree-based backgrounds were made explicit in [12], and a more convenient machine model called XML machines was developed permitting the use of monadic second-order logic. On these grounds we developed a logic to reason about database transformations [14].
2. Motivation We begin with modifications on tuples in a relation since tuples represent a common view for locations in the relational model. As will be revealed in the following example, parallel manipulations on distinct attributes of a tuple are prohibited if only tuples are permissible locations in a state. Example 2.1 Let S be a state containing a nested relation schema R = {A1 : {A11 : D11 , A12 : D22 }, A2 : D2 , A3 : D3 } and a nested relation I(R) over R as shown in Figure 1 where oi (i = 1, 3) are tuple identifiers in I(R) and oij (j = 1, 2) are tuple identifers in the relations in the attribute A1 of tuples oi . Suppose that the following two rules execute in parallel, modifying values of attributes A2 and A3 of the same tuple. forall x, y, z with R(x, y, z) ∧ y = b3 do par R(x, y, z) := f alse R(x, y, c2 ) := true par enddo
forall x, y, z with R(x, y, z) ∧ y = b3 do par R(x, y, z) := f alse R(x, b, z) := true par enddo
40
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
o1 o3
o11 o12 o31
A1 A11 {(a11 , (a11 , {(a31 ,
A12 a12 ), a12 )} a32 )}
A2
A3
b
c1
b3
c3
Figure 1. A relation I(R) in nested relational databases
The right rule changes the attribute value b3 in the second tuple to b, meanwhile the left rule changes the attribute value c3 in the same tuple to c2 . They yield pairs of updates {(R({(a31 , a32 )}, b, c3 ), true), (R({(a31 , a32 )}, b3 , c3 ), false)} and {(R({[a31 , a32 )}, b3 , c3 ), false), (R({(a31 , a32 )}, b3 , c2 ), true)}, respectively. Since the rules are running in parallel, we get a set of updates, i.e., {(R({(a31 , a32 )}, b, c3 ), true), (R({(a31 , a32 )}, b3 , c3 ), false), (R({(a31 , a32 )}, b3 , c2 ), true)}. However, applying such a set of updates results in replacing the tuple R({(a31 , a32 )}, b3 , c3 ) by two tuples R({(a31 , a32 )}, b, c3 ), R({(a31 , a32 )}, b3 , c2 ) rather than a single tuple R({(a31 , a32 )}, b, c2 ) as expected. A straightforward solution of solving this problem is to add a finite number of attribute functions as locations for accessing attributes of tuples. Thus, locations are extended to either an n-ary relational function symbol R with n arguments such as R(a1 , ..., an ), or a unary attribute function symbol with an argument in the form of fR.A1 ....Ak (o) for a relation name R, attributes A1 , . . . , Ak and an identifier o. Note that, attribute functions cannot entirely replace relational functions. To delete a tuple from or add a tuple into a relation, we must still use relational functions. Attribute functions can only be used to modify the values of attributes, including NULL values. The following example illustrates how values of distinct attributes in the same tuple can be modified in parallel by using this approach. Example 2.2 Let us consider again the nested relation I(R) in Figure 1. Assume that there is a set of attribute functions with a one-to-one corresponding to the attributes in R, i.e., for each Ak ∈ {A1 , A1 .A11 , A1 .A12 , A2 , A3 }, there is a fR.Ak (x) = y for a tuple identifier x in I(R) of a state S and a value y in the domain of Ak . Thus, we have the following locations and their interpretations for the second tuple of I(R). • • • • • • •
Using this approach, the following rule is able to modify values of attributes A2 and A3 of the same tuple in parallel. forall x with R(x) ∧ fR.A2 (x) = b3 do par fR.A2 (x) := b
41
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
o1 o3
o11 o12 o31
A1 A11 {(a11 , (a11 , {(a31 ,
A12 a12 ), a12 )} a32 )}
A2
A3
[b11 , b12 ]
{{c1 , c1 }}
[b3 ]
{{c31 , c32 , c33 }}
Figure 2. A relation I(R ) in complex-value databases
fR.A3 (x) := c2 par enddo The nested relation is just one example of complex-value databases. Other complexvalue data models are possible by allowing arbitrary nesting of various type constructors over base domains. Next we propose the locations necessary for two other common type constructors: list and multiset. Following the terminology in [11], we call a position of a list as the number referring to an element of the list, and a place of a list to be before the first element, between two adjacent elements or after the last element of the list, which both start from zero and counts from left-to-right. Let us take the list [b11 , b12 ] as an example. There are three positions in [b11 , b12 ], where b11 is in position 0 and b12 is in position 1. Moreover, the list [b11 , b12 ] has three places, where place 0 is just before b11 , place 1 is between b11 and b12 and place 2 is after b12 . Here, we prefer to consider that, for a finite list s with length n, the locations of s are in form of fs (k, k) for k = 0, . . . , n and fs (k, k + 1) for k = 0, . . . , n − 1. That is, a location fs (k, k) indicates an insertion point of the list s, while a location fs (k, k + 1) targets an element in the list. The symbol ↓ is used to indicate a deletion operation. For a multiset, we associate it with a pair (D, f ), where D is a domain of elements and f : D → N is a function from D to the set of natural numbers. Correspondingly, the locations referring to elements of a multiset M are expressed as unary functions in form of fM (x), and an update (fM (x), y) specifies that there are y occurrences of the element x in M. If y is zero, then we say that the element x does not exist in M.
Example 2.3 Let us extend the relation I(R) in Figure 1 to a relation I(R ) with R = {A1 : {A11 : D11 , A12 : D22 }, A2 : N (D2 ), A3 : M(D3 )} in Figure 2 where N (D2 ) denotes the set of all finite lists over the domain D2 and M(D3 ) denotes the set of all finite multisets over the domain D3 . The attribute functions for attributes A2 and A3 thus need to be changed, e.g., • valS (fR .A2 (o3 )) = [b3 ] • valS (fR .A3 (o3 )) = {{c31 , c32 , c33 }} Therefore, for the attribute value [b3 ], we may have (fR .A2 (o3 )(0, 0), b31 ) to insert b31 before b3 , (fR .A2 (o3 )(0, 1), b3 ) to replace b3 with b3 or (fR .A2 (o3 )(0, 1), ↓) to delete b3 . For the attribute value {{c31 , c32 , c33 }}, we may have (fR .A3 (o3 )(c34 ), 2) to add a new element c34 with the number of occurrence 2, (fR .A3 (o3 )(c32 ), 0) to delete c32 from the multiset.
42
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
In addition, we may want to increase the number of occurrence of an element in a multiset by a number k on top of the original occurrence. In this case, we do not care about the original number of occurrence as long as the number of occurrence of the element has been increased by k. For this kind of modification, it would be natural to associate an additional operator with an update so as to describe how the number of occurrence will be changed, for example, increase or decrease. The above approach of adding attribute functions works quite well in resolving the partial update problem on distinct attributes of a tuple or a subtuple. Nevertheless, the co-existence of locations R(a1 , ..., an ) for relational functions and fR.A1 ....Ak (o) for attribute functions give rise to new problems as illustrated by the following example. Example 2.4 Suppose that we have the following rule executing over I(R) in Figure 1. Then, the rule yields an update set containing two updates (fR.A2 (o3 ), b) and (R({(a31 , a32 )}, b3 , c3 ), false). By using the standard definition for a consistent update set, we know that this update set is consistent. However, they are actually conflicting each other: The update (fR.A2 (o3 ), b) intends to change the value of attribute A2 of the tuple with identifier o3 to b, while the update (R({(a31 , a32 )}, b3 , c3 ), false) intends to delete this tuple. par forall x with R(x) ∧ fR.A2 (x) = b3 do fR.A2 (x) := b enddo forall x, y, z with R(x, y, z) ∧ y = b3 do R(x, y, z) := f alse enddo par Due to the arbitrary nesting of type constructors, locations of complex-value database are often defined at multiple abstraction levels and thereby non-disjoint. In fact, allowing locations of different abstraction levels plays a vital role in supporting the requests of updating complex-values of a database at different granularity. This brings us to the question of how to utilise applicative algebra to solve the partial update problem in the setting of non-disjoint locations. One possible approach is to transform updates with nested locations into updates with nested modifications but disjoint locations and then apply applicative algebras as suggested in [11]. Because all sorts of particles used to modify the nested internal structure of an element have to be defined on the outermost type of the element, this immediately leads to particles with complicated controls which encode the nested modifications. The second approach is to establish a mechanism for the nested construction of applicative algebras in accordance with complex data structures used in a data model. Let us take I(R ) in Figure 2 for example. Assume that Ai (i = 11, 12, 2, 3) are applicative algebras built upon the domains Di , and we use the notations set(A), tup(A), lis(A) and mul(A) to denote applicative algebras built upon an applicative algebras A for types set, tuple, list and multiset, respectively. Then, the following nested applicative algebra needs to be constructed for I(R ): set(tup(set(tup(A11 , A12 )), lis(A2 ), mul(A3 ))).
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
43
Clearly, this kind of construction is quite complicated. Furthermore, there are two issues to be considered: (1) How to properly reflect the consistency and integration of partial updates at a particular level in the consistency and integration of partial updates at higher levels? (2) Is there an efficient algorithm that can handle the consistency and integration of a multiset of partial updates at different abstraction levels? In the rest of this paper, we will develop a customised and efficient mechanism to handle these problems.
3. Preliminaries on Partial Updates We first formalise the notion of partial location and then formally define partial updates. Definition 3.1 Let S be a state, f be an auxiliary dynamic function symbol of arity n in the state signature, and a1 , ..., an be elements in the base set of S. Then, f (a1 , ..., an ) is called a non-prime location. Definition 3.2 A location 1 subsumes a location 2 (notation: 2 1 ) if, for all states S, valS (1 ) uniquely determines valS (2 ). While, in principle, prime locations bounded to complex values are independent from each other, we may consider each position within a complex value as a non-prime location. We call a location 2 the sublocation of a location 1 iff 2 1 holds. A location is the sublocation of itself. A trivial location ⊥ is the sublocation of every location. Example 3.1 fR .A2 (o3 )(0, 0) and fR .A3 (o3 )(c32 ) discussed in Example 2.3 are non prime locations and also the sublocations of R ({(a31 , a32 )}, [b3 ], {{c31 , c32 , c33 }}). From a constructive point of view, a prime location may be considered as an algebraic structure in which its sublocations refer to parts of the structure. Since such a structure is always constructed by using type constructors like set, tuple, list, multiset, etc. from a specific data model, we only allow sublocations of a prime location, which either subsume or disjoint one another, to be partial locations by the following definition. This restriction is more a technicality so that we can focus on discussing the integration and consistency checking of partial updates. Extending to the general case would be straightforward after adding a decomposition procedure to eliminate sublocations that are overlapping but do not subsume one another. Let 1 , 2 , 3 be any prime or non-prime locations. Then 1 2 = 3 if 1 3 , 2 3 and there is no other ∈ L such that = 3 , 1 , 2 and 3 . Similarly, 1 2 = 3 if 3 1 , 3 2 and there is no other ∈ L such that = 3 , 1 , 2 and 3 . We say 1 2 = ⊥ if 1 and 2 are disjoint. Definition 3.3 Let S be a state. Then, the set of partial locations of S is the smallest set such that • each prime location is a partial location, and • if each prime location is an algebraic structure (L , , , , ⊥ ) satisfying the following conditions, then each sublocation of is a partial location.
44
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
Figure 3. An algebraic structure of a prime location
— (L , , ) is a lattice, consisting of a set L of all sublocations of , and two binary operations (i.e., join) and (i.e., meet) on L , — is the identity element for the join operation , — ⊥ is the identity element for the meet operation , and — for any 1 and 2 in L , one of the following conditions must be satisfied: (1) 1 2 = 1 , (2) 1 2 = 2 , or (3) 1 2 = ⊥ .
Example 3.2 Let us consider the prime location R ({(a31 , a32 )}, [b3 ], {{c31 , c32 , c33 }}) in the relation I(R ) of Figure 2. This prime location can be regarded as an algebraic structure in Figure 3, where the label i of a node in the picture at the left hand side corresponds to the index i of the sublocation i at the right hand side. As all conditions required in Definition 3.3 are satisfied, therefore, these sublocations are partial locations. In addition to the subsumption relation, one partial location may be dependent on another partial location, i.e., the dependence relation over partial locations of a state. Definition 3.4 A location 1 depends on a location 2 (notation: 2 1 ) if valS (2 ) = ⊥ implies valS (1 ) = ⊥ for all states S. The dependency relation is said to be strict on the location , if for all 1 , 2 , 3 ∈ L = { | }, we have that whenever 1 2 and 1 3 hold, then either 2 3 or 3 2 holds as well. B+-trees provide examples for non-strict dependency relations that are at the same time not induced by subsumption. However, such a dependency may also occur without nesting, the prominent examples being sequences and trees. Example 3.3 Consider the partial locations fR .A2 (o3 )(0, 0), fR .A2 (o3 )(0, 1) and fR .A2 (o3 )(1, 1) in the relation I(R ) of Figure 2. As fR .A2 (o3 )(k1 , k2 )fR .A2 (o3 )(k1 , k2 ) holds for k1 < k2 , the dependency relation is strict on fR .A2 (o3 ).
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
45
A partial location 2 that is subsumed by a partial location 1 certainly depends on it in the sense that if it is bound to a value other than ⊥ (representing undefinedness), then also 1 cannot be bound to ⊥. So the following lemma is straightforward. Lemma 3.1 For two partial locations 1 , 2 with 2 1 we also have 1 2 . Proof. Let S be a state. As valS (1 ) uniquely determines valS (2 ), clearly valS (l1 ) = ⊥ implies valS (l2 ) = ⊥. That is, 2 depends on 1 , i.e. 1 2 . To formalise the definition of partial updates, we associate a type with each partial location = f (a1 , ..., an ), such that the type τ () of f (a1 , ..., an ) is the codomain of the function f : D1 × ... × Dn → D, i.e., τ () = D. Therefore, a type of partial locations can be a built-in type provided by database systems, such as String, Int, Date, etc., a complex-value type constructed by using type constructors in a data model, such as set, tuple, list and multiset constructors, or a customised type defined by users, i.e., user-defined types (UDTs) used in database applications.
Example 3.4 Reconsider the partial locations 1 , 2 and 3 of I(R ) in Figure 3. They have the following types: τ (1 ) = P(N T 2 (D11 , D12 )), τ (2 ) = N (D2 ) and τ (3 ) = M(D3 ), where P(D) denotes the set of all subsets over the domain D, and N T 2 (D1 , D2 ) denotes the set of all 2-ary tuples over the domains D1 and D2 . Instead of particles, we will formalise partial updates of a database transformation in terms of exclusive and shared updates. Definition 3.5 An exclusive update is a pair (, b) consisting of a location and a value b of the same type τ as . A shared update is a triple (, b, μ) consisting of a location of type τ , a value b of type τ and a binary operator μ : τ × τ → τ . For a state S and an update set Δ containing a single (exclusive or shared) update, we have valS+Δ ()
=
b if Δ = {(, b)} μ(valS (), b) if Δ = {(, b, μ)}
Although exclusive updates have the similar form to updates of ASMs defined in a standard way, exclusive updates are allowed to have partial locations. It means that the locations of two exclusive updates may have a dependency relationship, whereas the locations of two standard updates of ASMs are assumed to be disjoint. Therefore, the notion of exclusive update generalises the notion of update in ASMs. Updates defined in ASMs become exclusive updates to prime locations in our definition. In a shared update (, b, μ), the binary operator μ is used to specify how the value b partially affects the content of in a state. When multiple partial updates are generated to the same location simultaneously, a multiset of partial updates is obtained. For example, a location of type N may associate with a multiset of shared updates {{(, 10, +), (, 10, +), (, 5, −)}} (i.e., increase the content of by 10 twice and decrease the content of by 5 once). The use of a binary operator μ in shared updates helps us to separate the concerns relating to database instance and database schema. By this separa-
46
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
tion, the consistency checking of incompatible operators can be conducted at a database schema level, which will be further discussed in the next section. So this viewpoint is efficient in practice, particularly for those database applications with large data-sets. We provide different update rules to generate exclusive and shared updates. Definition 3.6 Let t1 and t2 be terms of type τ , and μ be a binary operator over type τ , then the partial update rules take one of the following two forms • the rule for exclusive updates: t1 ⇔ t2 ; • the rule for shared updates: t1 ⇔μ t2 . Semantically, the partial update rules generate updates in a multiset. Let S be a ¨ 1 ⇔μ ¨ 1 ⇔ t2 , S, ζ) = {{(, b)}} and Δ(t state and ζ be a variable assignment, then Δ(t t2 , S, ζ) = {{(, b, μ)}}, where = t1 [a1 /x1 , . . . , an /xn ] for var(t1 ) = {x1 , . . . , xn } and ζ(xi ) = ai (i = 0, . . . , n), and valS,ζ (t2 ) = b. Remark 3.1 The addition of auxiliary functions as locations of a state requires a shifted view for partial updates in our definition. In contrast to an update (, b) defined in standard ASMs, in which valS+{(,b)} () = b holds for every state S, the partial updates considered here do not satisfy such a condition.
Example 3.5 Consider a state S that has the relation I(R ) of Figure 2 and the partial updates (fR .A2 (o3 )(0, 0), d31 ) and (fR .A2 (o3 )(0, 1), d32 ). Applying these partial updates will change the value of attribute A2 at the tuple with identifier o3 from [d3 ] in the state S to [d31 , d32 ] in the successor state S = S + {(fR .A2 (o3 )(0, 0), d31 ), (fR .A2 (o3 )(0, 1), d32 )}. However, valS (fR .A2 (o3 )(0, 0)) = d31 , and similarly, valS (fR .A2 (o3 )(0, 1)) = d32 . Instead, we have valS (fR .A2 (o3 )(0, 0)) = null and valS (fR .A2 (o3 )(0, 1)) = d31 . For simplicity, we will call partial location as location in the rest of this paper.
4. Normalisation of Shared Updates ¨ of partial updates is the process of merging all shared Normalisation of a multiset Δ ¨ is transformed into updates to the same location into a single exclusive update. Thus, Δ an update set Δ containing only exclusive updates. ¨ is in the normal form if each update in it is an Definition 4.1 An update multiset Δ exclusive update with multiplicity 1. ¨ and Opt(Δ) ¨ denote the set of locations and the set of As a convention, let Loc(Δ) ¨ ¨ denotes the submultiset operators occurring in an update multiset Δ, respectively, and Δ ¨ containing all shared updates that have the location . of an update multiset Δ
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
47
4.1. Operator-Compatibility The notion of operator-compatible addresses the inconsistencies arising from shared updates to the same location in an update multiset, no matter which abstraction level their locations reside at and whether they are dependent on other locations in the same update multiset. Example 4.1 Let Q∗ be the set of rational numbers excluding zero and R be the set of real numbers, then addition + and substraction − are operators over R, and multiplication × and division ÷ are operators over Q∗ , respectively. Suppose that is a location of type Q∗ , then the following modifications can be executed in parallel. par ⇔+ b1 ⇔− b2 ⇔× b3 ⇔÷ b4 par ¨ = {{(, b1 , +), (, b2 , −), (, b3 , ×), (, b4 , ÷)}} For this rule, the update multiset Δ is obtained. The operators in the submultisets {{(, b1 , +), (, b2 , −)}} and {{(, b3 , ×), ¨ is not compatible, (, b4 , ÷)}} are compatible. Nevertheless, the operators in Δ ¨ in different orders yields different results. because applying updates in Δ Many languages developed for database manipulations have set-theoretic operations, such as Structured Query Language (SQL), Relational Algebra (RA), etc. The partialupdate problem relating to set-theoretic operations is about the parallel manipulations on sets via various set-based operations. The following example illustrates that, after a main computation initializes a set of subcomputations, each of subcomputations may yield a set of values that are then unioned into the final result in parallel. Example 4.2 Let P(D) be the powerset of the domain D, then the set-based operations: union ∪, intersection ∩, difference −, symmetric difference , etc. can be regarded as common operators over domain P(D). The following rule produces an operatorcompatible update multiset {{(, {b1 , b2 }, ∪), (, {b2 , b3 , b4 }, ∪)}}. par ⇔∪ {b1 , b2 } ⇔∪ {b2 , b3 , b4 } par These examples motivate a straightforward definition of operator-compatibility in terms of order-independent application of shared updates to the same location. ¨ = {{(, ai , μi ) | i = 1, ..., k}} be a multiset of shared upDefinition 4.2 Let Δ ¨ is operator-compatible if for any two permudates on the same location . Then Δ tations (p1 , ..., pk ) and (q1 , ..., qk ), we have, for all x, μpk (...μp1 (x, ap1 )..., apk ) = ¨ is operator-compatible if Δ ¨ is μqk (...μq1 (x, aq1 )..., aqk ). An update multiset Δ ¨ operator-compatible for each ∈ Loc(Δ).
48
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
As illustrated in Example 4.1, the order-independence of operators is easy to check when the number of shared updates is small. However, in case of a large number of shared updates, compatibility checking by means of exploring all possible orderings is far too time-consuming. Therefore, we introduce an algebraic approach to characterize the operator-compatibility of shared updates to the same location. Definition 4.3 A binary operator μ1 (over the domain D) is compatible to the binary operator μ2 (notation: μ1 μ2 ) (over D) iff μ2 is associative and commutative and for all x ∈ D there is some x˙ ∈ D that for all y ∈ D we have y μ1 x = y μ2 x. ˙ Obviously, each associative and commutative operator μ is compatible to itself (i.e., self-compatible). The following Lemma gives a sufficient condition for compatibility. Lemma 4.1 Let μ1 and μ2 be two binary operators over domain D such that (D, μ2 ) defines a commutative group, and (x μ1 y) μ2 y = x holds for all x, y ∈ D. Then μ1 μ2 holds. Proof. Let e ∈ D be the neutral element for μ2 and x˙ be the inverse of x. Then we get, y μ1 x = (y μ1 x) μ2 e = (y μ1 x) μ2 (x μ2 x) ˙ = ((y μ1 x) μ2 x) μ2 x˙ = y μ2 x. ˙
Example 4.3 Let us look back Example 4.1. Both (R, +) and (Q∗ , ×) are abelian groups, the duality property in Lemma 4.1 is satisfied by addition + and substraction − on R, and multiplication × and division ÷ on Q∗ , respectively. Thus, − + and ÷ ∗ hold on R and Q∗ , respectively. Similarly, set operations such as union ∪, intersection ∩, symmetric difference are self-compatible. Moreover, as x − y = x ∩ y¯ holds with the complement y¯ of the set y, set difference − is compatible to intersection ∩. Compatibility ı1 ı2 permits replacing each shared update (, v, ı1 ) by the shared update (, v, ˙ ı2 ). Then the associativity and commutativity of ı2 guarantees orderindependence. Thus, we obtain the following theorem. ¨ of shared updates on the same location is Theorem 4.1 A non-empty multiset Δ ¨ ¨ ) such that, operator-compatible if either |Δ | = 1 holds or there exists a μ ∈ Opt(Δ ¨ for all μ1 ∈ Opt(Δ ), μ1 μ holds. Proof. The first case is trivial. In the second case, if μ1 μ holds, we can replace ¨ with μ1 by shared updates with μ. In doing so we obtain an all shared updates in Δ update multiset, in which only the self-compatible operator μ is used. The associativity and commutativity of μ implies (. . . ((x μ b1 ) μ b2 ) . . . μ bk ) = (. . . ((x μ bς(1) ) μ bς(2) ) . . . μ bς(k) ) for all x, b1 , . . . , bk and all permutations ς as desired.
¨ 1 ) = {+, −}, Δ ¨ 2 with Opt(Δ ¨ 2 ) = ¨ 1 with Opt(Δ Example 4.4 Suppose that we have Δ ¨ 3 with Opt(Δ ¨ 3 ) = {∩, −} and Δ ¨ 4 with Opt(Δ ¨ 4 ) = {∩, ∪}. {×, ÷}, Δ ¨ 1 , Δ ¨ 2 and Δ ¨ 3 are operator-compatible, and From Theorem 4.1, we obtain that Δ ¨ 4 is not operator-compatible. Δ
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
49
Remark 4.1 Theorem 4.1 allows checking of the operator-compatibility of shared updates to the same location by only utilising the schema information. This approach can ensure conformance to the genericity principle of database transformations, while considerably improving database performance. 4.2. Normalisation Algorithm ¨ we denote the normalisation of the update multiset Δ. ¨ FurBy the notation norm(Δ), thermore, let Δλ be a trivial update set, indicating that an update set is inconsistent. This comes into play when we do not have operator-compatibility. ¨ is conducted for each location apNormalisation of a given update multiset Δ ¨ ¨ ¨ is transformed into a set conpearing in Δ, i.e. we normalise Δ . In doing so, Δ ¨ taining exactly one exclusive update, provided Δ is operator-compatible. Otherwise ¨ ) = Δλ . The following algorithm describes the normalisation process in detail. norm(Δ Algorithm 4.1 ¨ and a state S Input: An update multiset Δ ¨ Output: An update set norm(Δ) Procedure: ¨ the set of locations Loc(Δ) ¨ appearing in Δ ¨ (i) By scanning through updates in Δ, ¨ and all exclusive is obtained, shared updates to each location are put into Δ updates a ¨ , the following steps are processed: (ii) For each Δ ¨ = {{(, b, μ)}}, then norm(Δ ¨ ) = {(, μ(valS (), b)}; (a) If Δ ¨ (b) otherwise, check Opt(Δ ): ¨ ) such that for all μ ∈ Opt(Δ ¨ ), μ μ holds, i. If there exists μ ∈ Opt(Δ then ¨ where μ = μ into the form • translate each update (, b, μ) ∈ Δ (, b , μ ) according to the results from Lemma 4.1; • assume that the update multiset after finishing the translation on ¨ is {{(, b , μ ), ..., (, b , μ )}}, Δ ¨ can be inteeach update in Δ 1 k ¨ ) = {(, b )}, where b = grated into the update set norm(Δ valS () μ b1 μ ... μ bk ¨ = Δλ and then exit the algorithm. ii. otherwise, norm(Δ) ¨ is obtained by norm(Δ) ¨ = ¨ ) ∪ Δ ¨ excl . (iii) norm(Δ) norm(Δ ¨ ∈Loc(Δ)
The following result is a direct consequence of the algorithm. ¨ its normalisation norm(Δ) ¨ is different from Corollary 4.1 For an update multiset Δ, ¨ Δλ iff Δ is operator-compatible. ¨ we can immediately draw the con¨ = Δλ for an update multiset Δ, If norm(Δ) ¨ is not consistent. Otherwise, we obtain an update set containing only exclusion that Δ ¨ = Δλ , clusive updates. In the following section we will therefore assume norm(Δ) and investigate further inconsistencies among exclusive updates in an update set after normalisation.
50
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
5. Integration of Exclusive Updates In this section, we will deal with the second stage of consistency checking starting from a normalised update set that only contain exclusive updates. Since exclusive updates may have partial locations, the definition for the consistency of an update set can not be directly taken from the standard definition of ASMs. Even if, for values b and b of any two exclusive updates to the same location in an update set Δ, we have b = b , Δ still might not be consistent. It is possible that inconsistencies arise from updates of distinct but non-disjoint locations, as illustrated in Example 2.4. Therefore, instead of consistent, we call an update set value-compatible if such a condition is satisfied. Definition 5.1 A set Δ of exclusive updates is value-compatible if, for each location in Δ, whenever (, b), (, b ) ∈ Δ holds, we have b = b . An update set that contains exclusive updates may be value-compatible but not consistent. On the other hand, following the standard definition for the consistency of an update set, we can have the following fact. Fact 5.1 Let Δ = {( 1 , v1 ), ..., (k , vk )} be an update set containing exclusive updates. If the condition i j is satisfied, then Δ is consistent. 1≤i=j≤k
Obviously, the condition
1≤i=j≤k
i j is sufficient but not necessary. There are
cases in which a set of exclusive updates to non-disjoint locations is consistent.
Example 5.1 For the relation I(R ) in Example 2.3, suppose that we have Δ1 = {(fR .A2 (o3 )(0, 1), b31 ), (fR .A2 (o3 )(1, 1), b32 ), (fR .A2 (o3 ), (b31 , b32 ))} which mean to add b31 before the first element of [b3 ], to replace the first element of [b3 ] and to change [b3 ] with [b31 , b32 ]. As applying the updates (fR .A2 (o3 )(0, 1), b31 ) and (fR .A2 (o3 )(1, 1), b32 ) simultaneously over the relation I(R ) results in the update (fR .A2 (o3 ), (b31 , b32 )), which coincides with the third update in Δ1 , Δ1 is consistent. The above example demonstrates that, in order to check the consistency of exclusive updates that may have non-disjoint locations, we need to compose exclusive updates to locations defined at the same abstraction level. 5.1. Parallel Composition We start with the parallel composition operations for updates, which have locations constructed by using common type constructors set, multiset, list and tuple. Set Set constructor has been widely used in various data modeling. Assume that we have a location representing a set in a state S, i.e., valS () = f , and locations referring to the elements a of the set f are expressed as f (a). For the set Δ of updates in which the locations refer only to the elements of the set f , if Δ is value-compatible, then the set of updates in Δ can be integrated into an update such that Ω(Δ) = (, b) where b = valS () ∪ {ai |bi = true ∧ (f (ai ), bi ) ∈ Δ} −{ai |bi = f alse ∧ (f (ai ), bi ) ∈ Δ}.
51
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
Multiset Multiset constructor is also known as bag constructor in data modeling. Assume that we have a location representing a multiset in a state S, i.e., valS () = M, and a location referring to an element a of the multiset M is expressed as f (a) as discussed in Section 2. Alternatively, a multiset M may be represented as a set of elements of the form (a, c) where a is the element of M and c its number of occurrence in the multiset M. A value compatible set Δ of updates in which the locations refer only to the elements of the multiset M can be integrated into an update such that Ω(Δ) = (, b) where
b = valS () − {(a, b )|(a, b ) ∈ valS () ∧ (f (a), b) ∈ Δ ∧ b = b } ∪{(a, b)|(f (a), b) ∈ Δ}. List List constructor provides the capability of modelling the order of elements when such an order is of interest. Consequently, the sublocations constructed by applying a list constructor are ordered, which we can capture by using a strict dependence relation among them as discussed in Section 3. Assume that we have a location representing a list f in a state S, and the locations referring to the parts of the list are expressed by f (k1 , k2 ) as discussed in Section 2. Then, a value-compatible set Δ = {(1 , b1 ), ..., (n , bn )} of updates, in which the locations i (i = 1, ..., n) refer to the elements of the list f can be integrated into an update such that Ω(Δ) = (, b) where b = valS+{(p1 ,bp1 )}+...+{(pn ,bpn )} () and pi pi+1 for a permutation p1 , . . . , pn of the updates in Δ and i = 1, . . . , n − 1. That is, b is the list obtained by applying Δ over the list f in the current state S in the order of first taking the update whose location is being dependent by the locations of other updates. Tuple Tuple constructor can be treated in a similar way to list constructor, except that the order of applying updates in an update set can be arbitrarily chosen. Assume that the location representing a tuple in a state S. Then, a value-compatible set Δ = {(1 , b1 ), ..., (n , bn )} of updates, in which the locations only refer to the attribute values of the tuple represented by can be integrated into an update such that Ω(Δ) = (, b) where b = valS+Δ (). 5.2. Location-Based Partitions To efficiently handle dependencies between partial locations, we propose to partition a given update set containing only exclusive updates into a family of update sets. Each update set in such a family is called a cluster which has an update subsuming all other updates. The notation SubL() denotes the set of all the sublocations of a location . Lemma 5.1 LetLS denote the set of locations in a state S. Then there exists a unique partition LS = Li such that i∈I
• for all i, j ∈ I with i = j we have i j and j i for all i ∈ Li and j ∈ Lj , and • for each i ∈ I there exists a location i ∈ LS with Li = SubL(i ).
52
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
Proof. By taking connected components of the graph defined by (LS , ) we can partition LS into Li (i ∈ I) satisfying the first property. Moreover, none of the Li can be further decomposed while still satisfying the first property and we cannot combine multiple partition classes such that the second property holds. Thus, this partition is unique. According to the definition of the subsumption relation , each SubL() is contained in one Li , and SubL(2) ⊆ SubL(1 ) holds for 2 1 . On the other hand, maximal elements with respect to define disjoint locations. Therefore, for a maximal element with respect to we must have SubL() = Li for some i ∈ I, which shows the second property.
Now let Δ be an update set containing exclusive updates. Using the partition of LS from Lemma 5.1 we obtain a partition Δ = Δi where Δi = {(, b) ∈ Δ | ∈ Li } i∈I
and I = {i ∈ I | Δi = ∅}. The following lemma is a direct consequence of the independence of locations in different set Li . Lemma 5.2 Δ is consistent iff each Δi for i ∈ I is consistent. As not all locations in Li appear in an update set Δ, we may further decompose each Δi for i ∈ I . For this let L(Δi ) ⊆ Li be the set of locations appearing in Δi . By taking connected components of the graph defined by (L(Δi ), ) we can get partition L(Δi ) = j∈Ji Lij such that for all j1 , j2 ∈ Ji with j1 = j2 we have j1 j2 and j2 j1 for all j1 ∈ Lij1 and j2 ∈ Lij2 . As none of the Lij can be further decomposed, this partition is also unique. Taking Δij = {(, b) ∈ Δi | ∈ Lij } and omitting those of these update sets that are empty, we obtain a unique partition of Δi . Lemma 5.3 Δi is consistent for i ∈ I iff each Δij with j ∈ Ji is consistent. Proof. Consider the maximal elements i1 , ..., ik in L(Δi ) with respect to and the unique values vij (j = 1, ..., k) with (ij , vij ) ∈ Δi . Let S be a state with valS (i ) = vi . If Δij is consistent, then valS+{(ij ,vij )} () = valS+Δij () for all (, v) ∈ Δij . As the locations ij are pairwise disjoint, according to Fact 5.1 we may simultaneously apply all updates (ij , vij ) to vi to obtain a value vi , thus valS+{(i ,vi )} () = valS+{(i1 ,vi1 ),...,(ik ,vik )} () for all (, v) ∈ Δi . The converse, that Δi (i.e., the union of all Δij ) is not consistent if any Δij is not consistent, is obvious.
In the proof we actually showed more, as we only need “upward consistency” for the set of locations below the maximal elements ij . Corollary 5.1 For the maximal elements i1 , . . . , ik in L(Δi ) with respect to , let Δij = {(, v) ∈ Δi | ij }. Then Δi is consistent iff all Δij (j = 1, . . . , k) are consistent. Note that the update sets Δij in Corollary 5.1 are uniquely determined by Δ. There exist locations i and ij such that ij i and for all updates (, v) ∈ Δij we have ij . We call such an update set Δij a cluster below ij . With respect to the subsumption relation , locations in Li may be assigned with levels. Assume that the length of the longest downward path to a minimal element from the maximal element in Li is n. Then,
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
53
• the maximal element is a location at the level n, • the elements which are the children of a location at the level k are locations at the level k − 1. Thus, the maximal element i ∈ Li (as in Lemma 5.1) resides at the highest level, the minimal element in Li resides at the lowest level and other locations in Li are arranged at levels in the middle. A location at the level n is denoted as n . For a cluster Δij below ij , the level of ij is called the height of Δij and is denoted as height(Δij ).
Example 5.2 Let us consider again the prime location R ({(a31 , a32 )}, [b3 ], {{c31 , c32 , c33 }}) and its sublocations (see Example 3.2).
• Suppose that we have Δ = {(112 , a32 ), (22 , b31 ), (23 , b32 ), (2 , [b31 , b32 ])}. Because 22 2 and 23 2 , thus (22 , b31 ), (23 , b32 ) and (2 , [b31 , b32 ]) are partitioned into one cluster, while (112 , a32 ) is in another cluster by itself. • Suppose that we have Δ = {(112 , a32 ), (22 , b31 ), (23 , b32 ), (2 , [b31 , b32 ]), (0 , (∅, [b3 ], {{c31 , c32 }}))}. As 112 , 22 , 23 and 2 are all subsumed by the location 0 , they are all in one cluster. 5.3. Cluster-Compatibility In light of Corollary 5.1, the problem of consistency checking is reduced to that of verifying the consistency of clusters. Lemma 5.4 Let Δ be a cluster below the location . If the set {(n1 , b1 ), ..., (ni , bi )} of all updates in Δ at a level n < height(Δ ) is value-compatible, then, as discussed in Subsection 5.1, it is possible to define a set {(n+1 , b1 ), ..., (n+1 , bj )} of updates at the 1 j level n + 1 such that, for all states S and any location ∈ LS , we have valS+{(n1 ,b1 ),...,(ni,bi )} ( ) = valS+{(n+1 ,b ),...,(n+1,b )} ( ). 1
1
j
j
Proof. Since the level n is less than height(Δ ), the set {(n1 , b1 ), ..., (ni , bi )} of updates can be grouped based on the condition whether their locations are subsumed by the same location at the level n + 1, e.g., {(nk1 , bk1 ), ..., (nkp , bkp )} ⊆ {(n1 , b1 ), ..., (ni , bi )} is the group in which the locations nk1 ,..., nkp are subsumed by some location n+1 ∈ m n+1 n+1 {1 , . . . , j }. Then, for each group of updates with the locations at the level n, if they are value-compatible, then they can be integrated into an exclusive update that has a location at the level n + 1 as follows:
Ω{(nk1 , bk1 ), ..., (nkp , bkp )} = (n+1 m , bm ), where valS+{(nk ,bk1 ),...,(nk ,bkp )} ( ) = valS+{(n+1 ( ) for each state S and all m ,bm )} p 1 n n ∈ LS . In doing so, the set of updates {(1 , b1 ), ..., (i , bi )} defines a set of exclusive , b1 ), ..., (n+1 , bj )} in which the locations are one level higher than n if updates {(n+1 1 j it is value-compatible.
We finally obtain the following main result on the consistency of clusters.
54
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
Theorem 5.1 Let Δ be a cluster below the location . If Δ is “level-by-level” valuecompatible, then Δ is consistent. Proof. If Δ is “level-by-level” value-compatible, then for any state S and starting from updates on locations at the lowest level, exclusive updates on locations at the same level in Δ can be replaced by exclusive updates on one-level-higher locations as stated in Lemma 5.4. As the set of exclusive updates at each level is value-compatible, this procedure continues until we reach the highest level in Δ , i.e., the height of Δ . Finally, all the updates at the level Δ are combined into a single exclusive update (, b) if they
are value-compatible, i.e., valS+{(,b)} ( ) = valS+Δ ( ) for all ∈ LS . Example 5.3 Let us look back again the cluster below the location 0 in the sec ond case of Example 5.2. First, {(112 , a32 ) at level 0 can be integrated into update (11 , (a31 , a32 )) at level 1. Then (11 , (a31 , a32 )) at level 1 is integrated into update (1 , {(a31 , a32 )}) at level 2. Similarly, integrating (22 , b31 ) and (23 , b32 ) at level 2 results in update (2 , [b31 , b32 ]) at level 2, which is identical with the original up date to the location 2 in the cluster. As (1 , {(a31 , a32 )}) and (2 , [b31 , b32 ]) are also value-compatible, they can be integrated to check for consistency against (0 , (∅, [b3 ], {{c31 , c32 }})). Since the resulting update (0 , ({(a31 , a32 )}, [b31 , b32 ], {{c31 , c32 , c33 }})) at level 3 is not value-compatible with the update (0 , (∅, [b3 ], {{c31 , c32 }})) at level 3, thus this cluster above 0 is not consistent. 5.4. Integration Algorithm In this subsection, we present how to algorithmically integrate exclusive updates. For clarity, the procedure is given in terms of two algorithms. The first algorithm clusters the updates in a given set of exclusive updates. Every update is initially assumed to define a cluster. We then successively consider each pair of updates where one update subsumed the other, and amalgamate their respective clusters into larger ones until no more changes can be made. Algorithm 5.1 Input: An update set Δ that only contain exclusive updates Output: A set clus(Δ) of clusters Procedure: (i) starting with P = ∅ and clus(Δ) = {{u}|u ∈ Δ}; (ii) checking the subsumption relation for any two updates ux , uy ∈ Δ, • if the locations of ux and uy are related by subsumption, then add {ux , uy } into P such that P = P ∪ {{ux , uy }}; (iii) doing the following as long as there are changes to clus(Δ): • for each element V in P , do the following — V = {x|x ∈ clus(Δ) and x ∩ V = ∅}, — clus(Δ)=clus(Δ) ∪ {V } − {x|x ⊆ V and x ∈ clus(Δ)}.
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
55
The second algorithm then take the set of clusters and transforms it into a set of exclusive updates in which locations are pairwise disjoint. This is done in accordance with Theorem 5.1, that is, through level-by-level integration provided the updates in each cluster at each level is value-compatible. Algorithm 5.2 Input: A set clus(Δ) of clusters Output: An update set Δ Procedure: (i) Δ = ∅; (ii) For each cluster Δi ∈ clus(Δ), apply the following steps: • Assigning a level to each location in Loc(Δi ) in accordance with the schema information provided by the database environment; • V = Δi ; • Doing the following until the height of the cluster Δi is reached: — P = {(, b)|(, b) ∈ V and the level level() of is minimal in V }; — partition updates in P such that, for each partition class {(1 , b1 ), ..., (n , bn )} ⊆ P , there exists a location with level() = i + 1 and i (i = 1, ..., n); — For each partition class {(1 , b1 ), ..., (n , bn )} ⊆ P , checking the valuecompatibility of the update set {(1 , b1 ), ..., (n , bn )}. (a) if it is value-compatible, then do the following: ∗ apply the parallel composition operation (, b) = Ω{(1 , b1 ), ..., (n , bn )}; ∗ V = V − P ∪ {(, b)}. (b) otherwise, Δ = Δλ and then exit the algorithm. • Δ=Δ∪V. (iii) Exit the algorithm with Δ.
6. Conclusion In this paper, we presented our research on the problem of partial updates in the context of complex-value databases. The work was motivated by the need for an efficient approach for checking the consistency of partial updates, in which locations may refer to parts of a complex object. We proposed an efficient approach for checking whether a given set of partial update is consistent. In the approach, partial updates are classified into exclusive and shared updates, and the consistency checking consists of two stages. The first stage uses an algebraic approach to normalize shared updates based on the compatibility of operators among shared updates, while the second stage checks the compatibility of clusters by integrating exclusive updates level-by-level. In the future, we will continue to exploit the use of partial updates in optimising, rewriting and maintaining aggregate computations in database applications.
56
K.-D. Schewe and Q. Wang / Partial Updates in Complex-Value Databases
References S. Abiteboul and V. Vianu. Datalog extensions for database queries and updates. Journal of Computer and System Sciences, 43(1):62–124, 1991. [2] A. Blass and Y. Gurevich. Background, reserve, and Gandy machines. In Proceedings of the 14th Annual Conference of the EACSL on Computer Science Logic, pages 1–17, London, UK, 2000. Springer-Verlag. [3] A. Blass and Y. Gurevich. Abstract state machines capture parallel algorithms. ACM Transactions on Computational Logic, 4(4):578–651, October 2003. [4] E. B¨orger and R. F. St¨ark. Abstract State Machines: A Method for High-Level System Design and Analysis. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2003. [5] E. Gr¨adel and Y. Gurevich. Metafinite model theory. Information and Computation, 140(1):26–81, 1998. [6] Y. Gurevich. Sequential abstract state machines capture sequential algorithms. ACM Transactions on Computational Logic, 1(1):77–111, July 2000. [7] Y. Gurevich, B. Rossman, and W. Schulte. Semantic essence of AsmL. Theoretical Computer Science, 343(3):370–412, 2005. [8] Y. Gurevich, W. Schulte, and M. Veanes. Rich sequential-time ASMs. In Formal Methods and Tools for Computer Science, pages 291–293, Canary Islands, Spain, 2001. Universidad de Las Palmas de Gran Canaria. [9] Y. Gurevich and N. Tillmann. Partial updates: Exploration. Journal of Universal Computer Science, 7(11):917–951, November 2001. [10] Y. Gurevich and N. Tillmann. Partial updates exploration II. In Abstract State Machines, 2003. [11] Y. Gurevich and N. Tillmann. Partial updates. Theoretical Computer Science, 336(2-3):311–342, 2005. [12] K.-D. Schewe and Q. Wang. XML database transformations, 2009. submitted for publication. [13] K.-D. Schewe and Q. Wang. A customised ASM thesis for database transformations. Acta Cybernetica, 19:765–805, 2010. [14] Q. Wang and K.-D. Schewe. Towards a logic for abstract metafinite state machines. In S. Hartmann and G. Kern-Isberner, editors, Lecture Notes in Computer Science, volume 4932, pages 365–380. Springer, 2008. [1]
Inferencing in Database Semantics Roland HAUSSER Abteilung Computerlinguistik, Universität Erlangen-Nürnberg (CLUE) Bismarckstr. 6, 91054 Erlangen, Germany [email protected] Abstract. As a computational model of natural language communication, Database Semantics1 (DBS) includes a hearer mode and a speaker mode. For the content to be mapped into language expressions, the speaker mode requires an autonomous control. The control is driven by the overall task of maintaining the agent in a state of balance by connecting the interfaces for recognition with those for action. This paper proposes to realize the principle of balance by sequences of inferences which respond to a deviation from the agent’s balance (trigger situation) with a suitable blueprint for action (countermeasure). The control system is evaluated in terms of the agent’s relative success in comparison other agents and the absolute success in terms of survival, including the adaptation to new situations (learning). From a software engineering point of view, the central question of an autonomous control is how to structure the content in the agent’s memory so that the agent’s cognition can precisely select what is relevant and helpful to remedy a current imbalance in real time. Our solution is based on the content-addressable memory of a Word Bank, the data structure of proplets defined as non-recursive feature structures, and the time-linear algorithm of Left-Associative grammar.
Introduction Designing an autonomous control as a software system requires a functional principle to drive it. Following earlier work such as [Bernard 1865] and [Wiener 1948], DBS control is based on the principle of balance, i.e., it is designed to maintain the agent in a steady state (equilibrium, homeostasis) relative to a continuously changing external and internal environment, short-, mid-, and long-term.2 In this way, changes of the environment are utilized as the main motor activating the agent’s cognitive operations. The balance principle guides behavior towards daily survival in the agent’s ecological niche. Behavior driven by instinct and by human desires not directly related to survival, such as power, love, belonging, freedom, and fun, may also be subsumed under the balance principle by treating them as part of the internal environment – like hunger. The agent’s balancing operations provide the foundation for a computational reconstruction of intention in DBS, just as the agent’s recognition and action procedures provide the foundation for a computational reconstruction of concepts and of meanings 1 For
an introduction to DBS see [NLC’06]. For a concise summery see [Hausser 2009a]. conceptually much different from previous and current approaches to autonomous control, our mechanism is closer in spirit to circular causal systems in ecology [Hutchinson 1948] than to the more recent systems of control with a stratified architecture structured into the levels of organization, coordination, and execution [Antsaklis and Passino 1993]. 2 Though
58
R. Hausser / Inferencing in Database Semantics
(cf. [AIJ’01]). This differs from [Grice 1965], who bases his notion of meaning on an elementary (undefined, atomic) notion of intention – which is unsuitable for computation.3 An autonomous control maintaining a balance by relating recognition to the evaluated outcome of possible reactions is decentralized,4 in line with [Brooks 1985].
1. Inferences of Database Semantics Maintaining the agent in a state of balance is based on three kinds of DBS inference, called R(eactor), D(eductor), and E(ffector) inferences.5 R inferences are initiated by a trigger provided (i) by the agent’s current external or internal recognition or (ii) by currently activated memories (subactivation, cf. Sect. 6). D and E inferences, in contrast, are initiated by other already active inferences, resulting in chaining. As a first, simple method of chaining, let us assume that the consequent of inference n must equal the antecedent of inference n+1. R(eactor) inferences provide a response to actual or potential deviations from the agent’s balance (cf. 1.1, 4.1, 12.1). A given trigger automatically initiates exactly those R inferences which contain the trigger concept, e.g., hot or hungry, in their antecedent. D(eductor) inferences establish semantic relations of content, and are illustrated by summarizing (cf. 3.2), downward traversal (cf. 10.1), and upward traversal (cf. 10.4). Other kinds of D inferences are precondition and cause and effect. Triggered initially by an R inference, a D inference may activate another D inference or an E inference. E(ffector) inferences provide blueprints for the agent’s action components.6 Because E inferences connect central cognition with peripheral cognition, their definition has to be hand-in-glove with the robotic hardware they are intended to control. The interaction of reactor, deductor, and effector inferences is illustrated by the following chain, using English rather than the formal data structure of proplets7 for simplicity: 1.1. C HAINING R, D, AND E INFERENCES 1. R: β is hungry cm β eats food. 2. D: β eats food pre β gets food. 3. D: β gets food ⇓ β gets α, where α {apple, pear, salad, steak}. 4. E: β gets α exec β locates α at γ. 5. E: β locates α at γ exec β takes α. 6. E: β takes α exec β eats α. 7. D: β eats α ⇑ β eats food. Step 1 is an R inference with the connective cm (for countermeasure) and triggered by a sensation of hunger. Step 2 is a D inference with the connective pre (for precondition), 3 Cf.
[FoCL’99], Sect. 4.5, Example II. behavior of social animals, e.g., ants in a colony, may also be described in terms of balance. 5 This terminology is intended to distinguish DBS inferences from the inferences of symbolic logic. For example, while a deductive inference like modus ponens is based on form, the deductor inferences of DBS take content into account. 6 In robotics, effectors range from legs and wheels to arms and fingers. The E inferences of DBS should also include gaze control. 7 Proplets are defined as non-recursive (flat) feature structures and serve as the basic elements of propositions. Like the cell in biology, the proplet is a fundamental unit of structure, function, and organization in DBS. 4 The cooperative
R. Hausser / Inferencing in Database Semantics
59
while step 3 is the D inference for downward traversal with the connective ⇓ (cf. 10.1). Steps 4, 5, and 6 are E inferences with the connective exec (for execute). Step 4 may be tried iteratively for the instantiations of food provided by the consequent of step 3 (see the restriction on the variable α). If the agent cannot locate an apple, for example, it tries next to locate a pear, etc. Individual food preferences of the agent may be expressed by the order of the elements in the variable restriction. Step 7 is based on the D inference for upward traversal with the connective ⇑ (cf. 10.4). This step is called the completion of the chain because the consequent of the inference equals the consequent of step 1. The completion indicates the successful execution of the countermeasure to the imbalance indicated by the antecedent of the initial reactor inference.
2. Coreference-by-Address The implementation of DBS inferences depends on the DBS memory structure. Called Word Bank, it is content-addressable8 in that it does not require a separate index (inverted file) for the storage and retrieval of proplets. A content-addressable memory is especially suitable for fixed content, i.e., content is written once and never changed. This provides a major speed advantage over the more widely used coordinate-addressable memory (as in a relational database) because internal access may be based on pointers enabling direct access to data. In DBS, the requirement of fixed content is accommodated by adding content instead of revising it, and by connecting the new content to the old by means of pointers. Consider, for example, a cognitive agent observing at moment ti that Julia is sleeping and at tj that Julia is awake, referring to the same person. Instead of representing this change by revising the first proposition into the second,9 the second proposition is added as new content, leaving the first proposition unaltered: 2.1. C OREFERENTIAL COORDINATION IN A W ORD BANK STORING PROPLETS ...
noun: Julia fnc: sleep . . . prn: 675
...
... ...
member proplets noun: (Julia 675) fnc: wake prn: 702
...
...
verb: sleep arg: Julia ... prn: 675
verb: wake arg: (Julia 675) prn: 702
owner proplets
. . . core: Julia
. . . core: wake
. . . core: sleep
In a proplet, the part-of-speech attribute, e.g., noun or verb, is called the core attribute and its value is called the core value. A Word Bank stores proplets with equivalent core values in the same token line in the order of their arrival. The occurrence of Julia in the 8 See 9A
[Chisvin and Duckworth 1992] for an overview. more application-oriented example would be fuel level high at ti and fuel level low at tj .
60
R. Hausser / Inferencing in Database Semantics
second proposition is represented by a proplet with a core attribute containing an address value, i.e., [noun: (Julia 675)], instead of a regular core value, e.g., [noun: Julia]. Coreference-by-address enables a given proplet to code as many semantic relations to other proplets as needed. For example, the proplets representing Julia in 2.1 have the fnc value sleep in proposition 675, but wake in proposition 702. The most recent (and thus most up-to-date) content relating to the original proplet is found by searching the relevant token line from right to left, i.e., in the anti-temporal direction. Coreference-by-address combines with the semantic relations of functor-argument and coordination structure, as in the following example: 2.2. C OREFERENCE - BY- ADDRESS CONNECTING NEW TO OLD CONTENT verb: sleep 1 arg: Julia ↔ prn: 675
noun: Julia fnc: sleep prn: 675
2 ←
noun: (Julia 675) fnc: wake prn: 702
3 ↔
verb: wake arg: (Julia 675) prn: 702
The connections 1 and 3 are intrapropositional and based on the functor-argument relations between Julia and sleep, and Julia and wake, respectively. Connection 2 is extrapropositional and based on the coreference between the pointer proplet of proposition 702 and the original Julia proplet of proposition 675.10 One way to realize 2.2 in English would be Julia was asleep. Now she is awake. 3. Inference for Creating Summaries Coreference-by-address allows not only (i) to revise the fixed information in a contentaddressable memory by extending it, as in 2.1, but also (ii) to derive new content from stored content by means of inferencing. One kind of DBS inference is condensing content into a meaningful summary. As an example, consider a short text, derived in detail in Chapts. 13 (hearer mode) and 14 (speaker mode) of [NLC’06]: The heavy old car hit a beautiful tree. The car had been speeding. A farmer gave the driver a lift.
A reasonable summary of this content would be car accident. This summary may be represented in the agent’s Word Bank as follows: 3.1. R ELATING SUMMARY TO TEXT member proplets ...
... 10 In its basic form, coreference-by-address is one-directional, from the pointer proplet to the original. The inverse direction may be handled by building an additional index. As usual, the proplets in 2.2 are order-free. During language production, an order is re-introduced by navigating from one proplet to the next.
Propositions 1 and 2 are connected (i) by adjacency-based coordination coded in the nc (next conjunct) and pc (previous conjunct) attribute values of their verb proplets hit and speed, and (ii) by coreferential coordination based on the original car proplet in proposition 1 and the corresponding pointer proplet in proposition 2. The summary consists of another car pointer proplet and the accident proplet, each with the same prn value (here 67) and related to each other by the modifier-modified relation. The connection between the summary and the original text is based on the address value (car 1), which serves as the core value of the rightmost car proplet as well as the mdr (modifier) value of the accident proplet. The summary-creating inference deriving the new content with the prn value 67 is formally defined as the following D(eductor) inference rule, shown with the sample input and output of 3.1 at the content level: 3.2. S UMMARY- CREATING D INFERENCE antecedent
rule level
consequent
noun: α verb: hit noun: β noun: (α K) noun: accident ⇒ mdd: accident fnc: hit arg: α β fnc: hit mdr: (α K) prn: K prn: K prn: K prn: K+M prn: K+M where α {car, truck, boat, ship, plane, ...} and β {tree, rock, wall, mountain, ...} ∪ α matching and binding
noun: car content fnc: hit level prn: 1 input
⎡
⎤
verb: hit ⎢arg: car tree⎥ noun: tree ⎥ ⎢ ⎢nc: 2 speed ⎥ fnc: hit ⎦ prn: 1 ⎣pc: prn: 1
noun: (car 1) mdd: accident prn: 67
noun: accident mdr: (car 1) prn: 67
output
The rule level shows two sets of pattern proplets, called the antecedent and the consequent, and connected by the operator ⇒. Pattern proplets are defined as proplets with variables as values, while the proplets at the content level do not contain any variables. The consequent pattern uses the address (or pointer, cf. Sect. 2) value (α K) to relate to the antecedent and has the new prn value K+M, with M > 0. In the rule, the possible values which α and β may be bound to during matching are restricted by the co-domains of these variables: the restricted variable α generalizes the summary-creating inference to different kinds of accidents, e.g., car accident, truck accident, etc., while the restricted variable β limits the objects to be hit to trees, rocks, etc., as well as cars, trucks, etc. Any content represented by the proplet hit with a subject
62
R. Hausser / Inferencing in Database Semantics
and an object proplet satisfying the variable restrictions of α and β, respectively, will be automatically (i) summarized as an accident of a certain kind whereby (ii) the summary is related to the summarized by means of an address value, here (car 1), thus fulfilling the condition that the data in a content-addressable memory may not be modified. By summarizing content into shorter and shorter versions, there emerges a hierarchy which provides retrieval relations for upward or downward traversal (cf. Sect. 10). An upward traversal supplies more and more general notions, which may be used by the agent to access inferences defined at the higher levels. A downward traversal supplies the agent with more and more concrete instantiations. 4. Horizontal and Vertical Aspects of Applying DBS Inferences DBS inferences are defined as formal rules which are applied to content in the agent’s Word Bank by means of pattern matching. As a software operation, such an application may be divided into phases which happen to have horizontal and vertical aspects. The horizontal aspect concerns the relation between the antecedent and the consequent of an inference and the chaining of inferences. The vertical aspect concerns the relation between the rule level and the content level, within an inference and in a chain of inferences. Consider the formal definition of the first inference in 1.1, applied to a suitable content: 4.1. F ORMAL DEFINITION OF THE hungry-eat R( EACTOR ) INFERENCE
rule level
antecedent consequent noun: β verb: hungry noun: (β K) cm fnc: eat fnc: hungry arg: β prn: K prn: K prn: K+M matching and binding
noun: Julia content fnc: hungry level prn: 211
verb: hungry arg: Julia prn: 211
noun: (Julia 211) fnc: eat prn: 220
verb: eat arg: (β K) food prn: K+M where 0 < M < θ
verb: eat arg: (Julia 211) food prn: 220
noun: food fnc: eat prn: K+M
noun: food fnc: eat prn: 220
The upper bound θ is intended to ensure that the content of the consequent closely follows the content of the antecedent. Furthermore, the inclusion of the antecedent’s subject in the consequent by means of the address value (β K) excludes cases in which one agent is hungry and another one eats food – which would fail as an effective countermeasure. The rule application starts with the vertical grounding of the antecedent in the trigger situation by matching and binding. Next there is the horizontal relation between the grounded antecedent and the consequent, which formalizes a countermeasure (cm) connected to the antecedent and its trigger situation. Finally, the patterns of the consequent vertically derive a new content as a (preliminary) blueprint for action which may horizontally activate another inference, as shown in 1.1. 5. Schema Derivation and Intersection The sets of connected pattern proplets constituting the antecedent and the consequent of an inference like 3.2 or 4.1 are each called a DBS schema. Schemata are used in
R. Hausser / Inferencing in Database Semantics
63
general for retrieving (visiting, activating) relevant content in a Word Bank. A schema is derived from a content, represented as a set of proplets, by simultaneously substituting all occurrences of a constant with a restricted variable. Consider the following example of a content: 5.1. P ROPLETS CODING THE CONTENT OF Julia knows John.
noun: Julia fnc: know prn: 625
verb: know arg: Julia John prn: 625
noun: John fnc: know prn: 625
This representation characterizes functor-argument structure in that the Julia and John proplets11 specify know as the value of their fnc attributes,12 and the know proplet specifies Julia and John as the values of its arg attribute. The content may be turned into a schema by replacing its prn value 625 with the variable K, restricted to the positive integers. This schema will select all propositions in a Word Bank with a content equivalent to 5.1 The set of proplets matched by a schema is called its yield. The yield of a schema relative to a given Word Bank may be controlled precisely by two complementary methods. One is by the choice and number of constants in a content which are replaced by restricted variables. For example, the following schema results from replacing the constants Julia, John, and 625 in content 5.1 with the variables α, β, and K, respectively: 5.2. P OSSIBLE SCHEMA RESULTING FROM 5.1
noun: α fnc: know prn: K
verb: know arg: α β prn: K
noun: β fnc: know prn: K
The yield of this schema are all contents in which someone knows someone. However, if only John and 625 in content 5.1 are replaced by variables, the resulting schema has a smaller, more specific yield, namely all contents in which Julia knows someone. When a schema with several pattern proplets is used as a query, its yield is obtained by “intersecting” the token lines corresponding to the pattern proplets’ core values (provided the latter are constants). As an example, consider the schema for hot potato: 5.3. S CHEMA FOR hot potato
adj: hot mdd: potato prn: K
noun: potato mdr: hot prn: K
The functor-argument structure of this example (consisting of a modifier and a modified) is a schema because the prn value is the variable K. Applying the schema to the corresponding token lines in the following example results in two intersections: 11 When 12 When
we refer to a proplet by its core value, we use Italic, e.g., John. we refer to an attribute or a value within a proplet, we use Helvetica, e.g., fnc or know.
64
R. Hausser / Inferencing in Database Semantics
5.4. I NTERSECTING TOKEN LINES FOR hot AND potato ... ...
adj: hot mdd: potato prn: 20
⎡
adj: hot mdd: water prn: 32
member proplets
adj: hot mdd: potato prn: 55
⎤⎡
⎤⎡
adj: hot mdd: day prn: 79
⎤⎡
owner proplets
core: hot
⎤
noun: potato noun: potato noun: potato noun: potato ⎥⎢fnc: eat ⎥ ⎢fnc: look_for⎥⎢fnc: cook ⎥ ⎢fnc: find ... ⎣ ⎦⎣mdd: small ⎦ ⎦⎣mdr: big ⎦ ⎣mdr: hot mdr: hot prn: 20 prn: 35 prn: 55 prn: 88
core: potato
The intersections contain the proplets with the prn values 20 and 55. They are selected because the pattern proplets of schema 5.3 match only hot proplets with the mdd (modified) value potato and only potato proplets with the mdr (modifier) value hot. The other method to control and adjust the yield of a schema is in terms of the restrictions on the variables. Restrictions may consist in an explicit enumeration of what a variable may be bound to (cf. 3.2). Restrictions may also be specified by constants, like vehicle or obstacle, which lexically provide similar sets as the enumeration method by using a thesaurus, an ontology, WordNet, or the like. The two methods of fine-tuning a DBS schema result in practically13 perfect recall and precision. This is crucial for autonomous control because the effective activation of relevant data is essential for the artificial agent to make good decisions.
6. Subactivation (Selective Attention) In DBS, the selection of content by means of schemata is complemented by the equally powerful method of subactivation: the concepts provided by recognition and inferencing are used as a continuous stream of triggers which select corresponding data in the Word Bank. As an example, consider the following subactivation of a token line: 6.1. T RIGGER CONCEPT SUBACTIVATING A CORRESPONDING TOKEN LINE member proplets
adj: hot mdd: potato prn: 20
adj: hot mdd: water prn: 32
adj: hot mdd: potato prn: 55
owner proplet
trigger concept
adj: hot mdd: day . . . core: hot ⇐ hot prn: 79
Subactivation is an automatic mechanism of association,14 resulting in a mild form of selective attention. It works like a dragnet, pulled by the incoming concepts serving as triggers and accompanying them with corresponding experiences from the agent’s past. Intuitively, subactivation may be viewed as highlighting an area of content at half strength, setting it off against the rest of the Word Bank, but such that exceptional evaluations (cf. Sect. 8) are still visible as brighter spots. In this way, the agent will be alerted to potential threats or opportunities even in current situations which would otherwise seem innocuous – resulting in virtual triggers for suitable inferences. 13 Recall 14 Like
and precision are defined in terms of subjective user satisfaction. Cf. [Salton 1989]. associating a certain place with a happy memory.
R. Hausser / Inferencing in Database Semantics
65
The primary subactivation 6.1 may be extended into a secondary and tertiary one by spreading activation15 [Quillian 1968]. For example, using the semantic relations coded by the left-most proplet in 6.1, the following proposition may be subactivated, based on the continuation and prn values potato 20, look_for 20, and John 20: 6.2. S ECONDARY SUBACTIVATION OF A PROPOSITION ⎤ ⎡ ⎡ ⎤ verb: look_for noun: potato ⎢arg: John, potato⎥ adj: hot noun: John ⎥ ⎢fnc: look_for⎥ ⎢ fnc: look_for ⎢pc: cook 19 ⎥ ⎣mdr: hot ⎦ mdd: potato ⎦ ⎣nc: eat 21 prn: 20 prn: 20 prn: 20
prn: 20
While a secondary subactivation utilizes the intrapropositional relations of functorargument and coordination structure (cf. [NLC’06], Chapts. 6 and 8), a tertiary subactivation is based on the corresponding extrapropositional relations (cf. [NLC’06], Chapts. 7 and 9). For example, using the pc (previous conjunct) and nc (next conjunct) values of the look_for proplet in 6.2, the tertiary subactivation may spread from John looked for a hot potato to the predecessor and successor propositions with the verb values cook and eat, and the prn values 19 and 21, respectively.
7. Semantic Relations Subactivation may spread along any semantic relations between proplets. By coding the semantic relations inside and between propositions solely as proplet-internal values, proplets become order-free and are therefore suitable for efficient storage and retrieval in the content-addressable memory of a Word Bank. Subactivation is made especially efficient by coding the semantic relations as pointers (cf. Sect. 2). In DBS, the semantic relations are of two kinds, (i) form and (ii) content. The semantic relations of form are functor-argument and coordination structure, intra- and extrapropositionally; they are established during recognition and are utilized in the encoding of blueprints for action. In natural language communication, for example, the semantic relations of grammatical form are established in the hearer mode (recognition) and encoded in the speaker mode (action). The semantic relations of content are exemplified by cause and effect, precondition, the semantic hierarchies, etc. Content relations have been used to define associative (or semantic) networks (cf. [Brachman 1979] for an overview). In DBS, semantic relations of content are established by inferences. The topic of semantic relations in general and of content relations in particular is widely discussed in linguistics, psychology, and philosophy. Content relations in lexicography, for example, are classified in terms of synonymy, antonymy, hypernymy, hyponymy, meronymy, and holonymy. In philosophy, content relations are viewed from a different perspective, described by [Wiener 1948], p. 133, as follows: According to Locke, this [i.e., the subactivation of ideas, R.H.] occurs according to three principles: the principle of contiguity, the principle of similarity, and the principle of cause and 15 In fiction, our notion of triggering a spreading subactivation is illustrated by the madeleine experience of [Proust 1913], which brings back an almost forgotten area of what he calls "l’édifice immense du souvenir."
66
R. Hausser / Inferencing in Database Semantics
effect. The third of these is reduced by Locke, and even more definitely by Hume, to nothing but constant concomitance, and so is subsumed under the first, contiguity.
Formal examples of semantic relations of content in DBS are the summary inference 3.2, the hungry-eat inference 4.1, and the hierarchy inferences for downward traversal 10.1 and for upward traversal 10.4. DBS inferences serve not only to maintain the agent’s balance, but also code a kind of knowledge which is different from a content like 5.1. 8. Evaluation of Content If a cognitive agent were to value all subactivated contents the same, they would provide little guidance towards successful behavior – neither absolute in terms of the agent’s survival nor relative in comparison to other agents. Even the path of daily routine, of least resistance, or of following some majority is ultimately the result of choices based on evaluation. As a general notion, content evaluation has been investigated in philosophy, linguistics, psychology, and neurology. In today’s natural language processing, it has reappeared as the sentiment detection of data mining [Turney 2002]. In modern psychology, evaluation is analyzed in emotion theory [Arnold 1993] and in appraisal theory [Lazarus and Lazarus 1994]. For a software model of control, evaluations are not so much a question of how they are expressed or which of them are universal,16 but how they are assigned internally by individual agents. In DBS, evaluations are assigned when new content is read into the agent’s Word Bank – by recognition or by inference. At their lowest level, recognition-based evaluations must be integrated into the agent’s hardware (else they would be figments of imagination). For example, hot and cold require a sensor for temperature. Evaluations have been classified in terms of joy, sadness, fear, or anger, and are expressed in terms of good vs. bad, true vs. false, excellent vs. poor, virtuous vs. depraved, brave vs. cowardly, generous vs. cheap, loyal vs. treacherous, desirable vs. undesirable, acceptable vs. unacceptable, etc. For guiding the autonomous control of a cognitive agent, DBS uses the features [eval: attract] and [eval: avoid]. They are of a more basic and more neutral nature, and fit into the data structure of proplets. Their values may be scalar and may be set between neutral (0) and the extremes asymptotically approaching -1 or +1. The overall purpose of DBS evaluation is to record (i) any actual deviation from the agent’s state of balance, (ii) any impending threat to the agent’s balance, and (iii) any possibility to secure positive aspects of maintaining the agent’s balance mid- and longterm. Each is used as a trigger for selecting an inference which provides an appropriate reaction. For example, if it is too hot (evaluation-based trigger), go to where it is cooler (inference-based reaction). 9. Adaptation and Learning The mechanism of deriving and adjusting DBS schemata (cf. Sect. 5) holds at a level of abstraction which applies to natural and artificial agents alike. Because of the simplicity 16 Cf.
[Darwin 1872], Chapt. XIV, pp. 351–360.
R. Hausser / Inferencing in Database Semantics
67
of this mechanism, artificial agents may be designed like natural agents in that they adjust automatically over time. Thereby, the following differences between natural and artificial agents do not stand in the way: In natural agents, adjusting to a changing environment as well as optimizing come in two varieties, (i) the biological adaptation of a species in which physical abilities and cognition are co-evolved, and (ii) the learning of individuals which is mostly limited to cognition. Adaptation and learning differ also in that they apply to different ranges of time and different media of storage (gene memory vs. brain memory). In artificial agents, in contrast, improvement of the hardware is the work of engineers, while development of an automatically adjusting cognition is the work of software designers. Because of this division between hardware and software, the automatic adjustment of artificial agents corresponds more to learning than to adaptation. Fortunately, the absence of natural inheritance in artificial agents may be easily compensated by copying the cognition software (including the artificial agent’s experiences and adaptations) from the current hardware model to the next. The DBS mechanism underlying adaptation as well as learning is based on (i) deriving schemata from sets of content proplets17 by replacing constants with variables and on (ii) adjusting the restrictions of the variables (cf. Sect. 5). This mechanism may be automated based on the frequency of partially overlapping contents: 9.1. A SET OF CONTENTS WITH PARTIAL OVERLAP Julia eats an apple Julia eats a pear Julia eats a salad Julia eats a steak For simplicity, the propositions are presented in English rather than by corresponding sets of proplets. Because of their partial overlap, the propositions may be automatically summarized as the following schema: 9.2. S UMMARIZING THE SET 9.1 WITH A SCHEMA Julia eats α, where α {apple, pear, salad, steak} Due to the restriction on the variable α, 9.2 is strictly equivalent to 9.1. The next step is to replace α by a concept serving as a hypernym, here food: 9.3. R EPLACING THE RESTRICTED VARIABLE BY A HYPERNYM Julia eats food, where food {apple, pear, salad, steak} This concept may serve as the literal meaning of the word food in English, aliment in French, Nahrung in German, etc. (cf. [Hausser 2009b]). Implicit in the content of 9.3 is the following semantic hierarchy: 17 Content proplets consist of context proplets and language proplets (cf. [NLC’06], Sect. 3.2). Language proplets consist of unconnected lexical proplets (e.g., [NLC’06], 5.6.1) and the connected proplets of languagebased propositions (e.g., [NLC’06], 3.2.4).
68
R. Hausser / Inferencing in Database Semantics
9.4. R EPRESENTING THE SEMANTIC HIERARCHY IMPLICIT IN 9.3 AS A TREE food
apple
pear
salad
steak
The automatic derivation of a semantic hierarchy illustrated in 9.1 – 9.3 is empirically adequate if the resulting class containing the instantiations corresponds to that of the surrounding humans. For example, if the artificial agent observes humans to habitually (frequency) eat müsli, the restriction list of α must be adjusted correspondingly.18 Furthermore, the language surface chosen by the artificial agent for the hypernym concept (cf. 9.3) must correspond to that of the natural language in use. 10. Hierarchy Inferences An agent looking for food must know that food is instantiated by apples, pairs, salad, or steaks, just as an agent recognizing an apple must know that it can be used as food. In DBS, this knowledge is implemented in terms of inferences for the downward and the upward traversal of semantic hierarchies like 9.4. For example, if Julia is looking for food, the following downward inference will derive the new content that Julia is looking for an apple, a pear, a salad, or a steak: 10.1. H IERARCHY- INFERENCE FOR DOWNWARD TRAVERSAL antecedent
rule level
noun: Julia content level fnc: look_for prn: 18
consequent
noun: food noun: α ⇓ fnc: β fnc: (β K) prn: K prn: K+M where α {apple, pear, salad, steak} matching and binding
verb: look_for arg: Julia food prn: 18
noun: food fnc: look_for prn: 18
noun: α verb: (look_for 18) prn: 25
The antecedent consists of a single pattern proplet with the core value food. When this pattern matches a corresponding proplet at the content level, the consequent derives a new content containing the following disjunction19 of several proplets with core values corresponding to the elements of the restriction set of α: 10.2. O UTPUT DISJUNCTION OF THE DOWNWARD INFERENCE APPLICATION 12.1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ noun: apple or
noun: pear
noun: salad
noun: steak
prn: 25
prn: 25
prn: 25
prn: 25
⎢fnc: (look_for 18)⎥ ⎢pc: apple ⎥ ⎢pc: pear ⎥ ⎢pc: salad ⎥ ⎣nc: pear ⎦ ⎣nc: salad ⎦ ⎣nc: steak ⎦ ⎣nc: ⎦ 18 This method resembles the establishment of inductive inferences in logic, though based on individual agents. 19 See [NLC’06], Chapt. 8, for a detailed discussion of intrapropositional coordination such as conjunction and disjunction.
69
R. Hausser / Inferencing in Database Semantics
The proplets of the output disjunction are concatenated by the pc (for previous conjunct) and nc (for next conjunct) features, and have the new prn value 25. They are related to the original proposition by the pointer address (look_for 18) serving as the fnc value of the first disjunct. The output disjunction may be completed automatically into the new proposition Julia looks_for apple or pear or salad or steak, represented as follows: 10.3. P ROPOSITION RESULTING FROM APPLYING DOWNWARD INFERENCE 12.1
noun: (Julia 18) fnc: (look_for 18) prn: 25
⎡
⎤ ⎡
verb: (look_for 18) arg: (Julia 18) apple or prn: 25
⎤⎡
⎡
⎤
noun: pear noun: apple or ⎢fnc: (look_for 18)⎥ ⎢pc: apple ⎥ ⎦ ⎣nc: salad ⎦ ⎣nc: pear prn: 25 prn: 25
This new proposition with the prn value 25 is derived from the given proposition with the prn value 18 shown at the content level of 10.1, and related to it by pointer values. The inverse of downward traversal is the upward traversal of a semantic hierarchy. An upward inference assigns a hypernym like food to concepts like salad or steak. Consider the following definition with an associated sample input and output at the content level: 10.4. H IERARCHY- INFERENCE FOR UPWARD TRAVERSAL antecedent rule level
α {apple, pear, salad, steak}
&
consequent
noun: α fnc: β prn: K
⇑
noun: food fnc: (β k) prn: K+M
matching and binding
content level
noun: Julia fnc: prepare prn: 23
verb: prepare arg: Julia salad prn: 23
noun: salad fnc: prepare prn: 23
noun: food fnc: (prepare 23) prn: 29
Like the downward inference 10.1, the antecedent of the upward inference consists of a single pattern proplet with the restricted variable α as the core value. Due to the use of a pointer address as the fnc value of the output (required anyway by the contentaddressable memory of DBS), there is sufficient information to complete the output proplets into the proposition Julia prepares food, with the prn value 29 and pointer proplets for Julia and prepare. The limited matching used by the upward and downward inferences has the advantage of generality. The automatic derivation and restriction of schemata (cf. Sect. 9) directly controls the automatic adaptation of the hierarchy inferences. They illustrate how DBS is intended to fulfill the three functions which define an autonomic system: “automatically configure itself in an environment, optimize its performance using the environment and mechanisms for performance, and continually adapt to improve performance and heal itself in a changing environment” [Naphade and Smith 2009].
70
R. Hausser / Inferencing in Database Semantics
11. Analogical Models as Blueprints for Action To obtain a suitable blueprint for an action, the agent may assemble reactor, deductor, and effector inferences creatively into a new chain – which may or may not turn out to be successful. Most of the time, however, it will be easier and safer for the agent to re-use an earlier action sequence, successfully self-performed or observed in others, provided such an analogical model is available in the agent’s memory. These earlier models are contained at various levels of detail in the contents subactivated by the initial R inference. The R inference defined in 4.1, for example, subactivates all contents matching the β is hungry schema (antecedent), the β eats food schema (consequent), as well the token lines of the inference’s constants, here hungry, eat, and food. By spreading to secondary and tertiary subactivations (cf. Sect. 6), the initial R inference may subactivate a large set of contents in the agent’s Word Bank. These serve to illustrate the trigger situation with a cloud of subactivations (cf. [NLC’06], Sect. 5.6), but their precision is too low as to provide a specific blueprint for practical, goal-directed action. In order for a content stored in memory to be useful for resolving the agent’s current challenge, it must (i) fit the trigger situation as precisely as possible and (ii) have a positively evaluated outcome. For this, our method of choice is DBS intersection (cf. Sect. 5). Assume that the agent is alone in Mary’s house – which serves as a trigger (cf.6.1) subactivating the token line of Mary in the agent’s Word Bank. Furthermore, the agent is hungry, which triggers the hungry-eat inference 4.1. The constant eat in the consequent subactivates the corresponding token line, resulting in intersections between the Mary and eat token lines such as the following: 11.1. E XAMPLE OF TWO Mary eat INTERSECTIONS
In other words, the agent remembers Mary once eating an apple and once eating müsli. The two proplets in each intersection share a prn value, namely 49 and 82, respectively, and are in a grammatical relation, namely functor-argument structure. In both intersections, the verb proplet eat provides two continuations. For example, the verb of the first intersection provides the continuation values apple and take 48, which may result in the following secondary and tertiary subactivations (cf. Sect. 6). 11.2. S UBACTIVATION SPREADING FROM Mary eat TO Mary take apple. ⎡
The anti-temporal order corresponds to the spreading direction of the subactivation. The apple 49 proplet (secondary subactivation) contains the eval attribute with the value attract. Assuming that the corresponding subactivation for the second intersection happens to evaluate the müsli 82 proplet as eval: avoid20 (not shown), the agent would pursue only the tertiary subactivation from the first (and not the second) intersection in 11.1 as a possible candidate for an analogical model for regaining balance. To get at the information relevant for finding something to eat in Mary’s house, the subactivation 11.2 may spread further, based on the pc (for previous conjunct) value locate 47 of the take 48 proplet. In this way, the subactivation of the earlier eating event may be completed into the following backward sequence of propositions: 11.3. S UBACTIVATED SEQUENCE OF PROPOSITIONS ( ANTI - TEMPORAL ORDER ) Mary eat apple [prn: 49]. Mary take apple [prn: 48]. Mary locate apple in blue cupboard [prn: 47]. The information relevant for the hungry agent is the location from where Mary got the apple, i.e., the blue cupboard. If the anti-temporal order is reversed, the propositions in 11.3 will match the antecedent of step 5 in Example 1.1 all the way to the consequent of step 7. This completes the chain relative to the consequent of the initial R inference 4.1 at the level of content, obviating steps 1–4 and thus without any assertion that Mary was hungry when she ate the apple.21 From the content 11.3 provided by memory via intersection, the agent may obtain an analogical model by (i) reversing the order and (ii) replacing the value Mary with a pointer to the agent, represented as moi: 11.4. R ESULTING ANALOGICAL MODEL exec Moi locate apple in blue cupboard [prn: 102] exec Moi take apple [prn: 103] exec Moi eat apple [prn: 104] ⇑ Moi eat food [prn: 105] Whether or not these blueprints for the agent’s action components will result in a successful countermeasure depends on whether proposition 102 turns out to hold in the agent’s current situation or not. 12. Learning by Imitation The purposeful subactivation of an earlier content in the Word Bank by means of intersection provides the agent with an analogical model potentially suitable to remedy its current imbalance. For example, instead of looking randomly through Mary’s house for something to eat, the agent will begin with searching for an apple in the blue cupboard. To implement such a system requires an agent with interfaces for recognition and action of a quality not yet available. Therefore, let us consider a simpler example, namely a robot loading its battery at one of several loading stations in its environment. In analogy to 1.1, this behavior may be controlled by the following chain of inferences: 20 The
assumed evaluations reflect the agent’s preference of eating apples over eating müsli. the agent were to assume (unnecessarily) that Mary must have been hungry, then this would correspond to an abductive inference in logic. The point is that observing Mary eating is sufficient for the purpose at hand. 21 If
72
R. Hausser / Inferencing in Database Semantics
12.1. AUTONOMOUS CONTROL AS A CHAIN OF R-D-E INFERENCES 1. R: β low battery cm β load battery. 2. D: β load battery pre β locate station. 3. D: β locate station ⇓ β locate α, where α {1, 2, 3, etc. }. 4. E: β locate α exec β attach to α. 5. D: β attach to α ⇑ β attach to station. 6. E: β attach to station exec β load battery. The connectives cm (countermeasure), pre (precondition), ⇓ (is instantiated by), ⇑ (hypernym), and exec (execute) are as in 1.1. Steps 3 and 5 show a primitive semantic hierarchy, namely the term station for the instantiations of α. The consequent of step 6 provides completion. In terms of current technology, each notion used in this software program, e.g., locate, attach, or load, has a rather straightforward procedural counterpart. It is therefore possible even today to build a real robot in a real environment performing this routine. Instead of programming the robot’s operations directly, for example in C or Java, let us use a declarative specification in terms of proplets in a Word Bank. In other words, the robots’ recognitions, e.g., locate α, are stored in its Word Bank as sets of proplets and the robot’s actions, e.g., attach_to α, are controlled by sequences of proplets. To simulate learning by imitation, let us use two such robots, called A and B. Initially, each is training in its own environment, whereby A has the loading stations 1 and 2, and B has the loading stations 3, 4, and 5 – with their respective α variables defined accordingly. Once the individual loading routines are well established for both, A is put into the environment of B. To simplify A’s recognition of loading events by B, let us assume that B emits a signal every time it is loading and that A can correctly interpret the signal. In order for A to imitate B, A must follow B, remember the new locations, and adapt A’s definition of α to the new environment. The new loading stations may differ in hight, which may cause different efforts of reach, thus inducing preferences (evaluation). After following B around, A’s battery is low. This imbalance triggers step 1 in 12.1. Being in B’s environment, A subactivates the token line of B in A’s Word Bank, while the consequent of step 1 subactivates the token line of load, leading to their intersection – in analogy to 11.1. Spreading results in secondary and tertiary subactivations: 12.2. S UBACTIVATED SEQUENCE OF PROPOSITIONS ( ANTI - TEMPORAL ORDER ) B load battery [prn: 69]. B attach to station 3 [prn: 68]. B locate station 3 [prn: 67]. By reversing the spreading order into the temporal order and by replacing B by A, the visiting robot obtains the following blueprints for its action components: 12.3. B LUEPRINTS FOR ACTION A locate station 3 [prn: 87]. A attach to station 3 [prn: 88]. A load battery [prn: 89]. Except for the replaced subject, these propositions consist of recognition content from A’s memory. Therefore, their core values are tokens carrying sensory, motor, and conceptual information which is not provided by the types of the inference chain 12.1, but essential for action blueprints sufficiently detailed to master the situation at hand.
R. Hausser / Inferencing in Database Semantics
73
13. Fixed vs. Adaptive Behavior The behavior of robot A described above is flexible in that it can adapt to different environments of a known kind, here two rooms which differ in the number and location of loading stations. In this example, the artificial agents and their artificial environments are co-designed by the engineers. A more demanding setup is to take a given natural environment and to design a robot able to maintain a balance relative to internal and external changes. This requires (i) analysis of the external environment, (ii) construction of interfaces for the agent’s recognition of, and action in, the external environment, and (iii) definition of R(eactor), D(eductor), and E(ffector) inferences for optimal survival. The ultimate goal, however, is to design a robot with a basic learning software. It should be capable of deriving schemata (cf. Sect. 5) and semantic relations of content (cf. Sect. 7), and of automatically establishing and adapting instantiation classes22 (cf. Sect. 9). In this way, it should be able to continuously optimize behavior for daily survival in the agent’s ecological niche. This may be done in small steps, first testing the artificial agent in artificial environments it was specifically designed for, and then in new environments. By putting the artificial agent into more and more challenging test situations, the control software may be fine-tuned in small steps, by hand and by automatic adaptation.
14. Component Structure and Functional Flow At any moment in time, the DBS model of a cognitive agent distinguishes three kinds of content: (i) old content stored in the Wordbank, (ii) new content provided by recognition, and (iii) new content provided by inference. Recognition, including language interpretation in the hearer mode, interprets the data stream provided by the external and internal interfaces non-selectively and adds the resulting content to the Word Bank. Inferences, in contrast, are triggered selectively by items which match their antecedent. Their derivation of new content is usually based on the subactivation of stored data (cf. Sect.11), and is used as blueprints for action, including language production in the speaker mode. Memories of these actions are added non-selectively23 to the Word Bank. The procedures of recognition and of inference are formally based on small sets of connected pattern proplets, called DBS schemata, which operate on corresponding sets of content proplets by means of pattern matching. The matching between individual pattern proplets and content proplets is greatly facilitated by their non-recursive feature structures (cf. [NLC’06], Sect. 3.2). So far, this method has been used for the following cognitive operations: 14.1. C OGNITIVE OPERATIONS BASED ON MATCHING a. natural language interpretation: matching between LA-hear grammar rules and language proplets (cf. [TCS’92], [NLC’06], Sect. 3.4) 22 [Steels 1999] presents algorithms for automatically evolving new classes from similar data by abstracting from what they take to be accidental (in the sense of Aristotle). 23 We are leaving aside the psychological phenomenon of repression (Unterdrückung) in natural agents.
74
R. Hausser / Inferencing in Database Semantics
b. navigation: matching between LA-think grammar rules and content proplets (cf. [NLC’06], Sect. 3.5, [Hausser 2009a]) c. querying: matching between query patterns and content proplets (cf. [NLC’06], Sect. 5.1) d. inferencing: matching between inference rules and content proplets (cf. 3.2, 4.1, 10.1, 10.4). Navigation (b) and inferencing (d) jointly provide the conceptualization (what to say?) and substantial parts of the realization (how to say it) for language production. The different kinds of matching between pattern proplets and content proplets in combination with the agent’s cognitive input and output suggest the following component structure:24 14.2. C OMPONENT STRUCTURE OF A COGNITIVE AGENT cognitive agent peripheral cognition
The diagram shows three general components, (i) an I/O (input-output) component for recognition and action, (ii) a rule component for interpretation and production, and (iii) a content component for language and context (or non-language) data. The separation of patterns and of contents into distinct components provides a uniform structural basis for the rule component to govern the processing of content (7) – with data-driven feedback from the content component (8), including automatic schema derivation (Sect. 9). The rule and the content component are each connected unidirectionally to the I/O component. All recognition output of this I/O component is input to the rule component (5), where it is processed and passed on to the content component (7). All action input to the I/O component comes from the content component (6), derived in frequent (8, 7) interaction with the rule component. 24 The component structure 14.2 raises the question of how it relates to an earlier proposal, presented in [NLC’06] as diagram 2.4.1. The [NLC’06] diagram models reference in the sense of analytic philosophy and linguistics, namely as a vertical relation between a horizontal language level and a horizontal context level – which is helpful for explaining the Seven Principles of Pragmatics (see [NLC’06], Sect. 2.6, for a summary). In diagram 14.2, this earlier component structure is embedded into the content component. Technically, the [NLC’06] diagram is integrated into 14.2 by changing to a different view: instead of viewing content proplets as sets with a common prn value (propositions), and separated into a language and a context level, the same proplets are viewed as items to be sorted into token lines according to their core value. Treating the [NLC’06] diagram as part of the content component in 14.2 serves to explain the separate
R. Hausser / Inferencing in Database Semantics
75
Conclusion Language production in the speaker mode of a cognitive agent raises the question of where the content to be realized should come from. The cycle of natural language communication modeled in DBS answers this question by providing two sources: (i) content provided by recognition, either current or stored in the agent’s memory, and (ii) blueprints for action derived on-the-fly by the agent to maintain a state of balance (equilibrium, homeostasis) vis-à-vis a constantly changing external and internal environment. So far, work on the speaker mode in DBS has concentrated on a systematic description of (i), i.e., production from recognition content (cf. [NLC’06], [Hausser 2009b]). This paper, in contrast, explores the foundations of (ii), i.e., a general solution to providing blue-prints for meaningful actions by the agent, including natural language production. As a consequence, our focus here is on the what to say aspect of natural language production (conceptualization) rather than the how to say it aspect (realization). A conceptualization based on a cognitive agent with a memory and interfaces to the external and internal environment is in a principled contrast to a language production for weather reports or query answering for ship locations, train schedules, and the like. The latter are agentless applications; they are popular in the research literature because they allow to fudge the absence of an autonomous control. Their disadvantage, however, is that they cannot be extended to agent-based applications such as free dialog [Schegloff 2007], whereas the inverse direction from an agent-based to an agentless application is comparatively easy. Proceeding on the assumption that a sound theoretical solution to natural language production must be agent-based, this paper shows how an autonomous control based on the principle of balance may be embedded into the cycle of natural language communication as formally modeled and computationally verified in DBS [NLC’06]. Founded technically on a content-addressable memory and coreference-by-address (pointers), this extension of the existing system requires a number of new procedures, such as automatic schema derivation, the subactivation and evaluation of content, adaptation and learning, the definition and chaining of inferences for deriving action blueprints, etc. The resulting conceptual model of a cognitive agent is summarized by showing the basic components and the functional flow connecting the interfaces for recognition with those for action. To bring across the basic ideas, the presentation tries to be as intuitive as possible. Nevertheless, the formal illustrations of contents, patterns, rules, intersections, etc., provide the outline of a declarative specification for a straightforward transfer into efficiently running code. Acknowledgements This paper benefitted from comments by Johannes Handl, Thomas Proisl, Besim Kabashi, and Carsten Weber, research and teaching associates at the Abteilung für Computer-Linguistik Uni Erlangen (CLUE). input-output channels for the language and the context component in the earlier diagram: The I/O component of 14.2 provides the rule component with a (usually clear) distinction between language and non-language surfaces, resulting in a distinction between language proplets and context proplets during lexical lookup [Handl et al. 2009]. Therefore, the input channel to the content component 7 and the output channel 8 may each be divided into a part for language proplets and a part for context proplets.
76
R. Hausser / Inferencing in Database Semantics
References [AIJ’01] Hausser, R. (2001). Database Semantics for natural language, Artificial Intelligence, 130.1:27–74, Elsevier. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Anderson 1983] Anderson, J. R. (1983). A spreading activation theory of memory, Journal of Verbal Learning and Verbal Behavior, 22:261-295 [Antsaklis and Passino 1993] Antsaklis, P.J., and K. M. Passino, eds. (1993). An Introduction to Intelligent and Autonomous Control, Dordrecht: Kluwer Academic [Arnold 1993] Arnold, M. B. (1984). Memory and the Brain, Hillsdale, NJ: Erlbaum [Bernard 1865] Bernard, C. (1865). Introduction à l’étude de la médecine expérimentale, first English translation by Henry Copley Greene, published by Macmillan, 1927; reprinted in 1949 [Brachman 1979] Brachman, R.J. (1979). On the Epistemological Status of Semantic Networks, in N. Findler (ed.) Associative Networks, pp. 3–50, Academic Press [Brooks 1985] Brooks, R. (1985). A Robust Layered Control System for a Mobile Robot Cambridge, MA: MIT AI Lab Memo 864, 227–270 [Chisvin and Duckworth 1992] Chisvin, L., and R. J. Duckworth (1992). Content-Addressable and Associative Memory In M.C. Yovits (ed.) Advances in Computer Science, 2nd ed. pp. 159–235, Academic Press [Darwin 1872] Darwin, C. (1872/1998). The Expression of the Emotions in Man and Animals. 3rd edition. London: Harper Collins [FoCL’99] Hausser, R. (1999). Foundations of Computational Linguistics, 2nd ed.. Heidelberg Berlin New York: Springer [Grice 1965] Grice, P. (1965). Utterer’s meaning, sentence meaning, and word meaning, Foundations of Language, 4:1–18 [Handl et al. 2009] Handl, J., B. Kabashi, T. Proisl, and C. Weber (2009). JSLIM - Computational morphology in the framework of the SLIM theory of language, in C. Mahlow and M. Piotrowski (eds.) State of the Art in Computational Morphology, Berlin Heidelberg New York: Springer [Hausser 2009a] Hausser, R. (2009). Modeling Natural Language Communication in Database Semantics, Proceedings of the APCCM 2009, Australian Comp. Sci. Inc., CIPRIT, Vol. 96. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Hausser 2009b] Hausser, R. (2009). From Word Form Surfaces to Communication, in T. Tokuda et al. (eds.) Information Modelling and Knowledge Bases XXI, Amsterdam: IOS Press Ohmsha. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Hutchinson 1948] Hutchinson, G.E. (1948). Circular Causal Systems in Ecology, Ann. New York Acad. Science 50:221-246 [Lazarus and Lazarus 1994] Lazarus, R., and B. Lazarus (1994). Passion and Reason: Making Sense of Our Emotions, New York: Oxford University Press [Naphade and Smith 2009] Naphade, M.R., and J. R. Smith (2009.) Computer program product and system for autonomous classification, Patent Application #:20090037358 - Class: 706 46 (USPTO) [NLC’06] Hausser, R. (2006). A Computational Model of Natural Language Communication. Berlin Heidelberg New York: Springer [Proust 1913] Proust, M. (1913). Du côté de chez Swann, ed. by Jean-Yves Tadie et al., Bibliotheque de la Pleiade, Paris: Gallimard,1987-89 [Quillian 1968] Quillian, M. (1968). Semantic memory in M. Minsky (ed.), Semantic Information Processing, 227–270, Cambridge, MA: MIT Press [Salton 1989] Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Reading, Mass.: Addison-Wesley [Schegloff 2007] Schegloff, E. (2007). Sequence Organization in Interaction, New York: CUP [Steels 1999] Steels, L. (1999). The Talking Heads Experiment. Antwerp: limited pre-edition for the Laboratorium exhibition [TCS’92] Hausser, R. (1992). Complexity in Left-Associative Grammar. Theoretical Computer Science 106.2:283-308, Elsevier. Available online at http://www.linguistik.uni-erlangen.de/clue/de/publikationen.html [Turney 2002] Turney, P. (2002). Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, Association for Computational Linguistics (ACL), 417-424 [Wiener 1948] Wiener, N. (1948). Cybernetics: Or the Control and Communication in the Animal and the Machine, Cambridge, MA: MIT Press
Modelling a Query Space Using Associations Mika TIMONEN a,1 , Paula SILVONEN a,2 and Melissa KASARI b,3 a Technical Research Centre of Finland, PO Box 1000, FI-02044 VTT, Finland b Department of Computer Science, PO Box 68, FI-00014 University of Helsinki, Finland Abstract. We all use our associative memory constantly. Words and concepts form paths that we can follow to find new related concepts; for example, when we think about a car we may associate it with driving, roads or Japan, a country that produces cars. In this paper we present an approach for information modelling that is derived from human associative memory. The idea is to create a network of concepts where the links model the strength of the association between the concepts instead of, for example, semantics. The network, called association network, can be learned with an unsupervised network learning algorithm using concept co-occurrences, frequencies and concept distances. The possibility to create the network with unsupervised learning brings a great benefit when compared to semantic networks, where the ontology development usually requires a lot of manual labour. We present a case where the associations bring benefits over semantics due to easier implementation and the overall concept. The case focuses on a business intelligence search engine where we modelled its query space using association modelling. We utilised the model in information retrieval and system development. Keywords. Association network, Association modelling, Human Associative Memory, Query space modelling, Information retrieval
Introduction Information modelling has been researched rigorously in recent years. The aim has been to present the complex set of information related to different domains in a structured manner so it can be utilised in different applications. The information is refined into knowledge that can be understood by intelligent agents; both human and artificial. A lot of work in this field has been done on ontologies, knowledge bases and semantic networks. Ontologies aim to define concepts in the abstract level, providing semantics to the knowledge located in knowledge bases. Semantic networks model the relationships between concepts; for example, ’car’ is a ’transport device’ that has ’tyres’. Even though semantic networks are useful, their implementation is very labour intensive as the ontology is usually created manually; an example of this can be found in [1]. 1 E-mail:
M. Timonen et al. / Modelling a Query Space Using Associations
This is the biggest drawback with ontologies. As there are cases when a simpler model of the domain is enough, implementing an ontology and a semantic network is not a suitable option. Especially when we want to link related concepts together to be used, for example, in a search engine or in a recommendation system, we do not necessarily need to identify their semantics. In these cases, a lighter approach is usually preferred. The term query space refers here to the collection of concepts found in the documents of the given domain. For example, a database consisting of research articles makes a query space which consists of concepts that are the terms found from the documents. By modelling this space we can map the concepts and find links between them. The mappings can then be used when processing users’ queries by finding related terms and expanding the query. For example, if there is a relation between terms ’car’ and ’tyre’ and user searches for tyres, the query can be expanded to include also cars especially if the initial search does not produce any results. In this paper, we present a method for modelling a business intelligence related query space by identifying associations between the concepts. We model the associations using an association network that mimics the human associative memory. For some reason, when we think of a concept, e.g., ’car’, our first association may be something completely unrelated in the semantic sense, e.g., ’Australia’. This association has been formed by our experiences; for instance a long road trip in Australia. In a semantic network, the concept ’car’ would most probably be linked with concepts like ’vehicle’, ’automobile’ or ’tyre’. The idea behind association modelling is not to model the semantics of a domain but the associative relationship of the concepts in a domain. It does not necessarily link semantically similar concepts closely together; in association network two concepts may have a strong association even if they do not have any semantic relationship. We used our association modelling approach to model a query space of a business intelligence search engine called BI-search. The idea was to tackle two major problems with the search engine: (1) as the searched databases are fairly limited, the query space of each database is also limited. Therefore, users often used terms not found from the query space (databases) even though related terms were found. To address this problem, we needed to map related terms together and use the mapping in query expansion. We also wanted to (2) facilitate the search process by providing an intuitive and easy to use graphical user interface that presents the related terms and provides a possibility to refine and continue the search. We implemented the network using a project database, which contains information about approximately 9 000 on-going and completed projects done in Technical Research Centre of Finland (VTT). The information includes project name, start and end year, abstract and keywords. For each project there are two or more keywords that describe the relevant concepts of the given project. We assumed that (1) the keyword list holds the relevant concepts of the project in a concise way, and (2) if two keywords appear with each other they will form an association. The more often they appear with each other, the stronger the association is. We developed an unsupervised graph learning algorithm to create the association network from the keywords. The biggest challenge with the algorithm is the way the association weights are learned. We used confidence, which is an important metric in association rule mining [2], as the starting point for calculating the association weight. We calculated the confidence of each keyword pair, i.e., probability that keyword B is linked to the project when keyword A is, and weighted it with the average distance of
M. Timonen et al. / Modelling a Query Space Using Associations
79
the keywords in the keyword lists and the age of the keyword list (i.e., the project). This mimics the human associative memory by giving stronger associations to concepts that are "fresh in the memory" and that appear often with each other. We assessed the created network by manually evaluating the association. We also compared the utilisation of the network, i.e., query expansion, with information retrieval and other query expansion methods. We concluded that our approach brings several benefits over the compared methods. For example, space consumption was lower than with thesauri and term frequency - inverse document frequency methods. The precision (percentage of relevant results in the result set) was lower after using the method but recall (percentage of relevant results compared to all relevant results in the query space) was better as was expected. By scoring and ranking the results, the negative effects of the lower precision were diminished and the benefits of higher recall emphasized. This document is organised as follows. We review related work in Section 1. In Section 2 we describe the BI-search engine and give the background for this work. In Section 3 we present association modelling using association network and its implementation in an abstract level. In Sections 4 and 5 we describe the case study and the method we used to automatically model the query space associations. Section 6 presents the evaluation and its results. We conclude the paper in Section 7.
1. Related Work Association network represents a conceptual model of a domain by modelling the associations between concepts. Therefore it should not be confused with neural networks [3] and association neural networks [4] that concentrate on, for example, pattern recognition and classification. In this section we survey psychology and neurobiology, information modelling and information retrieval as they are closely related to the method presented in this paper. 1.1. Psychology and Neurobiology Associationism, the theory that associations between concepts operate the mental processes, was first presented by Plato. Later, philosophers like David Hume, John Locke and James Mill continued this work [5]. Nowadays, associations are the corner stone of psychology as they are studied from cognition and memory modelling perspective. Search of Associative Memory (SAM) [6] was initially created to model episodic memory. According to SAM, associations are made when two concepts occupy the same memory buffer at the same time. The more often this happens, the stronger the association gets. In other words, often co-occurring concepts will have a stronger association. Context is also included to the associations. The longer a concept is present in the given context, the higher the association between the concept and context. The activation of the associated concept - context or concept - concept pair will be determined by the strength of the association. In the synaptic level, the neurons will have a higher degree of connections if they have a strong association. Hebb, the father of the Hebbian theory, which concerns how neurons might connect themselves to become engrams (the way memory traces are stored with biochemical changes in the brain), stated in [7] that when two cells are repeatedly
80
M. Timonen et al. / Modelling a Query Space Using Associations
activated, they tend to be associated; meaning that an activation in one tends to lead to activation in the other. However, these associations will gradually deteriorate if they are not used; newer concepts will have stronger association than the older ones. Our work on association modelling is based on these theories. 1.2. Information Modelling Ontologies, knowledge bases and semantic networks are the most relevant information modelling methods related to association modelling. They are usually used for formally modelling the concepts of a domain and the relationships between the concepts [8]. There are two major characteristics of a semantic network: first, the nodes, which contain the concepts, are usually linked to ontology or taxonomy. This defines the nodes formally by stating an upper level concept to which they are mapped. Second, the links between the nodes are labelled and they define the type of relationship between the nodes. The type of relationships can be freely defined; is_a, is_part_of, has_synonym, is_needed, and so on. For example, the following could be found from a semantic network: wheel is_part_of car, where ’wheel’ and ’car’ are the nodes and is_part_of is the link between the nodes. Node ’wheel’ may be mapped to the upper level concept ’steering device’ and ’car’ to ’vehicle’ found from taxonomy. By linking the nodes to the taxonomy and defining the relationships with each other, a semantic network is created. Ontology engineering is the research field concentrating on implementation process and methods. There are some methods for automatic implementation of ontologies, for example [9], but usually the development process is done manually due to the complexity of the domain [1]. The difference between a semantic network and an association network is clear. In semantic networks the network holds more knowledge about the entities, i.e., the nodes, and the relationships between the entities. In association networks there are only the entities and the weight between them. The biggest benefit of an association network when compared to semantic networks is that it is easy to implement by training the network unsupervised. It should be noted, however, that combining a semantic network and an association network could produce even greater benefits than using either one alone. 1.3. Information Retrieval Information retrieval aims to find documents that are relevant to a user’s information need. User satisfies his or her information need by doing a search, i.e., a query. The problem with the search is usually how the query is formulated. When the query is well formulated the results are also good but more often than not the query is too short or does not hold all the terms needed to satisfy the user’s information need. In this case, there is a need for reformulating the query by adding new search terms to it. This method is called query expansion which is a widely researched method for improving the performance of information retrieval. Expanding a query is a difficult but important problem when doing information retrieval. There are a lot of different approaches how this reformulation is done. There are several relevant methods related to our approach, including: • Relevance feedback, • Pseudo-relevance feedback,
M. Timonen et al. / Modelling a Query Space Using Associations
• • • •
81
Statistically co-occurring words, WordNet, Term frequency - inverse document frequency, Spreading activation
Relevance feedback [10] is one of the first methods proposed for query expansion. The idea is that the user can select the relevant documents from the result set and do the search again. The query is reformulated by adding terms from the relevant documents to the query. Pseudo-relevance feedback is a method that does not require any input from the user [11,12]. It is based on automatic calculation of document relevance and using the top k most relevant documents from the result set as an input to the relevance feedback method. Another approach is to expand the query before the initial search. This can be done, for example, by creating a list of terms that map terms together. For instance, if a term A is present in the query, the list could hold that terms B and C should be also added to the query. One way of storing the term - term mappings is using a thesaurus. There can be different types of thesauri but usually a thesaurus is defined as a set of mappings from terms to other related terms [13]. The classical way is to use semantic relations mappings such as synonym, hyponym and antonym. A good example of this is WordNet [14]. Thesaurus can be built using different methods, the most notable being manually built thesaurus, co-occurrence-based thesaurus and linguistic relations based thesaurus. Building a thesaurus from linguistic relations is based on the idea that terms that appear in similar context, e.g., have similar verbs near them, are similar [15]. Co-occurrence-based approach is fairly similar to our approach. The method is based on the assumption that terms that often appear together in the same document are similar in some way. Hearst [16] proposed a method that divides the document into pseudosentences of size n terms and calculates the similarity between the terms by checking how often the terms appear together in the pseudo-sentences. We have taken this approach further, as described later in this paper. Term frequency - inverse document frequency (tf-idf) [17] is the classic method used in information retrieval. The method weights the terms in each document by calculating how frequent the term is in the document, and in the collection of documents. The term’s weight is larger if the term is frequent in one document and infrequent in the collection of documents, i.e., appears in only a few documents. This method is used by search engines to rank the documents in comparison to the user’s search string. Even though tf-idf is mostly used for ranking the documents it can also be used to tackle the problem of query expansion. One approach is to use it for finding documents that are related to the original search string by comparing the documents’ term vectors; if two documents have similar term vectors they contain similar information even if they do not use the same terms. For example, the term ’road’ may appear when talking about ’cars’ and ’trucks’, making their document vectors similar. We can then deduce that ’cars’ and ’trucks’ have a connection between them. Spreading activation [18] is a method developed for searching a semantic or neural network. In the network, the nodes or the edges need to be weighted as the activation is spread between the strongest weighted nodes. The activation is continued until the activation value reaches below a given threshold. There is also a decay factor that lowers the activation value after each jump. Even though developed for a different case, this has also been used in information retrieval where nodes present the documents and their
82
M. Timonen et al. / Modelling a Query Space Using Associations
Figure 1. Search page.
terms [18]. We utilised this approach in our method to break the expansion when the activation was spread enough.
2. Background We decided to use the business intelligence search (BI-search) engine as the test case for the association network. The BI-search application was implemented in collaboration with the Technical Research Centre of Finland (VTT) and Fujitsu Laboratories Japan. BI-search is an application that queries internal and external databases, integrates the information and presents the results to the user. The users of the system are researchers who want to do a quick business intelligence check related to a project idea or proposal they have. The idea behind the system is to provide a comprehensive and intuitive report of patents, projects, persons and companies that are relevant to the new project and its proposal. The search page is presented in Figure 1. The data sources we have integrated to the system include: (1) project database called Research Register that contains approximately 9 000 on-going and completed projects done within VTT, (2) personnel database called SkillBase that holds information about the employees and their skills, (3) patent database called Patent Register, and (4) Yahoo! search engine. Research Register is used for finding completed and on-going projects to support the project or the project proposal writing process. SkillBase, which holds a large collection of skills relevant to VTT, is organised into taxonomy to form a hierarchy. These skills include java programming, which is a sub-skill of programming; data mining, subskill of Technologies and methods; and customer relationship management, sub-skill of Competence areas. Each employee has rated their skill level in each of the skills listed in SkillBase. In BI-search, SkillBase is used for finding if there are persons who can do the tasks required in the project.
M. Timonen et al. / Modelling a Query Space Using Associations
83
Figure 2. The front page of the report view and an example of the term - company relationship graph presented to the user.
Patent Register is used for getting relevant patent information and finding which companies have relevant patents in this field. The Yahoo! search engine is used for finding companies related to the search terms. The system works as follows: user inputs a set of search terms that are relevant to the new project. The different terms are separated and forwarded to the search engine. The set of search terms makes the query set Q. The search engine queries the different data sources using the query set Q. It should be noted that standard pre-processing of the terms is done before the queries. This includes lower casing, and transforming the terms to singular form. The results are processed and analysed using different methods and heuristics to create an informative and intuitive report for the user. The results from Yahoo are processed using a text mining pipeline that extracts company names and locations from the results. The documents not containing any company names are discarded. Result analysis process includes scoring of the results. The results are shown in the report page that holds information that was found from the databases. The information is presented in descending order starting with the highest ranking score. Some of the information is also presented in different types of graphs. An example report view is shown in Figure 2. More information about the implementation of the association network can be found in Sections 3 and 4. Query expansion and result scoring are described in Section 5.1.
3. Association Network The idea behind association network is to mimic the way human associative memory works. The method is based on the theory that when two concepts appear often with each
84
M. Timonen et al. / Modelling a Query Space Using Associations
Figure 3. An example association network.
other, they tend to get a stronger association between them [6]. However, the associations are probabilistic in nature; we do not always follow the same association path but the paths vary. For example, we may usually associate the concept ’car’ to ’driving’, but we may also think hundreds of other concepts, like ’road’, ’wheel’ and ’pavement’ among other things. We model the associations using a network. The nodes in the network represent the concepts that can be words, terms or phrases like ’car’, ’arctic regions’ and ’road trip across Australia’. The nodes are linked together with directed edges that represent association and are weighted with the strength of the association. Figure 3 represents a small example of association network. The network and its notation is nothing new in computer science; Bayesian networks look similar as they consist of nodes that model concepts, and edges that model the probabilities between the concepts. Therefore the contribution of the association network is more abstract than concrete: the idea of modelling the associations instead of semantics or probabilities. Associations between concepts are formed when we experience something [7,6]. The experiences usually consist of several unrelated concepts that we then associate with each other. For example, a road trip across Australia may form associations between concepts like ’Australia’, ’driving’, ’car’, and ’kangaroos’. The stronger the experience is, the stronger the association. In human brain, stronger associations have more neural pathways between them [7]; in association network we use a decimal value to indicate how strong the association is. The experiences can be just about anything, including actual events from every day life, textual documents, signals and images. In our work we have concentrated on textual information found from databases. From machine learning perspective, it is usually difficult, if not impossible to identify how strong an "experience" is. Therefore we have based our association weighting method on a concept used in association rule mining: confidence. Confidence is the probability of concept A appearing when the concept B appears. For instance, when talking about cars, we might talk about tyres 25% of times, making the confidence between cars and tyres 0.25. This is not symmetric, i.e., the confidence will be different when talking about tyres; cars may be talked about 50% of the times, making the confidence of tyres and cars 0.5. Using only the confidence is not enough as we usually make a stronger association between the concepts that were experienced closely together. If using only the confidence to indicate the association weight, all of the concepts from the same experience would have the same weight. In addition, the association tends to be stronger with newer experiences and gradually deteriorate as time passes.
M. Timonen et al. / Modelling a Query Space Using Associations
85
Algorithm 1. Representation of an abstract level implementation of association network.
for Each concept c in experience E do Create node n n←c for Each Concept ce in E \ c do Create node m m ← ce Create edge e Calculate weight w(c, ce ) we ← w(c, ce ) end for end for Therefore we include two additional parameters to weight the association value: distance, which indicates how closely together the concepts were experienced, and time, which indicates the age of the concept pairing. Distance is an attribute that can vary depending on the data source. In an unstructured text, distance can be measured as the number of words, noun phrases, sentences or even paragraphs between the concepts. In time series data, the distance can be temporal. In some cases, it may be possible to use Euclidean distance. When the age of the experience can be deduced or extracted from the data, it can be used to simulate the natural deterioration of neural pathways. In Section 4 we give a more detailed description on association weight calculation. Algorithm 1 presents an abstract level algorithm of association network implementation. Eq. (1) presents a simple approach for calculating the association weight that takes the distance and confidence into consideration. In Eq. (1) c denotes the concepts, ce the concepts it will have an association with, s confidence, which is usually calculated with Eq. (2), and d the distance between c and ce . In Eq. (2) f req(c) is the frequency of concept c (how many times c has appeared), and f req(ce |c) is the frequency of concept ce ’s co-appearances with the concept c. w(c, ce ) = s(c, ce ) × d(c, ce ) s(c, ce ) =
f req(ce |c) f req(c)
(1) (2)
The association network has some similarities with a semantic network. Both have nodes and links but the idea behind association network is to remove the elements that require a lot of manual work. Therefore there is no ontology or taxonomy that would give semantics to the nodes. Also, the links between the nodes are a bit simpler as the labels are replaced by weights. These modifications are made so that the network would be lighter and it can be automatically implemented. We did not see any reason to add the semantics to the network but in case the semantics are needed (like ’car’ is a ’vehicle’), new information can be added to the network. Also, the relations do not have to be labelled as we only need the information about the weight of the relationship, i.e., how strong the association is. However, it may be useful to include the type of the relationship in the future as it may hold interesting information. The more information the network contains, the more usable it becomes but also more work is required in implementation. In our opinion, if seman-
86
M. Timonen et al. / Modelling a Query Space Using Associations
tic network and association network would be combined, the resulting network would provide the best uses. When possible, it may be a good idea to add the associations to the semantic network as extracting them is fast when compared to the arduous task of modelling the semantics.
4. Query Space Model When we started implementing the BI-search engine, we faced several challenges. The first and the biggest challenge was mapping the search terms to the terms found from the databases. This was a major challenge due to the limitations of SkillBase. SkillBase consists of, approximately, 100 concepts; it was likely that the search term did not match any of the SkillBase concepts. For example, a search like ’knowledge base’, even though related to ’ontology’ which is found from SkillBase, did not produce any results. Another challenge was the usability of the system. When the results are not good, i.e., when some of the data sources produce no or incomplete results, the users wanted to update their search. For example, a search did not produce any results from SkillBase even though the user knows there are people who have expertise in ’knowledge bases’. The problem was that as the feasible search term did not produce results, it is difficult to guess the related term that would generate the desired outcome. We addressed these issues by modelling the query space using the association network described in Section 3. A query space S is a collection of terms and concepts t that have some relevance to the domain in question. In the case of documents, query space consists of the terms found from the documents. An association network G(V, E) holds a set V of nodes (or vertices) and a set E of directed edges. Each node n ∈ V represents a term t ∈ S. If terms tn and tm are experienced together (for example found in the same document), the corresponding nodes n and m are linked with directed edges (n, m) and (m, n) in G. Each edge e ∈ E has a weight we (the strength of the association). We chose this approach as the association network can link related terms in the query space with very little effort. We base the work on the assumption that if a concept A appears with concept B often, there is a good chance that concept B will be interesting from the user’s point of view. Even though the relationship between the terms is not defined in the network, terms that have high association will be relevant in most cases. We used the VTT’s Research Register when creating the network as it holds the key concepts of the query space. Each project found from Research Register holds several attributes; title, abstract, start and end years, and keywords being the most relevant. For implementing the network, we used only the keywords of each project as they hold the key concepts in a concise way. When compared to abstracts, the biggest benefit with keywords is that they usually hold the same information but extracting them is notably easier. As described previously, we have based the implementation of the association network on two assumptions: (1) if two concepts appear often in the same context, their association is stronger, and (2) if two concepts appear often closely together, i.e., their average distance is small, the association between them is even stronger. We also adjust the weight with gradual deterioration. Algorithm 2 presents the association network creation. The first step when implementing the association network is to pre-process the input data; in this case the key-
M. Timonen et al. / Modelling a Query Space Using Associations
87
Algorithm 2. Representation of the association network implementation algorithm that is used to create the query space model.
for Project p, collect keywords K do for Each Keyword kn in K do Create node n n ← kn for Each Keyword km in K \ kn do Create node m m ← km Create edge e Calculate weight w(kn , km ) we ← w(kn , km ) end for end for end for words. As the keywords were comma separated, the keyword extraction was a trivial task. After the keywords of each project are extracted, each keyword pair (kn , km ) linked to a project p is used to create the network. If a node for keyword kn or km does not exist, it will be created. The edge e between the nodes kn and km is created and its weight calculated. Calculating the weight between the nodes is the most crucial part of the algorithm as it indicates the strength of the association between two concepts. In order to mimic the associative memory, we base the weights on co-occurrence and frequencies. Our assumption is that when two concepts, i.e., keywords, occur together, association between them will be formed. If the occurrence of the pair is rare, the association is weak. On the other hand, if they occur together often, they will have a strong association. We used this idea when we developed the calculation scheme for the association network. We started out by calculating the frequencies of each keyword pair (kn , km ). The frequencies were then used to calculate the confidence S(kn , km ) as described in Eq. (2). For example, when keyword A appears 10 times, and of those 10 times keyword B co-appears 7 times, the confidence S(kA , kB ) = 0.7. This indicates that the association between kA , kB is 0.7. It should be noted that the edge between kn , km is directed (from kn to km ). The weight of the association from km to kn is calculated separately. The intuition behind this is that when we think of a term ’tyre’ we may think of ’car’ 70% of the time, but when we think of car we may think of ’tyre’ 10% of the time. If we use only the confidence for the association weight we will lose an important element. Consider a case where you will have to memorise a list of words. When memorising, the words that appear next to each other will get a higher association when recollecting the words. As the keyword lists often consist of several keywords, we will utilise this by taking the distance between the keywords into consideration; if two concepts appear close to each other in the keyword list, they will get a stronger association. It is clear that in the keyword lists some of the keywords appear next to each other by chance. But it is highly unlikely that they would appear together often enough to merit a high association value. In other words, if two terms appear closely together often, they will get a higher association weight; otherwise the weight will be lower.
88
M. Timonen et al. / Modelling a Query Space Using Associations
Table 1. Effect of confidence and distance to the association weight. Confidence / Distance
1
2
3
5
7
9
1.0
1.0
1.0
1.0
1.0
1.0
1.0
0.8
1.0
1.0
1.0
1.0
0.95
0.89
0.6
1.0
1.0
1.0
0.86
0.71
0.63
0.5
1.0
1.0
1.0
0.72
0.59
0.52
0.3
1.0
0.997
0.63
0.43
0.35
0.31
0.2
1.0
0.66
0.42
0.29
0.24
0.21
0.1
1.0
0.33
0.21
0.14
0.11
0.10
0.05
1.0
0.17
0.10
0.07
0.06
0.05
We added this distance factor to the calculations by taking the average distance of two keywords and calculating the logarithm of the distance. The distance d between two terms is simply: d = n - m
(3)
where n is the order number of the nth keyword (kn ) and m is the order number of the mth keyword (km ). If the average distance is 1 (terms always appear next to each other), making log(1) = 0, we defined this factor to be 0.01. If the distance was more than 10, we defined the factor as 1.0. This way we will get factor values that vary between 0.01 and 1.0. Eq. (4) shows how we used the distance when calculating the weight. w(kn , km ) =
S(kn , km ) log10 (dkn ,km )
(4)
Table 1 presents how the weights range depending on the distance and confidence of the keyword pair (kn , km ). The distance makes a big difference only when it is small. When the distance is near 2, the weight is approximately three times the confidence. The average distance between the keywords we used for creating the network was 3.7, making the average impact on weight 176%. As there are keywords that appear only once, these keywords will have too much weight when compared with other keywords, especially with their neighbouring keywords. Therefore, we made a small adjustment to the distance calculation. This adjustment a, which can be seen in Eq. (5), gives more weight to the keywords that appear often. a(kn , km ) =
1 f rec(km |kn )
(5)
The distance is calculated now: d(kn , km ) = n − m + a(kn , km )
(6)
When a keyword appears only once, its distance will be ’penalised’ 100%, but when it appears ten times, the penalty is at most 10% of the original distance. Eq. (7) presents the way we calculate the weight for each keyword pair (kn , km ) after the adjustment a.
M. Timonen et al. / Modelling a Query Space Using Associations
w(kn , km ) =
S(kn , km ) log10 (n − m + a(kn , km ))
89
(7)
It is possible that the weight is above 1, especially if the term appears only once. In this case, we normalise the value to be 1 or smaller. This is done with Eq. (8), where max w(kn , N ) refers to the maximum weight in the node kn ’s neighbourhood N . w(kn , km ) =
w(kn , km ) max w(kn , N )
(8)
For example, if the weight w(kn , km ) is 1.20, and there is a keyword kp in the kn ’s neighbourhood N to which the weight is 1.40 (making the max w(kn , N ) = 1.40), weight w(kn , km ) will be normalised to 0.86. Finally, we included the gradual deterioration of the associations to the weighting schema. The motivation for this is the fact that when there are two associations with otherwise similar attributes (distance, co-occurence frequency), the newer one should have a greater probability to activate. Especially in our case, we feel that the younger associations are more interesting to the users: for example, a research project conducted in the 1970’s is far less interesting than a research project done last year. As Research Register holds the start and end years of the projects we were able to extract and use this information. Eq. (9) presents how we calculate the gradual deterioration gd function. gd(kn , km ) = 1 −
ln kage α
(9)
We used α = 30 in the calculations to make the values fall between 1.15 and 0.85. The value kage denotes the average age of the keyword that is calculated by taking the current year minus the average of the end years of the projects where kn and km occur together. If the average age 0 or below (the concept pairing is new), we assign kage = 0.01. The final adjustment for the weight is done by multiplying it with the gradual deterioration, as shown in Eq. (10). The effect of the gd adjustment is small but noticable. If the concept’s average age is less than one year the weight will increase slightly. If the age is five years, the weight will decrease approximately 5.5%. By changing α we can give more emphasis to the age factor and make these changes more significant. For example, if α = 10, five year old pairings would get 16% lower association weight and the new pairings would get 46% higher weight. w(kn , km ) = w(kn , km ) × gd(kn , km )
(10)
The result of this process was a network that contains approximately 14 000 nodes and 300 000 edges. It should be noted that there are always two edges between two nodes; from A to B and from B to A.
5. Utilisation of the Associations We implemented the association network to tackle the following three problems: (1) facilitate search and query expansion, (2) integrate data sources, and (3) improve the user interface and usability of the system.
90
M. Timonen et al. / Modelling a Query Space Using Associations
5.1. Query Expansion Before including the association network to the search engine, the biggest problem with BI-search was the null results. It was too common that search terms that should have produced results returned nothing. The feedback received from the users indicated that this was a clear problem. This problem was due to the limitations of the queried data sources; it could have been tackled manually but mapping hundreds of related search terms and database concepts together seemed too big of a task. We based our query expansion algorithm on spreading activation [18]. Algorithm 3 presents the pseudo code of the query expansion; for each query term q, the algorithm finds the corresponding node n from the network. The query is expanded by extracting the neighbours of the node n to the set N . The top k neighbours, i.e., the nodes with the highest association weights w, are added to the expansion set E. Next, each of the nodes ne located in E are expanded by extracting their neighbours. The association weight between the original query node n and the expanded node ne 2 (which is the neighbour of the neighbour) is calculated by multiplying the weights between the path from n to ne2 , as shown in Eq. (11). w(n, nej ) = n nej (11) In Eq. (11) ne j indicates that the link distance between node n and ne j is j; for example, ne 2 is directly linked to ne 1 which is directly linked to n. The node ne j is added to E if it has greater association weight than the smallest weight in E, i.e., ne j > min wn e , or if E does not hold k nodes, i.e., E < k. After expansion finishes, the nodes in set E are added to the query and the different databases are searched with this new set of query terms. The results of the search are analysed and the report is printed out for the user. We use different types of heuristics to score the results and to order them. The scoring will usually rank the results from the expanded terms lower than the ones found using user’s original search terms; however, if a result contains both expanded and user’s terms, its score will be high. We score the results using the following method: first, a result is scored by checking the query term that produced the result. If the query term is found only in the set E (i.e., E \ Q, where Q is the original query set), the score is calculated by multiplying the score with the term’s association weight. For example, if we have expanded the query ’car’ with the term ’road’ (w = 0.7), the results that were received with the query ’car’ will receive the weight 1 and ’road’ 0.7. If the result holds both, its score will be 1.7. We also check other information about the result, such as what the spatial location of the resulted company, patent or person is, and how old the document is. These will affect the ranking only a little. As with all query expansion methods, it is evident that using query expansion will lower the overall precision of the results but the recall will be much higher. But by scoring the results and weighting the score with association value we ensure that the lower precision will not irritate the users. However, the higher recall will be noticed when it is needed, i.e., when the query would not otherwise produce any results. 5.2. Associative Search The null results produced also another problem for the users. As the user’s search terms were feasible, users commented that they did not know how to modify their search to
M. Timonen et al. / Modelling a Query Space Using Associations
91
Algorithm 3. Algorithm for query expansion using association network.
for Each query term q in Q do Find corresponding node n = q N ← n’s neighbours Order N by association weight w E ← N ’s top k nodes for Each node ne in E do Extract ne ’s neighbours Ne j for Each node ne j in Ne j do Calculate weight w between n and ne j if w > min wn e then E ← ne j end if end for end for end for produce the results they wanted. And even if the results were good, we wanted to provide an intuitive search option to continue and expand the search manually in case more information is needed. To address these issues we included an intuitive search option to the user interface called Associative search. The idea behind the search is that user can see the terms that have some association with the original search terms and use them to manually form the next query. We also included the SkillBase taxonomy to this search. We had to limit the expansion set to top k nodes as the precision of the search would otherwise be too low. When we present the nodes to the user we can set the limit higher. Therefore when expanding the search with top k nodes, as described in 5.1, we also get additional top j nodes that are presented to the user but not included in the query expansion. These k + j nodes are presented to the user in the user interface. Figure 4 presents the user interface of the Associative Search which provides the user with the possibility of manually expanding the search by selecting new search terms from the list of concepts. The list also includes the association weight (relevance weight from the user’s point of view) and the original search term to which it was mapped to. The concepts can be added to form a new search by clicking them on the list.
6. Experiments 6.1. Evaluation Setup Evaluation of the network is a difficult task as it is hard to assess if the association weight between two concepts is feasible. It may not even be sensible to assess the associations as they are, in fact, associations. Nonetheless, we conducted a small scale evaluation of the network by manually checking approximately 300 of the associations and their weights, concentrating mostly on the top weighted associations. This sample contains approximately 1% of the top weighted associations; we considered that the sample of this
92
M. Timonen et al. / Modelling a Query Space Using Associations
Figure 4. Associative search, located on the left, can be used to manually expand the query with the related terms found from the association network.
size gives a good indication of the feasibility of the results. When assessing the results, it was difficult to know if the result was good, as can be seen from Table 2. We evaluated the query expansion by comparing the space consumption and the results of our approach against other query expansion methods. The associative search was evaluated by collecting feedback from the users. 6.2. Results This section describes the evaluation results of the association network, information retrieval and associative search. 6.2.1. Association Network Approximately 9% of the associations were weighted 1 and approximately 1% of the associations were weighted 0.9 < w < 1. Approximately 26% of the associations were weighted over 0.5 and 33% below 0.1. Table 2 presents 30 example associations and their weights. Figure 5 shows an example association in regards to Table 2. From Table 2 we can see that most of the associations that have weight over 0.9 are feasible. Some of them, such as satellite picture - satellite image, are synonyms. Several of them have a strong association in a real life setting, such as GPRS - UMTS and road - asphalt. The table shows also the effect of the age factor. In most cases age lowers
M. Timonen et al. / Modelling a Query Space Using Associations
93
Figure 5. An example association, where concept (from) is GPRS, concept (to) is UMTS and weight is 1. In other words, association from GPRS to UMTS is weighted 1.
the weight but in some the weight is increased. In our opinion, the impact of weight is feasible as the newer association are usually more relevant from the user’s stand-point. We evaluated 300 randomly selected associations, 200 of them having weight 1.0; 50 of them were weighted 0.3 < w < 0.7; and 50 of them below 0.1. The evaluation was difficult as there are several concepts that are unclear to us. In addition, assessing the associations may not be feasible. Therefore when we classified an association as a "negative hit" we consider that the mapping would produce negative search results with a high probability. If we consider that the weight should be higher, we indicate it in the "higher" column of Table 3. It should be noted that when we evaluated the network, we discarded the misspelled concepts that were present in the training data. Table 3 presents the results of the evaluation. As can be seen from the table, we considered most of the associations with weight 1 as correct. The associations with weight between 0.3 and 0.7 were mostly correct but there were a great number of associations that were considered too lightly weighted. However, this is a gray area as the association is quite strong. With the association weight below 0.1, approximately half of the association weights were too low. However, it is important to note that when assessing the associations we did not have all the information available. For instance, the same concept may have stronger association with other concepts, making the lower weight sensible as there should not be several strong associations for one concept. This is especially true in cases when a concept has dozens of associations. Therefore, even though the evaluation may seem to produce poor results when association weights are small, we think that these results can be justified in most cases. 6.2.2. Query Expansion We compared the query expansion with association network against other information retrieval and query expansion methods. We used projects found from Research Register and the data from SkillBase to compare the methods. Term frequency - inverse document frequency produced quite good results but there were two major problems. First, as we used several different data sources we could not tackle the problem of mapping the search terms to related terms found from SkillBase with tf-idf. For finding similar projects from the Research Register, tf-idf produced good results. However, it produced more true negative results, i.e., projects that are not relevant, than our query expansion method. This was due to the way tf-idf works: it takes all of the keywords and creates a vector that is compared to the original search string. Second, the space consumption of the method was substantial. We used n × d matrix, where n is the number of terms and d is the number of documents, making the size of the matrix 126 000 000 entries. Thesaurus approach takes even more space than tf-idf as it requires n × n matrix to store the weights between the terms. We can weight the terms the same way we weight the edges in the association network, making the space consumption the only difference
94
M. Timonen et al. / Modelling a Query Space Using Associations
Table 2. Example set of associations. The last column indicates the weight after the age has been factored in. Concept (from)
Concept (to)
Weight
Weight with age factor
satellite picture
satellite image
1.0
1.0
building information modelling
safety
1.0
1.0
waste combustion
biomass
1.0
0.977
ontology
reasoning
1.0
0.967
regional construction
energy distribution
1.0
0.96
oulu
energy conservation
1.0
0.96
rfid tag
barcode
1.0
0.96
pulping process
pulping industry
1.0
0.95
gprs
umts
1.0
0.935
road
asphalt
1.0
0.92
competitor survey
SME
1.0
0.91
sun
isotropy
1.0
0.91
rime
ice formation
1.0
0.90
respirator
occupational safety
1.0
0.896
screwdriver
hand saw
1.0
0.895
polymer
plastic
1.0
0.85
apms
paper machine
0.96
0.889
iron
steel
0.95
0.81
organic contaminant
enzyme
0.93
0.998
sea level
climatic change
0.93
0.877
felling
pulpwood
0.90
0.845
mobile telephone
local area network
0.90
0.85
lightweight concrete
stiffness
0.90
0.81
aerial photography
aerial survey
0.83
0.76
online measurement technology
high pressure
0.71
0.65
atmosphere
scanning
0.63
0.58
testing methods
failure
0.55
0.5
process simulation
processes
0.52
0.49
rye
wheat
0.42
0.45
energy conservation
fuel consumption
0.22
0.21
food processing
electric device
0.09
0.07
enzyme
health care
0.013
0.015
between the methods. With this approach, the space consumption is 196 000 000 entries. Association network requires only space for each node, and for the edges between the nodes. In our network, there are 13 712 nodes and 291 536 edges between the nodes, making the space requirement for the network approximately 300 000 entries. Pseudo-relevance feedback requires only the space for the results making the space consumption 0. We tested pseudo-relevance feedback method by extracting the keywords from the result set and expanded the search using these keywords. This lowered the precision drastically as there were on average 4 new search terms added per project in the result set. As there were approximately 28 projects in the result set, the number of new search terms was approximately 110. We did not weight or prune the set of new search
95
M. Timonen et al. / Modelling a Query Space Using Associations
Table 3. Results of the network evaluation. Positives were considered as correctly weighted, negatives incorrectly weighted. Higher and lower indicates whether the negatives should be valued higher or lower. Weight
Positives
Negatives
Higher
Lower
1.0
92%
8%
0%
100%
0.3 - 0.7
60%
40%
85%
15%
< 0.1
45%
55%
100%
0%
terms. It could be possible to use a variation of our association weighting method here; however this would create more time consumption for the algorithm as the weights need to be calculated on each run separately. As it is expected, the precision of the results is lower when using the query expansion. This is due to new search terms that are added to the search. On the other hand, recall is much higher for the same reason. Finding the balance between precision and recall is difficult but as described in Section 5.1, we have avoided this problem with the result weighting schema in BI-search. 6.2.3. Associative Search During the final stages of the development we conducted user tests on the system and collected feedback about the search engine and the associative search. The test setup was simple: user does a search after which he or she checks the results and is asked to look the associative search panel on the screen. If there are interesting terms present, a new search is made. The system received favourable comments especially about the usability. First, it was easy to continue the search after the initial results as the related concepts were present. A couple of users commented, that by doing a new search using the related terms produced new ideas for the project by pointing towards a possible domain for test cases and directed towards persons with similar completed projects within the company; even though the original search did not produce such results.
7. Conclusions In this paper we presented an unsupervised method for implementing association networks. We used the method for modelling a query space and utilised the network in query expansion and in enhancing the usability of the BI-search system by presenting the relevant associative concepts to the user. We used keywords rather than free text as they contain approximately the same information in a more concise way, making it easier to extract the concepts of the domain. The network itself is a useful and intuitive tool to present the associations between the concepts. When compared, for example, to matrices, the network requires much less space and is more intuitive and efficient to use. The results proved this approach useful for our needs. Even though precision was lower, as was expected, recall was high. The network was able to make two improvements to the search: (1) to provide results when null results would otherwise occur, and (2) to provide additional results that could interest the user. The user interface and usability of the system were also successfully improved, as the user feedback indicated.
96
M. Timonen et al. / Modelling a Query Space Using Associations
In the future we will experiment with the association network in other domains such as content-based recommendation systems. An interesting challenge is extracting the concepts from free text, such as abstracts. Future improvements to query expansion may be to find the strongest paths between the query terms and expanding the search using the concepts on each path. This may be efficient as it concentrates on several query terms at the same time instead of just one.
Timonen, M., Implementation of an Ontology-Based Biological Knowledge Base, Master’s Thesis, Department of Computer Science, University of Helsinki, Helsinki, 2007. Agrawal, R., Imielinski, T., Swami, A., Mining association rules between sets of items in large databases, SIGMOD rec., 22(2), 1993, pp. 207-216. Gurney, K., Neural Networks, CRC, 1997. Tetko, I., Associative Neural Networks, Neural Processing Letters, 16(2), 2002, pp. 187-199. Anderson, J., Bower, G,. Human Associative Memory: A brief edition, Psychology press, 1980. Raaijmakers, J., Schiffrin, R., Search of associative memory, Psychological Review, 8(2), 1981, pp. 98134. Hebb, D., The organization of behavior, New York: Wiley, 1949 Gruber, T., A translation approach to portable ontology specifications, Knowledge Acquisition, 5, 1993, pp. 199-220. Blomqvist, E., OntoCase - A Pattern-Based Ontology Construction Approach, On the Move to Meaningful Internet Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS, pp. 971-988. Kelly, D., Belkin, N., Reading Time, Scrolling and Interaction: Exploring Implicit Sources of User Preferences for Relevance Feedback During Interactive Information Retrieval, SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, New Orleans, Louisiana, United States, 2001, pp. 408-409. Buckley, C., Salton, G., and Allan, J., Automatic Retrieval with Locality Information Using Smart, Text REtrieval Conference (TREC-1), National Institute of Standards and Technology, Gaithersburg, MD, 1992, pp. 59-72. Efthimiadis, N. E., Query Expansion, Annual Review of Information Systems and Technology, 31, 1996, pp. 121-187. Schutze, H., Pedersen, J., A cooccurrence-based thesaurus and two applications to information retrieval, Information Processing & Management, 33(3), 1997, pp. 307-318. WordNet: An Electronic Lexical Database, http://wordnet.princeton.edu/ Wang, Y., Vandendorpe, J., Relational Thesauri in Information Retrieval, Journal of the American Society for Information Science, 36(1), 1985, pp. 15-27. Hearst, M., Multi-paragraph segmentation of expository text, Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Las Cruces, New Mexico, United States, 1994, pp. 9-16. Salton, G., Buckley, C., Term-weighting approaches in automatic text retrieval, Information Processing & Management, 24(5), 1988, pp. 513-523. Crestani, F., Application of Spreading Activation Techniques, Information Retrieval, Artificial Intelligence Review, 11(6), 1997, pp. 453-482.
Architecture-Driven Modelling Methodologies Hannu JAAKKOLA a,1 and Bernhard THALHEIM b,2 Tampere University of Technology, P.O.Box 300, FI-28101 Pori, Finland b Christian-Albrechts-University Kiel, Computer Science Institute, 24098 Kiel, Germany a
Abstract. Classical software development methodologies take architectural issues as granted or pre-determined. They thus neglect the impact decisions for architecture have within the development process. This omission is applicable as long as we are considering monolithic systems. It cannot however been kept whenever we move to distributed systems. Web information systems pay far more attention to users support and thus require sophisticated layout and playout systems. These systems go beyond what has been known for presentation systems. We thus discover that architecture plays a major role during systems analysis, design and development. We thus target on building a framework that is based on early architectural decisions or on integration of new solutions into existing architectures. We aim at development of novel approaches to web information systems development that allow a co-evolution of architectures and software systems. Keywords. architecture-driven development, software development, web, information systems, modelling.
1. Introduction Typical components of modern information systems are large databases, which are utilized trough internet connections. The applications - Web Information systems (WIS) are usually large and the structure of them is complex covering different types of assets from reusable architectures to COTS components and tailored software elements. The complexity of information systems is increased also the growing demand of interoperability expectations. Larry Boehm - in his conference paper [1] - is using the term “complex systems of systems” in this context. His message is that modern information systems are layered and complex structures based on interoperability between individual systems, products and services. There is no commonly agreed definition for the notion of a software architecture 3 . Some of the notions we found in the literature are too broad, some others are too narrow4 . Boehm [2] approaches the topic by analyzing the trends that are worth of knowing in 1 Corresponding
Author: hannu.jaakkola@tut.fi http://www.pori.tut.fi/∼ hj http://www.is.informatik.uni-kiel.de/∼ thalheim 3 Compare the large list more than hundred of definitions collected from contributors to http://www.sei.cmu.edu/architecture/start/community.cfm 4 Compare http://www.sei.cmu.edu/architecture/start/moderndefs.cfm 2 [email protected]
98
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
adapting the software engineering practices and methods in the current needs. One of his findings points out the importance of architectures. Architectures are means to communicate about software, to set up preconditions to the components and interfaces, to adopt beneficial approaches for strategic reuse in software development, etc. Architecture has three roles: • to explain: architecture explains the structure of software; • to guide: architecture guides the designer to follow the predefined commonly accepted rules; • to enable: architecture provides high level mechanism to implement the requirements set to the product. In modern software development especially the role of enabling architectures has been growing as the role of reuse as a part of development is increasing. A similar observation has been made for advanced database system architectures [6,14]. A key observation for database management systems has been that the invariants in database processing determine the architecture of a system. [6] predicted that novel systems such as native XML systems must either use novel architectures or let the user experience the “performance catastrophe”. Business information systems applications that target novel applications, e.g., SOA [15,21], require completely different architectures. Architecture is a term that must cope with a variety of different aspect reflections and viewpoints. The Quasar model of sd& m [23]) distinguished between the application architecture that reflects the outside or gray-box view of a system, the technical or module construction architecture that separates components or modules for construction and implementation, and the technical infrastructure architecture that considers the embedding of the system into a larger system or into the supporting infrastructure. This separation of concern is similar to different viewpoints in geometry that uses the top view, the profile view, and the ground view. These views are only three views out of a large variety of views. We use the following definition of the notion architecture: A system architecture represents the conceptual model 5 of a system together with models derived from it that represent (1) different viewpoints defined as views on top of the conceptual model, (2) facets or concerns of the system in dependence on the scope and abstraction level of various stakeholders, (3) restrictions for the deployment of the system and description of the quality warranties of the system, and (4) embeddings into other (software) systems. We can distinguish five standard views in an architectural framework: (I) The information or data view represents the data that is required by the business to support its activities. This answers the what information is being processed question. (II) The functional business or domain view represents all the business processes and activities that must be supported. This answers the “what business activities are being carried out”. (III) The integration or data-flow view represents the flow of information through the business, where it comes from and where it needs to go. This answers the which business activities require it question. (IV) The deployment or technology view represents the 5 The conceptual model includes structural, behavioural and collaboration elements. Systems might be modularised or can also be monolithic. The conceptual model allows us to derive a specification of the system capacity. We may distinguish between standard views and views that support different purposes such as system construction, system componentisation, documentation, communication, analysis, evolution or migration, mastering of system complexity, system embedding, system examination or assessment, etc.
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
99
physical configuration and technology components used to deploy the architecture in the operating environment. This answers the where is the information located question. (V) The infrastructure or embedment view represents the system as a black- or grey-box and concentrates on the embedding of the system into other systems that are either supporting the system or are using systems services. Web information systems are typically layered or distributed systems. Layering and distribution results in rather specific data structures and functions that are injected in order to cope with the specific services provided by layers or components. The CottbusNet projects used a multi-layer and distributed environment. For instance, the events calendar in city information systems may use a dozen or more different database systems and a view tower. A view tower of such systems must provide advanced search facilities [4]. It uses views that compile a variety of ETL results into a common view for events, an extraction view for presentation of events at a classical website or at other media such as video text canvas or smart phone display, a derived search functionality for these data, and a collection view for a shopping cart of a event shopper. A similar observation can be made for OLTP-OLAP systems [12,13]. OLAP systems are typically built on top of OLTP systems by applying first grouping and aggregation functions and second by integrating data obtained into a data mart presentation. In projects aiming in developing web information systems [25] we discovered that interactivity required redevelopment and adjustment of functionality and of structuring of supporting database systems. Therefore, the presentation layer of a system “struck through” to the support system and resulted in change of this system. This observation complements the observations such as [6,14,21] and shows that web information systems must be build on a more flexible consideration of architectures. These observations can be summarized into the architecture/application impedance mismatch: Architecture solutions heavily influence the capability of a system and must be considered as an orthogonal dimension during systems development. Outline of the Paper This paper opens discussion on Architecture-Driven Modelling Methodologies in the connection with large Web Information Systems. The paper has its roots in a joint research project of the co-authors; the project has had connections to other related research activities of the participating organisations, and it is funded by DAAD in Germany and Academy of Finland. This paper provides an overview to the approach and methodology developed in the project. Sections 2 introduces the key concepts of the paper. Sections 3 and 4 cover the bindings of the topic to the state-of-the-art of classical IS methodologies and to the Co-Design approach developed by one of the co-authors [25,19]. Architecture Driven Methodologies are discussed in Section 5. The paper summarises the findings of the project by introducing a four-dimensional or four-facetted model to software development in Section 6. 2. Architecture-Driven Modelling of Web Information Systems 2.1. The Challenges of Modern Web-Based and Web Information Systems Web information systems (WIS) [3,9,20] augment classical information systems by modern Web technologies. They require at the same time a careful development and sup-
100
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
port for the interaction or story spaces beside the classical support for the working space of users. These dimensions complicate the system development process. Usually, WIS are data-intensive applications which are backed by a database. While the development of information systems is seen as a complex process, Web information systems engineering adds additional obstacles to this process because of technical and organizational specifics: • WIS are open systems from any point of view. For example, the user dimension is a challenge. Although purpose and usage of the system can be formulated in advance, user characteristics cannot be completely predefined. Applications have to be intuitively usable because there cannot be training courses for the users. Non-functional properties of the application like ‘nice looking’ user interfaces are far more important compared with standard business software. WIS-E is not only restricted to enterprises but is also driven by an enthusiastic community fulfilling different goals with different tools. • WIS are based on Web technologies and standards. Important aspects are only covered by RFCs because of the conception of the Internet. These (quasi)standards usually reflect the ‘common sense’ only, while important aspects are handled individually. • Looking at the complete infrastructure, a WIS contains software components with uncontrollable properties like faulty, incomplete, or individualistically implemented Web browsers. • Base technologies and protocols for the Web were defined more than 10 years ago to fulfill the tasks of the World Wide Web as they had been considered at this time. For example, the HTTP protocol was defined to transfer hypertext documents to enable users to browse the Web. The nature of the Web changed significantly since these days, but there were only minor changes to protocols to keep the Holy Cow of Compatibility alive. Today, HTTP is used as a general purpose transfer protocol which is used as the backbone for complex interactive applications. Shortcomings like statelessness, loose coupling of client and server, or the restrictions of the request-response communication paradigm are covered by proprietary and heavy-weight frameworks on top of HTTP. Therefore, they are not covered by the standard and handled individually by the framework and the browser, e.g., session management. Small errors may cause unwanted or uncontrollable behavior of the whole application or even security risks. WIS can be considered from two perspectives: the system perspective and the user perspective. These perspectives are tightly related to each other. We consider the presentation system as an integral part of WIS. It satisfies all user requirements. It is based on real life cases. Software engineering has divided properties into functional and nonfunctional properties, restrictions and pseudo-properties. This separation can be understood as a separation into essential properties and non-essential ones. If we consider the dichotomy of a WIS then this separation leads to a far more natural separation into information system requirements and presentation systems requirements. The system perspective considers properties such as performance, efficiency, maintainability, portability, and other classical functional and non-functional requirements. Typical presentation system requirements are usability, reliability, and requirements oriented to high quality in use, e.g., effectiveness, productivity, safety, privacy, and satisfaction. Safety and security are also considered to be restrictions since they specify undesired behavior of systems.
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
101
Pseudo-properties are concerned with technological decisions such as language, middleware, operating system or are imposed by the user environment, the channel to be used, or the variety of client systems. WIS must provide a sophisticated support for a large variety of users, a large variety of usage stories, and for different (technical) environments. Due to this flexibility the development of WIS differs from the development of information systems by careful elaboration of the application domain, by adaptation to users, stories, environments, etc. Classical software engineering typically climbs down the system ladder to the implementation layer in order to create a productive system. The usual way in today’s WIS development is a manual approach: human modelling experts interpret the specification to enrich and transform it along the system ladder. This way of developing specifications is error-prone: even if the specification on a certain layer is given in a formal language, the modelling expert as a human being will not interpret it in a formal way. Misinterpretations, misunderstandings, and therefore the loss of already specified system properties is the usual business. 2.2. The Classical Presentation System Development for Web Information Systems Classical approaches to web information systems are often based on late integration of presentation systems into the WIS information system. This approach is depicted in in Figure 1. Classically several layers of abstraction are identified. The top layer is called the application domain layer. It is used to describe the system in a general way: What are the intentions? Who are the expected users? The next lower layer is called the requirements prescription layer, which is used to concretise the ideas gathered on the application domain layer. This means to get a clearer picture of the different kinds of users and their profiles. This may also include the different roles of users and tasks associated with these roles. The major part of this layer, however, deals with the description of the story board. Stories identify possible paths through the system and the information that is requested to enable such paths. So the general purpose of the business layer is to anticipate the behaviour of the system’s users in order to set up the system in a way that supports the users as much as possible. The central layer is the conceptual layer. Whilst the requirements prescription layer did not pay much attention to technical issues, they come into play on the conceptual layer. The various scenes appearing in the story board have to be analysed and integrated, so that each scene can be supported by a unit combining some site content with some functionality. This will lead to designing abstract media types. The information content of the media types must be combined to design the structure of an underlying database. The next lower layer is the presentation layer which is devoted to the problem of associating presentation options to the media types. This can be seen as a step towards implementing the system. Finally, the lowest layer is the implementation layer. All the aspects of the physical implementation have to addressed on this layer. This includes setting up the logical and physical database schemata, the page layout, the realisation of functionality using scripting languages, etc. As far as possible, components on the implementation layer, especially web-pages, should be generated from the description on the higher layers.
102
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
PP
Description/ prescription layer
Conceptual layer
Implementation layer
PP PP PP Application PP
area PP PP description PP PP PP PP PP PP P P PP PP PP Design P PP Requirements Refinement PP PP prescriptions P WIS description PPP PP P and prescription P
Information system
specification Presentation system specification WIS specification Information system Implementation Transformation Presentation system Web information system
Figure 1. The classical dichotomy of human-computer systems and the systems ladder
This approach has the advantage that the presentation system specification is based on database views. The entire presentation depends on the maturity of the information systems specification. For this reason we may prefer the development according to the methodology depicted in Figure 1 or better in Figure 4.
3. State of the Art and Classical (Web) Information Systems Methodologies ARIS (Architecture of Integrated Information Systems) [16] defines a framework with five views (functional, organizational, data, product, controlling) and three layers (conceptual (‘Fachkonzept’), technical (‘DV-Konzept’), and implementation). ARIS was designed as a general architecture for information systems in enterprise environments. Therefore, it is too general to cover directly the specifics of Web information systems and needs to be tailored. The Rational Unified Process (RUP) [10] is an iterative methodology incorporating different interleaving development phases. RUP is backed by sets of development tools.
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
103
RUP is strongly bound to the Unified Modelling Language (UML). Therefore, RUP limits the capabilities of customization. Like ARIS, RUP does not address the specifics of WIS-E. A similar discussion can be made for other general purpose approaches from software engineering. OOHDM [22] is a methodology which deals with WIS-E specifics. It defines an iterative process with five subsequent activities: requirements gathering, conceptual design, navigational design, abstract interface design, and implementation. OOHDM considers Web Applications to be hypermedia applications. Therefore, it assumes an inherent navigational structure which is derived from the conceptual model of the application domain. This is a valid assumption for data-driven (hypermedia-driven) Web applications but does not fit the requirements for Web information systems with dominating interactive components (e.g., entertainment sites) or process-driven applications. There are several other methodologies similar to OOHDM. Like OOHDM, most of these methodologies agree in an iterative process with a strict top-down ordering of steps in each phase. Surprisingly, most of these methodologies consider the implementation step as an ‘obvious’ one which is done by the way, although specifics of Web applications cause several pitfalls for the unexperienced programmer especially in the implementation step. Knowledge management during the development cycles is usually neglected. There are several methodologies that cope with personalization of WIS. For example, the HERA methodology [7] provides a model-driven specification framework for personalized WIS supporting automated generation of presentation for different channels, integration and transformation of distributed data and integration of Semantic Web technologies. Although some methodologies provide a solid ground for WIS-E, there is still a need for enhancing the possibilities for specifying the interaction space of the Web information system, especially interaction stories based on the portfolio of personal tasks and goals. This list of projects is not complete. Most of the project are not supporting conceptual development but provide services for presentation layout or playout. The Yahoo pipes project6 uses mashup services for remixing popular feed types. The Active Record pattern embeds the knowledge of how to interact with the database directly into the class performing the interaction.
4. Co-Design of Web Information Systems We distinguish a number of facets or views on the application domain. Typical facets to be considered are business procedure and rule facets, intrinsic facets, support technology facets, management and organization facets, script facets, and human behavior. These facets are combined into the following aspects that describe different separate concerns: • The structural aspect deals with the data which is processed by the system. Schemata are developed which express the characteristics of data such as types, classes, or static integrity constraints. • The functional aspect considers functions and processes of the application. • The interactivity aspect describes the handling of the system by the user on the basis of foreseen stories for a number of envisioned actors and is based on media 6 See:
http//pipes.yahoo.com
104
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
objects which are used to deliver the content of the database to users or to receive new content. • The distribution aspect deals with the integration of different parts of the system which are (physically or logically) distributed by the explicit specification of services and exchange frames. Each aspect provides different modelling languages which focus on specific needs. While higher layers are usually based on specifications in natural language, lower layers facilitate formally given modelling languages. For example, the classical WIS Co-Design approach uses the Higher-Order Entity Relationship Modelling language for modelling structures, transition systems and Abstract State Machines for modelling functionality, Sitelang for the specification of interactivity, and collaboration frames for expressing distribution. Other languages such as UML may be used depending on the skills of modelers and programmers involved in the development process. A specification of a WIS consists of a specification for each aspect such that the combination of these specifications (the integrated specification) fulfills the given requirements. Integrated specifications are considered on different levels of abstraction (see Figure 2) while associations between specifications on different levels of abstraction reflect the progress of the development process as well as versions and variations of specifications. Unfortunately, the given aspects are not orthogonal to each other in a mathematical sense. Different combinations of specifications for structure, functionality, interactivity, and distribution can be used to fulfill given requirements while the definition of the ‘best combination’ relies on non-functional parameters which are only partially given in a formal way. Especially the user perspective of a WIS contributes many informal and vague parameters possibly depending on intuition. For example, ordering an article in an online shop may be modelled as a workflow. Alternatively, the same situation may be modelled by storyboards for the dialog flow emphasizing the interactivity part. This principle of designing complex systems is called Co-Design, known from the design process of embedded systems where certain aspects can be realized alternatively in hardware or software (Hardware Software Co-Design). The Co-Design approach for WIS-E developed in the Kiel project group defines the modelling spaces according to this perception. We can identify two extremes of WIS development. Turnkey development is typically started from scratch in a response to a specific development call. Commercial offthe-shelf development is based on software and infrastructure whose functionality is decided upon by the makers of the software and the infrastructure than by the customers. A number of software engineering models has been proposed in the past: waterfall model, iterative models, rapid prototyping models, etc. The Co-Design approach can be integrated with all these methods. At the same time, developers need certain flexibility during WIS engineering. Some information may not be available. We need to consider feedback loops for redoing work that has been considered to be complete. All dependencies and assumptions must be explicit in this case. In [5] we discussed one strategy to early incorporate architectural concerns into website development. The outcome was a methodology with a third development step that aims in the development of a systems architecture before any requirements elicitation is deployed.
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
Application domain layer
Scoping
? Requirements acquisition layer
Variating
? Business user layer
Designing
A A
X XX X
A
A
A
A A
XXX X XX Distribution specification XXX XXX XX
105
? Conceptual layer Structuring ImpleXX specification XX menting XX ? XX XX Implementation Functionality layer specification
A
A A
A
A
A
A A Dialogue specification
Figure 2. Abstraction Layers and Model Categories in WIS Co-Design
Architectural styles provide an abstract description of general characteristics of a solution. The following table list some of the styles. Style Client-Server ComponentBased Architecture Layered Arch. Message-Bus
N-tier / 3-tier
ObjectOriented Separated Presentation SOA
Description Segregates the system into two applications, where the client makes a service request to the server. Decomposes application design into reusable functional or logical components that are location-transparent and expose well-defined communication interfaces. Partitions the concerns of the application into stacked groups (layers). A software system that can receive and send messages that are based on a set of known formats, so that systems can communicate with each other without needing to know the actual recipient. Segregates functionality into separate segments in much the same way as the layered style, but with each segment being a tier located on a physically separate computer. An architectural style based on division of tasks for an application or system into individual reusable and self-sufficient objects, each containing the data and the behavior relevant to the object. Separates the logic for managing user interaction from the user interface (UI) view and from the data with which the user works. Refers to Applications that expose and consume functionality as a service using contracts and messages.
Each of these styles has strengthes, weaknesses, opportunities, and threats. Strengths and opportunities of certain architectural styles are widely discussed. Weaknesses and threats are discovered after implementing and deploying the decision. For instance, the strengths of SOA (service oriented architecture) are domain alignment, abstraction, reusable com-
106
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
ponents, and discoverability. Weaknesses of SOA are acceptance for SOA within the organization, harder aspects of architecture and service modeling, implementation difficulties for a team, methodologies and approaches for implementing SOA, and missing evaluations of various commercial products that purport to help with SOA rollouts. Threats of SOA are the development of a proper architectural plan, the process plan, resource scope, the application of an iterative methodology, the existence of a governance strategy, and the agreement on clear acceptance criteria. Therefore, a selection of an architecture has a deep impact on the web information system itself and drives the analysis, design and development of such systems. Figures 1 and 4 consider a separation of systems into a presentation system and a support system, i.e. the classical client-server decision. The picture is more complex if we decide to use 3-tier, SOA or other architectures. The structuring and the functionality that are provided by each of the subsystems must be properly designed. Therefore, the architectural style is going to drive the development process. 5. Architecture-Driven and Application-Domain-Ruled Modelling Methodologies The project we report was aiming in bridging two technologies developed in the research groups at Kiel and Tampere universities. The Tampere team has been concentrating in the past on software development technologies and methodologies. They have been contributing to corresponding standards. The Kiel team has gained deep insight into web information systems development. In the past the two groups have already been collaborating for the development of a web information systems design methodology. We built a framework that is based on early architectural decisions or on integration of new solutions into existing architectures. We aim in development of novel approaches to web information systems development that allow a co-evolution of architectures and software systems. WIS development results in a number of implemented features and aspects. These features and aspects are typically well-understood since they are similar to classical software products. One dimension that has often been taken into consideration at the intentional level is the level of detail or granularity of the description. Classical databases schemata are, for instance, schemata at the schema level of detail. This schema level is extended by views within the three-level architecture of database systems. These views are typically based on macro-schemata. Online analytical processing and data warehouse applications brought another level of detail and are based on aggregated data. Content management systems are additionally based on annotations of data sets and on concepts that explain these data sets and provide their foundation. Finally, scientific applications require another schema design since they use sensor data which are compacted and coded. These data must be stored together with the ‘normal’ data. The architectural component has been neglected for most systems since architecture has been assumed to be canonically given. This non-consideration has led to a number of competing architectures for distributed, main-frame or client-server systems. These architectures can however been considered as elements of the architecture solution space. Therefore the development space for software systems development can be considered to be three-dimensional. Figure 3 displays this space. Web information systems development has sharpened the conflicting goals of system development. We must consider at the same time a bundle of different levels of details,
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
Feature A Aspect B mainframe client/server federated collaborated collaborating on demand
-
and products of development
Architectures Figure 3. The Development Space for Web Information Systems
languages and schemata. Systems will not provide all features and aspects to all users. Users will only get those services that are necessary for their work. At the same time, a number of architectural solutions must co-exist. 5.1. Development by Separation of Concern Our approach concentrates on the separation of concern for development. We shall distinguish the user request diploid within a development: Application domain modelling aims in meeting the expectations of the user depending on their profile and their work portfolio. Users want to see a system as companion and do not wish to get another additional education before they can use a system. Architecture modelling proposes a realisation alternative. This architecture is typically either based on already existing solutions or must be combined with the user system. Separation of concern for development allows to decompose an application into fields of action, thought or influence. All components have an internal structure formed from a set of smaller interlocking components (sub-component) performing well-defined functions within the overall application domain. Separation of concern covers the what, who, when and (if its relevant) the why aspects of the business and allows us to identify ‘owners’ and ‘influencers’ of each significant business activity that we need to consult whenever we want to change any of these aspects. A prescriptive (i.e., principles driven) separation is easier to justify to business stakeholders when proposals are put forward to restructure a business activity to improve overall efficiency. Functional business areas have a high influence on a system. They are identifiable vertical business areas such as finance, sales & marketing, human resources or product manufacturing; and in other cases, they are cross-functional “horizontal” areas such as customer service or business intelligence. Therefore, the business areas already govern the architecture of a system. The establishment of an “ownership” of an information flow assigns the owner to be responsible for making the data available to other business areas as and when those business areas require it. “Influencers” of an information flow need to be consulted when any changes are proposed to ensure that they can comply with the
108
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
change. Coherence boundaries are the points at which different functional business areas have to communicate with the outside world in a consistent and grammatically structured language. This request diploid is mapped then to different systems and can be separated as shown in Figure 1. We typically distinguish between the user system e.g. consisting of the presentation system and possibly of supporting systems and the computer system which uses a certain architecture, platform and leads to an implementation. Based on the abstraction layer model in Figure 2 we may distinguish different realisations of systems: Information-systems-driven development is based on late integration of the presentation and user system. Presentation systems are either developed after the conceptualisation has been finished (this leads to the typical ladder in Figure 1) or are started after the implementation has been developed. In this case we distinguish the following phases: 1. application domain description; 2. requirements elicitation, acquisition, and compilation prescription; 3. business user layer; 4. conceptual layer; 5. implementation layer. Web information systems use more flexible architectures. Their development is intentionally often already based on development methodologies presented in Figure 4. So far, no systematic development of an methodology beside the methodology developed in our collaboration has been made. We typically may distinguish the following phases: 1. application domain description; 2. requirements elicitation, acquisition, and compilation prescription; 3. conceptual systems layer; 4. presentation systems layer; 5. implementation layer. Additionally we may also consider the deployment, maintenance, ... etc. layers. We restricted our project to the layers discussed above. 5.2. Abstraction Layering During Systems Development Our approach allows to integrate architecture development with architecture development. Top-down development of systems seems to be the most appropriate whenever a system is developed from scratch or a system is extended. For this reason, we may differentiate among three layers: the systems description and prescription layer, the conceptual specification layer, and the systems layer. These layers may be extended by the the strategic layer that describes the general intention of the system, by the business user layer that describes how business users will see the system and by the logical layer that relates the conceptual layer to the systems layer by using the systems languages for pro-
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
109
gramming and specification. Figure 4 relates the three main layers of systems development. The system ladder distinguishes at least between the following refinement layers: description / prescription, specification, and implementation. The refinement layers allow to concentrate on different aspects of concern. At the same time, refinement is based on refinement decisions which should be explicitly recorded. The implementation is the basis for the usage. The dichotomy distinguishes between the user world and the system world. They are related to each other through user interfaces. So, we can base WIS engineering on either the user world description, the systems prescription, the developers presentation specification, the developers systems specification. We may extend the ladder by introduction layer, the deployment layer, and the maintenance layer. Since the last layers are often considered to be orthogonal to each other and we are mainly discussing WIS engineering the three layers are out of our scope. 5.3. Another Dichotomy for Web Information Systems Development We thus develop another methodology for web information systems. WIS have two different faces: the systems perspective and the user perspective. These perspectives are tightly related to each other. We consider the presentation system as an integral part of WIS. It satisfies all user requirements. It is based on real life cases. The dichotomy is displayed in Figure 4 where the right side represents the system perspective and the left side of the ladder represents the user perspective. Software engineering has divided properties into functional and non-functional properties, restrictions and pseudo-properties. This separation can be understood as a separation into essential properties and non-essential ones. If we consider the dichotomy of a WIS then this separation leads to a far more natural separation into information system requirements and presentation systems requirements. The system perspective considers properties such as performance, efficiency, maintainability, portability, and other classical functional requirements. Typical presentation system requirements are usability, reliability, and requirements oriented to high quality in use, e.g., effectiveness, productivity, safety, privacy, and satisfaction. Safety and security are also considered to be restrictions since they specify undesired behaviour of systems. Pseudo-properties are concerned with technological decisions such as language, middleware, operating system or are imposed by the user environment, the channel to be used, or the variety of client systems.
6. Extending the Triptych to the Software Modelling Quadruple We are going to combine the results of the first three solutions into architecture development. One dimension of software engineering that has not been yet integrated well is the software architecture. Modelling has different targets and quality demands depending on the architecture. For instance, mainframe-oriented modelling concentrates on the development of a monolithic schema with a support by view schemata for different aspects of the application. Three-tier architectures separate the system schema into presentation schemata, business process schemata and supporting database schemata based on separation of concern and information hiding. Component architectures are based on
110
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
Description/ prescription layer
Conceptual layer
Implementation layer
PP PP PP Application PPP PP area PP description PP PP PP PP PP PP PP P P Design PP PP P P Requirements PP Refinement PP PPprescriptions PP PP WIS description PP PP PP and prescription PP PP Presentation system PP PP specification PP PP PP PP PP PP PP PP PP Implementation PPP P PP Information systems Transformation PP PP specification PP PP PP PP PP PP WIS specification P PP PP Presentation P system PP PP PP PP PP PP PP PP PP PP P P PP PP Information PP PP system PP Web information system PPP P
Figure 4. The dichotomy of human-computer systems and the systems ladder
‘meta-schemata’ that describe the intention of the component, the interfaces provided by the component, and the bindings among the interfaces. SOA architectures encapsulate functionality and structuring into services and use orchestration for realisation of business tasks through mediators. Therefore, application domain description is going to be extended by consideration of architectures and environments. Software architecture is often considered from the technical or structural point of view and shows the association of modules or packages of software. Beside this structural point of view we consider the application architecture that illustrates the structure of the software from the application domain perspective. Additionally we might include the perspective of the technical infrastructure, e.g. periphery of the system. These three viewpoints are one the most important viewpoints of the same architecture. We call an architecture documentation architecture blueprint. Summarizing we find four interwoven parts of a software system documentation that we need to consider and that is depicted in Figure 5. The tasks and the objective of (conceptual) modelling changes depending on the architecture that has been chosen for the system. 6.1. The Prescription of Requirements Architecture has an impact on development of early phases. We consider first requirements description.
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
Software engineering has divided properties into functional and non-functional properties, restrictions and pseudo-properties. This separation can be understood as a separation into essential properties and non-essential ones. If we consider the dichotomy of a WIS then this separation leads to a far more natural separation into information system requirements and presentation systems requirements. The system perspective considers properties such as performance, efficiency, maintainability, portability, and other classical functional requirements. Typical presentation system requirements are usability, reliability, and requirements oriented to high quality in use, e.g., effectiveness, productivity, safety, privacy, and satisfaction. Safety and security are also considered to be restrictions since they specify undesired behaviour of systems. Pseudo-properties are concerned with technological decisions such as language, middleware, operating system or are imposed by the user environment, the channel to be used, or the variety of client systems. Properties are often difficult to specify and to check. We should concentrate on those and only those properties that can be shown to hold for the desired system. Since we are interested in proofing or checking the adherence of the system to the properties we need to define properties in such a way that tests or proofs can be formulated. They need to be adequate, i.e. cover what business users expect. At the same time, they need to be implementable. We also must be sure that they can be verified and validated. 6.2. Architecture-Driven System Development WIS specification is often based on an incremental development of WIS components, their quality control and their immediate deployment when the component is approved. The development method is different from those we have used in the first layers. Application domain description aims in capturing the entire application based on exploration techniques. Requirements prescription is refining the application domain description. Specification is based on incremental development, verification, model checking, and testing. This incremental process leads to different versions of the WIS: demo WIS, skeleton WIS, prototype WIS, and finally approved WIS. Software becomes surveyable, extensible and maintainable if a clear separation of concerns and application parts is applied. In this case, a skeleton of the application structure is developed. This skeleton separates parts or services. Parts are connected through interfaces. Based on this architecture blueprint, an application can be developed part by part. We combine modularity, star structuring, co-design, and architecture development to a novel framework based on components. Such combination seems to be not feasible. We discover, however, that we may integrate all these approaches by using a component-
112
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
based approach [26,27]. This skeleton can be refined during evolution of the schema. Then, each component is developed step by step. Structuring in component-based codesign is based on two constructs: Components: Components are the main building blocks. They are used for structuring of the main data. The association among components is based on ‘connector’ types (called hinge or bridge types) that enable in associating the components in a variable fashion. Skeleton-based construction: Components are assembled together by application of connector types. These connector types are usually relationship types. A typical engineering approach to development of large conceptual models is based on general solutions, on an architecture of the solution and on combination operations for parts of the solution. We may use a two-layer approach for this kind of modelling. First, generic solutions are developed. We call these solutions conceptual schema pattern set. The architecture provides a general development contract for subparts of a schema under development. The theory of conceptual modelling may also be used for a selection and development of an assembly of modelling styles and perspectives. Typical wellknown styles [24] are inside-out refinement, top-down refinement, bottom-up refinement, modular refinement, and mixed skeleton-driven refinement. A typical perspective is the three-layer architecture that uses a conceptual model together with a number of external models and an implementation model. Another perspective might be the separation into an OTP-OLAP-DW system. The adaptation of a conceptual schema pattern set to development contracts and of styles and perspectives leads to a conceptual schema grid. 6.3. Architecture Blueprint An architecture blueprint consists of models, documents, artifacts, deliverables etc. which are classified by the following states: The architecture framework consists of the information or data view, functional business or domain view, integration or data-flow view, deployment or technology view, and infrastructure or embedment view. The WIS development architectures guide: The current architecture is the set all solution architecture models that have been developed by the delivery projects to date. Ownership of the solution architecture models are transferred to the current Enterprise Architecture when the delivery project is closed. The development state architecture represents the total set of architecture models that are currently under development within the current development projects. The target vision state architecture provides a blueprint for the future state of the architecture needed in order to satisfy the application domain descriptions and target operating model. 7. Applying Architecture-Driven and Application-Domain-Ruled Modelling Methodologies 7.1. The CottbusNet Design and Development Decisions Let us consider the event calendar in an infotainment setting of a city information system. This calendar must provide a variety of very different information from various heterogeneous resources:
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
113
• Event-related information: Which event is performed by whom? Where from are the actors? How the event is going on? • Location-based information: Which location can be reached by which traffic under which conditions with whose support? • Audience information: Which audience is sought under which conditions, regulations and with which support? • Marketing information: Which provider or supplier markets the event under which time restrictions with which business rules? • Time-related information: Which specific time data should be provided together with events? • Intention information: Are there intentions of the event that should be provided? The event calendar is based on a different databases: event databases for big events, marketing events, sport events, cultural events, minor art events etc.; location databases for support of visitors of the event providing also traffic, parking etc. information; auxiliary databases for business rules, time, regulations, official restrictions, art or sport activists, reports on former events etc. It is not surprising that this information is provided by heterogeneous databases, in a variety of formats, in a large bandwidth of data quality, in a variety of update policies. Additionally, it is required to deliver the data to the user in the right size and structuring, at the right moment and under consideration of the user’s information demand. Consider, for instance, minor art events such as a cabaret event held in a restaurant. The information on this event is typically incomplete, not very actual, partially inexact and partially authorised. The infotainment site policy requires however also to cope with such kinds of events. We might consider now a number of architectures,e.g., the following one: • Server-servlet-applet-client layered systems typically use a ground database system with the production data, a number of serving databases systems with the summarised and aggregated data based on media type technology [17], and playouting systems based on container technology [13] depending on adaption to the storyboard [18]. • OLTP-OLAP-Warehouse systems [11,12] use a ground database system for OLTP computing, a derived (summarised, aggregated) OLAP system for comprehensive data delivery to the user, and a number of data warehouses for data playout to the various kinds of users. Depending on these architectures we must enhance and extend the conceptual schema for the different databases, the workflow schemata for data input, storage, and data playout to the user. 7.2. The Resulting Quality of Service and Tracking Back Problems to Decisions Made Quality of WIS is characterized depending on the abstraction layers [8]: Quality parameters at business user layer may include ubiquity ( access unrestricted in time and space) and security/privacy (against failures, attacks, errors; trustworthy; privacy maintenance). Quality parameters at conceptual layer subsume interpretability (formal framework for interpretation) and consistency (of data and functions).
114
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
Quality parameters at implementation layer include durability (access to the entire information unless it is explicitly overwritten), robustness (based on a failure model for resilience, conflicts, and persistency), performance (depending on the cost model, response time and throughput), and scalability (to changes in services, number of clients and servers). We use a number of measures that define quality of service (QoS) for WIS: • Deadline Miss Ratio of User Transactions: In a WIS QoS specification, a developer can specify the target deadline miss ratio that can be tolerated for a specific real-time application. • Data Freshness: We categorize data freshness into database freshness and perceived freshness. Database freshness is the ratio of fresh data to the entire temporal data in a database. Perceived freshness is the ratio of fresh data accessed to the total data accessed by timely transactions - transactions which finish within their deadlines. • Overshoot is the worst-case system performance in the transient system state. In this paper, it is considered the highest miss ratio over the miss ratio threshold in the transient state. In general, a high transient miss ratio may imply a loss of profit in e-commerce. • Settling time is the time for the transient overshoot to decay and reach the steady state performance. • Freshness of Derived Data: To maintain the freshness, a derived data object has to be recomputed as the related ground database changes. A recomputation of derived data can be relatively expensive compared to a base data update. • Differentiated Timeliness: In WIS QoS requirements, relative response time between service classes can be specified. For example, relative response time can be specified as 1:2 between premium and basic classes. We observe that these quality of services characteristics are difficult to specify in systems if architecture is not taken into consideration. Let us consider data freshness as an example for WIS. Data freshness results is related to information logistics that aims in providing the correct data at the best point of time, in the agreed format and quality for the right user with the at the right location and context. Methods for achieving the logistics goals are the analysis of the information demand, storyboarding of the WIS, an intelligent information system, the optimization of the flow of data and the technical and organizational flexibility. Therefore, data freshness can be considered to be a measure for appropriateness of the system. Depending on the requested data freshness we derive the right architecture of the system. 7.3. Resolution and Toleration of QoS Problems Based on our co-design modelling appraoch and as a result separation of concern within the software engineering quadruple be can derive a number of techniques for architecture-driven and application-domain-rules modelling of high quality WIS: • Introduction of artificial bottlenecks: Instead of replicating data at different sites or databases we may introduce a central data store that exhibits a numbe rof versions to each of the clients that require different data.
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
115
• Introduction of a tolerance model: We may introduce an explicit tolerance model that decreases the burden of data actuality to those web pages for which complete actuality is essential. • A cost-benefit model of updates: Updates may sometimes causes a large overhead of internal computing due to constraint maintenance and due to propagation of the update to all derived data. We thus may introduce delays of updates and specific update obligations for certain time points. Typical resulting techniques are dynamic adaptation of updates and the explicit treatment by an update policy. • Data replication in a distributed environment: Data access can be limited in networking environments. The architecture may however introduce explicit data replication and specific update models for websites. This list of techniques is not complete but demonstrates the potential of architecturedriven WIS development. 8. Conclusion This paper discusses the results of a project that was aiming in developing a methodological approach to web information systems development. Most approaches known so far did not take into consideration architectural issues. Typically, they are taken for granted or assumed on default. This paper shows that architectures have a deep impact on the development methodology. We took as an example web information systems development. These systems are typically based on the 2-tier architectures. The information system development part is very-well considered. The presentation system development is often mixed with the information system development. It cannot however be mixed. We separate these two systems from each other. While separating we discover that in this case the application domain description fits very well with the support by the presentation system. This description is the source for requirements prescription. The later results in software specification and later development and coding of the system. The presentation system conceptualisation and coding can either be done before considering the information system or done afterwards. Classical approaches consider the three facets of system development: application domain description, requirements prescription and software specification. We discover in this paper that there is a fourth facet that cannot be neglected: architecture of the system. Therefore, we extend the classical framework to the software modelling quadruple. References [1] B. Boehm. A view of 20th and 21st century software engineering. In Proc. ICSE’06, pages 12–29, ACM Press, 2006. [2] B. Boehm, D. Port, and K. Sullivan. White paper for value based software engineering. http://www.isis.vanderbilt.edu/sdp/Papers/, May 2007. [3] Stefano Ceri, Piero Fraternali, and Maristella Matera. Conceptual modeling of data-intensive web applications. IEEE Internet Computing, 6(4):20–30, 2002. [4] A. D¨usterh¨oft and B. Thalheim. Linguistic based search facilities in snowflake-like database schemes. Data and Knowledge Engineering, 48:177–198, 2004. [5] G. Fiedler, H. Jaakkola, T. M¨akinen, B. Thalheim, and T. Varkoi. Co-design of web information systems supported by SPICE. In Information Modelling and Knowledge Bases, volume XX, pages 123–138, Amsterdam, 2009. IOS Press.
116
H. Jaakkola and B. Thalheim / Architecture-Driven Modelling Methodologies
[6] T. H¨arder. XML databases and beyond - plenty of architectural challenges ahead. In ADBIS, volume 3631 of Lecture Notes in Computer Science, pages 1–16. Springer, 2005. [7] G.-J. Houben, P. Barna, F. Frasincar, and R. Vdovjak. HERA: Development of semantic web information systems. In Third International Conference on Web Engineering – ICWE 2003, volume 2722 of LNCS, pages 529–538. Springer-Verlag, 2003. [8] H. Jaakkola and B. Thalheim. A framework for high quality software design and development: A systematic approach. IET Software, 2010. to appear. [9] G. Kappel, B. Pr¨oll, S. Reich, and W. Retschitzegger, editors. Web Engineering: Systematische Entwicklung von Web-Anwendungen. dpunkt, 2003. [10] Philippe Kruchten. The Rational Unified Process - An Introduction. Addison-Wesley, 1998. [11] H.-J. Lenz and B. Thalheim. OLAP databases and aggregation functions. In Proc. SSDBM 2001, pages 91–100. IEEE, 2001. [12] H.-J. Lenz and B. Thalheim. OLTP-OLAP schemes for sound applications. In TEAA 2005, volume LNCS 3888, pages 99–113, Trondheim, 2005. Springer. [13] J. Lewerenz, K.-D. Schewe, and B. Thalheim. Modeling data warehouses and OLAP applications by means of dialogue objects. In Proc. ER’99, LNCS 1728, pages 354–368. Springer, Berlin, 1999. [14] Peter C. Lockemann. Information system architectures: From art to science. In BTW, volume 26 of LNI, pages 30–56. GI, 2003. [15] C. Pahl, W. Hasselbring, and M. Voss. Service-centric integration architecture for enterprise software systems. J. Inf. Sci. Eng., 25(5):1321–1336, 2009. [16] A.-W. Scheer. Architektur integrierter Informationssysteme - Grundlagen der Unternehmensmodellierung. Springer, Berlin, 1992. [17] K.-D. Schewe and B. Thalheim. Modeling interaction and media objects. In Proc. NLDB 2000, LNCS 1959, pages 313–324. Springer, 2001. [18] K.-D. Schewe and B. Thalheim. Reasoning about web information systems using story algebra. In ADBIS’2004, LNCS 3255, pages 54–66, 2004. [19] K.-D. Schewe and B. Thalheim. The co-design approach to web information systems development. International Journal of Web Information Systems, 1(1):5–14, March 2005. [20] K.-D. Schewe and B. Thalheim. Conceptual modelling of web information systems. Data and Knowledge Engineering, 54:147–188, 2005. [21] T. Schmedes. Entwurfsmethode f¨ur service-orientierte Architekturen im dezentralen Energiemanagement. In Multikonferenz Wirtschaftsinformatik. GITO-Verlag, Berlin, 2008. [22] D. Schwabe, G. Rossi, and S. Barbosa. Systematic hypermedia design with OOHDM. In Proc. Hypertext ’96, pages 116–128. ACM Press, 1996. [23] J. Siedersleben. Moderne Softwarearchitektur. dpunkt-Verlag, Heidelberg, 2004. [24] B. Thalheim. Entity-relationship modeling – Foundations of database technology. Springer, Berlin, 2000. [25] B. Thalheim. Co-design of structuring, functionality, distribution, and interactivity of large information systems. Technical Report 15/03, BTU Cottbus, Computer Science Institute, Cottbus, September 2003. 190pp. [26] B. Thalheim. Application development based on database components. In Y. Kiyoki H. Jaakkola, editor, EJC’2004, Information Modeling and Knowledge Bases XVI. IOS Press, 2004. [27] B. Thalheim. Component development and construction for database design. Data and Knowledge Engineering, 54:77–95, 2005.
Acknowledgement We would like to thank the Academy of Finland and the German Academic Exchange Service (DAAD) for the support of this research.
An Emotion-Oriented Image Search System with Cluster based Similarity Measurement using Pillar-Kmeans Algorithm a
Ali Ridho BARAKBAHa and Yasushi KIYOKI b Graduate School of Media and Governance, Keio University, Japan b Faculty of Environmental Information, Keio University, Japan 5322 Endoh, Fujisawa, Kanagawa, Japan, 252-8520 [email protected], [email protected] Abstract. This paper presents an image search system with an emotion-oriented context recognition mechanism. Our motivation implementing an emotional context is to express user’s impressions for retrieval process in the image search system. This emotional context recognizes the most important features by connecting the user’s impressions to the image queries. The Mathematical Model of Meaning (MMM: [2], [4] and [5]) is applied for recognizing a series of emotional contexts for retrieving the most highly correlated impressions to the context. These impressions are then projected to a color impression metric to obtain the most significant colors for subspace feature selection. After applying subspace feature selection, the system then clusters the subspace color features of the image dataset using our proposed Pillar-Kmeans algorithm. Pillar algorithm is an algorithm to optimize the initial centroids for K-means clustering. This algorithm is very robust and superior for initial centroids optimization for K-means by positioning all centroids far separately among them in the data distribution. It is inspiring that by distributing the pillars as far as possible from each other within the pressure distribution of a roof, the pillars can withstand the roof’s pressure and stabilize a house or building. It considers the pillars which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as number of centroids among the gravity weight of data distribution in the vector space. Therefore, this algorithm designates positions of initial centroids in the farthest accumulated distance between them in the data distribution. The cluster based similarity measurement also involves a semantic filtering mechanism. This mechanism filters out the unimportant image data items to the context in order to speed up the computational execution for image search process. The system then clusters the image dataset using our Pillar-Kmeans algorithm. The centroids of clustering results are used for calculating the similarity measurements to the image query. We perform our proposed system for experimental purpose with the Ukiyo-e image dataset from Tokyo Metropolitan Library for representing the Japanese cultural image collections. Keywords. Image search, emotional context, multi-query images, subspace feature selection, cluster based similarity.
1. Introduction The World Wide Web has become a significant source of information, including image data. Everyday abundant information resources are transformed and collected into huge databases which make difficult in processing and analyzing data without the use of automatic approaches and techniques. Related to image data, many researchers and developers developed an efficient image searching, browsing, and retrieval systems in order to provide better ways and approaches for such kinds of activities.
118
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
The image retrieval systems based on the contents are attracting and challenging in research areas of image searching. Many content-based image retrieval (CBIR) systems have been proposed and widely applied to both commercial purposes and research systems. The system analyzes the content of an image by extracting primitive features such as color, shape, texture, etc. Most approaches have been introduced to explore the content of an image and identify the primary and dominant features inside the image. QBIC [3] introduced an image retrieval system based on color information inside an image. VisualSeek [7] represented a system by diagramming spatial arrangements based on representation of color regions. NETRA [8] developed a CBIR system by extracting color and texture features. Virage [6] utilized color, texture, and shape features for the image retrieval engine. CoIRS [10] also introduced a cluster oriented image retrieval system based on color, shape, and texture features. Veltkamp and Tanase [9] and Liu et al [11] presented a survey to many image retrieval systems using diverse features. Barakbah and Kiyoki introduced an image retrieval system by combining color, shape and structure features [12].
Figure 1. System architecture of our proposed image search system
Several researches addressed emotional recognition problems for the image retrieval system. The search system commonly constructs the emotion model driven by the user interaction to the system [17]. Park and Lee [18] introduced an emotion-based image retrieval driven by users. The system constructed emotion recognition by analyzing consistency feedbacks from the users. Solli and Lenz [19] developed an image retrieval system involving bags of emotion. The system used color emotion models derived from psychophysical experiments which are activity, weight and heat. However, it has not connected directly the queries of emotional expressions to the models yet. Wang and He [20] presented a survey on emotional semantic image
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
119
retrieval. The supervised learning techniques usually used to bridge semantic gap between image features and emotional semantics. This paper presents an image search system with an emotion oriented context recognition mechanism by connecting a series of emotion expressions to the color based impression. Our search system addresses a dynamic manipulation of unsupervised emotion recognition. Our motivation implementing an emotional context in the image search system is to express user’s impressions for retrieval process. This emotional context recognizes the most important features by connecting the user’s impressions to the image queries. In this system, the Mathematical Model of Meaning (MMM: [2], [4] and [5]) is applied and transformed to the color features with a color impression metric for subspace feature selection. Our previous work [14] presented how to connect the user’s impressions to the queries by involving a series of emotional contexts (such as “happy”, “calm”, “beautiful”, “luxurious”, etc.) and recognize the most important features for the image dataset and the image query. This paper continues our previous work by expanding the MMM vector space ([2], [4], [5]) with the lists impressions in the Color Image Scale. This paper also introduces a multi-query image search system by applying an aggregation mechanism to generate representative query colors for processing multi-query images. The Mathematical Model of Meaning (MMM) is applied and transformed to the color features with a color impression metric for subspace feature selection. This paper implements a cluster based similarity measurement in order to tie the similar colors of the subspace color features in a same group in the process of similarity measurement. We apply our previous work, Pillar-Kmeans algorithm, for the cluster based similarity measurement with involving a semantic filtering mechanism to filter out the irrelevant data. The Pillar-Kmeans algorithm is an optimized K-means clustering with our Pillar algorithm by generating initial centroids for K-means. Applying our Pillar-Kmeans algorithm for cluster based similarity measurement is important to reach high precision of the clustering result as well as to speed up the computational time of the clustering. Figure 1 shows the system architecture of the proposed image search system. We organize this paper as follows. In Section 2, the emotional context recognition mechanism using MMM is described. Section 3 discusses the feature extraction, representative query color generation of multi-image queries and subspace feature selection with a color impression metric. The cluster based similarity measurement using Pillar-Kmeans algorithm with a semantic filtering mechanism is described in Section 4. Section 5 describes the experimental results using the Ukiyo-e image dataset and discusses the performance analysis, and then followed by concluding remarks in Section 6.
2. Emotional Context Recognition Mechanism Our idea to recognize an emotional context in the image search system is to provide a function in which the users can express their impressions (such as “happy”, “calm”, “beautiful”, “luxurious”, etc.) for image search. This function finds the most essential features related to an emotional context, given as the user’s impressions to the image query. The Mathematical Model of Meaning (MMM) is applied for recognizing a series of emotional contexts for retrieving the most highly correlated impressions to the
120
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
context. In this section, the outline of the Mathematical Model of Meaning (MMM) is briefly reviewed. This model has been presented in [2], [4] and [5] in detail. 2.1. An overview of the Mathematical Model of Meaning In the Mathematical Model of Meaning [2][4][5], an orthogonal semantic space is created for semantic associative search. Retrieval candidates and queries are mapped onto the semantic space. The semantic associative search is performed by calculating the correlation of the retrieval candidates and the queries on the semantic space in the following steps: (1) A context represented as a set of impression words is given by a user, as shown in Figure 2(a). (2) A subspace is selected according to the given context as shown in Figure 2(b). (3) Each information resource is mapped onto the subspace and the norm of p is calculated as the correlation value between the context and the information resource, as shown in Figure 2(c).
Figure 2. Semantic associative search in MMM
2.2. The outline of semantic associative search in MMM The outline of the MMM is expressed as follows [2][4][5]: (1) A set of m words is given, and each word is characterized by n features. That is, an m by n matrix M is given as the data matrix. (2) The correlation matrix MTM with respect to the n features is constructed from the matrix M. Then, the eigen value decomposition of the correlation matrix is computed and the eigenvectors are normalized. The orthogonal semantic space MDS is created as the span of the eigenvectors which correspond to nonzero eigen values. (3) Context words are characterized by using the n features and representing them as n-dimensional vectors. (4) The context words are mapped into the orthogonal semantic space by computing the Fourier expansion for the n-dimensional vectors. (5) A set of all the projections from the orthogonal semantic space to the invariant subspaces (eigen spaces) is defined. Each subspace represents a phase of meaning, and it corresponds to a context or situation. (6) A subspace of the orthogonal semantic space is selected according to the user's impression expressed in n-dimensional vectors as context words, which are given as a context represented by a sequence of words.
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
121
The dynamic interpretation of meaning of data according to the given context words is realized through the selection of a semantic subspace from the entire semantic space that consists of approximately 2000 orthogonal vectors. A subspace is extracted by the semantic projection operator when context words, or the user’s impressions, are given. Thus, vectors of document data in the semantic subspace have norms adjusted accordingly with the given context words. The semantic interpretation is performed as projections of the semantic space dynamically, according to the given contexts, as shown in Figure 3. This process has been presented in our previous works [2][4][5] which describe as follows. 1. Defining a set of the semantic projections Πν: We consider the set of all the projections from the semantic space I to the invariant subspaces (eigen spaces). We refer to the projection as the semantic projection and the corresponding projected space as the semantic subspace. Since the number of i dimensional invariant subspaces is (v (v – 1)…(v – i + 1)) / i ! , the total number of the semantic projections is 2v. That is, this model can express 2v different phases of the meaning. 2. Constructing the Semantic Operator Sp: Suppose a sequence sℓ of ℓ words which determines the context is given. We construct an operator Sp to determine the semantic projection according to the context. We call the operator a semantic operator. (a) First we map the ℓ context words in databases to the semantic space I. This mathematically means that we execute the Fourier expansion of the sequence sℓ in I and seek the Fourier coefficients of the words with respect to the semantic elements. This corresponds to seeking the correlation between each context word of sℓ and each semantic element. (b) Then we sum up the values of the Fourier coefficients for each semantic element. (We call this sum corresponding axis’ weight). This corresponds to finding the correlation between the sequence sℓ and each semantic element. Since we have v semantic elements, we can constitute a v dimensional vector. We call the vector normalized in the infinity norm the semantic center of the sequence sℓ . (c) If the sum obtained in (b) for a semantic element is greater than a given threshold ε, we employ the semantic element to form the projected semantic subspace. We define the semantic projection by the sum of such projections. This operator automatically selects the semantic subspace which is highly correlated with the sequence sℓ of ℓ the context words which determines the context. This model makes dynamic semantic interpretation possible. We emphasize here that, in our model, the “meaning” is the selection of the semantic subspace, namely, the selection of the semantic projection and the “interpretation” is the best approximation in the selected subspace. Figure 3 shows the semantic interpretation according to contexts in MMM. The most correlated information resources to the given context are extracted in the selected subspace by applying the metric defined in the semantic space. We expand the 2000 Longman vector space in MMM that was used in our previous work [14] to 180 impression words of Color Image Scale. The most highly correlated words to the
122
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
context are the representative impressions for Color Image Scale in order to select subspace color features.
Figure 3. Semantic interpretation according to contexts in MMM
3. Feature extraction and Subspace Selection This section consists of three discussions: (1) the color feature extraction in the image dataset and the image query with quantization of RGB color system using Color Image Scale, (2) the aggregation mechanism of representative query color generation for processing multi-query images, and (3) subspace feature selection with a color impression metric.
Figure 4. The 130 basic color features are mapped on RGB color space and used for expressing relations between colors and impressions
123
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
3.1. Color Feature Extraction The system extracts color features using 130 basic color features of Color Image Scale [1]. These features consist of non-uniform quantization of RGB color space based on human impression. The features contain 120 chromatic colors and 10 achromatic colors. These features have encompasses 10 hues and 12 tones. Each hue may be bright or dull, showy or sober, and has a number of tones. The tone of a color [1] is the result of the interaction of two factors: brightness or value, and color saturation or chroma. Colors of the same tone are arranged in order of hue, starting from red at the left of the scale. The lines linking colors of the same tone show the range of images that tone can convey [1]. Figure 4 shows the 130 non-uniform quantization of RGB color space by Color Image Scale for expressing relations between color and impressions. These 130 basic color features will be projected to the lists of impressions discussed in Section 3.3. 3.2. Representative Query Color Generation In this paper, our image search system provides a multi-query input that allows users to assign the image query more than one image. With this multi-query input, the users have more spaces and flexibility to express what they want to search in the image dataset. For realizing this, we construct an aggregation mechanism of representative query color generation for processing multi-query images. The mechanism works by the following steps. Step 1: Extracting the color features f of the n image queries into 130 color features of Color Image Scale. ⎡ f1,1 L f1, 130 ⎤ ⎢ M O M ⎥ ⎢f ⎥ L f n , 130 ⎥ ⎣⎢ n,1 ⎦ where: fq,c is c-th color feature of image query q
(1)
Step 2: Calculating local average L of each image query for normalizing the value of histogram bin for each image query ⎡ L1,1 L L1, 130 ⎤ ⎢ M O M ⎥ ⎥ ⎢L ⎢⎣ n,1 L Ln, 130 ⎦⎥
(2)
where: Lq,c is local average of c-th color feature for image query q and be defined in Eq (3) Lq ,c =
f q, c fq
(3)
Step 3: Accumulating values of local average for each feature [ M 1 L M 130 ]
where:
(4)
124
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
Mc =
Σ nq =1 Lq, c n
(5)
Step 4: Calculating average A and standard deviation S of M, as shown in Eq (6). ⎡ A1 L A130 ⎤ ⎢⎣ S1 L S130 ⎥⎦
(6)
Step 5: Calculating density D of each color feature. Because a color feature which is a candidate as representative color feature is identified to have high A and low S, the density D of each color feature can be defined in Eq (7). Dc =
Ac + α Sc + α
(7)
where : α is a small number to avoid zero-division Step 6: Filtering out the irrelevant Dc which closes to zero. In this case, it is very important to filter out the irrelevant data adjusting to the data distribution. Because of that, an automatic clustering which can recognize number of clusters automatically is applied using our previous work Hill Climbing Automatic Clustering [15]. The Hill Climbing Automatic Clustering analyzes moving variances of clusters, and then observes the pattern to find the global optimum for number of clusters. After clustering the density D, the cluster members those are belonging to the cluster which locates closest to the zero point is filtering out. The rest of cluster members are selected to be representative color features. Figure 5 shows the visual representation of representative query color generation. Our approach can identified non-representative feature (indicated by red color in Figure 5) and remove them from the selection.
Figure 5. The identified non-representative colors (indicated by red color) will be removed from query feature extraction
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
125
3.3. Subspace Feature Selection The most highly correlated impressions from MMM (discussed in Section 2), is projected to the Color Impression Metric defined by Color Image Scale [1]. The Color Impression Metric consists of 130 basic color features and 180 key impression words. The projection calculates the relationships between the representative impressions from MMM and key image impression words in the Color Image Scale. The most significant colors which have the highest values of the projection is obtained and then used for selecting the color features among 130 color features of the image dataset and the representative image query colors.
4. Cluster Based Similarity Measurement After applying subspace color feature selection to the image features, a cluster based similarity measurement is calculated with involving a semantic filtering mechanism. This mechanism filters out the unimportant image datasets to the context in order to speed up the computational execution for image search process. The system then clusters the subspace color features of the image dataset using our Pillar-Kmeans algorithm. 4.1. Semantic Filtering Mechanism Before clustering the selected subspace color features for similarity calculation, it is important to filter out the irrelevant data items those have low correlation to the emotional contexts. The semantic information filtering was introduced in our previous work [16]. It works by providing a mechanism with a way to express user’s impressions. When the users give contexts to express their impressions to the system, the contexts lead number of data items to be low and high correlation to the contexts. By filtering out retrieval candidate data items with low semantic information retrieval with the given contexts, the retrieval process becomes effective because analysis of data is only performed on data items with high correlation with the contexts. By filtering out the irrelevant data, it can reduce number of data items and speed up the computational time.
Figure 6. Semantic filtering mechanism for filtering out irrelevant data
126
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
The irrelevant data semantically locates close to zero point in the vector space of the subspace color features. A case-dependent threshold th is used for selecting semantic information filtering. The vectors with norms less than th are considered unnecessary and filtered out from the subspace, as shown in Figure 6. The users can decide high threshold if they want to filter out a relatively large amount of data and retrieve limited data which are highly related to their impressions or set the threshold at a lower value so that they gain most data for thorough analysis. In our case, we set th as average color distances to the zero. 4.2. Pillar-Kmeans Algorithm After applying subspace feature selection, the system then clusters the subspace color features of the image dataset using our previous work, Pillar-Kmeans algorithm [13]. Pillar algorithm is an algorithm to optimize the initial centroids for K-means clustering. This algorithm is very robust and superior for initial centroids optimization for Kmeans by positioning all centroids far separately among them in the data distribution. Pillar algorithm is inspired by the thought process of determining a set of pillars’ locations in order to make a stable house or building. Figure 7 illustrates the locating of two, three, and four pillars, in order to withstand the pressure distributions of several different roof structures composed of discrete points. It is inspiring that by distributing the pillars as far as possible from each other within the pressure distribution of a roof, the pillars can withstand the roof’s pressure and stabilize a house or building. It considers the pillars which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as number of centroids among the gravity weight of data distribution in the vector space. Therefore, this algorithm designates positions of initial centroids in the farthest accumulated distance between them in the data distribution.
Figure 7. Illustration of locating a set of pillars (white points) withstanding against different pressure distribution of roofs
The process of determining the initial centroids by Pillar algorithm have been presented in our previous work [13] described as follows. First of all, the grand mean of data points is calculated as the gravity center of the data distribution. The distance metric D (let D1 be D in this early step), is then created between each data point and the grand mean. A data point which has the highest distance in D1 will be selected as the first candidate of the initial centroid ж. Figure 8(a) illustrates m as the grand mean of data points and ж which is has the farthest distance to m is the candidate of the first
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
127
initial centroid. If ж is not an outlier, it will be promoted to the first initial centroid c1. We then recalculate D (D2 in this step), which is the distance metric between each data points and c1. Starting from this step, we use the accumulated distance metric DM and assign D2 to DM. This step which initiates the creation of DM is an improvement part of our previous work, MDC algorithm [16], that the construction of DM is started from D1. To select a candidate for the second initial centroid, the same mechanism is applied using DM instead of D. The data point with the highest distance of DM will be selected as the second initial centroid candidate ж, as shown in Figure 8(b). If ж is not classified as an outlier, it becomes c2. To select a next ж for the candidate of the rest initial centroids, Dt (where t is the current iteration step) is recalculated between each data points and ct-1. The Dt is then added to the accumulated distance metric DM (DM DM + Dt). This accumulation scheme can avoid the nearest data points to ct-1 being chosen as the candidate of the next initial centroid. It consequently can spread out the next initial centroids far away from the previous ones. The data points with the highest distance in DM will then be selected as ж, as shown in Figure 8(c). If ж is not an outlier, it will become ct. The iterative process guarantees that all initial centroids are designated. In this way, all centroids can be located as far as possible from each other within the data distribution.
Figure 8. Selection for several candidates of the initial centroids
Here is the detail sequence of Pillar algorithm. Let X={xi | i=1,…,n} be data, k be number of clusters, C={ci | i=1,…,k} be initial centroids, SX ⊆ X be identification for X which are already selected in the sequence of process, DM={xi | i=1,…,n} be accumulated distance metric, D={xi | i=1,…,n} be distance metric for each iteration, and m be the grand mean of X. The following execution steps of the proposed algorithm are described as follows: 1. Set C=∅, SX=∅, and DM=[] 2. Calculate D dis(X,m) 3. Set number of neighbors nmin = α . n / k 4. Assign dmax argmax(D) 5. Set neighborhood boundary nbdis = β . dmax 6. Set i=1 as counter to determine the i-th initial centroid 7. DM = DM + D 8. Select ж xargmax(DM) as the candidate for i-th initial centroids 9. SX=SX ∪ ж 10. Set D as the distance metric between X to ж. 11. Set nonumber of data points fulfilling D ≤ nbdis
128
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
12. 13. 14. 15. 16. 17. 18.
Assign DM(ж)=0 If no < nmin, go to step 8 Assign D(SX)=0 C=C∪ж i=i+1 If i ≤ k, go back to step 7 Finish in which C is the solution as optimized initial centroids.
The centroids of clustering results from Pillar-Kmeans algorithm discussed in previous Section are used for calculating the similarity measurements to the representative query color features of the image queries. In this case, we use Cosine distance metric for similarity calculation.
5. Experimental Results To apply our emotion-oriented image search system, we implement it for cultural image datasets. For experimental study, we use the Ukiyo-e image dataset from Tokyo Metropolitan Library for representing the Japanese cultural image collections. It contains 8743 typical images and artworks of famous paintings in Edo and Meiji era, including Hiroshige, Toyokuni, Kunisada, Yoshitoshi, Kunichika, Sadahige, Kuniteru, etc. We set the highest ranks of 15 number of image retrieved results. For performance analysis, we compare the highest 10 impression words of retrieved results using color impression metric to the given emotional contexts, as shown in Eq (8). In this case, the comparison to the given emotional contexts encompasses two things: (1) comparison definitely to the contexts, and (2) comparison semantically to the closest meanings of the given contexts. 15 ⎧ prec i = 1 ← imprs (retrvimg i ) = contexts precision = ∑ prec i ⎨ i =1 ⎩ preci = 0 ← otherwise
(8)
5.1. Experiment 1 Four 4 images are given as multiple queries, shown in Figure 9. We set two emotional contexts which are “calm” and “quiet”, for expressing the impressions to the queries in which we want to retrieve in the image search system.
Figure 9. Multiple queries given to the search system
The computational steps are described as follows. First, the given contexts “calm” and “quiet” are computed by MMM to calculate the most highly correlated words to the context. We add the 2000 Longman vector space in MMM that was used in our
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
129
previous work [14] with 180 impression words of Color Image Scale. We compute a series of correlated words to the given contexts by MMM and obtain the most highly 10 correlated words which are "calm", "peaceful", "clean", "fresh", "quiet", "rich", "tender", "pretty", "bitter", and "rational". These most highly correlated words are then projected to the color impression metric to obtain the most significant colors for subspace feature selection. The result of this projection is that 78 most significant colors related to the impression words are selected among 130 color features. This color feature subspace selection is applied for both the image dataset and the image queries. Before applying the color feature selection, the features of the image dataset and the image queries are extracted. Because the multiple queries are given to the system, we need to aggregate the color features and generate their representative color features, as described in Section 3.2. Figure 10 shows the 130 histogram bins of extracted color features of the 4 image queries in Figure 9. The histogram of representative query colors by our proposed representative query color generation is performed in Figure 11. As shown in Figure 12, the selection of representative colors are not applied for all query color data items those have values in the histogram bins, but only applied for those have high average value and low standard deviation value of histogram bins.
Figure 10. Histogram of the multiple image queries
After applying extracting the features of the image dataset and the representative query colors, the 78 most significant colors resulted by the projection of MMM to the color impression metric is used for subspace color feature selection.
Figure 11. Histogram of the representative colors of image queries
130
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
The next step is the similarity calculation between the features of the subspace colors and the representative query colors. The semantic filtering mechanism is applied to filter out the irrelevant data items of subspace color features of the image dataset those have low correlation to the emotional contexts. In this experiment, our semantic filtering mechanism selected 2893 of 8743 data items and filtered out the irrelevant rests of data items. After filtering out the irrelevant data items, the clustering is applied to grouping the similarity distribution of the relevant data items. Our Pillar-Kmeans algorithm is used to cluster the data. In this case, we set 20 numbers of clusters for clustering process. After grouping the data items, the cosine distance metric is used for the similarity calculation. The result of the calculation is ranked to obtain the best retrieved image results. Figure 12 shows the top 15 retrieval results of our image search system. For performance analysis, we extract the most highly computed impression words from each retrieved image results using color impression metric. Table 1 shows the lists of 10-impression words from each retrieved image results. Table 1 performs 8 of 15 retrieved image results (indicated by red font color) containing “calm quiet” context. Moreover, if we refer to human perception which the given context “calm quiet” may relatively consist of several meanings which are “restfull”, “tranquil”, “sedate”, “solemn”, “sober”, “placid”, “quiet_and_sophisticated”, and “simple_quiet_elegant”, the experimental results achieved all correct retrieved image results (indicated by blue font color). This experimental result performed that our proposed system is able to reach high precision for image retrieval in accordance with the given context by the users.
Figure 12. The top 15 retrieved image results of “calm quiet” emotional contexts
Figure 13 shows the precision of the retrieval results in line with i-th number of image results. In that figure, PR1 indicates the precision of the image results containing impressions those are definitely same to the contexts, PR2 indicates the precision of the image results containing impressions those are very close to the similar impressions of the contexts (or in other word, semantically same to the contexts), and MaxPR is the maximum bound of the precision. Even though PR1 just reached 53.33% of the
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
131
precision in line with the top i image results, but PR2 performed the all correct retrieval results.
Figure 13. The precision of the retrieval results in line with i-th number image results
Table 1. The impression words of retrieved images with contexts “calm quiet” Rank retrieved image results 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
5.2. Experiment 2 Figure 14 shows eight images as queries. We set two emotional contexts which are “luxurious” and “elegant”, for expressing the impressions to the queries in which we want to retrieve in the image search system.
Figure 14. Multiple queries given to the search system
We compute a series of correlated words to the given contexts by MMM and obtain the most highly 10 correlated words which are "elegant", "graceful", "luxurious", "stylish", "grand", "precious", "chic", "youthful", "masculine", and "feminine". These most highly correlated words are then projected to the color impression metric to obtain the most significant colors for subspace feature selection. This projection selected 61 of 130 most significant colors related to the impression words. This color feature subspace selection is applied for both the image dataset and the image queries. Before applying the subspace color feature selection, we need to aggregate the query color features and generate their representative features. Figure 15 shows the 130 histogram bins of extracted color features of the 4 image queries in Figure 14. The histogram of representative query colors by our proposed representative query color generation is performed in Figure 16. After applying extracting the features of the image dataset and the representative query colors, the 61 most significant colors resulted by the projection of MMM to the color impression metric is used for subspace color feature selection. For the similarity calculation, our semantic filtering mechanism selected 2960 of 8743 data items and filtered out the irrelevant rests of data items.
Figure 15. Histogram of the multiple image queries
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
133
Figure 16. Histogram of the representative colors of image queries
After filtering out the irrelevant data items, the clustering using our Pillar-Kmeans algorithm is applied to grouping the similarity distribution of the relevant data items. In this case, we set 20 numbers of clusters for clustering process. After the clustering is applied to the data items, the cosine distance metric is used for the similarity calculation between the representative query colors and the clustered data items. The result of the calculation is ranked to obtain the best retrieved image results. Figure 17 shows the top 15 retrieval results of our image search system.
Figure 17. The top 15 retrieved image results of “luxurious elegant” emotional contexts
For performance analysis, we extract the most highly computed impression words from each retrieved image results using color impression metric. Table 2 shows the lists of 10-impression words from each retrieved image results. Table 2 performs 11 of 15 retrieved image results (indicated by red font color) containing “luxurious elegant” context. Moreover, if we refer to human perception which the given context “luxurious elegant” may relatively consist of several meanings which are "rich", "simple_quiet_elegant", "gentle_elegant", "grand", and "tasteful", the experimental results achieved all correct retrieved image results (indicated by blue font color). This experimental result performed that our proposed system is able to reach high precision for image retrieval in accordance with the given context by the users.
134
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
Figure 18. The precision of the retrieval results in line with i-th number image results
Figure 18 shows the precision of the retrieval results in line with i-th number of image results. In that figure, PR1 indicates the precision of the image results containing impressions those are definitely same to the contexts, PR2 indicates the precision of the image results containing impressions those are very close to the similar impressions of the contexts, and MaxPR is the maximum bound of the precision. Figure 18 shows that PR1 reached 73.33% correct retrieval results, and PR2 performed the all correct results in line with the top i image results. Table 2. The impression words of retrieved images with contexts “luxurious elegant” Rank retrieved image results 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
135
6. Conclusion and future works This paper presented a semantic image search system by applying the emotional contexts of user’s impressions for retrieval process. The system provided a function in which the users can express their impressions (such as “happy”, “calm”, “beautiful”, “luxurious”, etc.) for image search. This emotional context recognizes the most important features by connecting the user’s impressions to the image queries. A multiquery input is applied in the system so that the users have more spaces and flexibility to express what they want to retrieve in the image search system. The Mathematical Model of Meaning is applied and transformed to the color features with a color impression metric for subspace feature selection. After applying subspace color feature selection to the image features, our Pillar-Kmeans algorithm is applied for the cluster based similarity measurement with involving a semantic filtering mechanism to filter out the irrelevant data. The Pillar algorithm designates positions of initial centroids in the farthest accumulated distance between them in the data distribution in order to improve the precision of K-means clustering and speed up the computational time of clustering. Our image search system was examined in the experimental study with the 8743 Ukiyo-e image datasets from Tokyo Metropolitan Library for representing the Japanese cultural image collections. The experimental results described in Section 5 showed that the proposed system reached 53.33% and 73.33% of the precision rate to the given context respectively in the Experiment 1 and Experiment 2, and performed all correct retrieval results to the close meanings of the given emotional contexts. In our future work, we will integrate our emotion-oriented image search system with our previous work of image retrieval system involving shape and structure features.
References [1] [2]
Shigenobu Kobayashi, Color Image Scale, 1-st edition, Kodansha International publisher, 1992. T. Kitagawa, Y. Kiyoki, A mathematical model of meaning and its application to multidatabase systems, Proc. 3rd IEEE International Workshop on Research Issues on Data Engineering: Interoperability in Multidatabase Systems, pp.130-135, 1993. [3] C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, W. Equitz, Efficient and effective querying by image content, Journal of Intelligent Information Systems 3 (3–4), pp. 231–262, 1994. [4] Y. Kiyoki, T. Kitagawa, T. Hayama, A metadatabase system for semantic image search by a mathematical model of meaning, ACM SIGMOD Record, Vol.23, No. 4, pp.34-41, 1994. [5] Y. Kiyoki, , T. Kitagawa, Y. Hitomi, A fundamental framework for realizing semantic interoperability in a multidatabase environment, International Journal of Integrated Computer-Aided Engineering, Vol.2, No.1 (Special Issue on Multidatabase and Interoperable Systems), pp.3-20, John Wiley & Sons, 1995. [6] J. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Gorowitz, R. Humphrey, R. Jain, C. Shu, Virage image search engine: an open framework for image management, Proc. The SPIE, Storage and Retrieval for Image and Video Databases IV, San Jose, CA, pp. 76–87, 1996. [7] J.R. Smith, S.F.Chang, VisualSEEk: a fully automated content-based image query system, Proc. The Fourth ACM International Conference on Multimedia, Boston, MA, pp. 87-98, 1996. [8] W.Y. Ma, B.S. Manjunath, Netra: A toolbox for navigating large image databases, Multimedia Systems 7 (3), pp. 184–198, 1999. [9] R.C. Veltkamp, M. Tanase, Content-Based Image Retrieval Systems: A survey, Technical Report UUCS-2000-34, 2000. [10] H.M. Lotfy, A.S. Elmaghraby, CoIRS: Cluster-oriented Image Retrieval System, Proc. 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004) 00, pp. 224-231, 2004. [11] Y. Liu, D. Zhang, G. Lu, W.Y. Ma, A survey of content-based image retrieval with high-level semantics, Pattern Recognition 40, pp. 262–282, 2007.
136
A.R. Barakbah and Y. Kiyoki / An Emotion-Oriented Image Search System
[12] A.R. Barakbah, Y. Kiyoki, An Image Database Retrieval System with 3D Color Vector Quantization and Cluster-based Shape and Structure Features, The 19th European-Japanese Conference on Information Modelling and Knowledge Bases, Maribor, 2009. [13] A.R. Barakbah, Y. Kiyoki, A Pillar Algorithm for K-Means Optimization by Distance Maximization for Initial Centroid Designation, IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Nashville-Tennessee, 2009. [14] A.R. Barakbah, Y. Kiyoki, Cluster Oriented Image Retrieval System with Context Based Color Feature Subspace Selection, In. Proc. Industrial Electronics Seminar (IES) 2009, pp. C101-C106, Surabaya, 2009. [15] A.R. Barakbah, K. Arai, Determining Constraints of Moving Variance to Find Global Optimum and Make Automatic Clustering, In. Proc. Industrial Electronics Seminar (IES) 2004, p.409-413, Surabaya, Indonesia, 2004. [16] D. Sakai, Y. Kiyoki, N. Yoshida, T. Kitagawa, A Semantic Information Filtering and Clustering Method for Document Data with a Context Recognition Mechanism, Journal of Information Modelling and Knowledge Base, Vol. XIII, pp. 325-343, 2002. [17] S. Wang, X. Wang, Emotion Semantics Image Retrieval: An Brief Overview, ACII 2005, LNCS 3784, pp. 490–497, Springer-Verlag Berlin Heidelberg, 2005. [18] E.J. Park, J.W. Lee, Emotion-Based Image Retrieval Using Multiple-Queries and Consistency Feedback, The 6th IEEE International Conference on Industrial Informatics (INDIN) 2008, pp. 16541659, 2008. [19] M. Solli, R. Lenz, Color Based Bags-of-Emotions, CAIP 2009, LNCS 5702, pp. 573–580, SpringerVerlag Berlin Heidelberg, 2009. [20] W. Wang, Q. He, A Survey On Emotional Semantic Image Retrieval, The 15th IEEE International Conference on Image Processing (ICIP) 2008, San Diego, USA, 2008.
The Quadrupel -A Model for Automating Intermediary Selection in Supply Chain Management Remy FLATT b , Markus KIRCHBERG a , and Sebastian LINK b,1 a Agency for Science, Technology and Research (A∗ STAR), Singapore b School of Information Management, Victoria University, New Zealand Abstract. The selection of intermediaries is a fundamental and challenging problem in supply chain management. We propose a conceptual process model to guide the supply chain coordinator through the selection process. Besides the support of our model for the agility, adaptability and alignment of the target supply chain, it also provides extensive automated assistance for the selection of tactics by off-theshelf tools from the area of artificial intelligence. Keywords. Supply Chain Modeling, Strategic Concept Development, Intermediary Selection, Decision Support, Artificial Intelligence
1. Introduction Supply chain management (SCM) evolved from a traditional focus on purchasing and logistics practised between the mid-1960s and mid-1990s, to a broader, more integrated emphasis on value creation in the new millennium. Leading companies increasingly view supply chain excellence as more than just a source of cost reduction - rather, they see it as a source of competitive advantage, with the potential to drive performance improvement in customer service, profit generation, asset utilization, and cost reduction. The effective selection of intermediaries is essential to achieve these goals, individually and collectively. In electronic markets, the dynamics of market restructuring may lead some intermediaries to extinction, but the overall market picture will compensate for the losses by providing opportunities for both existing and new intermediaries to enter the market through providing value-added services to electronic transactions. The opportunities for dis-intermediation, re-intermediation and cyber-mediation in electronic markets are contingent on their market structures, products and services as well as relationships between the various market participants. On balance, the world of electronic commerce will be characterized by an increasing differentiation of market channels. The resulting outcome is a dynamic mixed-mode structure that represents a continuum of combinations of traditional channels, dis-, re- and cyber-mediation [7]. 1 Corresponding Author: Sebastian Link, School of Information Management, Victoria University, Wellington, New Zealand; E-mail: [email protected].
138
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
The design of the supply chain is a complex decision that involves the strategic choice of the appropriate channel structure, and the tactical selection of the appropriate intermediaries. In general, if there are n intermediaries that are candidates for selection, then 2n different selections are possible; and, hypothetically, each of these selections must be considered. Due to the required flexibility of the supply chain, selecting intermediaries is not a one-time process. These arguments suggests that the supply chain coordinator requires assistance in the selection process, e.g., in form of advice by experts that are currently available, by automated decision support, or by a process model that guides the intermediary selection process to support an agile, adaptable and aligned supply chain [16]. Contributions. As the first main contribution of this paper, we propose such a process model. The framework is generic in the sense that it is not tailored to any kinds of products, or to any specific part of a supply chain. Refinements and specializations of our model will be investigated in the future. Our model deals with the high complexity of the selection process by following a divide-and-conquer approach. That is, based on events, sudden changes and the expertise currently available, the supply chain is divided into different fragments. The domain experts then develop new or adjust existing recommendations for the fragments of their expertise to adapt to the current circumstances. In our model, these recommendations are abstract summaries of a careful analysis process, which we do not specify in detail to guarantee maximal generality of our model. Indeed, the recommendations are specified in a certain language (possibly by some language expert). This language restriction serves as a coordination mechanism which enables the supply chain coordinator to integrate and align the different recommendations into an overall strategy for selecting intermediaries. In fact, this mechanism guarantees off-the-shelf support for generating automatically all tactics available to implement such a strategy. Subsequently, there is also support to narrow down the choices for a preferred tactic, or to approximate a tactic as closely as possible. Our four-stage process model is iterative to accommodate the constant changes in the supply chain. Note that our framework may also be seen as a model for integrating different supply chains. It fits well into already existing models: it is an instance of the dynamic e-business strategy model [12], supports strategy definition in the generic strategy process model [4], and strongly supports the derivation and maintenance of the triple-A chain [16]. As a byproduct, we also propose explicit definitions of what a strategy and a tactic constitute, which we think is interesting in its own right. As the second major contribution, we demonstrate how off-the-shelf tools from artificial intelligence provide automatic assistance for the supply chain coordinator in selecting intermediaries. Organization. We introduce our model in Section 2. In Section 3 we comment on the division of the supply chain into fragments, and introduce a running example. We explain the syntax and semantics of propositional logic in Section 4. In Section 5 we show how to specify local plans for individual supply chain fragments. In Section 6 we define strategies and tactics, and describe how the supply chain coordinator can use offthe-shelf tools to reason about the consistency of local plans. Section 7 shows how all available tactics of a plan can be determined. We discuss approximations of strategies in Section 8. Heuristics for selecting preferred tactics of approximations are analyzed in Section 9. Methods for evaluating the suitability of current tactics proposed in Section
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
139
1 Strategic analysis Internal resources
2 Strategic objectives Vision and Mission
Objectives
3 Strategic definition Option generation
Option evaluation
Option selection
Monitoring, Evaluation and Response
External environment
4 Strategic implementation Planning
Execution
Control
Figure 1. A generic strategy process model
10. We briefly discuss related work in Section 11, and conclude in Section 12 with an outlook to future work.
2. The Quadruple-A Model The selection of intermediaries in the supply chain involves a strategic decision on the channel structures involved and a tactical decision on the appropriate intermediaries in each of the channel structures. As such, intermediary selection naturally belongs to the third phase of the generic four-stage strategy process model [4], illustrated in Figure 1. As part of the definition of the business strategy, options for intermediaries are generated, evaluated and selected. Since the supply chain is highly complex in nature it is nearly impossible that a single supply chain coordinator can select the intermediaries. Instead, we propose an agile, adaptable, and aligned process model that also provides automated assistance to the supply chain coordinator. Our model is iterative, and each iteration consists of four phases. The iterations can be triggered by events, and therefore support the agility of the target supply chain. Examples of such events may be sudden changes in supply or demand, revised sets of strategic objectives or any types of disasters. In the first phase of every iteration, the supply chain is divided into different (possibly overlapping) fragments such that each intermediary candidate is covered by at least one fragment. The supply chain coordinator engages (a team of) domain experts to coordinate the fragments. In fact, the fragmentation may be based on the scope of the domain knowledge for which experts are currently available. Based on their key insights, the domain experts develop local plans for the selection of intermediaries within their fragments. These plans are abstract recommendations in some suitable formal language that we will specify later. Essentially, the recommenda-
140
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM Events Agility Feedback
Selection of preferred tactics
Fragmentation of supply chain and allocation of experts
Revision
Key insights
Adaptability Development of local plans for individual fragments
Alignment
Priorities
Inference of a strategy or approximations for the whole supply chain
Integration
Assistance
Figure 2. The Quadruple-A Model for Intermediary Selection
tions are summaries of a careful analysis of the fragment that adapt the target supply chain to local market situations or changes. For the purpose of this paper, we view this careful analysis as a black box. The local plans may consist of specific recommendations for the selection of intermediaries already, or specify complex conditions under which such selections take place. One example of a suitable formal language that specifies the local plans is discussed in Section 5. Subsequently, the supply chain coordinator attempts to align the local plans into a strategy for the whole supply chain, i.e., a selection of intermediaries that satisfies all the recommendations set out in the local plans. At this stage, it may well turn out that the recommendations of different local plans contradict one another. In that case, the coordinator may ask (some of) the domain experts to align their local plans, possibly by collaboration of different teams. This process will be iterated until the local plans become consistent, or the decision is made that the inconsistencies cannot be resolved presently. In the latter case, approximations of a strategy are developed subsequently. At the end of this stage, the supply chain coordinator may have the choice between several tactics available for either a strategy or approximations of a strategy. In the final step of one iteration, the coordinator applies some heuristics to narrow down the choice of the tactics available for a strategy or for approximations thereof. These heuristics are based on corporate strategies, for instance to minimize the number of intermediaries. The preferred tactic identifies a unique selection of intermediaries that meets the strategy for the supply chain, or an approximation thereof. Our process model is illustrated in Figure 2. From the description so far, it becomes apparent that automated assistance is necessary for: 1. 2. 3. 4.
the decision whether an alignment of local plans into a strategy is possible at all. inferring all tactics available for a strategy. approximating a strategy as closely as possible. narrowing down the choices of tactics for a strategy or approximations thereof.
The first item requires us to reason about the consistency between local plans. For example, one expert may recommend to select an intermediary while another advises the
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
141
opposite (recall that the same intermediary may be part of different fragments). Basically, such contradictions can be hidden deeply inside the specifications of the local plans, and reasoning about consistency means to detect any such contradictions. That implies that we need a formal language expressive enough for the domain experts to specify their local plans, and which allows us to reason about consistency efficiently at the same time. As a first example of such a language, we choose Boolean propositional logic in this paper. We believe this language to be expressive enough to accommodate many recommendations that result from a careful analysis of the fragment under consideration. The limits of propositional logic can be seen as some kind of coordination mechanism by which the supply chain coordinators forces the domain experts to express their key insights. On the other hand, propositional logic has been studied extensively in Artificial intelligence, and there are off-the-shelf tools available for us to reason efficiently about the consistency between local plans specified in this language. We will also describe what automated support propositional logic has to offer for the remaining items listed above. It may well turn out that there are other suitable candidates for such languages. These can simply be plugged into our framework. In summary, we propose a Quadruple-A model for intermediary selection that provides strong support for the agility, adaptability and alignment of a Triple-A supply chain [16], but also offers extensive automated assistance.
3. Dividing the Supply Chain The first step in a single iteration of our process model consists of the division of the supply chain into different fragments, and the allocation of domain experts to these fragments. Formally, the supply chain candidates form a non-empty set, denoted by SCC, of potential intermediaries, i.e., SCC = {I1 , . . . , In } for some positive integer n, and where each Ij denotes some intermediary. A fragmentation of the supply chain candidates SCC is some collection F(SCC) ⊆ 2SCC of subsets of SCC such that every element of SCC is an element of at least one fragment, i.e., for every I ∈ SCC there is some fragment F ∈ F(SCC) such that I ∈ F . The elements of a fragment F ∈ F(SCC) are also called the intermediaries of F . Example 1 For our running example we consider a down-stream supply chain. The supply chain candidates consist of four different intermediaries. These are two wholesalers W1 and W2 , and two retailers R1 and R2 . That is, SCC = {W1 , W2 , R1 , R2 }. Incidentally, there are four different domain experts assigned to the task of selecting intermediaries. The first two are experts in the geographical location of the first wholesaler and first retailer, and the second wholesaler and second retailer, respectively. Furthermore, there is an expert in the domain of the wholesalers, and an expert in the domain of the retailers. More formally, the fragmentation F(SCC) consists of the following four fragments F1 = {W1 , R1 }, F2 = {W2 , R2 }, F3 = {W1 , W2 }, and F4 = {R1 , R2 }. Indeed, every intermediary of SCC belongs to at least one of the overlapping fragments. For example, W1 is an intermediary of F1 and F3 . We assume implicitly that each fragment is allocated to some (team of) domain experts, for instance based on the scope of the expert’s knowledge. A fragmentation of the
142
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
F3 Expert: Wholesalers
F4 Expert: Retailers
F1 Expert: Geographic Region 1
W1
R1
F2 Expert: Geographic Region 2
W2
R2
Figure 3. A fragmentation of the supply chain candidates SCC = {W1 , W2 , R1 , R2 }
supply chain candidates can be illustrated by a hypergraph. The nodes of the hypergraph are given by the underlying supply chain candidates, and the edges of the hypergraph are given by the elements of the fragmentation. For instance, Figure 3 illustrates the fragmentation F(SCC) of the supply chain candidates from Example 1. Alternatively, we could define a fragmentation to be a multiset F(SCC) of subsets of SCC. In that case, duplicate elements of F(SCC) may represent the fact that different agents work on the same fragment. As yet another alternative, we may define a fragmentation to be an anti-chain F(SCC) of subsets of SCC, i.e., for any two fragments F and F of F(SCC) it holds that F is not a subset of F and F is not a subset of F . For the framework of this paper, it does not matter which definition we pick, we just offer some alternatives here. The local plans for the supply chain candidates will be specified over each of the fragments in the fragmentation of the supply chain candidates. More specifically, a local plan over fragment F will be a propositional formulae over F . Before we introduce the local plans in Section 5, we will therefore define the syntax and semantics of propositional logic in the next section. The reader that is already familiar with propositional logic, may skip this section.
4. A Primer on Propositional Logic In this section, we give a self-contained summary of the syntax and semantics of Boolean propositional logic [6]. We will also briefly comment on the state-of-the-art of a decision problem associated with formulae in propositional logic, and one of its search variants. In subsequent sections we will see that these problems naturally occur in the process of intermediary selection. 4.1. Syntax We define the language of Boolean propositional logic, i.e., we specify which objects belong to this language. In a first step we fix a countably infinite set of propositional variables, denoted by V. The elements of V form the atomic objects of our language, and all other objects will be derived from them. That is, we now specify the set of formulae over V, denoted by FV . In fact, we define FV to be the smallest set that satisfies the following rules:
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
143
• every propositional variable in V is a formulae in FV , i.e., V ⊆ FV . • if ϕ ∈ FV , then (¬ϕ) ∈ FV , and we say that (¬ϕ) is the negation of ϕ, • if ψ1 , ψ2 ∈ FV , then (ψ1 ∧ψ2 ) ∈ FV , and we say that (ψ1 ∧ψ2 ) is the conjunction of ψ1 and ψ2 . Suppose that V1 , V2 , and V3 are propositional variables in V. Then the following objects are examples of formulae in FV : (¬V2 ), (V1 ∧ (¬V2 )), (¬(V1 ∧ (¬V2 ))). For convenience, we introduce the following shortcuts. The formula (ψ1 ∨ ψ2 ) is a shortcut for ¬(¬ψ1 ∧ ¬ψ2 ), (ψ1 ⇒ ψ2 ) denotes (¬ψ1 ∨ ψ2 ), and (ψ1 ⇔ ψ2 ) denotes ((ψ1 ⇒ ψ2 ) ∧ (ψ2 ⇒ ψ1 )). We call (ψ1 ∨ ψ2 ) the disjunction of ψ1 and ψ2 , (ψ1 ⇒ ψ2 ) the material implication of ψ2 by ψ1 , and (ψ1 ⇔ ψ2 ) the equivalence between ψ1 and ψ2 . The operators of negation, conjunction, disjunction, material implication and equivalence are also known as connectives. For convenience, we also introduce the following rules of precedence: ¬ binds stronger than ∧ and ∨, which both bind stronger than ⇒, which binds stronger than ⇔. We may also omit the out-most parentheses in a formula. For example, the formula (¬(V1 ∧ (¬V2 ))) reduces to ¬(V1 ∧ ¬V2 ). 4.2. Semantics Now we attach some meaning to the formulae in FV , i.e., we will specify the conditions under which any element ϕ of FV will be true given an assignment of truth values to the propositional variables that occur in ϕ. That is, the truth of a complex formula ϕ in FV can be derived from the truth values assigned to the variables that occur in ϕ. Let false and true denote the Boolean propositional truth values. A truth assignment over V is a mapping θ : V → {0, 1} that assigns to each variable in V either true or false. We extend θ to a function Θ : FV → {false, true} that maps every formula ϕ in FV to its truth value Θ(ϕ) as follows: • if ϕ ∈ V, then Θ(ϕ) = θ(ϕ), • if ϕ = (¬ψ) for some ψ ∈ FV , then Θ(ϕ) =
true false
, if Θ(ψ) = false , , otherwise
• if ϕ = (ψ1 ∧ ψ2 ) for some ψ1 , ψ2 ∈ FV , then Θ(ϕ) =
true false
, if Θ(ψ1 ) = Θ(ψ2 ) = true . , otherwise
Even though the semantics of the shortcut connectives can be derived from the semantics of negation ¬ and conjunction ∧, we make this explicit in Table 1. The names of these connectives becomes apparent when we look at their semantics. Negation negates the truth value, a conjunction ψ1 ∧ ψ2 is true precisely when both of its conjuncts ψ1 and ψ2 are, a disjunction ψ1 ∨ ψ2 is true precisely when at least one of its disjuncts ψ1 or ψ2 is, etc.
144
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
ψ1
ψ2 ψ1 ∨ ψ2 ψ1 ⇒ ψ2 ψ1 ⇔ ψ2
true true
true
true
true
true false false true false false
true true false
false true true
false false true
Table 1. The semantics of disjunction, material implication and equivalence
4.3. SAT We say that a truth assignment θ over V satisfies the formula ϕ in FV , denoted by |=θ ϕ, if and only if Θ(ϕ) = true. If θ satisfies ϕ, we also call θ a model of ϕ. We say that θ is a model of a set Σ of propositional formulae, if it is a model of every element of Σ. If θ is not a model of ϕ (Σ), we also say that θ violates ϕ (Σ). A set Σ of propositional formulae over V is said to be satisfiable if there is some model of Σ. Satisfiable sets of propositional formulae are also said to be consistent. The satisfiability problem, SAT, is to decide whether an arbitrary set Σ of propositional formulae is satisfiable. For instance, the set Σ1 = {V1 , V1 ⇒ V2 , V1 ⇒ ¬V2 } is not satisfiable while the set Σ2 = {V1 ⇒ V2 , V1 ⇒ ¬V2 } is indeed satisfiable. SAT was the first problem to be shown NP-complete [5]. That means that, unless P=NP, there is no deterministic polynomial time algorithm for deciding SAT. Despite this suspected intractability, there are SAT-solvers that can deal efficiently with instances of SAT that contain up to a million different variables [14]. For a comprehensive survey on SAT-solvers we recommend [9]. 4.4. ALLSAT A search version of SAT computes a satisfying truth assignment for an arbitrary given set Σ of propositional formulae, if there is one. For the purpose of this paper, we are interested in a search variant of SAT known as ALLSAT where the aim is to enumerate all satisfying truth assignments of an arbitrary given set of formulae. For instance, for the input Σ2 = {V1 ⇒ V2 , V1 ⇒ ¬V2 } an ALLSAT-solver would return two truth assignment both of which assign false to V1 , and one assigns false to V2 and the other assigns true to V2 . SAT-solvers only require modest modifications to solve ALLSAT. A popular approach is the use of blocking clauses where the negation of each satisfying truth assignment that is found is added to the original problem, and the computation restarts. There are several optimizations for this method, focused on minimizing the number of assigned variables in a solution, such that each blocking clause represents a set of solutions. It is known that enumerating all satisfying truth assignments is proportional to the number of all satisfying truth assignments and the effort required to generate each satisfying truth assignment in isolation. For most of the instances in intermediary selection, the application of ALLSAT-solvers is feasible in practice.
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
145
5. Development of Local Plans In this section, we start to describe how the language of propositional logic can be applied to our process model for intermediary selection. Given a fragmentation F(SCC) of the set SCC of supply chain candidates, we will define what a local plan (for F(SCC)) constitutes. In order to get a feeling for what kind of plans we have in mind, we start with an example in which the plan is given in natural language. Example 2 Recall our fragmentation from Example 1 where F(SCC) consists of the following four fragments F1 = {W1 , R1 }, F2 = {W2 , R2 }, F3 = {W1 , W2 }, and F4 = {R1 , R2 }. The domain expert for geographical region 1, i.e., for fragment F1 , develops the following local plan (LSF1 ): if W1 is selected as an intermediary, then R1 is selected as an intermediary as well. The domain expert for geographical region 2, i.e., for fragment F2 , follows the same local plan in her domain (LSF2 ): if W2 is selected as an intermediary, then R2 is selected as an intermediary as well. The domain expert for the wholesalers, i.e., for fragment F3 , decides to select both W1 and W2 (LSF3 ). The expert for the retailers, i.e., for fragment F4 , develops the local plan (LSF4 ) that either R1 is selected or R2 (i.e. precisely one of them). Local plans are defined for each fragment F of a fragmentation F(SCC) for a set SCC of supply chain candidates. Therefore, we fix the propositional language FF for each of the fragments F . That is, the set of propositional variables of FF is given by F . Therefore, each supply chain candidate I of F is an atomic formula of FF . We interpret the atomic formula I ∈ FF as “the domain expert allocated to F recommends to select I for the supply chain”. From this interpretation of the atomic formula, the interpretation of the more complex formulae in FF can be derived. For example, ¬I ∈ FF means that it is recommended not to select I; or the formula R1 ⇔ ¬R2 ∈ FF4 expresses the fact that the domain expert of fragment F4 recommends to select R1 if and only if R2 is not selected. That is, it is recommended to select precisely one of R1 or R2 . Let F ∈ F(SCC) denote a fragment of SCC. A local plan for F is a propositional formula over F , i.e., an element of FF which we usually denote by λπF . A local plan for F(SCC) is a local plan for some fragment in F(SCC). Note that the condition of having just one formula represent a local plan is not a restriction: if there are several formulae, then let λπF just be the conjunction of these. Example 3 Using our interpretation of the variables for the intermediaries, the local plans LSF1 to LSF4 of Example 2 are specified by the following propositional formulae: • • • •
λπF1 λπF2 λπF3 λπF4
= W 1 ⇒ R1 , = W 2 ⇒ R2 , = R1 ⇔ ¬R2 , and = W1 ∧ W2 .
Note that local plans can be rather complex. For example, suppose that we have three different manufacturers M1 , M2 and M3 , and a distributor D. A local plan could be to
146
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
select precisely two manufacturers when the distributor is not selected. In this case, the plan is formalized by ¬D ⇒ (M1 ∧ M2 ∧ ¬M3 ) ∨ (M1 ∧ ¬M2 ∧ M3 ) ∨ (¬M1 ∧ M2 ∧ M3 ).
6. Conquer the Supply Chain: Strategies and Tactics In this section, we continue to describe the application of propositional logic to our framework. We will give an explicit definition of what a strategy and tactic for intermediary selection constitute, identify a decision problem fundamental to our framework, and the decision support available for it. A plan for an intermediary selection with respect to a fragmentation F(SCC) of supply chain candidates is the union π = ∪F ∈F (SCC) {λπF }. A policy ϑ of a plan π for an intermediary selection with respect to a fragmentation F(SCC) is a truth assignment ϑ : SCC → {true,false}. A policy ϑ of a plan π is said to be a tactic of π if ϑ is a model of π. A plan is said to be a strategy, usually denoted by ζ, if there is some tactic for ζ. An intermediary selection from a set SCC of supply chain candidates with respect to a fragmentation F(SCC) of SCC is a subset ι ⊆ SCC such that there is a strategy ζ with respect to F(SCC) and a tactic ϑ of ζ such that for all I ∈ SCC we have: I ∈ ι if and only if ϑ(I) = true. Hence, each tactic ϑ of a strategy ζ defines the intermediary selection ιϑ = {I ∈ SCC ||=ϑ I}. We say that the intermediary selection ιϑ is defined by the tactic ϑ. This terminology results in the following decision problem for the supply chain coordinator: Problem: Strategy INPUT: A plan π QUESTION: Is π a strategy? Example 4 Let π = {λπF1 , λπF2 , λπF3 , λπF4 } be a plan for an intermediary selection with respect to the fragmentation F(SCC) from Example 1 that results from the local plans of Example 3. Table 2 enumerates all policies of π. However, none of these policies satisfies all local plans of π. Consequently, there is no tactic for π, or in other words, π is not a strategy for an intermediary selection. Let ζ = {λπF1 , λπF2 , λπF3 } denote another plan without the local plan λπF4 . In this case, the plan ζ is indeed a strategy. For example, the policy ϑ that assigns true to the intermediary R1 and false to the intermediaries W1 , W2 , and R2 is a tactic for ζ. Consequently, the intermediary selection defined by ϑ is {R1 }. In our process model illustrated in Figure 2, the supply chain coordinator accumulates the local plans λπF for all fragments F ∈ F(SCC) into the plan π. Before different tactics will be identified to implement this plan, it would be helpful to decide whether there are any such tactics at all. If not, then either the local plans need to be revised, or the plan π can only be approximated. Consequently, decision support for the problem Strategy is fundamental to the framework we propose.
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
However, the problem Strategy is nothing else but the satisfiability problem SAT, i.e., to decide whether there is a model for the set π of propositional formulae. Since SAT is one of the most studied problems in AI, there is plenty off-the-shelf state-of-the-art decision support available [14].
7. Enumerating all tactics of a strategy Once the supply chain coordinator knows that the plan ζ is actually a strategy, i.e., the problem Strategy with input ζ has an affirmative answer, then the question is what the tactics of this strategy are. In a nutshell, there might be plenty of tactics and it might not be wise to let an automated procedure pick such a tactic. Instead, the supply chain coordinator should be aware of all such tactics to ensure that the best tactic has not been overlooked. On the other hand, all policies that are not tactics should be removed from the attention of the supply chain coordinator. Hence, we have the following problem. Problem: All-Tactics INPUT: A plan π QUESTION: What are all tactics for π? Note that All-Tactics is a more general problem than Strategy: if there are no tactics for input π, then π is not a strategy, and if there is at least one tactic, then π is a strategy.
148
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
However, it is generally more efficient to decide Strategy before moving on to enumerate all tactics of a strategy. As was the case with Strategy, the problem All-Tactics enjoys full decision support since the problem is equivalent to the well-studied problem ALLSAT in AI. Example 5 Let ζ = {λπF1 , λπF2 , λπF3 } denote the plan that is input to the problem AllTactics. The table Tactic ϑi W1 W2 R1 R2 Selection ιϑi ϑ6 true false true false {W1 , R1 } ϑ11 false true false true {W2 , R2 } ϑ14 false false true false {R1 } ϑ15 false false false true {R2 } shows all four tactics ϑi of ζ, and the associated intermediary selections ιϑi defined by ϑi .
8. Approximations of Strategies As mentioned previously, it might become necessary to decide that contradictions between the local plans cannot be resolved, and therefore that a strategy cannot be obtained. In that situation, it would be helpful to approximate a strategy as closely as possible. Informally, an approximation of a plan is a strategy that contains as many simultaneously satisfiable local plans of the plan as possible. A maximal approximation of a plan is an approximation of a plan of maximum cardinality. Formally, an approximation of a plan π is a maximal sub-strategy of π, i.e., a subset ς ⊆ π such that ς is a strategy and no strategy ς ⊆ π is a proper superset of ς. Note that an approximation of a strategy ζ is unique, and that it is ζ itself. A best approximation of a plan π is an approximation ς of π with a maximum number of local plans, i.e., there is no approximation ς of π that consists of more local plans than ς. Considering our framework, we would be looking for automated support for the following problem. Problem: All-Best-Approximations INPUT: A plan π QUESTION: What are all best approximations of π? The problem All-Best-Approximations is what is known in the AI literature as the problem ALL-MC. The problem ALL-MC is to enumerate for an arbitrary given finite set Σ of propositional formulae all maximally satisfiable subsets of Σ with maximum cardinality. Again, this problem and variations thereof have been well-studied in the AI literature [3]. Example 6 Let π = {λπF1 , λπF2 , λπF3 , λπF4 } denote the plan that is input to the problem All-Best-Approximations. In this case we obtain four best approximations of π which are the three-element sub-strategies of π. The table
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
shows all four best approximations α of π, all their available tactics, and the associated intermediary selections ι. 9. Heuristics for Intermediary Selection At the final stage of an iteration in our process model, the supply chain coordinator applies heuristics to narrow down the choices for the tactics of a strategy or an approximation thereof. The heuristics can be derived from corporate objectives or preferences. A prime example of such an objective could be to minimize the number of intermediaries. Informally, a minimal tactic selects a minimal number of intermediaries among all tactics. Formally, a minimal tactic of a plan π is a tactic ϑ of π such that there is no other tactic ϑ of π which defines an intermediary selection ιϑ that is a proper subset of the intermediary selection ιϑ . Problem: All-Minimal-Tactics INPUT: A plan π QUESTION: What are all minimal tactics for π? The problem All-Minimal-Tactics is what is known in the AI literature as the problem ALL-MINIMAL. The problem ALL-MINIMAL is to enumerate for an arbitrary given finite set Σ of propositional formulae all minimal models of Σ. A minimal model of a formula ϕ is a model θ of ϕ such that there is no other model θ of ϕ where {V | θ (V ) = true} is a proper subset of {V | θ(V ) = true}. This problem has been well-studied in the AI literature [2]. Example 7 Suppose that the previous steps of our process model have resulted in the approximation α = {λπF1 , λπF2 , λπF3 }. The table of Example 6 shows the four different tactics available for α. If α is the input to the problem All-Minimal-Tactics, then the tactics: Tactic Selection (W1 , W2 , R1 , R2 ) ι (false,false,true,false) {R1 } (false,false,false,true) {R2 }
150
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
are returned. For example, the tactic (true,false,true,false) is not minimal since it defines the selection of both W1 and R1 , but the tactic (false,false,true,false) defines the selection of R1 only. The corporate objective to minimize the number of selected intermediaries might be more refined, e.g., the minimum requirement may only apply to a certain selection X of candidate intermediaries. Let X ⊆ SCC be a subset of candidate intermediaries. An X-minimal tactic of a plan π is a tactic ϑ of π such that there is no other tactic ϑ of π where ιϑ ∩ X is a proper subset of ιϑ ∩ X. Problem: All-X-Minimal-Tactics INPUT: A plan π, a subset X of candidate intermediaries QUESTION: What are all X-minimal tactics for π? Note that All-X-Minimal-Tactics subsumes the problem All-Minimal-Tactics for the special case where X = SCC. The problem All-X-Minimal-Tactics is what is known in the AI literature as the problem ALL-X-MINIMAL. The problem ALL-X-MINIMAL is to enumerate for an arbitrary given finite set Σ of propositional formulae all X-minimal models of Σ. An X−minimal model of a formula ϕ is a model θ of ϕ such that there is no other model θ of ϕ where {V | θ (V ) = true} ∩ X is a proper subset of {V | θ(V ) = true} ∩ X. Again, this problem has been well-studied in the AI literature [1]. Example 8 Consider again the approximation α = {λπF1 , λπF2 , λπF3 }. with the four different tactics available for α illustrated in Example 6. Let X denote the collection {W1 , R1 } of intermediaries, i.e., it is the corporate strategy to minimize the selection of W1 and R1 . If α and X form the input to the problem All-X-Minimal-Tactics, then the tactics: Tactic Selection (W1 , W2 , R1 , R2 ) ι (false,true,false,true) {W2 , R2 } (false,false,false,true) {R2 } are returned. For example, the tactic (false,false,true,false) is not X−minimal, since it defines the selection of R1 , but the tactic (false,true,false,true) defines a selection that does neither select W1 nor R1 . The corporate strategy may also suggest some order of priority for the different fragments of the supply chain. In other words, the selection of the tactics might be based on a ranking of the local plans. The following example illustrates how such a ranking can be combined with the approximation of a strategy. Example 9 As before, let π = {λπF1 , λπF2 , λπF3 , λπF4 }. Since π does not have a strategy, we determine the best approximations of π in a first step to evaluate our options. These
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
Table 3. Ranking of best approximations of π according to preference order 4,3,2,1
best approximations are the seven policies ϑ1 , ϑ2 , ϑ3 , ϑ6 , ϑ11 , ϑ14 and ϑ15 from Table 2. The corporate strategy tells us that policies that satisfy λπF4 have highest priority: this leaves us with ϑ1 , ϑ2 , and ϑ3 . The next highest priority is given to those policies that satisfy λπF3 which gives us an option between ϑ2 and ϑ3 . Finally, priority of λπF2 over λπF1 determines the preferred tactic ϑ3 . This policy defines the intermediary selection ιϑ3 = {W1 , W2 , R2 }. The rankings of the approximations of π is illustrated in Table 3.
10. Assessment of Intermediary Selections In this section we briefly mention two related problems that are of value when the current selection of intermediaries is to be assessed with respect to a plan. Such situations may occur, for example, when a plan has been revised but the current tactic has not. The first problem is to decide whether an arbitrary given policy ϑ is a tactic for an arbitrary given plan π. Problem: Tactic INPUT: A plan π, a policy ϑ QUESTION: Is ϑ a tactic for π? In terms of propositional logic, this is the model checking problem MODEL, i.e., given a finite set Σ of propositional formulae and some truth assignment θ, decide whether θ is a model of Σ. A related problem is to decide whether an arbitrary given policy ϑ is a minimal tactic for an arbitrary given plan π. Problem: Minimal Tactic INPUT: A plan π, a policy ϑ QUESTION: Is ϑ a minimal tactic for π? In terms of propositional logic, this is the minimal model checking problem MINMODEL, i.e., given a finite set Σ of propositional formulae and some truth assignment θ, decide whether θ is a minimal model of Σ.
152
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
11. Related Work As explained previously, intermediary selection directly fits into the strategic definition within the strategy process [4]. In the context of e-business, it is a specialization of the dynamic e-business strategy model [12]. Our model supports the development and maintenance of the Triple-A supply chain [16] and adds considerable automated decision support. The authors of [8,13] call for supply chain collaboration. In [13], a model of iterative loops is suggested which is similar to ours: choosing strategic partners, aligning supply chain strategy and corporate strategy, and identifying the most appropriate supply chain strategy. Our model can thus be viewed as a collaborative way of selecting intermediaries. It also demonstrates what decision support might be of use [13]. Other models for intermediary selection have been proposed in the literature. An example is the model in [20] which focuses on the development of local plans in our model (without specifying a language), based on the strategy to maximize profits within the given budget constraints. To the author’s best knowledge, our model is the first to suggest a divide-and-conquer approach that enjoys full decision support. In particular, the abstract specification of our local plans results in the ability to generate optimal tactics that may not only accommodate a single parameter, but may show the relative impact of altering different parameters in the supply chain. This property was identified as one of the future modelling opportunities in supply chain management [21]. Supply Chain Management views a business as a chain of inter-connected entities of commercial activities. Therefore, multi-agent systems may be utilized to explore optimum chain connections from the procurement to the customer [11]. We refer the interested reader to [10], or to [15] for a more recent survey.
12. Conclusion and Future Work We have proposed a process model that assists supply chain coordinators in their task to select intermediaries. Our model follows an iterative, four-stage, divide-and-conquer approach that fosters the idea of a quadruple-A supply chain: agility, adaptability, alignment, and assistance. We have proposed to use propositional logic as a formal language to specify local plans for each of the fragments of a supply chain. This results in a concise representation of the key insights of each of the domain experts assigned to the fragments. Most importantly, it enables automated assistance for many tasks in the selection process. We have identified at least seven different problems that are fundamental to our process model. Each of the problems has a counterpart in propositional logic that has been well-studied by the Artificial Intelligence community. Table 4 provides a summary of these problems and their relationship. Even though the problems are, in general, perceived to be intractable, modern algorithms can deal efficiently with instances of the problem that contain a huge number of variables [14]. This number is usually significantly greater than the number of intermediary candidates in any supply chain. In future work we will test our process model in various case studies. This will provide useful insight into the level of support that our framework has to offer, but also on its limits. We would also like to analyze the potential of other formal languages, e.g., first-order logic and modal logics. Since different domain experts are likely to have different opinions it seems also natural to look at various approaches to dealing with inconsistencies.
R. Flatt et al. / The Quadrupel – A Model for Automating Intermediary Selection in SCM
Table 4. Correspondences of Problems in Intermediary Selection and AI
References [1] C. Avin and R. Ben-Eliyahu-Zohary. An upper bound on computing all X-minimal models. AI Commun., 20(2):87–92, 2007. [2] R. Ben-Eliyahu-Zohary. An incremental algorithm for generating all minimal models. Artif. Intell., 169(1):1–22, 2005. [3] E. Birnbaum and E. Lozinskii. Consistent subsets of inconsistent systems: structure and behaviour. J. Exp. Theor. Artif. Intell., 15(1):25–46, 2003. [4] D. Chaffey. E-Business and E-commerce management. Prentice-Hall, 2007. [5] S. Cook. The complexity of theorem-proving procedures. In ACM Symposium on Theory of Computing, pages 151–158, 1971. [6] H. Enderton. A mathematical introduction to logic: Second Edition. Academic Press, 2001. [7] G. Giaglis, S. Klein, and R. O’Keefe. The role of intermediaries in electronic marketplaces: developing a contingency model. Information Systems Journal, 12:231–246, 2002. [8] M. Grieger. Electronic marketplaces: A literature review and a call for supply chain management research. European Journal of Operational Research, 144:280–294, 2003. [9] J. Gu, P. Purdom, J. Franco, and B. Wah. Algorithms for the satisfiability (SAT) problem: A survey. In Satisfiability Problem: Theory and Applications, pages 19–152. Amer. Math. Soc., 1997. [10] R. Guttman, A. Moukas, and P. Maes. Agent-mediated electronic commerce: a survey. The Knowledge Engineering Review, 13(2):147–159, 1998. [11] B. Hellingrath, C. Böhle, and J. van Hueth. A framework for the development of multi-agent systems in supply chain management. In HICSS, pages 1–9, 2009. [12] R. Kalakota and M. Robinson. E-business. Roadmap for success. Addison-Wesley, 2000. [13] P. Kampstra, J. Ashayeri, and J. Gattorna. Realities of supply chain collaboration. CentER Discussion Paper Series No. 2006-59. Available at SSRN: http://ssrn.com/abstract=919813, 2006. [14] H. Kautz and B. Selman. The state of SAT. Discrete Applied Mathematics, 155(12):1514–1524, 2007. [15] N. Lang, H. Moonen, F. Srour, and R. Zuidwijk. Multi-agent systems in logistics: A literature and state-of-the-art review. ERIM Report Series Reference No. ERS-2008-043-LIS, available at SSRN: http://ssrn.com/abstract=1206705, 2008. [16] H. Lee. The Triple-A supply chain. Harvard Business Review, 10(11):102–112, 2004. [17] S. Link. On the implication of multivalued dependencies in partial database relations. Int. J. Found. Comp. Sci., Volume 19(3): 691-715, 2008. [18] S. Link. On the logical implication of multivalued dependencies with null values. Conferences in Research and Practice in Information Technology, Volume 51: 113-122, 2006. [19] S. Link. Consistency enforcement in databases. Semantics in Databases, Lecture Notes in Computer Science, Volume 2582:139-159, 2001. [20] V. Rangan, A. Zoltners, and R. Becker. The channel intemediary selection decision: A model and an application. Management Science, 32(9):1114–1122, 1986. [21] J. Swaminathan and S. Tayur. Models for supply chains in E-business. Management Science, 49(10):1387–1406, 2003.
A Simple Model of Negotiation for Cooperative Updates on Database Schema Components Stephen J. HEGNER Umeå University, Department of Computing Science SE-901 87 Umeå, Sweden [email protected] http://www.cs.umu.se/~hegner Abstract. Modern applications involving information systems often require the cooperation of several distinct users, and many models of such cooperation have arisen over the years. One way to model such situations is via a cooperative update on a database; that is, an update for which no single user has the necessary access rights, so that several users, each with distinct rights, must cooperate to achieve the desired goal. However, cooperative update mandates new ways of modelling and extending certain fundamentals of database systems. In this paper, such extensions are explored, using database schema components as the underlying model. The main contribution is an effective three-stage process for inter-component negotiation. Keywords. database, component
Introduction The idea of modelling large software systems as the interconnection of simpler components, or componentware [3], has long been a central topic of investigation. In recent work, Thalheim has forwarded the idea that a similar approach, that of database componentware, is a fruitful direction for the modelling of large database systems [23]. Database componentware is a true software-component approach, in that it embodies the principle of co-design [24] [10] — that applications should be integrated into the design of information systems. Indeed, the formal model [25] is closely related to that of the software components of Broy [5] [6]. While this approach has obvious merits, it does involve one substantial compromise; namely, the classical notion of conceptual data independence [17, p. 33] is sacrificed, since the applications are integral to the design. As new applications become necessary, or as existing applications must be modified, a change to the entire design may become necessary. It is therefore appropriate to ask whether a component-based approach to modelling database systems which preserves conceptual data independence, and thus mirrors more closely the traditional notions of a database schema, is feasible. In [12], the foundations for such a framework were presented. The core idea is that of a schema component, consisting of database schema and a collection
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
155
of its views, called ports. Interconnections are formed by connecting ports; that is, by requiring the states of connected ports to match. Such an interconnection defines a composite database schema. The idea is closely related to lossless and dependency-preserving decomposition, but it is really a theory of composition — the main schema is constructed from components rather than decomposed into constituents. The structure necessary to connect components together is part of the definition of the components themselves. The ultimate value of any concept lies in its applicability. In [15], initial ideas surrounding the use of schema components as the underlying framework for the support of cooperative update were presented. The model developed was a proof-of-concept effort, and many simplifying assumptions were made. Furthermore, the focus was upon a formal computational model rather than upon an illustration of how the technique may be used to model situations requiring cooperative update. The goal of this paper is to complement and extend [15]. The main contribution the presentation of a simple yet effective negotiation process. Any approach to cooperative update must support negotiation while still providing for reasonable convergence. While the process described in [15] is guaranteed to converge, the number of steps which are possible can be very large [15, 3.5(a)]. In this paper, a much more efficient negotiation process is developed in which each component executes at most three negotiating steps. This process is illustrated via an extended and annotated example, rather than via a completely formal model. There are a number of other aspects of cooperative update which were not even mentioned, much less addressed, in [15]. In this paper, several of the most important are discussed briefly, and illustrated relative to the running example. One of the most important is relative authority. Even in cooperative situations, there will typically be a hierarchy of authority, so that some players will be obligated in certain ways to accommodate the proposals of others. Others include models of behavior when actors are presented with choices for supporting an update request, and models for ensuring the cooperation does not lead to corruption. There has been considerable research on the general topic of cooperative work in general and cooperative transactions in particular [16] [22] [28]. There has also been some very recent work on synchronizing updates to repositories [18]. Relative to these, the focus of this paper is upon how an update which is proposed by a single agent (the initiator) to a single schema component may be realized via suitable updates to other components. It does not address more general situations in which a group of agents must begin from scratch to produce a desired final result, although such situations could conceivably be modelled within the context of schema components also.
1. Fundamentals of Schema Components and Cooperative Update The work of this paper is based upon the formal foundations of schema components and cooperative update, as presented in [12] and [15], respectively. While a complete understanding of the formalisms of those papers is not absolutely necessary for this paper, it is nevertheless useful for the reader to be familiar with the basic concepts and notation. The purpose of this section is to summarize the material from those two references which is central to the rest of this paper. The reader may wish to skim this section rather briefly, referring back to it as the need arises. In any case, the reader is referred to those papers for details and a more systematic presentation. The ideas are presented in terms of the
156
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
classical relational model, although they may easily be generalized to any data model admitting the notions of state and of view. 1.1. Schema Components Let E0 be the relational schema with the single relation symbol R[ABCDE], constrained by the functional dependencies (FDs) F = {B → C,C → DE}. The notation LDB(E0 ) is used to represent the set of all legal databases of E0 ; that is, the set of all relations on ABCDE which satisfy the FDs in F, while DB(E0 ) denotes the set of all databases on E0 which may or may not satisfy the constraints of F. Consider the decomposition of this schema into its four projections in {AB, BC, CD,CE}. Using classical relational database theory, it is easy to establish that this decomposition is lossless, in the sense that the original database may be reconstructed by joining together the projections, and dependency preserving in the sense that the elements of F may be recovered from the dependencies which are implied on the projections. Together, these two properties imply that there is a natural bijective correspondence between LDB(E0 ) and the decomposed databases. More precisely, if N = NAB , NBC , NCD , NCE is a quadruple of databases, with NAB a relation on AB which satisfies all of the dependencies in (the closure of) F which embed into AB, and likewise for NBC , NCD , and NCE on their respective projections, then there is an M ∈ LDB(E0 ) which decomposes into N. E0 = To proceed further, a more comprehensive notation is essential. Define ΠBC E BC 0 BC (E0 , πBC ) to be the view which is the projection of R onto BC. Here E0 is the relational schema with the single relation symbol RBC , constrained by FAB = {B → C}, and E0 E0 E0 E πBC0 : E0 → EBC 0 is the projection of R onto RBC . The views ΠAB , ΠCD , and ΠCE are defined in a completely analogous fashion, with analogous notation, as the projections onto the given sets of attributes. Modelling using components embraces explicitly two related notions which are only implicit in the above view-based approach. First, the model is totally distributed, in the sense that no reference to a main schema is necessary. Second, because of this lack of an explicit main schema, the means by which the components are interconnected must be made explicit. These ideas are now examined in more detail in the light of the above example. E0 The component corresponding to ΠAB consists of the schema EAB 0 together with the EAB
EAB
EAB
AB 0 0 view ΠB0 of EAB 0 which projects AB onto B. Write KAB = (E0 , {ΠB }). The view ΠB is called a port of KAB because it is used to connect to other components. A component EBC
EBC
0 0 may have more than one port. Indeed, KBC = (EBC 0 , {ΠB , ΠC }) has two ports. The ECD
ECE
BC 0 0 components KCD = (EBC 0 , {ΠC }) and KCE = (E0 , {ΠC }), each with a single port, are defined similarly. For each of these components, the first entry is the schema and the second its set of ports. It is convenient to have a graphical notation for the representation of interconnected components. Figure 1 illustrates this notation for the example just given. The components are represented as rectangles, with the ports depicted as circles. When two ports are connected, they are shown as a single circle. The interconnection family for Figure 1 specifies how the components are interconnected, and gives the sets of ports which are connected together. In this case, it is EAB
EBC
EBC
ECD
ECE
J0 = {{ΠB0 , ΠB0 }, {ΠC0 , ΠC0 , ΠC0 }}. A single member of an interconnection fam-
157
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
KCD KAB RAB [AB]
KBC
πBAB
RB [B]
πBBC
RBC [BC] B →C
πCBC
πCCD RC [C]
RCD [CD] C→D KCE
πCCE
RCE [CE] C→E
Figure 1. An interconnection of components
ily is called a star interconnection. Thus, J0 consists of two star interconnections. For this notation to be unambiguous, the set of components must be name normalized, in that globally, over all components, no two ports have the same name. Since this is just a naming convention, it can always be met through suitable renaming. Note, on the other hand, for two ports to be members of the same star interconnection, they must have idenEAB
EBC
tical schemata. For example, even though ΠB0 and ΠB0 are distinct ports, from distinct components, they have identical (and not just isomorphic) schemata. This condition is essential because the semantic condition on such an interconnection is that the states of all such view schemata must be identical. When the port schema (defined by RB in this E case) is from a view of a main schema (ΠB0 in this case), this happens automatically, but in the case of component interconnection without reference to a main schema, it must be enforced explicitly. Note further that the graphical notation of Figure 1 embodies this idea implicitly, since each common port schema is represented by a single circle. 1.2. Cooperative Update For convenience, assume that the current state of the main schema is M = E0 {R(a1 , b1 , c1 , d1 , e1 ), R(a2 , b2 , c2 , d2 , e2 )}. The state of ΠAB is then MAB = {RAB (a1 , b1 ), E0 E0 E0 RAB (a2 , b2 )}, with the states MBC , MCD , and MCE of ΠBC , ΠCD , and ΠCE obtained simE0 ilarly. Suppose that a given user aAB has access to the database only through view ΠAB , E0 and wishes to insert RAB (a3 , b2 ). This update can be realized entirely within ΠAB . By inE0 serting R(a3 , b2 , c2 , d2 , e2 ) into M, the desired update to ΠAB is achieved without altering the state of any of the other three views. Indeed, this is an instance of update via the E classical constant-complement strategy [2]. The mutual view ΠB0 , the projection onto B, E0 E0 is called the meet of ΠAB and ΠBC , and is precisely that which must be held constant under the constant complement strategy [11]. Now suppose that instead that user aAB wishes to insert RAB (a3 , b3 ). This update E0 which holds the states of the other cannot be realized by a change to the state of ΠAB three views constant. Indeed, it is necessary to insert a tuple of the form RBC (b3 , c? ) into E0 E0 . Since user aAB does not have write access to view ΠBC , the cooperation the state of ΠBC of another user who has such write access, say aBC , is necessary. If that user chooses to insert, say, RBC (b3 , c2 ), then the process terminates without any need for cooperation from E0 E0 or ΠCE . However, if user aBC chooses to cooperate by inserting, say, RBC (b3 , c3 ), then ΠCD E0 E0 the cooperation of additional users, one for ΠCD and one for ΠCE is necessary. Finally, if these additional users choose to insert RCD (c3 , d3 ) and RCE (c3 , e3 ), respectively, then
158
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
the tuple R(a3 , b3 , c3 , d3 , e3 ) may be inserted into the state M of E0 to achieve the desired result. Note that no single user, of a single view, could effect this update; by its very nature it requires the cooperation of distinct views, likely controlled by distinct users.
2. Three-Stage Negotiation for Cooperative Update In this section, a three-stage negotiation process for cooperative update on an interconnection of schema components is developed. Rather than presenting a completely formal model, the main ideas are developed in detail in the context of a simple business process, the approval of a travel request. This example is superficially similar to that found in [15]; however, not only the example process but also the underlying schema differs substantially, because the points which require emphasis are quite different. 2.1. The Schemata and Components of the Example Figures 2 and 3, together with Table 1, provide the basic definitions for the example, which is presented in the relational model. In Figure 2, the immutable relations of the model; that is, the ones which may not be updated (at least for the purposes of servicing a business process) are shown. Keys are marked with an underline, while set-valued attributes (i.e., multisets in the terminology of SQL:2003 [8]) are marked with a :::: wavy underscore. Thus, each employee has an employee ID, a home department defined by the ID of that department, and a set of assigned projects. Similarly, each department has a supervisor, each account has an account manager, and each project has a supervisor and a set of accounts (for travel funds). These relations are shared by all components. Employee [ EmpID, DeptID, ProjIDs ] :::::
Project [ ProjID, SupID, ProjAccts ]
Department [ DeptID, SupID ] Account [ AcctID, AMgrID ]
:::::::
Figure 2. The immutable relations of the running example
Figure 3, which employs the symbolic notation which was introduced in [12] and is summarized in Section 1, shows the basic schema components and ports. The upper line in each rectangle (e.g., Accounting) gives the name of the associated component, while the lower line (e.g., RActg SBank ) identifies the mutable relations which define the schema of that component; that is, the relations which may be modified in the course of servicing a travel request. Shown within each circle is the relation defining the schema of the associated port. Information on the attributes of the individual relations of the components, aside from the port relations, is given in Table 1. For each attribute name, a checkmark in the column of a relation indicates that the attribute is included in that relation, and an underline of a checkmark indicates that the given attribute is a key. Thus, for example, RActg may be expressed more completely in standard relational notation as RActg [TripID, EmpID, ProjID, TotalCost, AcctID, ApprvAcct]. Since TripID is a key for every relation of the form Rxxx (i.e., every relation except SBank ), those relations may be joined together to form one large relation R on the set of all attributes
159
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
Accounting RActg SBank
RAcAm
AccountMgr RActtMgr
RSeHt
RSeAc
RSeDm
ProjectMgr RProjMgr
RSeEm
Secretariat RSecrt
RSePm
DeptMgr RDeptMgr
Hotel RHotel
Employee REmpl
Figure 3. The components of the running example and their relations
Travel
REmpl
RSecrt
RHotel
RActg
RActtMgr
TripID
✓
✓
✓
✓
✓
✓
✓
EmpID
✓
✓
✓
✓
✓
✓
ProjID
✓
✓
✓
✓
✓
✓
Purpose
✓
✓
✓
✓
StartDate
✓
✓
✓
✓
✓
EndDate
✓
✓
✓
✓
✓
Location
✓
✓
✓
✓
✓
HotelCost
✓
✓
✓
TotalCost
✓
✓
✓
✓
AcctID
✓
✓
✓
ApprvProj
✓
ApprvSup
✓
ApprvAcct
✓
HotelName
✓
✓
RProjMgr
RDeptMgr
SBank
✓ ✓ ✓
✓
✓
✓ ✓
Balance Table 1. The mutable relations of the running example
shown in Table 1, save for the last one, Balance, which is used only in SBank . Then SBank may be joined with R, since AcctID is a key for it, and thus a universal relation Travel on all of the attributes may be obtained, with each of the component relations a projection of Travel. Each relation associated with a port is also a projection of Travel; the attributes of a port schema are given by the intersection of the attributes associated with the connecting components. For example, the attributes of RSeAc are {TripID, EmpID, ProjID, TotalCost, AcctID, ApprvAcct}. The semantics of the attributes of Table 1 are self explanatory, for the most part. Each trip is taken by a single employee and is associated with a single project. It has a purpose, a start date, and end date, and a location. There is a total cost for the entire trip, as well as the cost of just the hotel. The costs are charged to a single account. A trip must receive three distinct approvals, one by the project supervisor, one by the department supervisor, and one by account manager for the account to which the charges are made.
160
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
Finally, the relation SBank recaptures that each account has a balance, which is reduced accordingly when a trip is charged to that account. The component interconnection of Figure 3 illustrates a spoke-and-hub topology, in that there is a central vertex (in this case Secretariat) which embodies most, but not all, of the mutable information. This is not an essential feature of the schema-component model, but it is a very useful architecture for many applications, such as the travel-request example considered here. Also, in Figure 3, each port schema connects only two components, but this is not a general requirement either, as the example of Section 1 illustrates. 2.2. The Representation of a Simple Update Request In principle, a travel request may be initiated as an update to any of the components. Indeed, this is one of advantages of the using schema components to model business processes — the actual control flow need not be specified; rather, only the constraints on that flow imposed by the model need be respected. One of the most common cases is that that an employee, say Annie for the sake of concreteness, initiates a request for her own travel. Annie has write access only to the component Employee, and indeed, only to tuples of REmpl which are associated with her EmpID. Suppose that she is working on the French project and wishes to travel to one of Nantes or Nice from April 1 to April 5. To express this request as an update, she obtains a new TripID from a server and proposes an insertion of a single tuple into REmpl satisfying the following expression. uEmpl:0 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), 1000 ≤ TotalCost ≤ 1500, HotelCost ≤ 1500, HotelName = ∗. The plus sign indicates that the update is an insertion; that is, the tuple(s) indicated by the expression are to be inserted. It actually represents many possibilities, and so is termed a nondeterminstic update request, and the expression uEmpl:0 identifies an update family. Each possible update inserts only one tuple, but the values of the TotalCost, HotelCost, and HotelName fields are not fixed. No values for HotelCost and HotelName are excluded. Since Annie does not know Nantes, she has used the ∗ wildcard to indicate that she expresses no preference for a hotel, and allows a cost up to and including the total amount for the trip. Similarly, any value for TotalCost between 1000 and 1500 Euros inclusive is a possibility. In effect, an update family may be thought of as a set of ordinary, deterministic updates. In this case, there is one deterministic update in uEmpl:0 for each quadruple (Loc, TC, HC, HN) in which Loc ∈ {Nantes, Nice}, 1000 ≤ TC ≤ 1500, 0 ≤ HC ≤ 1500, and HN is the name of a hotel in the appropriate city. It is assumed that all such update families are checked for integrity with the given constraints. For example, the relation Employee must reflect that Annie is a member of the French project. 2.3. The Three Stages of the Negotiation Process Annie has the authority to update REmpl only insofar as that update does not affect the other components. However, any of these proposed updates would affect the state of
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
161
RSeEm as well. Thus, the cooperation of neighboring components, in this case the Secretariat component, must be obtained in order to obtain a completion of her initial request. The component Secretariat will then need to cooperate with other components. The process by which all components come to agreement on a completion of the initial update request uEmpl:0 is called negotiation. In [15], a negotiation process is described in which any component can make a decision at any time. While such a model is very attractive theoretically and is well suited for the formal model presented there, convergence may be very slow. Here, a simple negotiation process is described in which each component goes through three distinct stages, although different components may be in different stages at different times. For a given component, each stage requires the execution of one well-specified task. Once these tasks are completed, the negotiation process is complete. In particular, negotiation cannot continue indefinitely in a back-and-forth fashion. The description given below assumes that the interconnection are acyclic [12, Sec. 3], in the sense that there are no cycles in the graph which represents the interconnection of the components. The example interconnection of Figure 3 is acyclic. It also requires a few simple definitions. For components K and K , a simple path from K to K goes through no component more than once. For example, in Figure 3, Employee, Secretariat, DeptMgr is a simple path from Employee to DeptMgr, while Employee, Secretariat, ProjectMgr, Secretariat, DeptMgr is a path which is not simple. For an acyclic graph, there is at most one simple path between any two components. Let Γ be a port of K . Call Γ inner relative to K if it occurs on the simple path from K to K , and outer otherwise. For example, the port of Accounting defined by RSeAc is inner with respect to Employee, while the port defined by RAcAm is outer. Call a component K extremal with respect to another component K if there is a simple path K = K0 , K1 , . . . , Kn = K from K to K and this path cannot be extended beyond K while keeping it simple. Relative to Employee, the components Hotel, AccountMgr, ProjectMgr, and DeptMgr are extremal, while the others are not. The three stages of the negotiation process are described as follows. Stage 1 — Outward propagation: During Stage 1, the initial update request is radiated from the initiating component outwards to the other components. Each user of a given component, as it receives information about the initial update request, makes a decision regarding the way in which it is willing to support that request. It is only during this stage that such decisions may be made. In the later stages, each user must respect the decisions which were made in Stage 1. Since the underlying graph is assumed to be acyclic, each component receives information about the proposed update from at most one of its neighbors. Thus, there is no need to integrate information from different sources during this step. The component which initiates the update request enters Stage 1 immediately. It then projects this request onto its ports; each neighboring component then lifts state on the port to an update request on its own schema. These neighboring components enter Stage 1 as soon as they have performed this lifting. The process then continues, with each component which are newly in Stage 1 projecting its lifting onto their inner ports relative to the initiating component. It ends when the liftings have been propagated to the extremal components.
162
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
Stage 2 — Propagate inward and merge: During Stage 2, the liftings which were chosen during Stage 1 are radiated back inwards towards the initiating component. In each component, the information from its neighbors which are connected to its outer ports is merged into a single update family. Since an extremal component has no outer ports, it enters Stage 2 as soon as it has decided upon a lifting for the update request. After that decision has been made, it is transmitted it back to the component from which the initial update request was received during Stage 1 by projecting it onto the appropriate port. Components which are not extremal enter Stage 2 when they have received a return update request from each neighbor which is connected to an external port, and then have merged the possibilities of these into a single update family. This merged update family is then transmitted back towards the initiating component via the inner ports of the current component. This merger may be empty, in which case it is impossible to realize the initial update request. However, even if it is empty, it is transmitted back. Stage 3 — Choose final state and commit: Once the initiator of the update request has received and merged all of the incoming requests, it has reached Stage 2, and that marks the end of Stage 2 for all components, since all components have now merged the information from their more outward neighbors. The final step is for the initiating component to select one of the possibilities which it has computed in its merge as the actual update to its schema. (If this set of possibilities is empty, the update request fails.) Once it has chosen a possibility, it transmits this decision outward, just as in Stage 1. Each component must make a decision as to which of the possibilities in the update family determined in Stage 2 will be the actual update. This decision process is called Stage 3. Once all of these decisions are made, the update can be committed to the database. There is one detail which was not elaborated in the above description. It is possible that some components will not need to be involved in the negotiation process, because none of the possible liftings will change their states. These components are simply ignored in the process. 2.4. The Negotiation Process via Example The three-stage process described above is now illustrated on the running example, using the update family uEmpl:0 defined in 2.2. In the first step, the update to the component Employee is projected onto the view RSeEm ; in this case RSeEm and REmpl have the same attributes and so this projection is the identity. At this point, Employee has completed Stage 1. Next, this projection must be lifted to an update family on the schema of the component Secretariat, which must include values for every attribute of RSecrt ; that is, every attribute listed in Table 1 save for Balance. Without further restrictions, a user of the Secretariat component (a human secretary, say) could choose any subset of the set of possible liftings to propagate forward, including the empty set, which would abort the proposed update. This liberal model is in fact used in [15]. In a real modelling situation, the set of liftings which are allowed must be regulated in some way; this topic is discussed further in 3.3. For now, assume that the rôle of the Secretariat carries no decision-making authority; thus, it must allow all possible liftings which do not involve extraneous riders, such as additional
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
163
travel for someone else. See 3.2 for an elaboration of this notion. The lifting will then have a representation of the following form. uSecrt:0 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), HotelName = ∗, HotelCost ≤ 1500, 1000 ≤ TotalCost ≤ 1500, ApprvProj = Carl, ApprvSup = Barbara, ( AcctID = A1, ApprvAcct = AM1 ∨AcctID = A2, ApprvAcct = AM2 ∨AcctID = A3, ApprvAcct = AM3 ∨AcctID = A4, ApprvAcct = AM4) The IDs for the project supervisor and department manager have been filled in, since these are single valued and given in the immutable tables Project and Department. Similarly, the identities of the four accounts which are associated with the French project, together with their managers, are obtained from the table Account. No decision on the part of the secretariat is required to determine these values. To complete the process for Stage 1 for component Secretariat, uSecrt:0 is projected onto each outer port. At this point, Stage 1 for component Secretariat is complete. Consider first the communication with the component Hotel, which is assumed to be autonomous (with no decision-making authority) and simply returns a list of available hotel rooms for the given time interval. Suppose that the following lifting is obtained. uHotel:0 := +TripID = 12345, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, ( HotelCost = 1600, HotelName = TrèsCher ∨HotelCost = 1200, HotelName = AssezCher ∨HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple) Thus, there are no hotels available in Nice for the request period of time, but there are four from which to choose in Nantes (although one turns out to be too expensive). Hotel is an extremal component, so upon placing this lifting on the port defined by RSeHt , both Stage 1 and Stage 2 for that component are complete. This result is held by Secretariat until the other responses are received and it can complete its processing for Stage 2. Next, consider the projection onto the outer port defined by RSeAc , connected to component Accounting. Only the values for TripID, EmpID, ProjID, and TotalCost, as well as the alternatives for AcctID and ApprvAcct, are included. The lifting to the component Accounting must add information on the relation SBank , as shown below. uActg:0 := +TripID = 12345, EmpID = Annie, ProjID = French, ( AcctID = A1, 1000 ≤ TotalCost ≤ 1500, ApprvAcct = AM1 ∨AcctID = A2, TotalCost = 1000, ApprvAcct = AM2 ∨AcctID = A3, 1000 ≤ TotalCost ≤ 1100, ApprvAcct = AM3) ∪ ±Balance ← Balance − TotalCost
164
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
The account A4 has been excluded because the balance was insufficient to fund the trip. (Assume that it was 900 Euros, say.) Similarly, the amounts allowed for accounts A2 and A3 are below those of the initial request, since these accounts cannot fund the entire 1500 Euros. This process of reducing the allowed liftings is called trimming. A decision to exclude other accounts, such as A2, might also be made; whether or not this would be allowed would depend upon the authority of the user of this component (see 3.3). However, in this example, all applicable accounts with sufficient balance have been included. Also, in this model, the entire cost of the trip must be paid from one account; the cost of a single trip may not be shared amongst accounts. In contrast to the update families which have been obtained thus far, this one is not a pure insertion. In order to pay for the trip, funds must be removed from the paying account. Thus, the update, which is tagged with a “+” indicating an insertion, also has a sub-update which is tagged with a “±”, indicating a modification. Standard imperative programming notation has been used to express this. To complete Stage 1 for Accounting, this update family is passed to component AccountMgr via the port with schema RAcAm . Here there is not a single user which must construct a lifting; rather, each account manager must make a decision, and these decisions subsequently combined into a single lifting. However, no negotiation amongst these managers is required; the individual decisions are independent of one another. Suppose that two of the account managers agree to funding, each at a different level, but a third (AM2 for account A2) does not, so that the lifting in AccountMgr is given by the following expression. uActg:0 := +TripID = 12345, EmpID = Annie, ProjID = French, ( AcctID = A1, 1000 ≤ TotalCost ≤ 1500, ApprvAcct = AM1 ∨AcctID = A3, 1100 ≤ TotalCost ≤ 1100, ApprvAcct = AM3) Since AccountMgr is an extremal component, this lifting is transmitted back to component Accounting, thus completing not only Stage 1 but also Stage 2 for AccountMgr. This information requires that component Accounting trim its initial proposal to remove the possibility of using account A2. The following is computed as the final lifting in Accounting. uActg:1 := +TripID = 12345, EmpID = Annie, ProjID = French, ( AcctID = A1, 1000 ≤ TotalCost ≤ 1500, ApprvAcct = AM1 ∨AcctID = A3, 1000 ≤ TotalCost ≤ 1100, ApprvAcct = AM3) ∪ ±Balance ← Balance − TotalCost Component Accounting now projects this result back to its inner port defined by RSeAc , thus completing its Stage 2. The component Secretariat is still in Stage 1, and must communicate the initial update request to the other two manager components, ProjectMgr and DeptMgr. The project manager and department manager make only approve/disapprove decisions; no other parameters are involved. They are presented only with the proposed values for TripID, EmpID, ProjID, Purpose, StartDate, EndDate, and Location. They indicate approval by placing their IDs in the respective approval fields: ApprvProj or ApprvSup. For example, the update expression which is passed to the component ProjectMgr is
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
165
uSePm:0 := TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), ApprvProj = Carl Observe in particular that the location is given as either Nantes or else Nice. Even though there are no hotels available in Nice, for this simple model, the communication of component Secretariat with Hotel, Accounting, ProjectMgr, and DeptMgr occurs in parallel. Thus, it is not necessarily known that there are no hotels available in Nice when this update request is sent to ProjectMgr. Furthermore, even if Secretariat had received the reply from Hotel before initiating communication with ProjectMgr, it may not have the authority to pass this information along to that component. See 3.1 and 3.3 for a further discussion of this type of situation. Returning to the communication with ProjectMgr, it indicates approval by returning this same expression, and indicates rejection by returning the empty expression. In either case, since it is an extremal component, returning the decision completes Stages 1 and 2 for it. An analogous expression applies for communication with the component DeptMgr. In the decision flow of this example, assume that both return positive decisions. At this point the Secretariat component has received all of the responses, and is in a position to complete its Stage 2. To do this, it merges all of these responses to find a greatest common expression; that is, the largest update family which respects each of the update families which was reflected back to it. The expression which is obtained is the following. uSecrt:1 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, ApprvSup = Barbara, ApprvProj = Carl, ( 1200 ≤ TotalCost ≤ 1300, AcctID = A1, ApprvAcct = AM1, HotelCost = 1200, HotelName = AssezCher ∨1000 ≤ TotalCost ≤ 1300, AcctID = A1, ApprvAcct = AM1, ( HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple) ∨1000 ≤ TotalCost = 1100, AcctID = A3, ApprvAcct = AM3, ( HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple)) To complete Stage 2 for Secretariat, this expression is projected back to component Employee as the following. Note that details about approval and about which account can fund the trip are not included; such information is not part of the view for Employee.
166
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
uEmpl:1 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, ( 1200 ≤ TotalCost ≤ 1300, HotelCost = 1200, HotelName = AssezCher ∨1000 ≤ TotalCost ≤ 1300, ( HotelCost = 400, HotelName = PasCher ∨HotelCost = 200, HotelName = Simple)) This completes Stage 2 for Employee. Now, for Stage 3, Annie must choose one of the possibilities. If she decides to take as much travel funds as possible, namely 1300 Euros, she will have only 100 Euros left for the hotel. So, she chooses the hotel PasCher for 400 Euros. Because she is a very responsible person, and because the hotel is so inexpensive, she decides to take only 1100 Euros in total expenses, since 700 is more than enough to cover the other expenses. Her final, deterministic update request is thus the following. uEmpl:2 := +TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, Location = Nantes, TotalCost = 1100, HotelCost = 400, HotelName = PasCher To complete Stage 3 for all components, this decision must be propagated to the other components, and then committed to the database. This is not quite trivial, because even though Annie has made a decision, there is still a choice to be made in another component. In this example, since she chose to take only 1100 Euros, either account A1 or account A3 may be charged. It is within the domain of the administrator who has update rights on the Accounting component to make this decision. In any case, the process of propagating the decision to the other components is again a simple project-lift process, which will not be elaborated further here. Once these decisions are made, the update may be committed to the database, completing Stage 3. 2.5. Analysis of the Three-Stage Negotiation Process The process presented here is a very simple one. Basically, there are only two points at which an actor may make a decision. The first is during Stage 1, when the set of alternatives which the actor will accept is asserted. In effect, the actor agrees to support each of these alternatives for the life of the negotiation process. This stands in sharp contrast to the model forwarded in [15], in which an actor may at any time decide to withdraw alternatives which it previously agreed to support. Similarly, in Stage 3, an actor must decide which of the alternatives to support in the final update, but this is also a single decision which may not be modified once it is made. Stage 2 does not involve any decisions at all. Rather, its purpose is to merge the decisions made in Stage 1, and may be carried out in an entirely automated fashion, without any input at all from the actors. Again, this is in contrast to the approach of [15], in which the actors may examine the results of merging the previous results and make new decisions as to which alternatives to support and which to reject. The upshot is that the total number of steps required in the negotiation process is effectively independent of the number of alternatives considered.
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
167
In contrast, the process described in [15] will in the worst case require a number of steps proportional to the total number of alternatives possible for satisfying the update request. Of course, this reduction comes at the expense of some flexibility in the process itself, but for many applications it should be more than adequate. The dominant cost for this approach is governed not by the number of decisions but rather by the resources required to specify and manage nondeterministic update specifications. This is indeed an important issue which requires further work. It may be addressed both by exploring efficient methods for representing such specifications, as discussed in Section 4.2, and by controlling the number of such alternatives and the ways in which they are propagated, as discussed further in Sections 3.1 and 3.2. However, the point is that with the approach to negotiation presented here, the evolution of that process itself is not the bottleneck.
3. Further Modelling Issues for Cooperative Update In describing the update and negotiation process via the running example of Section 2, some issues were glossed over in the interest of not clouding the main ideas with details. In this section, some of these more important details are elaborated. On the other hand, issues which are not addressed at all in this paper, such as concurrency control, are discussed in 4.2. 3.1. Context Sensitivity of the Lifting Strategy In the lifting uactg:0 in the example of Section 2, employee Annie made a request to travel either to Nantes or else to Nice for the French project, and department manager Barbara approved this request. However, suppose that Barbara had instead rejected this request, but would have approved a reduced request which includes only the possibility to travel to Nantes, but not to Nice. In other words, she would reject the request to travel to Nantes were it accompanied by an alternative to travel to Nice, but not if Nantes were given as the sole possibility for the destination. In this case, it is said that her decision is context sensitive. Although context-sensitive lifting behavior might seem less than completely rational, it must be acknowledged that human actors may sometimes exhibit such characteristics in their decision making. This work is not primarily about modelling human decision makers. However, context sensitivity in lifting behavior does have important implications. Suppose that, for efficiency purposes, the component Secretariat were allowed to check hotel availability before forwarding travel requests on to the managers. In that case, since no hotel is available in Nice for the requested time period, the department manager would not see that Annie had requested also to travel to that city, since that information would be filtered out before being transmitted to DeptMgr. Thus, Barbara would see only the request to travel to Nantes, and so would approve it. In this case, whether or not the travel request is approved depends upon the order in which impossibilities are filtered out. On the other hand, if Barbara exhibited a context-free decision behavior; that is, if whether she would approve the trip to Nantes were independent of any other requests which Annie had made, allowing the Secretariat to check hotel availability before forwarding the request on to the managers would not affect the final outcome.
168
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
It is important to emphasize that this notion of context sensitivity relates to alternatives in the update family, and not upon conjunctive combinations. For example, if the request of Annie contained two alternatives, one to travel just to Nantes, and a second to travel both to Nantes and to Nice, then to approve the travel to Nantes, but not the combined travel to both Nantes and Nice would be perfectly context free. Context sensitivity has only to do with rejecting a given alternative on the grounds of the presence of other alternatives. 3.2. Admissibility for the Lifting Strategy In Stage 1 of the negotiation process, the liftings should be minimal in the sense that they do not make any changes which are not essential to the update request. Within the limited framework of the running example, it is difficult to illustrate liftings which are not minimal. However, suppose that the component DeptMgr contains an additional relation SBudget (DeptID, Amount) which represents the department budget, and this component is connected to an additional component UpperMgt representing upper management, as illustrated in Figure 4.
RSePm
DeptMgr RDeptMgr SBudget
RDmUm
UpperMgt SBudget
Figure 4. Additional component for rider update
Now, suppose that in approving the travel for the trip of Annie, the department manager also adds an increase of 100000 Euros to the department budget to the lifting, so that it becomes uDeptMgr:0 := TripID = 12345, EmpID = Annie, ProjID = French, Purpose = “meet with project partners”, StartDate = 01.04.10, EndDate = 05.04.10, (Location = Nantes ∨ Location = Nice), ApprvProj = Carl ∪ DeptID = CDpt, ; Amount ← Amount + 100000 Here Carl has added a rider to the update request; to be approved, an additional update which is irrelevant to the original request must be realized as well. This lifting is not minimal because the rider could be removed without compromising support for the original update request. It may not always be possible to characterize minimality of a lifting in terms of inserting and deleting the minimal number of tuples. There might be a situation, such as a funds transfer, in which the amount should be minimal. However, the principle remains clear.
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
169
3.3. The Model of Authority A suitable framework for describing and managing access rights in the context of cooperative update requires certain special features beyond those of conventional database systems, since traditional access rights do not take into account any form of cooperation. One suitable model builds upon the widely-used notion of rôle-based access control, which was introduced in [1] using the terminology named protection domain or NPD, and which is elaborated more fully in articles such as [20]. The key idea is that rights are assigned not to individual users, but to rôles. Each user may have one or more rôles, and each rôle may have one or more users as members. For example, Barbara may have the rôle of manager of the French project, but she may also be an ordinary employee when making a travel request for herself. In addition to the usual privileges hierarchy, in which A ≤ B means that B has all privileges which A has, there is a authority hierarchy, in which A ≤ B means that A must support fully the requests of B. A possible authority hierarchy for the example of Section 2 might be the following, in which the ordering is represented from left to right. TravelAgent
<
< <
Scientist
Secretary
Manager
< Accountant
The employee Annie might make the travel request from the component Employee in the rôle of Scientist, in which case someone (or something — a program perhaps) in the rôle of Secretary using the component Secretariat and someone/something in the rôle of TravelAgent using the component Hotel would need to respect the update request of Annie, but those assuming the rôles of Accountant or of Manager (in the components with corresponding names) would have the right to trim her request as they see fit. This is only a sketch of how the model of authority works; the details will appear in a forthcoming paper. 3.4. The Model of Representation and Computation The representation of update families, and the computations involved in lifting and merging them, are illustrated via example in Section 2, with the basic ideas hopefully clear. It is nevertheless appropriate to provide a bit more information as to what is allowed. First of all, update families are generally taken to be finite; that is, they represent only a finite number of alternatives. This means that, at least in theory, the liftings of Stage 1 of the negotiation process can be computed on a case-by-case basis. Consider the initial update request uEmpl:0 of 2.2. While the ranges on values for TotalCost and HotelCost are finite, the ranges for HotelName is specified by a wildcard and thus appear to be unconstrained. However, it is assumed that there are only a finite number of hotels, so this range may be taken to be finite. A second, computational issue arises in the context of computing merges in Stage 2 of the negotiation process. Here the set of liftings which agree with the update requests on each of several ports must be computed. In the most general case, this is an unsolvable problem. There is nevertheless a very natural case in which such problems do not arise. If the port views are defined by basic SPJ (select-project-join) queries, and if the schema
170
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
has the finite-extension property [14, Def. 28]; that is, if the classical chase procedure [9] always terminates with a finite structure, then the merger can be computed as the result of the chase. Of course, there will be one such chase for each set of alternatives in the respective update families, but the total number of such alternatives is finite. In [19], many cases which guarantee such termination, and thus the semantic-extension property, are identified. Included in these is the classical situation of schemata constrained by functional dependencies and unary inclusion dependencies (which include in particular foreign-key dependencies), provided that the latter have the property of being acyclic [7]. The bottom line is that, from a theoretical standpoint, there are no problems with representation and computation. However, further work is needed to identify suitable cases which are both useful and efficiently solvable. See 4.2 for a further discussion.
4. Conclusions and Further Directions 4.1. Conclusions A straightforward but useful model of negotiation for cooperative update on database schemata defined by components has been presented. In contrast to the approach given in [15], the method presented here involves only three simple stages for each component and thus terminates rapidly. The key idea is that decisions are made only during the first stage; thereafter the operations involve only merging those decisions and then selecting one of them as the final result. Other aspects of the modelling process, such as the representation of update requests, have been illustrated via a detailed example. This has illustrated that, at least for some examples, such representation is a viable alternative to more traditional, task-based representations. Nevertheless, there are many issues which remain to be solved before the ideas can be put into practice. 4.2. Further Directions Relationship to workflow and business-process modelling formalisms The kinds of applications which can be modelled effectively via cooperative update overlap in substantial part those which are typically modelled using workflow [26] and/or businessprocess modelling languages [4]. Furthermore, some database transaction models, such as the ConTract model [27], [21], are oriented towards modelling these sorts of processes. Relative to all of these, the cooperative update approach developed here is constraint based, in that it does not specify any flow of control explicitly; rather, it places constraints on what that flow may be. The identification of workflow and business-process representations for those flows of control which are representable by cooperative update, as well as a way to translate between the various representations, is an important direction which warrants further investigation. An appropriate model of concurrency control Update requests to databases, whether cooperative or not, typically overlap, thus requiring some form of concurrency control. However, traditional approaches are generally inadequate for cooperative update. Since they typically involve at least some human interaction, cooperative update processes are by their very nature long running, and so locking large parts of the database in order to avoid unwanted interaction of distinct transactions is not a feasible solu-
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
171
tion. On the other hand, cooperative transactions typically involve changes to only a very small part of the overall database. Work is currently underway on a non-locking approach which uses information contained in the initial update request to identify tight bounds on the part of the database which must be protected during a cooperative transaction [13]. A distributed model of control and communication The operation of a database system constructed from schema components, particularly in the context of cooperative updates, involves the passing of messages (i.e., projections and liftings) from component to component. Thus, a unified model of control and communication which is distributed amongst the components is essential to an effective realization of systems with this architecture. Future work will look at the properties and realization of such models. An efficient representation for nondeterministic update families This issue has already been discussed briefly in 3.4. Work is currently underway in two areas. The first is to identify economical and computationally flexible representations for nondeterministic update families. The second is to identify ways of computing merges of such nondeterministic update families using only one, or at least relatively few, instances of the chase procedure. More complex models of negotiation The model of negotiation which has been developed and presented in this paper is a very simple one. Although it is useful in modelling many business processes, there is clearly also a need for more complex negotiation processes, particularly ones with a back-and-forth nature in which parties compromise to reach a decision. Future work will look at such general notions of negotiation.
Acknowledgments For three to four months each year from 2005-2008, the author was a guest researcher at the Information Systems Engineering Group at Christian-Albrechts-Universität zu Kiel, and many of the ideas in this paper were developed during that time. He is particularly indebted to Bernhard Thalheim for suggesting the idea that his ideas of database components and the author’s work on views and view updates could have a fruitful intersection, as well as for inviting him to work with his group on this problem. He is furthermore indebted to Peggy Schmidt, for countless discussions and also for fruitful collaboration on the ideas of schema components. She furthermore read initial drafts of this paper and made several insightful comments.
References [1]
[2] [3]
R. W. Baldwin. Naming and grouping privileges to simplify security management in large databases. In Proc. 1990 IEEE Symposium on Research in Security and Privacy, pages 116–132. IEEE Computer Society Press, 1990. F. Bancilhon and N. Spyratos. Update semantics of relational views. ACM Trans. Database Systems, 6:557–575, 1981. G. Beneken, U. Hammerschall, M. Broy, M. V. Cengarle, J. Jürjens, B. Rumpe, and M. Schoenmakers. Componentware - State of the Art 2003. In Proceedings of the CUE Workshop Venedig, 2003.
172 [4] [5]
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
Business process modeling notation v1.1. http://www.omg.org/spec/BPMN/1.1/PDF, 2008. M. Broy. A logical basis for modular software and systems engineering. In B. Rovan, editor, SOFSEM, volume 1521 of Lecture Notes in Computer Science, pages 19–35. Springer, 1998. [6] M. Broy. Model-driven architecture-centric engineering of (embedded) software intensive systems: modeling theories and architectural milestones. Innovations Syst. Softw. Eng., 3(1):75–102, 2007. [7] S. S. Cosmadakis and P. C. Kanellakis. Functional and inclusion dependencies. Advances in Computing Research, 3:163–184, 1986. [8] A. Eisenberg, J. Melton, K. G. Kulkarni, J.-E. Michels, and F. Zemke. SQL:2003 has been published. SIGMOD Record, 33(1):119–126, 2004. [9] R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: Semantics and query answering. Theoret. Comput. Sci., 336:89–124, 2005. [10] G. Fiedler, H. Jaakkola, T. Mäkinen, B. Thalheim, and T. Varkoi. Co-design of Web information systems supported by SPICE. In Y. Kiyoki, T. Tokuda, H. Jaakkola, X. Chen, and N. Yoshida, editors, Information Modelling and Knowledge Bases XX, 18th European-Japanese Conference on Information Modelling and Knowledge Bases (EJC 2008), Tsukuba, Japan, June 2-6, 2008, volume 190 of Frontiers in Artificial Intelligence and Applications, pages 123–138. IOS Press, 2008. [11] S. J. Hegner. An order-based theory of updates for closed database views. Ann. Math. Art. Intell., 40:63–125, 2004. [12] S. J. Hegner. A model of database components and their interconnection based upon communicating views. In H. Jakkola, Y. Kiyoki, and T. Tokuda, editors, Information Modelling and Knowledge Systems XIX, Frontiers in Artificial Intelligence and Applications, pages 79–100. IOS Press, 2008. [13] S. J. Hegner. A model of independence and overlap for transactions on database schemata. In B. Catania, M. Ivanovic, and B. Thalheim, editors, Advances in Databases and Information Systems, 14th East European Conference, ADBIS 2010, Novi Sad, Serbia, September 20-24, 2010, Proceedings, volume 6295 of Lecture Notes in Computer Science, pages 209–223. Springer-Verlag, 2010. [14] S. J. Hegner. Internal representation of database views. J. Universal Comp. Sci., 17:–, 2011. in press. [15] S. J. Hegner and P. Schmidt. Update support for database views via cooperation. In Y. Ioannis, B. Novikov, and B. Rachev, editors, Advances in Databases and Information Systems, 11th East European Conference, ADBIS 2007, Varna, Bulgaria, September 29 - October 3, 2007, Proceedings, volume 4690 of Lecture Notes in Computer Science, pages 98–113. Springer-Verlag, 2007. [16] G. E. Kaiser. Cooperative transactions for multiuser environments. In W. Kim, editor, Modern Database Systems: The Object Model, Interoperability, and Beyond, pages 409–433. ACM Press and AddisonWesley, 1995. [17] M. Kifer, A. Bernstein, and P. M. Lewis. Database Systems: An Application-Oriented Approach. Addison-Wesley, second edition, 2006. [18] L. Kot and C. Koch. Cooperative update exchange in the Youtopia system. Proc. VLDB Endow., 2(1):193–204, 2009. [19] M. Meier, M. Schmidt, and G. Lausen. On chase termination beyond stratification. CoRR, abs/0906.4228, 2009. [20] S. L. Osborn and Y. Guo. Modeling users in role-based access control. In ACM Workshop on Role-Based Access Control, pages 31–37, 2000. [21] A. Reuter and F. Schwenkreis. ConTracts – a low-level mechanism for building general-purpose workflow management-systems. IEEE Data Eng. Bull., 18(1):4–10, 1995. [22] M. C. Sampaio and S. Turc. Cooperative transactions: A data-driven approach. In 29th Annual Hawaii International Conference on System Sciences (HICSS-29), January 3-6, 1996, Maui, Hawaii, pages 41– 50. IEEE Computer Society, 1996. [23] B. Thalheim. Database component ware. In K.-D. Schewe and X. Zhou, editors, Database Technologies 2003, Proceedings of the 14th Australasian Database Conference, ADC 2003, Adelaide, South Australia, February 2003, volume 17 of CRPIT, pages 13–26. Australian Computer Society, 2003. [24] B. Thalheim. Co-design of structuring, functionality, distribution, and interactivity for information systems. In S. Hartmann and J. F. Roddick, editors, APCCM, volume 31 of CRPIT, pages 3–12. Australian Computer Society, 2004. [25] B. Thalheim. Component development and construction for database design. Data Knowl. Eng., 54(1):77–95, 2005. [26] W. van der Aalst and K. van Hee. Workflow Management: Models, Methods, and Systems. MIT Press, 2002.
S.J. Hegner / A Simple Model of Negotiation for Cooperative Updates
[27] [28]
173
H. Wächter and A. Reuter. The ConTract model. In A. K. Elmagarmid, editor, Database Transaction Models for Advanced Applications, pages 219–263. Morgan Kaufmann, 1992. W. Wieczerzycki. Multiuser transactions for collaborative database applications. In G. Quirchmayr, E. Schweighofer, and T. J. M. Bench-Capon, editors, Database and Expert Systems Applications, 9th International Conference, DEXA ’98, Vienna, Austria, August 24-28, 1998, Proceedings, volume 1460 of Lecture Notes in Computer Science, pages 145–154. Springer, 1998.
A Description-based Approach to Mashup of Web Applications, Web Services and Mobile Phone Applications Prach CHAISATIEN, Takehiro TOKUDA {prach, tokuda} @tt.cs.titech.ac.jp Department of Computer Science, Tokyo Institute of Technology Meguro, Tokyo 152-8552, Japan
Abstract. Recent developments in mobile technology have enabled mobile phones to work as mobile Web servers. However, the composition of mobile phone applications and Web resources to form new mashup applications requires mobile programming knowledge ranging from how to create user interfaces, network connections and access to Web resources. Furthermore, the unique capabilities of mobile phone applications such as access to camera inputs or sensor data are often limited to local use only. To address these problems, we present a descriptionbased approach and an Integration Model for the composition of mobile mashup applications combining Web applications, Web services and mobile phone applications (i.e., generic components). The compositions appear to require less native mobile programming knowledge. In the current work, to leverage access to these services and applications, an Interface Wrapper was used to transform generic components into mashup components. Composers were able to transform and reuse form-based query results from Web applications and integrate them with wrapped output from users’ interaction with mobile phone applications, and other Web services. The final applications can be configured to work two ways: 1) as native mobile phone applications or 2) as a Web application accessible externally via a mobile Web server application. Keywords. Mobile phone application, mobile Web server, Web service, Web application, mobile mashup, Interface Wrapper
1. Introduction Mobile phone applications deliver unique capabilities such as GPS location services, voice recognition and camera/image processing applications. There are some problems related to the composition of mashup applications of these components with existing Web resources. One of these problems is related to the lack of mobile programming language knowledge needed for the creation of user interfaces and control parts. Another issue is that composers need not only to know how to create a standalone mobile application, but also need additional skills to program the mobile phones to access and reuse Web resources. To address these problems, this paper presents a description-based approach to flexibly compose mashup applications from 3 generic component categories: Web applications, Web services, and mobile phone applications. With minimum
P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup
175
configuration required, our approach allows composers to accomplish the following tasks in the aforementioned categories: • Simplify and reuse form-based query results from Web applications. • Extract selected portions from Web services’ outputs. • Generate and configure Web service interfaces for mobile phone applications. In the composition procedures, first, the Integration Model is used to describe and plan data flows of the mashup components. The Integration Model later is expanded into configurations of a Mobile Integration Description file (MID file). Then the mashup application generator uses the file to generate the actual mashup applications. In the similar a manner, composers are required to fill the control parameters of each component in the MID file. Thus a mashup application is generated according to those configurations. Lastly, composers were able to configure the final mashup application to run on the device as a mobile phone application or to be accessed externally as a Web application via the mobile Web server application. To leverage access to each mashup component, we termed the components that transform communication interfaces between component categories “Interface Wrappers”. For instance, the Web service wrapper detailed in this study enables intercomponent communication and external access to a mobile phone application using a Web service interface.
Figure 1. Overview of the mashup applications, Interface Wrappers and their relation to outputs and clients
This study’s contribution is the presentation of the model and the methodology to reuse non-API Web resources with existent mobile phone applications to form mashup applications. The established method is to use Java classes to build and to connect components, while our method controls the data flows of existent mashup components through the utilization of configuration files and its parameters. The implementation of this study shows that our approach allows composers to flexibly reuse capabilities of sensors and peripherals controlled by mobile phone applications to integrate them with Web resources and generate new mashup applications. The organization of the rest of this paper is as follows. Related work and research background are reviewed in Section 2. A mashup example is presented in Section 3 to demonstrate our approach. Section 4 explains the method of composing a mobile mashup application. The composition is divided into three working processes: planning
176
P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup
process, configuration process, and application generation process. Also Section 4, we present the Web information extraction tool used in the configuration process. In Section 5, we provide detailed mashup composition examples, then evaluate them by presenting the applications’ actual drawbacks and problems when applying the same model with other resources. In Section 6 we give a general discussion by making a comparison to the conventional approaches in terms of generation process, objectives and limitations. In Section 7, we describe this study’s future work and present our concluding remarks.
2. Background and Related Work Research disciplines in mobile mashup are usually related to these fields of study: 1. Web page tailoring and adaptation. 2. Web information extraction and reproduction. 3. Mobile mashup languages, modeling and their applications. 4. Mobile servers and ubiquitous environments. Generally, the conventional focus in tailoring and adapting Web pages for viewing on mobile devices gives more importance to extracting and simplifying visual outputs. DOM tree based extraction and proxy server architecture, as presented in [1] [4] and [11], are used to adapt the presentation of a Web page on mobile devices to assist navigation effectiveness. Although these methods promote minimization of information and visualization, when composing a mashup application for mobile phones they appear to support less functionality in communications and integrations over multiple working components. Research in the field of Web information extraction emphasizes methods to correctly indicate and reproduce parts of Web applications for creating new mashup Web applications. The study in [7] proposed a Web information extraction method to generate the virtual Web service functions from Web applications at the client side. This research targeted static contents, a limitation which was later corrected in [8] by allowing dynamic Web contents created by client-side scripts. These two systems are implemented using large external Java libraries including Java Applet. In our case, a mobile device cannot handle loads from external libraries to extract and simulate entire Web pages. In approaches to use description language based on XML, research in [13] and [15] has shown that the majority of description-based XML languages are designed to support content delivery to mobile phones and handheld devices. However, most languages target user interface design and do not facilitate integration with Web information. XISL [12], which extends interaction and input methods, requires an implementation of interpreter and the dialog manager modules. One substantial difference when compared to our approach is that we reuse interactions from existent mobile phone applications, and do not create new applications concerning users’ interaction from the description file. A method to generate a mobile phone application using configuration files was presented in mFoundry’s Mojax [14]. This framework borrows syntaxes from JavaScript, CSS and XML. Mojax applications are compiled and run as native Java code (J2ME) on a device. Mojax also supports development of plug-ins that can access device capabilities such as location services, address book, audio and video. Our approach introduces the transformation of generic components into mashup
P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup
177
components. Moreover, developers are able to write the optional control parts using Web or mobile programming. In 2002, Intel Research introduced the Personal Server [19], which wirelessly connected to the local network environment. The personal server allows an HTTP connection to access personal pages or file storage. A more specific study of component-based infrastructure can be found in [16]. This system used abstract user interface descriptions to represent software components on an embedded hardware system. Although a method to display system information and control the hardware system using a variety of clients was presented, those connections were specific to the transport layer. The use of information for integration with Web information was also not found. The sharing of mobile services presented in [10] is a system based on websites that support user generated mobile services. Our approach instead promotes the use of mobile integration with Web content, allowing contributors from these two platforms to share their works. The mobile service system in [17] provides an extension of presence awareness to mobile users. Without the implementation of the central server system, our approach is applicable with always-on HTTP connection via the mobile Web server, which allows quick access to shared information anywhere and anytime. Concerning development of mobile Web servers, ServersMan [18] is a mobile application targeted on major mobile platform (iPhone, Windows Mobile and Android). The application enables Web access to devices’ file storage and other parameters such as GPS latitude and longitude. The operations to access devices’ other resources, such as digital compass or accelerometer, are not yet defined. Moreover, the reuse of existent mobile application in the device is also not presented.
3. Mobile Mashup Example We present our example to demonstrate how the description-based approach is used in composing a mashup application. In Example 5.1.1, we show the composition of a mashup application for displaying nearest Wikipedia article and local weather information according to the mobile phone’s location.
Figure 2. Mobile Mashup Example
178
P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup
This application is targeted for a mobile Internet device (i.e., Output, Web application, iPod Touch, Safari Browser) by compositing a location service from a mobile phone application (i.e., Component A, publisher, GPS Locator, mobile application) with Web services. With no built-in GPS hardware at the client side, component B and C can alternatively retrieve information from a location service on component A and perform queries for their Web service outputs (i.e., Component B, subscriber, Wikinear, Web service and component C, subscriber, LocalWeather, Web service). Working procedures in composing this mashup application are as follow. 1. Specify a starting component: The data flows in this mashup application begin from a component that accesses GPS parameters from a mobile application (GPSLocator). For compatibility to the next components, we first transform the parameters by applying the Web service wrapper to GPSLocator. Composers must specify the Intent parameters and Intent’s extra parameters to retrieve data from this mobile application. It is also required that composers specify the provider role to the component including provider’s ID. Composers need to specify the Web service wrapper’s JSON message as well. 2. Specify next components: The required parameters for the next components are Web services’ URL, query field names, and each field’s value. In this example both Wikinear and LocalWeather used fields named lat for latitude and lng for longitude. The components’ role must be set to subscriber and use publisher’s ID where lat and lng are referred to. 3. Specify the output component: The output component, which is in the form of Web applications, uses query results from Web services described in 2. Composers must specify the mobile Web server’s access path and the output page in the form of HTML code and refer to parameters from Web services’ output. 4. Generate the final mashup application: Composers input information in item 1 - 3 to the MID file and generate the output Web application which is placed on the mobile Web server. Users can access it using a mobile Web server host name and access path according to the configuration. To support the compositions of mashup applications, the Integration Model is used to plan the data flows in the mashup applications.
4. Method in the Composition of Mobile Mashup Application Our method of composing mobile mashup applications consists of planning, configuration, and generation processes. In the planning process, the Integration Model is used to outline the component’s roles, the data flows, and the format of output forms for mashup applications. In the configuration process, the Integration Model is adapted and expanded into the actual configuration of MID files. We use the Web extraction tool to aid composers in retrieving configurations for data extraction from Web applications. Later in the application generation process, the mobile mashup application generator uses MID files to generate the actual mobile mashup application. The data flows involve different components located in different locations of a large system. Therefore, we explain the current system architecture to assist in understanding data flows and how the generated mobile mashup applications are to be placed in the system. Then we present the detailed processes of generating mashup applications.
P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup
179
4.1. Planning Process: Integration Model Table 1. Model representation of mashup components, roles and output forms Category Mashup component Role Output form
A model representation of three mashup components, roles and output forms are shown in Table 1. Parameter indices indicate the publisher-subscriber relationship of component couples. As an extension to publisher (P) and subscriber (S) roles, we use medium role (M) to describe the component that is publishing a subscribed output from another component. The representation of component A in mobile mashup example 5.1.1, which is the Web service wrapper applied to mobile phone applications can be written as C [Mobile Phone Application (GPS Locator)], O [Web Service]
We call these one-tier compositions Interface Wrappers, which are used in order to transform an output’s interface for communication between other mashup components. Table 2 contains a model representation of wrappers, their corresponding functions and sample usages. Table 2. Model representation of Interface Wrappers, their corresponding functions and sample usages. Case
Model Representation
Function
(a)
C [Web Application] O [Web Service] = W [WS[WA (name)]]
Web content extractor functioning as a Web service
(b)
C [Mobile Phone Application] O [Web Service] = W [WS[MA (name)]]
Mobile Web Service Wrapper
(c)
C [Web Application] O [Mobile Phone Application] = W [MA(name)[WA(name)]]*
Mobile application functioning as a Web content extractor
(d)
C [Web Service] O [Mobile Phone Application] = W [MA(name)[WS(name)]] *
Mobile application functioning as a Web service connector
(e)
C [Web Service] O [Web Application] = W [WA[WS(name)]]
Web application functioning as a Web Service connector
(f)
C [Mobile Phone Application] O [Web Application] = W [WA[MA(name)]]
Mobile Web Application Wrapper
Sample Usage Extracts texts from querybased Web page (e.g. product search, book reviews, game ratings) Retrieves GPS coordinates from mobile phone application via a Web Service Displays part of querybased Web page Selects and displays texts from Web service’s result using native mobile phone application Searches and displays results from a Web service Searches for contact info, media or database query on the mobile phone
*Since usage and output of mobile applications are different, application name has to be declared.
180
P. Chaisatien and T. Tokuda / A Description-Based Approach to Mashup
Outputs in the form of Web application (WA) or mobile applications (MA) can be used as end points in creating a mashup application. A composition that contains only one wrapper might not be enough to form a meaningful application. In Section 5, we select case (a) (b) and (c) for our implementation because of the following reasons. • Case (a) and case (e) are considered as existent Web extraction techniques. Web service (WS) output from wrapped WA in case (a) would be more appropriate for showing complexity in creating a mashup application. • Case (b) and (f) are similar to each other, only the output forms are different. We select case (b) to show further integration of its WS output. • Case (c) is more complex than (d). We would like to show how information is extracted from WA in (c) while (d) contains only simple operations to use WS output. By applying new interface syntax (b) to example 5.1.1, a new abstract model can be declared as W C C O
[WS[MA(GPS Locator)]], R [P1], [WS(Wikinear)], R [S1], [WS(Weather Report)], R [S1], [WA(iPod Touch, Desktop)]
In the next section, we describe how to adapt and expand this model as an actual configuration file. 4.2. Configuration Process: MID File and Web Extraction Tool In the configuration process, the Integration Model is adapted and is expanded into the actual configuration of MID files. We use the Web extraction tool to aid composers in retrieving configurations for data extraction from Web applications. 4.2.1. MID File Table 3. Structure of the <project> scope in MID files. Scope <project>